🧀 BigCheese.ai

Social

AiOla open-sources ultra-fast 'multi-head' speech recognition model

🧀

aiOla has unveiled Whisper-Medusa, an innovative open-source AI model that surpasses OpenAI’s Whisper by achieving 50% faster performance without sacrificing accuracy. The advancement comes from its ability to predict ten tokens at once, significantly increasing speech prediction speed and enhancing runtime, particularly for long-form audio. Whisper-Medusa is currently offered as a 10-head model, with future plans to introduce a 20-head version.

  • Combines OpenAI's Whisper with aiOla's technology, over 50% faster.
  • Whisper-Medusa capable of predicting ten tokens at a time.
  • Available on Hugging Face and GitHub for public access.
  • Employs multi-head attention architecture and weak supervision.
  • Understands over 100 languages and various business jargons.