🧀 BigCheese.ai

Social

Show HN: Open-source, native audio turn detection model

🧀

Smart Turn is an open-source, community-driven project for native audio turn detection for conversational voice AI, utilizing a Wav2Vec2-BERT model base. It is designed to improve on limitations of VAD-based turn detection by including linguistic and acoustic cues and supports only English currently. The project invites contributions and is licensed under the BSD-2-Clause license.

  • Uses Wav2Vec2-BERT
  • BSD-2-Clause license
  • Supports English
  • ~150ms on GPU
  • Seeking contributions