AI speech generator 'reaches human parity' – but it's too dangerous to release


Microsoft's new AI, VALL-E 2, can reproduce human voices from a few seconds of audio, reaching human parity in speech generation. Despite its capabilities, Microsoft is not releasing VALL-E 2 to the public due to potential misuse, such as voice spoofing or impersonation.

  • VALL-E 2 achieves human speech parity.
  • The AI can mimic voices quickly.
  • Audio quality affects VALL-E's output.
  • It uses LibriSpeech and VCTK datasets.
  • Concerns over misuse limit its release.