🧀 BigCheese.ai

Social

Show HN: AI assisted image editing with audio instructions

🧀

AAIELA is an open-source project on GitHub that enables users to edit images using audio commands. The project combines AI models for computer vision, speech-to-text, language models, and text-to-image inpainting.

  • Utilizes Detectron2 for segmentation.
  • Leverages Faster Whisper for audio transcription.
  • Employs language models like GPT-4 for language understanding.
  • Incorporates Stable Diffusion for image inpainting.
  • Project aims to bridge spoken language and visual transformation.