🧀 BigCheese.ai

Social

Omnivision-968M: Vision Language Model with 9x Tokens Reduction for Edge Devices

🧀

OmniVision-968M is heralded as the world's smallest vision language model optimized for edge devices, providing a 9x token reduction over traditional models, achieving enhanced accuracy and reduced computational costs.

  • Sub-billion model
  • Enhances accuracy
  • 9x token reduction
  • Uses DPO training
  • Optimized for edge