🧀 BigCheese.ai


Dragonfly: A large vision-language model with multi-resolution zoom


Dragonfly, a new vision-language model, uses a breakthrough multi-resolution zoom-and-select architecture to enhance visual understanding. It achieves improved performance on commonsense visual QA and image captioning benchmarks.

  • Dragonfly unveiled by Together AI.
  • New model uses multi-resolution zoom.
  • Achieves fine-grained visual understanding.
  • Open-source models with competitive results.
  • Developed with Stanford Medicine partnership.