Dragonfly: A large vision-language model with multi-resolution zoom

🧀

View Website Dragonfly Arxiv Paper Dragonfly GitHub Together AI Twitter Together AI LinkedIn

Dragonfly, a new vision-language model, uses a breakthrough multi-resolution zoom-and-select architecture to enhance visual understanding. It achieves improved performance on commonsense visual QA and image captioning benchmarks.

Dragonfly unveiled by Together AI.
New model uses multi-resolution zoom.
Achieves fine-grained visual understanding.
Open-source models with competitive results.
Developed with Stanford Medicine partnership.

View Website Dragonfly Arxiv Paper Dragonfly GitHub Together AI Twitter Together AI LinkedIn

Social

Dragonfly: A large vision-language model with multi-resolution zoom