🧀 BigCheese.ai

Social

ARIA: An Open Multimodal Native Mixture-of-Experts Model

🧀

The paper titled 'Aria: An Open Multimodal Native Mixture-of-Experts Model' focuses on integrating multimodal information using an open-source model named Aria. With 3.9B and 3.5B activated parameters for visual and text tokens respectively, it outperforms similar proprietary models. Authors introduce a 4-stage pre-training pipeline, including stages for language and multimodal understanding as well as instruction following.

  • Aria is an open-source model.
  • It outperforms Pixtral-12B.
  • Authors: Dongxu Li and others.
  • Submitted: 8 Oct 2024.
  • Venue: arXiv