🧀 BigCheese.ai


Maxtext: A simple, performant and scalable Jax LLM


MaxText is an open-source Large Language Model (LLM) designed for high performance and scalability, utilizing Python/Jax and targeting Google Cloud TPUs and GPUs. It boasts high Model FLOPS Utilization (MFU), supports TPUs and GPUs, offers features like training and inference, and includes various open models like Llama2, Mistral, and Gemma.

  • Achieves high MFUs on TPU v5p.
  • Scalable to ~51K chips.
  • Compatible with TPUs and GPUs.
  • Supports models like Llama2 and Gemma.
  • Encourages forking and modification.