🧀 BigCheese.ai

Social

Cerebras Launches the Fastest AI Inference

🧀

Cerebras Systems has launched Cerebras Inference, touted as the world’s fastest AI inference platform. Delivering unparalleled performance, it is 20 times faster than NVIDIA GPU solutions and offers competitive pricing, starting at 10c per million tokens. The platform is designed to maintain state-of-the-art accuracy without compromise, and it supports a pay-as-you-go model. Notably, Cerebras Inference has received recognition for its performance in AI benchmarks.

  • 1800 tokens/s for Llama3.1 8B
  • 450 tokens/s for Llama3.1 70B
  • 20x faster than GPUs
  • 10c per million tokens
  • 16-bit accuracy maintained