🧀 BigCheese.ai

Social

AMD Unveils Its First Small Language Model AMD-135M

🧀

AMD introduces its first small language model, the AMD-135M, a smaller counterpart to large language models offering advantages for specific use cases. Trained on 670 billion tokens with AMD Instinct MI250 accelerators, the model and its code-variant are open-sourced for development and innovation within the AI community. With Speculative Decoding, the AMD-135M improves inference performance, serving as a draft model to speed up larger models like CodeLlama-7b on select AMD platforms.

  • AMD-135M is the first SLM from AMD.
  • Uses 670B tokens for pretraining.
  • Trains in 6 days on 4 MI250 nodes.
  • Speculative Decoding enhances speed.
  • Model open-sourced for collaboration.