AMD Unveils Its First Small Language Model AMD-135M

🧀

View Website AMD Homepage GitHub Repository Hugging Face Model Card AMD Developer Cloud Technical Blog

AMD introduces its first small language model, the AMD-135M, a smaller counterpart to large language models offering advantages for specific use cases. Trained on 670 billion tokens with AMD Instinct MI250 accelerators, the model and its code-variant are open-sourced for development and innovation within the AI community. With Speculative Decoding, the AMD-135M improves inference performance, serving as a draft model to speed up larger models like CodeLlama-7b on select AMD platforms.

AMD-135M is the first SLM from AMD.
Uses 670B tokens for pretraining.
Trains in 6 days on 4 MI250 nodes.
Speculative Decoding enhances speed.
Model open-sourced for collaboration.

View Website AMD Homepage GitHub Repository Hugging Face Model Card AMD Developer Cloud Technical Blog

Social

AMD Unveils Its First Small Language Model AMD-135M