AMD introduces its first small language model, the AMD-135M, a smaller counterpart to large language models offering advantages for specific use cases. Trained on 670 billion tokens with AMD Instinct MI250 accelerators, the model and its code-variant are open-sourced for development and innovation within the AI community. With Speculative Decoding, the AMD-135M improves inference performance, serving as a draft model to speed up larger models like CodeLlama-7b on select AMD platforms.