Lm.rs: Minimal CPU LLM inference in Rust with no dependency

🧀

View Website GitHub Repository WebUI Hugging Face Collection Video Demo

The GitHub project lm.rs, created by Samuel Vitorino, demonstrates minimal LLM inference using Rust and is aimed at running language models on the CPU without relying on machine learning libraries. The project includes benchmarks and download links for models and tokenizers, as well as instructions to convert models to the LMRS format.

Project includes minimal Rust code for LLM CPU inference
Supports Llama3.2 1B and 3B models
Models available on Hugging Face
Provides quantized model download options
WebUI available for backend interaction

View Website GitHub Repository WebUI Hugging Face Collection Video Demo

Social

Lm.rs: Minimal CPU LLM inference in Rust with no dependency