Topics
Small Language Models
Compact, on-device, and edge-deployable models — strong capability per parameter for local and low-cost inference.
Small Language Models · Hugging Face
DistilBERT turns knowledge distillation for compact language models into a concrete research object, with evidence anchors, method tradeoffs, and limits for practical use.
Small Language Models · Google Research
MobileBERT turns mobile-friendly BERT compression into a concrete research object, with evidence anchors, method tradeoffs, and limits for practical use.
Small Language Models · Independent Researcher
TinyLlama turns open small language model training into a concrete research object, with evidence anchors, method tradeoffs, and limits for practical use.
Small Language Models · Meta AI
MobileLLM argues architecture matters more at sub-billion scale: deep-thin designs plus sharing improve 125M/350M models by 2.7%/4.3%, then 0.7%/0.8% more.
Small Language Models · Hugging Face
SmolLM2 is a 1.7B model overtrained on ~11T tokens through four data stages. It scores 68.7 on HellaSwag and 19.4 on MMLU-Pro, beating Llama3.2-1B — and ships every dataset, not just the weights.
Efficient AI · Microsoft Research
Phi-3-mini is a 3.8B-parameter model trained on 3.3T heavily filtered and synthetic tokens that hits 69% on MMLU and 8.38 on MT-bench — matching Mixtral 8x7B and GPT-3.5 while small enough to run on a phone.