Theorem Proving · LLM Reasoning

AlphaGeometry: Olympiad Geometry Without Human Proof Demonstrations

AlphaGeometry combines a neural language model with symbolic deduction, using synthetic theorems and proofs to reach near gold-medal performance on olympiad geometry.

TL;DR

AlphaGeometry combines a neural language model with symbolic deduction, using synthetic theorems and proofs to reach near gold-medal performance on olympiad geometry.

What problem it solves

Geometry is a hard test for AI reasoning because natural-language statements, diagrams, auxiliary constructions, and formal proof steps all interact. Human proof demonstrations are scarce, and naive search explodes quickly. AlphaGeometry asks whether a system can learn useful proof guidance without relying on human-written geometry solutions.

The core method

The system is neuro-symbolic. A neural language model proposes useful auxiliary constructions, while a symbolic deduction engine performs exact geometric reasoning. Instead of training on human proofs, AlphaGeometry synthesizes millions of theorems and proofs at different levels of difficulty, then trains the language model from scratch to guide the search process. The neural part suggests where to search; the symbolic part checks what follows.

Key results

On a benchmark of 30 recent olympiad-level geometry problems, AlphaGeometry solves 25, far ahead of the previous best method that solved 10 and close to an average IMO gold medallist. Under expert evaluation, it solves all geometry problems from IMO 2000 and 2015. It also produces human-readable proofs, although not always with the elegance or brevity a mathematician would prefer.

Why it matters

AlphaGeometry is important because it avoids the usual tradeoff between neural fluency and symbolic correctness. The language model does not have to hallucinate a proof; it proposes constructions that expand the symbolic engine’s reach. That pattern is relevant beyond geometry: use learning to guide search, and use formal machinery to verify each step.

Limits and open questions

The system is specialized for Euclidean plane geometry, not general mathematics. Some generated proofs can be long and mechanically tedious, and translating messy real problem statements into the right formal representation remains hard. The broader question is whether the same synthetic-data plus symbolic-checking pattern can scale to algebra, combinatorics, formal theorem proving, and scientific reasoning.

One line: AlphaGeometry uses neural intuition to steer symbolic proof.