Leanstral Review: Mistral Just Shipped the AI That Proves Your Code Is Correct

Here is a fact that should give every engineering leader pause: 96% of developers say they do not fully trust AI-generated code.

They are using it anyway. Because it is fast, and deadlines are real. But they are shipping code they cannot verify. At scale, in critical systems, this is a ticking clock.

Mistral shipped Leanstral on March 16, 2026. It is a different kind of answer to the AI code quality problem—not more guardrails, not better test generation, but formal mathematical proof. Leanstral generates proofs that your code is provably correct against formal specifications. That is a fundamentally different level of verification from “the tests pass.”

What Is Leanstral?

Leanstral is an open-source AI agent built specifically for Lean 4, the formal proof programming language developed by Microsoft Research and widely used in mathematics and safety-critical software verification.

At its core: you write your code and your formal specification, and Leanstral generates the Lean 4 proof that your implementation satisfies the spec. If the proof compiles, the code is mathematically verified. Not “probably right.” Proven correct.

Mistral released it under Apache 2.0—fully open-source, commercially usable, no restrictions. They also offer a free API endpoint and embed it into Mistral Vibe (their coding environment) for teams that want managed access.

Why Formal Verification Matters Now

Formal verification has existed for decades. The reason it never went mainstream is the same reason most powerful tools do not: the learning curve is brutal, and the tooling friction is high. Writing Lean 4 proofs by hand requires deep expertise in type theory. Most engineers do not have it and do not want to acquire it.

Leanstral changes the equation. It lowers the barrier from “hire a formal methods expert” to “describe what your function should do.” The agent handles the proof generation. Lean’s compiler handles the verification. You get the guarantee without requiring your team to become mathematicians.

This matters specifically for:

Financial software where a calculation error is a liability event
Safety-critical systems (medical devices, aerospace, autonomous vehicles) where unverified behavior is a legal and ethical risk
Cryptography and security protocols where correctness is not optional
Smart contracts where a logic error is an irreversible financial loss
Frontier research mathematics where human verification is the bottleneck

Under the Hood

Leanstral is a 120-billion-parameter mixture-of-experts model, but it runs on just 6 billion active parameters per forward pass. This is Mistral’s signature architecture play: enormous model capacity, fraction of the compute cost.

The results back it up. On the MiniF2F benchmark (the standard evaluation set for formal math proofs):

Leanstral at pass@2: 26.3% — outperforms Claude Sonnet at $36 total cost vs $549
Leanstral at pass@4: 29.3% — continues scaling linearly
Claude Opus remains ahead at higher pass counts, but at dramatically higher cost

For most practical use cases, Leanstral at pass@2 or pass@4 delivers verification results that would cost 10–15x more with comparable commercial models. At Apache 2.0 licensing, you can self-host it for effectively zero incremental cost.

How It Works in Practice

Leanstral integrates with Lean’s Language Server Protocol via MCP (Model Context Protocol), the same protocol used by Claude Code and Cursor for tooling integration. If you already work in a Lean 4 development environment, the integration is straightforward.

Mistral demonstrated it handling a real Stack Exchange debugging question: a breaking change in Lean 4.29.0-rc6 had introduced a type alias behavior change. Leanstral correctly diagnosed the definitional equality issue and identified that swapping def for abbrev would restore tactic matching. That is the kind of context-aware reasoning that distinguishes a trained specialist from a general-purpose model that happens to know some Lean syntax.

The workflow looks like this:

Write your function in Lean 4 (or have an AI coding assistant generate it)
Describe the formal specification—what inputs produce what outputs, what invariants must hold
Leanstral generates the proof
Lean’s compiler verifies the proof
If it compiles, your code is mathematically proven correct

Pricing and Access

Three ways to use Leanstral:

Free API endpoint: Mistral provides free access to the Leanstral model via their API. No pricing listed for the API tier—effectively free at launch for evaluation and open-source projects.

Mistral Vibe: Leanstral is embedded in Mistral’s coding environment. Vibe pricing follows Mistral’s standard API pricing (tokens-in/tokens-out), which is competitive with OpenAI and Anthropic at comparable quality tiers.

Self-hosted: Apache 2.0 weights, downloadable from Mistral’s model repository. If you have the inference infrastructure, this is zero per-query cost.

The cost comparison vs alternatives is worth stating clearly: using Claude Sonnet for the same verification tasks Leanstral handles would run approximately $549 per evaluation set where Leanstral costs $36. For teams running continuous formal verification in CI/CD, that cost gap compounds fast.

Pros and Cons

Pros:

Apache 2.0 — commercially usable, self-hostable, no vendor lock-in
Free API endpoint to start immediately
15x cheaper than comparable commercial models on formal verification benchmarks
Trained specifically for Lean 4 (not a general model with Lean syntax knowledge—a specialist)
Integrates with standard Lean Language Server Protocol tooling
The only open-source model purpose-built for formal code proof generation

Cons:

Lean 4 is a niche skill—if your team does not already work in it, the onboarding cost is real
Formal verification is not for every project—overkill for a landing page, essential for a payment processor
Still a research-adjacent tool at this stage; production integrations require tooling setup
Does not replace standard unit testing—it complements it for correctness guarantees, not behavioral coverage

Who Should Try Leanstral?

Strong fit:

Teams working on formal verification of mathematical algorithms
Security engineers building cryptographic protocols
Developers in regulated industries (finance, medical) who need verifiable correctness
Open-source mathematics projects using Lean (there are now hundreds)
ML researchers using formal methods to verify training algorithms

Probably not yet:

Full-stack web developers building CRUD applications
Teams without existing Lean 4 exposure who are not ready to invest in the learning curve
Startups moving fast where correctness at this level is a future problem, not a today problem

The Bigger Picture

Mistral has been methodically building a case that open-source AI can match closed-source frontier performance at a fraction of the cost. Leanstral is the latest data point in that argument. When an Apache 2.0 model at $36 outperforms a $549 proprietary model on a specialized benchmark, the cost-efficiency case for open models becomes very hard to ignore.

More importantly, Leanstral signals where AI coding tools are heading: from “generate code faster” to “generate code that can be proven correct.” That is not a marginal improvement. That is a different category of value for a different segment of the market—one that will only grow as AI-generated code infiltrates more critical systems.

→ Access Leanstral for free via Mistral API

FAQ

What is Leanstral? Leanstral is Mistral AI’s open-source AI agent for Lean 4, released March 16, 2026. It generates formal mathematical proofs that verify code correctness, using the Lean 4 proof assistant. Available under Apache 2.0 licensing with a free API endpoint.

What is Lean 4 and why does it matter for code verification? Lean 4 is a formal proof language that allows you to write mathematical proofs that verify code correctness. If a proof compiles, the code is guaranteed to behave as specified. It is used in safety-critical software, cryptography, financial systems, and advanced mathematics research.

Is Leanstral free to use? Yes. Mistral provides a free API endpoint and has released the weights under Apache 2.0. You can also self-host on your own infrastructure at no per-query cost.

How does Leanstral compare to Claude for formal verification? Leanstral achieves 26.3% on MiniF2F at pass@2 for a cost of approximately $36 per evaluation set. Claude Sonnet achieves similar scores at approximately $549. Leanstral is purpose-trained for Lean 4 and significantly more cost-efficient for this specific task.

Does Leanstral replace unit testing? No. Formal verification and unit testing are complementary. Unit tests check that your code behaves correctly for specific inputs. Formal verification proves that your code satisfies a specification for all possible inputs. You want both.

Can I use Leanstral without knowing Lean 4? Leanstral generates the proofs, but you still need to work in the Lean 4 environment and understand how to write formal specifications. There is a learning curve. Mistral Vibe’s integration helps abstract some of this.

What Is Leanstral?#

Why Formal Verification Matters Now#

Under the Hood#

How It Works in Practice#

Pricing and Access#

Pros and Cons#

Who Should Try Leanstral?#

The Bigger Picture#

FAQ#