Here is a fact that should give every engineering leader pause: 96% of developers say they do not fully trust AI-generated code.
They are using it anyway. Because it is fast, and deadlines are real. But they are shipping code they cannot verify. At scale, in critical systems, this is a ticking clock.
Mistral shipped Leanstral on March 16, 2026. It is a different kind of answer to the AI code quality problem—not more guardrails, not better test generation, but formal mathematical proof. Leanstral generates proofs that your code is provably correct against formal specifications. That is a fundamentally different level of verification from “the tests pass.”
What Is Leanstral?
Leanstral is an open-source AI agent built specifically for Lean 4, the formal proof programming language developed by Microsoft Research and widely used in mathematics and safety-critical software verification.
At its core: you write your code and your formal specification, and Leanstral generates the Lean 4 proof that your implementation satisfies the spec. If the proof compiles, the code is mathematically verified. Not “probably right.” Proven correct.
Mistral released it under Apache 2.0—fully open-source, commercially usable, no restrictions. They also offer a free API endpoint and embed it into Mistral Vibe (their coding environment) for teams that want managed access.
Why Formal Verification Matters Now
Formal verification has existed for decades. The reason it never went mainstream is the same reason most powerful tools do not: the learning curve is brutal, and the tooling friction is high. Writing Lean 4 proofs by hand requires deep expertise in type theory. Most engineers do not have it and do not want to acquire it.
Leanstral changes the equation. It lowers the barrier from “hire a formal methods expert” to “describe what your function should do.” The agent handles the proof generation. Lean’s compiler handles the verification. You get the guarantee without requiring your team to become mathematicians.
This matters specifically for:
- Financial software where a calculation error is a liability event
- Safety-critical systems (medical devices, aerospace, autonomous vehicles) where unverified behavior is a legal and ethical risk
- Cryptography and security protocols where correctness is not optional
- Smart contracts where a logic error is an irreversible financial loss
- Frontier research mathematics where human verification is the bottleneck
Under the Hood
Leanstral is a 120-billion-parameter mixture-of-experts model, but it runs on just 6 billion active parameters per forward pass. This is Mistral’s signature architecture play: enormous model capacity, fraction of the compute cost.
The results back it up. On the MiniF2F benchmark (the standard evaluation set for formal math proofs):
- Leanstral at pass@2: 26.3% — outperforms Claude Sonnet at $36 total cost vs $549
- Leanstral at pass@4: 29.3% — continues scaling linearly
- Claude Opus remains ahead at higher pass counts, but at dramatically higher cost
For most practical use cases, Leanstral at pass@2 or pass@4 delivers verification results that would cost 10–15x more with comparable commercial models. At Apache 2.0 licensing, you can self-host it for effectively zero incremental cost.
How It Works in Practice
Leanstral integrates with Lean’s Language Server Protocol via MCP (Model Context Protocol), the same protocol used by Claude Code and Cursor for tooling integration. If you already work in a Lean 4 development environment, the integration is straightforward.
Mistral demonstrated it handling a real Stack Exchange debugging question: a breaking change in Lean 4.29.0-rc6 had introduced a type alias behavior change. Leanstral correctly diagnosed the definitional equality issue and identified that swapping def for abbrev would restore tactic matching. That is the kind of context-aware reasoning that distinguishes a trained specialist from a general-purpose model that happens to know some Lean syntax.
The workflow looks like this:
- Write your function in Lean 4 (or have an AI coding assistant generate it)
- Describe the formal specification—what inputs produce what outputs, what invariants must hold
- Leanstral generates the proof
- Lean’s compiler verifies the proof
- If it compiles, your code is mathematically proven correct
Pricing and Access
Three ways to use Leanstral:
Free API endpoint: Mistral provides free access to the Leanstral model via their API. No pricing listed for the API tier—effectively free at launch for evaluation and open-source projects.
Mistral Vibe: Leanstral is embedded in Mistral’s coding environment. Vibe pricing follows Mistral’s standard API pricing (tokens-in/tokens-out), which is competitive with OpenAI and Anthropic at comparable quality tiers.
Self-hosted: Apache 2.0 weights, downloadable from Mistral’s model repository. If you have the inference infrastructure, this is zero per-query cost.
The cost comparison vs alternatives is worth stating clearly: using Claude Sonnet for the same verification tasks Leanstral handles would run approximately $549 per evaluation set where Leanstral costs $36. For teams running continuous formal verification in CI/CD, that cost gap compounds fast.
Pros and Cons
Pros:
- Apache 2.0 — commercially usable, self-hostable, no vendor lock-in
- Free API endpoint to start immediately
- 15x cheaper than comparable commercial models on formal verification benchmarks
- Trained specifically for Lean 4 (not a general model with Lean syntax knowledge—a specialist)
- Integrates with standard Lean Language Server Protocol tooling
- The only open-source model purpose-built for formal code proof generation
Cons:
- Lean 4 is a niche skill—if your team does not already work in it, the onboarding cost is real
- Formal verification is not for every project—overkill for a landing page, essential for a payment processor
- Still a research-adjacent tool at this stage; production integrations require tooling setup
- Does not replace standard unit testing—it complements it for correctness guarantees, not behavioral coverage
Who Should Try Leanstral?
Strong fit:
- Teams working on formal verification of mathematical algorithms
- Security engineers building cryptographic protocols
- Developers in regulated industries (finance, medical) who need verifiable correctness
- Open-source mathematics projects using Lean (there are now hundreds)
- ML researchers using formal methods to verify training algorithms
Probably not yet:
- Full-stack web developers building CRUD applications
- Teams without existing Lean 4 exposure who are not ready to invest in the learning curve
- Startups moving fast where correctness at this level is a future problem, not a today problem
The Bigger Picture
Mistral has been methodically building a case that open-source AI can match closed-source frontier performance at a fraction of the cost. Leanstral is the latest data point in that argument. When an Apache 2.0 model at $36 outperforms a $549 proprietary model on a specialized benchmark, the cost-efficiency case for open models becomes very hard to ignore.
More importantly, Leanstral signals where AI coding tools are heading: from “generate code faster” to “generate code that can be proven correct.” That is not a marginal improvement. That is a different category of value for a different segment of the market—one that will only grow as AI-generated code infiltrates more critical systems.
→ Access Leanstral for free via Mistral API
FAQ
What is Leanstral? Leanstral is Mistral AI’s open-source AI agent for Lean 4, released March 16, 2026. It generates formal mathematical proofs that verify code correctness, using the Lean 4 proof assistant. Available under Apache 2.0 licensing with a free API endpoint.
What is Lean 4 and why does it matter for code verification? Lean 4 is a formal proof language that allows you to write mathematical proofs that verify code correctness. If a proof compiles, the code is guaranteed to behave as specified. It is used in safety-critical software, cryptography, financial systems, and advanced mathematics research.
Is Leanstral free to use? Yes. Mistral provides a free API endpoint and has released the weights under Apache 2.0. You can also self-host on your own infrastructure at no per-query cost.
How does Leanstral compare to Claude for formal verification? Leanstral achieves 26.3% on MiniF2F at pass@2 for a cost of approximately $36 per evaluation set. Claude Sonnet achieves similar scores at approximately $549. Leanstral is purpose-trained for Lean 4 and significantly more cost-efficient for this specific task.
Does Leanstral replace unit testing? No. Formal verification and unit testing are complementary. Unit tests check that your code behaves correctly for specific inputs. Formal verification proves that your code satisfies a specification for all possible inputs. You want both.
Can I use Leanstral without knowing Lean 4? Leanstral generates the proofs, but you still need to work in the Lean 4 environment and understand how to write formal specifications. There is a learning curve. Mistral Vibe’s integration helps abstract some of this.