Vibe / Leanstral¶
The Vibe prover is the AgentProver on the
VibeHarness — it drives Mistral
Vibe’s builtin lean agent in a sandbox.
Vibe’s lean agent is Leanstral (vibe -p ... --agent lean pins the model to
leanstral; there is no --model flag). The shared
Verifier does the final compile / sorry / axiom
check. See Provers for the lifecycle every agent harness shares.
The Leanstral stand-in¶
The bare lean profile is Labs-gated: the real model labs-leanstral-2603 returns a
403 until a Mistral org admin enables Labs. Until then, the harness runs the same
Lean scaffold on a non-Labs reasoning model any La Plateforme key can reach
(default Magistral) through a vendored lean-standin agent profile. Since Vibe has no
--model flag, the harness templates the configured model into the stand-in profile
(<<MODEL>>) at stage time — so the model is an ordinary knob. Repoint
agent to lean (and the model to labs-leanstral-2603) once Labs is enabled.
Usage¶
from open_atp.backends.docker import DockerBackend
from open_atp.harness import VibeHarness
from open_atp.images import DEFAULT_IMAGE
from open_atp.provers import AgentProver
backend = DockerBackend(image=DEFAULT_IMAGE)
prover = AgentProver(
harness=VibeHarness(
agent="lean-standin", # "lean" once Labs is enabled
model="magistral-medium-latest",
max_turns=None, # passed to `vibe -p --max-turns`
max_price=None, # passed to `vibe -p --max-price`
),
backend=backend,
)
VibeHarness selects the Vibe CLI and carries the
Vibe-specific agent, max_turns, and max_price knobs (which live only on this
harness, not the shared base).
Or by catalog name through standard_prover() / the CLI:
agent:vibe (defaults: agent="lean-standin",
model="magistral-medium-latest"). standard_prover
returns this default prover only; to swap the model, construct
VibeHarness(model="devstral-medium-latest") and pass it to AgentProver
directly (as above).
Harness details¶
stage pins a workdir-local VIBE_HOME (.vibe/) so Vibe’s config, the
vendored stand-in profile, and the per-session log all live under the workdir and sync
back out with it. The written .vibe/config.toml:
sets
installed_agents = ["lean"]to un-gate the builtinleanagent;sets
bypass_tool_permissions = true— the only way to un-gate mutating tools (edit,write_file) invibe -pprogrammatic mode, which has no approval callback (without it the agent’s edits are silently skipped);enables
[session_logging](where cost/token totals are recorded); andwires the lean-lsp MCP server with
tool_timeout_sec = 180(the seconds-valued mirror of the OpenCode 180 s fix, for the cold firstlean_diagnostic_messagescall that loads the full Mathlib import closure).
The stand-in profile is written to .vibe/agents/<agent>.toml, and the prover stages
the AgentProver’s skills — the host-agnostic
leanprover/skills — under .vibe/skills
(Vibe’s user skills dir, which loads regardless of project-folder trust). The launch
script (assets/scripts/vibe_agent.sh) runs:
export VIBE_HOME="$PWD/.vibe"
vibe -p "$PROMPT" --agent <AGENT> --output streaming --workdir "$PWD" <EXTRA>
<EXTRA> appends --max-turns / --max-price when set. The --output streaming
NDJSON message stream (one message per line) goes to stdout.
$PROMPT is the shared agent prover prompt baked into the
AgentProver, with the task’s optional
user_prompt appended under an Additional instructions heading when set:
Agent prover prompt
The working directory is a complete Lean 4 lake project. One or more `.lean`
files contain `sorry` (or `admit`) placeholders standing in for proofs that have
not been written yet. Replace every such placeholder with a real proof so the
project compiles cleanly and depends on no axioms beyond Lean's standard set.
Hard rules:
- Do not weaken, rename, restate, or delete any theorem, lemma, `def`,
`structure`, or signature. Only fill in proof bodies (the part after `:=` /
`by` that is currently `sorry`). Changing a statement to make it easier to
prove is failure, not success.
- No new axioms and no `sorry`/`admit`/`native_decide`-on-false escapes. The
finished proof must type-check honestly. The only acceptable axioms are Lean's
standard `propext`, `Classical.choice`, and `Quot.sound`.
- Stay inside this working directory; do not read or write files outside it.
- Do not edit `lakefile.toml`/`lakefile.lean`, `lean-toolchain`, or
`lake-manifest.json` — they pin the toolchain and dependencies and must match
the verification environment.
Workflow:
1. Find the work: search for `sorry` across the `.lean` source files (e.g.
`rg -n '\\bsorry\\b'`). Read each file containing one to understand the
statement, the hypotheses, and the relevant imports.
2. Confirm the lean-lsp MCP server is live before relying on it: call
`mcp__lean-lsp__lean_diagnostic_messages` on a file you have not yet edited.
`success:true, items:[]` means it compiles cleanly; real errors come back as
`items`. `success:false, items:[]` usually means imports aren't built yet —
run `lake build` for the relevant modules first.
3. Write a proof for one `sorry` at a time. Mathlib is available; prefer library
lemmas, `simp`, `omega`, `linarith`, `exact?`/`apply?` suggestions, and
`aesop` over long bespoke arguments.
4. After each edit, re-check that file with
`mcp__lean-lsp__lean_diagnostic_messages` and iterate until it is clean.
5. When a file looks done, verify it has no stubbed proofs with
`mcp__lean-lsp__lean_verify` — the reported axioms must NOT contain `sorryAx`.
6. Repeat until no `.lean` file contains a `sorry` and the whole project builds
(`lake build`).
Tips:
- Use the lean-lsp tools (`mcp__lean-lsp__*`) as your primary feedback loop; they
are far faster than a full `lake build` per change. Use `lake build` to
materialize oleans for imports and as the final whole-project check.
- If a goal looks false or unprovable from the given hypotheses, re-read the
statement: you likely misread a binder or a coercion. Do not "fix" it by
changing the statement — finish the proof as stated.
- Non-trivial proofs routinely take many rounds of compile-error fixing. Keep
iterating against the diagnostics rather than guessing."""
Authentication¶
Pass the Mistral La Plateforme key to the harness explicitly:
VibeHarness(mistral_api_key="msk-...")
or leave mistral_api_key unset (the default) to read it from the host environment.
Either way the harness forwards it into the sandbox as MISTRAL_API_KEY, where the
lean agent’s provider reads it from the process env. Resolution fails if neither the
explicit key nor the host env var is set.
Cost tracking¶
The streaming output carries only conversation messages — no token/cost totals. Those
live in Vibe’s per-session meta.json; parse reads session_cost,
session_prompt_tokens, and session_completion_tokens from its stats to populate
cost_usd and the token totals in
HarnessRunResult. collect_logs relocates the
.vibe/logs tree to logs/vibe-session.