OpenCode

The OpenCode prover is the AgentProver on the OpenCodeHarness — the OpenCode CLI driving the sorrys in a sandbox with the lean-lsp-mcp server. Unlike Claude Code and Codex, OpenCode is provider-agnostic: one CLI fronts Anthropic, OpenAI, Google, or DeepSeek, billed directly against that provider’s API. The shared Verifier does the final compile / sorry / axiom check. See Provers for the staging/diff lifecycle every agent harness shares.

Usage

from open_atp.backends.docker import DockerBackend
from open_atp.harness import OpenCodeHarness
from open_atp.images import DEFAULT_IMAGE
from open_atp.provers import AgentProver

backend = DockerBackend(image=DEFAULT_IMAGE)
prover = AgentProver(
    harness=OpenCodeHarness(model="claude-opus-4-8", effort="medium"),
    backend=backend,
)

OpenCodeHarness selects the OpenCode CLI; its provider is inferred from the model prefix unless set explicitly.

Or by catalog name through standard_prover() / the CLI: agent:opencode. The provider is inferred from the model prefix (claude-*anthropic, gpt-*openai, and so on), so any provider’s model is selected by name through the same model knob.

Harness details

stage writes an opencode.json carrying the inferred provider, the model and its reasoning-effort config, and the lean-lsp MCP server; the prover then stages the AgentProver’s skills — the host-agnostic leanprover/skills — under .agents/skills/. The MCP timeout is raised to 180 000 ms (180 s) — the first lean_diagnostic_messages call starts lake serve and loads the file’s full Mathlib import closure, which blows past the 60 s default on a cold, few-CPU sandbox. Reasoning effort maps per provider: Anthropic gets thinking: {type: "adaptive"} plus an effort output_config, while OpenAI / Google / DeepSeek get reasoningEffort. The launch script (assets/scripts/opencode_agent.sh) runs:

opencode run --dir /workspace/wd --format json \
    --model '<PROVIDER>/<MODEL>' \
    "$PROMPT"

The --format json event stream goes to stdout.

$PROMPT is the shared agent prover prompt baked into the AgentProver, with the task’s optional user_prompt appended under an Additional instructions heading when set:

Agent prover prompt
The working directory is a complete Lean 4 lake project. One or more `.lean`
files contain `sorry` (or `admit`) placeholders standing in for proofs that have
not been written yet. Replace every such placeholder with a real proof so the
project compiles cleanly and depends on no axioms beyond Lean's standard set.

Hard rules:
- Do not weaken, rename, restate, or delete any theorem, lemma, `def`,
  `structure`, or signature. Only fill in proof bodies (the part after `:=` /
  `by` that is currently `sorry`). Changing a statement to make it easier to
  prove is failure, not success.
- No new axioms and no `sorry`/`admit`/`native_decide`-on-false escapes. The
  finished proof must type-check honestly. The only acceptable axioms are Lean's
  standard `propext`, `Classical.choice`, and `Quot.sound`.
- Stay inside this working directory; do not read or write files outside it.
- Do not edit `lakefile.toml`/`lakefile.lean`, `lean-toolchain`, or
  `lake-manifest.json` — they pin the toolchain and dependencies and must match
  the verification environment.

Workflow:
1. Find the work: search for `sorry` across the `.lean` source files (e.g.
   `rg -n '\\bsorry\\b'`). Read each file containing one to understand the
   statement, the hypotheses, and the relevant imports.
2. Confirm the lean-lsp MCP server is live before relying on it: call
   `mcp__lean-lsp__lean_diagnostic_messages` on a file you have not yet edited.
   `success:true, items:[]` means it compiles cleanly; real errors come back as
   `items`. `success:false, items:[]` usually means imports aren't built yet —
   run `lake build` for the relevant modules first.
3. Write a proof for one `sorry` at a time. Mathlib is available; prefer library
   lemmas, `simp`, `omega`, `linarith`, `exact?`/`apply?` suggestions, and
   `aesop` over long bespoke arguments.
4. After each edit, re-check that file with
   `mcp__lean-lsp__lean_diagnostic_messages` and iterate until it is clean.
5. When a file looks done, verify it has no stubbed proofs with
   `mcp__lean-lsp__lean_verify` — the reported axioms must NOT contain `sorryAx`.
6. Repeat until no `.lean` file contains a `sorry` and the whole project builds
   (`lake build`).

Tips:
- Use the lean-lsp tools (`mcp__lean-lsp__*`) as your primary feedback loop; they
  are far faster than a full `lake build` per change. Use `lake build` to
  materialize oleans for imports and as the final whole-project check.
- If a goal looks false or unprovable from the given hypotheses, re-read the
  statement: you likely misread a binder or a coercion. Do not "fix" it by
  changing the statement — finish the proof as stated.
- Non-trivial proofs routinely take many rounds of compile-error fixing. Keep
  iterating against the diagnostics rather than guessing."""

Authentication

OpenCode bills directly against an API provider rather than a flat-rate subscription. Sign up for an API account with your chosen provider, fund it, and monitor consumption from that provider’s usage dashboard — see OpenCode providers for the full list. Pass the key matching your chosen provider to the harness explicitly:

OpenCodeHarness(model="claude-opus-4-8", provider_api_key="sk-...")

or leave provider_api_key unset (the default) to read it from the host environment, for example:

export DEEPSEEK_API_KEY=...

The provider is inferred from the model prefix unless you pass provider explicitly. Either way the harness forwards the selected provider’s key into the sandbox under its canonical env var (ANTHROPIC_API_KEY, OPENAI_API_KEY, GOOGLE_API_KEY, or DEEPSEEK_API_KEY); resolution fails if neither the explicit key nor the host env var is set.

Cost tracking

The OpenCode CLI reports a per-step cost and token breakdown for each provider call. parse sums step_finish events — input (tokens.input plus cache write/read), output (tokens.output), and cost — into cost_usd in HarnessRunResult, so cost comes straight from the provider via OpenCode.