provers¶
The concrete provers and the registry/factory over them. Each prover subclasses
AutomatedProver and funnels its output through the
shared Verifier. The agentic provers compose an
agent harness (the agent concern) with a
ComputeBackend (the compute concern).
Registry¶
The open_atp.provers registry maps a prover name to a constructed
AutomatedProver, wired to a shared sandbox
image/toolchain and compute backend. A caller then drives the prover directly
via prove(), which returns a
ProofResult with verification and cost. This is the
top-level surface re-exported from open_atp itself.
PROVERS enumerates the available provers (e.g.
PROVERS.CLAUDE, PROVERS.CODEX, PROVERS.ARISTOTLE).
get_prover() maps a member (or its string value) to a constructed
AutomatedProver, wiring in the shared image/toolchain,
the verify backend, and (for agentic provers) the agent backend.
- class open_atp.provers.PROVERS(*values)[source]¶
The provers
get_prover()accepts.Each member’s value is the registry key.
agent:<harness>members select anAgentProverharness;CLAUDEis the bareagent(config default,claude_code).- ARISTOTLE = 'aristotle'¶
- AXPROVER = 'agent:axprover'¶
- CLAUDE = 'agent'¶
- CODEX = 'agent:codex'¶
- NUMINA = 'numina'¶
- OPENCODE = 'agent:opencode'¶
- VIBE = 'vibe'¶
- open_atp.provers.get_prover(name, *, verification_backend, agent_backend=None, overrides=None)[source]¶
Construct the prover
nameagainst the shared verify backend.nameis aPROVERSmember (or its string value). The config is built from the name’s baked-in defaults + calleroverrides(per-prover knobs: model, effort, max_rounds, …). The sandbox image (and the toolchain + Mathlib pins projects are checked against) comes fromverification_backend’s config, not a parameter here. Agentic provers also receiveagent_backendfor generation (defaults to the verify backend), keeping the agent-vs-verify backend split available.- Parameters:
- name
PROVERSorstr The prover to build – a
PROVERSmember or its registry-key string (e.g.PROVERS.CLAUDEor"agent"). RaisesValueErrorfor an unknown name.- verification_backend
ComputeBackend The backend that runs the shared
Verifier(the cheap final compile/sorry/axiom check).- agent_backend
ComputeBackend, optional The backend agentic provers run generation on. Defaults to
verification_backend, so generation and verification share a backend unless you split them (e.g. generate on Modal, verify on local Docker). Ignored by non-agentic provers (e.g.AristotleProver).- overrides
Mapping[str,object], optional Per-prover config knobs layered over the name’s baked-in defaults (model, effort, max_rounds, …). Keys must be fields of the prover’s config class.
- name
- Returns:
- prover
AutomatedProver The constructed prover, ready to drive via
prove().
- prover
Examples
Construction is cheap and offline (the backend is wired in, not called), so the factory builds a ready-to-drive prover directly.
overrideslayer per-prover knobs over the name’s baked-in defaults:>>> from open_atp import PROVERS, get_prover >>> from open_atp.backends.docker import DockerBackend, DockerConfig >>> from open_atp.images import DEFAULT_IMAGE >>> backend = DockerBackend(DockerConfig(image=DEFAULT_IMAGE)) >>> prover = get_prover( ... PROVERS.CLAUDE, ... verification_backend=backend, ... overrides={"model": "claude-sonnet-4-6", "effort": "low"}, ... ) >>> prover.config.model 'claude-sonnet-4-6'
See each prover class for a family-specific construction example (
AgentProver,NuminaProver,AristotleProver).Drive a constructed prover with
prove()(this step runs the sandbox, so it needs a working Docker backend):from pathlib import Path from open_atp.lean import LeanProject, ProofTask task = ProofTask(project=LeanProject("path/to/lake/project")) result = prover.prove(task, output_dir=Path("runs/demo")) result.success # compiles, sorry-free, no foreign axioms result.cost_usd # estimated USD, when the prover reports it result.duration_s # wall-clock seconds
See Running a prover for more example usage.
- open_atp.provers.available_provers()[source]¶
The provers
get_prover()accepts.
Base¶
The base prover abstraction. An AutomatedProver is a
candidate generator; the base class owns the shared lifecycle (the public prove:
generate, then verify in the sandbox) so subclasses only implement _generate.
- class open_atp.provers.base.AutomatedProver(config, verification_backend)[source]¶
Generate candidate proofs, then verify them in a shared sandbox.
- prove(task, output_dir)[source]¶
Full lifecycle: reject-on-mismatch, generate, verify, write the result.
output_diris caller-chosen and is populated asoutput_dir/{wd,logs}/:wdis the completed lake project (the proof output) andlogsis the run record (the agentstdout.txt/stderr.txt,result.json, and any harness-specific rich logs).
- class open_atp.provers.base.AutomatedProverConfig(timeout_s=1800, env=<factory>)[source]¶
Base configuration shared by all provers.
Subclasses extend with their own knobs:
AgentProverConfig: harness (claude/opencode/codex), effort, skills, MCP.NuminaProverConfig: extends the agent config + max_rounds, helper API keys.AristotleProverConfig: api key, mode, poll interval.
The sandbox image (its tag plus the Lean toolchain + Mathlib pins the shared verifier checks every project against) lives on the backend’s
BackendConfig, not here – a prover inherits whatever image its verification backend runs.- Attributes:
- classmethod from_dict(data)[source]¶
Build a config from a mapping (e.g. parsed JSON), ignoring unknown keys.
The inverse of
dataclasses.asdict(): any tuple-typed field is restored from the list JSON round-trips it to. Unknown keys are dropped so a serialized superset (or a sibling subclass’s extra knobs) loads cleanly. Subclasses inherit this unchanged –clsresolves to the concrete config, so its own fields are honoured.
AgentProver¶
- class open_atp.provers.agent_prover.AgentProver(config, verification_backend, agent_backend=None)[source]¶
Bases:
AutomatedProverGenerate proofs by driving an agent CLI harness in a compute backend.
Composes an agent harness (the agent concern) with a
ComputeBackend(the compute concern): the harness edits the staged.leanfiles in place, then the sharedVerifierdoes the final compile/sorry/axiom check.codex,opencode,axprover, andvibeare this prover on a different harness.Examples
Build the config and construct the prover directly (the agent backend defaults to the verify backend):
>>> from open_atp.backends.docker import DockerBackend, DockerConfig >>> from open_atp.provers.agent_prover import AgentProver, AgentProverConfig >>> backend = DockerBackend(DockerConfig()) >>> config = AgentProverConfig( ... harness="claude_code", ... model="claude-opus-4-8", ... effort="high", ... ) >>> prover = AgentProver(config, verification_backend=backend) >>> prover.config.harness 'claude_code'
- config¶
- class open_atp.provers.agent_prover.AgentProverConfig(timeout_s=1800, env=<factory>, harness='claude_code', model='claude-opus-4-8', effort='high', assets='default', skills=None, plugins=None, extra_env=<factory>, agent='lean', max_turns=None, max_price=None, max_iterations=None)[source]¶
Bases:
AutomatedProverConfigConfiguration for
AgentProver.Extends
AutomatedProverConfig(timeout_s,env) with the agent-harness knobs.- Attributes:
- harness
str Which agent CLI to drive: one of
claude_code|opencode|codex|vibe. Defaultclaude_code.- model
str Model id passed to the harness. Default
claude-opus-4-8.- effort
str Reasoning-effort level passed to harnesses that support it. Default
high.- assets
str Vendored skill/prompt/MCP asset bundle to mount into the workdir. Default
default.- skills
list[str], optional Per-run override of the bundle’s skills, each a name (resolved from a vendored catalog) or a full path.
Nonekeeps the bundle’s own skills; an empty list mounts none. Applies to every harness.- plugins
list[str], optional Per-run override of the bundle’s plugins, same resolution as
skills. Claude-only; ignored by the other harnesses.- extra_env
dict[str,str] Additional environment variables forwarded into the agent sandbox. Default empty.
- agent
str Vibe-only: which vibe agent profile to drive (
leanis Leanstral;lean-standinthe non-Labs, model-templated stand-in). Ignored by the other harnesses. Defaultlean.- max_turns
int, optional Vibe-only programmatic-run guard passed straight to
vibe -p.Noneleaves it unset. Ignored by the other harnesses.- max_price
float, optional Vibe-only programmatic-run guard passed straight to
vibe -p.Noneleaves it unset. Ignored by the other harnesses.- max_iterations
int, optional AxProver-only cap on ax-prover’s proposer->builder->reviewer loop.
Nonekeeps ax-prover’s default (50). Ignored by the other harnesses.
- harness
NuminaProver¶
- class open_atp.provers.numina.NuminaProver(config, verification_backend, agent_backend=None)[source]¶
Bases:
AgentProverRun the Numina coordinator/subagent scaffold as an
AgentProver.A specialization of
AgentProverwired to the Numina asset bundle (coordinator prompt + vendored skills + subagent prompts); generation and the sharedVerifierwork exactly as in the base agent prover.Examples
Build the config and construct the prover directly:
>>> from open_atp.backends.docker import DockerBackend, DockerConfig >>> from open_atp.provers.numina import NuminaProver, NuminaProverConfig >>> backend = DockerBackend(DockerConfig()) >>> config = NuminaProverConfig() >>> prover = NuminaProver(config, verification_backend=backend) >>> prover.config.max_rounds 20
- config¶
- class open_atp.provers.numina.NuminaProverConfig(timeout_s=1800, env=<factory>, harness='claude_code', model='claude-opus-4-8', effort='high', assets='numina', skills=None, plugins=None, extra_env=<factory>, agent='lean', max_turns=None, max_price=None, max_iterations=None, max_rounds=20, max_consecutive_limits=2, helper_env_keys=('GEMINI_API_KEY', 'OPENAI_API_KEY', 'LEAN_LEANDEX_API_KEY', 'ANTHROPIC_API_KEY'), guard_statements=True, on_statement_change='error')[source]¶
Bases:
AgentProverConfigConfiguration for
NuminaProver.Extends
AgentProverConfigwith the Numina coordinator’s round-loop and helper-skill knobs.- Attributes:
- harness
str Fixed to
claude_code: Numina is claude-CLI driven and not configurable.- assets
str Asset bundle to mount. Default
numina(coordinator prompt + vendored skills + subagent prompts).- max_rounds
int Maximum number of coordinator rounds before the run stops. Default
20.- max_consecutive_limits
int Reset (start a fresh session) after this many consecutive LIMIT rounds. Default
2.- helper_env_keys
tuple[str, …] Helper-skill credentials forwarded into the sandbox when present in the host env; skills degrade/skip when their key is absent.
ANTHROPIC_API_KEYbacks the informal-prover skill’s Claude calls.- guard_statementsbool
Whether to snapshot the target theorems and reject runs that weaken or delete them. Default
True.- on_statement_change{“error”, “warn”}
Behavior on a weakened/deleted target theorem:
errorstops the run and restores the originals;warnrestores and continues. Defaulterror(rejects; safe).- extra_env
dict[str,str] Additional environment variables forwarded into the agent sandbox. Default empty.
- harness
AristotleProver¶
- class open_atp.provers.aristotle.AristotleProver(config, verification_backend)[source]¶
Bases:
AutomatedProverProve by handing the whole project to Harmonic’s hosted Aristotle agent.
Generation happens over the network (submit the lake project, wait, download the result archive, unpack it over the workdir); the shared
Verifierthen runs the same local compile/sorry/axiom check. Network-only, so it takes just the verify backend – noagent_backend.Examples
Build the config and construct the prover directly (network-only, so it takes just the verify backend):
>>> from open_atp.backends.docker import DockerBackend, DockerConfig >>> from open_atp.provers.aristotle import AristotleProver, AristotleProverConfig >>> backend = DockerBackend(DockerConfig()) >>> config = AristotleProverConfig() >>> prover = AristotleProver(config, verification_backend=backend) >>> prover.config.api_key_env 'ARISTOTLE_API_KEY'
- config¶
- class open_atp.provers.aristotle.AristotleProverConfig(timeout_s=1800, env=<factory>, api_key_env='ARISTOTLE_API_KEY', allow_agent_questions=False, max_connection_retries=5, max_resume_attempts=20, resume_backoff_seconds=5.0)[source]¶
Bases:
AutomatedProverConfigConfiguration for
AristotleProver.Extends
AutomatedProverConfig(timeout_s,env) with the hosted-API knobs.- Attributes:
- api_key_env
str Name of the environment variable holding the Harmonic API key. Default
ARISTOTLE_API_KEY.- allow_agent_questionsbool
Whether to let the hosted agent ask clarifying questions. Off by default: this is a headless API path and a prompt for stdin would hang the run.
- max_connection_retries
int Bounds per-call retries (list/refresh/download) when a connection drops. The hosted run lives server-side, so a dropped connection is recoverable: re-fetch rather than reporting the run failed. Default
5.- max_resume_attempts
int Bounds how many times we re-attach to the event stream when it drops mid-run. Default
20.- resume_backoff_seconds
float Initial sleep between retries/resumes, doubling (capped) between tries. Default
5.0.
- api_key_env