backends

A ComputeBackend runs a command over a working directory inside a Lean+Mathlib sandbox. It is the single load-bearing primitive of the platform, used both to run a coding agent and to run lake env lean ... for verification. A ComputeSession keeps that sandbox alive across several commands – generation then verification against the same hot filesystem – without paying a second spin-up.

Base

class open_atp.backends.base.ComputeBackend(config)[source]

Runs commands over a workdir in a sandbox carrying Lean + Mathlib.

container_home = '/root'

Absolute $HOME inside the sandbox; per-run credential dirs (an agent’s ~/.codex, say) are mounted under it. Overridden per backend.

run(workdir, command, *, env=None, mounts=None, timeout_s=None)[source]

Convenience: start() then block via CommandHandle.wait().

abstractmethod session(workdir, *, env=None, mounts=None, timeout_s=None)[source]

Open a persistent sandbox over workdir for multiple exec() calls.

Unlike start() (one command, then teardown), the returned ComputeSession stays alive until ComputeSession.close(), so the same hot sandbox can run generation and verification back to back.

env/mounts here pin the long-lived sandbox at creation (Docker bind mounts can only be set at docker run); per-command credentials go to ComputeSession.exec()’s own env.

abstractmethod start(workdir, command, *, env=None, mounts=None, timeout_s=None)[source]

Launch command with workdir mounted/synced into the sandbox.

The workdir is synced in before launch and synced back out on completion so that file mutations (completed proofs) are visible to the caller.

mounts are extra (host_path, container_path) bind mounts beyond the workdir – used to forward agent credential dirs (e.g. Codex’s ~/.codex) per run, without baking them into the backend config.

class open_atp.backends.base.BackendConfig(image=Image(name='open-atp:latest', lean_toolchain='leanprover/lean4:v4.28.0', mathlib_rev='v4.28.0'), timeout_s=1800, env=<factory>)[source]

Shared backend knobs. Subclasses add their own (Docker mounts, Modal CPU…).

Attributes:
imageImage

The sandbox image carrying Lean + Mathlib – its tag plus the toolchain and Mathlib revision the verifier checks projects against. Default DEFAULT_IMAGE.

timeout_sint

Wall-clock cap applied to a command when its call site does not pass an explicit timeout_s. Default 1800.

envMapping[str, str]

Environment variables baked into every command run in the sandbox. Default empty.

classmethod from_dict(data)[source]

Build a config from a mapping (e.g. parsed JSON), ignoring unknown keys.

The inverse of dataclasses.asdict(): any tuple-typed field (e.g. volumes) is restored from the list JSON round-trips it to. Unknown keys are dropped so a serialized superset (or a sibling backend’s extra knobs) loads cleanly. Subclasses inherit this unchanged – cls resolves to the concrete config, so its own fields are honoured.

class open_atp.backends.base.CommandHandle[source]

A live, streaming command. Concrete backends subclass and fill the hooks.

Streaming matters for agents (we want incremental stdout for cost/progress parsing) but is equally usable for a blocking lake invocation via wait().

cancel()[source]

Best-effort termination (e.g. docker kill / sandbox terminate).

stream()[source]

Yield stdout lines as they arrive.

wait()[source]

Drain to completion and return the final result.

class open_atp.backends.base.CommandResult(exit_code, stdout, stderr, duration_s)[source]

Outcome of a finished command run inside a backend.

class open_atp.backends.base.ComputeSession[source]

A persistent sandbox over a workdir, exec’d many times, torn down once.

The one-shot ComputeBackend.start()/ComputeBackend.run() pair creates a sandbox, runs a single command, and tears it down. A session keeps the sandbox alive so several commands can run against the same hot filesystem – the agent’s generation then the verifier’s compile – without paying a second spin-up.

Lifecycle invariant: teardown lives in close() (not in the handle’s wait), so a session MUST be used as a context manager and close() MUST be idempotent – otherwise an error between exec and close leaks the sandbox.

Examples

Open a session over a workdir and run several commands against the same hot sandbox – here generation followed by the compile, with one spin-up (needs a live backend, so this is illustrative rather than a doctest):

from open_atp.backends.docker import DockerBackend, DockerConfig
from open_atp.images import DEFAULT_IMAGE

backend = DockerBackend(DockerConfig(image=DEFAULT_IMAGE))
with backend.session(workdir) as session:
    with session.exec("lake env lean Demo.lean") as handle:
        result = handle.wait()
    session.sync_out()   # pull the agent's edits back to the host
# the sandbox is torn down on exit, even if exec raised
close()[source]

Tear the sandbox down (pull artifacts first where needed). Idempotent.

exec(command, *, env=None, timeout_s=None)[source]

Run command in the live sandbox; the handle does NOT tear it down.

sync_in()[source]

Push the host workdir into the sandbox (no-op when bind-mounted).

sync_out()[source]

Pull the sandbox workdir back to the host (no-op when bind-mounted).

Docker

class open_atp.backends.docker.DockerBackend(config)[source]

Bases: ComputeBackend

Run sandboxes as local docker containers over a bind-mounted workdir.

The workdir is bind-mounted (not copied), so a command’s file edits land directly on the host. Construction is offline – it just holds the DockerConfig; the daemon is only contacted when a command runs.

Examples

>>> from open_atp.backends.docker import DockerBackend, DockerConfig
>>> from open_atp.images import DEFAULT_IMAGE
>>> backend = DockerBackend(DockerConfig(image=DEFAULT_IMAGE))
>>> backend.name
'docker'
>>> backend.config.workdir_mount
'/workspace/wd'
config
container_home = '/home/agent'

Absolute $HOME inside the sandbox; per-run credential dirs (an agent’s ~/.codex, say) are mounted under it. Overridden per backend.

session(workdir, *, env=None, mounts=None, timeout_s=None)[source]

Open a persistent sandbox over workdir for multiple exec() calls.

Unlike start() (one command, then teardown), the returned ComputeSession stays alive until ComputeSession.close(), so the same hot sandbox can run generation and verification back to back.

env/mounts here pin the long-lived sandbox at creation (Docker bind mounts can only be set at docker run); per-command credentials go to ComputeSession.exec()’s own env.

class open_atp.backends.docker.DockerConfig(image=Image(name='open-atp:latest', lean_toolchain='leanprover/lean4:v4.28.0', mathlib_rev='v4.28.0'), timeout_s=1800, env=<factory>, workdir_mount='/workspace/wd', baked_lake='/workspace/.lake', volumes=())[source]

Bases: BackendConfig

Configuration for DockerBackend.

Extends BackendConfig (image, timeout_s, env) with Docker-specific knobs.

Attributes:
workdir_mountstr

Path inside the container where the workdir is bind-mounted. Default /workspace/wd.

baked_lakestr

Image-baked warm cache to symlink the workdir’s .lake to. Empty skips the symlink. Default /workspace/.lake.

volumestuple[tuple[str, str], …]

Extra -v host:container mounts (e.g. agent credential dirs). Default empty.

class open_atp.backends.docker.DockerSession(backend, container)[source]

Bases: ComputeSession

A live docker run -d container; exec many commands, docker kill once.

The workdir is bind-mounted, so sync_out()/sync_in() are no-ops – edits already live on the host.

backend
close()[source]

Tear the sandbox down (pull artifacts first where needed). Idempotent.

container
exec(command, *, env=None, timeout_s=None)[source]

Run command in the live sandbox; the handle does NOT tear it down.

sync_in()[source]

Push the host workdir into the sandbox (no-op when bind-mounted).

sync_out()[source]

Pull the sandbox workdir back to the host (no-op when bind-mounted).