backends¶
A ComputeBackend runs a command over a working
directory inside a Lean+Mathlib sandbox. It is the single load-bearing primitive of
the platform, used both to run a coding agent and to run lake env lean ... for
verification. A ComputeSession keeps that sandbox
alive across several commands – generation then verification against the same hot
filesystem – without paying a second spin-up.
Base¶
- class open_atp.backends.base.ComputeBackend(config)[source]¶
Runs commands over a workdir in a sandbox carrying Lean + Mathlib.
- container_home = '/root'¶
Absolute
$HOMEinside the sandbox; per-run credential dirs (an agent’s~/.codex, say) are mounted under it. Overridden per backend.
- run(workdir, command, *, env=None, mounts=None, timeout_s=None)[source]¶
Convenience:
start()then block viaCommandHandle.wait().
- abstractmethod session(workdir, *, env=None, mounts=None, timeout_s=None)[source]¶
Open a persistent sandbox over
workdirfor multipleexec()calls.Unlike
start()(one command, then teardown), the returnedComputeSessionstays alive untilComputeSession.close(), so the same hot sandbox can run generation and verification back to back.env/mountshere pin the long-lived sandbox at creation (Docker bind mounts can only be set atdocker run); per-command credentials go toComputeSession.exec()’s ownenv.
- abstractmethod start(workdir, command, *, env=None, mounts=None, timeout_s=None)[source]¶
Launch
commandwithworkdirmounted/synced into the sandbox.The workdir is synced in before launch and synced back out on completion so that file mutations (completed proofs) are visible to the caller.
mountsare extra(host_path, container_path)bind mounts beyond the workdir – used to forward agent credential dirs (e.g. Codex’s~/.codex) per run, without baking them into the backend config.
- class open_atp.backends.base.BackendConfig(image=Image(name='open-atp:latest', lean_toolchain='leanprover/lean4:v4.28.0', mathlib_rev='v4.28.0'), timeout_s=1800, env=<factory>)[source]¶
Shared backend knobs. Subclasses add their own (Docker mounts, Modal CPU…).
- Attributes:
- image
Image The sandbox image carrying Lean + Mathlib – its tag plus the toolchain and Mathlib revision the verifier checks projects against. Default
DEFAULT_IMAGE.- timeout_s
int Wall-clock cap applied to a command when its call site does not pass an explicit
timeout_s. Default1800.- env
Mapping[str,str] Environment variables baked into every command run in the sandbox. Default empty.
- image
- classmethod from_dict(data)[source]¶
Build a config from a mapping (e.g. parsed JSON), ignoring unknown keys.
The inverse of
dataclasses.asdict(): any tuple-typed field (e.g.volumes) is restored from the list JSON round-trips it to. Unknown keys are dropped so a serialized superset (or a sibling backend’s extra knobs) loads cleanly. Subclasses inherit this unchanged –clsresolves to the concrete config, so its own fields are honoured.
- class open_atp.backends.base.CommandHandle[source]¶
A live, streaming command. Concrete backends subclass and fill the hooks.
Streaming matters for agents (we want incremental stdout for cost/progress parsing) but is equally usable for a blocking
lakeinvocation viawait().
- class open_atp.backends.base.CommandResult(exit_code, stdout, stderr, duration_s)[source]¶
Outcome of a finished command run inside a backend.
- class open_atp.backends.base.ComputeSession[source]¶
A persistent sandbox over a workdir, exec’d many times, torn down once.
The one-shot
ComputeBackend.start()/ComputeBackend.run()pair creates a sandbox, runs a single command, and tears it down. A session keeps the sandbox alive so several commands can run against the same hot filesystem – the agent’s generation then the verifier’s compile – without paying a second spin-up.Lifecycle invariant: teardown lives in
close()(not in the handle’swait), so a session MUST be used as a context manager andclose()MUST be idempotent – otherwise an error between exec and close leaks the sandbox.Examples
Open a session over a workdir and run several commands against the same hot sandbox – here generation followed by the compile, with one spin-up (needs a live backend, so this is illustrative rather than a doctest):
from open_atp.backends.docker import DockerBackend, DockerConfig from open_atp.images import DEFAULT_IMAGE backend = DockerBackend(DockerConfig(image=DEFAULT_IMAGE)) with backend.session(workdir) as session: with session.exec("lake env lean Demo.lean") as handle: result = handle.wait() session.sync_out() # pull the agent's edits back to the host # the sandbox is torn down on exit, even if exec raised
Docker¶
- class open_atp.backends.docker.DockerBackend(config)[source]¶
Bases:
ComputeBackendRun sandboxes as local
dockercontainers over a bind-mounted workdir.The workdir is bind-mounted (not copied), so a command’s file edits land directly on the host. Construction is offline – it just holds the
DockerConfig; the daemon is only contacted when a command runs.Examples
>>> from open_atp.backends.docker import DockerBackend, DockerConfig >>> from open_atp.images import DEFAULT_IMAGE >>> backend = DockerBackend(DockerConfig(image=DEFAULT_IMAGE)) >>> backend.name 'docker' >>> backend.config.workdir_mount '/workspace/wd'
- config¶
- container_home = '/home/agent'¶
Absolute
$HOMEinside the sandbox; per-run credential dirs (an agent’s~/.codex, say) are mounted under it. Overridden per backend.
- session(workdir, *, env=None, mounts=None, timeout_s=None)[source]¶
Open a persistent sandbox over
workdirfor multipleexec()calls.Unlike
start()(one command, then teardown), the returnedComputeSessionstays alive untilComputeSession.close(), so the same hot sandbox can run generation and verification back to back.env/mountshere pin the long-lived sandbox at creation (Docker bind mounts can only be set atdocker run); per-command credentials go toComputeSession.exec()’s ownenv.
- class open_atp.backends.docker.DockerConfig(image=Image(name='open-atp:latest', lean_toolchain='leanprover/lean4:v4.28.0', mathlib_rev='v4.28.0'), timeout_s=1800, env=<factory>, workdir_mount='/workspace/wd', baked_lake='/workspace/.lake', volumes=())[source]¶
Bases:
BackendConfigConfiguration for
DockerBackend.Extends
BackendConfig(image,timeout_s,env) with Docker-specific knobs.- Attributes:
- workdir_mount
str Path inside the container where the workdir is bind-mounted. Default
/workspace/wd.- baked_lake
str Image-baked warm cache to symlink the workdir’s
.laketo. Empty skips the symlink. Default/workspace/.lake.- volumes
tuple[tuple[str,str], …] Extra
-v host:containermounts (e.g. agent credential dirs). Default empty.
- workdir_mount
- class open_atp.backends.docker.DockerSession(backend, container)[source]¶
Bases:
ComputeSessionA live
docker run -dcontainer; exec many commands,docker killonce.The workdir is bind-mounted, so
sync_out()/sync_in()are no-ops – edits already live on the host.- backend¶
- container¶
Modal¶
- class open_atp.backends.modal.ModalBackend(config)[source]¶
Bases:
ComputeBackendRun sandboxes as Modal Sandboxes, pushing the workdir up and pulling it back.
Unlike the bind-mounted Docker backend, the workdir is synced into the remote Sandbox and synced back out on completion. Construction is offline – it just holds the
ModalConfig; Modal is only contacted when a command runs.Examples
>>> from open_atp.backends.modal import ModalBackend, ModalConfig >>> from open_atp.images import DEFAULT_IMAGE >>> backend = ModalBackend(ModalConfig(image=DEFAULT_IMAGE, cpu=4.0)) >>> backend.name 'modal' >>> backend.config.cpu 4.0 >>> backend.config.app 'open-atp'
- config¶
- container_home = '/root'¶
Absolute
$HOMEinside the sandbox; per-run credential dirs (an agent’s~/.codex, say) are mounted under it. Overridden per backend.
- session(workdir, *, env=None, mounts=None, timeout_s=None)[source]¶
Open a persistent sandbox over
workdirfor multipleexec()calls.Unlike
start()(one command, then teardown), the returnedComputeSessionstays alive untilComputeSession.close(), so the same hot sandbox can run generation and verification back to back.env/mountshere pin the long-lived sandbox at creation (Docker bind mounts can only be set atdocker run); per-command credentials go toComputeSession.exec()’s ownenv.
- class open_atp.backends.modal.ModalConfig(image=Image(name='open-atp:latest', lean_toolchain='leanprover/lean4:v4.28.0', mathlib_rev='v4.28.0'), timeout_s=1800, env=<factory>, cpu=2.0, memory_mib=4096, app='open-atp')[source]¶
Bases:
BackendConfigConfiguration for
ModalBackend.Extends
BackendConfig(image,timeout_s,env) with Modal-specific knobs.
- class open_atp.backends.modal.ModalSession(backend, sb, workdir)[source]¶
Bases:
ComputeSessionA live Modal Sandbox: exec many commands over the pushed workdir, terminate once.
The filesystem is isolated, so
sync_out()(tar pull) andsync_in()(tar push) bridge host<->Sandbox when a caller needs the host workdir current between commands (e.g. a host-side diff, or Numina’s statement tracker).- backend¶
- exec(command, *, env=None, timeout_s=None)[source]¶
Run
commandin the live sandbox; the handle does NOT tear it down.
- sb¶
- workdir¶