backends¶
A ComputeBackend runs a command over a working
directory inside a Lean+Mathlib sandbox. It is the single load-bearing primitive of
the platform, used both to run a coding agent and to run lake env lean ... for
verification. A ComputeSession keeps that sandbox
alive across several commands – generation then verification against the same hot
filesystem – without paying a second spin-up.
Base¶
- class open_atp.backends.base.ComputeBackend(*, image: Image | Mapping[str, object] = Image(name='open-atp:latest', lean_toolchain='leanprover/lean4:v4.28.0', mathlib_rev='v4.28.0'), timeout_s: int = 1800, env: Mapping[str, str] | None = None)[source]¶
Runs commands over a workdir in a sandbox carrying Lean + Mathlib.
- Parameters:
- image
Image The sandbox image carrying Lean + Mathlib – its tag plus the toolchain and Mathlib revision the verifier checks projects against. Default
DEFAULT_IMAGE. A mapping is coerced to anImage(so a parsed config’s nestedimage:block works).- timeout_s
int Wall-clock cap applied to a command when its call site does not pass an explicit
timeout_s. Default1800.- env
Mapping[str,str], optional Environment variables baked into every command run in the sandbox. Default empty.
- image
- run(workdir: Path, command: str, *, env: Mapping[str, str] | None = None, mounts: Sequence[tuple[str, str]] | None = None, timeout_s: int | None = None) CommandResult[source]¶
Convenience:
start()then block viaCommandHandle.wait().- Parameters:
- workdir
pathlib.Path Host directory mounted/synced into the sandbox; mutations sync back out.
- command
str The shell command to run inside the sandbox.
- env
Mapping[str,str], optional Per-call environment variables, merged over the backend’s
env. Default empty.- mounts
Sequence[tuple[str,str]], optional Extra
(host_path, container_path)bind mounts beyond the workdir. Default empty.- timeout_s
int, optional Wall-clock cap for the command; falls back to the backend’s
timeout_s. DefaultNone.
- workdir
- Returns:
CommandResultExit code, captured stdout/stderr, and wall-clock duration.
- abstractmethod session(workdir: Path, *, env: Mapping[str, str] | None = None, mounts: Sequence[tuple[str, str]] | None = None, timeout_s: int | None = None) ComputeSession[source]¶
Open a persistent sandbox over
workdirfor multipleexec()calls.Unlike
start()(one command, then teardown), the returnedComputeSessionstays alive untilComputeSession.close(), so the same hot sandbox can run generation and verification back to back.env/mountshere pin the long-lived sandbox at creation (Docker bind mounts can only be set atdocker run); per-command credentials go toComputeSession.exec()’s ownenv.- Parameters:
- workdir
pathlib.Path Host directory mounted/synced into the sandbox for the session’s life.
- env
Mapping[str,str], optional Environment variables pinned on the sandbox at creation, merged over the backend’s
env. Default empty.- mounts
Sequence[tuple[str,str]], optional Extra
(host_path, container_path)mounts pinned at creation. Default empty.- timeout_s
int, optional Wall-clock cap for the sandbox; falls back to the backend’s
timeout_s. DefaultNone.
- workdir
- Returns:
ComputeSessionA live session over the workdir; close it via
ComputeSession.close()(use as a context manager).
- abstractmethod start(workdir: Path, command: str, *, env: Mapping[str, str] | None = None, mounts: Sequence[tuple[str, str]] | None = None, timeout_s: int | None = None) CommandHandle[source]¶
Launch
commandwithworkdirmounted/synced into the sandbox.The workdir is synced in before launch and synced back out on completion so that file mutations (completed proofs) are visible to the caller.
mountsare extra(host_path, container_path)bind mounts beyond the workdir – used to forward agent credential dirs (e.g. Codex’s~/.codex) per run, without baking them into the backend config.- Parameters:
- workdir
pathlib.Path Host directory mounted/synced into the sandbox; file mutations are synced back out on completion.
- command
str The shell command to run inside the sandbox.
- env
Mapping[str,str], optional Per-call environment variables, merged over the backend’s
env. Default empty.- mounts
Sequence[tuple[str,str]], optional Extra
(host_path, container_path)bind mounts beyond the workdir (e.g. agent credential dirs). Default empty.- timeout_s
int, optional Wall-clock cap for the command; falls back to the backend’s
timeout_s. DefaultNone.
- workdir
- Returns:
CommandHandleA live handle to stream or
wait()on.
- class open_atp.backends.base.CommandHandle[source]¶
A live, streaming command. Concrete backends subclass and fill the hooks.
Streaming matters for agents (we want incremental stdout for cost/progress parsing) but is equally usable for a blocking
lakeinvocation viawait().- stream() Iterator[str][source]¶
Yield stdout lines as they arrive.
- Yields:
strEach line of the command’s standard output, newline-stripped, as produced.
- wait() CommandResult[source]¶
Drain to completion and return the final result.
- Returns:
CommandResultExit code, captured stdout/stderr, and wall-clock duration.
- class open_atp.backends.base.CommandResult(exit_code: int, stdout: str, stderr: str, duration_s: float)[source]¶
Outcome of a finished command run inside a backend.
- class open_atp.backends.base.ComputeSession[source]¶
A persistent sandbox over a workdir, exec’d many times, torn down once.
The one-shot
ComputeBackend.start()/ComputeBackend.run()pair creates a sandbox, runs a single command, and tears it down. A session keeps the sandbox alive so several commands can run against the same hot filesystem – the agent’s generation then the verifier’s compile – without paying a second spin-up.Lifecycle invariant: teardown lives in
close()(not in the handle’swait), so a session MUST be used as a context manager andclose()MUST be idempotent – otherwise an error between exec and close leaks the sandbox.Examples
Open a session over a workdir and run several commands against the same hot sandbox – here generation followed by the compile, with one spin-up (needs a live backend, so this is illustrative rather than a doctest):
from open_atp.backends.docker import DockerBackend from open_atp.images import DEFAULT_IMAGE backend = DockerBackend(image=DEFAULT_IMAGE) with backend.session(workdir) as session: with session.exec("lake env lean Demo.lean") as handle: result = handle.wait() session.sync_out() # pull the agent's edits back to the host # the sandbox is torn down on exit, even if exec raised
Docker¶
- class open_atp.backends.docker.DockerBackend(*, image: Image | Mapping[str, object] = Image(name='open-atp:latest', lean_toolchain='leanprover/lean4:v4.28.0', mathlib_rev='v4.28.0'), timeout_s: int = 1800, env: Mapping[str, str] | None = None, workdir_mount: str = '/workspace/wd', baked_lake: str = '/workspace/.lake', volumes: tuple[tuple[str, str], ...] = ())[source]¶
Bases:
ComputeBackendRun sandboxes as local
dockercontainers over a bind-mounted workdir.The workdir is bind-mounted (not copied), so a command’s file edits land directly on the host. Construction is offline – it just records its knobs; the daemon is only contacted when a command runs.
- Parameters:
- image
Image The sandbox image carrying Lean + Mathlib – its tag plus the toolchain and Mathlib revision the verifier checks projects against. Default
DEFAULT_IMAGE. A mapping is coerced to anImage(so a parsed config’s nestedimage:block works).- timeout_s
int Wall-clock cap applied to a command when its call site does not pass an explicit
timeout_s. Default1800.- env
Mapping[str,str], optional Environment variables baked into every command run in the sandbox. Default empty.
- workdir_mount
str Path inside the container where the workdir is bind-mounted. Default
/workspace/wd.- baked_lake
str Image-baked warm cache to symlink the workdir’s
.laketo. Empty skips the symlink. Default/workspace/.lake.- volumes
tuple[tuple[str,str], …] Extra
-v host:containermounts (e.g. agent credential dirs). Default empty.
- image
Examples
>>> from open_atp.backends.docker import DockerBackend >>> from open_atp.images import DEFAULT_IMAGE >>> backend = DockerBackend(image=DEFAULT_IMAGE) >>> backend.name 'docker' >>> backend.workdir_mount '/workspace/wd'
- session(workdir: Path, *, env: Mapping[str, str] | None = None, mounts: Sequence[tuple[str, str]] | None = None, timeout_s: int | None = None) ComputeSession[source]¶
Start a detached keep-alive container for repeated
docker execcommands.All mounts are wired at
docker runtime (Docker can’t add them per-exec); the container lives untilDockerSession.close().- Parameters:
- workdir
pathlib.Path Host directory bind-mounted at
workdir_mountfor the session’s life.- env
Mapping[str,str], optional Environment variables pinned on the container at creation, merged over the backend’s
env. Default empty.- mounts
Sequence[tuple[str,str]], optional Extra
(host_path, container_path)bind mounts wired at creation (Docker can’t add them per-exec). Default empty.- timeout_s
int, optional Unused by Docker (the container has no built-in cap). Default
None.
- workdir
- Returns:
ComputeSessionA live
DockerSessionover the workdir.
- class open_atp.backends.docker.DockerSession(backend: DockerBackend, container: str)[source]¶
Bases:
ComputeSessionA live
docker run -dcontainer; exec many commands,docker killonce.The workdir is bind-mounted, so
sync_out()/sync_in()are no-ops – edits already live on the host.
Modal¶
- class open_atp.backends.modal.ModalBackend(*, image: Image | Mapping[str, object] = Image(name='open-atp:latest', lean_toolchain='leanprover/lean4:v4.28.0', mathlib_rev='v4.28.0'), timeout_s: int = 1800, env: Mapping[str, str] | None = None, cpu: float = 2.0, memory_mib: int = 4096, app: str = 'open-atp')[source]¶
Bases:
ComputeBackendRun sandboxes as Modal Sandboxes, pushing the workdir up and pulling it back.
Unlike the bind-mounted Docker backend, the workdir is synced into the remote Sandbox and synced back out on completion. Construction is offline – it just records its knobs; Modal is only contacted when a command runs.
- Parameters:
- image
Image The sandbox image carrying Lean + Mathlib – its tag plus the toolchain and Mathlib revision the verifier checks projects against. Default
DEFAULT_IMAGE. A mapping is coerced to anImage(so a parsed config’s nestedimage:block works).- timeout_s
int Wall-clock cap applied to a command when its call site does not pass an explicit
timeout_s. Default1800.- env
Mapping[str,str], optional Environment variables baked into every command run in the sandbox. Default empty.
- cpu
float CPU cores requested for the Modal Sandbox. Default
2.0.- memory_mib
int Memory (MiB) requested for the Modal Sandbox. Default
4096.- app
str Modal app the Sandbox is associated with (also the publish target of
open-atp build-modal-image). Defaultopen-atp.
- image
Examples
>>> from open_atp.backends.modal import ModalBackend >>> from open_atp.images import DEFAULT_IMAGE >>> backend = ModalBackend(image=DEFAULT_IMAGE, cpu=4.0) >>> backend.name 'modal' >>> backend.cpu 4.0 >>> backend.app 'open-atp'
- session(workdir: Path, *, env: Mapping[str, str] | None = None, mounts: Sequence[tuple[str, str]] | None = None, timeout_s: int | None = None) ComputeSession[source]¶
Provision a Sandbox over
workdirand keep it alive for many execs.The Sandbox lives until
ModalSession.close();env/mountspin it at creation.- Parameters:
- workdir
pathlib.Path Host directory pushed into the Sandbox at creation; bridge it back with
ModalSession.sync_out().- env
Mapping[str,str], optional Environment variables pinned on the Sandbox at creation, merged over the backend’s
env. Default empty.- mounts
Sequence[tuple[str,str]], optional Extra
(host_path, container_path)dirs pushed in at creation. Default empty.- timeout_s
int, optional Wall-clock cap for the Sandbox; falls back to the backend’s
timeout_s. DefaultNone.
- workdir
- Returns:
ComputeSessionA live
ModalSessionover the workdir.
- class open_atp.backends.modal.ModalSession(backend: ModalBackend, sb: modal.Sandbox, workdir: Path)[source]¶
Bases:
ComputeSessionA live Modal Sandbox: exec many commands over the pushed workdir, terminate once.
The filesystem is isolated, so
sync_out()(tar pull) andsync_in()(tar push) bridge host<->Sandbox when a caller needs the host workdir current between commands (e.g. a host-side diff, or Numina’s statement tracker).- exec(command: str, *, env: Mapping[str, str] | None = None, timeout_s: int | None = None) CommandHandle[source]¶
Exec
commandin the live Sandbox; close() owns teardown.- Parameters:
- command
str The shell command to run in the live Sandbox.
- env
Mapping[str,str], optional Per-command environment variables, forwarded as a one-off Modal secret (usually empty – credentials pin at session creation). Default empty.
- timeout_s
int, optional Unused per-exec; the cap is fixed on the Sandbox at creation. Default
None.
- command
- Returns:
CommandHandleA live
ModalSessionHandlewhosewaitleaves the Sandbox up for the next command.