`backends`¶

A ComputeBackend runs a command over a working directory inside a Lean+Mathlib sandbox. It is the single load-bearing primitive of the platform, used both to run a coding agent and to run lake env lean ... for verification. A ComputeSession keeps that sandbox alive across several commands – generation then verification against the same hot filesystem – without paying a second spin-up.

Base¶

class open_atp.backends.base.ComputeBackend(*, image: Image | Mapping[str, object] = Image(name='open-atp:latest', lean_toolchain='leanprover/lean4:v4.28.0', mathlib_rev='v4.28.0'), timeout_s: int = 1800, env: Mapping[str, str] | None = None)[source]¶

Runs commands over a workdir in a sandbox carrying Lean + Mathlib.

Parameters:

imageImage: The sandbox image carrying Lean + Mathlib – its tag plus the toolchain and Mathlib revision the verifier checks projects against. Default DEFAULT_IMAGE. A mapping is coerced to an Image (so a parsed config’s nested image: block works).
timeout_sint: Wall-clock cap applied to a command when its call site does not pass an explicit timeout_s. Default 1800.
envMapping[str, str], optional: Environment variables baked into every command run in the sandbox. Default empty.

run(workdir: Path, command: str, *, env: Mapping[str, str] | None = None, mounts: Sequence[tuple[str, str]] | None = None, timeout_s: int | None = None) → CommandResult[source]¶

Convenience: start() then block via CommandHandle.wait().

Parameters:

workdirpathlib.Path: Host directory mounted/synced into the sandbox; mutations sync back out.
commandstr: The shell command to run inside the sandbox.
envMapping[str, str], optional: Per-call environment variables, merged over the backend’s env. Default empty.
mountsSequence[tuple[str, str]], optional: Extra (host_path, container_path) bind mounts beyond the workdir. Default empty.
timeout_sint, optional: Wall-clock cap for the command; falls back to the backend’s timeout_s. Default None.

Returns:

CommandResult: Exit code, captured stdout/stderr, and wall-clock duration.

abstractmethod session(workdir: Path, *, env: Mapping[str, str] | None = None, mounts: Sequence[tuple[str, str]] | None = None, timeout_s: int | None = None) → ComputeSession[source]¶

Open a persistent sandbox over workdir for multiple exec() calls.

Unlike start() (one command, then teardown), the returned ComputeSession stays alive until ComputeSession.close(), so the same hot sandbox can run generation and verification back to back.

env/mounts here pin the long-lived sandbox at creation (Docker bind mounts can only be set at docker run); per-command credentials go to ComputeSession.exec()’s own env.

Parameters:

workdirpathlib.Path: Host directory mounted/synced into the sandbox for the session’s life.
envMapping[str, str], optional: Environment variables pinned on the sandbox at creation, merged over the backend’s env. Default empty.
mountsSequence[tuple[str, str]], optional: Extra (host_path, container_path) mounts pinned at creation. Default empty.
timeout_sint, optional: Wall-clock cap for the sandbox; falls back to the backend’s timeout_s. Default None.

Returns:

ComputeSession: A live session over the workdir; close it via ComputeSession.close() (use as a context manager).

abstractmethod start(workdir: Path, command: str, *, env: Mapping[str, str] | None = None, mounts: Sequence[tuple[str, str]] | None = None, timeout_s: int | None = None) → CommandHandle[source]¶

Launch command with workdir mounted/synced into the sandbox.

The workdir is synced in before launch and synced back out on completion so that file mutations (completed proofs) are visible to the caller.

mounts are extra (host_path, container_path) bind mounts beyond the workdir – used to forward agent credential dirs (e.g. Codex’s ~/.codex) per run, without baking them into the backend config.

Parameters:

workdirpathlib.Path: Host directory mounted/synced into the sandbox; file mutations are synced back out on completion.
commandstr: The shell command to run inside the sandbox.
envMapping[str, str], optional: Per-call environment variables, merged over the backend’s env. Default empty.
mountsSequence[tuple[str, str]], optional: Extra (host_path, container_path) bind mounts beyond the workdir (e.g. agent credential dirs). Default empty.
timeout_sint, optional: Wall-clock cap for the command; falls back to the backend’s timeout_s. Default None.

Returns:

CommandHandle: A live handle to stream or wait() on.

class open_atp.backends.base.CommandHandle[source]¶

A live, streaming command. Concrete backends subclass and fill the hooks.

Streaming matters for agents (we want incremental stdout for cost/progress parsing) but is equally usable for a blocking lake invocation via wait().

cancel() → None[source]¶: Best-effort termination (e.g. docker kill / sandbox terminate).

stream() → Iterator[str][source]¶

Yield stdout lines as they arrive.

Yields:

str: Each line of the command’s standard output, newline-stripped, as produced.

wait() → CommandResult[source]¶

Drain to completion and return the final result.

Returns:

CommandResult: Exit code, captured stdout/stderr, and wall-clock duration.

class open_atp.backends.base.CommandResult(exit_code: int, stdout: str, stderr: str, duration_s: float)[source]¶

Outcome of a finished command run inside a backend.

Attributes:

exit_codeint: The command’s process exit code (0 on success).
stdoutstr: The full captured standard output.
stderrstr: The full captured standard error.
duration_sfloat: Wall-clock time the command took, in seconds.

class open_atp.backends.base.ComputeSession[source]¶

A persistent sandbox over a workdir, exec’d many times, torn down once.

The one-shot ComputeBackend.start()/ComputeBackend.run() pair creates a sandbox, runs a single command, and tears it down. A session keeps the sandbox alive so several commands can run against the same hot filesystem – the agent’s generation then the verifier’s compile – without paying a second spin-up.

Lifecycle invariant: teardown lives in close() (not in the handle’s wait), so a session MUST be used as a context manager and close() MUST be idempotent – otherwise an error between exec and close leaks the sandbox.

Examples

Open a session over a workdir and run several commands against the same hot sandbox – here generation followed by the compile, with one spin-up (needs a live backend, so this is illustrative rather than a doctest):

from open_atp.backends.docker import DockerBackend
from open_atp.images import DEFAULT_IMAGE

backend = DockerBackend(image=DEFAULT_IMAGE)
with backend.session(workdir) as session:
    with session.exec("lake env lean Demo.lean") as handle:
        result = handle.wait()
    session.sync_out()   # pull the agent's edits back to the host
# the sandbox is torn down on exit, even if exec raised

close() → None[source]¶: Tear the sandbox down (pull artifacts first where needed). Idempotent.

exec(command: str, *, env: Mapping[str, str] | None = None, timeout_s: int | None = None) → CommandHandle[source]¶

Run command in the live sandbox; the handle does NOT tear it down.

Parameters:

commandstr: The shell command to run in the live sandbox.
envMapping[str, str], optional: Per-command environment variables, merged over the backend’s env. Default empty.
timeout_sint, optional: Wall-clock cap for the command; backends may fall back to their configured timeout_s. Default None.

Returns:

CommandHandle: A live handle to stream or wait() on. Its teardown hook does NOT close the session.

sync_in() → None[source]¶: Push the host workdir into the sandbox (no-op when bind-mounted).

sync_out() → None[source]¶: Pull the sandbox workdir back to the host (no-op when bind-mounted).

Docker¶

class open_atp.backends.docker.DockerBackend(*, image: Image | Mapping[str, object] = Image(name='open-atp:latest', lean_toolchain='leanprover/lean4:v4.28.0', mathlib_rev='v4.28.0'), timeout_s: int = 1800, env: Mapping[str, str] | None = None, workdir_mount: str = '/workspace/wd', baked_lake: str = '/workspace/.lake', volumes: tuple[tuple[str, str], ...] = ())[source]¶

Bases: ComputeBackend

Run sandboxes as local docker containers over a bind-mounted workdir.

The workdir is bind-mounted (not copied), so a command’s file edits land directly on the host. Construction is offline – it just records its knobs; the daemon is only contacted when a command runs.

Parameters:

imageImage: The sandbox image carrying Lean + Mathlib – its tag plus the toolchain and Mathlib revision the verifier checks projects against. Default DEFAULT_IMAGE. A mapping is coerced to an Image (so a parsed config’s nested image: block works).
timeout_sint: Wall-clock cap applied to a command when its call site does not pass an explicit timeout_s. Default 1800.
envMapping[str, str], optional: Environment variables baked into every command run in the sandbox. Default empty.
workdir_mountstr: Path inside the container where the workdir is bind-mounted. Default /workspace/wd.
baked_lakestr: Image-baked warm cache to symlink the workdir’s .lake to. Empty skips the symlink. Default /workspace/.lake.
volumestuple[tuple[str, str], …]: Extra -v host:container mounts (e.g. agent credential dirs). Default empty.

Examples

>>> from open_atp.backends.docker import DockerBackend
>>> from open_atp.images import DEFAULT_IMAGE
>>> backend = DockerBackend(image=DEFAULT_IMAGE)
>>> backend.name
'docker'
>>> backend.workdir_mount
'/workspace/wd'

session(workdir: Path, *, env: Mapping[str, str] | None = None, mounts: Sequence[tuple[str, str]] | None = None, timeout_s: int | None = None) → ComputeSession[source]¶

Start a detached keep-alive container for repeated docker exec commands.

All mounts are wired at docker run time (Docker can’t add them per-exec); the container lives until DockerSession.close().

Parameters:

workdirpathlib.Path: Host directory bind-mounted at workdir_mount for the session’s life.
envMapping[str, str], optional: Environment variables pinned on the container at creation, merged over the backend’s env. Default empty.
mountsSequence[tuple[str, str]], optional: Extra (host_path, container_path) bind mounts wired at creation (Docker can’t add them per-exec). Default empty.
timeout_sint, optional: Unused by Docker (the container has no built-in cap). Default None.

Returns:

ComputeSession: A live DockerSession over the workdir.

class open_atp.backends.docker.DockerSession(backend: DockerBackend, container: str)[source]¶

Bases: ComputeSession

A live docker run -d container; exec many commands, docker kill once.

The workdir is bind-mounted, so sync_out()/sync_in() are no-ops – edits already live on the host.

close() → None[source]¶: docker kill the keep-alive container. Idempotent.

exec(command: str, *, env: Mapping[str, str] | None = None, timeout_s: int | None = None) → CommandHandle[source]¶

docker exec command into the container; close() owns teardown.

Parameters:

commandstr: The shell command to run in the live container.
envMapping[str, str], optional: Per-command environment variables (docker exec -e). Default empty.
timeout_sint, optional: Unused by Docker (the container has no built-in cap). Default None.

Returns:

CommandHandle: A live DockerSessionHandle whose cancel is a no-op (the session owns teardown).

sync_in() → None[source]¶: No-op: the bind mount means host edits are already visible in-container.

sync_out() → None[source]¶: No-op: the bind mount means edits are already on the host.

Modal¶

class open_atp.backends.modal.ModalBackend(*, image: Image | Mapping[str, object] = Image(name='open-atp:latest', lean_toolchain='leanprover/lean4:v4.28.0', mathlib_rev='v4.28.0'), timeout_s: int = 1800, env: Mapping[str, str] | None = None, cpu: float = 2.0, memory_mib: int = 4096, app: str = 'open-atp')[source]¶

Bases: ComputeBackend

Run sandboxes as Modal Sandboxes, pushing the workdir up and pulling it back.

Unlike the bind-mounted Docker backend, the workdir is synced into the remote Sandbox and synced back out on completion. Construction is offline – it just records its knobs; Modal is only contacted when a command runs.

Parameters:

imageImage: The sandbox image carrying Lean + Mathlib – its tag plus the toolchain and Mathlib revision the verifier checks projects against. Default DEFAULT_IMAGE. A mapping is coerced to an Image (so a parsed config’s nested image: block works).
timeout_sint: Wall-clock cap applied to a command when its call site does not pass an explicit timeout_s. Default 1800.
envMapping[str, str], optional: Environment variables baked into every command run in the sandbox. Default empty.
cpufloat: CPU cores requested for the Modal Sandbox. Default 2.0.
memory_mibint: Memory (MiB) requested for the Modal Sandbox. Default 4096.
appstr: Modal app the Sandbox is associated with (also the publish target of open-atp build-modal-image). Default open-atp.

Examples

>>> from open_atp.backends.modal import ModalBackend
>>> from open_atp.images import DEFAULT_IMAGE
>>> backend = ModalBackend(image=DEFAULT_IMAGE, cpu=4.0)
>>> backend.name
'modal'
>>> backend.cpu
4.0
>>> backend.app
'open-atp'

session(workdir: Path, *, env: Mapping[str, str] | None = None, mounts: Sequence[tuple[str, str]] | None = None, timeout_s: int | None = None) → ComputeSession[source]¶

Provision a Sandbox over workdir and keep it alive for many execs.

The Sandbox lives until ModalSession.close(); env/mounts pin it at creation.

Parameters:

workdirpathlib.Path: Host directory pushed into the Sandbox at creation; bridge it back with ModalSession.sync_out().
envMapping[str, str], optional: Environment variables pinned on the Sandbox at creation, merged over the backend’s env. Default empty.
mountsSequence[tuple[str, str]], optional: Extra (host_path, container_path) dirs pushed in at creation. Default empty.
timeout_sint, optional: Wall-clock cap for the Sandbox; falls back to the backend’s timeout_s. Default None.

Returns:

ComputeSession: A live ModalSession over the workdir.

class open_atp.backends.modal.ModalSession(backend: ModalBackend, sb: modal.Sandbox, workdir: Path)[source]¶

Bases: ComputeSession

A live Modal Sandbox: exec many commands over the pushed workdir, terminate once.

The filesystem is isolated, so sync_out() (tar pull) and sync_in() (tar push) bridge host<->Sandbox when a caller needs the host workdir current between commands (e.g. a host-side diff, or Numina’s statement tracker).

close() → None[source]¶: Pull final artifacts, then terminate the Sandbox. Idempotent.

exec(command: str, *, env: Mapping[str, str] | None = None, timeout_s: int | None = None) → CommandHandle[source]¶

Exec command in the live Sandbox; close() owns teardown.

Parameters:

commandstr: The shell command to run in the live Sandbox.
envMapping[str, str], optional: Per-command environment variables, forwarded as a one-off Modal secret (usually empty – credentials pin at session creation). Default empty.
timeout_sint, optional: Unused per-exec; the cap is fixed on the Sandbox at creation. Default None.

Returns:

CommandHandle: A live ModalSessionHandle whose wait leaves the Sandbox up for the next command.

sync_in() → None[source]¶: Tar the host workdir up into the Sandbox.

sync_out() → None[source]¶: Tar the Sandbox workdir back out over the host workdir.

backends¶

Base¶

Docker¶

Modal¶

`backends`¶