backends

A ComputeBackend runs a command over a working directory inside a Lean+Mathlib sandbox. It is the single load-bearing primitive of the platform, used both to run a coding agent and to run lake env lean ... for verification. A ComputeSession keeps that sandbox alive across several commands – generation then verification against the same hot filesystem – without paying a second spin-up.

Base

class open_atp.backends.base.ComputeBackend(*, image: Image | Mapping[str, object] = Image(name='open-atp:latest', lean_toolchain='leanprover/lean4:v4.28.0', mathlib_rev='v4.28.0'), timeout_s: int = 1800, env: Mapping[str, str] | None = None)[source]

Runs commands over a workdir in a sandbox carrying Lean + Mathlib.

Parameters:
imageImage

The sandbox image carrying Lean + Mathlib – its tag plus the toolchain and Mathlib revision the verifier checks projects against. Default DEFAULT_IMAGE. A mapping is coerced to an Image (so a parsed config’s nested image: block works).

timeout_sint

Wall-clock cap applied to a command when its call site does not pass an explicit timeout_s. Default 1800.

envMapping[str, str], optional

Environment variables baked into every command run in the sandbox. Default empty.

run(workdir: Path, command: str, *, env: Mapping[str, str] | None = None, mounts: Sequence[tuple[str, str]] | None = None, timeout_s: int | None = None) CommandResult[source]

Convenience: start() then block via CommandHandle.wait().

Parameters:
workdirpathlib.Path

Host directory mounted/synced into the sandbox; mutations sync back out.

commandstr

The shell command to run inside the sandbox.

envMapping[str, str], optional

Per-call environment variables, merged over the backend’s env. Default empty.

mountsSequence[tuple[str, str]], optional

Extra (host_path, container_path) bind mounts beyond the workdir. Default empty.

timeout_sint, optional

Wall-clock cap for the command; falls back to the backend’s timeout_s. Default None.

Returns:
CommandResult

Exit code, captured stdout/stderr, and wall-clock duration.

abstractmethod session(workdir: Path, *, env: Mapping[str, str] | None = None, mounts: Sequence[tuple[str, str]] | None = None, timeout_s: int | None = None) ComputeSession[source]

Open a persistent sandbox over workdir for multiple exec() calls.

Unlike start() (one command, then teardown), the returned ComputeSession stays alive until ComputeSession.close(), so the same hot sandbox can run generation and verification back to back.

env/mounts here pin the long-lived sandbox at creation (Docker bind mounts can only be set at docker run); per-command credentials go to ComputeSession.exec()’s own env.

Parameters:
workdirpathlib.Path

Host directory mounted/synced into the sandbox for the session’s life.

envMapping[str, str], optional

Environment variables pinned on the sandbox at creation, merged over the backend’s env. Default empty.

mountsSequence[tuple[str, str]], optional

Extra (host_path, container_path) mounts pinned at creation. Default empty.

timeout_sint, optional

Wall-clock cap for the sandbox; falls back to the backend’s timeout_s. Default None.

Returns:
ComputeSession

A live session over the workdir; close it via ComputeSession.close() (use as a context manager).

abstractmethod start(workdir: Path, command: str, *, env: Mapping[str, str] | None = None, mounts: Sequence[tuple[str, str]] | None = None, timeout_s: int | None = None) CommandHandle[source]

Launch command with workdir mounted/synced into the sandbox.

The workdir is synced in before launch and synced back out on completion so that file mutations (completed proofs) are visible to the caller.

mounts are extra (host_path, container_path) bind mounts beyond the workdir – used to forward agent credential dirs (e.g. Codex’s ~/.codex) per run, without baking them into the backend config.

Parameters:
workdirpathlib.Path

Host directory mounted/synced into the sandbox; file mutations are synced back out on completion.

commandstr

The shell command to run inside the sandbox.

envMapping[str, str], optional

Per-call environment variables, merged over the backend’s env. Default empty.

mountsSequence[tuple[str, str]], optional

Extra (host_path, container_path) bind mounts beyond the workdir (e.g. agent credential dirs). Default empty.

timeout_sint, optional

Wall-clock cap for the command; falls back to the backend’s timeout_s. Default None.

Returns:
CommandHandle

A live handle to stream or wait() on.

class open_atp.backends.base.CommandHandle[source]

A live, streaming command. Concrete backends subclass and fill the hooks.

Streaming matters for agents (we want incremental stdout for cost/progress parsing) but is equally usable for a blocking lake invocation via wait().

cancel() None[source]

Best-effort termination (e.g. docker kill / sandbox terminate).

stream() Iterator[str][source]

Yield stdout lines as they arrive.

Yields:
str

Each line of the command’s standard output, newline-stripped, as produced.

wait() CommandResult[source]

Drain to completion and return the final result.

Returns:
CommandResult

Exit code, captured stdout/stderr, and wall-clock duration.

class open_atp.backends.base.CommandResult(exit_code: int, stdout: str, stderr: str, duration_s: float)[source]

Outcome of a finished command run inside a backend.

Attributes:
exit_codeint

The command’s process exit code (0 on success).

stdoutstr

The full captured standard output.

stderrstr

The full captured standard error.

duration_sfloat

Wall-clock time the command took, in seconds.

class open_atp.backends.base.ComputeSession[source]

A persistent sandbox over a workdir, exec’d many times, torn down once.

The one-shot ComputeBackend.start()/ComputeBackend.run() pair creates a sandbox, runs a single command, and tears it down. A session keeps the sandbox alive so several commands can run against the same hot filesystem – the agent’s generation then the verifier’s compile – without paying a second spin-up.

Lifecycle invariant: teardown lives in close() (not in the handle’s wait), so a session MUST be used as a context manager and close() MUST be idempotent – otherwise an error between exec and close leaks the sandbox.

Examples

Open a session over a workdir and run several commands against the same hot sandbox – here generation followed by the compile, with one spin-up (needs a live backend, so this is illustrative rather than a doctest):

from open_atp.backends.docker import DockerBackend
from open_atp.images import DEFAULT_IMAGE

backend = DockerBackend(image=DEFAULT_IMAGE)
with backend.session(workdir) as session:
    with session.exec("lake env lean Demo.lean") as handle:
        result = handle.wait()
    session.sync_out()   # pull the agent's edits back to the host
# the sandbox is torn down on exit, even if exec raised
close() None[source]

Tear the sandbox down (pull artifacts first where needed). Idempotent.

exec(command: str, *, env: Mapping[str, str] | None = None, timeout_s: int | None = None) CommandHandle[source]

Run command in the live sandbox; the handle does NOT tear it down.

Parameters:
commandstr

The shell command to run in the live sandbox.

envMapping[str, str], optional

Per-command environment variables, merged over the backend’s env. Default empty.

timeout_sint, optional

Wall-clock cap for the command; backends may fall back to their configured timeout_s. Default None.

Returns:
CommandHandle

A live handle to stream or wait() on. Its teardown hook does NOT close the session.

sync_in() None[source]

Push the host workdir into the sandbox (no-op when bind-mounted).

sync_out() None[source]

Pull the sandbox workdir back to the host (no-op when bind-mounted).

Docker

class open_atp.backends.docker.DockerBackend(*, image: Image | Mapping[str, object] = Image(name='open-atp:latest', lean_toolchain='leanprover/lean4:v4.28.0', mathlib_rev='v4.28.0'), timeout_s: int = 1800, env: Mapping[str, str] | None = None, workdir_mount: str = '/workspace/wd', baked_lake: str = '/workspace/.lake', volumes: tuple[tuple[str, str], ...] = ())[source]

Bases: ComputeBackend

Run sandboxes as local docker containers over a bind-mounted workdir.

The workdir is bind-mounted (not copied), so a command’s file edits land directly on the host. Construction is offline – it just records its knobs; the daemon is only contacted when a command runs.

Parameters:
imageImage

The sandbox image carrying Lean + Mathlib – its tag plus the toolchain and Mathlib revision the verifier checks projects against. Default DEFAULT_IMAGE. A mapping is coerced to an Image (so a parsed config’s nested image: block works).

timeout_sint

Wall-clock cap applied to a command when its call site does not pass an explicit timeout_s. Default 1800.

envMapping[str, str], optional

Environment variables baked into every command run in the sandbox. Default empty.

workdir_mountstr

Path inside the container where the workdir is bind-mounted. Default /workspace/wd.

baked_lakestr

Image-baked warm cache to symlink the workdir’s .lake to. Empty skips the symlink. Default /workspace/.lake.

volumestuple[tuple[str, str], …]

Extra -v host:container mounts (e.g. agent credential dirs). Default empty.

Examples

>>> from open_atp.backends.docker import DockerBackend
>>> from open_atp.images import DEFAULT_IMAGE
>>> backend = DockerBackend(image=DEFAULT_IMAGE)
>>> backend.name
'docker'
>>> backend.workdir_mount
'/workspace/wd'
session(workdir: Path, *, env: Mapping[str, str] | None = None, mounts: Sequence[tuple[str, str]] | None = None, timeout_s: int | None = None) ComputeSession[source]

Start a detached keep-alive container for repeated docker exec commands.

All mounts are wired at docker run time (Docker can’t add them per-exec); the container lives until DockerSession.close().

Parameters:
workdirpathlib.Path

Host directory bind-mounted at workdir_mount for the session’s life.

envMapping[str, str], optional

Environment variables pinned on the container at creation, merged over the backend’s env. Default empty.

mountsSequence[tuple[str, str]], optional

Extra (host_path, container_path) bind mounts wired at creation (Docker can’t add them per-exec). Default empty.

timeout_sint, optional

Unused by Docker (the container has no built-in cap). Default None.

Returns:
ComputeSession

A live DockerSession over the workdir.

class open_atp.backends.docker.DockerSession(backend: DockerBackend, container: str)[source]

Bases: ComputeSession

A live docker run -d container; exec many commands, docker kill once.

The workdir is bind-mounted, so sync_out()/sync_in() are no-ops – edits already live on the host.

close() None[source]

docker kill the keep-alive container. Idempotent.

exec(command: str, *, env: Mapping[str, str] | None = None, timeout_s: int | None = None) CommandHandle[source]

docker exec command into the container; close() owns teardown.

Parameters:
commandstr

The shell command to run in the live container.

envMapping[str, str], optional

Per-command environment variables (docker exec -e). Default empty.

timeout_sint, optional

Unused by Docker (the container has no built-in cap). Default None.

Returns:
CommandHandle

A live DockerSessionHandle whose cancel is a no-op (the session owns teardown).

sync_in() None[source]

No-op: the bind mount means host edits are already visible in-container.

sync_out() None[source]

No-op: the bind mount means edits are already on the host.