codex-exec-server is the library backing codex exec-server, a small JSON-RPC server for spawning and controlling subprocesses through codex-utils-pty.

It provides:

a CLI entrypoint: codex exec-server
a Rust connection client: ExecServerConnection
a small protocol module with shared request/response types

This crate owns the transport, protocol, and filesystem/process handlers. The top-level codex binary owns hidden helper dispatch for sandboxed filesystem operations and codex-linux-sandbox.

Client And Session Ownership

Remote environments expose one logical exec-server client to the rest of Codex. That client is not the same thing as one websocket connection.

Environment
  `- one RemoteExecServerClient
       |- RemoteExecServerSession
       |    |- logical_session_id
       |    |- current ExecServerConnection?
       |    |- one in-flight reconnect attempt
       |    |- terminal reconnect error?
       |    `- tracked process_sessions: HashMap<ProcessId, Weak<ProcessSession>>
       |- RemoteProcess -> RemoteExecProcess -> ProcessSessionHandle
       |- RemoteFileSystem
       `- HttpClient for RemoteExecServerClient

ExecServerConnection
  `- Inner
       |- RpcClient
       |- reader task
       |- disconnect latch
       |- connection-local process_session_routes
       |- connection-local HTTP body stream routes
       `- initialized session_id for this live binding

ProcessSessionHandle
  |- process_id
  |- Arc<ProcessSession>
  `- ProcessSessionControl

ProcessSession
  |- wake channel
  |- event log
  |- ordered event buffer
  |- failure state
  `- disconnect policy

ProcessSessionControl
  |- Connection(ExecServerConnection) for direct/test one-shot sessions
  `- RemoteClient(RemoteExecServerClient) for reconnecting remote environments

The main roles are:

RemoteExecServerClient: environment-owned logical client. Environment clones this into the remote process backend, remote filesystem, and remote HTTP capability so all remote APIs share one reconnecting session.
RemoteExecServerSession: durable logical-session state behind the client. It remembers the resumable session id, current live connection, one shared reconnect attempt, weak references to process sessions that may need rebinding, and any terminal resume error.
ExecServerConnection: one live JSON-RPC transport binding. It owns connection-local routing for notifications and streamed HTTP response bodies.
Inner: private per-connection machinery behind ExecServerConnection. It owns the RpcClient, reader task, disconnect latch, connection-local process notification routes, connection-local HTTP body stream routes, and initialized session id for that live binding.
ProcessSession: durable per-process client state owned by the live process handle. It keeps the local event log, wake cursor, and failure state that must survive connection replacement while that handle still exists.
ProcessSessionHandle: process-facing handle used by RemoteExecProcess. It routes reads, writes, terminate, and unregister through either a focused direct connection test path or the logical reconnecting client path.
ProcessSessionControl: small command-path enum for a ProcessSessionHandle. It is not an owner; it only chooses direct ExecServerConnection versus reconnecting RemoteExecServerClient.
RemoteProcess, RemoteFileSystem, and HttpClient: thin capability adapters. They should not own reconnect state themselves.

Reconnect invariants:

There is one shared reconnect attempt per RemoteExecServerClient, not one reconnect loop per API surface.
Reconnect resumes the same logical session id and rebinds tracked ProcessSession routes onto the replacement ExecServerConnection.
When a reconnecting process session loses its transport, it emits a local ResyncRequired event and wake so callers blocked on pushed events or wake notifications know to recover through process/read(afterSeq).
process/read may retry once after a transport-close race because its afterSeq cursor makes the replay read-only and recoverable.
process/start, process/write, process/terminate, filesystem RPCs, and http/request are not replayed after an ambiguous mid-request disconnect. They reconnect before later calls, but an in-flight call that may already have reached the server returns an error instead of risking duplicate side effects.
Streamed HTTP bodies are connection-local. A reconnect can start a later HTTP request, but it cannot resume body-delta delivery for an already-open stream.
Rendezvous uses the same logical split. The relay websocket and relay stream_id are transport beneath the exec-server logical session for the first reconnect slice. If a rendezvous websocket dies, the harness/client may establish a fresh relay stream, then re-run exec-server initialize with the prior session_id as resume_session_id. Existing process state recovers through process/read(after_seq), not relay-frame replay. Full same-stream relay resume/replay remains a later protocol slice that requires endpoint-held seq/ack/replay state.

Transport

The server speaks the shared codex-app-server-protocol message envelope on the wire.

The CLI entrypoint supports:

ws://IP:PORT (default)
--remote URL --executor-id ID [--name NAME]

Remote mode registers the local exec-server with the executor registry, then reconnects to the service-provided rendezvous websocket as the executor. It uses the standard Codex ChatGPT sign-in state; run codex login first when remote registration needs authentication. Containerized callers that receive an Agent Identity JWT in CODEX_ACCESS_TOKEN can opt into that auth path with --use-agent-identity-auth; Codex then registers an Agent task and sends the derived AgentAssertion headers on the registry request.

Wire framing:

local websocket: one JSON-RPC message per websocket frame
remote websocket: binary protobuf relay frames carrying JSON-RPC payloads

Remote Relay Message Format

In remote mode, the harness and executor communicate through rendezvous using codex.exec_server.relay.v1.RelayMessageFrame; the checked-in schema is in src/proto/codex.exec_server.relay.v1.proto. The relay frame carries stream identity plus endpoint-owned reliability metadata:

version
stream_id
body              // data | ack_frame | resume | reset | heartbeat
ack               // highest contiguous peer segment seq received
ack_bits          // bitset for peer segment seqs after ack
seq               // data only: segment sequence number
segment_index     // data only: 0-based index within message
segment_count     // data only: number of segments in message
payload           // data only: JSON-RPC message bytes or segment bytes
next_seq          // resume only: next sender seq
reason            // reset only: reset reason

stream_id identifies one virtual harness/executor JSON-RPC session on the executor websocket. The harness generates a UUIDv4 stream_id; the executor demuxes frames by stream_id and runs an independent ConnectionProcessor per stream.

Use segment-level sequence numbers for reliability:

seq = 0, 1, 2, 3, ...

Use contiguous segment sequence ranges to identify and stitch a segmented application message:

message_start_seq = seq - segment_index
segment_index = 0
segment_count = 1

message_start_seq is derived by the receiver, not sent on the wire. For unsplit messages, message_start_seq == seq, segment_index == 0, and segment_count == 1.

Use cumulative ack plus fixed-size ack_bits instead of variable ack ranges:

ack = highest contiguous received segment seq
bit i in ack_bits acknowledges seq = ack + 1 + i

Send ack and ack_bits redundantly on every outbound frame. Acks are not themselves acked. Acks, retries, duplicate suppression, segmentation, and reassembly are endpoint responsibilities; rendezvous only routes relay frames by stream_id.

Lifecycle

Each connection follows this sequence:

Send initialize.
Wait for the initialize response.
Send initialized.
Call process or filesystem RPCs.

If the server receives any notification other than initialized, it replies with an error using request id -1.

If the websocket connection closes, the server terminates any remaining managed processes for that client connection.

API

`initialize`

Initial handshake request.

Request params:

{
  "clientName": "my-client"
}

Response:

{}

`initialized`

Handshake acknowledgement notification sent by the client after a successful initialize response.

Params are currently ignored. Sending any other notification method is treated as an invalid request.

`process/start`

Starts a new managed process.

Request params:

{
  "processId": "proc-1",
  "argv": ["bash", "-lc", "printf 'hello\\n'"],
  "cwd": "/absolute/working/directory",
  "env": {
    "PATH": "/usr/bin:/bin"
  },
  "tty": true,
  "pipeStdin": false,
  "arg0": null
}

Field definitions:

processId: caller-chosen stable id for this process within the connection.
argv: command vector. It must be non-empty.
cwd: absolute working directory used for the child process.
env: environment variables passed to the child process.
tty: when true, spawn a PTY-backed interactive process.
pipeStdin: when true, keep non-PTY stdin writable via process/write.
arg0: optional argv0 override forwarded to codex-utils-pty.

Response:

{
  "processId": "proc-1"
}

Behavior notes:

Reusing an existing processId is rejected.
PTY-backed processes accept later writes through process/write.
Non-PTY processes reject writes unless pipeStdin is true.
Output is streamed asynchronously via process/output.
Exit is reported asynchronously via process/exited.

`process/read`

Reads buffered output and terminal state for a managed process.

Request params:

{
  "processId": "proc-1",
  "afterSeq": null,
  "maxBytes": 65536,
  "waitMs": 1000
}

Field definitions:

processId: managed process id returned by process/start.
afterSeq: optional sequence number cursor; when present, only newer chunks are returned.
maxBytes: optional response byte budget.
waitMs: optional long-poll timeout in milliseconds.

Response:

{
  "chunks": [],
  "nextSeq": 1,
  "exited": false,
  "exitCode": null,
  "closed": false,
  "failure": null
}

`process/write`

Writes raw bytes to a running process stdin.

Request params:

{
  "processId": "proc-1",
  "chunk": "aGVsbG8K"
}

chunk is base64-encoded raw bytes. In the example above it is hello\n.

Response:

{
  "status": "accepted"
}

Behavior notes:

Writes to an unknown processId are rejected.
Writes to a non-PTY process are rejected unless it started with pipeStdin.

`process/terminate`

Terminates a running managed process.

Request params:

{
  "processId": "proc-1"
}

Response:

{
  "running": true
}

If the process is already unknown or already removed, the server responds with:

{
  "running": false
}

Notifications

`process/output`

Streaming output chunk from a running process.

Params:

{
  "processId": "proc-1",
  "seq": 1,
  "stream": "stdout",
  "chunk": "aGVsbG8K"
}

Fields:

processId: process identifier
seq: per-process output sequence number
stream: "stdout", "stderr", or "pty"
chunk: base64-encoded output bytes

`process/exited`

Final process exit notification.

Params:

{
  "processId": "proc-1",
  "seq": 2,
  "exitCode": 0
}

`process/closed`

Notification emitted after process output is closed and the process handle is removed.

Params:

{
  "processId": "proc-1"
}

Filesystem RPCs

Filesystem methods use absolute paths and return JSON-RPC errors for invalid or unavailable paths:

fs/readFile
fs/writeFile
fs/createDirectory
fs/getMetadata
fs/readDirectory
fs/remove
fs/copy

Each filesystem request accepts an optional sandbox object. When sandbox contains a ReadOnly or WorkspaceWrite policy, the operation runs in a hidden helper process launched from the top-level codex executable and prepared through the shared sandbox transform path. Helper requests and responses are passed over stdin/stdout.

Errors

The server returns JSON-RPC errors with these codes:

-32600: invalid request
-32602: invalid params
-32603: internal error

Typical error cases:

unknown method
malformed params
empty argv
duplicate processId
writes to unknown processes
writes to non-PTY processes
sandbox-denied filesystem operations

Rust surface

The crate exports:

ExecServerClient
ExecServerError
ExecServerClientConnectOptions
RemoteExecServerConnectArgs
protocol request/response structs for process and filesystem RPCs
DEFAULT_LISTEN_URL and ExecServerListenUrlParseError
ExecServerRuntimePaths
run_main() for embedding the websocket server
RemoteExecutorConfig and run_remote_executor() for embedding remote registration mode

Callers must pass ExecServerRuntimePaths to run_main(). The top-level codex exec-server command builds these paths from the codex arg0 dispatch state. RemoteExecutorConfig::new(...) also takes the auth provider that remote registration should use; the CLI builds that provider from Codex auth state before starting remote mode.

Example session

Initialize:

{"id":1,"method":"initialize","params":{"clientName":"example-client"}}
{"id":1,"result":{}}
{"method":"initialized","params":{}}

Start a process:

{"id":2,"method":"process/start","params":{"processId":"proc-1","argv":["bash","-lc","printf 'ready\\n'; while IFS= read -r line; do printf 'echo:%s\\n' \"$line\"; done"],"cwd":"/tmp","env":{"PATH":"/usr/bin:/bin"},"tty":true,"pipeStdin":false,"arg0":null}}
{"id":2,"result":{"processId":"proc-1"}}
{"method":"process/output","params":{"processId":"proc-1","seq":1,"stream":"stdout","chunk":"cmVhZHkK"}}

Write to the process:

{"id":3,"method":"process/write","params":{"processId":"proc-1","chunk":"aGVsbG8K"}}
{"id":3,"result":{"status":"accepted"}}
{"method":"process/output","params":{"processId":"proc-1","seq":2,"stream":"stdout","chunk":"ZWNobzpoZWxsbwo="}}

Terminate it:

{"id":4,"method":"process/terminate","params":{"processId":"proc-1"}}
{"id":4,"result":{"running":true}}
{"method":"process/exited","params":{"processId":"proc-1","seq":3,"exitCode":0}}
{"method":"process/closed","params":{"processId":"proc-1"}}