## Why
Exec-server websocket handling had separate reader and writer tasks for
the same socket. That made websocket control-frame handling asymmetric:
the task reading frames could observe `Ping`, but the task allowed to
write frames was elsewhere. This PR moves each physical websocket onto
one always-running pump so the socket owner can handle application
frames and websocket control frames together.
## What changed
- Refactored direct exec-server websocket connections in `connection.rs`
to use one task that owns the websocket for outbound JSON-RPC, inbound
JSON-RPC, periodic keepalive pings, and `Ping` -> `Pong` replies.
- Refactored relay websocket handling in `relay.rs` the same way for
both the harness-side logical connection and the multiplexed executor
physical socket.
- Preserved the existing keepalive ownership policy: outbound direct
websocket clients still send periodic pings, inbound Axum accepts only
reply with pongs, and relay physical websocket endpoints keep their
existing periodic pings.
- Added focused websocket pump tests for ping/pong, binary JSON-RPC,
relay data, malformed relay text frames, and close/disconnect behavior.
- Reconnect behavior is intentionally left for a follow-up.
## Validation
- Devbox Bazel focused unit target:
- `//codex-rs/exec-server:exec-server-unit-tests
--test_filter='websocket_connection_|harness_connection_|multiplexed_executor_'`
## Why
`codex exec-server` should keep the existing public `ws://IP:PORT` URL
shape while serving that websocket connection through an HTTP upgrade
path internally. That keeps the client-facing configuration simple and
allows the listener to work through intermediate HTTP-aware
infrastructure.
## What changed
- keep the emitted and configured exec-server URL as `ws://IP:PORT`
- serve that websocket endpoint through Axum HTTP upgrade handling on
`/`
- expose `GET /readyz` from the same listener for readiness checks
- route upgraded Axum websocket streams through the shared JSON-RPC
connection machinery
- initialize the rustls crypto provider before websocket client
connections
- preserve inbound binary websocket JSON-RPC parsing for compatibility
with the prior transport behavior
## Verification
- `cargo test -p codex-exec-server --test health --test process --test
websocket --test initialize --test exec_process`
## Summary
This is the `exec-server` follow-up to #21759.
#21759 fixed the Windows `taskkill` output leak for the `rmcp-client`
MCP teardown path, but #22050 showed that `exec-server` still had a
parallel `taskkill /T /F` cleanup path in
`exec-server/src/connection.rs`. Because that command inherited the
parent stdio handles, Windows could still print `SUCCESS:` lines into
the user's terminal during stdio child cleanup.
This change silences that remaining `exec-server` callsite by
redirecting `taskkill` stdin, stdout, and stderr to `Stdio::null()`.
## What Changed
- add a Windows-only `Stdio` import in `exec-server/src/connection.rs`
- redirect the `taskkill` command in `kill_windows_process_tree` to
`Stdio::null()` for stdin, stdout, and stderr
- keep the existing kill semantics unchanged by still checking
`.status()` and preserving the existing fallback/logging behavior
## How to Test
Manual validation is Windows-only, so I did not run the UI repro path
locally here.
1. On Windows, use a Codex build from this branch.
2. Exercise an `exec-server` stdio flow that spawns a child process tree
and then triggers transport cleanup.
3. Confirm the child process tree is still torn down.
4. Confirm the terminal no longer shows `SUCCESS: The process with PID
... has been terminated.` lines during cleanup.
Targeted tests:
- `cargo test -p codex-exec-server
client::tests::dropping_stdio_client_terminates_spawned_process --
--exact`
- `cargo test -p codex-exec-server
client::tests::malformed_stdio_message_terminates_spawned_process --
--exact`
Notes:
- `cargo test -p codex-exec-server` still hits unrelated local macOS
`sandbox-exec: sandbox_apply: Operation not permitted` failures in
`tests/file_system.rs`.
## References
- Fixes the remaining callsite discussed in #22050
- Related earlier fix: #21759
## Why
Configured environments need to connect to exec-server instances that
are not necessarily already listening on a websocket URL. A
command-backed stdio transport lets Codex start an exec-server process,
speak JSON-RPC over its stdio streams, and clean up that child process
with the client lifetime.
**Stack position:** this is PR 2 of 5. It builds on the server-side
stdio listener from PR 1 and provides the client transport used by later
environment/config PRs.
## What Changed
- Add `ExecServerTransport` variants for websocket URLs and stdio shell
commands.
- Add stdio command connection support for `ExecServerClient`.
- Move websocket/stdio transport setup into `client_transport.rs` so
`client.rs` stays focused on shared JSON-RPC client, session, HTTP, and
notification behavior.
- Tie stdio child process cleanup to the JSON-RPC connection lifetime
with a RAII lifetime guard.
- Keep existing websocket environment behavior by adapting URL-backed
remotes to `ExecServerTransport::WebSocketUrl`.
## Stack
- 1. https://github.com/openai/codex/pull/20663 - Add stdio exec-server
listener
- **2. This PR:** https://github.com/openai/codex/pull/20664 - Add stdio
exec-server client transport
- 3. https://github.com/openai/codex/pull/20665 - Make environment
providers own default selection
- 4. https://github.com/openai/codex/pull/20666 - Add CODEX_HOME
environments TOML provider
- 5. https://github.com/openai/codex/pull/20667 - Load configured
environments from CODEX_HOME
Split from original draft: https://github.com/openai/codex/pull/20508
## Validation
Not run locally; this was split out of the original draft stack and then
refactored to separate transport setup from the base client.
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
This stack adds configured exec-server environments, including
environments reached over stdio. Before client-side stdio transports or
config can use that path, the exec-server binary itself needs a
first-class stdio listen mode so it can speak the same JSON-RPC protocol
over stdin/stdout that it already speaks over websockets.
**Stack position:** this is PR 1 of 5. It is the server-side transport
foundation for the stack.
## What Changed
- Accept `stdio` and `stdio://` for `codex exec-server --listen`.
- Promote the existing stdio `JsonRpcConnection` helper from test-only
code into normal exec-server transport code.
- Add parse coverage for stdio listen URLs while preserving the existing
websocket default.
## Stack
- **1. This PR:** https://github.com/openai/codex/pull/20663 - Add stdio
exec-server listener
- 2. https://github.com/openai/codex/pull/20664 - Add stdio exec-server
client transport
- 3. https://github.com/openai/codex/pull/20665 - Make environment
providers own default selection
- 4. https://github.com/openai/codex/pull/20666 - Add CODEX_HOME
environments TOML provider
- 5. https://github.com/openai/codex/pull/20667 - Load configured
environments from CODEX_HOME
Split from original draft: https://github.com/openai/codex/pull/20508
## Validation
Not run locally; this was split out of the original draft stack.
---------
Co-authored-by: Codex <noreply@openai.com>
This introduces session-scoped ownership for exec-server so ws
disconnects no longer immediately kill running remote exec processes,
and it prepares the protocol for reconnect-based resume.
- add session_id / resume_session_id to the exec-server initialize
handshake
- move process ownership under a shared session registry
- detach sessions on websocket disconnect and expire them after a TTL
instead of killing processes immediately (we will resume based on this)
- allow a new connection to resume an existing session and take over
notifications/ownership
- I use UUID to make them not predictable as we don't have auth for now
- make detached-session expiry authoritative at resume time so teardown
wins at the TTL boundary
- reject long-poll process/read calls that get resumed out from under an
older attachment
---------
Co-authored-by: Codex <noreply@openai.com>
Summary
- delete the deprecated stdio transport plumbing from the exec server
stack
- add a basic `exec_server()` harness plus test utilities to start a
server, send requests, and await events
- refresh exec-server dependencies, configs, and documentation to
reflect the new flow
Testing
- Not run (not requested)
---------
Co-authored-by: starr-openai <starr@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Stacked PR 1/3.
This is the initialize-only exec-server stub slice: binary/client
scaffolding and protocol docs, without exec/filesystem implementation.
---------
Co-authored-by: Codex <noreply@openai.com>