This PR partially rebase `unified_exec` on the `exec-server` and adapt
the `exec-server` accordingly.
## What changed in `exec-server`
1. Replaced the old "broadcast-driven; process-global" event model with
process-scoped session events. The goal is to be able to have dedicated
handler for each process.
2. Add to protocol contract to support explicit lifecycle status and
stream ordering:
- `WriteResponse` now returns `WriteStatus` (Accepted, UnknownProcess,
StdinClosed, Starting) instead of a bool.
- Added seq fields to output/exited notifications.
- Added terminal process/closed notification.
3. Demultiplexed remote notifications into per-process channels. Same as
for the event sys
4. Local and remote backends now both implement ExecBackend.
5. Local backend wraps internal process ID/operations into per-process
ExecProcess objects.
6. Remote backend registers a session channel before launch and
unregisters on failed launch.
## What changed in `unified_exec`
1. Added unified process-state model and backend-neutral process
wrapper. This will probably disappear in the future, but it makes it
easier to keep the work flowing on both side.
- `UnifiedExecProcess` now handles both local PTY sessions and remote
exec-server processes through a shared `ProcessHandle`.
- Added `ProcessState` to track has_exited, exit_code, and terminal
failure message consistently across backends.
2. Routed write and lifecycle handling through process-level methods.
## Some rationals
1. The change centralizes execution transport in exec-server while
preserving policy and orchestration ownership in core, avoiding
duplicated launch approval logic. This comes from internal discussion.
2. Session-scoped events remove coupling/cross-talk between processes
and make stream ordering and terminal state explicit (seq, closed,
failed).
3. The failure-path surfacing (remote launch failures, write failures,
transport disconnects) makes command tool output and cleanup behavior
deterministic
## Follow-ups:
* Unify the concept of thread ID behind an obfuscated struct
* FD handling
* Full zsh-fork compatibility
* Full network sandboxing compatibility
* Handle ws disconnection
Fixes#15283.
## Summary
Older system bubblewrap builds reject `--argv0`, which makes our Linux
sandbox fail before the helper can re-exec. This PR keeps using system
`/usr/bin/bwrap` whenever it exists and only falls back to vendored
bwrap when the system binary is missing. That matters on stricter
AppArmor hosts, where the distro bwrap package also provides the policy
setup needed for user namespaces.
For old system bwrap, we avoid `--argv0` instead of switching binaries:
- pass the sandbox helper a full-path `argv0`,
- keep the existing `current_exe() + --argv0` path when the selected
launcher supports it,
- otherwise omit `--argv0` and re-exec through the helper's own
`argv[0]` path, whose basename still dispatches as
`codex-linux-sandbox`.
Also updates the launcher/warning tests and docs so they match the new
behavior: present-but-old system bwrap uses the compatibility path, and
only absent system bwrap falls back to vendored.
### Validation
1. Install Ubuntu 20.04 in a VM
2. Compile codex and run without bubblewrap installed - see a warning
about falling back to the vendored bwrap
3. Install bwrap and verify version is 0.4.0 without `argv0` support
4. run codex and use apply_patch tool without errors
<img width="802" height="631" alt="Screenshot 2026-03-25 at 11 48 36 PM"
src="https://github.com/user-attachments/assets/77248a29-aa38-4d7c-9833-496ec6a458b8"
/>
<img width="807" height="634" alt="Screenshot 2026-03-25 at 11 47 32 PM"
src="https://github.com/user-attachments/assets/5af8b850-a466-489b-95a6-455b76b5050f"
/>
<img width="812" height="635" alt="Screenshot 2026-03-25 at 11 45 45 PM"
src="https://github.com/user-attachments/assets/438074f0-8435-4274-a667-332efdd5cb57"
/>
<img width="801" height="623" alt="Screenshot 2026-03-25 at 11 43 56 PM"
src="https://github.com/user-attachments/assets/0dc8d3f5-e8cf-4218-b4b4-a4f7d9bf02e3"
/>
---------
Co-authored-by: Michael Bolin <mbolin@openai.com>
Add environment manager that is a singleton and is created early in
app-server (before skill manager, before config loading).
Use an environment variable to point to a running exec server.
I've seen several intermittent failures of
`get_auth_status_returns_token_after_proactive_refresh_recovery` today.
I investigated, and I found a couple of issues.
First, `getAuthStatus(refreshToken=true)` could refresh twice in one
request: once via `refresh_token_if_requested()` and again via the
proactive refresh path inside `auth_manager.auth()`. In the
permanent-failure case this produced an extra `/oauth/token` call and
made the app-server auth tests flaky. Use `auth_cached()` after an
explicit refresh request so the handler reuses the post-refresh auth
state instead of immediately re-entering proactive refresh logic. Keep
the existing proactive path for `refreshToken=false`.
Second, serialize auth refresh attempts in `AuthManager` have a
startup/request race. One proactive refresh could already be in flight
while a `getAuthStatus(refreshToken=false)` request entered
`auth().await`, causing a second `/oauth/token` call before the first
failure or refresh result had been recorded. Guarding the refresh flow
with a single async lock makes concurrent callers share one refresh
result, which prevents duplicate refreshes and stabilizes the
proactive-refresh auth tests.
## Summary
- move skill loading and management into codex-core-skills
- leave codex-core with the thin integration layer and shared wiring
## Testing
- CI
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
This change adds websocket authentication at the app-server transport
boundary and enforces it before JSON-RPC `initialize`, so authenticated
deployments reject unauthenticated clients during the websocket
handshake rather than after a connection has already been admitted.
During rollout, websocket auth is opt-in for non-loopback listeners so
we do not break existing remote clients. If `--ws-auth ...` is
configured, the server enforces auth during websocket upgrade. If auth
is not configured, non-loopback listeners still start, but app-server
logs a warning and the startup banner calls out that auth should be
configured before real remote use.
The server supports two auth modes: a file-backed capability token, and
a standard HMAC-signed JWT/JWS bearer token verified with the
`jsonwebtoken` crate, with optional issuer, audience, and clock-skew
validation. Capability tokens are normalized, hashed, and compared in
constant time. Short shared secrets for signed bearer tokens are
rejected at startup. Requests carrying an `Origin` header are rejected
with `403` by transport middleware, and authenticated clients present
credentials as `Authorization: Bearer <token>` during websocket upgrade.
## Validation
- `cargo test -p codex-app-server transport::auth`
- `cargo test -p codex-cli app_server_`
- `cargo clippy -p codex-app-server --all-targets -- -D warnings`
- `just bazel-lock-check`
Note: in the broad `cargo test -p codex-app-server
connection_handling_websocket` run, the touched websocket auth cases
passed, but unrelated Unix shutdown tests failed with a timeout in this
environment.
---------
Co-authored-by: Eric Traut <etraut@openai.com>
## Summary
- move the analytics events client into codex-analytics
- update codex-core and app-server callsites to use the new crate
## Testing
- CI
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- extract plugin identifiers and load-outcome types into codex-plugin
- update codex-core to consume the new plugin crate
## Testing
- CI
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- extract instruction fragment and user-instruction types into
codex-instructions
- update codex-core to consume the new crate
## Testing
- CI
---------
Co-authored-by: Codex <noreply@openai.com>
- move the shared byte-based middle truncation logic from `core` into
`codex-utils-string`
- keep token-specific truncation in `codex-core` so rollout can reuse
the shared helper in the next stacked PR
---------
Co-authored-by: Codex <noreply@openai.com>
- create `codex-git-utils` and move the shared git helpers into it with
file moves preserved for diff readability
- move the `GitInfo` helpers out of `core` so stacked rollout work can
depend on the shared crate without carrying its own git info module
---------
Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com>
Co-authored-by: Codex <noreply@openai.com>
This PR completes the conversion of non-interactive `codex exec` to use
app server rather than directly using core events and methods.
### Summary
- move `codex-exec` off exec-owned `AuthManager` and `ThreadManager`
state
- route exec bootstrap, resume, and auth refresh through existing
app-server paths
- replace legacy `codex/event/*` decoding in exec with typed app-server
notification handling
- update human and JSONL exec output adapters to translate existing
app-server notifications only
- clean up "app server client" layer by eliminating support for legacy
notifications; this is no longer needed
- remove exposure of `authManager` and `threadManager` from "app server
client" layer
### Testing
- `exec` has pretty extensive unit and integration tests already, and
these all pass
- In addition, I asked Codex to put together a comprehensive manual set
of tests to cover all of the `codex exec` functionality (including
command-line options), and it successfully generated and ran these tests
## Summary
- move the pure sandbox policy transform helpers from `codex-core` into
`codex-sandboxing`
- move the corresponding unit tests with the extracted implementation
- update `core` and `app-server` callers to import the moved APIs
directly, without re-exports or proxy methods
## Testing
- cargo test -p codex-sandboxing
- cargo test -p codex-core sandboxing
- cargo test -p codex-app-server --lib
- just fix -p codex-sandboxing
- just fix -p codex-core
- just fix -p codex-app-server
- just fmt
- just argument-comment-lint
## Summary
- move macOS permission merging/intersection logic and tests from
`codex-core` into `codex-sandboxing`
- move seatbelt policy builders, permissions logic, SBPL assets, and
their tests into `codex-sandboxing`
- keep `codex-core` owning only the seatbelt spawn wrapper and switch
call sites to import the moved APIs directly
## Notes
- no re-exports added
- moved the seatbelt tests with the implementation so internal helpers
could stay private
- local verification is still finishing while this PR is open
## Summary
- add a new `codex-sandboxing` crate for sandboxing extraction work
- move the pure Linux sandbox argv builders and their unit tests out of
`codex-core`
- keep `core::landlock` as the spawn wrapper and update direct callers
to use `codex_sandboxing::landlock`
## Testing
- `cargo test -p codex-sandboxing`
- `cargo test -p codex-core landlock`
- `cargo test -p codex-cli debug_sandbox`
- `just argument-comment-lint`
## Notes
- this is step 1 of the move plan aimed at minimizing per-PR diffs
- no re-exports or no-op proxy methods were added
Moves Code Mode to a new crate with no dependencies on codex. This
create encodes the code mode semantics that we want for lifetime,
mounting, tool calling.
The model-facing surface is mostly unchanged. `exec` still runs raw
JavaScript, `wait` still resumes or terminates a `cell_id`, nested tools
are still available through `tools.*`, and helpers like `text`, `image`,
`store`, `load`, `notify`, `yield_control`, and `exit` still exist.
The major change is underneath that surface:
- Old code mode was an external Node runtime.
- New code mode is an in-process V8 runtime embedded directly in Rust.
- Old code mode managed cells inside a long-lived Node runner process.
- New code mode manages cells in Rust, with one V8 runtime thread per
active `exec`.
- Old code mode used JSON protocol messages over child stdin/stdout plus
Node worker-thread messages.
- New code mode uses Rust channels and direct V8 callbacks/events.
This PR also fixes the two migration regressions that fell out of that
substrate change:
- `wait { terminate: true }` now waits for the V8 runtime to actually
stop before reporting termination.
- synchronous top-level `exit()` now succeeds again instead of surfacing
as a script error.
---
- `core/src/tools/code_mode/*` is now mostly an adapter layer for the
public `exec` / `wait` tools.
- `code-mode/src/service.rs` owns cell sessions and async control flow
in Rust.
- `code-mode/src/runtime/*.rs` owns the embedded V8 isolate and
JavaScript execution.
- each `exec` spawns a dedicated runtime thread plus a Rust
session-control task.
- helper globals are installed directly into the V8 context instead of
being injected through a source prelude.
- helper modules like `tools.js` and `@openai/code_mode` are synthesized
through V8 module resolution callbacks in Rust.
---
Also added a benchmark for showing the speed of init and use of a code
mode env:
```
$ cargo bench -p codex-code-mode --bench exec_overhead -- --samples 30 --warm-iterations 25 --tool-counts 0,32,128
Finished [`bench` profile [optimized]](https://doc.rust-lang.org/cargo/reference/profiles.html#default-profiles) target(s) in 0.18s
Running benches/exec_overhead.rs (target/release/deps/exec_overhead-008c440d800545ae)
exec_overhead: samples=30, warm_iterations=25, tool_counts=[0, 32, 128]
scenario tools samples warmups iters mean/exec p95/exec rssΔ p50 rssΔ max
cold_exec 0 30 0 1 1.13ms 1.20ms 8.05MiB 8.06MiB
warm_exec 0 30 1 25 473.43us 512.49us 912.00KiB 1.33MiB
cold_exec 32 30 0 1 1.03ms 1.15ms 8.08MiB 8.11MiB
warm_exec 32 30 1 25 509.73us 545.76us 960.00KiB 1.30MiB
cold_exec 128 30 0 1 1.14ms 1.19ms 8.30MiB 8.34MiB
warm_exec 128 30 1 25 575.08us 591.03us 736.00KiB 864.00KiB
memory uses a fresh-process max RSS delta for each scenario
```
---------
Co-authored-by: Codex <noreply@openai.com>
- emit a typed `thread/realtime/transcriptUpdated` notification from
live realtime transcript deltas
- expose that notification as flat `threadId`, `role`, and `text` fields
instead of a nested transcript array
- continue forwarding raw `handoff_request` items on
`thread/realtime/itemAdded`, including the accumulated
`active_transcript`
- update app-server docs, tests, and generated protocol schema artifacts
to match the delta-based payloads
---------
Co-authored-by: Codex <noreply@openai.com>
This adds a dummy v8-poc project that in Cargo links against our
prebuilt binaries and the ones provided by rusty_v8 for non musl
platforms. This demonstrates that we can successfully link and use v8 on
all platforms that we want to target.
In bazel things are slightly more complicated. Since the libraries as
published have libc++ linked in already we end up with a lot of double
linked symbols if we try to use them in bazel land. Instead we fall back
to building rusty_v8 and v8 from source (cached of course) on the
platforms we ship to.
There is likely some compatibility drift in the windows bazel builder
that we'll need to reconcile before we can re-enable them. I'm happy to
be on the hook to unwind that.
`CODEX_TEST_REMOTE_ENV` will make `test_codex` start the executor
"remotely" (inside a docker container) turning any integration test into
remote test.
### Preliminary /plugins TUI menu
- Adds a preliminary /plugins menu flow in both tui and tui_app_server.
- Fetches plugin list data asynchronously and shows loading/error/cached
states.
- Limits this first pass to the curated ChatGPT marketplace.
- Shows available plugins with installed/status metadata.
- Supports in-menu search over plugin display name, plugin id, plugin
name, and marketplace label.
- Opens a plugin detail view on selection, including summaries for
Skills, Apps, and MCP Servers, with back navigation.
### Testing
- Launch codex-cli with plugins enabled (`--enable plugins`).
- Run /plugins and verify:
- loading state appears first
- plugin list is shown
- search filters results
- selecting a plugin opens detail view, with a list of
skills/connectors/MCP servers for the plugin
- back action returns to the list.
- Verify disabled behavior by running /plugins without plugins enabled
(shows “Plugins are disabled” message).
- Launch with `--enable tui_app_server` (and plugins enabled) and repeat
the same /plugins flow; behavior should match.
- Split the feature system into a new `codex-features` crate.
- Cut `codex-core` and workspace consumers over to the new config and
warning APIs.
Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com>
Co-authored-by: Codex <noreply@openai.com>
- Move the auth implementation and token data into codex-login.
- Keep codex-core re-exporting that surface from codex-login for
existing callers.
---------
Co-authored-by: Codex <noreply@openai.com>
For each feature we have:
1. Trait exposed on environment
2. **Local Implementation** of the trait
3. Remote implementation that uses the client to proxy via network
4. Handler implementation that handles PRC requests and calls into
**Local Implementation**
- Move core/src/terminal.rs and its tests into a standalone
terminal-detection workspace crate.
- Update direct consumers to depend on codex-terminal-detection and
import terminal APIs directly.
---------
Co-authored-by: Codex <noreply@openai.com>
Stacked PR 2/3, based on the stub PR.
Adds the exec RPC implementation and process/event flow in exec-server
only.
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
Persist Stop-hook continuation prompts as `user` messages instead of
hidden `developer` messages + some requested integration tests
This is a followup to @pakrym 's comment in
https://github.com/openai/codex/pull/14532 to make sure stop-block
continuation prompts match training for turn loops
- Stop continuation now writes `<hook_prompt hook_run_id="...">stop
hook's user prompt<hook_prompt>`
- Introduces quick-xml dependency, though we already indirectly depended
on it anyway via syntect
- This PR only has about 500 lines of actual logic changes, the rest is
tests/schema
## Testing
Example run (with a sessionstart hook and 3 stop hooks) - this shows
context added by session start, then two stop hooks sending their own
additional prompts in a new turn. The model responds with a single
message addressing both. Then when that turn ends, the hooks detect that
they just ran using `stop_hook_active` and decide not to infinite loop
test files for this (unzip, move codex -> .codex):
[codex.zip](https://github.com/user-attachments/files/26075806/codex.zip)
```
› cats
• Running SessionStart hook: lighting the observatory
SessionStart hook (completed)
warning: Hi, I'm a session start hook for wizard-tower (startup).
hook context: A wimboltine stonpet is an exotic cuisine from hyperspace
• Cats are tiny zen wizards, my friend: equal parts nap, mystery, and chaos. If you want, we can talk
cat facts, cat breeds, cat names, or build something cat-themed in this repo.
• Running Stop hook: checking the tower wards
• Running Stop hook: sacking the guards
• Running Stop hook: hiring the guards
Stop hook (completed)
warning: Wizard Tower Stop hook reviewed the completed reply (177 chars).
Stop hook (blocked)
warning: Wizard Tower Stop hook continuing conversation
feedback: cook the stonpet
Stop hook (blocked)
warning: Wizard Tower Stop hook continuing conversation
feedback: eat the cooked stonpet
• Stonpet’s cooked, aloha style: flash-seared over a blue quasiflame, glazed with nebula salt, and
rested until the hyperspace juices settle.
Now we eat with gratitude, my friend. One mindful bite in, and the flavor is pure cosmic surf:
smoky, bright, and totally out of this dimension.
• Running Stop hook: checking the tower wards
• Running Stop hook: sacking the guards
• Running Stop hook: hiring the guards
Stop hook (completed)
warning: Wizard Tower Stop hook reviewed the completed reply (285 chars).
Stop hook (completed)
warning: Wizard Tower Stop hook saw a second pass and stayed calm to avoid a loop.
Stop hook (completed)
warning: Wizard Tower Stop hook saw a second pass and stayed calm to avoid a loop.
```
The idea is that codex-exec exposes an Environment struct with services
on it. Each of those is a trait.
Depending on construction parameters passed to Environment they are
either backed by local or remote server but core doesn't see these
differences.
Add a representation of the agent graph. This is now used for:
* Cascade close agents (when I close a parent, it close the kids)
* Cascade resume (oposite)
Later, this will also be used for post-compaction stuffing of the
context
Direct fix for: https://github.com/openai/codex/issues/14458
## Description
Dependent on:
- [responsesapi] https://github.com/openai/openai/pull/760991
- [codex-backend] https://github.com/openai/openai/pull/760985
`codex app-server -> codex-backend -> responsesapi` now reuses a
persistent websocket connection across many turns. This PR updates
tracing when using websockets so that each `response.create` websocket
request propagates the current tracing context, so we can get a holistic
end-to-end trace for each turn.
Tracing is propagated via special keys (`ws_request_header_traceparent`,
`ws_request_header_tracestate`) set in the `client_metadata` param in
Responses API.
Currently tracing on websockets is a bit broken because we only set
tracing context on ws connection time, so it's detached from a
`turn/start` request.
Summary
- delete the deprecated stdio transport plumbing from the exec server
stack
- add a basic `exec_server()` harness plus test utilities to start a
server, send requests, and await events
- refresh exec-server dependencies, configs, and documentation to
reflect the new flow
Testing
- Not run (not requested)
---------
Co-authored-by: starr-openai <starr@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Stacked PR 1/3.
This is the initialize-only exec-server stub slice: binary/client
scaffolding and protocol docs, without exec/filesystem implementation.
---------
Co-authored-by: Codex <noreply@openai.com>
Cleanup image semantics in code mode.
`view_image` now returns `{image_url:string, details?: string}`
`image()` now allows both string parameter and `{image_url:string,
details?: string}`
Adds an environment crate and environment + file system abstraction.
Environment is a combination of attributes and services specific to
environment the agent is connected to:
File system, process management, OS, default shell.
The goal is to move most of agent logic that assumes environment to work
through the environment abstraction.
PR https://github.com/openai/codex/pull/14512 added an in-process app
server and started to wire up the tui to use it. We were originally
planning to modify the `tui` code in place, converting it to use the app
server a bit at a time using a hybrid adapter. We've since decided to
create an entirely new parallel `tui_app_server` implementation and do
the conversion all at once but retain the existing `tui` while we work
the bugs out of the new implementation.
This PR undoes the changes to the `tui` made in the PR #14512 and
restores the old initialization to its previous state. This allows us to
modify the `tui_app_server` without the risk of regressing the old `tui`
code. For example, we can start to remove support for all legacy core
events, like the ones that PR https://github.com/openai/codex/pull/14892
needed to ignore.
Testing:
* I manually verified that the old `tui` starts and shuts down without a
problem.
# Summary
This PR introduces the Windows sandbox runner IPC foundation that later
unified_exec work will build on.
The key point is that this is intentionally infrastructure-only. The new
IPC transport, runner plumbing, and ConPTY helpers are added here, but
the active elevated Windows sandbox path still uses the existing
request-file bootstrap. In other words, this change prepares the
transport and module layout we need for unified_exec without switching
production behavior over yet.
Part of this PR is also a source-layout cleanup: some Windows sandbox
files are moved into more explicit `elevated/`, `conpty/`, and shared
locations so it is clearer which code is for the elevated sandbox flow,
which code is legacy/direct-spawn behavior, and which helpers are shared
between them. That reorganization is intentional in this first PR so
later behavioral changes do not also have to carry a large amount of
file-move churn.
# Why This Is Needed For unified_exec
Windows elevated sandboxed unified_exec needs a long-lived,
bidirectional control channel between the CLI and a helper process
running under the sandbox user. That channel has to support:
- starting a process and reporting structured spawn success/failure
- streaming stdout/stderr back incrementally
- forwarding stdin over time
- terminating or polling a long-lived process
- supporting both pipe-backed and PTY-backed sessions
The existing elevated one-shot path is built around a request-file
bootstrap and does not provide those primitives cleanly. Before we can
turn on Windows sandbox unified_exec, we need the underlying runner
protocol and transport layer that can carry those lifecycle events and
streams.
# Why Windows Needs More Machinery Than Linux Or macOS
Linux and macOS can generally build unified_exec on top of the existing
sandbox/process model: the parent can spawn the child directly, retain
normal ownership of stdio or PTY handles, and manage the lifetime of the
sandboxed process without introducing a second control process.
Windows elevated sandboxing is different. To run inside the sandbox
boundary, we cross into a different user/security context and then need
to manage a long-lived process from outside that boundary. That means we
need an explicit helper process plus an IPC transport to carry spawn,
stdin, output, and exit events back and forth. The extra code here is
mostly that missing Windows sandbox infrastructure, not a conceptual
difference in unified_exec itself.
# What This PR Adds
- the framed IPC message types and transport helpers for parent <->
runner communication
- the renamed Windows command runner with both the existing request-file
bootstrap and the dormant IPC bootstrap
- named-pipe helpers for the elevated runner path
- ConPTY helpers and process-thread attribute plumbing needed for
PTY-backed sessions
- shared sandbox/process helpers that later PRs will reuse when
switching live execution paths over
- early file/module moves so later PRs can focus on behavior rather than
layout churn
# What This PR Does Not Yet Do
- it does not switch the active elevated one-shot path over to IPC yet
- it does not enable Windows sandbox unified_exec yet
- it does not remove the existing request-file bootstrap yet
So while this code compiles and the new path has basic validation, it is
not yet the exercised production path. That is intentional for this
first PR: the goal here is to land the transport and runner foundation
cleanly before later PRs start routing real command execution through
it.
# Follow-Ups
Planned follow-up PRs will:
1. switch elevated one-shot Windows sandbox execution to the new runner
IPC path
2. layer Windows sandbox unified_exec sessions on top of the same
transport
3. remove the legacy request-file path once the IPC-based path is live
# Validation
- `cargo build -p codex-windows-sandbox`
This PR replicates the `tui` code directory and creates a temporary
parallel `tui_app_server` directory. It also implements a new feature
flag `tui_app_server` to select between the two tui implementations.
Once the new app-server-based TUI is stabilized, we'll delete the old
`tui` directory and feature flag.
Add a protocol-level filesystem surface to the v2 app-server so Codex
clients can read and write files, inspect directories, and subscribe to
path changes without relying on host-specific helpers.
High-level changes:
- define the new v2 fs/readFile, fs/writeFile, fs/createDirectory,
fs/getMetadata, fs/readDirectory, fs/remove, fs/copy RPCs
- implement the app-server handlers, including absolute-path validation,
base64 file payloads, recursive copy/remove semantics
- document the API, regenerate protocol schemas/types, and add
end-to-end tests for filesystem operations, copy edge cases
Testing plan:
- validate protocol serialization and generated schema output for the
new fs request, response, and notification types
- run app-server integration coverage for file and directory CRUD paths,
metadata/readDirectory responses, copy failure modes, and absolute-path
validation
This PR is part of the effort to move the TUI on top of the app server.
In a previous PR, we introduced an in-process app server and moved
`exec` on top of it.
For the TUI, we want to do the migration in stages. The app server
doesn't currently expose all of the functionality required by the TUI,
so we're going to need to support a hybrid approach as we make the
transition.
This PR changes the TUI initialization to instantiate an in-process app
server and access its `AuthManager` and `ThreadManager` rather than
constructing its own copies. It also adds a placeholder TUI event
handler that will eventually translate app server events into TUI
events. App server notifications are accepted but ignored for now. It
also adds proper shutdown of the app server when the TUI terminates.
## Stacked PRs
This work is now effectively split across two steps:
- #14178: add custom CA support for browser and device-code login flows,
docs, and hermetic subprocess tests
- #14239: extend that shared custom CA handling across Codex HTTPS
clients and secure websocket TLS
Note: #14240 was merged into this branch while it was stacked on top of
this PR. This PR now subsumes that websocket follow-up and should be
treated as the combined change.
Builds on top of #14178.
## Problem
Custom CA support landed first in the login path, but the real
requirement is broader. Codex constructs outbound TLS clients in
multiple places, and both HTTPS and secure websocket paths can fail
behind enterprise TLS interception if they do not honor
`CODEX_CA_CERTIFICATE` or `SSL_CERT_FILE` consistently.
This PR broadens the shared custom-CA logic beyond login and applies the
same policy to websocket TLS, so the enterprise-proxy story is no longer
split between “HTTPS works” and “websockets still fail”.
## What This Delivers
Custom CA support is no longer limited to login. Codex outbound HTTPS
clients and secure websocket connections can now honor the same
`CODEX_CA_CERTIFICATE` / `SSL_CERT_FILE` configuration, so enterprise
proxy/intercept setups work more consistently end-to-end.
For users and operators, nothing new needs to be configured beyond the
same CA env vars introduced in #14178. The change is that more of Codex
now respects them, including websocket-backed flows that were previously
still using default trust roots.
I also manually validated the proxy path locally with mitmproxy using:
`CODEX_CA_CERTIFICATE=~/.mitmproxy/mitmproxy-ca-cert.pem
HTTPS_PROXY=http://127.0.0.1:8080 just codex`
with mitmproxy installed via `brew install mitmproxy` and configured as
the macOS system proxy.
## Mental model
`codex-client` is now the owner of shared custom-CA policy for outbound
TLS client construction. Reqwest callers start from the builder
configuration they already need, then pass that builder through
`build_reqwest_client_with_custom_ca(...)`. Websocket callers ask the
same module for a rustls client config when a custom CA bundle is
configured.
The env precedence is the same everywhere:
- `CODEX_CA_CERTIFICATE` wins
- otherwise fall back to `SSL_CERT_FILE`
- otherwise use system roots
The helper is intentionally narrow. It loads every usable certificate
from the configured PEM bundle into the appropriate root store and
returns either a configured transport or a typed error that explains
what went wrong.
## Non-goals
This does not add handshake-level integration tests against a live TLS
endpoint. It does not validate that the configured bundle forms a
meaningful certificate chain. It also does not try to force every
transport in the repo through one abstraction; it extends the shared CA
policy across the reqwest and websocket paths that actually needed it.
## Tradeoffs
The main tradeoff is centralizing CA behavior in `codex-client` while
still leaving adoption up to call sites. That keeps the implementation
additive and reviewable, but it means the rule "outbound Codex TLS that
should honor enterprise roots must use the shared helper" is still
partly enforced socially rather than by types.
For websockets, the shared helper only builds an explicit rustls config
when a custom CA bundle is configured. When no override env var is set,
websocket callers still use their ordinary default connector path.
## Architecture
`codex-client::custom_ca` now owns CA bundle selection, PEM
normalization, mixed-section parsing, certificate extraction, typed
CA-loading errors, and optional rustls client-config construction for
websocket TLS.
The affected consumers now call into that shared helper directly rather
than carrying login-local CA behavior:
- backend-client
- cloud-tasks
- RMCP client paths that use `reqwest`
- TUI voice HTTP paths
- `codex-core` default reqwest client construction
- `codex-api` websocket clients for both responses and realtime
websocket connections
The subprocess CA probe, env-sensitive integration tests, and shared PEM
fixtures also live in `codex-client`, which is now the actual owner of
the behavior they exercise.
## Observability
The shared CA path logs:
- which environment variable selected the bundle
- which path was loaded
- how many certificates were accepted
- when `TRUSTED CERTIFICATE` labels were normalized
- when CRLs were ignored
- where client construction failed
Returned errors remain user-facing and include the relevant env var,
path, and remediation hint. That same error model now applies whether
the failure surfaced while building a reqwest client or websocket TLS
configuration.
## Tests
Pure unit tests in `codex-client` cover env precedence and PEM
normalization behavior. Real client construction remains in subprocess
tests so the suite can control process env and avoid the macOS seatbelt
panic path that motivated the hermetic test split.
The subprocess coverage verifies:
- `CODEX_CA_CERTIFICATE` precedence over `SSL_CERT_FILE`
- fallback to `SSL_CERT_FILE`
- single-cert and multi-cert bundles
- malformed and empty-file errors
- OpenSSL `TRUSTED CERTIFICATE` handling
- CRL tolerance for well-formed CRL sections
The websocket side is covered by the existing `codex-api` / `codex-core`
websocket test suites plus the manual mitmproxy validation above.
---------
Co-authored-by: Ivan Zakharchanka <3axap4eHko@gmail.com>
Co-authored-by: Codex <noreply@openai.com>
## Stacked PRs
This work is split across three stacked PRs:
- #14178: add custom CA support for browser and device-code login flows,
docs, and hermetic subprocess tests
- #14239: broaden the shared custom CA path from login to other outbound
`reqwest` clients across Codex
- #14240: extend that shared custom CA handling to secure websocket TLS
so websocket connections honor the same CA env vars
Review order: #14178, then #14239, then #14240.
Supersedes #6864.
Thanks to @3axap4eHko for the original implementation and investigation
here. Although this version rearranges the code and history
significantly, the majority of the credit for this work belongs to them.
## Problem
Login flows need to work in enterprise environments where outbound TLS
is intercepted by an internal proxy or gateway. In those setups, system
root certificates alone are often insufficient to validate the OAuth and
device-code endpoints used during login. The change adds a
login-specific custom CA loading path, but the important contracts
around env precedence, PEM compatibility, test boundaries, and
probe-only workarounds need to be explicit so reviewers can understand
what behavior is intentional.
For users and operators, the behavior is simple: if login needs to trust
a custom root CA, set `CODEX_CA_CERTIFICATE` to a PEM file containing
one or more certificates. If that variable is unset, login falls back to
`SSL_CERT_FILE`. If neither is set, login uses system roots. Invalid or
empty PEM files now fail with an error that points back to those
environment variables and explains how to recover.
## What This Delivers
Users can now make Codex login work behind enterprise TLS interception
by pointing `CODEX_CA_CERTIFICATE` at a PEM bundle containing the
relevant root certificates. If that variable is unset, login falls back
to `SSL_CERT_FILE`, then to system roots.
This PR applies that behavior to both browser-based and device-code
login flows. It also makes login tolerant of the PEM shapes operators
actually have in hand: multi-certificate bundles, OpenSSL `TRUSTED
CERTIFICATE` labels, and bundles that include well-formed CRLs.
## Mental model
`codex-login` is the place where the login flows construct ad hoc
outbound HTTP clients. That makes it the right boundary for a narrow CA
policy: look for `CODEX_CA_CERTIFICATE`, fall back to `SSL_CERT_FILE`,
load every parseable certificate block in that bundle into a
`reqwest::Client`, and fail early with a clear user-facing error if the
bundle is unreadable or malformed.
The implementation is intentionally pragmatic about PEM input shape. It
accepts ordinary certificate bundles, multi-certificate bundles, OpenSSL
`TRUSTED CERTIFICATE` labels, and bundles that also contain CRLs. It
does not validate a certificate chain or prove a handshake; it only
constructs the root store used by login.
## Non-goals
This change does not introduce a general-purpose transport abstraction
for the rest of the product. It does not validate whether the provided
bundle forms a real chain, and it does not add handshake-level
integration tests against a live TLS server. It also does not change
login state management or OAuth semantics beyond ensuring the existing
flows share the same CA-loading rules.
## Tradeoffs
The main tradeoff is keeping this logic scoped to login-specific client
construction rather than lifting it into a broader shared HTTP layer.
That keeps the review surface smaller, but it also means future
login-adjacent code must continue to use `build_login_http_client()` or
it can silently bypass enterprise CA overrides.
The `TRUSTED CERTIFICATE` handling is also intentionally a local
compatibility shim. The rustls ecosystem does not currently accept that
PEM label upstream, so the code normalizes it locally and trims the
OpenSSL `X509_AUX` trailer bytes down to the certificate DER that
`reqwest` can consume.
## Architecture
`custom_ca.rs` is now the single place that owns login CA behavior. It
selects the CA file from the environment, reads it, normalizes PEM label
shape where needed, iterates mixed PEM sections with `rustls-pki-types`,
ignores CRLs, trims OpenSSL trust metadata when necessary, and returns
either a configured `reqwest::Client` or a typed error.
The browser login server and the device-code flow both call
`build_login_http_client()`, so they share the same trust-store policy.
Environment-sensitive tests run through the `login_ca_probe` helper
binary because those tests must control process-wide env vars and cannot
reliably build a real reqwest client in-process on macOS seatbelt runs.
## Observability
The custom CA path logs which environment variable selected the bundle,
which file path was loaded, how many certificates were accepted, when
`TRUSTED CERTIFICATE` labels were normalized, when CRLs were ignored,
and where client construction failed. Returned errors remain user-facing
and include the relevant path, env var, and remediation hint.
This gives enough signal for three audiences:
- users can see why login failed and which env/file caused it
- sysadmins can confirm which override actually won
- developers can tell whether the failure happened during file read, PEM
parsing, certificate registration, or final reqwest client construction
## Tests
Pure unit tests stay limited to env precedence and empty-value handling.
Real client construction lives in subprocess tests so the suite remains
hermetic with respect to process env and macOS sandbox behavior.
The subprocess tests verify:
- `CODEX_CA_CERTIFICATE` precedence over `SSL_CERT_FILE`
- fallback to `SSL_CERT_FILE`
- single-certificate and multi-certificate bundles
- malformed and empty-bundle errors
- OpenSSL `TRUSTED CERTIFICATE` handling
- CRL tolerance for well-formed CRL sections
The named PEM fixtures under `login/tests/fixtures/` are shared by the
tests so their purpose stays reviewable.
---------
Co-authored-by: Ivan Zakharchanka <3axap4eHko@gmail.com>
Co-authored-by: Codex <noreply@openai.com>