## Summary
- add unix:// app-server transport backed by the shared codex-uds crate
- reuse the websocket connection loop for axum and tungstenite-backed
streams
- add codex app-server proxy to bridge stdio clients to the control
socket
- tolerate Windows UDS backends that report a missing rendezvous path as
connection refused before binding
## Tests
- cargo test -p codex-app-server
control_socket_acceptor_forwards_websocket_text_messages_and_pings
- cargo test -p codex-app-server
- just fmt
- just fix -p codex-app-server
- git -c core.fsmonitor=false diff --check
Add a temporary internal remote_plugin feature flag that merges remote
marketplaces into plugin/list and routes plugin/read through the remote
APIs when needed, while keeping pure local marketplaces working as
before.
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
The device-key protocol needs an app-server implementation that keeps
local key operations behind the same request-processing boundary as
other v2 APIs.
app-server owns request dispatch, transport policy, documentation, and
JSON-RPC error shaping. `codex-device-key` owns key binding, validation,
platform provider selection, and signing mechanics. Keeping the adapter
thin makes the boundary easier to review and avoids moving local
key-management details into thread orchestration code.
## What changed
- Added `DeviceKeyApi` as the app-server adapter around
`DeviceKeyStore`.
- Converted protocol protection policies, payload variants, algorithms,
and protection classes to and from the device-key crate types.
- Encoded SPKI public keys and DER signatures as base64 protocol fields.
- Routed `device/key/create`, `device/key/public`, and `device/key/sign`
through `MessageProcessor`.
- Rejected remote transports before provider access while allowing local
`stdio` and in-process callers to reach the device-key API.
- Added stdio, in-process, and websocket tests for device-key validation
and transport policy.
- Documented the device-key methods in the app-server v2 method list.
## Test coverage
- `device_key_create_rejects_empty_account_user_id`
- `in_process_allows_device_key_requests_to_reach_device_key_api`
- `device_key_methods_are_rejected_over_websocket`
## Stack
This is PR 3 of 4 in the device-key app-server stack. It is stacked on
#18429.
## Validation
- `cargo test -p codex-app-server device_key`
- `just fix -p codex-app-server`
## Why
Cloud-hosted sessions need a way for the service that starts or manages
a thread to provide session-owned config without treating all config as
if it came from the same user/project/workspace TOML stack.
The important boundary is ownership: some values should be controlled by
the session/orchestrator, some by the authenticated user, and later some
may come from the executor. The earlier broad config-store shape made
that boundary too fuzzy and overlapped heavily with the existing
filesystem-backed config loader. This PR starts with the smaller piece
we need now: a typed session config loader that can feed the existing
config layer stack while preserving the normal precedence and merge
behavior.
## What Changed
- Added `ThreadConfigLoader` and related typed payloads in
`codex-config`.
- `SessionThreadConfig` currently supports `model_provider`,
`model_providers`, and feature flags.
- `UserThreadConfig` is present as an ownership boundary, but does not
yet add TOML-backed fields.
- `NoopThreadConfigLoader` preserves existing behavior when no external
loader is configured.
- `StaticThreadConfigLoader` supports tests and simple callers.
- Taught thread config sources to produce ordinary `ConfigLayerEntry`
values so the existing `ConfigLayerStack` remains the place where
precedence and merging happen.
- Wired the loader through `ConfigBuilder`, the config loader, and
app-server startup paths so app-server can provide session-owned config
before deriving a thread config.
- Added coverage for:
- translating typed thread config into config layers,
- inserting thread config layers into the stack at the right precedence,
- applying session-provided model provider and feature settings when
app-server derives config from thread params.
## Follow-Ups
This intentionally stops short of adding the remote/service transport.
The next pieces are expected to be:
1. Define the proto/API shape for this interface.
2. Add a client implementation that can source session config from the
service side.
## Verification
- Added unit coverage in `codex-config` for the loader and layer
conversion.
- Added `codex-core` config loader coverage for thread config layer
precedence.
- Added app-server coverage that verifies session thread config wins
over request-provided config for model provider and feature settings.
Split plugin loading, marketplace, and related infrastructure out of
core into codex-core-plugins, while keeping the core-facing
configuration and orchestration flow in codex-core.
---------
Co-authored-by: Codex <noreply@openai.com>
Builds on top of #17659
Move the filesystem + sqlite thread listing-related operations inside of
a local ThreadStore implementation and call ThreadStore from the places
that used to perform these filesystem/sqlite operations.
This is the first of a series of PRs that will implement the rest of the
local ThreadStore.
Testing:
- added unit tests for the thread store implementation
- adjusted some unit tests in the realtime + personality packages whose
callsites changed. Specifically I'm trying to hide ThreadMetadata inside
of the local implementation and make ThreadMetadata a sqlite
implementation detail concern rather than a public interface, preferring
the more generate StoredThread interface instead
- added a corner case test for the personality migration package that
wasn't covered by the existing test suite
- adjust the behavior of searched thread listing to run the existing
local rollout repair/backfill pass _before_ querying SQLite results, so
callers using ThreadStore::list_threads do not miss matches after a
partial metadata warm-up
Stacked on #16508.
This removes the temporary `codex-core` / `codex-login` re-export shims
from the ownership split and rewrites callsites to import directly from
`codex-model-provider-info`, `codex-models-manager`, `codex-api`,
`codex-protocol`, `codex-feedback`, and `codex-response-debug-context`.
No behavior change intended; this is the mechanical import cleanup layer
split out from the ownership move.
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
This finishes the config-type move out of `codex-core` by removing the
temporary compatibility shim in `codex_core::config::types`. Callers now
depend on `codex-config` directly, which keeps these config model types
owned by the config crate instead of re-expanding `codex-core` as a
transitive API surface.
## What Changed
- Removed the `codex-rs/core/src/config/types.rs` re-export shim and the
`core::config::ApprovalsReviewer` re-export.
- Updated `codex-core`, `codex-cli`, `codex-tui`, `codex-app-server`,
`codex-mcp-server`, and `codex-linux-sandbox` call sites to import
`codex_config::types` directly.
- Added explicit `codex-config` dependencies to downstream crates that
previously relied on the `codex-core` re-export.
- Regenerated `codex-rs/core/config.schema.json` after updating the
config docs path reference.
## Why
`codex-core` was re-exporting APIs owned by sibling `codex-*` crates,
which made downstream crates depend on `codex-core` as a proxy module
instead of the actual owner crate.
Removing those forwards makes crate boundaries explicit and lets leaf
crates drop unnecessary `codex-core` dependencies. In this PR, this
reduces the dependency on `codex-core` to `codex-login` in the following
files:
```
codex-rs/backend-client/Cargo.toml
codex-rs/mcp-server/tests/common/Cargo.toml
```
## What
- Remove `codex-rs/core/src/lib.rs` re-exports for symbols owned by
`codex-login`, `codex-mcp`, `codex-rollout`, `codex-analytics`,
`codex-protocol`, `codex-shell-command`, `codex-sandboxing`,
`codex-tools`, and `codex-utils-path`.
- Delete the `default_client` forwarding shim in `codex-rs/core`.
- Update in-crate and downstream callsites to import directly from the
owning `codex-*` crate.
- Add direct Cargo dependencies where callsites now target the owner
crate, and remove `codex-core` from `codex-rs/backend-client`.
## Why
`codex-mcp` already owns the shared MCP API surface, including `auth`,
`McpConfig`, `CODEX_APPS_MCP_SERVER_NAME`, and tool-name helpers in
[`codex-rs/codex-mcp/src/mcp/mod.rs`](f61e85dbfb/codex-rs/codex-mcp/src/mcp/mod.rs (L1-L35)).
Re-exporting that surface from `codex_core::mcp` gives downstream crates
two import paths for the same API and hides the real crate dependency.
This PR keeps `codex_core::mcp` focused on the local `McpManager`
wrapper in
[`codex-rs/core/src/mcp.rs`](f61e85dbfb/codex-rs/core/src/mcp.rs (L13-L40))
and makes consumers import shared MCP APIs from `codex_mcp` directly.
## What
- Remove the `codex_mcp::mcp` re-export surface from `core/src/mcp.rs`.
- Update `codex-core` internals plus `codex-app-server`, `codex-cli`,
and `codex-tui` test code to import MCP APIs from `codex_mcp::mcp`
directly.
- Add explicit `codex-mcp` dependencies where those crates now use that
API surface, and refresh `Cargo.lock`.
## Verification
- `just bazel-lock-check`
- `cargo test -p codex-core -p codex-cli -p codex-tui`
- `codex-cli` passed.
- `codex-core` still fails five unrelated config tests in
`core/src/config/config_tests.rs` (`approvals_reviewer_*` and
`smart_approvals_alias_*`).
- A broader `cargo test -p codex-core -p codex-app-server -p codex-cli
-p codex-tui` run previously hung in `codex-app-server` test
`in_process_start_uses_requested_session_source_for_thread_start`.
## Why
`parse_tool_input_schema` and the supporting `JsonSchema` model were
living in `core/src/tools/spec.rs`, but they already serve callers
outside `codex-core`.
Keeping that shared schema parsing logic inside `codex-core` makes the
crate boundary harder to reason about and works against the guidance in
`AGENTS.md` to avoid growing `codex-core` when reusable code can live
elsewhere.
This change takes the first extraction step by moving the schema parsing
primitive into its own crate while keeping the rest of the tool-spec
assembly in `codex-core`.
## What changed
- added a new `codex-tools` crate under `codex-rs/tools`
- moved the shared tool input schema model and sanitizer/parser into
`tools/src/json_schema.rs`
- kept `tools/src/lib.rs` exports-only, with the module-level unit tests
split into `json_schema_tests.rs`
- updated `codex-core` to use `codex-tools::JsonSchema` and re-export
`parse_tool_input_schema`
- updated `codex-app-server` dynamic tool validation to depend on
`codex-tools` directly instead of reaching through `codex-core`
- wired the new crate into the Cargo workspace and Bazel build graph
## Summary
This change adds websocket authentication at the app-server transport
boundary and enforces it before JSON-RPC `initialize`, so authenticated
deployments reject unauthenticated clients during the websocket
handshake rather than after a connection has already been admitted.
During rollout, websocket auth is opt-in for non-loopback listeners so
we do not break existing remote clients. If `--ws-auth ...` is
configured, the server enforces auth during websocket upgrade. If auth
is not configured, non-loopback listeners still start, but app-server
logs a warning and the startup banner calls out that auth should be
configured before real remote use.
The server supports two auth modes: a file-backed capability token, and
a standard HMAC-signed JWT/JWS bearer token verified with the
`jsonwebtoken` crate, with optional issuer, audience, and clock-skew
validation. Capability tokens are normalized, hashed, and compared in
constant time. Short shared secrets for signed bearer tokens are
rejected at startup. Requests carrying an `Origin` header are rejected
with `403` by transport middleware, and authenticated clients present
credentials as `Authorization: Bearer <token>` during websocket upgrade.
## Validation
- `cargo test -p codex-app-server transport::auth`
- `cargo test -p codex-cli app_server_`
- `cargo clippy -p codex-app-server --all-targets -- -D warnings`
- `just bazel-lock-check`
Note: in the broad `cargo test -p codex-app-server
connection_handling_websocket` run, the touched websocket auth cases
passed, but unrelated Unix shutdown tests failed with a timeout in this
environment.
---------
Co-authored-by: Eric Traut <etraut@openai.com>
- create `codex-git-utils` and move the shared git helpers into it with
file moves preserved for diff readability
- move the `GitInfo` helpers out of `core` so stacked rollout work can
depend on the shared crate without carrying its own git info module
---------
Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com>
Co-authored-by: Codex <noreply@openai.com>
## Summary
- move the pure sandbox policy transform helpers from `codex-core` into
`codex-sandboxing`
- move the corresponding unit tests with the extracted implementation
- update `core` and `app-server` callers to import the moved APIs
directly, without re-exports or proxy methods
## Testing
- cargo test -p codex-sandboxing
- cargo test -p codex-core sandboxing
- cargo test -p codex-app-server --lib
- just fix -p codex-sandboxing
- just fix -p codex-core
- just fix -p codex-app-server
- just fmt
- just argument-comment-lint
- Split the feature system into a new `codex-features` crate.
- Cut `codex-core` and workspace consumers over to the new config and
warning APIs.
Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com>
Co-authored-by: Codex <noreply@openai.com>
The idea is that codex-exec exposes an Environment struct with services
on it. Each of those is a trait.
Depending on construction parameters passed to Environment they are
either backed by local or remote server but core doesn't see these
differences.
Adds an environment crate and environment + file system abstraction.
Environment is a combination of attributes and services specific to
environment the agent is connected to:
File system, process management, OS, default shell.
The goal is to move most of agent logic that assumes environment to work
through the environment abstraction.
Add a protocol-level filesystem surface to the v2 app-server so Codex
clients can read and write files, inspect directories, and subscribe to
path changes without relying on host-specific helpers.
High-level changes:
- define the new v2 fs/readFile, fs/writeFile, fs/createDirectory,
fs/getMetadata, fs/readDirectory, fs/remove, fs/copy RPCs
- implement the app-server handlers, including absolute-path validation,
base64 file payloads, recursive copy/remove semantics
- document the API, regenerate protocol schemas/types, and add
end-to-end tests for filesystem operations, copy edge cases
Testing plan:
- validate protocol serialization and generated schema output for the
new fs request, response, and notification types
- run app-server integration coverage for file and directory CRUD paths,
metadata/readDirectory responses, copy failure modes, and absolute-path
validation
## Summary
This PR keeps app-server RPC request trace context alive for the full
lifetime of the work that request kicks off (e.g. for `thread/start`,
this is `app-server rpc handler -> tokio background task -> core op
submissions`). Previously we lose trace lineage once the request handler
returns or hands work off to background tasks.
This approach is especially relevant for `thread/start` and other RPC
handlers that run in a non-blocking way. In the near future we'll most
likely want to make all app-server handlers run in a non-blocking way by
default, and only queue operations that must operate in order (e.g.
thread RPCs per thread?), so we want to make sure tracing in app-server
just generally works.
Depends on https://github.com/openai/codex/pull/14300
**Before**
<img width="155" height="207" alt="image"
src="https://github.com/user-attachments/assets/c9487459-36f1-436c-beb7-fafeb40737af"
/>
**After**
<img width="299" height="337" alt="image"
src="https://github.com/user-attachments/assets/727392b2-d072-4427-9dc4-0502d8652dea"
/>
## What changed
- Keep request-scoped trace context around until we send the final
response or error, or the connection closes.
- Thread that trace context through detached `thread/start` work so
background startup stays attached to the originating request.
- Pass request trace context through to downstream core operations,
including:
- thread creation
- resume/fork flows
- turn submission
- review
- interrupt
- realtime conversation operations
- Add tracing tests that verify:
- remote W3C trace context is preserved for `thread/start`
- remote W3C trace context is preserved for `turn/start`
- downstream core spans stay under the originating request span
- request-scoped tracing state is cleaned up correctly
- Clean up shutdown behavior so detached background tasks and spawned
threads are drained before process exit.
## What changed
- This PR changes only the flaky test setup for
`turn_start_notify_payload_includes_initialize_client_name`.
- Instead of shelling out to `python3` to write the notify payload, the
test uses the first-party `codex-app-server-test-notify-capture` helper.
- The helper writes `notify.json` atomically and the test waits for the
file to exist before reading it.
## Why this fixes the flake
- The old test depended on an external Python interpreter being present
and behaving consistently on every CI runner.
- It also raced the file write: the test could observe the path before
the payload had been fully written, which produced partial reads and
intermittent assertion failures.
- Moving the write into a repo-owned helper removes the external
dependency, and atomic write-plus-wait makes the handoff deterministic.
## Scope
- Test-only change.
Healthcheck endpoints for the websocket server
- serve `GET /readyz` and `GET /healthz` from the same listener used for
`--listen ws://...`
- switch the websocket listener over to `axum` upgrade handling instead
of manual socket parsing
- add websocket transport coverage for the health endpoints and document
the new behavior
Testing
- integration tests
- built and tested e2e
```
> curl -i http://127.0.0.1:9234/readyz
HTTP/1.1 200 OK
content-length: 0
date: Fri, 06 Mar 2026 19:20:23 GMT
> curl -i http://127.0.0.1:9234/healthz
HTTP/1.1 200 OK
content-length: 0
date: Fri, 06 Mar 2026 19:20:24 GMT
```
* Add an ability to stream stdin, stdout, and stderr
* Streaming of stdout and stderr has a configurable cap for total amount
of transmitted bytes (with an ability to disable it)
* Add support for overriding environment variables
* Add an ability to terminate running applications (using
`command/exec/terminate`)
* Add TTY/PTY support, with an ability to resize the terminal (using
`command/exec/resize`)
This adds a first-class server request for MCP server elicitations:
`mcpServer/elicitation/request`.
Until now, MCP elicitation requests only showed up as a raw
`codex/event/elicitation_request` event from core. That made it hard for
v2 clients to handle elicitations using the same request/response flow
as other server-driven interactions (like shell and `apply_patch`
tools).
This also updates the underlying MCP elicitation request handling in
core to pass through the full MCP request (including URL and form data)
so we can expose it properly in app-server.
### Why not `item/mcpToolCall/elicitationRequest`?
This is because MCP elicitations are related to MCP servers first, and
only optionally to a specific MCP tool call.
In the MCP protocol, elicitation is a server-to-client capability: the
server sends `elicitation/create`, and the client replies with an
elicitation result. RMCP models it that way as well.
In practice an elicitation is often triggered by an MCP tool call, but
not always.
### What changed
- add `mcpServer/elicitation/request` to the v2 app-server API
- translate core `codex/event/elicitation_request` events into the new
v2 server request
- map client responses back into `Op::ResolveElicitation` so the MCP
server can continue
- update app-server docs and generated protocol schema
- add an end-to-end app-server test that covers the full round trip
through a real RMCP elicitation flow
- The new test exercises a realistic case where an MCP tool call
triggers an elicitation, the app-server emits
mcpServer/elicitation/request, the client accepts it, and the tool call
resumes and completes successfully.
### app-server API flow
- Client starts a thread with `thread/start`.
- Client starts a turn with `turn/start`.
- App-server sends `item/started` for the `mcpToolCall`.
- While that tool call is in progress, app-server sends
`mcpServer/elicitation/request`.
- Client responds to that request with `{ action: "accept" | "decline" |
"cancel" }`.
- App-server sends `serverRequest/resolved`.
- App-server sends `item/completed` for the mcpToolCall.
- App-server sends `turn/completed`.
- If the turn is interrupted while the elicitation is pending,
app-server still sends `serverRequest/resolved` before the turn
finishes.
## Summary
- add the v2 `thread/metadata/update` API, including
protocol/schema/TypeScript exports and app-server docs
- patch stored thread `gitInfo` in sqlite without resuming the thread,
with validation plus support for explicit `null` clears
- repair missing sqlite thread rows from rollout data before patching,
and make those repairs safe by inserting only when absent and updating
only git columns so newer metadata is not clobbered
- keep sqlite authoritative for mutable thread git metadata by
preserving existing sqlite git fields during reconcile/backfill and only
using rollout `SessionMeta` git fields to fill gaps
- add regression coverage for the endpoint, repair paths, concurrent
sqlite writes, clearing git fields, and rollout/backfill reconciliation
- fix the login server shutdown race so cancelling before the waiter
starts still terminates `block_until_done()` correctly
## Testing
- `cargo test -p codex-state
apply_rollout_items_preserves_existing_git_branch_and_fills_missing_git_fields`
- `cargo test -p codex-state
update_thread_git_info_preserves_newer_non_git_metadata`
- `cargo test -p codex-core
backfill_sessions_preserves_existing_git_branch_and_fills_missing_git_fields`
- `cargo test -p codex-app-server thread_metadata_update`
- `cargo test`
- currently fails in existing `codex-core` grep-files tests with
`unsupported call: grep_files`:
- `suite::grep_files::grep_files_tool_collects_matches`
- `suite::grep_files::grep_files_tool_reports_empty_results`
## Summary
This removes the old app-server v1 methods and notifications we no
longer need, while keeping the small set the main codex app client still
depends on for now.
The remaining legacy surface is:
- `initialize`
- `getConversationSummary`
- `getAuthStatus`
- `gitDiffToRemote`
- `fuzzyFileSearch`
- `fuzzyFileSearch/sessionStart`
- `fuzzyFileSearch/sessionUpdate`
- `fuzzyFileSearch/sessionStop`
And the raw `codex/event/*` notifications emitted from core. These
notifications will be removed in a followup PR.
## What changed
- removed deprecated v1 request variants from the protocol and
app-server dispatcher
- removed deprecated typed notifications: `authStatusChange`,
`loginChatGptComplete`, and `sessionConfigured`
- updated the app-server test client to use v2 flows instead of deleted
v1 flows
- deleted legacy-only app-server test suites and added focused coverage
for `getConversationSummary`
- regenerated app-server schema fixtures and updated the MCP interface
docs to match the remaining compatibility surface
## Testing
- `just write-app-server-schema`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server`
## Summary
- write app-server SQLite logs at TRACE level when SQLite is enabled
- source app-server `/feedback` log attachments from SQLite for the
requested thread when available
- flush buffered SQLite log writes before `/feedback` queries them so
newly emitted events are not lost behind the async inserter
- include same-process threadless SQLite rows in those `/feedback` logs
so the attachment matches the process-wide feedback buffer more closely
- keep the existing in-memory ring buffer fallback unchanged, including
when the SQLite query returns no rows
## Details
- add a byte-bounded `query_feedback_logs` helper in `codex-state` so
`/feedback` does not fetch all rows before truncating
- scope SQLite feedback logs to the requested thread plus threadless
rows from the same `process_uuid`
- format exported SQLite feedback lines with the log level prefix to
better match the in-memory feedback formatter
- add an explicit `LogDbLayer::flush()` control path and await it in
app-server before querying SQLite for feedback logs
- pass optional SQLite log bytes through `codex-feedback` as the
`codex-logs.log` attachment override
- leave TUI behavior unchanged apart from the updated `upload_feedback`
call signature
- add regression coverage for:
- newest-within-budget ordering
- excluding oversized newest rows
- including same-process threadless rows
- keeping the newest suffix across mixed thread and threadless rows
- matching the feedback formatter shape aside from span prefixes
- falling back to the in-memory snapshot when SQLite returns no logs
- flushing buffered SQLite rows before querying
## Follow-up
- SQLite feedback exports still do not reproduce span prefixes like
`feedback-thread{thread_id=...}:`; there is a `TODO(ccunningham)` in
`codex-rs/state/src/log_db.rs` for that follow-up.
## Testing
- `cd codex-rs && cargo test -p codex-state`
- `cd codex-rs && cargo test -p codex-app-server`
- `cd codex-rs && just fmt`
### Overview
This PR adds the first piece of tracing for app-server JSON-RPC
requests.
There are two main changes:
- JSON-RPC requests can now take an optional W3C trace context at the
top level via a `trace` field (`traceparent` / `tracestate`).
- app-server now creates a dedicated request span for every inbound
JSON-RPC request in `MessageProcessor`, and uses the request-level trace
context as the parent when present.
For compatibility with existing flows, app-server still falls back to
the TRACEPARENT env var when there is no request-level traceparent.
This PR is intentionally scoped to the app-server boundary. In a
followup, we'll actually propagate trace context through the async
handoff into core execution spans like run_turn, which will make
app-server traces much more useful.
### Spans
A few details on the app-server span shape:
- each inbound request gets its own server span
- span/resource names are based on the JSON-RPC method (`initialize`,
`thread/start`, `turn/start`, etc.)
- spans record transport (stdio vs websocket), request id, connection
id, and client name/version when available
- `initialize` stores client metadata in session state so later requests
on the same connection can reuse it
## Why
`codex-rs/core/src/lib.rs` re-exported a broad set of types and modules
from `codex-protocol` and `codex-shell-command`. That made it easy for
workspace crates to import those APIs through `codex-core`, which in
turn hides dependency edges and makes it harder to reduce compile-time
coupling over time.
This change removes those public re-exports so call sites must import
from the source crates directly. Even when a crate still depends on
`codex-core` today, this makes dependency boundaries explicit and
unblocks future work to drop `codex-core` dependencies where possible.
## What Changed
- Removed public re-exports from `codex-rs/core/src/lib.rs` for:
- `codex_protocol::protocol` and related protocol/model types (including
`InitialHistory`)
- `codex_protocol::config_types` (`protocol_config_types`)
- `codex_shell_command::{bash, is_dangerous_command, is_safe_command,
parse_command, powershell}`
- Migrated workspace Rust call sites to import directly from:
- `codex_protocol::protocol`
- `codex_protocol::config_types`
- `codex_protocol::models`
- `codex_shell_command`
- Added explicit `Cargo.toml` dependencies (`codex-protocol` /
`codex-shell-command`) in crates that now import those crates directly.
- Kept `codex-core` internal modules compiling by using `pub(crate)`
aliases in `core/src/lib.rs` (internal-only, not part of the public
API).
- Updated the two utility crates that can already drop a `codex-core`
dependency edge entirely:
- `codex-utils-approval-presets`
- `codex-utils-cli`
## Verification
- `cargo test -p codex-utils-approval-presets`
- `cargo test -p codex-utils-cli`
- `cargo check --workspace --all-targets`
- `just clippy`
Hardens codex-rs/app-server connection lifecycle and outbound routing
for websocket clients. Fixes some FUD I was having
- Added per-connection disconnect signaling (CancellationToken) for
websocket transports.
- Split websocket handling into independent inbound/outbound tasks
coordinated by cancellation.
- Changed outbound routing so websocket connections use non-blocking
try_send; slow/full websocket writers are disconnected instead of
stalling broadcast delivery.
- Kept stdio behavior blocking-on-send (no forced disconnect) so local
stdio clients are not dropped when queues are temporarily full.
- Simplified outbound router flow by removing deferred
pending_closed_connections handling.
- Added guards to drop incoming response/notification/error messages
from unknown connections.
- Fixed listener teardown race in thread listener tasks using a
listener_generation check so stale tasks do not clear newer listeners.
Fixes
https://linear.app/openai/issue/CODEX-4966/multiclient-handle-slow-notification-consumers
## Tests
Added/updated transport tests covering:
- broadcast does not block on a slow/full websocket connection
- stdio connection waits instead of disconnecting on full queue
I (maxj) have tested manually and will retest before landing
- add `LOG_FORMAT=json` support for app-server tracing logs via
`tracing_subscriber`'s built-in JSON formatter
- keep the default human-readable format unchanged and keep `RUST_LOG`
filtering behavior
- document the env var and update lockfile
Reapply "Add app-server transport layer with websocket support" with
additional fixes from https://github.com/openai/codex/pull/11313/changes
to avoid deadlocking.
This reverts commit 47356ff83c.
## Summary
To avoid deadlocking when queues are full, we maintain separate tokio
tasks dedicated to incoming vs outgoing event handling
- split the app-server main loop into two tasks in
`run_main_with_transport`
- inbound handling (`transport_event_rx`)
- outbound handling (`outgoing_rx` + `thread_created_rx`)
- separate incoming and outgoing websocket tasks
## Validation
Integration tests, testing thoroughly e2e in codex app w/ >10 concurrent
requests
<img width="1365" height="979" alt="Screenshot 2026-02-10 at 2 54 22 PM"
src="https://github.com/user-attachments/assets/47ca2c13-f322-4e5c-bedd-25859cbdc45f"
/>
---------
Co-authored-by: jif-oai <jif@openai.com>
We are removing feature-gated shared crates from the `codex-rs`
workspace. `codex-common` grouped several unrelated utilities behind
`[features]`, which made dependency boundaries harder to reason about
and worked against the ongoing effort to eliminate feature flags from
workspace crates.
Splitting these utilities into dedicated crates under `utils/` aligns
this area with existing workspace structure and keeps each dependency
explicit at the crate boundary.
## What changed
- Removed `codex-rs/common` (`codex-common`) from workspace members and
workspace dependencies.
- Added six new utility crates under `codex-rs/utils/`:
- `codex-utils-cli`
- `codex-utils-elapsed`
- `codex-utils-sandbox-summary`
- `codex-utils-approval-presets`
- `codex-utils-oss`
- `codex-utils-fuzzy-match`
- Migrated the corresponding modules out of `codex-common` into these
crates (with tests), and added matching `BUILD.bazel` targets.
- Updated direct consumers to use the new crates instead of
`codex-common`:
- `codex-rs/cli`
- `codex-rs/tui`
- `codex-rs/exec`
- `codex-rs/app-server`
- `codex-rs/mcp-server`
- `codex-rs/chatgpt`
- `codex-rs/cloud-tasks`
- Updated workspace lockfile entries to reflect the new dependency graph
and removal of `codex-common`.
- Adds --listen <URL> to codex app-server with two listen modes:
- stdio:// (default, existing behavior)
- ws://IP:PORT (new websocket transport)
- Refactors message routing to be connection-aware:
- Tracks per-connection session state (initialize/experimental
capability)
- Routes responses/errors to the originating connection
- Broadcasts server notifications/requests to initialized connections
- Updates initialization semantics to be per connection (not
process-global), and updates app-server docs accordingly.
- Adds websocket accept/read/write handling (JSON-RPC per text frame,
ping/pong handling, connection lifecycle events).
Testing
- Unit tests for transport URL parsing and targeted response/error
routing.
- New websocket integration test validating:
- per-connection initialization requirements
- no cross-connection response leakage
- same request IDs on different connections route independently.
We started working with MCP in Codex before
https://crates.io/crates/rmcp was mature, so we had our own crate for
MCP types that was generated from the MCP schema:
8b95d3e082/codex-rs/mcp-types/README.md
Now that `rmcp` is more mature, it makes more sense to use their MCP
types in Rust, as they handle details (like the `_meta` field) that our
custom version ignored. Though one advantage that our custom types had
is that our generated types implemented `JsonSchema` and `ts_rs::TS`,
whereas the types in `rmcp` do not. As such, part of the work of this PR
is leveraging the adapters between `rmcp` types and the serializable
types that are API for us (app server and MCP) introduced in #10356.
Note this PR results in a number of changes to
`codex-rs/app-server-protocol/schema`, which merit special attention
during review. We must ensure that these changes are still
backwards-compatible, which is possible because we have:
```diff
- export type CallToolResult = { content: Array<ContentBlock>, isError?: boolean, structuredContent?: JsonValue, };
+ export type CallToolResult = { content: Array<JsonValue>, structuredContent?: JsonValue, isError?: boolean, _meta?: JsonValue, };
```
so `ContentBlock` has been replaced with the more general `JsonValue`.
Note that `ContentBlock` was defined as:
```typescript
export type ContentBlock = TextContent | ImageContent | AudioContent | ResourceLink | EmbeddedResource;
```
so the deletion of those individual variants should not be a cause of
great concern.
Similarly, we have the following change in
`codex-rs/app-server-protocol/schema/typescript/Tool.ts`:
```
- export type Tool = { annotations?: ToolAnnotations, description?: string, inputSchema: ToolInputSchema, name: string, outputSchema?: ToolOutputSchema, title?: string, };
+ export type Tool = { name: string, title?: string, description?: string, inputSchema: JsonValue, outputSchema?: JsonValue, annotations?: JsonValue, icons?: Array<JsonValue>, _meta?: JsonValue, };
```
so:
- `annotations?: ToolAnnotations` ➡️ `JsonValue`
- `inputSchema: ToolInputSchema` ➡️ `JsonValue`
- `outputSchema?: ToolOutputSchema` ➡️ `JsonValue`
and two new fields: `icons?: Array<JsonValue>, _meta?: JsonValue`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10349).
* #10357
* __->__ #10349
* #10356
This enables a new use case where `codex app-server` is embedded into a
parent application that will directly own the user's ChatGPT auth
lifecycle, which means it owns the user’s auth tokens and refreshes it
when necessary. The parent application would just want a way to pass in
the auth tokens for codex to use directly.
The idea is that we are introducing a new "auth mode" currently only
exposed via app server: **`chatgptAuthTokens`** which consist of the
`id_token` (stores account metadata) and `access_token` (the bearer
token used directly for backend API calls). These auth tokens are only
stored in-memory. This new mode is in addition to the existing `apiKey`
and `chatgpt` auth modes.
This PR reuses the shape of our existing app-server account APIs as much
as possible:
- Update `account/login/start` with a new `chatgptAuthTokens` variant,
which will allow the client to pass in the tokens and have codex
app-server use them directly. Upon success, the server emits
`account/login/completed` and `account/updated` notifications.
- A new server->client request called
`account/chatgptAuthTokens/refresh` which the server can use whenever
the access token previously passed in has expired and it needs a new one
from the parent application.
I leveraged the core 401 retry loop which typically triggers auth token
refreshes automatically, but made it pluggable:
- **chatgpt** mode refreshes internally, as usual.
- **chatgptAuthTokens** mode calls the client via
`account/chatgptAuthTokens/refresh`, the client responds with updated
tokens, codex updates its in-memory auth, then retries. This RPC has a
10s timeout and handles JSON-RPC errors from the client.
Also some additional things:
- chatgpt logins are blocked while external auth is active (have to log
out first. typically clients will pick one OR the other, not support
both)
- `account/logout` clears external auth in memory
- Ensures that if `forced_chatgpt_workspace_id` is set via the user's
config, we respect it in both:
- `account/login/start` with `chatgptAuthTokens` (returns a JSON-RPC
error back to the client)
- `account/chatgptAuthTokens/refresh` (fails the turn, and on next
request app-server will send another `account/chatgptAuthTokens/refresh`
request to the client).
In order to make Codex work with connectors, we add a built-in gateway
MCP that acts as a transparent proxy between the client and the
connectors. The gateway MCP collects actions that are accessible to the
user and sends them down to the user, when a connector action is chosen
to be called, the client invokes the action through the gateway MCP as
well.
- [x] Add the system built-in gateway MCP to list and run connectors.
- [x] Add the app server methods and protocol
This PR introduces a `codex-utils-cargo-bin` utility crate that
wraps/replaces our use of `assert_cmd::Command` and
`escargot::CargoBuild`.
As you can infer from the introduction of `buck_project_root()` in this
PR, I am attempting to make it possible to build Codex under
[Buck2](https://buck2.build) as well as `cargo`. With Buck2, I hope to
achieve faster incremental local builds (largely due to Buck2's
[dice](https://buck2.build/docs/insights_and_knowledge/modern_dice/)
build strategy, as well as benefits from its local build daemon) as well
as faster CI builds if we invest in remote execution and caching.
See
https://buck2.build/docs/getting_started/what_is_buck2/#why-use-buck2-key-advantages
for more details about the performance advantages of Buck2.
Buck2 enforces stronger requirements in terms of build and test
isolation. It discourages assumptions about absolute paths (which is key
to enabling remote execution). Because the `CARGO_BIN_EXE_*` environment
variables that Cargo provides are absolute paths (which
`assert_cmd::Command` reads), this is a problem for Buck2, which is why
we need this `codex-utils-cargo-bin` utility.
My WIP-Buck2 setup sets the `CARGO_BIN_EXE_*` environment variables
passed to a `rust_test()` build rule as relative paths.
`codex-utils-cargo-bin` will resolve these values to absolute paths,
when necessary.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/8496).
* #8498
* __->__ #8496
This attempts to tighten up the types related to "config layers."
Currently, `ConfigLayerEntry` is defined as follows:
bef36f4ae7/codex-rs/core/src/config_loader/state.rs (L19-L25)
but the `source` field is a bit of a lie, as:
- for `ConfigLayerName::Mdm`, it is
`"com.openai.codex/config_toml_base64"`
- for `ConfigLayerName::SessionFlags`, it is `"--config"`
- for `ConfigLayerName::User`, it is `"config.toml"` (just the file
name, not the path to the `config.toml` on disk that was read)
- for `ConfigLayerName::System`, it seems like it is usually
`/etc/codex/managed_config.toml` in practice, though on Windows, it is
`%CODEX_HOME%/managed_config.toml`:
bef36f4ae7/codex-rs/core/src/config_loader/layer_io.rs (L84-L101)
All that is to say, in three out of the four `ConfigLayerName`, `source`
is a `PathBuf` that is not an absolute path (or even a true path).
This PR tries to uplevel things by eliminating `source` from
`ConfigLayerEntry` and turning `ConfigLayerName` into a disjoint union
named `ConfigLayerSource` that has the appropriate metadata for each
variant, favoring the use of `AbsolutePathBuf` where appropriate:
```rust
pub enum ConfigLayerSource {
/// Managed preferences layer delivered by MDM (macOS only).
#[serde(rename_all = "camelCase")]
#[ts(rename_all = "camelCase")]
Mdm { domain: String, key: String },
/// Managed config layer from a file (usually `managed_config.toml`).
#[serde(rename_all = "camelCase")]
#[ts(rename_all = "camelCase")]
System { file: AbsolutePathBuf },
/// Session-layer overrides supplied via `-c`/`--config`.
SessionFlags,
/// User config layer from a file (usually `config.toml`).
#[serde(rename_all = "camelCase")]
#[ts(rename_all = "camelCase")]
User { file: AbsolutePathBuf },
}
```