## Summary
`cargo test` has entails both running standard Rust tests and doctests.
It turns out that the doctest discovery is fairly slow, and it's a cost
you pay even for crates that don't include any doctests.
This PR disables doctests with `doctest = false` for crates that lack
any doctests.
For the collection of crates below, this speeds up test execution by
>4x.
E.g., before this PR:
```
Benchmark 1: cargo test -p codex-utils-absolute-path -p codex-utils-cache -p codex-utils-cli -p codex-utils-home-dir -p codex-utils-output-truncation -p codex-utils-path -p codex-utils-string -p codex-utils-template -p codex-utils-elapsed -p codex-utils-json-to-toml
Time (mean ± σ): 1.849 s ± 4.455 s [User: 0.752 s, System: 1.367 s]
Range (min … max): 0.418 s … 14.529 s 10 runs
```
And after:
```
Benchmark 1: cargo test -p codex-utils-absolute-path -p codex-utils-cache -p codex-utils-cli -p codex-utils-home-dir -p codex-utils-output-truncation -p codex-utils-path -p codex-utils-string -p codex-utils-template -p codex-utils-elapsed -p codex-utils-json-to-toml
Time (mean ± σ): 428.6 ms ± 6.9 ms [User: 187.7 ms, System: 219.7 ms]
Range (min … max): 418.0 ms … 436.8 ms 10 runs
```
For a single crate, with >2x speedup, before:
```
Benchmark 1: cargo test -p codex-utils-string
Time (mean ± σ): 491.1 ms ± 9.0 ms [User: 229.8 ms, System: 234.9 ms]
Range (min … max): 480.9 ms … 512.0 ms 10 runs
```
And after:
```
Benchmark 1: cargo test -p codex-utils-string
Time (mean ± σ): 213.9 ms ± 4.3 ms [User: 112.8 ms, System: 84.0 ms]
Range (min … max): 206.8 ms … 221.0 ms 13 runs
```
Co-authored-by: Codex <noreply@openai.com>
## Summary
This PR removes the synthetic `HashMap<String, ToolInfo>` keys from MCP
tool discovery. `McpConnectionManager::list_all_tools()` now returns
normalized `Vec<ToolInfo>`, and downstream code derives identity from
`ToolInfo::canonical_tool_name()`.
The motivation is to keep model-visible tool identity on
`ToolName`/`ToolInfo` instead of parallel string map keys, so future
namespace changes do not have to preserve otherwise-unused lookup keys.
## Changes
- Rename the MCP normalization path from `qualify_tools` to
`normalize_tools_for_model` and return tool values directly.
- Flow MCP tool lists through connectors, plugin injection, router/spec
building, code mode, and tool search as vectors/slices.
- Keep direct/deferred subtraction local to `mcp_tool_exposure`, using
`ToolName` values.
- Update tests to compare `ToolName` instances where MCP identity
matters.
## Validation
- `cargo test -p codex-mcp test_normalize_tools`
- `cargo test -p codex-core mcp_tool_exposure`
- `cargo test -p codex-core
direct_mcp_tools_register_namespaced_handlers`
- `cargo test -p codex-core
search_tool_registers_namespaced_mcp_tool_aliases`
- `just fix -p codex-mcp`
- `just fix -p codex-core`
## DISCLAIMER
This is experimental and no production service must rely on this
## Why
Built-in MCPs are product-owned runtime capabilities, but they were
previously flattened into the same config-backed stdio path as
user-configured servers. That made them depend on a hidden `codex
builtin-mcp` re-exec path, exposed them through config-oriented CLI
flows, and erased distinctions the runtime needs to preserve—most
notably whether an MCP call should count as external context for
memory-mode pollution.
## What changed
- Model product-owned built-ins separately from config-backed MCP
servers via `BuiltinMcpServer` and `EffectiveMcpServer`.
- Launch built-ins in process through a reusable async transport instead
of the hidden `builtin-mcp` stdio subcommand.
- Keep config-oriented CLI operations such as `codex mcp
list/get/login/logout` scoped to configured servers, while merging
built-ins only into the effective runtime server set.
- Retain server metadata after launch so parallel-tool support and
context classification come from the live server set; built-in
`memories` is now classified as local Codex state rather than external
context.
## Test plan
- `cargo test -p codex-mcp`
- `cargo test -p codex-core --test suite
builtin_memories_mcp_call_does_not_mark_thread_memory_mode_polluted_when_configured`
---------
Co-authored-by: Codex <noreply@openai.com>
# Motivation
Browser Use origin-access prompts are MCP elicitations, not direct
tool-call approval prompts, so they were bypassing the Guardian approval
path. We need a generic opt-in that lets eligible MCP elicitations use
Guardian when the current turn already routes approvals there.
# Description
Add a generic elicitation reviewer hook in codex-mcp and wire codex-core
to pass a Guardian reviewer callback when creating the MCP connection
manager. The reviewer validates explicit mcp_tool_call opt-in metadata,
builds a Guardian MCP tool-call review request from
server/tool/connector metadata and tool params, and maps Guardian
approval, denial, timeout, and cancellation decisions back to MCP
elicitation responses.
The new option to trigger this in the `_meta` object is:
```
"codex_request_type": "approval_request",
```
# Testing
- RUST_MIN_STACK=8388608 NEXTEST_STATUS_LEVEL=leak cargo nextest run
--no-fail-fast --cargo-profile ci-test --test-threads 2
- cargo clippy --tests -- -D warnings
- cargo fmt -- --config imports_granularity=Item --check
- cargo shear
- pnpm run format
- python3 .github/scripts/verify_cargo_workspace_manifests.py
- python3 .github/scripts/verify_tui_core_boundary.py
- python3 .github/scripts/verify_bazel_clippy_lints.py
- git diff --check
## Why
The core `Op::ListMcpTools` request path is no longer needed. Keeping it
around left a dead request/response surface alongside the app-server MCP
inventory APIs that own current server status listing.
## What Changed
- Removed `Op::ListMcpTools`, `EventMsg::McpListToolsResponse`, and the
core handler that built the MCP snapshot response.
- Removed the now-unused `codex-mcp` snapshot wrapper/export and passive
event handling arms in rollout and MCP-server consumers.
- Updated tests that used the old op as a synchronization hook to wait
on existing startup/skills events, and deleted the plugin test that only
exercised the removed listing op.
## Validation
- `cargo test -p codex-protocol`
- `cargo test -p codex-mcp`
- `cargo test -p codex-rollout -p codex-rollout-trace -p
codex-mcp-server`
- `cargo test -p codex-core --test all
pending_input::queued_inter_agent_mail`
- `cargo test -p codex-core --test all
rmcp_client::stdio_mcp_tool_call_includes_sandbox_state_meta`
- `cargo test -p codex-core --test all
rmcp_client::stdio_image_responses`
- `just fix -p codex-core -p codex-protocol -p codex-mcp -p
codex-rollout -p codex-rollout-trace -p codex-mcp-server`
## Summary
Xcode 26.4 was built against app-server behavior from before MCP
elicitation requests became client-visible in CLI 0.120.0 via #17043.
That client line does not expect the new events/messages, so this PR
restores the old behavior for exactly that client/version combination.
The compatibility handling stays in the app-server layer: when the
initialized client is `Xcode` and its version starts with `26.4`, the
app server marks the live Codex thread so MCP elicitations are
auto-denied. The flag is applied on thread start/resume/fork/turn
attachment, carried through `Codex`/`CodexThread`, and stored on
`McpConnectionManager` so refreshed MCP managers preserve the behavior.
## Notes
This is intentionally narrow and includes a TODO to remove the
compatibility path once Xcode 26.4 ages out.
## Why
MCP servers can provide `instructions` that explain what their tools are
for. Directly exposed MCP namespaces already use those instructions when
a connector description is not available, but deferred `tool_search`
results did not preserve that fallback. The direct path falls back from
connector metadata to server instructions, while the deferred path only
carried `connector_description` and otherwise fell back to generic
namespace text.
That meant a plain MCP server could provide useful model-facing guidance
and still appear as `Tools in the X namespace.` whenever it was
discovered lazily through `tool_search`.
## What changed
- Store one model-facing `namespace_description` on `ToolInfo`, using
connector descriptions for connector-backed tools and server
instructions for plain MCP servers.
- Thread that namespace description through the `tool_search` source
list, search indexing, and returned namespace metadata.
- Add an end-to-end regression test for deferred non-app MCP search
results exposing server instructions as the namespace description.
## Verification
- `cargo test -p codex-tools
search_tool_description_lists_each_mcp_source_once --lib`
- `cargo test -p codex-core --test all
tool_search_uses_non_app_mcp_server_instructions_as_namespace_description`
## Summary
- Treat `approval_mode = "approve"` as skip-review across all permission
modes.
- Remove the mode-specific split in the MCP auto-approval gate so
approved tools bypass review consistently.
- Expand regression coverage in the shared MCP helper and the core
tool-call flow.
## Testing
- `just fmt`
- `cargo test -p codex-mcp`
- `cargo test -p codex-core
approve_mode_skips_arc_and_guardian_in_every_permission_mode`
- `git diff --check`
- Full `cargo test -p codex-core` was also attempted, but the suite hit
an unrelated pre-existing stack overflow in an existing multi-agent test
# Why
Codex currently negotiates MCP `2025-06-18`, where the client
elicitation capability is represented as an empty object. We were still
serializing `capabilities.elicitation.form`, which belongs to the later
capability shape and can cause strict `2025-06-18` servers to reject
`initialize` with an unrecognized-field error.
This keeps the handshake aligned with the protocol version Codex
actually negotiates and fixes the compatibility regression tracked in
#17492.
# What
- Serialize the client elicitation capability as `elicitation: {}` for
`2025-06-18`.
- Keep elicitation advertised for both Codex Apps and custom MCP
servers.
- Tighten regression coverage so the unit test asserts both the Rust
value and the serialized wire shape.
- Add an app-server integration test that round-trips a form elicitation
from a custom MCP server; the existing connector round-trip continues to
cover the connector path.
# Verification
- `cargo test -p codex-mcp`
- `cargo test -p codex-app-server mcp_server_elicitation_round_trip`
- `cargo test -p codex-app-server
mcp_server_tool_call_round_trips_elicitation`
# Next steps
- Decide whether `tool_call_mcp_elicitation=false` should also suppress
capability advertisement during `initialize`.
- Revisit `form` / `url` capability advertisement when Codex is ready to
negotiate MCP `2025-11-25`, which defines that newer shape.
## Why
When an MCP or app tool is configured with approval mode `approve`
(always allow), users expect that decision to be authoritative. In
guardian auto-review mode, ARC could still return `ask-user`, which then
routed the approval question into guardian with the ARC reason as
context. That meant a tool explicitly configured as always allowed still
went through both safety monitors before running.
This change keeps the existing ARC behavior for non-auto-review
sessions, but avoids the ARC-to-guardian sequence when
`approvals_reviewer = auto_review` and the tool approval mode is
`approve`.
## What changed
- Short-circuit MCP tool approval handling when `approval_mode ==
approve` and `approvals_reviewer == auto_review`.
- Updated the MCP approval regression test so the auto-review case
asserts neither ARC nor guardian is called.
- Preserved existing tests that verify ARC can still block always-allow
MCP tools outside guardian auto-review mode.
## Verification
- `cargo test -p codex-core --lib mcp_tool_call`
Summary
- Add `[features.apps_mcp_path_override]` config with a `path` field for
overriding only the built-in apps MCP path.
- Keep existing host/base URL derivation unchanged and append the
configured path after that base.
- Regenerate the config schema with the custom feature-config case.
Test Plan
- Not run for latest revision; only `just fmt` and `just
write-config-schema` were run.
- Earlier revision: `cargo test -p codex-features`
- Earlier revision: `cargo test -p codex-mcp`
## Why
Several bug reports describe thread shutdown (including subagent
threads) leaving stdio MCP server processes behind. These reports all
point at the same lifecycle gap: Codex launches stdio MCP servers, but
the session-level shutdown path does not explicitly close MCP clients or
terminate the server process tree.
Fixes#12491Fixes#12976Fixes#18881Fixes#19469
## History
This is best understood as a regression/coverage gap in MCP session
lifecycle management, not as stdio MCP cleanup being absent all along.
#10710 added process-group cleanup for stdio MCP servers, but that
cleanup only runs when the `RmcpClient`/transport is dropped. The older
reports (#12491 and #12976) came after that cleanup existed, which
suggests the remaining problem was that some higher-level shutdown paths
kept the MCP manager alive or replaced it without explicitly draining
clients. The newer reports (#18881 and #19469) exposed the same family
around manager replacement and shutdown.
## What changed
- Added an explicit stdio MCP process handle in `codex-rmcp-client` so
local MCP servers terminate their process group and executor-backed MCP
servers call the executor process terminator.
- Added `RmcpClient::shutdown()` and manager-level MCP shutdown draining
so session shutdown, channel-close fallback, MCP refresh, and connector
probing stop owned MCP clients.
- Added regression coverage that starts a stdio MCP server, begins an
in-flight blocking tool call, shuts down the client, and asserts the
server process exits.
## Verification
- `cargo test -p codex-rmcp-client`
- `cargo test -p codex-mcp`
- `just fix -p codex-rmcp-client`
- `just fix -p codex-mcp`
- `just fix -p codex-core`
- Manual before/after validation with a temporary repro script:
- Pre-fix binary from `HEAD^` (`fed0a8f4fa`): reproduced the leak with
surviving MCP server and child PIDs, `survivors=[77583, 77592]`,
`leaked=true`.
- Post-fix binary from this branch (`67e318148b`): verified both MCP
processes were gone after interrupting `codex exec`, `survivors=[]`,
`leaked=false`.
## Why
After #19391, `PermissionProfile` and the split filesystem/network
policies could still be stored in parallel. That creates drift risk: a
profile can preserve deny globs, external enforcement, or split
filesystem entries while a cached projection silently loses those
details. This PR makes the profile the runtime source and derives
compatibility views from it.
## What Changed
- Removes stored filesystem/network sandbox projections from
`Permissions` and `SessionConfiguration`; their accessors now derive
from the canonical `PermissionProfile`.
- Derives legacy `SandboxPolicy` snapshots from profiles only where an
older API still needs that field.
- Updates MCP connection and elicitation state to track
`PermissionProfile` instead of `SandboxPolicy` for auto-approval
decisions.
- Adds semantic filesystem-policy comparison so cwd changes can preserve
richer profiles while still recognizing equivalent legacy projections
independent of entry ordering.
- Updates config/session tests to assert profile-derived projections
instead of parallel stored fields.
## Verification
- `cargo test -p codex-core direct_write_roots`
- `cargo test -p codex-core runtime_roots_to_legacy_projection`
- `cargo test -p codex-app-server
requested_permissions_trust_project_uses_permission_profile_intent`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19392).
* #19395
* #19394
* #19393
* __->__ #19392
## Why
The visibility cleanup in the base PR reduced what `codex-mcp` exposes,
but several files still made reviewers read private support machinery
before the public or crate-facing entry points. This ordering pass makes
each file easier to scan: exported API first, crate-visible MCP
internals next, then private helpers in breadth-first order from the
higher-level MCP flows to leaf utilities.
## What Changed
- Reordered `codex-mcp` exports so the runtime, configuration, snapshot,
auth, and helper surfaces are grouped by visibility and reader
importance.
- Moved public and crate-visible MCP items ahead of private helpers in
the auth, MCP planning/snapshot, connection manager, and tool-name
modules.
- Kept the change mechanical, with no behavior changes intended.
## Verification
- `cargo check -p codex-mcp`
## Why
`codex-mcp` currently exposes more API than the rest of the workspace
uses. Some of that surface is simply visibility that can be tightened,
and some of it is public helper code that remains compiler-valid because
it is exported even though no workspace caller uses it.
That distinction matters: Rust does not warn on exported API just
because the current workspace does not call it. This PR intentionally
treats those exported-but-workspace-unreferenced paths as stale
`codex-mcp` surface. The main example is MCP skill dependency
collection, where the active implementation now lives in
`codex-rs/core/src/mcp_skill_dependencies.rs`; keeping the older
`codex-mcp` copy makes it unclear which implementation owns skill MCP
installation.
## What Changed
- Pruned unused `codex-mcp` re-exports from `codex-mcp/src/lib.rs`.
- Removed non-runtime helper methods from `McpConnectionManager` so it
stays focused on live MCP clients.
- Made `ToolPluginProvenance` lookup methods crate-private.
- Removed workspace-unreferenced snapshot wrapper APIs and
qualified-tool grouping helpers.
- Deleted the duplicate `codex-mcp` skill dependency module and tests
now that skill MCP dependency handling is owned by `core`.
## Verification
- `cargo check -p codex-mcp`
## Why
MCP tool calls can receive a serialized `SandboxState` when a server
declares the sandbox-state capability. That state is one of the places
MCP runtimes learn what permissions Codex is operating under. As the
permissions migration makes `PermissionProfile` the canonical
representation, MCP consumers should be able to read that profile
directly instead of reconstructing permissions from the legacy
`SandboxPolicy`.
## What changed
- Adds optional `permissionProfile` to `codex_mcp::SandboxState`, while
keeping `sandboxPolicy` for existing MCP consumers.
- Populates `permissionProfile` from the current `TurnContext` when
serializing sandbox state for MCP tool calls.
## Verification
- Current GitHub Actions for this PR are passing.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18286).
* #18288
* #18287
* __->__ #18286
### Why
The RMCP layer needs a Streamable HTTP client that can talk either
directly over `reqwest` or through the executor HTTP runner without
duplicating MCP session logic higher in the stack. This PR adds that
client-side transport boundary so remote Streamable HTTP MCP can reuse
the same RMCP flow as the local path.
### What
- Add a shared `rmcp-client/src/streamable_http/` module with:
- `transport_client.rs` for the local-or-remote transport enum
- `local_client.rs` for the direct `reqwest` implementation
- `remote_client.rs` for the executor-backed implementation
- `common.rs` for the small shared Streamable HTTP helpers
- Teach `RmcpClient` to build Streamable HTTP transports in either local
or remote mode while keeping the existing OAuth ownership in RMCP.
- Translate remote POST, GET, and DELETE session operations into
executor `http/request` calls.
- Preserve RMCP session expiry handling and reconnect behavior for the
remote transport.
- Add remote transport coverage in
`rmcp-client/tests/streamable_http_remote.rs` and keep the shared test
support in `rmcp-client/tests/streamable_http_test_support.rs`.
### Verification
- `cargo check -p codex-rmcp-client`
- online CI
### Stack
1. #18581 protocol
2. #18582 runner
3. #18583 RMCP client
4. #18584 manager wiring and local/remote coverage
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
This PR fully reverts the previously merged Agent Identity runtime
integration from the old stack:
https://github.com/openai/codex/pull/17387/changes
It removes the Codex-side task lifecycle wiring, rollout/session
persistence, feature flag plumbing, lazy `auth.json` mutation,
background task auth paths, and request callsite changes introduced by
that stack.
This leaves the repo in a clean pre-AgentIdentity integration state so
the follow-up PRs can reintroduce the pieces in smaller reviewable
layers.
## Stack
1. This PR: full revert
2. https://github.com/openai/codex/pull/18871: move Agent Identity
business logic into a crate
3. https://github.com/openai/codex/pull/18785: add explicit
AgentIdentity auth mode and startup task allocation
4. https://github.com/openai/codex/pull/18811: migrate auth callsites
through AuthProvider
## Testing
Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.
## Summary
Making thread id optional so that we can better cache resources for MCPs
for connectors since their resource templates is universal and not
particular to projects.
- Make `mcpServer/resource/read` accept an optional `threadId`
- Read resources from the current MCP config when no thread is supplied
- Keep the existing thread-scoped path when `threadId` is present
- Update the generated schemas, README, and integration coverage
## Testing
- `just write-app-server-schema`
- `just fmt`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-mcp`
- `cargo test -p codex-app-server --test all mcp_resource`
- `just fix -p codex-mcp`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-app-server`
## Summary
Introduces a single background/control-plane agent task for ChatGPT
backend requests that do not have a thread-scoped task, with
`AuthManager` owning the default ChatGPT backend authorization decision.
Callers now ask `AuthManager` for the default ChatGPT backend
authorization header. `AuthManager` decides whether that is bearer or
background AgentAssertion based on config/internal state, while
low-level bootstrap paths can explicitly request bearer-only auth.
This PR is stacked on PR4 and focuses on the shared background task auth
plumbing plus the first tranche of backend/control-plane consumers. The
remaining callsite wiring is split into PR4.2 to keep review size down.
## Stack
- PR1: https://github.com/openai/codex/pull/17385 - add
`features.use_agent_identity`
- PR2: https://github.com/openai/codex/pull/17386 - register agent
identities when enabled
- PR3: https://github.com/openai/codex/pull/17387 - register agent tasks
when enabled
- PR3.1: https://github.com/openai/codex/pull/17978 - persist and
prewarm registered tasks per thread
- PR4: https://github.com/openai/codex/pull/17980 - use task-scoped
`AgentAssertion` for downstream calls
- PR4.1: this PR - introduce AuthManager-owned background/control-plane
`AgentAssertion` auth
- PR4.2: https://github.com/openai/codex/pull/18260 - use background
task auth for additional backend/control-plane calls
## What Changed
- add background task registration and assertion minting inside
`codex-login`
- persist `agent_identity.background_task_id` separately from
per-session task state
- make `BackgroundAgentTaskManager` private to `codex-login`; call sites
do not instantiate or pass it around
- teach `AuthManager` the ChatGPT backend base URL and feature-derived
background auth mode from resolved config
- expose bearer-only helpers for bootstrap/registration/refresh-style
paths that must not use AgentAssertion
- wire `AuthManager` default ChatGPT authorization through app listing,
connector directory listing, remote plugins, MCP status/listing,
analytics, and core-skills remote calls
- preserve bearer fallback when the feature is disabled, the backend
host is unsupported, or background task registration is not available
## Validation
- `just fmt`
- `cargo check -p codex-core -p codex-login -p codex-analytics -p
codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p
codex-models-manager -p codex-chatgpt -p codex-model-provider -p
codex-mcp -p codex-core-skills`
- `cargo test -p codex-login agent_identity`
- `cargo test -p codex-model-provider bearer_auth_provider`
- `cargo test -p codex-core agent_assertion`
- `cargo test -p codex-app-server remote_control`
- `cargo test -p codex-cloud-requirements fetch_cloud_requirements`
- `cargo test -p codex-models-manager manager::tests`
- `cargo test -p codex-chatgpt`
- `cargo test -p codex-cloud-tasks`
- `just fix -p codex-core -p codex-login -p codex-analytics -p
codex-app-server -p codex-cloud-requirements -p codex-cloud-tasks -p
codex-models-manager -p codex-chatgpt -p codex-model-provider -p
codex-mcp -p codex-core-skills`
- `just fix -p codex-app-server`
- `git diff --check`
## Summary
- Add the executor-backed RMCP stdio transport.
- Wire MCP stdio placement through the executor environment config.
- Cover local and executor-backed stdio paths with the existing MCP test
helpers.
## Stack
```text
o #18027 [6/6] Fail exec client operations after disconnect
│
@ #18212 [5/6] Wire executor-backed MCP stdio
│
o #18087 [4/6] Abstract MCP stdio server launching
│
o #18020 [3/6] Add pushed exec process events
│
o #18086 [2/6] Support piped stdin in exec process API
│
o #18085 [1/6] Add MCP server environment config
│
o main
```
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- Move local MCP stdio process startup behind a launcher trait.
- Preserve existing local stdio behavior while making transport creation
explicit.
## Stack
```text
o #18027 [6/6] Fail exec client operations after disconnect
│
o #18212 [5/6] Wire executor-backed MCP stdio
│
@ #18087 [4/6] Abstract MCP stdio server launching
│
o #18020 [3/6] Add pushed exec process events
│
o #18086 [2/6] Support piped stdin in exec process API
│
o #18085 [1/6] Add MCP server environment config
│
o main
```
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- Add an MCP server environment setting with local as the default.
- Thread the default through config serialization, schema generation,
and existing config fixtures.
## Stack
```text
o #18027 [8/8] Fail exec client operations after disconnect
│
o #18025 [7/8] Cover MCP stdio tests with executor placement
│
o #18089 [6/8] Wire remote MCP stdio through executor
│
o #18088 [5/8] Add executor process transport for MCP stdio
│
o #18087 [4/8] Abstract MCP stdio server launching
│
o #18020 [3/8] Add pushed exec process events
│
o #18086 [2/8] Support piped stdin in exec process API
│
@ #18085 [1/8] Add MCP server environment config
│
o main
```
Co-authored-by: Codex <noreply@openai.com>
Addresses https://github.com/openai/codex/issues/17143
Problem: TUI interrupts without an active turn stopped cancelling slow
MCP startup after routing through the app-server APIs.
Solution: Route no-active-turn interrupts through app-server as startup
cancels, acknowledge them immediately, and emit cancelled MCP startup
updates.
Testing: I manually confirmed that MCP cancellation didn't work prior to
this PR and works after the fix was in place.
## Why
#17763 moved sandbox-state delivery for MCP tool calls to request
`_meta` via the `codex/sandbox-state-meta` experimental capability.
Keeping the older `codex/sandbox-state` capability meant Codex still
maintained a second transport that pushed updates with the custom
`codex/sandbox-state/update` request at server startup and when the
session sandbox policy changed.
That duplicate MCP path is redundant with the per-tool-call metadata
path and makes the sandbox-state contract larger than needed. The
existing managed network proxy refresh on sandbox-policy changes is
still needed, so this keeps that behavior separate from the removed MCP
notification.
## What Changed
- Removed the exported `MCP_SANDBOX_STATE_CAPABILITY` and
`MCP_SANDBOX_STATE_METHOD` constants.
- Removed detection of `codex/sandbox-state` during MCP initialization
and stopped sending `codex/sandbox-state/update` at server startup.
- Removed the `McpConnectionManager::notify_sandbox_state_change`
plumbing while preserving the managed network proxy refresh when a user
turn changes sandbox policy.
- Slimmed `McpConnectionManager::new` so startup paths pass only the
initial `SandboxPolicy` needed for MCP elicitation state.
- Kept `codex/sandbox-state-meta` support intact; servers that opt in
still receive the current `SandboxState` on tool-call request `_meta`
([remaining call
path](ff2d3c1e72/codex-rs/core/src/mcp_tool_call.rs (L487-L526))).
- Added regression coverage for refreshing the live managed network
proxy on a per-turn sandbox-policy change.
## Verification
- `cargo test -p codex-core
new_turn_refreshes_managed_network_proxy_for_sandbox_change`
- `cargo test -p codex-mcp`
stacked on #17402.
MCP tools returned by `tool_search` (deferred tools) get registered in
our `ToolRegistry` with a different format than directly available
tools. this leads to two different ways of accessing MCP tools from our
tool catalog, only one of which works for each. fix this by registering
all MCP tools with the namespace format, since this info is already
available.
also, direct MCP tools are registered to responsesapi without a
namespace, while deferred MCP tools have a namespace. this means we can
receive MCP `FunctionCall`s in both formats from namespaces. fix this by
always registering MCP tools with namespace, regardless of deferral
status.
make code mode track `ToolName` provenance of tools so it can map the
literal JS function name string to the correct `ToolName` for
invocation, rather than supporting both in core.
this lets us unify to a single canonical `ToolName` representation for
each MCP tool and force everywhere to use that one, without supporting
fallbacks.
## Changes
Allows MCPs to opt in to receiving sandbox config info through `_meta`
on model-initiated tool calls. This lets MCPs adhere to the thread's
sandbox if they choose to.
## Details
- Adds the `codex/sandbox-state-meta` experimental MCP capability.
- Tracks whether each MCP server advertises that capability.
- When a server opts in, `codex-core` injects the current `SandboxState`
into model-initiated MCP tool-call request `_meta`.
## Verification
- added an integration test for the capability
## Why
For more advanced MCP usage, we want the model to be able to emit
parallel MCP tool calls and have Codex execute eligible ones
concurrently, instead of forcing all MCP calls through the serial block.
The main design choice was where to thread the config. I made this
server-level because parallel safety depends on the MCP server
implementation. Codex reads the flag from `mcp_servers`, threads the
opted-in server names into `ToolRouter`, and checks the parsed
`ToolPayload::Mcp { server, .. }` at execution time. That avoids relying
on model-visible tool names, which can be incomplete in
deferred/search-tool paths or ambiguous for similarly named
servers/tools.
## What was added
Added `supports_parallel_tool_calls` for MCP servers.
Before:
```toml
[mcp_servers.docs]
command = "docs-server"
```
After:
```toml
[mcp_servers.docs]
command = "docs-server"
supports_parallel_tool_calls = true
```
MCP calls remain serial by default. Only tools from opted-in servers are
eligible to run in parallel. Docs also now warn to enable this only when
the server’s tools are safe to run concurrently, especially around
shared state or read/write races.
## Testing
Tested with a local stdio MCP server exposing real delay tools. The
model/Responses side was mocked only to deterministically emit two MCP
calls in the same turn.
Each test called `query_with_delay` and `query_with_delay_2` with `{
"seconds": 25 }`.
| Build/config | Observed | Wall time |
| --- | --- | --- |
| main with flag enabled | serial | `58.79s` |
| PR with flag enabled | parallel | `31.73s` |
| PR without flag | serial | `56.70s` |
PR with flag enabled showed both tools start before either completed;
main and PR-without-flag completed the first delay before starting the
second.
Also added an integration test.
Additional checks:
- `cargo test -p codex-tools` passed
- `cargo test -p codex-core
mcp_parallel_support_uses_exact_payload_server` passed
- `git diff --check` passed
- [x] Expand tool search to custom MCPs.
- [x] Rename several variables/fields to be more generic.
Updated tool & server name lifecycles:
**Raw Identity**
ToolInfo.server_name is raw MCP server name.
ToolInfo.tool.name is raw MCP tool name.
MCP calls route back to raw via parse_tool_name() returning
(tool.server_name, tool.tool.name).
mcpServerStatus/list now groups by raw server and keys tools by
Tool.name: mod.rs:599
App-server just forwards that grouped raw snapshot:
codex_message_processor.rs:5245
**Callable Names**
On list-tools, we create provisional callable_namespace / callable_name:
mcp_connection_manager.rs:1556
For non-app MCP, provisional callable name starts as raw tool name.
For codex-apps, provisional callable name is sanitized and strips
connector name/id prefix; namespace includes connector name.
Then qualify_tools() sanitizes callable namespace + name to ASCII alnum
/ _ only: mcp_tool_names.rs:128
Note: this is stricter than Responses API. Hyphen is currently replaced
with _ for code-mode compatibility.
**Collision Handling**
We do initially collapse example-server and example_server to the same
base.
Then qualify_tools() detects distinct raw namespace identities behind
the same sanitized namespace and appends a hash to the callable
namespace: mcp_tool_names.rs:137
Same idea for tool-name collisions: hash suffix goes on callable tool
name.
Final list_all_tools() map key is callable_namespace + callable_name:
mcp_connection_manager.rs:769
**Direct Model Tools**
Direct MCP tool declarations use the full qualified sanitized key as the
Responses function name.
The raw rmcp Tool is converted but renamed for model exposure.
**Tool Search / Deferred**
Tool search result namespace = final ToolInfo.callable_namespace:
tool_search.rs:85
Tool search result nested name = final ToolInfo.callable_name:
tool_search.rs:86
Deferred tool handler is registered as "{namespace}:{name}":
tool_registry_plan.rs:248
When a function call comes back, core recombines namespace + name, looks
up the full qualified key, and gets the raw server/tool for MCP
execution: codex.rs:4353
**Separate Legacy Snapshot**
collect_mcp_snapshot_from_manager_with_detail() still returns a map
keyed by qualified callable name.
mcpServerStatus/list no longer uses that; it uses
McpServerStatusSnapshot, which is raw-inventory shaped.
## Summary
- bridge Codex Apps tools that declare `_meta["openai/fileParams"]`
through the OpenAI file upload flow
- mask those file params in model-visible tool schemas so the model
provides absolute local file paths instead of raw file payload objects
- rewrite those local file path arguments client-side into
`ProvidedFilePayload`-shaped objects before the normal MCP tool call
## Details
- applies to scalar and array file params declared in
`openai/fileParams`
- Codex uploads local files directly to the backend and uses the
uploaded file metadata to build the MCP tool arguments locally
- this PR is input-only
## Verification
- `just fmt`
- `cargo test -p codex-core mcp_tool_call -- --nocapture`
---------
Co-authored-by: Codex <noreply@openai.com>
Currently, when a MCP server sends an elicitation to Codex running in
Full Access (`sandbox_policy: DangerFullAccess` + `approval_policy:
Never`), the elicitations are auto-cancelled.
This PR updates the automatic handling of MCP elicitations to be
consistent with other approvals in full-access, where they are
auto-approved. Because MCP elicitations may actually require user input,
this mechanism is limited to empty form elicitations.
## Changeset
- Add policy helper shared with existing MCP tool call approval
auto-approve
- Update `ElicitationRequestManager` to auto-approve elicitations in
full access when `can_auto_accept_elicitation` is true.
- Add tests
Co-authored-by: Codex <noreply@openai.com>
Addresses #16971
Problem: Disabled MCP servers were still queried for streamable HTTP
auth status during MCP inventory, so unreachable disabled entries could
add startup latency.
Solution: Return `Unsupported` immediately for disabled MCP server
configs before bearer token/OAuth status discovery.
## Summary
- reduce public module visibility across Rust crates, preferring private
or crate-private modules with explicit crate-root public exports
- update external call sites and tests to use the intended public crate
APIs instead of reaching through module trees
- add the module visibility guideline to AGENTS.md
## Validation
- `cargo check --workspace --all-targets --message-format=short` passed
before the final fix/format pass
- `just fix` completed successfully
- `just fmt` completed successfully
- `git diff --check` passed
Addresses #16244
This was a performance regression introduced when we moved the TUI on
top of the app server API.
Problem: `/mcp` rebuilt a full MCP inventory through
`mcpServerStatus/list`, including resources and resource templates that
made the TUI wait on slow inventory probes.
Solution: add a lightweight `detail` mode to `mcpServerStatus/list`,
have `/mcp` request tools-and-auth only, and cover the fast path with
app-server and TUI tests.
Testing: Confirmed slow (multi-second) response prior to change and
immediate response after change.
I considered two options:
1. Change the existing `mcpServerStatus/list` API to accept an optional
"details" parameter so callers can request only a subset of the
information.
2. Add a separate `mcpServer/list` API that returns only the servers,
tools, and auth but omits the resources.
I chose option 1, but option 2 is also a reasonable approach.
- Split MCP runtime/server code out of `codex-core` into the new
`codex-mcp` crate. New/moved public structs/types include `McpConfig`,
`McpConnectionManager`, `ToolInfo`, `ToolPluginProvenance`,
`CodexAppsToolsCacheKey`, and the `McpManager` API
(`codex_mcp::mcp::McpManager` plus the `codex_core::mcp::McpManager`
wrapper/shim). New/moved functions include `with_codex_apps_mcp`,
`configured_mcp_servers`, `effective_mcp_servers`,
`collect_mcp_snapshot`, `collect_mcp_snapshot_from_manager`,
`qualified_mcp_tool_name_prefix`, and the MCP auth/skill-dependency
helpers. Why: this creates a focused MCP crate boundary and shrinks
`codex-core` without forcing every consumer to migrate in the same PR.
- Move MCP server config schema and persistence into `codex-config`.
New/moved structs/enums include `AppToolApproval`,
`McpServerToolConfig`, `McpServerConfig`, `RawMcpServerConfig`,
`McpServerTransportConfig`, `McpServerDisabledReason`, and
`codex_config::ConfigEditsBuilder`. New/moved functions include
`load_global_mcp_servers` and
`ConfigEditsBuilder::replace_mcp_servers`/`apply`. Why: MCP TOML
parsing/editing is config ownership, and this keeps config
validation/round-tripping (including per-tool approval overrides and
inline bearer-token rejection) in the config crate instead of
`codex-core`.
- Rewire `codex-core`, app-server, and plugin call sites onto the new
crates. Updated `Config::to_mcp_config(&self, plugins_manager)`,
`codex-rs/core/src/mcp.rs`, `codex-rs/core/src/connectors.rs`,
`codex-rs/core/src/codex.rs`,
`CodexMessageProcessor::list_mcp_server_status_task`, and
`utils/plugins/src/mcp_connector.rs` to build/pass the new MCP
config/runtime types. Why: plugin-provided MCP servers still merge with
user-configured servers, and runtime auth (`CodexAuth`) is threaded into
`with_codex_apps_mcp` / `collect_mcp_snapshot` explicitly so `McpConfig`
stays config-only.