## Why
For more advanced MCP usage, we want the model to be able to emit
parallel MCP tool calls and have Codex execute eligible ones
concurrently, instead of forcing all MCP calls through the serial block.
The main design choice was where to thread the config. I made this
server-level because parallel safety depends on the MCP server
implementation. Codex reads the flag from `mcp_servers`, threads the
opted-in server names into `ToolRouter`, and checks the parsed
`ToolPayload::Mcp { server, .. }` at execution time. That avoids relying
on model-visible tool names, which can be incomplete in
deferred/search-tool paths or ambiguous for similarly named
servers/tools.
## What was added
Added `supports_parallel_tool_calls` for MCP servers.
Before:
```toml
[mcp_servers.docs]
command = "docs-server"
```
After:
```toml
[mcp_servers.docs]
command = "docs-server"
supports_parallel_tool_calls = true
```
MCP calls remain serial by default. Only tools from opted-in servers are
eligible to run in parallel. Docs also now warn to enable this only when
the server’s tools are safe to run concurrently, especially around
shared state or read/write races.
## Testing
Tested with a local stdio MCP server exposing real delay tools. The
model/Responses side was mocked only to deterministically emit two MCP
calls in the same turn.
Each test called `query_with_delay` and `query_with_delay_2` with `{
"seconds": 25 }`.
| Build/config | Observed | Wall time |
| --- | --- | --- |
| main with flag enabled | serial | `58.79s` |
| PR with flag enabled | parallel | `31.73s` |
| PR without flag | serial | `56.70s` |
PR with flag enabled showed both tools start before either completed;
main and PR-without-flag completed the first delay before starting the
second.
Also added an integration test.
Additional checks:
- `cargo test -p codex-tools` passed
- `cargo test -p codex-core
mcp_parallel_support_uses_exact_payload_server` passed
- `git diff --check` passed
## Summary
When a `spawn_agent` call does a full-history fork, keep the parent's
effective agent type and model configuration instead of applying child
role/model overrides.
This is the minimal config-inheritance slice of #16055. Prompt-cache key
inheritance and MCP tool-surface stability are split into follow-up PRs.
## Design
- Reject `agent_type`, `model`, and `reasoning_effort` for v1
`fork_context` spawns.
- Reject `agent_type`, `model`, and `reasoning_effort` for v2
`fork_turns = "all"` spawns.
- Keep v2 partial-history forks (`fork_turns = "N"`) configurable;
requested model/reasoning overrides and role config still apply there.
- Keep non-forked spawn behavior unchanged.
## Tests
- `cargo +1.93.1 test -p codex-core spawn_agent_fork_context --lib`
- `cargo +1.93.1 test -p codex-core multi_agent_v2_spawn_fork_turns
--lib`
- `cargo +1.93.1 test -p codex-core
multi_agent_v2_spawn_partial_fork_turns_allows_agent_type_override
--lib`
## Summary
- add an exec-server `envPolicy` field; when present, the server starts
from its own process env and applies the shell environment policy there
- keep `env` as the exact environment for local/embedded starts, but
make it an overlay for remote unified-exec starts
- move the shell-environment-policy builder into `codex-config` so Core
and exec-server share the inherit/filter/set/include behavior
- overlay only runtime/sandbox/network deltas from Core onto the
exec-server-derived env
## Why
Remote unified exec was materializing the shell env inside Core and
forwarding the whole map to exec-server, so remote processes could
inherit the orchestrator machine's `HOME`, `PATH`, etc. This keeps the
base env on the executor while preserving Core-owned runtime additions
like `CODEX_THREAD_ID`, unified-exec defaults, network proxy env, and
sandbox marker env.
## Validation
- `just fmt`
- `git diff --check`
- `cargo test -p codex-exec-server --lib`
- `cargo test -p codex-core --lib unified_exec::process_manager::tests`
- `cargo test -p codex-core --lib exec_env::tests`
- `cargo test -p codex-core --lib exec_env_tests` (compile-only; filter
matched 0 tests)
- `cargo test -p codex-config --lib shell_environment` (compile-only;
filter matched 0 tests)
- `just bazel-lock-update`
## Known local validation issue
- `just bazel-lock-check` is not runnable in this checkout: it invokes
`./scripts/check-module-bazel-lock.sh`, which is missing.
---------
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: pakrym-oai <pakrym@openai.com>
## Summary
- register flattened handler aliases for deferred MCP tools
- cover the node_repl-shaped deferred MCP call path in tool registry
tests
## Root Cause
Deferred MCP tools were registered only under their namespaced handler
key, e.g. `mcp__node_repl__:js`. If the model/bridge emitted the
flattened qualified name `mcp__node_repl__js`, core parsed it as an MCP
payload but dispatch looked up the flattened handler key and returned
`unsupported call` before reaching the MCP handler.
## Validation
- `just fmt`
- `cargo test -p codex-tools
search_tool_registers_deferred_mcp_flattened_handlers`
- `cargo test -p codex-core
search_tool_registers_namespaced_mcp_tool_aliases`
- `git diff --check`
**Summary**
This PR treats Guardian timeouts as distinct from explicit denials in
the core approval paths.
Timeouts now return timeout-specific guidance instead of Guardian
policy-rejection messaging.
It updates the command, shell, network, and MCP approval flows and adds
focused test coverage.
Addresses #17302
Problem: `thread/list` compared cwd filters with raw path equality, so
`resume --last` could miss Windows sessions when the saved cwd used a
verbatim path form and the current cwd did not.
Solution: Normalize cwd comparisons through the existing path comparison
utilities before falling back to direct equality, and add Windows
regression coverage for verbatim paths. I made this a general utility
function and replaced all of the duplicated instance of it across the
code base.
## Summary
- Add `TimedOut` to Guardian/review carrier types:
- `ReviewDecision::TimedOut`
- `GuardianAssessmentStatus::TimedOut`
- app-server v2 `GuardianApprovalReviewStatus::TimedOut`
- Regenerate app-server JSON/TypeScript schemas for the new wire shape.
- Wire the new status through core/app-server/TUI mappings with
conservative fail-closed handling.
- Keep `TimedOut` non-user-selectable in the approval UI.
**Does not change runtime behavior yet; emitting `TimeOut` and
parent-model timeout messaging will come in followup PRs**
## Description
This PR introduces `review_id` as the stable identifier for guardian
reviews and exposes it in app-server `item/autoApprovalReview/started`
and `item/autoApprovalReview/completed` events.
Internally, guardian rejection state is now keyed by `review_id` instead
of the reviewed tool item ID. `target_item_id` is still included when a
review maps to a concrete thread item, but it is no longer overloaded as
the review lifecycle identifier.
## Motivation
We'd like to give users the ability to preempt a guardian review while
it's running (approve or decline).
However, we can't implement the API that allows the user to override a
running guardian review because we didn't have a unique `review_id` per
guardian review. Using `target_item_id` is not correct since:
- with execve reviews, there can be multiple execve calls (and therefore
guardian reviews) per shell command
- with network policy reviews, there is no target item ID
The PR that actually implements user overrides will use `review_id` as
the stable identifier.
## Summary
- preserve legacy Windows elevated sandbox behavior for existing
policies
- add elevated-only support for split filesystem policies that can be
represented as readable-root overrides, writable-root overrides, and
extra deny-write carveouts
- resolve those elevated filesystem overrides during sandbox transform
and thread them through setup and policy refresh
- keep failing closed for explicit unreadable (`none`) carveouts and
reopened writable descendants under read-only carveouts
- for explicit read-only-under-writable-root carveouts, materialize
missing carveout directories during elevated setup before applying the
deny-write ACL
- document the elevated vs restricted-token support split in the core
README
## Example
Given a split filesystem policy like:
```toml
":root" = "read"
":cwd" = "write"
"./docs" = "read"
"C:/scratch" = "write"
```
the elevated backend now provisions the readable-root overrides,
writable-root overrides, and extra deny-write carveouts during setup and
refresh instead of collapsing back to the legacy workspace-only shape.
If a read-only carveout under a writable root is missing at setup time,
elevated setup creates that carveout as an empty directory before
applying its deny-write ACE; otherwise the sandboxed command could
create it later and bypass the carveout. This is only for explicit
policy carveouts. Best-effort workspace protections like `.codex/` and
`.agents/` still skip missing directories.
A policy like:
```toml
"/workspace" = "write"
"/workspace/docs" = "read"
"/workspace/docs/tmp" = "write"
```
still fails closed, because the elevated backend does not reopen
writable descendants under read-only carveouts yet.
---------
Co-authored-by: Codex <noreply@openai.com>
- [x] Expand tool search to custom MCPs.
- [x] Rename several variables/fields to be more generic.
Updated tool & server name lifecycles:
**Raw Identity**
ToolInfo.server_name is raw MCP server name.
ToolInfo.tool.name is raw MCP tool name.
MCP calls route back to raw via parse_tool_name() returning
(tool.server_name, tool.tool.name).
mcpServerStatus/list now groups by raw server and keys tools by
Tool.name: mod.rs:599
App-server just forwards that grouped raw snapshot:
codex_message_processor.rs:5245
**Callable Names**
On list-tools, we create provisional callable_namespace / callable_name:
mcp_connection_manager.rs:1556
For non-app MCP, provisional callable name starts as raw tool name.
For codex-apps, provisional callable name is sanitized and strips
connector name/id prefix; namespace includes connector name.
Then qualify_tools() sanitizes callable namespace + name to ASCII alnum
/ _ only: mcp_tool_names.rs:128
Note: this is stricter than Responses API. Hyphen is currently replaced
with _ for code-mode compatibility.
**Collision Handling**
We do initially collapse example-server and example_server to the same
base.
Then qualify_tools() detects distinct raw namespace identities behind
the same sanitized namespace and appends a hash to the callable
namespace: mcp_tool_names.rs:137
Same idea for tool-name collisions: hash suffix goes on callable tool
name.
Final list_all_tools() map key is callable_namespace + callable_name:
mcp_connection_manager.rs:769
**Direct Model Tools**
Direct MCP tool declarations use the full qualified sanitized key as the
Responses function name.
The raw rmcp Tool is converted but renamed for model exposure.
**Tool Search / Deferred**
Tool search result namespace = final ToolInfo.callable_namespace:
tool_search.rs:85
Tool search result nested name = final ToolInfo.callable_name:
tool_search.rs:86
Deferred tool handler is registered as "{namespace}:{name}":
tool_registry_plan.rs:248
When a function call comes back, core recombines namespace + name, looks
up the full qualified key, and gets the raw server/tool for MCP
execution: codex.rs:4353
**Separate Legacy Snapshot**
collect_mcp_snapshot_from_manager_with_detail() still returns a map
keyed by qualified callable name.
mcpServerStatus/list no longer uses that; it uses
McpServerStatusSnapshot, which is raw-inventory shaped.
## Summary
App-server v2 already receives turn-scoped `clientMetadata`, but the
Rust app-server was dropping it before the outbound Responses request.
This change keeps the fix lightweight by threading that metadata through
the existing turn-metadata path rather than inventing a new transport.
## What we're trying to do and why
We want turn-scoped metadata from the app-server protocol layer,
especially fields like Hermes/GAAS run IDs, to survive all the way to
the actual Responses API request so it is visible in downstream
websocket request logging and analytics.
The specific bug was:
- app-server protocol uses camelCase `clientMetadata`
- Responses transport already has an existing turn metadata carrier:
`x-codex-turn-metadata`
- websocket transport already rewrites that header into
`request.request_body.client_metadata["x-codex-turn-metadata"]`
- but the Rust app-server never parsed or stored `clientMetadata`, so
nothing from the app-server request was making it into that existing
path
This PR fixes that without adding a new header or a second metadata
channel.
## How we did it
### Protocol surface
- Add optional `clientMetadata` to v2 `TurnStartParams` and
`TurnSteerParams`
- Regenerate the JSON schema / TypeScript fixtures
- Update app-server docs to describe the field and its behavior
### Runtime plumbing
- Add a dedicated core op for app-server user input carrying turn-scoped
metadata: `Op::UserInputWithClientMetadata`
- Wire `turn/start` and `turn/steer` through that op / signature path
instead of dropping the metadata at the message-processor boundary
- Store the metadata in `TurnMetadataState`
### Transport behavior
- Reuse the existing serialized `x-codex-turn-metadata` payload
- Merge the new app-server `clientMetadata` into that JSON additively
- Do **not** replace built-in reserved fields already present in the
turn metadata payload
- Keep websocket behavior unchanged at the outer shape level: it still
sends only `client_metadata["x-codex-turn-metadata"]`, but that JSON
string now contains the merged fields
- Keep HTTP fallback behavior unchanged except that the existing
`x-codex-turn-metadata` header now includes the merged fields too
### Request shape before / after
Before, a websocket `response.create` looked like:
```json
{
"type": "response.create",
"client_metadata": {
"x-codex-turn-metadata": "{\"session_id\":\"...\",\"turn_id\":\"...\"}"
}
}
```
Even if the app-server caller supplied `clientMetadata`, it was not
represented there.
After, the same request shape is preserved, but the serialized payload
now includes the new turn-scoped fields:
```json
{
"type": "response.create",
"client_metadata": {
"x-codex-turn-metadata": "{\"session_id\":\"...\",\"turn_id\":\"...\",\"fiber_run_id\":\"fiber-start-123\",\"origin\":\"gaas\"}"
}
}
```
## Validation
### Targeted tests added / updated
- protocol round-trip coverage for `clientMetadata` on `turn/start` and
`turn/steer`
- protocol round-trip coverage for `Op::UserInputWithClientMetadata`
- `TurnMetadataState` merge test proving client metadata is added
without overwriting reserved built-in fields
- websocket request-shape test proving outbound `response.create`
contains merged metadata inside
`client_metadata["x-codex-turn-metadata"]`
- app-server integration tests proving:
- `turn/start` forwards `clientMetadata` into the outbound Responses
request path
- websocket warmup + real turn request both behave correctly
- `turn/steer` updates the follow-up request metadata
### Commands run
- `just write-app-server-schema`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-protocol`
- `cargo test -p codex-core
turn_metadata_state_merges_client_metadata_without_replacing_reserved_fields
--lib`
- `cargo test -p codex-core --test all
responses_websocket_preserves_custom_turn_metadata_fields`
- `cargo test -p codex-app-server --test all client_metadata`
- `cargo test -p codex-app-server --test all
turn_start_forwards_client_metadata_to_responses_websocket_request_body_v2
-- --nocapture`
- `just fmt`
- `just fix -p codex-core -p codex-protocol -p codex-app-server-protocol
-p codex-app-server`
- `just fix -p codex-exec -p codex-tui-app-server`
- `just argument-comment-lint`
### Full suite note
`cargo test` in `codex-rs` still fails in:
-
`suite::v2::turn_interrupt::turn_interrupt_resolves_pending_command_approval_request`
I verified that same failure on a clean detached `HEAD` worktree with an
isolated `CARGO_TARGET_DIR`, so it is not caused by this patch.
## Summary
- detect remote exec-server sessions in the unified-exec runtime
- bypass the local shell-snapshot bootstrap only for those remote
sessions
- preserve existing local snapshot wrapping, PowerShell UTF-8 prefixing,
sandbox orchestration, and zsh-fork handling
## Why
The shell snapshot file is currently captured and stored next to Core.
If Core wraps a remote command with `. /path/to/local/snapshot`, the
process starts on the executor and tries to source a path from the
orchestrator filesystem. This keeps remote commands from receiving that
known-local path until shell snapshots are captured/restored on the
executor side.
## Validation
- `just fmt`
- `git diff --check`
- `cargo test -p codex-core --lib tools::runtimes::tests`
Co-authored-by: Codex <noreply@openai.com>
**Disabling Image-Gen for Non-SIWC Codex Users**
We are only enabling image-gen feature for SIWC Codex users until there
comes a fix in ResponsesAPI to omit output from responses.completed, to
prevent the following issues:
1. websocket blows up due to heavier load (images) than before (text)
2. http parser streams through n^2 of n-base64 bytes (sum of base64s of
all images generated in turn) that causes long delays in
turn_completion.
## Summary
- Carry `AbsolutePathBuf` through tool cwd parsing/resolution instead of
resolving workdirs to raw `PathBuf`s.
- Type exec/sandbox request cwd fields as `AbsolutePathBuf` through
`ExecParams`, `ExecRequest`, `SandboxCommand`, and unified exec runtime
requests.
- Keep `PathBuf` conversions at external/event boundaries and update
existing tests/fixtures for the typed cwd.
## Validation
- `cargo check -p codex-core --tests`
- `cargo check -p codex-sandboxing --tests`
- `cargo test -p codex-sandboxing`
- `cargo test -p codex-core --lib tools::handlers::`
- `just fix -p codex-sandboxing`
- `just fix -p codex-core`
- `just fmt`
Full `codex-core` test suite was not run locally; per repo guidance I
kept local validation targeted.
Allow multi_agent_v2 features to have its own temporary configuration
under `[features.multi_agent_v2]`
```
[features.multi_agent_v2]
enabled = true
usage_hint_enabled = false
usage_hint_text = "Custom delegation guidance."
hide_spawn_agent_metadata = true
```
Absent `usage_hint_text` means use the default hint.
```
[features]
multi_agent_v2 = true
```
still works as the boolean shorthand.
## Summary
Fix network proxy sessions so changing sandbox mode recomputes the
effective managed network policy and applies it to the already-running
per-session proxy.
## Root Cause
`danger_full_access_denylist_only` injects `"*"` only while building the
proxy spec for Full Access. Sessions built that spec once at startup, so
a later permission switch to Full Access left the live proxy in its
original restricted policy. Switching back needed the same recompute
path to remove the synthetic wildcard again.
## What Changed
- Preserve the original managed network proxy config/requirements so the
effective spec can be recomputed for a new sandbox policy.
- Refresh the current session proxy when sandbox settings change, then
reapply exec-policy network overlays.
- Add an in-place proxy state update path while rejecting
listener/port/SOCKS changes that cannot be hot-reloaded.
- Keep runtime proxy settings cheap to snapshot and update.
- Add regression coverage for workspace-write -> Full Access ->
workspace-write.
## Summary
- run apply_patch through the executor filesystem when a remote
environment is present instead of shelling out to the local process
- thread the executor FileSystem into apply_patch interception and keep
existing local behavior for non-remote turns
- make the apply_patch integration harness use the executor filesystem
for setup/assertions
- add remote-aware skips for turn-diff coverage that still reads the
test-runner filesystem
## Why
Remote apply_patch needed to mutate the remote workspace instead of the
local checkout. The tests also needed to seed and assert workspace state
through the same filesystem abstraction so local and remote runs
exercise the same behavior.
## Validation
- `just fmt`
- `git diff --check`
- `cargo check -p core_test_support --tests`
- `cargo test -p codex-core --test all
suite::shell_serialization::apply_patch_custom_tool_call -- --nocapture`
- `cargo test -p codex-core --test all
suite::apply_patch_cli::apply_patch_cli_updates_file_appends_trailing_newline
-- --nocapture`
- remote `cargo test -p codex-core --test all apply_patch_cli --
--nocapture` (229 passed)
- Migrate apply-patch verification and application internals to use the
async `ExecutorFileSystem` abstraction from `exec-server`.
- Convert apply-patch `cwd` handling to `AbsolutePathBuf` through the
verifier/parser/handler boundary.
Doesn't change how the tool itself works.
## Summary
https://github.com/openai/codex/pull/13860 changed the serialized output
format of Unified Exec. This PR reverts those changes and some related
test changes
## Testing
- [x] Update tests
---------
Co-authored-by: Codex <noreply@openai.com>
## Description
This PR changes guardian transcript compaction so oversized
conversations no longer collapse into a nearly empty placeholder.
Before this change, if the retained user history alone exceeded the
message budget, guardian would replace the entire transcript with
`<transcript omitted to preserve budget for planned action>`!
That meant approvals, especially network approvals, could lose the
recent tool call and tool result that explained what guardian was
actually reviewing. Now we keep a compact but usable transcript instead
of dropping it all.
### Before
```
The following is the Codex agent history whose request action you are assessing...
>>> TRANSCRIPT START
<transcript omitted to preserve budget for planned action>
>>> TRANSCRIPT END
Conversation transcript omitted due to size.
The Codex agent has requested the following action:
>>> APPROVAL REQUEST START
Retry reason:
Sandbox blocked outbound network access.
Assess the exact planned action below. Use read-only tool checks when local state matters.
Planned action JSON:
{
"tool": "network_access",
"target": "https://example.com:443",
"host": "example.com",
"protocol": "https",
"port": 443
}
>>> APPROVAL REQUEST END
```
### After
```
The following is the Codex agent history whose request action you are assessing...
>>> TRANSCRIPT START
[1] user: Please investigate why uploads to example.com are failing and retry if needed.
[8] user: If the request looks correct, go ahead and try again with network access.
[9] tool shell call: {"command":["curl","-X","POST","https://example.com/upload"],"cwd":"/repo"}
[10] tool shell result: sandbox blocked outbound network access
>>> TRANSCRIPT END
Some conversation entries were omitted.
The Codex agent has requested the following action:
>>> APPROVAL REQUEST START
Retry reason:
Sandbox blocked outbound network access.
Assess the exact planned action below. Use read-only tool checks when local state matters.
Planned action JSON:
{
"tool": "network_access",
"target": "https://example.com:443",
"host": "example.com",
"protocol": "https",
"port": 443
}
>>> APPROVAL REQUEST END
```
Addresses #15527
Problem: Nested `codex exec` commands could source a shell snapshot that
re-exported the parent `CODEX_THREAD_ID`, so commands inside the nested
session were attributed to the wrong thread.
Solution: Reapply the live command env's `CODEX_THREAD_ID` after
sourcing the snapshot.
Problem: The multi-agent followup interrupt test polled history before
interrupt cleanup and mailbox wakeup were guaranteed to settle, which
made it flaky under CI scheduling variance.
Solution: Wait for the child turn's `TurnAborted(Interrupted)` event
before asserting that the redirected assistant envelope is recorded and
no plain user message is left behind.
## Summary
- reduce public module visibility across Rust crates, preferring private
or crate-private modules with explicit crate-root public exports
- update external call sites and tests to use the intended public crate
APIs instead of reaching through module trees
- add the module visibility guideline to AGENTS.md
## Validation
- `cargo check --workspace --all-targets --message-format=short` passed
before the final fix/format pass
- `just fix` completed successfully
- `just fmt` completed successfully
- `git diff --check` passed
## Summary
- make `CODEX_EXEC_SERVER_URL=none` map to an explicit disabled
environment mode instead of inferring from a missing URL
- expose environment capabilities (`exec_enabled`, `filesystem_enabled`)
so tool building can gate behavior explicitly and future
multi-environment work has a clearer seam
- suppress env-backed tools when the relevant capability is unavailable,
including exec tools, `js_repl`, `apply_patch`, `list_dir`, and
`view_image`
- keep handler/runtime backstops so disabled environments still reject
execution if a tool path somehow bypasses registration
## Testing
- `just fmt`
- `cargo test -p codex-exec-server`
- `cargo test -p codex-tools
disabled_environment_omits_environment_backed_tools`
- `cargo test -p codex-tools
environment_capabilities_gate_exec_and_filesystem_tools_independently`
- remote devbox Bazel build via `codex-applied-devbox`:
`//codex-rs/cli:cli`
Stacked on #16508.
This removes the temporary `codex-core` / `codex-login` re-export shims
from the ownership split and rewrites callsites to import directly from
`codex-model-provider-info`, `codex-models-manager`, `codex-api`,
`codex-protocol`, `codex-feedback`, and `codex-response-debug-context`.
No behavior change intended; this is the mechanical import cleanup layer
split out from the ownership move.
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- split `models-manager` out of `core` and add `ModelsManagerConfig`
plus `Config::to_models_manager_config()` so model metadata paths stop
depending on `core::Config`
- move login-owned/auth-owned code out of `core` into `codex-login`,
move model provider config into `codex-model-provider-info`, move API
bridge mapping into `codex-api`, move protocol-owned types/impls into
`codex-protocol`, and move response debug helpers into a dedicated
`response-debug-context` crate
- move feedback tag emission into `codex-feedback`, relocate tests to
the crates that now own the code, and keep broad temporary re-exports so
this PR avoids a giant import-only rewrite
## Major moves and decisions
- created `codex-models-manager` as the owner for model
cache/catalog/config/model info logic, including the new
`ModelsManagerConfig` struct
- created `codex-model-provider-info` as the owner for provider config
parsing/defaults and kept temporary `codex-login`/`codex-core`
re-exports for old import paths
- moved `api_bridge` error mapping + `CoreAuthProvider` into
`codex-api`, while `codex-login::api_bridge` temporarily re-exports
those symbols and keeps the `auth_provider_from_auth` wrapper
- moved `auth_env_telemetry` and `provider_auth` ownership to
`codex-login`
- moved `CodexErr` ownership to `codex-protocol::error`, plus
`StreamOutput`, `bytes_to_string_smart`, and network policy helpers to
protocol-owned modules
- created `codex-response-debug-context` for
`extract_response_debug_context`, `telemetry_transport_error_message`,
and related response-debug plumbing instead of leaving that behavior in
`core`
- moved `FeedbackRequestTags`, `emit_feedback_request_tags`, and
`emit_feedback_request_tags_with_auth_env` to `codex-feedback`
- deferred removal of temporary re-exports and the mechanical import
rewrites to a stacked follow-up PR so this PR stays reviewable
## Test moves
- moved auth refresh coverage from `core/tests/suite/auth_refresh.rs` to
`login/tests/suite/auth_refresh.rs`
- moved text encoding coverage from
`core/tests/suite/text_encoding_fix.rs` to
`protocol/src/exec_output_tests.rs`
- moved model info override coverage from
`core/tests/suite/model_info_overrides.rs` to
`models-manager/src/model_info_overrides_tests.rs`
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
This continues the compile-time cleanup from #16630. `SessionTask`
implementations are monomorphized, but `Session` stores the task behind
a `dyn` boundary so it can drive and abort heterogenous turn tasks
uniformly. That means we can move the `#[async_trait]` expansion off the
implementation trait, keep a small boxed adapter only at the storage
boundary, and preserve the existing task lifecycle semantics while
reducing the amount of generated async-trait glue in `codex-core`.
One measurement caveat showed up while exploring this: a warm
incremental benchmark based on `touch core/src/tasks/mod.rs && cargo
check -p codex-core --lib` was basically flat, but that was the wrong
benchmark for this change. Using package-clean `codex-core` rebuilds,
like #16630, shows the real win.
Relevant pre-change code:
- [`SessionTask` with
`#[async_trait]`](3c7f013f97/codex-rs/core/src/tasks/mod.rs (L129-L182))
- [`RunningTask` storing `Arc<dyn
SessionTask>`](3c7f013f97/codex-rs/core/src/state/turn.rs (L69-L77))
## What changed
- Switched `SessionTask::{run, abort}` to native RPITIT futures with
explicit `Send` bounds.
- Added a private `AnySessionTask` adapter that boxes those futures only
at the `Arc<dyn ...>` storage boundary.
- Updated `RunningTask` to store `Arc<dyn AnySessionTask>` and removed
`#[async_trait]` from the concrete task impls plus test-only
`SessionTask` impls.
## Timing
Benchmarked package-clean `codex-core` rebuilds with dependencies left
warm:
```shell
cargo check -p codex-core --lib >/dev/null
cargo clean -p codex-core >/dev/null
/usr/bin/time -p cargo +nightly rustc -p codex-core --lib -- \
-Z time-passes \
-Z time-passes-format=json >/dev/null
```
| revision | rustc `total` | process `real` | `generate_crate_metadata`
| `MIR_borrow_checking` | `monomorphization_collector_graph_walk` |
| --- | ---: | ---: | ---: | ---: | ---: |
| parent `3c7f013f9735` | 67.21s | 67.71s | 24.61s | 23.43s | 22.43s |
| this PR `2cafd783ac22` | 35.08s | 35.60s | 8.01s | 7.25s | 7.15s |
| delta | -47.8% | -47.4% | -67.5% | -69.1% | -68.1% |
For completeness, the warm touched-file benchmark stayed flat (`1.96s`
parent vs `1.97s` this PR), which is why that benchmark should not be
used to evaluate this refactor.
## Verification
- Ran `cargo test -p codex-core`; this change compiled and task-related
tests passed before hitting the same unrelated 5
`config::tests::*guardian*` failures already present on the parent
stack.
## Why
This finishes the config-type move out of `codex-core` by removing the
temporary compatibility shim in `codex_core::config::types`. Callers now
depend on `codex-config` directly, which keeps these config model types
owned by the config crate instead of re-expanding `codex-core` as a
transitive API surface.
## What Changed
- Removed the `codex-rs/core/src/config/types.rs` re-export shim and the
`core::config::ApprovalsReviewer` re-export.
- Updated `codex-core`, `codex-cli`, `codex-tui`, `codex-app-server`,
`codex-mcp-server`, and `codex-linux-sandbox` call sites to import
`codex_config::types` directly.
- Added explicit `codex-config` dependencies to downstream crates that
previously relied on the `codex-core` re-export.
- Regenerated `codex-rs/core/config.schema.json` after updating the
config docs path reference.
## Why
#16513 moved pure tool-registry planning into `codex-tools`, but much of
the corresponding spec/feature-gating coverage still lived in
`codex-core`. That leaves the tests for planner behavior in the crate
that no longer owns that logic and makes the next extraction steps
harder to review.
## What
Move the planner-only `spec_tests.rs` coverage into
`codex-rs/tools/src/tool_registry_plan_tests.rs` and wire it up from
`codex-rs/tools/src/tool_registry_plan.rs` using the crate-local `#[path
= "tool_registry_plan_tests.rs"] mod tests;` pattern.
The `codex-core` test file now keeps the core-side integration checks:
router-visible model tool lists, namespaced handler alias registration,
shell adapter behavior, and MCP schema edge cases that still exercise
the `core` binding layer.
## Verification
- `cargo test -p codex-tools`
- `cargo test -p codex-core tools::spec::tests`