## Why
Elevated Windows sandbox setup currently assumes that the firewall rules
it writes will take effect. On managed Windows hosts, local firewall
policy changes can be ignored or only partially apply across the active
profiles, which means setup can appear to succeed without providing the
expected network isolation.
## What changed
- Query `INetFwPolicy2::LocalPolicyModifyState` before configuring the
elevated sandbox firewall rules.
- Fail setup when Windows reports that local firewall policy edits are
ineffective or only apply to some current profiles.
- Surface that condition with a dedicated
`helper_firewall_policy_ineffective` setup error code so support and
IT-facing diagnostics can distinguish it from COM access failures.
- Add focused coverage for effective policy, group-policy override, and
partial-profile coverage cases.
## Testing
- `cargo test -p codex-windows-sandbox --bin
codex-windows-sandbox-setup`
## Why
`chatwidget.rs` is still carrying too many unrelated responsibilities in
one file. #22269 started a five-phase cleanup to move coherent behavior
domains into focused modules while keeping `chatwidget.rs` as the
composition layer. #22407 completed phase 2 by extracting input and
submission flow, and #22433 completed phase 3 by extracting protocol,
replay, streaming, and tool lifecycle handling.
This PR is phase 4. It keeps moving high-churn UI coordination out of
the central widget by extracting settings, popups, and status surfaces
without changing the visible behavior those flows already provide. This
is once again a mechanical movement of existing functions. No functional
changes.
## What Changed
- Added focused modules for runtime settings/model coordination,
model/reasoning/collaboration popups,
settings/personality/theme/audio/experimental popups, permission
prompts, status setup/output controls, and Windows sandbox prompt flows.
- Moved the remaining rate-limit nudge/status helpers and connectors
popup/loading/update helpers into their existing focused modules.
- Preserved the existing picker flows, approval behavior, status/title
setup previews, rate-limit notices, and connectors/app list behavior
while shrinking `chatwidget.rs` back toward orchestration.
- Left `codex-rs/tui/src/chatwidget.rs` as the registration and
composition surface for these extracted behaviors.
## Cleanup Phases
The five-phase cleanup plan from #22269 is:
1. Phase 1: mechanical helper and state moves. Completed in #22269.
2. Phase 2: extract input and submission flow, including queued user
messages, shell prompt submission, pending steer restoration, and thread
input snapshot/restore behavior. Completed in #22407.
3. Phase 3: extract protocol, replay, streaming, and tool lifecycle
handling, while preserving active-cell grouping, transcript
invalidation, interrupt deferral, and final-message separator behavior.
Completed in #22433.
4. Phase 4: extract settings, popups, and status surfaces, including
model/reasoning/collaboration/personality popups, permission prompts,
rate-limit UI, and connectors helpers. This PR.
5. Phase 5: clean up the remaining constructor and orchestration code
once the larger behavior domains have moved out, leaving `chatwidget.rs`
as the composition layer.
## Verification
- `cargo check -p codex-tui`
- `cargo test -p codex-tui chatwidget::tests::permissions`
- `cargo test -p codex-tui chatwidget::tests::status_surface_previews`
- `cargo test -p codex-tui chatwidget::tests::popups_and_settings`
- `cargo test -p codex-tui chatwidget::tests::status_and_layout`
`cargo test -p codex-tui` also compiles and begins running, but aborts
in the unchanged app-side test
`app::tests::discard_side_thread_keeps_local_state_when_server_close_fails`
with a reproducible stack overflow.
## Why
`TurnContext::cwd` and `TurnContext::resolve_path` are being phased out
in favor of using the selected turn environment cwd directly.
Deprecating both APIs makes any new direct dependency visible while
preserving the existing migration path for current callers.
## What Changed
- Marked `TurnContext::cwd` and `TurnContext::resolve_path` as
deprecated with guidance to use the selected turn environment cwd
instead.
- Added exact `#[allow(deprecated)]` suppressions at each existing
direct usage site, including tests, rather than adding crate-wide
suppression.
- Kept the change behavior-preserving: current cwd reads, writes, and
path resolution continue to use the same values.
## Verification
- `just fmt`
- `cargo check -p codex-core`
- `cargo check -p codex-core --tests`
- `git diff --check`
# Description
We need to set the appropriate Product SKU for full functionality for
the apps endpoints for each type of client
# Testing
`./target/debug/codex --enable app`
<img width="1786" height="398" alt="CleanShot 2026-05-12 at 11 51 25@2x"
src="https://github.com/user-attachments/assets/2142f768-fc72-4fcb-8f39-9bd0d8569170"
/>
Regular slack flows seem to work, also curling these endpoints with the
correct SKU returns the right apps
## Why
`code_mode_only` filters code-mode nested tools out of the top-level
tool list. For multi-agent v2, we need a rollout shape where the
collaboration tools remain callable as normal model tools without also
being embedded into the code-mode `exec` tool declaration.
Related to this:
https://openai-corpws.slack.com/archives/C0AQLHB4U75/p1778660267922549
## What Changed
- Adds `features.multi_agent_v2.non_code_mode_only`, including config
resolution, profile override handling, and generated schema coverage.
- Introduces `ToolExposure::DirectModelOnly` so a tool can be included
in the initial model-visible list while staying out of the nested
code-mode tool surface.
- Applies that exposure to the multi-agent v2 tools when the new flag is
set: `spawn_agent`, `send_message`, `followup_task`, `wait_agent`,
`close_agent`, and `list_agents`.
- Updates code-mode-only filtering so direct-model-only tools remain
visible while ordinary nested code-mode tools are still hidden.
## Verification
- Added config parsing/profile tests for `non_code_mode_only`.
- Added tool spec coverage for the code-mode-only multi-agent v2
exposure behavior.
## Why
Recent session history showed no active use of the raw `shell`,
`local_shell`, or `container.exec` execution surfaces. Keeping those
handlers/specs wired into core leaves duplicate shell execution paths
alongside the supported `shell_command` and unified exec tools.
## What changed
- Removed the raw `shell` handler/spec and its `ShellToolCallParams`
protocol helper.
- Removed the legacy `local_shell` and `container.exec` handler/spec
plumbing while preserving persisted-history compatibility for old
response items.
- Normalized model/config `default` and `local` shell selections to
`shell_command`.
- Pruned tests that exercised removed raw-shell/local-shell/apply-patch
variants and kept coverage on `shell_command`, unified exec, and
freeform `apply_patch`.
## Verification
- `git diff --check`
- `cargo test -p codex-protocol`
- `cargo test -p codex-tools`
- `cargo test -p codex-core tools::handlers::shell`
- `cargo test -p codex-core tools::spec`
- `cargo test -p codex-core tools::router`
- `cargo test -p codex-core
active_call_preserves_triggering_command_context`
- `cargo test -p codex-core guardian_tests`
- `cargo test -p codex-core --test all shell_serialization`
- `cargo test -p codex-core --test all apply_patch_cli`
- `cargo test -p codex-core --test all shell_command_`
- `cargo test -p codex-core --test all local_shell`
- `cargo test -p codex-core --test all otel::`
- `cargo test -p codex-core --test all hooks::`
- `just fix -p codex-core`
- `just fix -p codex-tools`
## Why
Stop sending duplicate `session_id`/`thread_id` headers. We only want
the hyphenated forms as `_` is rejected by some proxies
Related discussion here:
https://openai.slack.com/archives/C095U48JNL9/p1778508316923179
## What
- Keep `session-id` and `thread-id`
- Remove the underscore aliases
## Why
Deferred tools were tracked with separate side-channel filtering after
tool specs had already been assembled. That made the registry
responsible for executing tools while the router/spec planner separately
decided whether those same tools should be exposed to the model up
front.
This PR makes exposure part of the tool handler contract so direct
versus deferred availability travels with the executable tool
registration.
Next step will be to simplify registration
## What Changed
- Adds `ToolExposure` to `codex-tools` and exposes it through
`ToolExecutor`, defaulting tools to `Direct`.
- Teaches dynamic tools and MCP handlers to mark deferred tools as
`Deferred` at construction time.
- Renames the registry object-safe wrapper from `AnyToolHandler` to
`RegisteredTool` and uses `ToolExposure` when deciding whether to
include a handler's spec in the initial model-visible tool list.
- Refactors tool spec planning to derive direct specs and deferred
search entries from registered handlers, removing the router's
special-case deferred dynamic tool filtering.
## Verification
- Not run.
## Why
Codex intentionally ignores unknown `config.toml` fields by default so
older and newer config files keep working across versions. That leniency
also makes typo detection hard because misspelled or misplaced keys
disappear silently.
This change adds an opt-in strict config mode so users and tooling can
fail fast on unrecognized config fields without changing the default
permissive behavior.
This feature is possible because `serde_ignored` exposes the exact
signal Codex needs: it lets Codex run ordinary Serde deserialization
while recording fields Serde would otherwise ignore. That avoids
requiring `#[serde(deny_unknown_fields)]` across every config type and
keeps strict validation opt-in around the existing config model.
## What Changed
### Added strict config validation
- Added `serde_ignored`-based validation for `ConfigToml` in
`codex-rs/config/src/strict_config.rs`.
- Combined `serde_ignored` with `serde_path_to_error` so strict mode
preserves typed config error paths while also collecting fields Serde
would otherwise ignore.
- Added strict-mode validation for unknown `[features]` keys, including
keys that would otherwise be accepted by `FeaturesToml`'s flattened
boolean map.
- Kept typed config errors ahead of ignored-field reporting, so
malformed known fields are reported before unknown-field diagnostics.
- Added source-range diagnostics for top-level and nested unknown config
fields, including non-file managed preference source names.
### Kept parsing single-pass per source
- Reworked file and managed-config loading so strict validation reuses
the already parsed `TomlValue` for that source.
- For actual config files and managed config strings, the loader now
reads once, parses once, and validates that same parsed value instead of
deserializing multiple times.
- Validated `-c` / `--config` override layers with the same
base-directory context used for normal relative-path resolution, so
unknown override keys are still reported when another override contains
a relative path.
### Scoped `--strict-config` to config-heavy entry points
- Added support for `--strict-config` on the main config-loading entry
points where it is most useful:
- `codex`
- `codex resume`
- `codex fork`
- `codex exec`
- `codex review`
- `codex mcp-server`
- `codex app-server` when running the server itself
- the standalone `codex-app-server` binary
- the standalone `codex-exec` binary
- Commands outside that set now reject `--strict-config` early with
targeted errors instead of accepting it everywhere through shared CLI
plumbing.
- `codex app-server` subcommands such as `proxy`, `daemon`, and
`generate-*` are intentionally excluded from the first rollout.
- When app-server strict mode sees invalid config, app-server exits with
the config error instead of logging a warning and continuing with
defaults.
- Introduced a dedicated `ReviewCommand` wrapper in `codex-rs/cli`
instead of extending shared `ReviewArgs`, so `--strict-config` stays on
the outer config-loading command surface and does not become part of the
reusable review payload used by `codex exec review`.
### Coverage
- Added tests for top-level and nested unknown config fields, unknown
`[features]` keys, typed-error precedence, source-location reporting,
and non-file managed preference source names.
- Added CLI coverage showing invalid `--enable`, invalid `--disable`,
and unknown `-c` overrides still error when `--strict-config` is
present, including compound-looking feature names such as
`multi_agent_v2.subagent_usage_hint_text`.
- Added integration coverage showing both `codex app-server
--strict-config` and standalone `codex-app-server --strict-config` exit
with an error for unknown config fields instead of starting with
fallback defaults.
- Added coverage showing unsupported command surfaces reject
`--strict-config` with explicit errors.
## Example Usage
Run Codex with strict config validation enabled:
```shell
codex --strict-config
```
Strict config mode is also available on the supported config-heavy
subcommands:
```shell
codex --strict-config exec "explain this repository"
codex review --strict-config --uncommitted
codex mcp-server --strict-config
codex app-server --strict-config --listen off
codex-app-server --strict-config --listen off
```
For example, if `~/.codex/config.toml` contains a typo in a key name:
```toml
model = "gpt-5"
approval_polic = "on-request"
```
then `codex --strict-config` reports the misspelled key instead of
silently ignoring it. The path is shortened to `~` here for readability:
```text
$ codex --strict-config
Error loading config.toml:
~/.codex/config.toml:2:1: unknown configuration field `approval_polic`
|
2 | approval_polic = "on-request"
| ^^^^^^^^^^^^^^
```
Without `--strict-config`, Codex keeps the existing permissive behavior
and ignores the unknown key.
Strict config mode also validates ad-hoc `-c` / `--config` overrides:
```text
$ codex --strict-config -c foo=bar
Error: unknown configuration field `foo` in -c/--config override
$ codex --strict-config -c features.foo=true
Error: unknown configuration field `features.foo` in -c/--config override
```
Invalid feature toggles are rejected too, including values that look
like nested config paths:
```text
$ codex --strict-config --enable does_not_exist
Error: Unknown feature flag: does_not_exist
$ codex --strict-config --disable does_not_exist
Error: Unknown feature flag: does_not_exist
$ codex --strict-config --enable multi_agent_v2.subagent_usage_hint_text
Error: Unknown feature flag: multi_agent_v2.subagent_usage_hint_text
```
Unsupported commands reject the flag explicitly:
```text
$ codex --strict-config cloud list
Error: `--strict-config` is not supported for `codex cloud`
```
## Verification
The `codex-cli` `strict_config` tests cover invalid `--enable`, invalid
`--disable`, the compound `multi_agent_v2.subagent_usage_hint_text`
case, unknown `-c` overrides, app-server strict startup failure through
`codex app-server`, and rejection for unsupported commands such as
`codex cloud`, `codex mcp`, `codex remote-control`, and `codex
app-server proxy`.
The config and config-loader tests cover unknown top-level fields,
unknown nested fields, unknown `[features]` keys, source-location
reporting, non-file managed config sources, and `-c` validation for keys
such as `features.foo`.
The app-server test suite covers standalone `codex-app-server
--strict-config` startup failure for an unknown config field.
## Documentation
The Codex CLI docs on developers.openai.com/codex should mention
`--strict-config` as an opt-in validation mode for supported
config-heavy entry points once this ships.
## Why
`chatwidget.rs` is still carrying too many unrelated responsibilities in
one file. #22269 started a five-phase cleanup to move coherent behavior
domains into focused modules while keeping `chatwidget.rs` as the
composition layer. #22407 completed phase 2 by extracting input and
submission flow.
This PR is phase 3. It keeps moving high-churn event handling out of the
central widget by extracting protocol, replay, streaming, and tool
lifecycle handling without changing the visible behavior those flows
already provide. This is once again just a mechanical movement of
existing functions. No functional changes.
## What Changed
- Added focused modules for protocol request dispatch, replay rendering,
assistant/plan/reasoning streaming, turn runtime bookkeeping, hook
lifecycle handling, command lifecycle handling, tool lifecycle
rendering, and interactive tool request prompts.
- Kept active-cell grouping, transcript invalidation, interrupt
deferral, and final-message separator behavior in the same flows, just
moved into smaller files.
- Added module header comments to the new files so the ownership
boundaries are explicit.
- Left `codex-rs/tui/src/chatwidget.rs` as the registration and
orchestration surface for these extracted behaviors.
## Cleanup Phases
The five-phase cleanup plan from #22269 is:
1. Phase 1: mechanical helper and state moves. Completed in #22269.
2. Phase 2: extract input and submission flow, including queued user
messages, shell prompt submission, pending steer restoration, and thread
input snapshot/restore behavior. Completed in #22407.
3. Phase 3: extract protocol, replay, streaming, and tool lifecycle
handling, while preserving active-cell grouping, transcript
invalidation, interrupt deferral, and final-message separator behavior.
This PR.
4. Phase 4: extract settings, popups, and status surfaces, including
model/reasoning/collaboration/personality popups, permission prompts,
rate-limit UI, and connectors helpers.
5. Phase 5: clean up the remaining constructor and orchestration code
once the larger behavior domains have moved out, leaving `chatwidget.rs`
as the composition layer.
## Why
The memories extension has several distinct responsibilities:
registering its prompt and tool contributors, enforcing local-memory
filesystem boundaries, implementing list/read/search behavior, and
wrapping that backend as extension tools. Those responsibilities were
concentrated in `lib.rs`, `local.rs`, and the tool modules, which made
follow-up work harder to review and risked growing files through
unrelated edits.
This PR reorganizes the crate so each responsibility has a narrower
owner while preserving the same extension entrypoint and memory tool
behavior.
## What Changed
- Moved extension lifecycle, prompt, and tool registration into
`src/extension.rs`, leaving `src/lib.rs` as the small crate entrypoint.
- Split `LocalMemoriesBackend` helpers into `local/list.rs`,
`local/path.rs`, `local/read.rs`, and `local/search.rs`.
- Centralized tool names and limits at the crate level, and kept the
backend and extension implementation crate-private.
- Made `memory_list`, `memory_read`, and `memory_search` tool executors
generic over `MemoriesBackend`, so tests can exercise the full executor
path without depending on tool internals.
- Consolidated and expanded memory extension tests in `src/tests.rs`,
including read/search tool output coverage, multi-query search, windowed
`all_within_lines`, and legacy `query` rejection.
## Testing
- Not run locally.
## Why
Picker-style UI in the TUI has accumulated a mix of hardcoded navigation
keys. Some lists supported page movement, some did not; some accepted
Vim-like keys, while others only accepted arrows; and tabbed or
horizontally adjustable pickers had no shared keymap action for
left/right movement.
This PR makes picker/list navigation consistent and configurable so
users can rely on the same defaults across the TUI.
## What Changed
- Adds shared list keymap actions for:
- vertical movement: `move_up`, `move_down`
- horizontal movement: `move_left`, `move_right`
- paging and jumps: `page_up`, `page_down`, `jump_top`, `jump_bottom`
- Adds defaults:
- Up/down: arrows, `Ctrl+P/N`, `Ctrl+K/J`, and plain `k/j` where text
input is not active
- Page up/down: `PageUp/PageDown` and `Ctrl+B/F`
- First/last: `Home/End`
- Left/right: `Left/Right` and `Ctrl+H/L`
- Wires the shared list keymap through picker and list surfaces
including session resume, multi-select, tabbed selection lists,
settings-style lists, app-link selection, MCP elicitation,
request-user-input, and the OSS selection wizard.
- Keeps search behavior intact by reserving printable characters for
query text in searchable pickers.
- Updates keymap setup actions, config schema, snapshots, and focused
coverage for the new list actions.
## How to Test
1. Start Codex from this branch and open the session picker, for example
with an existing session history.
2. In the session list, verify that `Ctrl+J/K` moves the selection
down/up.
3. Verify that `Ctrl+F/B` pages down/up and `Home/End` jumps to the
first/last visible session.
4. Type printable search text such as `j` or `k` and confirm it updates
the query instead of navigating.
5. Focus a picker control that changes values horizontally, such as a
session picker toolbar control, and verify `Ctrl+H/L` changes the
focused value like left/right arrows.
Targeted tests run:
- `cargo test -p codex-tui keymap::tests::`
- `cargo test -p codex-tui keymap_setup::tests::`
- `cargo test -p codex-tui horizontal_list_keys`
- `cargo test -p codex-tui page_and_jump_navigation_use_list_keymap`
- `cargo test -p codex-tui ctrl_h_l_move_provider_selection`
- `cargo test -p codex-tui scroll_state::tests`
- `cargo test -p codex-tui
switching_tabs_changes_visible_items_and_clears_search`
- `cargo test -p codex-tui toggle_sort_key_reloads_with_new_sort`
Also ran `just write-config-schema`, `just fmt`, `just fix -p
codex-tui`, `just argument-comment-lint`, and `git diff --check`.
Note: `cargo test -p codex-tui` was attempted and still aborts in the
pre-existing
`tests::fork_last_filters_latest_session_by_cwd_unless_show_all` stack
overflow, which is unrelated to this branch.
## Why
Extensions can observe thread and turn lifecycle events today, but there
was no single host-owned hook for changes to the effective thread
configuration. That makes features that need to react to model,
permission, or tool-suggest updates either depend on individual mutation
paths or risk going stale after runtime config refreshes.
This adds a typed config-change contributor so extension-owned state can
stay synchronized with the effective thread config while the host
remains responsible for deciding when config changed.
## What Changed
- Added `ConfigContributor<C>` to `codex_extension_api`, with
before/after immutable snapshots of the effective config plus
session/thread extension stores.
- Added registry builder/accessor support through `config_contributor`
and `config_contributors`.
- Emits config-change callbacks after committed updates from session
settings, per-turn setting updates, and `refresh_runtime_config`.
- Builds effective config snapshots only when config contributors are
registered, and suppresses no-op callbacks when the before/after
snapshots are equal.
- Added a core session regression test that verifies contributors
observe both model changes and user-layer runtime config changes,
including access to session and thread extension stores.
## Validation
Added `config_change_contributor_observes_effective_config_changes` in
`codex-rs/core/src/session/tests.rs` to cover the new contributor path.
## Why
Spawned agents can already override `model` and `reasoning_effort`, but
they have no equivalent way to opt into a model-supported service tier.
That makes it impossible to preserve or intentionally select tiered
execution behavior when delegating work to a sub-agent, even though the
model catalog already advertises supported `service_tiers`.
## What changed
- Add optional `service_tier` to both legacy and `MultiAgentV2`
`spawn_agent` tool inputs.
- Show each picker-visible model's supported service tier ids and
descriptions in the `spawn_agent` tool guidance.
- Resolve service tier selection after the child agent's effective model
is known.
- Inherit the parent tier when omitted and still supported by the final
child model; otherwise clear it.
- Reject explicit unsupported tier requests with a model-facing error.
- Keep explicit `service_tier` usable on full-history forks, while still
honoring the existing model/reasoning fork restrictions.
- Hide `service_tier` alongside other spawn metadata when
`hide_spawn_agent_metadata` is enabled.
## Verification
Added focused coverage for:
- v1/v2 `spawn_agent` schema exposure for `service_tier`
- tier descriptions in spawn guidance
- hidden-metadata suppression
- explicit supported tier selection
- explicit unknown and unsupported tier rejection
- inherited tier preservation or clearing based on child-model support
- full-history fork acceptance for explicit service tiers in both v1 and
v2
Local Rust tests were not run in this workspace per repo guidance; the
new coverage is included for CI.
## Why
We added Zellij-specific TUI workarounds because older Zellij behavior
did not work with Codex's normal terminal model:
- #8555 made `tui.alternate_screen = "auto"` disable alternate screen in
Zellij so transcript history stayed available.
- #16578 avoided scroll-region operations in Zellij by emitting raw
newlines and using a separate composer styling path.
This PR removes both workarounds because the latest Zellij release
tested locally (`zellij 0.44.1`) works correctly with Codex's standard
TUI behavior: normal alternate-screen handling, redraw, and history
insertion.
## What Changed
- Removed the `InsertHistoryMode::Zellij` path and the Zellij-only
newline scrollback insertion behavior.
- Removed cached `is_zellij` state from the TUI and composer.
- Removed Zellij-specific composer styling, the helper snapshot, and the
`TerminalInfo::is_zellij()` convenience method that only served this
workaround.
- Changed `tui.alternate_screen = "auto"` to use alternate screen for
Zellij too; `--no-alt-screen` and `tui.alternate_screen = "never"` still
preserve the inline mode escape hatch.
- Updated the generated config schema description for
`tui.alternate_screen`.
## How to Test
Manual smoke path used with `zellij 0.44.1`:
1. Build and run this branch inside a Zellij `0.44.1` session with
default config.
2. Start Codex normally and produce enough assistant/tool output to
create scrollback.
3. Confirm the transcript remains readable, the composer renders
normally, and scrolling through terminal history works.
4. Resize the Zellij pane while output exists and confirm the TUI
redraws without duplicated, missing, or stale rows.
5. Compare with `--no-alt-screen` or `-c tui.alternate_screen=never` if
you want to verify the inline fallback still works.
Targeted tests:
- `just write-config-schema`
- `just fmt`
- `just fix -p codex-tui`
- `cargo test -p codex-terminal-detection`
- `cargo test -p codex-tui alternate_screen_auto_uses_alt_screen`
Attempted but did not complete locally:
- `cargo test -p codex-tui` built and ran the new test successfully,
then failed later on unrelated local failures in
`status_permissions_full_disk_managed_*` and a stack overflow in
`tests::fork_last_filters_latest_session_by_cwd_unless_show_all`.
## Documentation
No developers.openai.com Codex documentation update is needed for this
revert.
## Summary
- add a scoped level_id to ExtensionData and expose it through
level_id()
- remove thread_id/turn_id parameters from extension contributor inputs
where the scoped ExtensionData already carries that identity
- move turn-scoped extension data onto TurnContext so token usage and
lifecycle contributors can share the same turn store
## Testing
- cargo check -p codex-extension-api -p codex-core --tests
- cargo test -p codex-extension-api
- cargo test -p codex-guardian
- cargo test -p codex-core --lib
record_token_usage_info_notifies_extension_contributors
- cargo test -p codex-core --lib
submission_loop_channel_close_emits_thread_stop_lifecycle
- cargo test -p codex-core --lib
submission_loop_channel_close_aborts_active_turn_before_thread_stop_lifecycle
- just fix -p codex-extension-api
- just fix -p codex-guardian
- just fix -p codex-core
- just fmt
## Note
- Attempted cargo test -p codex-core; it aborted in
agent::control::tests::spawn_agent_fork_last_n_turns_keeps_only_recent_turns
with the existing stack overflow before the full suite completed.
## Why
Extensions need a stable place to observe token accounting after Codex
folds model-provider usage into the session's cached `TokenUsageInfo`.
Without a contributor hook, extension-owned features that need last-turn
or cumulative token usage have to duplicate session plumbing or infer
state from client-facing `TokenCount` notifications.
## What changed
- Added `TokenUsageContributor` to `codex-extension-api`, passing
session/thread `ExtensionData`, `ThreadId`, turn id, and the current
`TokenUsageInfo`.
- Added registry builder/storage support for token-usage contributors.
- Invoked registered contributors from
`Session::record_token_usage_info` after the session token cache is
updated and before the client `TokenCount` notification is emitted.
## Testing
- Added `record_token_usage_info_notifies_extension_contributors`,
covering cumulative token usage updates and access to both extension
stores.
## Why
The thread lifecycle contributor hooks from #22476 should observe every
session teardown. The explicit `Op::Shutdown` path already emitted
`on_thread_stop`, but when `submission_loop` exited because its
submission channel closed, it only tore down runtime services. That
meant extensions could miss the thread-stop lifecycle signal on implicit
runtime shutdown.
## What Changed
- Split shared runtime teardown into `shutdown_runtime_services(...)`.
- Split thread-stop lifecycle emission into
`emit_thread_stop_lifecycle(...)`.
- Reused those helpers from both explicit shutdown and the channel-close
shutdown path.
- Tracked whether `Op::Shutdown` was received so the explicit path does
not double-emit lifecycle events after it exits the loop.
- Added a regression test that closes the submission channel and asserts
`ThreadLifecycleContributor::on_thread_stop` runs once with the expected
thread/session stores.
## Testing
- `cargo test -p codex-core
submission_loop_channel_close_emits_thread_stop_lifecycle`
## Why
Extensions can already contribute prompt, tool, turn-item, and
thread-lifecycle behavior, but there was no explicit host-owned hook for
per-turn setup and cleanup. That makes extension-private turn state
awkward: an extension either has to stash it outside the turn lifecycle
or depend on core runtime objects.
This adds a small turn lifecycle boundary. Extensions receive stable
identifiers plus the existing session, thread, and turn `ExtensionData`
stores, while core keeps owning task scheduling, cancellation, and turn
teardown.
## What Changed
- Added `TurnLifecycleContributor` with `on_turn_start`, `on_turn_stop`,
and `on_turn_abort` callbacks in `codex-rs/ext/extension-api`.
- Added typed `TurnStartInput`, `TurnStopInput`, and `TurnAbortInput`
payloads that expose `thread_id`, `turn_id`, `session_store`,
`thread_store`, and `turn_store`.
- Registered and re-exported turn lifecycle contributors through
`ExtensionRegistry` and `ExtensionRegistryBuilder`.
- Wired `Session` to emit turn start, stop, and abort callbacks from the
existing turn/task lifecycle paths.
- Carried the turn-scoped `ExtensionData` through `RunningTask` and
`RemovedTask` so stop/abort callbacks receive the same turn store
created at turn start.
## Verification
- Not run locally.
## Why
Extensions that need thread-scoped state currently only get a start-time
callback. That is enough for seeding stores, but it leaves the host
without a shared extension seam for later thread rehydrate and flush
work as thread ownership evolves. This PR turns that start-only seam
into a host-owned thread lifecycle contributor contract so
extension-private state can stay behind the extension API instead of
leaking extra orchestration through core.
## What changed
- Replaced `ThreadStartContributor` with `ThreadLifecycleContributor`
and added typed lifecycle inputs for thread start, resume, and stop. The
contract lives in
[`contributors/thread_lifecycle.rs`](d0e9211f70/codex-rs/ext/extension-api/src/contributors/thread_lifecycle.rs (L1-L64)).
- Kept the existing start-time behavior intact by routing session
construction through `on_thread_start`.
- Invoked `on_thread_stop` during session shutdown before thread-scoped
extension state is dropped, while isolating contributor failures behind
warning logs.
- Migrated `git-attribution` and `guardian` onto the lifecycle
registration path.
- Renamed the extension registry plumbing from start-specific
contributors to lifecycle-specific contributors.
## Notes
`on_thread_resume` is introduced at the API boundary here so extensions
can target the final lifecycle shape; host resume dispatch can be wired
where that runtime path is finalized.
## Summary
- move `plugin/list` from the shared `config` read queue onto a
dedicated `plugin-list` shared-read queue
- move `plugin/read` onto that same dedicated shared-read queue as well
- keep the existing scheduler behavior unchanged
- allow plugin list/read operations to proceed independently of
config-family writes, accepting temporary stale or transient read errors
during concurrent mutations
## Validation
- `just fmt`
- `cargo test -p codex-app-server-protocol`
## Why
Extension tools were split across two public runtime contracts:
`codex-tool-api` exposed `ToolBundle` plus its own call/spec/error
types, while core native tools used `codex_tools::ToolExecutor`. That
made contributed tool specs and execution behavior easy to drift apart
and added another crate boundary for what should be one executable-tool
seam.
This PR makes `ToolExecutor` the single runtime contract and keeps
extension-specific pinning in `codex-extension-api`.
## Remaining todo
https://github.com/openai/codex/pull/22369/changes#diff-b935ea8245c3ce568a30cff660175fa6390b66b872ae409e1e2e965738250741R5
Either generic `Invocation` or sub-extract the `ToolCall` and clean
`ToolInvocation`
## What changed
- Removed the `codex-tool-api` workspace crate and its dependencies from
core and `codex-extension-api`.
- Made `codex_tools::ToolExecutor` object-safe with `async_trait` so
extension contributors can return a dyn executor.
- Added the extension-facing aliases under
`ext/extension-api/src/contributors/tools.rs`, including
`ExtensionToolExecutor = dyn ToolExecutor<ToolCall, Output =
ExtensionToolOutput>`.
- Changed `ToolContributor::tools` to return extension executors
directly instead of `ToolBundle`s.
- Updated core’s extension tool handler/registry/router path to adapt
those extension executors into the existing native `ToolInvocation`
runtime path.
- Added focused coverage for extension tools being registered,
model-visible, dispatchable, and not replacing built-in tools.
## Verification
- `cargo test -p codex-tools`
- `cargo test -p codex-extension-api`
## Why
Codex still models model-visible tools and executable behavior largely
inside `codex-core`, which makes it harder to evolve the tool system
toward a single reusable abstraction for built-ins, MCP-backed tools,
dynamic tools, and later tools injected from outside core.
This PR takes the next incremental step in that direction by moving the
common execution-facing pieces out of core and separating them from
core-only orchestration. The intent is to let shared tool abstractions
improve in one place, while `codex-core` keeps the parts that are still
inherently host-specific today, such as `ToolInvocation`, dispatch
wiring, and hook integration.
This PR is mostly moving things around. The only interesting piece is
this abstraction:
https://github.com/openai/codex/pull/22359/changes#diff-81af519002548ba51ed102bdaaf77e081d40a1e73a6e5f9b104bbbc96a6f1b3dR13
## What changed
- Added `codex_tools::ToolExecutor<Invocation>` as the shared execution
trait for model-visible tools.
- Moved the reusable execution support types from `codex-core` into
`codex-tools`:
- `FunctionCallError`
- `ToolPayload`
- `ToolOutput`
- Refactored core tool implementations so that execution behavior lives
on `ToolExecutor<ToolInvocation>`, while `ToolHandler` remains the
core-local extension point for hook payloads, telemetry tags, diff
consumers, and other orchestration concerns.
- Kept the registry and dispatch flow behaviorally unchanged while
making the shared/extracted boundary explicit across built-in, MCP,
dynamic, extension-backed, shell, and multi-agent tool handlers.
## Verification
- `cargo test -p codex-tools`
- `just fix -p codex-tools`
- `just fix -p codex-core`
- `cargo test -p codex-core` progressed through the updated tool
surfaces and then hit the existing unrelated multi-agent stack overflow
in
`tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`.
## Why
`codex-extension-api` needs an approval hook that lets an installed
extension own a rendered approval-review prompt and produce the final
`ReviewDecision`. The prior interceptor stub only exposed a yes/no claim
and did not model the review result itself, which left the host with the
missing half of the control flow.
## What changed
- Replaces `ApprovalInterceptorContributor` with
[`ApprovalReviewContributor`](c49d17531e/codex-rs/ext/extension-api/src/contributors.rs (L43-L55)),
which may claim a rendered prompt and return an async `ReviewDecision`.
- Re-exports the new contributor and future types from `extension-api`.
- Adds registry support through `approval_review_contributor(...)` plus
[`ExtensionRegistry::approval_review(...)`](c49d17531e/codex-rs/ext/extension-api/src/registry.rs (L90-L101)),
which returns the first installed contributor that claims the prompt.
## Summary
- move the `view_image` sandbox filesystem-read unit test onto a
temporary cwd
- keep the turn cwd and selected turn environment cwd aligned inside the
test
- avoid leaving `core/image.png` behind in the repo checkout after the
test runs
## Root cause
The test wrote `image.png` beneath `turn.cwd`, and the shared session
test helper defaults that cwd to the current repo directory when no
override is provided.
## Validation
- `just fmt`
- `cargo test -p codex-core
tools::handlers::view_image::tests::handle_passes_sandbox_context_for_local_filesystem_reads`
Adds plugin/share/checkout to turn a shared remote plugin into a local
working copy under ~/plugins/<name>.
Registers the copy in the managed personal marketplace and records the
remote-to-local mapping for later share/save flows.
---------
Co-authored-by: Codex <noreply@openai.com>
# Why
Hook trust happens through the TUI in `/hooks` so it can block
non-interactive use cases. This flag will allow users that are using
codex headlessly to bypass hooks when they want to.
# What
This adds one invocation-scoped escape hatch.
- the CLI flag sets a runtime-only `bypass_hook_trust` override; there
is no durable `config.toml` setting
- hook discovery still respects normal enablement, so explicitly
disabled hooks remain disabled
- we show a `--dangerously-bypass-hook-trust is enabled. Enabled hooks
may run without review for this invocation.` message on startup so
accidental use is visible in both interactive and exec flows
This keeps “enabled” and “trusted” as separate concepts in the normal
path, while giving CI/E2E callers a stable way to opt into the
exceptional path when they already control the hook set.
# Why
Linked worktrees currently load their own project hook declarations, so
the same repo can present different hook definitions depending on which
checkout is active. https://github.com/openai/codex/pull/21762 tried to
share trust by giving matching worktree hooks a shared synthetic key,
but review pointed out that divergent worktree hook definitions would
then fight over one `trusted_hash`.
Instead of introducing a second trust model, this makes linked worktrees
use the root checkout as the single source of truth for project hook
declarations. Worktree-local project config can still diverge for
unrelated settings, but project hooks now keep one real source path and
one trust state per repo.
# What
- Teach project config loading to remember the matching root-checkout
`.codex/` folder for actual linked-worktree project layers.
- Keep ordinary project config sourced from the worktree, but replace
project hook declarations with the root checkout's matching layer before
hook discovery runs, including linked-worktree layers with `.codex/` but
no local `config.toml`.
- Make hook discovery use that authoritative hook folder for both
`hooks.json` and TOML hook source paths, so linked worktrees produce the
same hook key and trust state as the root checkout.
- Cover the linked-worktree path plus regressions for missing worktree
`config.toml` and nested non-worktree project roots.
## Why
`UnavailableDummyTools` kept synthetic placeholder tools alive for
historical tool calls whose backing MCP tool was no longer available.
That path adds stale model-visible tool specs and special routing at the
point where unavailable MCP calls should use ordinary current-tool
handling. This removes the runtime backfill instead of preserving a
second compatibility lane.
## Is it safe to remove?
The unavailable tools were added in #17853 after a CS issue when a
previously-called MCP tool failed to load and was omitted from the CS
spec. Now that we have tool search, I think this is resolved:
- API merges tools from previous TST output into effective tool set so
theyre always in CS spec
- if an MCP tool surfaced by TST later becomes unavailable, the model
can still call it and it will just return model-visible error
- both TST output and function call output are dropped on compaction so
model will not remember old calls to MCP post compaction
## What changed
- Delete unavailable-tool collection, placeholder handler, router/spec
plumbing, and obsolete placeholder coverage.
- Keep `features.unavailable_dummy_tools` as a removed no-op feature
tombstone so existing configs still parse cleanly.
- Add an integration-style `tool_search` regression test showing that a
deferred MCP tool surfaced through `tool_search` still routes through
MCP and returns a model-visible tool-call error rather than `unsupported
call`.
## Verification
- `cargo test -p codex-core tool_search`
## Why
`chatwidget.rs` is still carrying too many unrelated responsibilities in
one file. #22269 started a five-phase effort to move coherent behavior
domains into focused modules while keeping `chatwidget.rs` as the
composition layer.
This PR is phase 2 of that plan. It extracts the input and submission
flow as a mechanical move before the later protocol, popup/status, and
constructor/orchestration phases.
## What Changed
- Added `codex-rs/tui/src/chatwidget/input_flow.rs` for composer input
results, queued user-message draining, pending-input previews, and
mode-specific submission entry points.
- Added `codex-rs/tui/src/chatwidget/input_submission.rs` for
user-message construction/submission, shell prompt submission,
structured mention resolution, and blocked image draft restoration.
- Added `codex-rs/tui/src/chatwidget/input_restore.rs` for
initial-message submission, pending steer restoration after interrupts,
and thread input snapshot/restore behavior.
- Registered the new modules and removed the moved `ChatWidget` impl
methods from `codex-rs/tui/src/chatwidget.rs`.
## Follow-On Refactor Phases
The five-phase plan from #22269 is:
- Phase 1: mechanical helper and state moves. Completed in #22269.
- Phase 2: extract input and submission flow, including queued user
messages, shell prompt submission, pending steer restoration, and thread
input snapshot/restore behavior. This PR.
- Phase 3: extract protocol, replay, streaming, and tool lifecycle
handling, while preserving active-cell grouping, transcript
invalidation, interrupt deferral, and final-message separator behavior.
- Phase 4: extract settings, popups, and status surfaces, including
model/reasoning/collaboration/personality popups, permission prompts,
rate-limit UI, and connectors helpers.
- Phase 5: clean up the remaining constructor and orchestration code
once the larger behavior domains have moved out, leaving `chatwidget.rs`
as the composition layer.
## Why
Added support for UDS connections in `codex --remote`.
TUI also now connects to local app-server using UDS by default if it is
running and set to listen to UDS connection.
## What Changed
- Introduced `RemoteAppServerEndpoint` with `WebSocket` and `UnixSocket`
variants.
- Reused the existing JSON-RPC-over-WebSocket protocol over either a TCP
WebSocket stream or a UDS stream.
- Updated `codex --remote` to accept `ws://host:port`,
`wss://host:port`, `unix://`, and `unix://PATH`.
- Kept `--remote-auth-token-env` restricted to `wss://` and loopback
`ws://` remotes.
- Added a fast TUI startup probe for the default daemon socket, falling
back to the embedded app server when the daemon is absent or
unresponsive.
## Verification
- Manually verified that the updated remote flow works.
- Added coverage for UDS remote round trips, WebSocket auth headers,
auth-token transport policy, remote address parsing, and missing-daemon
fallback.
- Ran focused remote test coverage locally.
- Keep shared-with-me as the plugin/list request kind, but return
private plugins under workspace-shared-with-me-private.
- Add workspace-shared-with-me-unlisted for installed workspace plugins
with UNLISTED discoverability,
## Why
This builds on the handler-owned spec refactor by moving deferred
tool-search metadata to the same handlers that already own tool specs.
The registry builder no longer needs a separate prebuilt
`tool_search_entries` path; it can collect searchable entries from
deferred handlers directly.
## What changed
- Added `search_info()` to tool handlers and implemented it for MCP and
dynamic handlers.
- Reused handler `spec()` output when constructing tool-search entries,
adapting it into the deferred `LoadableToolSpec` shape expected by
`tool_search`.
- Simplified `build_tool_registry_builder(...)` so `tool_search`
registration is based on deferred handlers with search info.
- Removed the old standalone search-entry builders and now-unused
`codex-tools` discovery helper exports.
## Verification
- `cargo test -p codex-core tools::handlers::tool_search::tests:: --
--nocapture`
- `cargo test -p codex-core tools::spec_plan::tests::search_tool --
--nocapture`
- `cargo test -p codex-core tools::spec::tests:: -- --nocapture`
- `cargo test -p codex-core tools::spec_plan::tests:: -- --nocapture`
- `cargo test -p codex-tools`
- `just fix -p codex-core`
- `just fix -p codex-tools`
## Why
Code mode already builds the merged nested `ToolSpec`s that feed the
`exec` prompt. Keeping a separate `tool_namespaces` map in the planning
path duplicated that metadata and left extra wrapper plumbing in
`spec.rs`.
## What changed
- derive code-mode namespace descriptions from the merged
`ToolSpec::Namespace` entries before building the code-mode handlers
- extract `build_code_mode_handlers(...)` so the code-mode-specific
planning stays in one place
- remove `tool_namespaces` from `ToolRegistryBuildParams`
- delete the now-unused `McpToolPlanInputs` wrapper and related test
helper plumbing
## Testing
- `cargo test -p codex-core spec_plan`
## Why
`CODEX_RS_SSE_FIXTURE` let integration-style CLI, exec, and TUI tests
bypass the normal Responses transport by reading SSE from local files.
That kept test-only behavior wired through production client code. The
affected tests can stay hermetic by using the existing
`core_test_support::responses` mock server and passing `openai_base_url`
instead.
## What Changed
- Removed the `CODEX_RS_SSE_FIXTURE` flag,
`codex_api::stream_from_fixture`, the `env-flags` dependency, and the
checked-in SSE fixture files.
- Repointed the affected core, exec, and TUI tests at `MockServer` with
the existing SSE event constructors.
- Removed the Bazel test data plumbing for the deleted fixtures and
refreshed cargo/Bazel lock state.
## Verification
- `cargo build -p codex-cli`
- `cargo test -p codex-api`
- `cargo test -p codex-core --test all responses_api_stream_cli`
- `cargo test -p codex-core --test all
integration_creates_and_checks_session_file`
- `cargo test -p codex-exec --test all ephemeral`
- `cargo test -p codex-exec --test all resume`
- `cargo test -p codex-tui --test all
resume_startup_does_not_consume_model_availability_nux_count`
- `just bazel-lock-update`
- `just bazel-lock-check`
- `just fix -p codex-api -p codex-core -p codex-exec -p codex-tui`
- `git diff --check`
## Why
Enterprise-managed hook policy needs a narrow way to require Codex to
ignore user-controlled lifecycle hooks without adopting the broader
trust-precedence model from earlier hook work. This keeps the policy
anchored in `requirements.toml`, so admins can opt into managed hooks
only while normal `config.toml` files cannot enable the restriction
themselves.
## What changed
- Added `allow_managed_hooks_only` to the requirements data flow and
preserved explicit `false` values.
- Also adds it to /debug-config
- Marked MDM, system, and legacy managed config layers as managed for
hook discovery.
- Updated hook discovery so `allow_managed_hooks_only = true`:
- keeps managed requirements hooks and managed config-layer hooks,
- skips user/project/session `hooks.json` and `[hooks]` entries with
concise startup warnings,
- skips current unmanaged plugin hooks,
- ignores any `allow_managed_hooks_only` key placed in ordinary
`config.toml` layers.
## Why
hook semantics treat `session_id` as shared across a root session and
its subagents. Codex hooks were still emitting the current thread ID,
which made spawned agents look like independent sessions and made it
harder for hook integrations to correlate work across a root thread and
its spawned helpers
This change makes hooks use Codex's existing shared session identity so
hook `session_id` matches the root-thread session across spawned
subagents.
## What Changed
- switch hook payloads to use the existing shared session identity from
core instead of the current thread ID
- cover all hook surfaces that expose `session_id`, including
`SessionStart`, tool hooks, compact hooks, prompt-submit hooks, stop
hooks, and legacy after-agent dispatch
## Why
Guardian review selection was hard-coded in `core`, which worked for the
default OpenAI path but did not give provider implementations a way to
choose backend-specific reviewer model IDs. That matters for Amazon
Bedrock: guardian review should run through the Bedrock/Mantle provider
using Bedrock's `openai.gpt-5.4` model ID, instead of accidentally
selecting a reviewer model that implies the OpenAI backend.
## What Changed
- Added provider-owned approval review model selection via
`ModelProvider::approval_review_model_selection`.
- Moved the existing default selection policy into the provider
abstraction: prefer the requested reviewer model when it is available,
otherwise fall back to the active turn model, preferring `Low` reasoning
when supported.
- Added an Amazon Bedrock override that pins guardian review to
`openai.gpt-5.4` with `Low` reasoning.
## Why
PR #21843 removed the TCP websocket app-server listener, but that also
removed functionality that still needs to exist. Restoring it as-is
would reopen the old remote exposure problem, so this keeps the restored
listener while making remote and non-loopback usage require explicit
auth.
## What Changed
- Mostly reverts #21843 and reapplies the small merge-conflict
resolutions needed on top of current main.
- Restores ws://IP:PORT parsing, the app-server TCP websocket acceptor,
websocket auth CLI flags, and the associated tests.
- The only intentional behavior change from the restored code is that
non-loopback websocket listeners now fail startup unless --ws-auth
capability-token or --ws-auth signed-bearer-token is configured.
Loopback listeners remain available for local and SSH-forwarding
workflows.
## Reviewer Focus
Please focus review on the small auth-enforcement delta layered on top
of the revert:
- codex-rs/app-server-transport/src/transport/websocket.rs:
start_websocket_acceptor now rejects unauthenticated non-loopback
websocket binds before accepting connections.
- codex-rs/app-server-transport/src/transport/auth.rs: helper logic
classifies unauthenticated non-loopback listeners.
- codex-rs/app-server/tests/suite/v2/connection_handling_websocket.rs:
tests cover unauthenticated ws://0.0.0.0 startup rejection and
authenticated non-loopback capability-token startup.
Everything else is intended to be revert/merge-conflict restoration
rather than new product behavior.
## Verification
- Manually verified that TUI remoting is restored and that auth is
enforced for non-localhost urls.
- Adds localVersion to plugin summaries and remoteVersion to share
context, including generated API schemas.
- Hydrates local and remote plugin versions from manifests and remote
release metadata.
- Adds default-on plugin_sharing gate for shared-with-me listing and
plugin/share/save, with disabled-path errors
and focused coverage.
## Summary
Plugin Creator now documents the shorter local-plugin handoff URL that
the app can interpret directly.
[#22221](https://github.com/openai/codex/pull/22221) teaches the skill
to end marketplace-backed creation flows with named View and Share
links; this follow-up updates those examples so the skill only emits the
normalized plugin name, the absolute marketplace path, and optional
share mode.
The documented shape is:
```txt
codex://plugins/<normalized-plugin-name>?marketplacePath=<absolute-marketplace-json-path>
codex://plugins/<normalized-plugin-name>?marketplacePath=<absolute-marketplace-json-path>&mode=share
```
The skill text now states exactly where the normalized plugin name
belongs, exactly where the absolute marketplace path belongs, and that
it should not add `pluginName` or `hostId` query parameters.
## Testing
Tests: plugin-creator skill validation.