## Summary
- split the single PR-blocking Bazel Windows test leg into four Windows
shard jobs
- preserve the existing required Windows Bazel check name with a
lightweight aggregate gate
- keep Linux/macOS Bazel test jobs and the separate Windows
clippy/release jobs unchanged
## Why
The ordinary PR Windows Bazel test leg was one GitHub Actions job, so
Bazel only had in-job parallelism. This gives that lane real job-level
fanout across separate Windows hosts while keeping the target set
disjoint via stable label hashing.
## Evidence
- final pre-rebase green run: `25774733562`
- Windows shard target counts: `61/212`, `48/212`, `52/212`, `51/212`
- Windows test fanout completed in about 7m29s versus a recent
monolithic median around 22m26s
## Notes
- this is scoped to the Bazel Windows test leg only
- each shard keeps the existing Windows cross-compile/RBE path and
restores the former monolithic Windows test cache
- shard jobs do not upload duplicate repository caches after test work,
keeping cache cleanup off the PR-blocking shard path
- no local validation run; relying on GitHub Actions for the
workflow-shaped check
Co-authored-by: Codex <noreply@openai.com>
## Why
Remote control starts by letting `codex-backend` initialize against the
app-server as an infrastructure health/proxy client before the real
remote client connects. App-server initialization also sets the
process-wide `originator` from `client_info.name`, so `codex-backend`
could become the sticky originator for later model/API requests even
after the real client initialized.
## What changed
- Treat `codex-backend` as a non-originating initialize client,
alongside the existing `codex_app_server_daemon` probe client.
- Preserve normal per-connection initialize behavior, including session
metadata and initialize analytics.
- Add regression coverage that verifies `codex-backend` initialize does
not replace the default originator.
## Testing
- `cargo test -p codex-app-server --test all
initialize_codex_backend_does_not_override_originator`
## Summary
It appears this config flag has been broken/a noop for quite some time:
since https://github.com/openai/codex/pull/8850. Let's simplify and get
rid of this.
## Testing
- [x] Updated unit tests
## Why
Elevated Windows sandbox setup currently assumes that the firewall rules
it writes will take effect. On managed Windows hosts, local firewall
policy changes can be ignored or only partially apply across the active
profiles, which means setup can appear to succeed without providing the
expected network isolation.
## What changed
- Query `INetFwPolicy2::LocalPolicyModifyState` before configuring the
elevated sandbox firewall rules.
- Fail setup when Windows reports that local firewall policy edits are
ineffective or only apply to some current profiles.
- Surface that condition with a dedicated
`helper_firewall_policy_ineffective` setup error code so support and
IT-facing diagnostics can distinguish it from COM access failures.
- Add focused coverage for effective policy, group-policy override, and
partial-profile coverage cases.
## Testing
- `cargo test -p codex-windows-sandbox --bin
codex-windows-sandbox-setup`
## Why
`chatwidget.rs` is still carrying too many unrelated responsibilities in
one file. #22269 started a five-phase cleanup to move coherent behavior
domains into focused modules while keeping `chatwidget.rs` as the
composition layer. #22407 completed phase 2 by extracting input and
submission flow, and #22433 completed phase 3 by extracting protocol,
replay, streaming, and tool lifecycle handling.
This PR is phase 4. It keeps moving high-churn UI coordination out of
the central widget by extracting settings, popups, and status surfaces
without changing the visible behavior those flows already provide. This
is once again a mechanical movement of existing functions. No functional
changes.
## What Changed
- Added focused modules for runtime settings/model coordination,
model/reasoning/collaboration popups,
settings/personality/theme/audio/experimental popups, permission
prompts, status setup/output controls, and Windows sandbox prompt flows.
- Moved the remaining rate-limit nudge/status helpers and connectors
popup/loading/update helpers into their existing focused modules.
- Preserved the existing picker flows, approval behavior, status/title
setup previews, rate-limit notices, and connectors/app list behavior
while shrinking `chatwidget.rs` back toward orchestration.
- Left `codex-rs/tui/src/chatwidget.rs` as the registration and
composition surface for these extracted behaviors.
## Cleanup Phases
The five-phase cleanup plan from #22269 is:
1. Phase 1: mechanical helper and state moves. Completed in #22269.
2. Phase 2: extract input and submission flow, including queued user
messages, shell prompt submission, pending steer restoration, and thread
input snapshot/restore behavior. Completed in #22407.
3. Phase 3: extract protocol, replay, streaming, and tool lifecycle
handling, while preserving active-cell grouping, transcript
invalidation, interrupt deferral, and final-message separator behavior.
Completed in #22433.
4. Phase 4: extract settings, popups, and status surfaces, including
model/reasoning/collaboration/personality popups, permission prompts,
rate-limit UI, and connectors helpers. This PR.
5. Phase 5: clean up the remaining constructor and orchestration code
once the larger behavior domains have moved out, leaving `chatwidget.rs`
as the composition layer.
## Verification
- `cargo check -p codex-tui`
- `cargo test -p codex-tui chatwidget::tests::permissions`
- `cargo test -p codex-tui chatwidget::tests::status_surface_previews`
- `cargo test -p codex-tui chatwidget::tests::popups_and_settings`
- `cargo test -p codex-tui chatwidget::tests::status_and_layout`
`cargo test -p codex-tui` also compiles and begins running, but aborts
in the unchanged app-side test
`app::tests::discard_side_thread_keeps_local_state_when_server_close_fails`
with a reproducible stack overflow.
## Why
`TurnContext::cwd` and `TurnContext::resolve_path` are being phased out
in favor of using the selected turn environment cwd directly.
Deprecating both APIs makes any new direct dependency visible while
preserving the existing migration path for current callers.
## What Changed
- Marked `TurnContext::cwd` and `TurnContext::resolve_path` as
deprecated with guidance to use the selected turn environment cwd
instead.
- Added exact `#[allow(deprecated)]` suppressions at each existing
direct usage site, including tests, rather than adding crate-wide
suppression.
- Kept the change behavior-preserving: current cwd reads, writes, and
path resolution continue to use the same values.
## Verification
- `just fmt`
- `cargo check -p codex-core`
- `cargo check -p codex-core --tests`
- `git diff --check`
# Description
We need to set the appropriate Product SKU for full functionality for
the apps endpoints for each type of client
# Testing
`./target/debug/codex --enable app`
<img width="1786" height="398" alt="CleanShot 2026-05-12 at 11 51 25@2x"
src="https://github.com/user-attachments/assets/2142f768-fc72-4fcb-8f39-9bd0d8569170"
/>
Regular slack flows seem to work, also curling these endpoints with the
correct SKU returns the right apps
## Why
`code_mode_only` filters code-mode nested tools out of the top-level
tool list. For multi-agent v2, we need a rollout shape where the
collaboration tools remain callable as normal model tools without also
being embedded into the code-mode `exec` tool declaration.
Related to this:
https://openai-corpws.slack.com/archives/C0AQLHB4U75/p1778660267922549
## What Changed
- Adds `features.multi_agent_v2.non_code_mode_only`, including config
resolution, profile override handling, and generated schema coverage.
- Introduces `ToolExposure::DirectModelOnly` so a tool can be included
in the initial model-visible list while staying out of the nested
code-mode tool surface.
- Applies that exposure to the multi-agent v2 tools when the new flag is
set: `spawn_agent`, `send_message`, `followup_task`, `wait_agent`,
`close_agent`, and `list_agents`.
- Updates code-mode-only filtering so direct-model-only tools remain
visible while ordinary nested code-mode tools are still hidden.
## Verification
- Added config parsing/profile tests for `non_code_mode_only`.
- Added tool spec coverage for the code-mode-only multi-agent v2
exposure behavior.
## Why
Recent session history showed no active use of the raw `shell`,
`local_shell`, or `container.exec` execution surfaces. Keeping those
handlers/specs wired into core leaves duplicate shell execution paths
alongside the supported `shell_command` and unified exec tools.
## What changed
- Removed the raw `shell` handler/spec and its `ShellToolCallParams`
protocol helper.
- Removed the legacy `local_shell` and `container.exec` handler/spec
plumbing while preserving persisted-history compatibility for old
response items.
- Normalized model/config `default` and `local` shell selections to
`shell_command`.
- Pruned tests that exercised removed raw-shell/local-shell/apply-patch
variants and kept coverage on `shell_command`, unified exec, and
freeform `apply_patch`.
## Verification
- `git diff --check`
- `cargo test -p codex-protocol`
- `cargo test -p codex-tools`
- `cargo test -p codex-core tools::handlers::shell`
- `cargo test -p codex-core tools::spec`
- `cargo test -p codex-core tools::router`
- `cargo test -p codex-core
active_call_preserves_triggering_command_context`
- `cargo test -p codex-core guardian_tests`
- `cargo test -p codex-core --test all shell_serialization`
- `cargo test -p codex-core --test all apply_patch_cli`
- `cargo test -p codex-core --test all shell_command_`
- `cargo test -p codex-core --test all local_shell`
- `cargo test -p codex-core --test all otel::`
- `cargo test -p codex-core --test all hooks::`
- `just fix -p codex-core`
- `just fix -p codex-tools`
## Why
Stop sending duplicate `session_id`/`thread_id` headers. We only want
the hyphenated forms as `_` is rejected by some proxies
Related discussion here:
https://openai.slack.com/archives/C095U48JNL9/p1778508316923179
## What
- Keep `session-id` and `thread-id`
- Remove the underscore aliases
## Why
Deferred tools were tracked with separate side-channel filtering after
tool specs had already been assembled. That made the registry
responsible for executing tools while the router/spec planner separately
decided whether those same tools should be exposed to the model up
front.
This PR makes exposure part of the tool handler contract so direct
versus deferred availability travels with the executable tool
registration.
Next step will be to simplify registration
## What Changed
- Adds `ToolExposure` to `codex-tools` and exposes it through
`ToolExecutor`, defaulting tools to `Direct`.
- Teaches dynamic tools and MCP handlers to mark deferred tools as
`Deferred` at construction time.
- Renames the registry object-safe wrapper from `AnyToolHandler` to
`RegisteredTool` and uses `ToolExposure` when deciding whether to
include a handler's spec in the initial model-visible tool list.
- Refactors tool spec planning to derive direct specs and deferred
search entries from registered handlers, removing the router's
special-case deferred dynamic tool filtering.
## Verification
- Not run.