Addresses #16622
Problem: bare local file links in TUI markdown render percent-encoded
path bytes literally, unlike file:// links.
Solution: decode bare path targets before local-path expansion and add
regression coverage for spaces and Unicode.
Problem: The resume picker used awkward "Created at" and "Updated at"
headers, and its relative timestamps changed while navigating because
they were recomputed on each redraw.
Solution: Rename the headers to "Created" and "Updated", and anchor
relative timestamp formatting to the picker load time so the displayed
ages stay stable while browsing.
Addresses #16781
Problem: `codex exec --ephemeral` backfilled empty `turn/completed`
items with `thread/read(includeTurns=true)`, which app-server rejects
for ephemeral threads.
This is a regression introduced in the recent conversion of "exec" to
use app server rather than call the core directly.
Solution: Skip turn-item backfill for ephemeral exec threads while
preserving the existing recovery path for non-ephemeral sessions.
Addresses #16832
Problem: After `/fast on`, the TUI omitted an explicit service-tier
clear on later turns, so `/fast off` left app-server sessions stuck on
`priority` until restart.
Solution: Always submit the current service tier with user turns,
including an explicit clear when Fast mode is off, and add a regression
test for the `/fast on` -> `/fast off` flow.
Addresses #16584
Problem: TUI word-wise cursor movement treated entire CJK runs as a
single word, so Option/Alt+Left and Right skipped too far when editing
East Asian text.
Solution: Use Unicode word-boundary segments within each non-whitespace
run so CJK text advances one segment at a time while preserving
separator and delete-word behavior, and add regression coverage for CJK
and mixed-script navigation.
Testing: Manually tested solution by pasting text that includes CJK
characters into the composer and confirmed that keyboard navigation
worked correctly (after confirming it didn't prior to the change).
This adds end-to-end coverage for `responses-api-proxy` request dumps
when Codex spawns a subagent and validates that the `x-codex-window-id`
and `x-openai-subagent` are properly set.
Addresses #13614
Problem: `codex exec --help` implied that `--full-auto` also changed
exec approval mode, even though non-interactive exec stays headless and
does not support interactive approval prompts.
Solution: clarify the `--full-auto` help text so it only describes the
sandbox behavior it actually enables for `codex exec`.
Addresses #15535
Problem: `codex exec --help` advertised a second positional `[COMMAND]`
even though `exec` only accepts a prompt or a subcommand.
Solution: Override the `exec` usage string so the help output shows the
two supported invocation forms instead of the phantom positional.
Problem: `rejects_escalated_permissions_when_policy_not_on_request`
retried a real shell command after asserting the escalation rejection,
so Windows CI could fail on command startup timing instead of approval
behavior.
Solution: Keep the rejection assertion, verify no turn permissions were
granted, and assert through exec-policy evaluation that the same command
would be allowed without escalation instead of timing a subprocess.
This test was flaking on Windows.
Problem: The Windows CI test for turn metadata compared git remote URLs
byte-for-byte even though equivalent remotes can be formatted
differently across Git code paths.
Solution: Normalize the expected and actual origin URLs in the test by
trimming whitespace, removing a trailing slash, and stripping a trailing
.git suffix before comparing.
## Why
`provider_auth_command_supplies_bearer_token` and
`provider_auth_command_refreshes_after_401` were still flaky under
Windows Bazel because the generated fixture used `powershell.exe`, whose
startup can be slow enough to trip the provider-auth timeout in CI.
## What
Replace the generated Windows auth fixture script in
`codex-rs/core/tests/suite/client.rs` with a small `.cmd` script
executed by `cmd.exe /D /Q /C`, and advance `tokens.txt` one line at a
time so the refresh-after-401 test still gets the second token on the
second invocation.
Also align the fixture timeout with the provider-auth default (`5_000`
ms) to avoid introducing a test-only timing budget that is stricter than
production behavior.
## Testing
Left to CI, specifically the Windows Bazel
`//codex-rs/core:core-all-test` coverage for the two provider-auth
command tests.
This makes Responses API proxy request/response dumping first-class by
adding an optional `--dump-dir` flag that emits paired JSON files with
shared sequence/timestamp prefixes, captures full request and response
headers and records parsed JSON bodies.
Send pending mailbox mail after completed reasoning or commentary items
so follow-up requests can pick it up mid-turn.
---------
Co-authored-by: Codex <noreply@openai.com>
This adds an `include_environment_context` config/profile flag that
defaults on, and guards both initial injection and later environment
updates to allow skipping injection of `<environment_context>`.
This PR adds root and profile config switches to omit the generated
`<permissions instructions>` and `<apps_instructions>` prompt blocks
while keeping both enabled by default, and it gates both the initial
developer-context injection and later permissions diff injection so
turning the permissions block off stays effective across turn-context
overrides.
Also added a prompt debug tool that can be used as `codex debug
prompt-input "hello"` and dumps the constructed items list.
# Why this PR exists
This PR is trying to fix a coverage gap in the Windows Bazel Rust test
lane.
Before this change, the Windows `bazel test //...` job was nominally
part of PR CI, but a non-trivial set of `//codex-rs/...` Rust test
targets did not actually contribute test signal on Windows. In
particular, targets such as `//codex-rs/core:core-unit-tests`,
`//codex-rs/core:core-all-test`, and `//codex-rs/login:login-unit-tests`
were incompatible during Bazel analysis on the Windows gnullvm platform,
so they never reached test execution there. That is why the
Cargo-powered Windows CI job could surface Windows-only failures that
the Bazel-powered job did not report: Cargo was executing those tests,
while Bazel was silently dropping them from the runnable target set.
The main goal of this PR is to make the Windows Bazel test lane execute
those Rust test targets instead of skipping them during analysis, while
still preserving `windows-gnullvm` as the target configuration for the
code under test. In other words: use an MSVC host/exec toolchain where
Bazel helper binaries and build scripts need it, but continue compiling
the actual crate targets with the Windows gnullvm cfgs that our current
Bazel matrix is supposed to exercise.
# Important scope note
This branch intentionally removes the non-resource-loading `.rs` test
and production-code changes from the earlier
`codex/windows-bazel-rust-test-coverage` branch. The only Rust source
changes kept here are runfiles/resource-loading fixes in TUI tests:
- `codex-rs/tui/src/chatwidget/tests.rs`
- `codex-rs/tui/tests/manager_dependency_regression.rs`
That is deliberate. Since the corresponding tests already pass under
Cargo, this PR is meant to test whether Bazel infrastructure/toolchain
fixes alone are enough to get a healthy Windows Bazel test signal,
without changing test behavior for Windows timing, shell output, or
SQLite file-locking.
# How this PR changes the Windows Bazel setup
## 1. Split Windows host/exec and target concerns in the Bazel test lane
The core change is that the Windows Bazel test job now opts into an MSVC
host platform for Bazel execution-time tools, but only for `bazel test`,
not for the Bazel clippy build.
Files:
- `.github/workflows/bazel.yml`
- `.github/scripts/run-bazel-ci.sh`
- `MODULE.bazel`
What changed:
- `run-bazel-ci.sh` now accepts `--windows-msvc-host-platform`.
- When that flag is present on Windows, the wrapper appends
`--host_platform=//:local_windows_msvc` unless the caller already
provided an explicit `--host_platform`.
- `bazel.yml` passes that wrapper flag only for the Windows `bazel test
//...` job.
- The Bazel clippy job intentionally does **not** pass that flag, so
clippy stays on the default Windows gnullvm host/exec path and continues
linting against the target cfgs we care about.
- `run-bazel-ci.sh` also now forwards `CODEX_JS_REPL_NODE_PATH` on
Windows and normalizes the `node` executable path with `cygpath -w`, so
tests that need Node resolve the runner's Node installation correctly
under the Windows Bazel test environment.
Why this helps:
- The original incompatibility chain was mostly on the **exec/tool**
side of the graph, not in the Rust test code itself. Moving host tools
to MSVC lets Bazel resolve helper binaries and generators that were not
viable on the gnullvm exec platform.
- Keeping the target platform on gnullvm preserves cfg coverage for the
crates under test, which is important because some Windows behavior
differs between `msvc` and `gnullvm`.
## 2. Teach the repo's Bazel Rust macro about Windows link flags and
integration-test knobs
Files:
- `defs.bzl`
- `codex-rs/core/BUILD.bazel`
- `codex-rs/otel/BUILD.bazel`
- `codex-rs/tui/BUILD.bazel`
What changed:
- Replaced the old gnullvm-only linker flag block with
`WINDOWS_RUSTC_LINK_FLAGS`, which now handles both Windows ABIs:
- gnullvm gets `-C link-arg=-Wl,--stack,8388608`
- MSVC gets `-C link-arg=/STACK:8388608`, `-C
link-arg=/NODEFAULTLIB:libucrt.lib`, and `-C link-arg=ucrt.lib`
- Threaded those Windows link flags into generated `rust_binary`,
unit-test binaries, and integration-test binaries.
- Extended `codex_rust_crate(...)` with:
- `integration_test_args`
- `integration_test_timeout`
- Used those new knobs to:
- mark `//codex-rs/core:core-all-test` as a long-running integration
test
- serialize `//codex-rs/otel:otel-all-test` with `--test-threads=1`
- Added `src/**/*.rs` to `codex-rs/tui` test runfiles, because one
regression test scans source files at runtime and Bazel does not expose
source-tree directories unless they are declared as data.
Why this helps:
- Once host-side MSVC tools are available, we still need the generated
Rust test binaries to link correctly on Windows. The MSVC-side
stack/UCRT flags make those binaries behave more like their Cargo-built
equivalents.
- The integration-test macro knobs avoid hardcoding one-off test
behavior in ad hoc BUILD rules and make the generated test targets more
expressive where Bazel and Cargo have different runtime defaults.
## 3. Patch `rules_rs` / `rules_rust` so Windows MSVC exec-side Rust and
build scripts are actually usable
Files:
- `MODULE.bazel`
- `patches/rules_rs_windows_exec_linker.patch`
- `patches/rules_rust_windows_bootstrap_process_wrapper_linker.patch`
- `patches/rules_rust_windows_build_script_runner_paths.patch`
- `patches/rules_rust_windows_exec_msvc_build_script_env.patch`
- `patches/rules_rust_windows_msvc_direct_link_args.patch`
- `patches/rules_rust_windows_process_wrapper_skip_temp_outputs.patch`
- `patches/BUILD.bazel`
What these patches do:
- `rules_rs_windows_exec_linker.patch`
- Adds a `rust-lld` filegroup for Windows Rust toolchain repos,
symlinked to `lld-link.exe` from `PATH`.
- Marks Windows toolchains as using a direct linker driver.
- Supplies Windows stdlib link flags for both gnullvm and MSVC.
- `rules_rust_windows_bootstrap_process_wrapper_linker.patch`
- For Windows MSVC Rust targets, prefers the Rust toolchain linker over
an inherited C++ linker path like `clang++`.
- This specifically avoids the broken mixed-mode command line where
rustc emits MSVC-style `/NOLOGO` / `/LIBPATH:` / `/OUT:` arguments but
Bazel still invokes `clang++.exe`.
- `rules_rust_windows_build_script_runner_paths.patch`
- Normalizes forward-slash execroot-relative paths into Windows path
separators before joining them on Windows.
- Uses short Windows paths for `RUSTC`, `OUT_DIR`, and the build-script
working directory to avoid path-length and quoting issues in third-party
build scripts.
- Exposes `RULES_RUST_BAZEL_BUILD_SCRIPT_RUNNER=1` to build scripts so
crate-local patches can detect "this is running under Bazel's
build-script runner".
- Fixes the Windows runfiles cleanup filter so generated files with
retained suffixes are actually retained.
- `rules_rust_windows_exec_msvc_build_script_env.patch`
- For exec-side Windows MSVC build scripts, stops force-injecting
Bazel's `CC`, `CXX`, `LD`, `CFLAGS`, and `CXXFLAGS` when that would send
GNU-flavored tool paths/flags into MSVC-oriented Cargo build scripts.
- Rewrites or strips GNU-only `--sysroot`, MinGW include/library paths,
stack-protector, and `_FORTIFY_SOURCE` flags on the MSVC exec path.
- The practical effect is that build scripts can fall back to the Visual
Studio toolchain environment already exported by CI instead of crashing
inside Bazel's hermetic `clang.exe` setup.
- `rules_rust_windows_msvc_direct_link_args.patch`
- When using a direct linker on Windows, stops forwarding GNU driver
flags such as `-L...` and `--sysroot=...` that `lld-link.exe` does not
understand.
- Passes non-`.lib` native artifacts as explicit `-Clink-arg=<path>`
entries when needed.
- Filters C++ runtime libraries to `.lib` artifacts on the Windows
direct-driver path.
- `rules_rust_windows_process_wrapper_skip_temp_outputs.patch`
- Excludes transient `*.tmp*` and `*.rcgu.o` files from process-wrapper
dependency search-path consolidation, so unstable compiler outputs do
not get treated as real link search-path inputs.
Why this helps:
- The host-platform split alone was not enough. Once Bazel started
analyzing/running previously incompatible Rust tests on Windows, the
next failures were in toolchain plumbing:
- MSVC-targeted Rust tests were being linked through `clang++` with
MSVC-style arguments.
- Cargo build scripts running under Bazel's Windows MSVC exec platform
were handed Unix/GNU-flavored path and flag shapes.
- Some generated paths were too long or had path-separator forms that
third-party Windows build scripts did not tolerate.
- These patches make that mixed Bazel/Cargo/Rust/MSVC path workable
enough for the test lane to actually build and run the affected crates.
## 4. Patch third-party crate build scripts that were not robust under
Bazel's Windows MSVC build-script path
Files:
- `MODULE.bazel`
- `patches/aws-lc-sys_windows_msvc_prebuilt_nasm.patch`
- `patches/ring_windows_msvc_include_dirs.patch`
- `patches/zstd-sys_windows_msvc_include_dirs.patch`
What changed:
- `aws-lc-sys`
- Detects Bazel's Windows MSVC build-script runner via
`RULES_RUST_BAZEL_BUILD_SCRIPT_RUNNER` or a `bazel-out` manifest-dir
path.
- Uses `clang-cl` for Bazel Windows MSVC builds when no explicit
`CC`/`CXX` is set.
- Allows prebuilt NASM on the Bazel Windows MSVC path even when `nasm`
is not available directly in the runner environment.
- Avoids canonicalizing `CARGO_MANIFEST_DIR` in the Bazel Windows MSVC
case, because that path may point into Bazel output/runfiles state where
preserving the given path is more reliable than forcing a local
filesystem canonicalization.
- `ring`
- Under the Bazel Windows MSVC build-script runner, copies the
pregenerated source tree into `OUT_DIR` and uses that as the
generated-source root.
- Adds include paths needed by MSVC compilation for
Fiat/curve25519/P-256 generated headers.
- Rewrites a few relative includes in C sources so the added include
directories are sufficient.
- `zstd-sys`
- Adds MSVC-only include directories for `compress`, `decompress`, and
feature-gated dictionary/legacy/seekable sources.
- Skips `-fvisibility=hidden` on MSVC targets, where that
GCC/Clang-style flag is not the right mechanism.
Why this helps:
- After the `rules_rust` plumbing started running build scripts on the
Windows MSVC exec path, some third-party crates still failed for
crate-local reasons: wrong compiler choice, missing include directories,
build-script assumptions about manifest paths, or Unix-only C compiler
flags.
- These crate patches address those crate-local assumptions so the
larger toolchain change can actually reach first-party Rust test
execution.
## 5. Keep the only `.rs` test changes to Bazel/Cargo runfiles parity
Files:
- `codex-rs/tui/src/chatwidget/tests.rs`
- `codex-rs/tui/tests/manager_dependency_regression.rs`
What changed:
- Instead of asking `find_resource!` for a directory runfile like
`src/chatwidget/snapshots` or `src`, these tests now resolve one known
file runfile first and then walk to its parent directory.
Why this helps:
- Bazel runfiles are more reliable for explicitly declared files than
for source-tree directories that happen to exist in a Cargo checkout.
- This keeps the tests working under both Cargo and Bazel without
changing their actual assertions.
# What we tried before landing on this shape, and why those attempts did
not work
## Attempt 1: Force `--host_platform=//:local_windows_msvc` for all
Windows Bazel jobs
This did make the previously incompatible test targets show up during
analysis, but it also pushed the Bazel clippy job and some unrelated
build actions onto the MSVC exec path.
Why that was bad:
- Windows clippy started running third-party Cargo build scripts with
Bazel's MSVC exec settings and crashed in crates such as `tree-sitter`
and `libsqlite3-sys`.
- That was a regression in a job that was previously giving useful
gnullvm-targeted lint signal.
What this PR does instead:
- The wrapper flag is opt-in, and `bazel.yml` uses it only for the
Windows `bazel test` lane.
- The clippy lane stays on the default Windows gnullvm host/exec
configuration.
## Attempt 2: Broaden the `rules_rust` linker override to all Windows
Rust actions
This fixed the MSVC test-lane failure where normal `rust_test` targets
were linked through `clang++` with MSVC-style arguments, but it broke
the default gnullvm path.
Why that was bad:
-
`@@rules_rs++rules_rust+rules_rust//util/process_wrapper:process_wrapper`
on the gnullvm exec platform started linking with `lld-link.exe` and
then failed to resolve MinGW-style libraries such as `-lkernel32`,
`-luser32`, and `-lmingw32`.
What this PR does instead:
- The linker override is restricted to Windows MSVC targets only.
- The gnullvm path keeps its original linker behavior, while MSVC uses
the direct Windows linker.
## Attempt 3: Keep everything on pure Windows gnullvm and patch the V8 /
Python incompatibility chain instead
This would have preserved a single Windows ABI everywhere, but it is a
much larger project than this PR.
Why that was not the practical first step:
- The original incompatibility chain ran through exec-side generators
and helper tools, not only through crate code.
- `third_party/v8` is already special-cased on Windows gnullvm because
`rusty_v8` only publishes Windows prebuilts under MSVC names.
- Fixing that path likely means deeper changes in
V8/rules_python/rules_rust toolchain resolution and generator execution,
not just one local CI flag.
What this PR does instead:
- Keep gnullvm for the target cfgs we want to exercise.
- Move only the Windows test lane's host/exec platform to MSVC, then
patch the build-script/linker boundary enough for that split
configuration to work.
## Attempt 4: Validate compatibility with `bazel test --nobuild ...`
This turned out to be a misleading local validation command.
Why:
- `bazel test --nobuild ...` can successfully analyze targets and then
still exit 1 with "Couldn't start the build. Unable to run tests"
because there are no runnable test actions after `--nobuild`.
Better local check:
```powershell
bazel build --nobuild --keep_going --host_platform=//:local_windows_msvc //codex-rs/login:login-unit-tests //codex-rs/core:core-unit-tests //codex-rs/core:core-all-test
```
# Which patches probably deserve upstream follow-up
My rough take is that the `rules_rs` / `rules_rust` patches are the
highest-value upstream candidates, because they are fixing generic
Windows host/exec + MSVC direct-linker behavior rather than
Codex-specific test logic.
Strong upstream candidates:
- `patches/rules_rs_windows_exec_linker.patch`
- `patches/rules_rust_windows_bootstrap_process_wrapper_linker.patch`
- `patches/rules_rust_windows_build_script_runner_paths.patch`
- `patches/rules_rust_windows_exec_msvc_build_script_env.patch`
- `patches/rules_rust_windows_msvc_direct_link_args.patch`
- `patches/rules_rust_windows_process_wrapper_skip_temp_outputs.patch`
Why these seem upstreamable:
- They address general-purpose problems in the Windows MSVC exec path:
- missing direct-linker exposure for Rust toolchains
- wrong linker selection when rustc emits MSVC-style args
- Windows path normalization/short-path issues in the build-script
runner
- forwarding GNU-flavored CC/link flags into MSVC Cargo build scripts
- unstable temp outputs polluting process-wrapper search-path state
Potentially upstreamable crate patches, but likely with more care:
- `patches/zstd-sys_windows_msvc_include_dirs.patch`
- `patches/ring_windows_msvc_include_dirs.patch`
- `patches/aws-lc-sys_windows_msvc_prebuilt_nasm.patch`
Notes on those:
- The `zstd-sys` and `ring` include-path fixes look fairly generic for
MSVC/Bazel build-script environments and may be straightforward to
propose upstream after we confirm CI stability.
- The `aws-lc-sys` patch is useful, but it includes a Bazel-specific
environment probe and CI-specific compiler fallback behavior. That
probably needs a cleaner upstream-facing shape before sending it out, so
upstream maintainers are not forced to adopt Codex's exact CI
assumptions.
Probably not worth upstreaming as-is:
- The repo-local Starlark/test target changes in `defs.bzl`,
`codex-rs/*/BUILD.bazel`, and `.github/scripts/run-bazel-ci.sh` are
mostly Codex-specific policy and CI wiring, not generic rules changes.
# Validation notes for reviewers
On this branch, I ran the following local checks after dropping the
non-resource-loading Rust edits:
```powershell
cargo test -p codex-tui
just --shell 'C:\Program Files\Git\bin\bash.exe' --shell-arg -lc -- fix -p codex-tui
python .\tools\argument-comment-lint\run-prebuilt-linter.py -p codex-tui
just --shell 'C:\Program Files\Git\bin\bash.exe' --shell-arg -lc fmt
```
One local caveat:
- `just argument-comment-lint` still fails on this Windows machine for
an unrelated Bazel toolchain-resolution issue in
`//codex-rs/exec:exec-all-test`, so I used the direct prebuilt linter
for `codex-tui` as the local fallback.
# Expected reviewer takeaway
If this PR goes green, the important conclusion is that the Windows
Bazel test coverage gap was primarily a Bazel host/exec toolchain
problem, not a need to make the Rust tests themselves Windows-specific.
That would be a strong signal that the deleted non-resource-loading Rust
test edits from the earlier branch should stay out, and that future work
should focus on upstreaming the generic `rules_rs` / `rules_rust`
Windows fixes and reducing the crate-local patch surface.
The `OPENAI_BASE_URL` environment variable has been a significant
support issue, so we decided to deprecate it in favor of an
`openai_base_url` config key. We've had the deprecation warning in place
for about a month, so users have had time to migrate to the new
mechanism. This PR removes support for `OPENAI_BASE_URL` entirely.
## Why
Extracted from [#16528](https://github.com/openai/codex/pull/16528) so
the Windows Bazel app-server test failures can be reviewed independently
from the rest of that PR.
This PR targets:
-
`suite::v2::thread_shell_command::thread_shell_command_runs_as_standalone_turn_and_persists_history`
-
`suite::v2::thread_start::thread_start_with_elevated_sandbox_trusts_project_and_followup_loads_project_config`
-
`suite::v2::thread_start::thread_start_with_nested_git_cwd_trusts_repo_root`
There were two Windows-specific assumptions baked into those tests and
the underlying trust lookup:
- project trust keys were persisted and looked up using raw path
strings, but Bazel's Windows test environment can surface canonicalized
paths with `\\?\` / UNC prefixes or normalized symlink/junction targets,
so follow-up `thread/start` requests no longer matched the project entry
that had just been written
- `item/commandExecution/outputDelta` assertions compared exact trailing
line endings even though shell output chunk boundaries and CRLF handling
can differ on Windows, and Bazel made that timing-sensitive mismatch
visible
There was also one behavior bug separate from the assertion cleanup:
`thread/start` decided whether to persist trust from the final resolved
sandbox policy, but on Windows an explicit `workspace-write` request may
be downgraded to `read-only`. That incorrectly skipped writing trust
even though the request had asked to elevate the project, so the new
logic also keys off the requested sandbox mode.
## What
- Canonicalize project trust keys when persisting/loading `[projects]`
entries, while still accepting legacy raw keys for existing configs.
- Persist project trust when `thread/start` explicitly requests
`workspace-write` or `danger-full-access`, even if the resolved policy
is later downgraded on Windows.
- Make the Windows app-server tests compare persisted trust paths and
command output deltas in a path/newline-normalized way.
## Verification
- Existing app-server v2 tests cover the three failing Windows Bazel
cases above.
- Keep only parent system/developer/user messages plus assistant
final-answer messages in forked child history.
- Strip parent tool/reasoning items and remove the unmatched synthetic
spawn output.
## Summary
The skill list opened by '$' shows `interface.display_name` preferably
if available but the sorting order of the search results use the
`skill.name` for sorting the results regardless.
This can be clearly seen in this example below: I expected with "pr" as
the search term to have "PR Babysitter" be the first item, but instead
it's way down the list.
The reason is because "PR Babysitter" skill name is "babysit-pr" and
therefore it doesn't rank as high as "pr-review-triage".
This PR fixes this behavior.
| Before | After |
| --- | --- |
| <img width="659" height="376" alt="image"
src="https://github.com/user-attachments/assets/51a71491-62ec-4163-a6f3-943ddf55856d"
/> | <img width="618" height="429" alt="image"
src="https://github.com/user-attachments/assets/f5ec4f4a-c539-4a5d-bdc5-c3e3e630f530"
/> |
## Testing
- `just fmt`
- `cargo test -p codex-tui
bottom_pane::skill_popup::tests::display_name_match_sorting_beats_worse_secondary_search_term_matches
--lib -- --exact`
- `cargo test -p codex-tui`
## Why
We were seeing failures in the following tests as part of trying to get
all the tests running under Bazel on Windows in CI
(https://github.com/openai/codex/pull/16528):
```
suite::shell_command::unicode_output::with_login
suite::shell_command::unicode_output::without_login
```
Certainly `PATHEXT` should have been included in the extra `CORE_VARS`
list, so we fix that up here, but also take things a step further for
now by forcibly ensuring it is set on Windows in the return value of
`create_env()`. Once we get the Windows Bazel build working reliably
(i.e., after #16528 is merged), we should come back to this and confirm
we can remove the special case in `create_env()`.
## What
- Split core env inheritance into `COMMON_CORE_VARS` plus
platform-specific allowlists for Windows and Unix in
[`exec_env.rs`](1b55c88fbf/codex-rs/core/src/exec_env.rs (L45-L81)).
- Preserve `PATHEXT`, `USERNAME`, and `USERPROFILE` on Windows, and
`HOME` / locale vars on Unix.
- Backfill a default `PATHEXT` in `create_env()` on Windows if the
parent env does not provide one, so child process launch still works in
stripped-down Bazel environments.
- Extend the Windows exec-env test to assert mixed-case `PathExt`
survives case-insensitive core filtering, and document why the
shell-command Unicode test goes through a child process.
## Verification
- `cargo test -p codex-core exec_env::tests`
Addresses #16124
Problem: `codex --remote --cd <path>` canonicalized the path locally and
then omitted it from remote thread lifecycle requests, so remote-only
working directories failed or were ignored.
Solution: Keep remote startup on the local cwd, forward explicit `--cd`
values verbatim to `thread/start`, `thread/resume`, and `thread/fork`,
and cover the behavior with `codex-tui` tests.
Testing: I manually tested `--remote --cd` with both absolute and
relative paths and validated correct behavior.
---
Update based on code review feedback:
Problem: Remote `--cd` was forwarded to `thread/resume` and
`thread/fork`, but not to `thread/list` lookups, so `--resume --last`
and picker flows could select a session from the wrong cwd; relative cwd
filters also failed against stored absolute paths.
Solution: Apply explicit remote `--cd` to `thread/list` lookups for
`--last` and picker flows, normalize relative cwd filters on the
app-server before exact matching, and document/test the behavior.
Addresses #11555
Problem: macOS malloc stack-logging diagnostics could leak into the TUI
composer and get misclassified as pasted user input.
Solution: Strip `MallocStackLogging*` and `MallocLogFile*` during macOS
pre-main hardening and document the additional env cleanup.
Addresses #15640
Problem: `codex exec` panicked on macOS when sandboxed proxy discovery
hit a NULL `SCDynamicStore` handle in `system-configuration`.
Solution: Bump `hyper-util` and `system-configuration` to versions that
handle denied `configd` lookups safely, and refresh the Bazel lockfile.
Testing: Verified using the manual `printf '(version 1) (allow default)
(deny mach-lookup (global-name
"com.apple.SystemConfiguration.configd"))' > /tmp/deny-configd.sb
sandbox-exec -f /tmp/deny-configd.sb codex exec -s danger-full-access
"echo test"`. Prior to the fix, this caused a panic.
Addresses #15282
Problem: Codex warned about missing system bubblewrap even when
sandboxing was disabled.
Solution: Gate the bwrap warning on the active sandbox policy and skip
it for danger-full-access and external-sandbox modes.
Addresses #16671 and #14927
Problem: `mcpServerStatus/list` rebuilt MCP tool groups from sanitized
tool prefixes but looked them up by unsanitized server names, so
hyphenated servers rendered as having no tools in `/mcp`. This was
reported as a regression when the TUI switched to use the app server.
Solution: Build each server's tool map using the original server name's
sanitized prefix, include effective runtime MCP servers in the status
response, and add a regression test for hyphenated server names.
Addresses #16454
Problem: `/copy` could keep stale output after a turn with
commentary-only assistant text.
Solution: Cache the latest non-empty agent message during a turn and
promote it on turn completion.
Stacked on #16508.
This removes the temporary `codex-core` / `codex-login` re-export shims
from the ownership split and rewrites callsites to import directly from
`codex-model-provider-info`, `codex-models-manager`, `codex-api`,
`codex-protocol`, `codex-feedback`, and `codex-response-debug-context`.
No behavior change intended; this is the mechanical import cleanup layer
split out from the ownership move.
---------
Co-authored-by: Codex <noreply@openai.com>
## Why
This is a follow-up to #16665. The Windows `unicode_output` test should
still exercise a child process so it verifies PowerShell's UTF-8 output
configuration, but `$env:COMSPEC` depends on that environment variable
surviving the curated Bazel test environment.
Using `cmd.exe` keeps the child-process coverage while avoiding both
bare `cmd` + `PATHEXT` lookup and `$env:COMSPEC` env passthrough
assumptions.
## What
- Run `cmd.exe /c echo naïve_café` in the Windows branch of
`unicode_output`.
## Verification
- `cargo test -p codex-core unicode_output`
## Why
Windows Bazel shell tests launch PowerShell with a curated environment,
so `PATHEXT` may be absent. The existing `unicode_output` test invokes
bare `cmd`, which can fail before the test exercises UTF-8 child-process
output.
## What
- Use `$env:COMSPEC /c echo naïve_café` in the Windows branch of
`unicode_output`.
- Preserve the external child-process path instead of switching the test
to a PowerShell builtin.
## Verification
- `cargo test -p codex-core unicode_output`
https://github.com/openai/codex/pull/16460 was a large PR created by
Codex to try to get the tests to pass under Bazel on Windows. Indeed, it
successfully ran all of the tests under `//codex-rs/core:` with its
changes to `codex-rs/core/`, though the full set of changes seems to be
too broad.
This PR tries to port the key changes, which are:
- Under Bazel, the `USERNAME` environment variable is not guaranteed to
be set on Windows, so for tests that need a non-empty env var as a
convenient substitute for an env var containing an API key, just use
`PATH`. Note that `PATH` is unlikely to contain characters that are not
allowed in an HTTP header value.
- Specify `"powershell.exe"` instead of just `"powershell"` in case the
`PATHEXT` env var gets lost in the shuffle.
## Summary
- split `models-manager` out of `core` and add `ModelsManagerConfig`
plus `Config::to_models_manager_config()` so model metadata paths stop
depending on `core::Config`
- move login-owned/auth-owned code out of `core` into `codex-login`,
move model provider config into `codex-model-provider-info`, move API
bridge mapping into `codex-api`, move protocol-owned types/impls into
`codex-protocol`, and move response debug helpers into a dedicated
`response-debug-context` crate
- move feedback tag emission into `codex-feedback`, relocate tests to
the crates that now own the code, and keep broad temporary re-exports so
this PR avoids a giant import-only rewrite
## Major moves and decisions
- created `codex-models-manager` as the owner for model
cache/catalog/config/model info logic, including the new
`ModelsManagerConfig` struct
- created `codex-model-provider-info` as the owner for provider config
parsing/defaults and kept temporary `codex-login`/`codex-core`
re-exports for old import paths
- moved `api_bridge` error mapping + `CoreAuthProvider` into
`codex-api`, while `codex-login::api_bridge` temporarily re-exports
those symbols and keeps the `auth_provider_from_auth` wrapper
- moved `auth_env_telemetry` and `provider_auth` ownership to
`codex-login`
- moved `CodexErr` ownership to `codex-protocol::error`, plus
`StreamOutput`, `bytes_to_string_smart`, and network policy helpers to
protocol-owned modules
- created `codex-response-debug-context` for
`extract_response_debug_context`, `telemetry_transport_error_message`,
and related response-debug plumbing instead of leaving that behavior in
`core`
- moved `FeedbackRequestTags`, `emit_feedback_request_tags`, and
`emit_feedback_request_tags_with_auth_env` to `codex-feedback`
- deferred removal of temporary re-exports and the mechanical import
rewrites to a stacked follow-up PR so this PR stays reviewable
## Test moves
- moved auth refresh coverage from `core/tests/suite/auth_refresh.rs` to
`login/tests/suite/auth_refresh.rs`
- moved text encoding coverage from
`core/tests/suite/text_encoding_fix.rs` to
`protocol/src/exec_output_tests.rs`
- moved model info override coverage from
`core/tests/suite/model_info_overrides.rs` to
`models-manager/src/model_info_overrides_tests.rs`
---------
Co-authored-by: Codex <noreply@openai.com>
Addresses #16655
Problem: `codex login --api-key` failed in Clap before Codex could show
the deprecation guidance.
Solution: Allow the hidden `--api-key` flag to parse with zero or one
values so both forms reach the `--with-api-key` message.
## Why
The Windows `ProviderAuthScript` test helpers do not need PowerShell.
Running them through `cmd.exe` is enough to emit the next fixture token
and rotate `tokens.txt`, and it avoids a PowerShell-specific dependency
in these tests.
## What changed
- Replaced the Windows `print-token.ps1` fixtures with `print-token.cmd`
in `codex-rs/core/src/models_manager/manager_tests.rs` and
`codex-rs/login/src/auth/auth_tests.rs`.
- Switched the failing external-auth helper in
`codex-rs/login/src/auth/auth_tests.rs` from `powershell.exe -Command
'exit 1'` to `cmd.exe /d /s /c 'exit /b 1'`.
- Updated Windows timeout comments so they no longer call out PowerShell
specifically.
## Verification
- `cargo test -p codex-login`
- `cargo test -p codex-core` (fails in unrelated
`core/src/config/config_tests.rs` assertions in this checkout)
## Why
`thread/shellCommand` executes the raw command string through the
current user shell, which is PowerShell on Windows. The two v2
app-server tests in `app-server/tests/suite/v2/thread_shell_command.rs`
used POSIX `printf`, so Bazel CI on Windows failed with `printf` not
being recognized as a PowerShell command.
For reference, the user-shell task wraps commands with the active shell
before execution:
[`core/src/tasks/user_shell.rs`](7a3eec6fdb/codex-rs/core/src/tasks/user_shell.rs (L120-L126)).
## What Changed
Added a test-local helper that builds a shell-appropriate output command
and expected newline sequence from `default_user_shell()`:
- PowerShell: `Write-Output '...'` with `\r\n`
- Cmd: `echo ...` with `\r\n`
- POSIX shells: `printf '%s\n' ...` with `\n`
Both `thread_shell_command_runs_as_standalone_turn_and_persists_history`
and `thread_shell_command_uses_existing_active_turn` now use that
helper.
## Verification
- `cargo test -p codex-app-server thread_shell_command`
- Persist trusted cwd state during thread/start when the resolved
sandbox is elevated.
- Add app-server coverage for trusted root resolution and confirm
turn/start does not mutate trust.
## Why
This continues the compile-time cleanup from #16630. `SessionTask`
implementations are monomorphized, but `Session` stores the task behind
a `dyn` boundary so it can drive and abort heterogenous turn tasks
uniformly. That means we can move the `#[async_trait]` expansion off the
implementation trait, keep a small boxed adapter only at the storage
boundary, and preserve the existing task lifecycle semantics while
reducing the amount of generated async-trait glue in `codex-core`.
One measurement caveat showed up while exploring this: a warm
incremental benchmark based on `touch core/src/tasks/mod.rs && cargo
check -p codex-core --lib` was basically flat, but that was the wrong
benchmark for this change. Using package-clean `codex-core` rebuilds,
like #16630, shows the real win.
Relevant pre-change code:
- [`SessionTask` with
`#[async_trait]`](3c7f013f97/codex-rs/core/src/tasks/mod.rs (L129-L182))
- [`RunningTask` storing `Arc<dyn
SessionTask>`](3c7f013f97/codex-rs/core/src/state/turn.rs (L69-L77))
## What changed
- Switched `SessionTask::{run, abort}` to native RPITIT futures with
explicit `Send` bounds.
- Added a private `AnySessionTask` adapter that boxes those futures only
at the `Arc<dyn ...>` storage boundary.
- Updated `RunningTask` to store `Arc<dyn AnySessionTask>` and removed
`#[async_trait]` from the concrete task impls plus test-only
`SessionTask` impls.
## Timing
Benchmarked package-clean `codex-core` rebuilds with dependencies left
warm:
```shell
cargo check -p codex-core --lib >/dev/null
cargo clean -p codex-core >/dev/null
/usr/bin/time -p cargo +nightly rustc -p codex-core --lib -- \
-Z time-passes \
-Z time-passes-format=json >/dev/null
```
| revision | rustc `total` | process `real` | `generate_crate_metadata`
| `MIR_borrow_checking` | `monomorphization_collector_graph_walk` |
| --- | ---: | ---: | ---: | ---: | ---: |
| parent `3c7f013f9735` | 67.21s | 67.71s | 24.61s | 23.43s | 22.43s |
| this PR `2cafd783ac22` | 35.08s | 35.60s | 8.01s | 7.25s | 7.15s |
| delta | -47.8% | -47.4% | -67.5% | -69.1% | -68.1% |
For completeness, the warm touched-file benchmark stayed flat (`1.96s`
parent vs `1.97s` this PR), which is why that benchmark should not be
used to evaluate this refactor.
## Verification
- Ran `cargo test -p codex-core`; this change compiled and task-related
tests passed before hitting the same unrelated 5
`config::tests::*guardian*` failures already present on the parent
stack.
## TL;DR
Fixes the issues when using Codex CLI with Zellij multiplexer. Before
this PR there would be no scrollback when using it inside a zellij
terminal.
## Problem
Addresses #2558
Zellij does not support ANSI scroll-region manipulation (`DECSTBM` /
Reverse Index) or the alternate screen buffer in the way traditional
terminals do. When codex's TUI runs inside Zellij, two things break: (1)
inline history insertion corrupts the display because the scroll-region
escape sequences are silently dropped or mishandled, and (2) the
composer textarea renders with inherited background/foreground styles
that produce unreadable text against Zellij's pane chrome.
## Mental model
The fix introduces a **Zellij mode** — a runtime boolean detected once
at startup via `codex_terminal_detection::terminal_info().is_zellij()` —
that gates two subsystems onto Zellij-safe terminal strategies:
- **History insertion** (`insert_history.rs`): Instead of using
`DECSTBM` scroll regions and Reverse Index (`ESC M`) to slide content
above the viewport, Zellij mode scrolls the screen by emitting `\n` at
the bottom row and then writes history lines at absolute positions. This
avoids every escape sequence Zellij mishandles.
- **Viewport expansion** (`tui.rs`): When the viewport grows taller than
available space, the standard path uses `scroll_region_up` on the
backend. Zellij mode instead emits newlines at the screen bottom to push
content up, then invalidates the ratatui diff buffer so the next draw is
a full repaint.
- **Composer rendering** (`chat_composer.rs`, `textarea.rs`): All text
rendering in the input area uses an explicit `base_style` with
`Color::Reset` foreground, preventing Zellij's pane styling from
bleeding into the textarea. The prompt chevron (`›`) and placeholder
text use explicit color constants instead of relying on `.bold()` /
`.dim()` modifiers that render inconsistently under Zellij.
## Non-goals
- This change does not fix or improve Zellij's terminal emulation
itself.
- It does not rearchitect the inline viewport model; it adds a parallel
code path gated on detection.
- It does not touch the alternate-screen disable logic (that already
existed and continues to use `is_zellij` via the same detection).
## Tradeoffs
- **Code duplication in `insert_history.rs`**: The Zellij and Standard
branches share the line-rendering loop (color setup, span merging,
`write_spans`) but differ in the scrolling preamble. The duplication is
intentional — merging them would force a complex conditional state
machine that's harder to reason about than two flat sequences.
- **`invalidate_viewport` after every Zellij history flush or viewport
expansion**: This forces a full repaint on every draw cycle in Zellij,
which is more expensive than ratatui's normal diff-based rendering. This
is necessary because Zellij's lack of scroll-region support means the
diff buffer's assumptions about what's on screen are invalid after we
manually move content.
- **Explicit colors vs semantic modifiers**: Replacing `.bold()` /
`.dim()` with `Color::Cyan` / `Color::DarkGray` / `Color::White` in the
Zellij branch sacrifices theme-awareness for correctness. If the project
ever adopts a theming system, Zellij styling will need to participate.
## Architecture
The Zellij detection flag flows through three layers:
1. **`codex_terminal_detection`** — `TerminalInfo::is_zellij()` (new
convenience method) reads the already-detected `Multiplexer` variant.
2. **`Tui` struct** — caches `is_zellij` at construction; passes it into
`update_inline_viewport`, `flush_pending_history_lines`, and
`insert_history_lines_with_mode`.
3. **`ChatComposer` struct** — independently caches `is_zellij` at
construction; uses it in `render_textarea` for style decisions.
The two caches (`Tui.is_zellij` and `ChatComposer.is_zellij`) are read
from the same global `OnceLock<TerminalInfo>`, so they always agree.
## Observability
No new logging, metrics, or tracing is introduced. Diagnosis depends on:
- Whether `ZELLIJ` or `ZELLIJ_SESSION_NAME` env vars are set (the
detection heuristic).
- Visual inspection of the rendered TUI inside Zellij vs a standard
terminal.
- The insta snapshot `zellij_empty_composer` captures the Zellij-mode
render path.
## Tests
- `terminal_info_reports_is_zellij` — unit test in `terminal-detection`
confirming the convenience method.
- `zellij_empty_composer_snapshot` — insta snapshot in `chat_composer`
validating the Zellij render path for an empty composer.
- `vt100_zellij_mode_inserts_history_and_updates_viewport` — integration
test in `insert_history` verifying that Zellij-mode history insertion
writes content and shifts the viewport.
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses #16560
Problem: `/status` stopped showing the source thread id in forked TUI
sessions after the app-server migration.
Solution: Carry fork source ids through app-server v2 thread data and
the TUI session adapter, and update TUI fixtures so `/status` matches
the old TUI behavior.
Recently, I merged a number of PRs to increase startup timeouts for
scripts that ran under PowerShell, but in the failure for
`suite::codex_tool::test_shell_command_approval_triggers_elicitation`, I
found this in the error logs when running on Bazel with BuildBuddy:
```
[mcp stderr] 2026-04-02T19:54:10.758951Z ERROR codex_core::tools::router: error=Exit code: 1
[mcp stderr] Wall time: 0.2 seconds
[mcp stderr] Output:
[mcp stderr] 'New-Item' is not recognized as an internal or external command,
[mcp stderr] operable program or batch file.
[mcp stderr]
```
This error implies that the command was run under `cmd.exe` instead of
`pwsh.exe`. Under GitHub Actions, I suspect that the `%PATH%` that is
passed to our Bazel builder is scrubbed such that our tests cannot find
PowerShell where GitHub installs it. Having these explicit fallback
paths should help.
While we could enable these only for tests, I don't see any harm in
keeping them in production, as well.
Addresses #16389
Problem: `/review` follow-ups can crash when app-server TUI steers with
a stale active turn id; #14717 introduced the client-side race, and
#15714 only handled the “no active turn” half.
Solution: Treat turn-id mismatch as stale cached state too, sync to the
server’s current turn id, retry once, and let review turns fall into the
existing queue path.