## Summary
Follow-up to #24459 and partial behavioral revert of `a71fc47` / #16699.
- Stop removing `MallocStackLogging*` and `MallocLogFile*` from macOS
pre-main hardening.
- Remove documentation that claims Codex suppresses those allocator
diagnostic controls.
- Retain the shared `remove_env_vars_with_prefix` refactor and existing
`LD_` / `DYLD_` hardening.
## Why
#24459 fixes the composer-corruption problem at the terminal stderr
boundary while preserving redirected stderr. With that guard in place,
stripping macOS malloc diagnostic settings is unnecessary and can hide
diagnostics intentionally enabled by callers.
## Validation
- `just fmt`
- `just test -p codex-process-hardening`
- `just argument-comment-lint-from-source -p codex-process-hardening`
- `git diff --check`
## Why
Fixes#17139.
On macOS, runtime diagnostics such as `MallocStackLogging` messages can
be written directly to process stderr while the inline TUI owns the
terminal. Those bytes paint into the same viewport as the composer
without passing through the renderer or composer state, making
diagnostic output appear to leak into the input area.
## What Changed
- Add a macOS terminal stderr guard while the inline TUI owns the
viewport.
- Restore stderr when Codex returns terminal ownership for external
interactive programs, suspend/resume, panic handling, and normal
shutdown.
- Add an fd-level regression test that verifies output is suppressed
only while terminal ownership is held and restored at each handoff
boundary.
## How to Test
1. On macOS, launch the interactive TUI and leave the composer visible.
2. Exercise the workflow that triggers an allocator/runtime stderr
diagnostic during an active session, as reported in #17139.
3. Confirm the diagnostic no longer overwrites the active composer
region.
4. Suspend or exit the TUI and confirm subsequent terminal stderr output
remains visible.
The platform diagnostic is environment-dependent, so the deterministic
regression check is the new fd-lifecycle test in
`tui::terminal_stderr::tests::suppresses_stderr_only_while_terminal_is_owned`.
Targeted validation:
- `just argument-comment-lint-from-source -p codex-tui` passed.
- `just test -p codex-tui` exercised and passed the new stderr-guard
regression test. The full invocation currently fails in two unrelated
guardian-policy tests,
`update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default`
and
`update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history`,
which reproduce when rerun in isolation.
## Why
Numbered Markdown findings become hard to scan when long items visually
run together or when wrapped explanatory paragraphs lose their list
indentation. This is especially visible in review output: the next
number can look attached to the previous finding, and paragraph
continuation rows can jump back toward the left margin instead of
staying grouped beneath their item.
<table><tr><td>
<center>Before</center>
<img width="1718" height="836" alt="CleanShot 2026-05-24 at 14 00 49"
src="https://github.com/user-attachments/assets/f1ee0023-50fa-4f81-a641-ae08b17b99bd"
/>
</td></tr>
<tr><td>
<center>After</center>
<img width="1714" height="906" alt="image"
src="https://github.com/user-attachments/assets/b123a5e0-a232-47bf-96d5-c935295f7c0a"
/>
</td></tr>
</table>
## What Changed
- Insert a blank separator before a sibling list item when the previous
item occupies more than one rendered line.
- Preserve compact rendering for lists whose sibling items each render
on one line.
- Preserve list-body leading whitespace when transient streamed
assistant rows require another wrapping pass for history display, so
wrapped paragraphs stay aligned beneath their item.
- Share the existing leading-whitespace prefix logic used by history
insertion instead of introducing a second indentation rule.
- Keep streamed Markdown output aligned with completed rendering and add
snapshots for findings-style spacing and streamed paragraph indentation.
## How to Test
1. Start Codex from this branch and open the recorded repro session
`019e563f-7d58-7ff2-8ec7-828f20fa61ca`.
2. Inspect the numbered `Findings` list whose items contain explanatory
paragraphs.
3. Confirm each multiline finding is separated from the next numbered
finding by one blank line.
4. Confirm wrapped rows of each indented paragraph remain aligned
beneath the finding body, rather than returning to the left edge.
5. Render a short one-line numbered or unordered list and confirm its
items remain compact without added blank rows.
Targeted tests:
- `just test -p codex-tui history_cell insert_history markdown_render
markdown_stream streaming::controller`
- `just argument-comment-lint-from-source -p codex-tui`
## Related Work
PR #24346 changes Markdown table column allocation in parallel. This PR
is intentionally limited to list-item readability and history wrapping;
both branches touch `codex-rs/tui/src/markdown_render.rs`, so a small
merge conflict may need resolution depending on merge order.
## Why
Markdown tables with a long path-heavy column could allocate almost all
available width to that column and collapse neighboring prose columns to
only a few characters. In rollout summaries this made `Unit` and `What
It Adds` difficult to read, even though the long `Files` values were the
content best suited to wrapping.
The affected example also specified `Files` as right aligned in its
markdown delimiter (`---:`). This change preserves that requested
alignment while improving how width is distributed.
| Before | After |
|---|---|
| <img width="1709" height="764" alt="image"
src="https://github.com/user-attachments/assets/932ab21c-b72d-48a2-9aad-b69da87a0968"
/> | <img width="1711" height="855" alt="image"
src="https://github.com/user-attachments/assets/4028bd20-2228-4c2f-be8a-1866325b7f62"
/> |
## What Changed
- Classify table columns as narrative, token-heavy, or compact during
width allocation.
- Shrink token-heavy path and URL columns before shrinking narrative
prose, while preserving compact counts and short labels longest.
- Use readable soft floors for narrative and token-heavy content before
falling back to tighter layouts.
- Add snapshot coverage for a rollout-shaped table containing
right-aligned file paths and prose columns.
## How to Test
1. Render a markdown table with `Unit`, right-aligned `Files`, `Adds`,
`Removes`, and `What It Adds` columns at a constrained terminal width.
2. Put long repository paths in `Files` and sentence-length content in
`Unit` and `What It Adds`.
3. Confirm that `Files` remains right aligned but wraps before the
narrative columns become unreadable.
4. Confirm that the compact numeric columns remain easy to scan.
Targeted tests:
- `just test -p codex-tui markdown_render`
Validation note: `just test -p codex-tui` was also attempted and reached
two existing unrelated failures in
`app::tests::update_feature_flags_disabling_guardian_*`; the markdown
rendering regression test passes in the targeted run.
## Why
Users have been reporting missing sessions in the app. The app server
thread listing is backed by the SQLite state DB, but the durable source
of truth for a thread still exists on disk as rollout JSONL. When the
state DB is incomplete, doctor should be able to show the mismatch
directly instead of leaving users with a generic state health result.
## What changed
This adds a `threads` doctor check that compares active and archived
rollout files under `CODEX_HOME` with rows in the SQLite `threads`
table. The check reports missing rollout rows, stale DB rows, archive
flag mismatches, duplicate rollout thread IDs, duplicate DB paths,
source/provider summaries, and bounded samples of affected rollout
paths.
It also adds a read-only state audit helper in `codex-rs/state` so
doctor can inspect thread rows without creating, migrating, or repairing
the database.
## Sample output
```text
⚠ threads rollout files are missing from the state DB
default model provider openai
rollout DB active files 3910
rollout DB archived files 2037
rollout DB scan errors 0
rollout DB malformed file names 0
rollout DB scan cap reached false
rollout DB rows 5499
rollout DB active rows 3462
rollout DB archived rows 2037
rollout DB missing active rows 448
rollout DB missing archived rows 0
rollout DB stale rows 0
rollout DB archive mismatches 0
rollout DB duplicate rollout thread ids 0
rollout DB duplicate DB paths 0
rollout DB model providers openai=5359, lmstudio=35, mock_provider=33, lite_llm=26, proxy=26, ollama=15, lms=4, local-usage-limit=1
rollout DB sources vscode=2587, cli=1494, subagent:thread_spawn=577, subagent:other=502, exec=281, subagent:memory_consolidation=46, subagent:review=9, unknown=3
rollout DB missing active sample ~/.codex/sessions/2026/0…857e-a923c712e066.jsonl
rollout DB missing active sample ~/.codex/sessions/2025/0…877a-766dff25c68d.jsonl
rollout DB missing active sample ~/.codex/sessions/2025/0…a8b1-7bbadc836f6e.jsonl
rollout DB missing active sample ~/.codex/sessions/2025/0…a218-e6197f3f62f8.jsonl
rollout DB missing active sample ~/.codex/sessions/2025/0…9011-7e30784f9932.jsonl
```
## Summary
The TUI `/mcp` inventory flow should reflect the app server’s MCP status
response. It was also joining those results with the TUI process’s local
`config.mcp_servers`, which can diverge once MCP state is owned by a
remote app server and cause stale local command, URL, status, or
empty-state details to render.
This change removes the local config join from the app-server-backed
inventory renderer. The TUI now renders directly from the existing
`mcpServerStatus/list` payload and treats an empty status response as
the empty MCP inventory state.
## Known limitation
The existing `mcpServerStatus/list` payload does not include
disabled-state or disabled-reason fields. To preserve the current
app-server API, this PR does not try to infer that state from
client-local config. If remote `/mcp` needs to show disabled/reason
details again, that should come from app-server-owned status data in a
follow-up.
Related to #22914, #22915, and #22916.
## Why
TUI onboarding trusted-project persistence should go through the same
app-server config write path as other config mutations. Writing
`config.toml` directly from the trust widget bypasses that layer and can
let onboarding proceed even when the trust decision was not actually
persisted.
## What changed
- Added a TUI config helper that writes the existing project trust
structure through `config/batchWrite`.
- Persists trust decisions as `projects.<project>.trust_level =
"trusted"` using the existing project trust key helper.
- Changed the trust directory widget to only record the user selection;
onboarding performs the app-server write before reporting success.
- Keeps the user on the trust screen and shows an error if app-server
persistence fails.
## Verification
- `cargo test -p codex-tui --lib
trust_persistence_failure_keeps_trust_step_in_progress`
- `cargo test -p codex-tui --lib
trusted_project_edit_targets_project_trust_level`
- Manual: built the local `codex-cli`, accepted the trust prompt in a
temp project, confirmed `projects.<project>.trust_level = "trusted"`,
and simulated an unwritable config to verify onboarding stays on the
trust screen without writing trust.
## Summary
Manual provider selection during `codex --oss` startup was still
persisting `oss_provider` through the legacy local `config.toml` writer.
That bypasses the app-server-owned config mutation path used by the TUI,
so this routes the write through the app server config API instead.
The net behavior is intentionally narrow: only an interactive picker
selection is persisted. Auto-detected single-running-provider startup
and explicit `--local-provider` startup remain ephemeral, so merely
having one backend running does not make that provider sticky for future
runs.
## What Changed
- Removed the TUI picker’s direct dependency on
`set_default_oss_provider`.
- Had `oss_selection` report whether the returned provider came from the
interactive picker.
- Carried only manually selected providers into startup persistence.
- Wrote `oss_provider` via `config/batchWrite` once the app server
session is available.
- Logged a warning and continued startup if the app-server config write
fails.
## Verification
Manually smoke-tested the real `codex-tui` binary with a temporary
`CODEX_HOME`, pseudo-terminal input, and a fake LM Studio HTTP server:
- Interactive picker selection persisted `oss_provider = "lmstudio"`.
- Non-picker `--local-provider lmstudio` startup did not persist
`oss_provider`.
Fixes#24093.
## Why
`--dangerously-bypass-hook-trust` is a supported CLI flag intended for
headless or automated runs where enabled hooks should be allowed to run
without requiring persisted trust. In the TUI, startup hook review still
opened whenever hooks looked untrusted, so a launch using the bypass
could block on the interactive "Hooks need review" prompt.
The tricky case is persistent app-server resume: a resume may attach to
an already-running thread, where resume config overrides are ignored. In
that path, hiding the startup review would be wrong because the existing
hook engine may still filter untrusted hooks.
## What Changed
- Startup hook review now skips the prompt only when hook trust bypass
is actually safe for that launch.
- The TUI forwards `bypass_hook_trust` through the app-server request
config for fresh thread start/resume/fork paths, and the app-server
applies it as a runtime-only `ConfigOverrides` value rather than
treating it like a `config.toml` setting.
- Persistent app-server resumes keep the startup review prompt so users
still have a chance to trust hooks when the running thread cannot
receive the bypass override.
## Verification
- Added focused coverage for startup hook review with and without
`bypass_hook_trust`.
- Extended existing TUI/app-server config override tests to cover
forwarding and applying `bypass_hook_trust`.
## Summary
Fixes#24411.
`/status` currently has no way to show when the TUI is talking to Codex
through a remote transport. That makes embedded local sessions, local
daemon sessions, and true remote sessions look the same, and it hides
the remote server version when debugging connection-specific behavior.
This PR adds a single `Remote` row for non-embedded connections only.
The row shows the sanitized connection address and a dimmed version
parenthetical, preserving the existing status output for embedded local
sessions.
<img width="791" height="144" alt="image"
src="https://github.com/user-attachments/assets/529d7940-1c45-4586-8b06-f20a1f04b771"
/>
## Verification
- Manually validated when connecting remotely (either implicitly to
local daemon or explicitly)
## Summary
The compact TUI status line already renders rate-limit percentages as
remaining capacity, but the text did not say so. That made high-usage
red indicators ambiguous because values like `weekly 6%` could be read
as either used or remaining.
This PR labels the compact rate-limit values explicitly as `left` across
the status line, terminal title, and setup previews.
Addresses #24274
## Why
We are seeing cases where users have an old background app-server still
running. `codex doctor` already reports background server state, but
without the running app-server version it is harder to diagnose
behaviors that depend on the daemon build.
## What changed
- Reused the app-server daemon's passive initialize probe through a
narrow `probe_app_server_version` helper.
- Updated the `codex doctor` Background Server section to report
`app-server version: <version>` when the socket is reachable.
- Preserved the not-running OK behavior and report `app-server version:
unavailable (<short error>)` when a socket exists but the passive probe
fails.
## Why
Issue #23031 was hard to diagnose from existing `codex doctor` output
because support could not see the OS language, resolved Git install, Git
repo metadata, Windows console mode/code page, or terminal-title inputs
that affect the TUI startup path. This adds those read-only signals to
`codex doctor` so Windows, Linux, and macOS reports carry the context
needed to investigate similar terminal rendering regressions.
Refs #23031
## What Changed
- Add a `system.environment` check for OS type/version, OS language, and
locale env vars.
- Add a `git.environment` check for the selected Git executable, PATH
Git candidates, version, exec path/build options, repository root,
branch, `.git` entry, and `core.fsmonitor`.
- Add Windows console code page and VT-processing mode details to
terminal diagnostics.
- Add a `terminal.title` check for configured/default title items and
resolved project-title source/value.
- Surface startup warning counts in config diagnostics and teach human
output to render the new categories.
## How to Test
1. On Windows, check out this branch and run `cargo run -p codex-cli --
doctor --summary`.
2. Confirm the Environment section includes `system`, `git`, `terminal`,
and `title` rows.
3. Run `cargo run -p codex-cli -- doctor --json`.
4. Confirm the JSON contains `system.environment`, `git.environment`,
and `terminal.title`; on Windows, confirm `terminal.env` details include
console code pages and `VT processing` for stdout/stderr.
5. From a non-git directory, run the same `doctor --json` command and
confirm the Git check reports `repo detected: false` rather than
warning.
Targeted tests:
- `cargo test -p codex-cli doctor`
- `cargo test -p codex-cli`
Move plugin tar.gz packing and unpacking into a shared core-plugins
archive helper so uploaded bundles are decoded through the same tar
handling used for installs. This removes duplicate archive logic,
supports GNU long-name entries on extraction, and keeps size, traversal,
link, and entry-type checks in one place.
## Summary
Change code-mode stored value updates to merge writes by key instead of
replacing the session's complete stored-value map after each cell
completes.
Previously, each cell received a snapshot of stored values and returned
the complete resulting map. When multiple cells ran concurrently, a
later completion could overwrite values written by another cell because
it committed an older snapshot.
This change moves stored-value ownership into `CodeModeService`:
- Each runtime starts from the service's current stored values.
- Runtime completion reports only keys written by that cell.
- The service merges those writes into the current stored-value map on
successful completion.
- Core no longer replaces its stored-value state from a cell result.
As a result, concurrently executing cells can update different stored
keys without clobbering one another.
The move into CodeModeService is motivated by a desire to have this
lifetime tied to a new lifetime object on that side in a subsequent PR.
# Why
`PreToolUse`, `PostToolUse`, and `updatedInput` coverage for local
function tools currently depends on each handler remembering to wire up
the hook contract itself. That makes coverage easy to miss as new
function tools are added, even though most of them share the same basic
shape: a model-facing function call with JSON arguments.
# What
This makes `CoreToolRuntime` provide the default hook contract for
ordinary local function tools:
- build generic `PreToolUse` and `PostToolUse` payloads from the
function tool name and arguments
- apply `updatedInput` rewrites back into function-tool arguments
through the same default path
- let tool outputs override the post-hook input or response when they
have a more stable hook-facing contract
The exceptions stay explicit:
- hosted tools remain outside the generic local function path
- code-mode `wait` and `write_stdin` opt out for now
- `PostToolUse` feedback replaces only the model-visible response, so
code mode keeps its typed tool result
With the generic path in place, the MCP and extension-tool adapters no
longer need their own duplicate pre/post hook plumbing. The new coverage
exercises the registry default plus end-to-end local function behavior
for pre-hook blocking, `updatedInput` rewriting, and post-hook context.
## Why
The package layout gives Codex a stable place for runtime helpers that
should travel with the entrypoint. `shell_zsh_fork` still required users
to configure `zsh_path` manually, even though we already publish
prebuilt zsh fork artifacts.
This PR builds on #24129 and uses the shared DotSlash artifact fetcher
to include the zsh fork in Codex packages when a matching target
artifact exists. Packaged Codex builds can then discover the bundled
fork automatically; the user/profile `zsh_path` override is removed so
the feature uses the package-managed artifact instead of a legacy path
knob.
## What Changed
- Added `scripts/codex_package/codex-zsh`, a checked-in DotSlash
manifest for the current macOS arm64 and Linux zsh fork artifacts.
- Taught `scripts/build_codex_package.py` to fetch the matching zsh fork
artifact and install it at `codex-resources/zsh/bin/zsh` when available
for the selected target.
- Added package layout validation for the optional bundled zsh resource.
- Added `InstallContext::bundled_zsh_path()` and
`InstallContext::bundled_zsh_bin_dir()` for package-layout resource
discovery.
- Threaded the packaged zsh path through config loading as the runtime
`zsh_path` for packaged installs, and removed the config/profile/CLI
override path.
- Kept the packaged default zsh override typed as `AbsolutePathBuf`
until the existing runtime `Config::zsh_path` boundary.
- Updated app-server zsh-fork integration tests to spawn
`codex-app-server` from a temporary package layout with
`codex-resources/zsh/bin/zsh`, matching the new packaged discovery path
instead of setting `zsh_path` in config.
- Switched package executable copying from metadata-preserving `copy2()`
to `copyfile()` plus explicit executable bits, which avoids macOS
file-flag failures when local smoke tests use system binaries as inputs.
## Testing
To verify that the `zsh` executable from the Codex package is picked up
correctly, first I ran:
```shell
./scripts/build_codex_package.py
```
which created:
```
/private/var/folders/vw/x2knqmks50sfhfpy27nftl900000gp/T/codex-package-pms94kdp/
```
so then I ran:
```
/private/var/folders/vw/x2knqmks50sfhfpy27nftl900000gp/T/codex-package-pms94kdp/bin/codex exec --enable shell_zsh_fork 'run `echo $0`'
```
which reported the following, as expected:
```
/private/var/folders/vw/x2knqmks50sfhfpy27nftl900000gp/T/codex-package-pms94kdp/codex-resources/zsh/bin/zsh
```
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/23756).
* #23768
* __->__ #23756
## Why
Remote-control websocket reconnects currently use the shared exponential
backoff helper without a local ceiling, so a long failure streak can
stretch retries out indefinitely and leave the runtime behavior hard to
inspect from logs.
## What Changed
Cap the remote-control reconnect delay at 30 seconds, then reset the
reconnect attempt counter once that capped delay is emitted so the next
failure starts from the initial jittered delay again.
The reconnect failure log now records the attempt number, chosen delay,
and whether the cap triggered a reset, with a separate info log when the
backoff counter is reset after the cap.
## Verification
`just test -p codex-app-server-transport`
Related issue: N/A
## Why
The zsh release workflow currently publishes macOS arm64 and Linux zsh
fork artifacts, but no macOS x64 artifact. The Codex package builder
therefore cannot include codex-resources/zsh/bin/zsh for
x86_64-apple-darwin packages.
## What Changed
- Added an x86_64-apple-darwin row to the macOS zsh release matrix.
- Runs that row on macos-15-large, the Intel macOS runner appropriate
for the native zsh build.
- Added the matching macos-x86_64 platform to the zsh DotSlash publish
config so the generated release manifest can reference the new tarball.
## Why
`openai/openai#947613` adds `X-Codex-Rate-Limit-Reached-Type` for Codex
workspace credit-depletion and spend-cap responses. The CLI currently
reads the adjacent promo header but otherwise renders generic
usage-limit copy, so those responses do not explain the
workspace-specific action the user needs to take.
Backend dependency: https://github.com/openai/openai/pull/947613
## What Changed
- Parse `X-Codex-Rate-Limit-Reached-Type` in the usage-limit error
handling path alongside `x-codex-promo-message`.
- Keep the header value parsing with the shared `RateLimitReachedType`
enum.
- Carry the parsed type on `UsageLimitReachedError` and render
client-owned copy for the four workspace owner/member credit and
spend-cap values.
- Preserve existing promo and plan-based text for absent, generic, or
unknown header values.
- Keep the existing TUI workspace-owner nudge state path unchanged; the
response header only selects the displayed error string.
- Add focused display coverage for all specific type values and the
generic fallback case.
## Test Plan
- Added `usage_limit_reached_error_formats_rate_limit_reached_types`
coverage.
- Not run manually, per request; CI runs validation on the pushed
commit.
## Why
The turn loop no longer needs to decide when a `ModelClientSession`
should reset its websocket state after compaction. That reset behavior
belongs inside the model client, where the websocket cache and retry
state are owned. The repo guidance now calls this out explicitly so
future changes let the incremental request logic decide whether the
previous request can be reused.
## What Changed
- Removed the `reset_client_session` return value from pre-sampling and
auto-compact helpers in `core/src/session/turn.rs`.
- Changed compaction helpers to return `CodexResult<()>` so callers only
handle success or failure.
- Made `ModelClientSession::reset_websocket_session` private to
`core/src/client.rs`, leaving it callable only from model-client
internals.
- Added `AGENTS.md` guidance not to call `reset_client_session`
unnecessarily.
## Validation
- `just test -p codex-core session::turn`
## Why
Before changing the Codex Bridge JSON schema policy, add integration
coverage around real connector-like MCP tool schemas. The existing unit
tests cover individual sanitizer behaviors, but they do not make it easy
to see whether full fixture schemas keep model-visible guidance, prune
only unreachable definitions, drop unsupported JSON Schema fields, and
stay within the Responses API schema budget.
## What Changed
- Added `tools/tests/json_schema_policy_fixtures.rs`, which converts MCP
tool fixtures through `mcp_tool_to_responses_api_tool` and validates the
resulting Responses tool parameters.
- Added connector-style fixtures for Slack, Google Calendar, Google
Drive, Notion, and Microsoft Outlook Email under
`tools/tests/fixtures/json_schema_policy/`.
- Added fixture assertions for preserved guidance, pruned definitions,
expected field drops after `JsonSchema` conversion, marker count
baselines, and dangling local `$ref` prevention.
- Added a real oversized golden Notion `create_page` input schema
fixture to exercise the compaction path that strips descriptions, drops
root `$defs`, rewrites local refs, and fits the compacted schema under
the budget.
## Summary
- add Divan benchmarks for prompt image re-encoding paths
- wire the image benchmark smoke test into Rust CI workflows
## Why
Image prompt handling includes re-encoding work that benefits from
repeatable benchmark coverage so changes can be measured in CI and
locally.
This already helped identify a potential regression from changing compiler flags.
## Impact
Developers can run and compare the new image re-encoding benchmarks, and
CI exercises the benchmark target via the Rust benchmark smoke test.
## Why
The idea here is to erase the difference between initial and followup
inputs to a turn. Followup inputs are already represented as TurnInput.
Eventual goal is not to have explicit on task input at all and pull
everything from input Q.
## What Changed
- Changes `SessionTask::run` and the erased `AnySessionTask::run` path
to accept `Vec<TurnInput>`.
- Wraps user-submitted spawn input as `TurnInput::UserInput` at the
session task start boundary.
- Updates `run_turn` to record initial `TurnInput` using the same hook
and recording path used for pending input.
- Keeps review-specific conversion local to `ReviewTask`, where the
sub-Codex one-shot API still expects `Vec<UserInput>`.
- Moves the synthetic compact prompt into `CompactTask` and starts
compact tasks with empty task input.
## Validation
- `cargo check -p codex-core`
- `just test -p codex-core -E
'test(task_finish_emits_turn_item_lifecycle_for_leftover_pending_user_input)
| test(queued_response_items_for_next_turn_move_into_next_active_turn) |
test(steered_input_reopens_mailbox_delivery_for_current_turn)'`
## Why
The package builder already fetches `rg` from a checked-in DotSlash
manifest. The zsh packaging work needs the same
fetch/cache/size-check/SHA-256/extract path for another manifest, but
keeping that refactor inside the zsh PR makes the review harder to
follow.
This PR factors the existing `rg`-specific implementation into a
reusable helper with no intended behavior change for `rg` packaging.
## What Changed
- Added `scripts/codex_package/dotslash.py` for checked-in DotSlash
manifest parsing, archive download, cache reuse, size validation,
SHA-256 validation, and member extraction.
- Updated `scripts/codex_package/ripgrep.py` to delegate to the shared
helper.
- Preserved the existing `rg` manifest path, cache key, destination
filename, and executable-bit behavior.
## Testing
- `python3 -m py_compile scripts/codex_package/dotslash.py
scripts/codex_package/ripgrep.py scripts/codex_package/cli.py
scripts/codex_package/layout.py scripts/codex_package/zsh.py`
- `python3 -m unittest discover scripts/codex_package`
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/24129).
* #23768
* #23756
* __->__ #24129
## What changed
- Add a distinct `responses_compaction_v2` value for
`CodexCompactionEvent.implementation`.
- Emit that value from the remote compaction v2 path.
- Keep local compaction as `responses` and legacy `/responses/compact`
as `responses_compact`.
## Why
Remote compaction v2 and local prompt-based compaction were both
reported as `responses`, which made the analytics table collapse two
different compaction mechanisms into one implementation bucket.
## Validation
- `just fmt`
- `just test -p codex-analytics`
`just test -p codex-core` was started locally, but this PR is
intentionally being pushed for CI to finish the remaining validation.
## Why
Standalone image generation needs a typed `codex-api` client surface for
the Codex image proxy routes before the harness and model-facing tool
layers are wired in.
## What changed
- Added `ImagesClient` support for JSON `images/generations` and
`images/edits` requests.
- Added typed request and response shapes for generation, JSON edit
image URLs, image metadata, and base64 image outputs.
- Kept generation model slugs open-ended while requiring the generation
model field that the downstream endpoint expects.
- Exported the new client and image types from `codex-api`.
- Added coverage for generation and edit wire shapes, extra response
metadata that the client ignores, and malformed image responses missing
`data`.
## Validation
- `cargo test -p codex-api`
- `just fix -p codex-api`
- `just fmt`
- `git diff --check main`
## Summary
- add `--oauth-client-id` and `--oauth-resource` options for streamable
HTTP `codex mcp add` registrations
- persist those options in MCP server config and use them during the
immediate OAuth login flow
- cover add-time serialization of both OAuth options in the CLI
integration tests
## Testing
- `just fmt`
- `cargo test -p codex-cli`
- `just fix -p codex-cli`
## Why
[Recent PR](https://github.com/openai/codex/pull/22709) removed
`trace_id` from `TurnContextItem`.
## What changed
- Add to `TurnStartedEvent` so rollout consumers can correlate turns
with telemetry traces.
- Note that the branch name is out of date because I originally re-added
to `TurnContextItem`, but we decided to move it to `TurnStartedEvent`.
## Verification
- `cargo test -p codex-protocol`
- `cargo test -p codex-core --lib
regular_turn_emits_turn_started_without_waiting_for_startup_prewarm`
- `cargo test -p codex-core --test all
emits_warning_when_resumed_model_differs`
- `cargo test -p codex-rollout`
- `cargo test -p codex-state`
## Why
`codex sandbox` now always runs the host sandbox backend, so it should
accept the same profile selection mechanism as the rest of the runtime
CLI surface. Without `--profile`, sandbox debugging can exercise only
the default config stack unless users manually translate profile config
into ad hoc `-c` overrides.
Supporting `--profile` lets sandbox invocations load
`$CODEX_HOME/<name>.config.toml`, including permission profile
configuration, before resolving the sandbox policy for the command being
run.
## What Changed
- Added `--profile NAME` / `-p NAME` to the host-specific `codex
sandbox` argument structs as `config_profile`.
- Allowed root-level `codex --profile NAME sandbox ...` and made a
sandbox-local `codex sandbox --profile NAME ...` override the root
selection.
- Threaded `LoaderOverrides` through sandbox config loading so selected
config profile files participate in permission resolution before the
legacy read-only fallback.
- Documented the new sandbox flag in `codex-rs/README.md`.
## Verification
- Added parser coverage for `codex sandbox --profile`.
- Added sandbox config-loader coverage that verifies selected config
profile loader overrides select the profile config rather than falling
back to read-only.
- Ran `cargo test -p codex-cli`.
## Why
Older Git for Windows versions can leave the Windows console output mode
without virtual terminal processing after Codex runs git metadata
commands in a repository. When the TUI later emits ANSI control
sequences for redraws, restore, or image rendering, Windows Terminal can
show raw escape bytes or leave the prompt/status area corrupted.
This is a targeted mitigation for the repo-conditioned Windows rendering
corruption reported in #23888 and related reports #23512 and #23628.
Updating Git avoids the trigger for affected users, but Codex should
also reassert the terminal mode before it writes TUI control sequences.
| Before | After |
|---|---|
| <img width="2100" height="1359" alt="CleanShot 2026-05-22 at 11 23 21"
src="https://github.com/user-attachments/assets/3218c379-5f97-4c71-ab25-805c9d20578a"
/> | <img width="2100" height="1359" alt="CleanShot 2026-05-22 at 11 23
58"
src="https://github.com/user-attachments/assets/55ac72bb-37d0-400e-99bc-12dd5ea4092d"
/> |
## What Changed
- Re-enable Windows virtual terminal processing for stdout and stderr
before TUI mode setup, restore, redraw, resume, and pet image render
paths.
- Treat invalid, null, or non-console handles as no-ops so redirected or
non-console output is unaffected.
- Keep the helper as a no-op on non-Windows platforms.
## How to Test
1. On Windows Terminal with a Git 2.28.0 for Windows install, start
Codex inside a valid Git repository.
2. Start a new Codex CLI session.
3. Confirm the prompt, working indicator, and bottom status line remain
readable instead of showing raw ANSI escape sequences.
4. Repeat outside a Git repository to confirm the ordinary non-repo
startup path is unchanged.
Targeted tests:
- Not run locally; the behavior depends on Windows console mode APIs and
the current worktree is on macOS.
Now that users can install via `curl` (or `irm`), we should tell them
about it so they no longer need to use `npm`!
Note that on one Windows machine I tested on, when I ran:
```
irm https://chatgpt.com/codex/install.ps1 | iex
```
I got this error:
```
iex : The property 'OSArchitecture' cannot be found on this object. Verify that the property exists.
At line:1 char:45
+ irm https://chatgpt.com/codex/install.ps1 | iex
+ ~~~
+ CategoryInfo : NotSpecified: (:) [Invoke-Expression], PropertyNotFoundException
+ FullyQualifiedErrorId : PropertyNotFoundStrict,Microsoft.PowerShell.Commands.InvokeExpressionCommand
```
so we'll recommend the following that works from both `cmd.exe` and
PowerShell:
```
powershell -ExecutionPolicy ByPass -c "irm https://chatgpt.com/codex/install.ps1 | iex"
```
This PR makes a slight update to `codex-rs/tui/src/update_action.rs` to
match.
## Why
Windows sandbox diagnostics currently append to a single `sandbox.log`
under `CODEX_HOME/.sandbox`. That file never rolls over, which makes it
hard to safely include sandbox diagnostics in future feedback reports
without risking unbounded growth.
## What changed
- Replaced direct append-open sandbox logging with
`tracing_appender::rolling::RollingFileAppender`.
- Configured sandbox logs to rotate daily using names like
`sandbox.YYYY-MM-DD.log`.
- Added a conservative `MAX_LOG_FILES` cap of 90 retained matching log
files.
- Routed the Windows sandbox setup helper through the same rolling
writer.
- Added helpers for resolving the current daily sandbox log path so
future feedback upload work can use the same filename logic.
- Updated tests and test diagnostics to read the dated daily log file.
This intentionally does not include sandbox logs in `/feedback` yet;
scrubbing and attachment behavior can happen in a follow-up.
## Testing
- `cargo fmt -p codex-windows-sandbox`
- `cargo check -p codex-windows-sandbox`
- `cargo test -p codex-windows-sandbox`
- `cargo test -p codex-windows-sandbox logging::tests`
- `cargo clippy -p codex-windows-sandbox --all-targets -- -D warnings`
Add new enterprise requirement gate.
Validation:
- `cargo test -p codex-config --lib`
- `cargo test -p codex-app-server-protocol --lib`
- `cargo test -p codex-tui --lib debug_config`
- `cargo test -p codex-app-server --lib` *(fails: stack overflow in
`in_process::tests::in_process_start_initializes_and_handles_typed_v2_request`;
reproduces when run alone)*
## Why
Legacy `[profiles.<name>]` config tables and the legacy `profile`
selector are being retired in favor of profile files selected with
`--profile <name>`. After #23886 removed the CLI-side legacy profile
plumbing, the app-server config surface still exposed those fields and
still carried conversion code for the old protocol shape.
## What changed
- Remove `profile`, `profiles`, and `ProfileV2` from the app-server
config protocol/schema output so `config/read` no longer returns legacy
profile config.
- Drop the old v1 `UserSavedConfig` profile conversion path from
`config`.
- Reject new app-server config writes under `profiles.*` with the same
migration direction used for `profile`, while still allowing callers to
clear existing legacy profile tables.
- Refresh app-server config coverage and the experimental API README
example around the remaining `Config` nesting path.
## Verification
- Added config-manager coverage that `config/read` omits legacy profile
config, `profiles.*` writes are rejected, and existing legacy profile
tables can still be cleared.
- Updated the v2 config RPC test to cover the rejected `profiles.*`
batch-write path.
## Why
`codex sandbox` previously required an OS subcommand like `linux`,
`macos`, or `windows`, even though the command can only run the sandbox
backend available on the current host. That made the CLI imply a
cross-OS choice that does not exist.
## What changed
- Collapse `codex sandbox <os>` into `codex sandbox [COMMAND]...` by
wiring the `sandbox` parser directly to the host-specific backend args
with `cfg`.
- Keep the existing backend runners for Seatbelt, Linux sandbox, and
Windows restricted token.
- Rename the public Windows debug sandbox runner to
`run_command_under_windows_sandbox` for clarity.
- Update the Rust sandbox docs and related README references to describe
host OS selection and avoid pointing readers at legacy `sandbox_mode`
config.
## Arg0 compatibility
The `codex-linux-sandbox` helper path is still handled before normal CLI
parsing. `arg0_dispatch()` checks whether the executable basename is
`codex-linux-sandbox` and directly calls
`codex_linux_sandbox::run_main()`, so removing the `sandbox linux`
parser branch does not affect the arg0 helper flow.
## Verification
- `cargo test -p codex-cli`
- `cargo test -p codex-arg0`
- `just fix -p codex-cli`
## Why
The TUI currently creates a shared plaintext `codex-tui.log` under the
default log directory. That append-only file can keep growing across
runs even though the TUI already records diagnostics in bounded local
stores.
Make the plaintext file log an explicit troubleshooting choice instead
of a default side effect.
This is possible because logs are also stored in the DB with proper
rotation
## What changed
- Only install the TUI file logging layer when `log_dir` is explicitly
set.
- Remove the prior `codex-tui.log` at startup before an opt-in file
layer is created.
- Clarify the `log_dir` config/schema text and `docs/install.md` example
so users opt in with `codex -c log_dir=...` when they need a plaintext
log.
## Why
Remote compaction v2 sends a normal `/responses` request with a
compaction trigger. It should follow the retry semantics used by normal
Responses streaming calls for transient stream/request failures, while
keeping a smaller per-transport retry budget because compact attempts
can run much longer than normal turns.
## What changed
- Add a v2 compaction retry loop that uses `stream_max_retries`,
matching normal Responses turn retry mechanics.
- Cap the compact v2 retry budget at 2 retries per transport with
`min(stream_max_retries, 2)`.
- Retry retryable request-open and post-open stream collection failures
through the same loop.
- Use the existing 200ms exponential backoff and requested retry delay
handling used by normal turn retries.
- Emit the same `Reconnecting... n/max` stream-error notification
pattern.
- Fall back from WebSockets to HTTPS after the compact v2 stream retry
budget is exhausted, then reset the retry counter for HTTPS.
- Keep final remote-compaction failure logging after retries/fallback
are exhausted.
- Treat compact stream EOF before `response.completed` as a retryable
stream failure.
- Add compact v2 regression coverage with `request_max_retries = 0` and
`stream_max_retries = 2`, covering both request-open failure and
opened-stream EOF in one end-to-end test.
## Tests
- `just fmt`
- `cargo test -p codex-core remote_compact_v2`
- `just fix -p codex-core`
`cargo test` for the core and other crates fails on a fresh macOS
checkout without the right stack size variable. This change encourages
using the just test command that sets the environment up correctly.
As a bonus, this should encourage agents to get more benefit out of
nextest's parallel execution.
`#[serde(default)]` wasn't sufficient for our generated TS types to
reflect that clients didn't have to set them. We also need
`skip_serializing_if = "std::ops::Not::not"`. This is already a rule in
our agents.md file.
Updates our build script to pull down the artifacts like we do in CI for
building v8 into our targets.
This changes the flow so that we now pre-install rusty v8 assets for all
of our release targets from pre-built in workflow.
Secondarily if running it locally we now optionally pull the assets down
on python run assuming the user hasn't set the proper values, it then
provides them.
Sorry for the miss here.
## Why
`--profile` now selects `<name>.config.toml`, so the legacy `profile`
selector should not be reintroduced through config write or MCP tool
paths. A matching legacy selector in base user config also needs the
same migration guard as a matching legacy `[profiles.<name>]` table so
profile loading fails with one clear migration error instead of mixing
the old and new profile models.
## What
- reject non-null app-server config writes to the top-level legacy
`profile` selector
- make `--profile <name>` reject base user config that still selects the
same legacy `profile = "<name>"` value, alongside the existing matching
legacy profile-table guard
- reject removed MCP `codex` tool fields such as `profile` by denying
unknown tool-call parameters and exposing that restriction in the
generated schema
- add regression coverage for the app-server write paths, config loader
guard, and MCP tool input/schema behavior
## Verification
- targeted regression tests cover the new app-server, config loader, and
MCP rejection paths
## Summary
- drop the dead legacy profile usage metric and active-profile
conversation-start fields
- update role comments so they describe provider and service-tier
preservation without legacy config-profile wording
- pair the code cleanup with the file-backed profile docs update in
openai/developers-website#1476
## Testing
- `just fmt`
- `cargo test -p codex-otel`
- `cargo test -p codex-core` *(fails: existing stack overflow in
`mcp_tool_call::tests::guardian_mode_mcp_denial_returns_rationale_message`)*
- `cargo test -p codex-core --lib
mcp_tool_call::tests::guardian_mode_mcp_denial_returns_rationale_message`
*(fails with the same stack overflow)*
## Why
`/feedback` asks `ThreadManager` for the selected agent subtree before
it uploads logs. The previous live subtree path reconstructed
parent-child links by iterating every loaded thread and awaiting each
thread config snapshot, so unrelated loaded-thread state could stall
feedback subtree enumeration.
The loaded-thread set already belongs to
[`ThreadManagerState`](50e6644c94/codex-rs/core/src/thread_manager.rs).
Reading thread-spawn parents from the captured `CodexThread` session
sources at that boundary keeps unload and resume behavior manager-owned
while avoiding per-session config inspection.
## What Changed
- expose parent-child thread-spawn edges for loaded, non-internal
threads from `ThreadManagerState`
- build the live child map from those edges while keeping agent metadata
lookup and ordering in `AgentControl`
- add regression coverage for live subtree enumeration when no state DB
is available
## Validation
- `git diff --check`
- local Rust tests not run per request
## Why
[#23883](https://github.com/openai/codex/pull/23883) moved the
user-facing `--profile` flag onto profile v2 and
[#23886](https://github.com/openai/codex/pull/23886) removed CLI
forwarding for the legacy profile-v1 path. Core and TUI config
persistence still carried `active_profile` and
`ConfigEditsBuilder::with_profile`, which let later writes continue
targeting legacy `[profiles.<name>]` tables after profile selection
moved to profile-v2 config files.
## What
- Remove legacy profile routing from
[`ConfigEditsBuilder`](4b38e9c22e/codex-rs/core/src/config/edit.rs (L1064-L1294)),
so core config edits no longer carry `with_profile` or infer
`[profiles.*]` write targets from a `profile` key.
- Drop `active_profile` plumbing from runtime `Config`, TUI
startup/state, app-server config override forwarding, and Windows
sandbox setup persistence.
- Make app-server-backed TUI config edits use unscoped model,
service-tier, feature, Auto-review, plan-mode, and Windows sandbox paths
through
[`tui/src/config_update.rs`](4b38e9c22e/codex-rs/tui/src/config_update.rs (L43-L112)).
- Update config edit coverage so legacy `profile` state stays untouched
by direct model writes, and remove tests whose only contract was the
deleted profile-scoped persistence path.
## Testing
- Not run locally.
## Why
[#23883](https://github.com/openai/codex/pull/23883) moved user-facing
`--profile` selection onto profile v2, and
[#23886](https://github.com/openai/codex/pull/23886) removed the old CLI
`config_profile` override path. Core still had a second legacy path:
`profile = "..."` could select `[profiles.*]` values while runtime
config was built. Keeping that resolver alive preserves the old
precedence model and profile-carrying surfaces even though profile
selection now points at `$CODEX_HOME/<name>.config.toml`.
## What
- Reject legacy top-level `profile = "..."` config while loading runtime
config, with an error that points callers at `--profile <name>` and
`<name>.config.toml` in the [core load
path](3d923366ec/codex-rs/core/src/config/mod.rs (L2524-L2531)).
- Remove the remaining profile-v1 merge points from runtime config
resolution, including features, permissions, model/provider selection,
web search, Windows sandbox settings, TUI settings, role reloads, and
OSS provider lookup.
- Drop the leftover profile override surface from
[`ConfigOverrides`](3d923366ec/codex-rs/core/src/config/mod.rs (L2118-L2148))
and from the MCP server `codex` tool schema.
- Prune profile-precedence tests that only exercised the removed
resolver and replace them with rejection coverage for the legacy
selector.
## Testing
- Not run in this metadata pass.
- Added
[`legacy_profile_selection_is_rejected`](3d923366ec/codex-rs/core/src/config/config_tests.rs (L7942-L7965))
coverage for the new runtime guard.
## Why
`codex --profile <name> mcp ...` should reach the same profile-v2
migration guard as runtime commands. Otherwise legacy
`[profiles.<name>]` users see the generic command-scope rejection
instead of the existing guidance to move settings into
`$CODEX_HOME/<name>.config.toml`.
## What
- Allow `codex mcp` through the `--profile` subcommand gate.
- Pass profile loader overrides into the MCP entry point only to
validate profile-v2 migration when a profile is present.
- Keep MCP add/remove/list/get/login/logout behavior otherwise
unchanged; this does not add profile-scoped MCP server management.
- Cover the legacy profile migration error for `codex --profile work mcp
list`.
## Testing
- `cargo test -p codex-cli`
## Summary
- set `NODE_USE_ENV_PROXY=1` when Codex applies managed network proxy
environment overrides
- keep the Node opt-in in the proxy environment key set used by
shell/runtime env handling
- cover the new env var in the focused network proxy env test
## Why
Codex already sets HTTP proxy environment variables for child processes
when the managed network proxy is active. Node's built-in network
behavior needs the `NODE_USE_ENV_PROXY` opt-in to honor those env vars,
so Node-based skill scripts can otherwise skip the managed proxy path
and fail under restricted network access.
## Validation
- `just fmt` in `codex-rs`
- `cargo test -p codex-network-proxy` in `codex-rs`
## Summary
- Treat MCP tools with `readOnlyHint: true` as parallel-safe even when
`supports_parallel_tool_calls` is unset or `false`.
- Keep server-level `supports_parallel_tool_calls` as an additive
override for non-read-only tools.
- Add focused unit coverage for the MCP handler eligibility decision.
- Update RMCP integration coverage to keep the serial baseline on a
mutable tool, verify read-only concurrency without server opt-in, and
preserve the server opt-in concurrency path separately.
## Testing
- `just fmt`
- `cargo test -p codex-core --lib tools::handlers::mcp::tests::`
- `cargo test -p codex-core --test all
stdio_mcp_read_only_tool_calls_run_concurrently_without_server_opt_in`
- `cargo test -p codex-core --test all
stdio_mcp_parallel_tool_calls_opt_in_runs_concurrently`
- `cargo test -p codex-rmcp-client`