Commit Graph

5954 Commits

Author SHA1 Message Date
jif-oai
9271e84b79 feat: add manual and remote_v2 tags to compaction metric (#24608)
## Why
`codex.task.compact` only distinguished `local` vs `remote`, which made
it hard to answer simple counter questions in Statsig. Manual `/compact`
and automatic compaction were collapsed together, and the legacy remote
path was also collapsed with `remote_compaction_v2`.

## What Changed
- route `codex.task.compact` through a shared helper in
`core/src/tasks/mod.rs`
- add a `manual=true|false` tag so manual and automatic compaction can
be counted separately
- split the remote tag into `remote` and `remote_v2`
- emit the metric from the inline auto-compaction path in
`core/src/session/turn.rs` as well as the manual `CompactTask` path in
`core/src/tasks/compact.rs`
- add focused unit coverage for the new tag shapes in
`core/src/tasks/mod_tests.rs`

## Verification
- added unit coverage in `core/src/tasks/mod_tests.rs` covering manual
`remote_v2` tags and automatic `local` tags
2026-05-26 18:47:42 +02:00
viyatb-oai
f6fd753039 tui: add named permission profile picker (#21559)
## Why

Users who opt into named permission profiles through
`default_permissions` or `[permissions.*]` should stay in named-profile
semantics when they open `/permissions`. The legacy picker rewrites
those users into anonymous preset state, which loses the active profile
identity and hides custom configured profiles.

## What changed

- Switch `/permissions` to a profile-aware picker when profile mode is
active.
- Show friendly built-in labels instead of raw `:` profile syntax.
- Include configured custom profiles and their descriptions in the
picker.
- Route selections through the split TUI profile-selection flow below
this PR.
- Add TUI snapshots and regression coverage for built-ins, custom
profiles, and conflicting legacy runtime overrides.

## Stack

1. [#22931](https://github.com/openai/codex/pull/22931):
runtime/session/network propagation for active permission profiles.
2. [#23708](https://github.com/openai/codex/pull/23708): TUI selection
plumbing and guardrail flow.
3. **This PR**: profile-aware `/permissions` menu and custom profile
display.

## UX impact

In profile mode, `/permissions` shows the same human-facing built-ins
users already know:

```text
Default
Auto-review
Full Access
Read Only
locked-down
web-enabled
```

Selecting `locked-down` keeps `active_permission_profile =
Some("locked-down")`; selecting a built-in keeps the friendly label
while switching to its named built-in profile.

## Screenshots

Live `$test-tui` smoke screenshots uploaded through GitHub attachments:

**Profile mode with built-ins and custom profiles**

<img width="832" alt="Profile mode permissions picker with custom
profiles"
src="https://github.com/user-attachments/assets/58b72431-418c-4839-9e39-575076db4c8f"
/>

**Legacy mode remains anonymous preset picker**

<img width="1232" alt="Legacy permissions picker"
src="https://github.com/user-attachments/assets/95f413ab-4cee-411c-9afb-92580a885c97"
/>

<img width="1296" height="906" alt="image"
src="https://github.com/user-attachments/assets/ea381a78-9904-4aa2-828f-b7f2e43f60f2"
/>

<img width="705" height="207" alt="Screenshot 2026-05-18 at 2 58 00 PM"
src="https://github.com/user-attachments/assets/2fa6dd71-0296-449e-a6de-a72d78a1cb70"
/>

## Validation

- `git diff --cached --check` before commit.
- Full test run skipped at the user request while pushing the split
stack.
2026-05-26 16:39:55 +00:00
jif-oai
ef6528c6c7 feat: gate dedicated memories tools in config (#24600)
## Why

The memories extension already has dedicated `list`, `read`, `search`,
and `add_ad_hoc_note` tools, but app-server registration was still
disabled. The memories app collaborator needs an explicit config switch
so those native extension tools can be exposed intentionally, without
making ordinary memory prompt usage automatically register the dedicated
tool surface.

## What changed

- Added `[memories].dedicated_tools`, defaulting to `false`, to
`MemoriesToml` / `MemoriesConfig`.
- Regenerated `core/config.schema.json` for the new setting.
- Registered the memories extension as a `ToolContributor`, while
keeping tool contribution gated on both memories being enabled and
`dedicated_tools = true`.
- Added tests for the disabled default, the enabled dedicated-tools
path, and installer registration.

## Verification

- `just test -p codex-config -p codex-memories-extension`
2026-05-26 18:18:58 +02:00
Eric Traut
b84c5898df tui: include exec sessions in resume list (#24503)
## Why

Fixes #24502.

`codex resume --include-non-interactive` should include sessions created
by `codex exec`, but the TUI was sending no `sourceKinds` filter to
`thread/list` for that mode. `thread/list` treats omitted or empty
`sourceKinds` as interactive-only (`cli`, `vscode`), so exec sessions
were still filtered out.

## What Changed

- Added a shared TUI `resume_source_kinds` helper so both resume lookup
paths always pass explicit `sourceKinds` to `thread/list`.
- Kept the default resume behavior scoped to `cli` and `vscode`.
- Made `--include-non-interactive` include `exec` and `appServer`
sessions, while continuing to exclude subagent and unknown sources.

## Verification

Added focused coverage for both affected TUI request builders:

- `latest_session_lookup_params_can_include_non_interactive_sources`
- `remote_thread_list_params_can_include_non_interactive_sources`
2026-05-26 08:27:10 -07:00
pakrym-oai
ff7513cd83 Move MCP tool naming mode into manager (#21576)
## Why

The `non_prefixed_mcp_tool_names` feature should be applied where MCP
tools become model-visible, not by remapping names later in core.
Keeping the decision in `McpConnectionManager` construction makes
`ToolInfo` the single shaped view that spec building, deferred tool
search, routing, and unavailable-tool placeholders can consume directly.

This also preserves the existing external behavior while the feature is
off, and keeps the feature-on behavior for code mode and hooks explicit
at the manager boundary.

## What Changed

- Add `McpToolNameMode` to `codex-mcp` and flow it through `McpConfig`
into `McpConnectionManager::new`.
- Normalize MCP `ToolInfo` names in the manager using either
legacy-prefixed namespaces or non-prefixed namespaces; the legacy path
adds `mcp__` without restoring the old trailing namespace suffix.
- Remove the core-side MCP name remapping path so specs, tool search,
session resolution, and unavailable-tool placeholder construction use
the manager-provided `ToolName` values directly.
- Keep code mode flattening on the `__` namespace separator.
- Preserve hook compatibility by giving non-prefixed MCP hook names
legacy `mcp__...` matcher aliases.
- Add/adjust integration and unit coverage for non-prefixed code-mode
behavior, hook matching with the feature on and off, and manager-level
legacy prefixing.

## Testing

- `cargo test -p codex-mcp --lib`
- `cargo test -p codex-core --lib tools::spec::tests -- --nocapture`
- `cargo test -p codex-core --lib mcp_tools -- --nocapture`
- `cargo test -p codex-core --lib mcp_tool_exposure -- --nocapture`
- `cargo test -p codex-core --test all mcp_tool -- --nocapture`
- `cargo test -p codex-core --test all search_tool -- --nocapture`
- `cargo test -p codex-core --test all hooks_mcp -- --nocapture`
- `cargo test -p codex-core --test all
code_mode_uses_non_prefixed_mcp_tool_names_when_feature_enabled --
--nocapture`
- `cargo test -p codex-tools`
- `cargo test -p codex-features`
2026-05-26 08:21:15 -07:00
pakrym-oai
b637fd26aa [codex] Make active turn task singular (#24105)
## Why

`ActiveTurn` already runs at most one task: starting a task requires
that no task is present, and replacement aborts existing work first.
Representing that state as an `IndexMap` leaves a multi-task shape for a
single-task invariant and makes each lifecycle lookup operate like a
collection lookup.

The slot remains optional because goal continuation uses an empty active
turn as a reservation while deciding whether to start continuation work.

## What changed

- Replace `ActiveTurn.tasks` with `task: Option<RunningTask>`.
- Update task abort/completion, session lookup and steering, input-queue
matching, goal reservation, and network-approval lookup to operate on
the singular slot.
- Mutate the singular task slot directly instead of retaining
collection-era add/remove/take helpers.
- Record token usage on the completing active task span without a
regular-task-only opt-in flag.

## Validation

- `cargo test -p codex-core --lib session::tests::steer_input`
- `cargo test -p codex-core --lib
session::tests::abort_empty_active_turn_preserves_pending_input`
- `cargo test -p codex-core --lib
session::tests::queued_response_items_for_next_turn_move_into_next_active_turn`
- `cargo test -p codex-core --lib
session::tests::active_goal_continuation_runs_again_after_no_tool_turn`
- `cargo test -p codex-core --lib
session::tests::abort_regular_task_emits_turn_aborted_only`
- `cargo test -p codex-core --lib session::input_queue::tests`
2026-05-26 08:20:58 -07:00
Eric Traut
0f91e869bd Use thread config for TUI MCP inventory (#24532)
## Summary
`/mcp` in the TUI should reflect the current loaded thread, including
project-local MCP servers from that thread config. Before this change,
`mcpServerStatus/list` only read the latest global MCP config, so the
active chat could miss project-local servers.

This adds optional `threadId` to `mcpServerStatus/list`. When present,
app-server resolves the loaded thread and lists MCP status from the
refreshed effective config for that thread; when omitted, existing
global config behavior stays unchanged.

The TUI now sends the active chat thread id for `/mcp` and `/mcp
verbose`, carries that origin through the async inventory result, and
ignores stale completions if the user has switched threads before the
fetch returns. The app-server schemas were regenerated.

## Follow-up
Once this app-server API change lands, the desktop app should make the
same `threadId` plumbing so its MCP inventory also uses the current
thread config.

Fixes #23874
2026-05-26 07:44:04 -07:00
jif-oai
c4e53d103c Wire app-server extension event sink (#24586)
## Why

The goal extension already emits `ThreadGoalUpdated` events, but
production app-server thread extensions were built with the default
no-op extension event sink. That meant extension-driven goal updates
could be produced without ever reaching app-server clients.

## What changed

- Build app-server thread extensions with a host-provided
`ExtensionEventSink`.
- Add an app-server sink that converts extension `ThreadGoalUpdated`
events into `ServerNotification::ThreadGoalUpdated` broadcasts.
- Use the existing bounded outgoing message channel via `try_send` so
event forwarding cannot create an unbounded queue.
- Pass `NoopExtensionEventSink` in app-server tests that construct a
`ThreadManager` without an app-server host.
- Refresh `Cargo.lock` for the existing `codex-memories-extension`
`codex-otel` dependency.

## Verification

- `just test -p codex-app-server
extensions::tests::app_server_event_sink_forwards_thread_goal_updates`
2026-05-26 15:28:02 +02:00
jif-oai
01a8bf0ae3 Add memory tool call metrics to memories extension (#24583)
## Why

The memories extension now receives a metrics exporter, but the useful
extension-owned signal is the memory tool call itself: which operation
ran, which memory area it touched, whether the backend call succeeded,
and whether the result was truncated.

## What changed

- Added the `codex.memories.tool.call` counter in
`ext/memories/src/metrics.rs`.
- Emit that counter from `memories/add_ad_hoc_note`, `memories/list`,
`memories/read`, and `memories/search` after backend execution.
- Tag each call with `tool`, `operation`, `scope`, `status`, and
`truncated`.
- Pass the existing `MetricsClient` through the memories extension into
the tool executors; tests use `None`.

## Verification

- `just test -p codex-memories-extension`
2026-05-26 15:27:51 +02:00
jif-oai
b77be36896 fix: drop flake (#24588)
Dropping already commented out stuff
2026-05-26 15:07:26 +02:00
jif-oai
c37884d5eb Wire metrics client into memories extension (#24567)
## Summary

- let the memories extension capture the process-global OTEL metrics
client at install time
- keep app-server/TUI/exec extension construction APIs unchanged
- store the metrics client for future memory metrics without emitting
any metrics yet

## Test plan

- `just fmt`
- `just bazel-lock-update`
- `just bazel-lock-check`
- Not run: tests/clippy per request; CI will cover them
2026-05-26 13:56:46 +02:00
jif-oai
3936ed221d Add ad-hoc memory note tool (#24562)
## Why

Codex memory updates currently rely on instructions that tell agents to
create ad-hoc note files directly in the memory workspace. The memories
extension already has a `MemoriesBackend` abstraction for local storage
and future non-filesystem backends, so the ad-hoc note writer should
live behind that same interface instead of baking local filesystem
assumptions into the tool shape.

## What

- Adds a `memories/add_ad_hoc_note` tool to the existing memories tool
bundle.
- Extends `MemoriesBackend` with `add_ad_hoc_note` plus request/response
types so remote memory stores can implement the same operation later.
- Implements the local backend by creating append-only notes under
`extensions/ad_hoc/notes`.
- Validates the tool-provided filename contract
(`YYYY-MM-DDTHH-MM-SS-<slug>.md`), rejects path-like filenames, rejects
empty notes, and uses create-new semantics so existing notes are never
overwritten.
- Keeps memories tool contribution behind the existing commented-out
registration path; this defines the tool surface without newly exposing
it through app-server.

## Test Plan

- `just test -p codex-memories-extension`
2026-05-26 12:23:24 +02:00
jif-oai
de513a83f3 chore: move memory prompt builder into extension (#24558)
## Why

The memories extension now owns the read-path developer instructions it
injects at thread start. Keeping that prompt builder and template in
`codex-memories-read` left the extension depending on a helper crate for
extension-specific prompt assembly, and kept async template/truncation
dependencies in the read crate after the remaining read surface no
longer needed them.

## What changed

- Moved `prompts.rs`, its tests, and `templates/memories/read_path.md`
from `memories/read` into `ext/memories`.
- Wired `MemoryExtension` to call the local prompt builder and added the
moved templates to `ext/memories/BUILD.bazel` compile data.
- Removed the now-unused prompt export and prompt-related dependencies
from `codex-memories-read`.

## Testing

- Not run locally.
2026-05-26 11:53:47 +02:00
jif-oai
d579dafb70 chore: drop orphaned codex memories MCP crate (#24555)
## Why

The memory read-tool surface had two implementations: the app-server
extension path under `ext/memories`, and an unused `codex-memories-mcp`
workspace crate under `memories/mcp`. The MCP crate no longer has
reverse dependents, so keeping it around preserves duplicate backend,
schema, and tool code that is not part of the live app-server memory
path.

Dropping the orphaned crate makes the remaining memory crate split
clearer: `memories/read` owns read-path prompt/citation helpers,
`memories/write` owns the write pipeline, and `ext/memories` owns the
app-server extension integration.

## What changed

- Removed the `memories/mcp` crate and its Bazel/Cargo metadata.
- Removed `memories/mcp` from the Rust workspace and lockfile.
- Updated `memories/README.md` so it only lists the remaining reusable
memory crates.

## Verification

- `cargo metadata --format-version 1 --no-deps` succeeds.
2026-05-26 11:29:37 +02:00
jif-oai
7f9ab6e083 [wip] goal shift (#23858) 2026-05-26 11:22:18 +02:00
rhan-oai
04a8580f33 centralize Responses retry policy (#24131)
## Why

#23951 added remote compaction v2 retries, but it left the retry and WS
-> HTTPS fallback behavior duplicated between normal Responses turns and
compaction. This follow-up centralizes the common retry handling so
future changes to fallback, retry delay, retry notifications, and retry
sleep do not have to be kept in sync across both callsites.

## What changed

- Added `core/src/responses_retry.rs` with a shared handler for
retryable Responses stream errors.
- Reused that handler from normal turn sampling and remote compaction
v2.
- Kept each callsite responsible for its retry budget: normal turns
still use `stream_max_retries`, while compaction v2 still uses
`min(stream_max_retries, 2)`.
- Preserved caller-specific behavior around non-retryable errors,
context-window errors, usage-limit errors, and compact-specific final
failure logging.

The shared handler now owns:

- WS -> HTTPS fallback warning emission
- retry delay selection, including server-requested stream retry delay
- retry logging
- first-WebSocket-retry notification suppression
- `Reconnecting... n/max` stream-error notification
- sleeping before the next retry attempt

## Verification

- `cargo test -p codex-core remote_compact_v2`
- `cargo test -p codex-core websocket_fallback`
- `just fix -p codex-core`

Did not run the full workspace test suite.

---------

Co-authored-by: jif-oai <jif@openai.com>
2026-05-26 11:01:18 +02:00
jif-oai
4f7d6b4ef7 chore: stop consuming legacy config profiles (#24076)
## Why

The old config-profile mechanism should no longer influence runtime
behavior now that profile selection has moved to file-based `--profile`
config files. Core already rejects a selected legacy `profile = "..."`
with a migration error in
[`core/src/config/mod.rs`](d6451fcb79/codex-rs/core/src/config/mod.rs (L2521-L2529)),
but a few residual consumers still read legacy `[profiles.*]` data while
performing managed-feature checks and personality migration.

That kept dead legacy profile state relevant after selection had been
removed, and could make personality migration depend on a stale or
missing old profile.

## What changed

- Stop scanning legacy `[profiles.*]` feature settings when validating
managed feature requirements.
- Make personality migration consider only top-level `personality` and
`model_provider` settings.
- Remove the now-unused `ConfigToml::get_config_profile` helper.
- Update personality migration coverage to verify that legacy profile
personality fields and missing legacy profile names no longer affect
that migration path.

This keeps the legacy `profile` / `profiles` config shape available for
the remaining compatibility and migration diagnostics; it only removes
these behavior consumers.

## Verification

- Updated `core/tests/suite/personality_migration.rs` for the new
legacy-profile behavior.
- Focused test command: `cargo test -p codex-core
personality_migration`.
2026-05-26 10:34:43 +02:00
Eric Traut
e8651516f4 Log rollout writer OS errors (#24474)
## Why

Refs #24425.

We have seen rollout JSONL corruption that appears consistent with a
rollout write failing after partially appending a line, followed by a
retry that appends the same item again. The available user logs did not
include the underlying OS error, so it is hard to tell whether the
trigger was `ENOSPC`, quota exhaustion, a filesystem error, or something
else.

This PR adds the missing diagnostics for future reports.

## What changed

- Include `ErrorKind` and `raw_os_error()` in rollout writer failure
logs.
- Preserve the existing append-only rollout write path; this PR is
diagnostic-only.

## Verification

- `just test -p codex-rollout`
2026-05-26 10:33:22 +02:00
Felipe Coury
8a94430bb2 fix(process-hardening): preserve macos malloc diagnostics (#24479)
## Summary

Follow-up to #24459 and partial behavioral revert of `a71fc47` / #16699.

- Stop removing `MallocStackLogging*` and `MallocLogFile*` from macOS
pre-main hardening.
- Remove documentation that claims Codex suppresses those allocator
diagnostic controls.
- Retain the shared `remove_env_vars_with_prefix` refactor and existing
`LD_` / `DYLD_` hardening.

## Why

#24459 fixes the composer-corruption problem at the terminal stderr
boundary while preserving redirected stderr. With that guard in place,
stripping macOS malloc diagnostic settings is unnecessary and can hide
diagnostics intentionally enabled by callers.

## Validation

- `just fmt`
- `just test -p codex-process-hardening`
- `just argument-comment-lint-from-source -p codex-process-hardening`
- `git diff --check`
2026-05-25 17:26:10 -03:00
Felipe Coury
599416d733 fix(tui): prevent macos stderr from corrupting composer (#24459)
## Why

Fixes #17139.

On macOS, runtime diagnostics such as `MallocStackLogging` messages can
be written directly to process stderr while the inline TUI owns the
terminal. Those bytes paint into the same viewport as the composer
without passing through the renderer or composer state, making
diagnostic output appear to leak into the input area.

## What Changed

- Add a macOS terminal stderr guard while the inline TUI owns the
viewport.
- Restore stderr when Codex returns terminal ownership for external
interactive programs, suspend/resume, panic handling, and normal
shutdown.
- Add an fd-level regression test that verifies output is suppressed
only while terminal ownership is held and restored at each handoff
boundary.

## How to Test

1. On macOS, launch the interactive TUI and leave the composer visible.
2. Exercise the workflow that triggers an allocator/runtime stderr
diagnostic during an active session, as reported in #17139.
3. Confirm the diagnostic no longer overwrites the active composer
region.
4. Suspend or exit the TUI and confirm subsequent terminal stderr output
remains visible.

The platform diagnostic is environment-dependent, so the deterministic
regression check is the new fd-lifecycle test in
`tui::terminal_stderr::tests::suppresses_stderr_only_while_terminal_is_owned`.

Targeted validation:
- `just argument-comment-lint-from-source -p codex-tui` passed.
- `just test -p codex-tui` exercised and passed the new stderr-guard
regression test. The full invocation currently fails in two unrelated
guardian-policy tests,
`update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default`
and
`update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history`,
which reproduce when rerun in isolation.
2026-05-25 19:53:40 +00:00
Felipe Coury
14d80e55cd fix(tui): improve multiline markdown list readability (#24351)
## Why

Numbered Markdown findings become hard to scan when long items visually
run together or when wrapped explanatory paragraphs lose their list
indentation. This is especially visible in review output: the next
number can look attached to the previous finding, and paragraph
continuation rows can jump back toward the left margin instead of
staying grouped beneath their item.

<table><tr><td>
<center>Before</center>
<img width="1718" height="836" alt="CleanShot 2026-05-24 at 14 00 49"
src="https://github.com/user-attachments/assets/f1ee0023-50fa-4f81-a641-ae08b17b99bd"
/>
</td></tr>
<tr><td> 
<center>After</center>
<img width="1714" height="906" alt="image"
src="https://github.com/user-attachments/assets/b123a5e0-a232-47bf-96d5-c935295f7c0a"
/>
</td></tr>
</table>

## What Changed

- Insert a blank separator before a sibling list item when the previous
item occupies more than one rendered line.
- Preserve compact rendering for lists whose sibling items each render
on one line.
- Preserve list-body leading whitespace when transient streamed
assistant rows require another wrapping pass for history display, so
wrapped paragraphs stay aligned beneath their item.
- Share the existing leading-whitespace prefix logic used by history
insertion instead of introducing a second indentation rule.
- Keep streamed Markdown output aligned with completed rendering and add
snapshots for findings-style spacing and streamed paragraph indentation.

## How to Test

1. Start Codex from this branch and open the recorded repro session
`019e563f-7d58-7ff2-8ec7-828f20fa61ca`.
2. Inspect the numbered `Findings` list whose items contain explanatory
paragraphs.
3. Confirm each multiline finding is separated from the next numbered
finding by one blank line.
4. Confirm wrapped rows of each indented paragraph remain aligned
beneath the finding body, rather than returning to the left edge.
5. Render a short one-line numbered or unordered list and confirm its
items remain compact without added blank rows.

Targeted tests:

- `just test -p codex-tui history_cell insert_history markdown_render
markdown_stream streaming::controller`
- `just argument-comment-lint-from-source -p codex-tui`

## Related Work

PR #24346 changes Markdown table column allocation in parallel. This PR
is intentionally limited to list-item readability and history wrapping;
both branches touch `codex-rs/tui/src/markdown_render.rs`, so a small
merge conflict may need resolution depending on merge order.
2026-05-25 15:42:28 -03:00
Felipe Coury
20d1b7674d fix(tui): improve markdown table column allocation (#24346)
## Why

Markdown tables with a long path-heavy column could allocate almost all
available width to that column and collapse neighboring prose columns to
only a few characters. In rollout summaries this made `Unit` and `What
It Adds` difficult to read, even though the long `Files` values were the
content best suited to wrapping.

The affected example also specified `Files` as right aligned in its
markdown delimiter (`---:`). This change preserves that requested
alignment while improving how width is distributed.

| Before | After |
|---|---|
| <img width="1709" height="764" alt="image"
src="https://github.com/user-attachments/assets/932ab21c-b72d-48a2-9aad-b69da87a0968"
/> | <img width="1711" height="855" alt="image"
src="https://github.com/user-attachments/assets/4028bd20-2228-4c2f-be8a-1866325b7f62"
/> |


## What Changed

- Classify table columns as narrative, token-heavy, or compact during
width allocation.
- Shrink token-heavy path and URL columns before shrinking narrative
prose, while preserving compact counts and short labels longest.
- Use readable soft floors for narrative and token-heavy content before
falling back to tighter layouts.
- Add snapshot coverage for a rollout-shaped table containing
right-aligned file paths and prose columns.

## How to Test

1. Render a markdown table with `Unit`, right-aligned `Files`, `Adds`,
`Removes`, and `What It Adds` columns at a constrained terminal width.
2. Put long repository paths in `Files` and sentence-length content in
`Unit` and `What It Adds`.
3. Confirm that `Files` remains right aligned but wraps before the
narrative columns become unreadable.
4. Confirm that the compact numeric columns remain easy to scan.

Targeted tests:
- `just test -p codex-tui markdown_render`

Validation note: `just test -p codex-tui` was also attempted and reached
two existing unrelated failures in
`app::tests::update_feature_flags_disabling_guardian_*`; the markdown
rendering regression test passes in the targeted run.
2026-05-25 15:09:17 -03:00
Eric Traut
a7836744cc Add doctor thread inventory audit (#24305)
## Why

Users have been reporting missing sessions in the app. The app server
thread listing is backed by the SQLite state DB, but the durable source
of truth for a thread still exists on disk as rollout JSONL. When the
state DB is incomplete, doctor should be able to show the mismatch
directly instead of leaving users with a generic state health result.

## What changed

This adds a `threads` doctor check that compares active and archived
rollout files under `CODEX_HOME` with rows in the SQLite `threads`
table. The check reports missing rollout rows, stale DB rows, archive
flag mismatches, duplicate rollout thread IDs, duplicate DB paths,
source/provider summaries, and bounded samples of affected rollout
paths.

It also adds a read-only state audit helper in `codex-rs/state` so
doctor can inspect thread rows without creating, migrating, or repairing
the database.

## Sample output

```text
  ⚠ threads      rollout files are missing from the state DB
      default model provider   openai
      rollout DB active files  3910
      rollout DB archived files 2037
      rollout DB scan errors   0
      rollout DB malformed file names 0
      rollout DB scan cap reached false
      rollout DB rows          5499
      rollout DB active rows   3462
      rollout DB archived rows 2037
      rollout DB missing active rows 448
      rollout DB missing archived rows 0
      rollout DB stale rows    0
      rollout DB archive mismatches 0
      rollout DB duplicate rollout thread ids 0
      rollout DB duplicate DB paths 0
      rollout DB model providers openai=5359, lmstudio=35, mock_provider=33, lite_llm=26, proxy=26, ollama=15, lms=4, local-usage-limit=1
      rollout DB sources       vscode=2587, cli=1494, subagent:thread_spawn=577, subagent:other=502, exec=281, subagent:memory_consolidation=46, subagent:review=9, unknown=3
      rollout DB missing active sample ~/.codex/sessions/2026/0…857e-a923c712e066.jsonl
      rollout DB missing active sample ~/.codex/sessions/2025/0…877a-766dff25c68d.jsonl
      rollout DB missing active sample ~/.codex/sessions/2025/0…a8b1-7bbadc836f6e.jsonl
      rollout DB missing active sample ~/.codex/sessions/2025/0…a218-e6197f3f62f8.jsonl
      rollout DB missing active sample ~/.codex/sessions/2025/0…9011-7e30784f9932.jsonl
```
2026-05-25 10:29:06 -07:00
Eric Traut
613e5149a4 TUI config cleanup: MCP inventory (#24265)
## Summary

The TUI `/mcp` inventory flow should reflect the app server’s MCP status
response. It was also joining those results with the TUI process’s local
`config.mcp_servers`, which can diverge once MCP state is owned by a
remote app server and cause stale local command, URL, status, or
empty-state details to render.

This change removes the local config join from the app-server-backed
inventory renderer. The TUI now renders directly from the existing
`mcpServerStatus/list` payload and treats an empty status response as
the empty MCP inventory state.

## Known limitation

The existing `mcpServerStatus/list` payload does not include
disabled-state or disabled-reason fields. To preserve the current
app-server API, this PR does not try to infer that state from
client-local config. If remote `/mcp` needs to show disabled/reason
details again, that should come from app-server-owned status data in a
follow-up.

Related to #22914, #22915, and #22916.
2026-05-25 09:56:21 -07:00
Eric Traut
bb55736906 TUI config cleanup: trusted projects (#24255)
## Why
TUI onboarding trusted-project persistence should go through the same
app-server config write path as other config mutations. Writing
`config.toml` directly from the trust widget bypasses that layer and can
let onboarding proceed even when the trust decision was not actually
persisted.

## What changed
- Added a TUI config helper that writes the existing project trust
structure through `config/batchWrite`.
- Persists trust decisions as `projects.<project>.trust_level =
"trusted"` using the existing project trust key helper.
- Changed the trust directory widget to only record the user selection;
onboarding performs the app-server write before reporting success.
- Keeps the user on the trust screen and shows an error if app-server
persistence fails.

## Verification
- `cargo test -p codex-tui --lib
trust_persistence_failure_keeps_trust_step_in_progress`
- `cargo test -p codex-tui --lib
trusted_project_edit_targets_project_trust_level`
- Manual: built the local `codex-cli`, accepted the trust prompt in a
temp project, confirmed `projects.<project>.trust_level = "trusted"`,
and simulated an unwritable config to verify onboarding stays on the
trust screen without writing trust.
2026-05-25 09:54:05 -07:00
Eric Traut
f05fd0e661 TUI config cleanup: oss_provider (#24254)
## Summary

Manual provider selection during `codex --oss` startup was still
persisting `oss_provider` through the legacy local `config.toml` writer.
That bypasses the app-server-owned config mutation path used by the TUI,
so this routes the write through the app server config API instead.

The net behavior is intentionally narrow: only an interactive picker
selection is persisted. Auto-detected single-running-provider startup
and explicit `--local-provider` startup remain ephemeral, so merely
having one backend running does not make that provider sticky for future
runs.

## What Changed

- Removed the TUI picker’s direct dependency on
`set_default_oss_provider`.
- Had `oss_selection` report whether the returned provider came from the
interactive picker.
- Carried only manually selected providers into startup persistence.
- Wrote `oss_provider` via `config/batchWrite` once the app server
session is available.
- Logged a warning and continued startup if the app-server config write
fails.

## Verification

Manually smoke-tested the real `codex-tui` binary with a temporary
`CODEX_HOME`, pseudo-terminal input, and a fake LM Studio HTTP server:

- Interactive picker selection persisted `oss_provider = "lmstudio"`.
- Non-picker `--local-provider lmstudio` startup did not persist
`oss_provider`.
2026-05-25 09:53:39 -07:00
Eric Traut
5fb5e47767 Respect hook trust bypass during TUI startup (#24317)
Fixes #24093.

## Why

`--dangerously-bypass-hook-trust` is a supported CLI flag intended for
headless or automated runs where enabled hooks should be allowed to run
without requiring persisted trust. In the TUI, startup hook review still
opened whenever hooks looked untrusted, so a launch using the bypass
could block on the interactive "Hooks need review" prompt.

The tricky case is persistent app-server resume: a resume may attach to
an already-running thread, where resume config overrides are ignored. In
that path, hiding the startup review would be wrong because the existing
hook engine may still filter untrusted hooks.

## What Changed

- Startup hook review now skips the prompt only when hook trust bypass
is actually safe for that launch.
- The TUI forwards `bypass_hook_trust` through the app-server request
config for fresh thread start/resume/fork paths, and the app-server
applies it as a runtime-only `ConfigOverrides` value rather than
treating it like a `config.toml` setting.
- Persistent app-server resumes keep the startup review prompt so users
still have a chance to trust hooks when the running thread cannot
receive the bypass override.

## Verification

- Added focused coverage for startup hook review with and without
`bypass_hook_trust`.
- Extended existing TUI/app-server config override tests to cover
forwarding and applying `bypass_hook_trust`.
2026-05-25 09:44:21 -07:00
Eric Traut
913270a689 Show remote connection details in /status (#24420)
## Summary

Fixes #24411.

`/status` currently has no way to show when the TUI is talking to Codex
through a remote transport. That makes embedded local sessions, local
daemon sessions, and true remote sessions look the same, and it hides
the remote server version when debugging connection-specific behavior.

This PR adds a single `Remote` row for non-embedded connections only.
The row shows the sanitized connection address and a dimmed version
parenthetical, preserving the existing status output for embedded local
sessions.

<img width="791" height="144" alt="image"
src="https://github.com/user-attachments/assets/529d7940-1c45-4586-8b06-f20a1f04b771"
/>


## Verification

- Manually validated when connecting remotely (either implicitly to
local daemon or explicitly)
2026-05-25 09:42:42 -07:00
Eric Traut
caebff3d66 tui: label compact rate-limit percentages (#24314)
## Summary

The compact TUI status line already renders rate-limit percentages as
remaining capacity, but the text did not say so. That made high-usage
red indicators ambiguous because values like `weekly 6%` could be read
as either used or remaining.

This PR labels the compact rate-limit values explicitly as `left` across
the status line, terminal title, and setup previews.

Addresses #24274
2026-05-25 09:41:32 -07:00
Eric Traut
6491d1207f Report app-server version in codex doctor (#24311)
## Why

We are seeing cases where users have an old background app-server still
running. `codex doctor` already reports background server state, but
without the running app-server version it is harder to diagnose
behaviors that depend on the daemon build.

## What changed

- Reused the app-server daemon's passive initialize probe through a
narrow `probe_app_server_version` helper.
- Updated the `codex doctor` Background Server section to report
`app-server version: <version>` when the socket is reachable.
- Preserved the not-running OK behavior and report `app-server version:
unavailable (<short error>)` when a socket exists but the passive probe
fails.
2026-05-25 09:41:12 -07:00
Felipe Coury
9f42c89c01 feat(doctor): add environment diagnostics (#24261)
## Why

Issue #23031 was hard to diagnose from existing `codex doctor` output
because support could not see the OS language, resolved Git install, Git
repo metadata, Windows console mode/code page, or terminal-title inputs
that affect the TUI startup path. This adds those read-only signals to
`codex doctor` so Windows, Linux, and macOS reports carry the context
needed to investigate similar terminal rendering regressions.

Refs #23031

## What Changed

- Add a `system.environment` check for OS type/version, OS language, and
locale env vars.
- Add a `git.environment` check for the selected Git executable, PATH
Git candidates, version, exec path/build options, repository root,
branch, `.git` entry, and `core.fsmonitor`.
- Add Windows console code page and VT-processing mode details to
terminal diagnostics.
- Add a `terminal.title` check for configured/default title items and
resolved project-title source/value.
- Surface startup warning counts in config diagnostics and teach human
output to render the new categories.

## How to Test

1. On Windows, check out this branch and run `cargo run -p codex-cli --
doctor --summary`.
2. Confirm the Environment section includes `system`, `git`, `terminal`,
and `title` rows.
3. Run `cargo run -p codex-cli -- doctor --json`.
4. Confirm the JSON contains `system.environment`, `git.environment`,
and `terminal.title`; on Windows, confirm `terminal.env` details include
console code pages and `VT processing` for stdout/stderr.
5. From a non-git directory, run the same `doctor --json` command and
confirm the Git check reports `repo detected: false` rather than
warning.

Targeted tests:

- `cargo test -p codex-cli doctor`
- `cargo test -p codex-cli`
2026-05-24 15:34:35 +00:00
xl-openai
7d47056ea4 fix: plugin bundle archive handling for upload and install (#23983)
Move plugin tar.gz packing and unpacking into a shared core-plugins
archive helper so uploaded bundles are decoded through the same tar
handling used for installs. This removes duplicate archive logic,
supports GNU long-name entries on extraction, and keeps size, traversal,
link, and entry-type checks in one place.
2026-05-22 19:31:39 -07:00
Channing Conger
f94157a4b2 code-mode: merge stored values by key (#24159)
## Summary

Change code-mode stored value updates to merge writes by key instead of
replacing the session's complete stored-value map after each cell
completes.

Previously, each cell received a snapshot of stored values and returned
the complete resulting map. When multiple cells ran concurrently, a
later completion could overwrite values written by another cell because
it committed an older snapshot.

This change moves stored-value ownership into `CodeModeService`:

- Each runtime starts from the service's current stored values.
- Runtime completion reports only keys written by that cell.
- The service merges those writes into the current stored-value map on
successful completion.
- Core no longer replaces its stored-value state from a cell result.

As a result, concurrently executing cells can update different stored
keys without clobbering one another.

The move into CodeModeService is motivated by a desire to have this
lifetime tied to a new lifetime object on that side in a subsequent PR.
2026-05-22 19:09:02 -07:00
Abhinav
5c20513a1b Default function tools into tool hooks (#23757)
# Why

`PreToolUse`, `PostToolUse`, and `updatedInput` coverage for local
function tools currently depends on each handler remembering to wire up
the hook contract itself. That makes coverage easy to miss as new
function tools are added, even though most of them share the same basic
shape: a model-facing function call with JSON arguments.

# What

This makes `CoreToolRuntime` provide the default hook contract for
ordinary local function tools:

- build generic `PreToolUse` and `PostToolUse` payloads from the
function tool name and arguments
- apply `updatedInput` rewrites back into function-tool arguments
through the same default path
- let tool outputs override the post-hook input or response when they
have a more stable hook-facing contract

The exceptions stay explicit:

- hosted tools remain outside the generic local function path
- code-mode `wait` and `write_stdin` opt out for now
- `PostToolUse` feedback replaces only the model-visible response, so
code mode keeps its typed tool result

With the generic path in place, the MCP and extension-tool adapters no
longer need their own duplicate pre/post hook plumbing. The new coverage
exercises the registry default plus end-to-end local function behavior
for pre-hook blocking, `updatedInput` rewriting, and post-hook context.
2026-05-23 00:56:58 +00:00
Michael Bolin
c7bcb90f9b package: include zsh fork in Codex package (#23756)
## Why

The package layout gives Codex a stable place for runtime helpers that
should travel with the entrypoint. `shell_zsh_fork` still required users
to configure `zsh_path` manually, even though we already publish
prebuilt zsh fork artifacts.

This PR builds on #24129 and uses the shared DotSlash artifact fetcher
to include the zsh fork in Codex packages when a matching target
artifact exists. Packaged Codex builds can then discover the bundled
fork automatically; the user/profile `zsh_path` override is removed so
the feature uses the package-managed artifact instead of a legacy path
knob.

## What Changed

- Added `scripts/codex_package/codex-zsh`, a checked-in DotSlash
manifest for the current macOS arm64 and Linux zsh fork artifacts.
- Taught `scripts/build_codex_package.py` to fetch the matching zsh fork
artifact and install it at `codex-resources/zsh/bin/zsh` when available
for the selected target.
- Added package layout validation for the optional bundled zsh resource.
- Added `InstallContext::bundled_zsh_path()` and
`InstallContext::bundled_zsh_bin_dir()` for package-layout resource
discovery.
- Threaded the packaged zsh path through config loading as the runtime
`zsh_path` for packaged installs, and removed the config/profile/CLI
override path.
- Kept the packaged default zsh override typed as `AbsolutePathBuf`
until the existing runtime `Config::zsh_path` boundary.
- Updated app-server zsh-fork integration tests to spawn
`codex-app-server` from a temporary package layout with
`codex-resources/zsh/bin/zsh`, matching the new packaged discovery path
instead of setting `zsh_path` in config.
- Switched package executable copying from metadata-preserving `copy2()`
to `copyfile()` plus explicit executable bits, which avoids macOS
file-flag failures when local smoke tests use system binaries as inputs.

## Testing

To verify that the `zsh` executable from the Codex package is picked up
correctly, first I ran:

```shell
./scripts/build_codex_package.py
```

which created:

```
/private/var/folders/vw/x2knqmks50sfhfpy27nftl900000gp/T/codex-package-pms94kdp/
```

so then I ran:

```
/private/var/folders/vw/x2knqmks50sfhfpy27nftl900000gp/T/codex-package-pms94kdp/bin/codex exec --enable shell_zsh_fork 'run `echo $0`'
```

which reported the following, as expected:

```
/private/var/folders/vw/x2knqmks50sfhfpy27nftl900000gp/T/codex-package-pms94kdp/codex-resources/zsh/bin/zsh
```



---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/23756).
* #23768
* __->__ #23756
2026-05-22 17:54:07 -07:00
Anton Panasenko
03e6c5f600 fix(remote-control): cap reconnect backoff (#24164)
## Why

Remote-control websocket reconnects currently use the shared exponential
backoff helper without a local ceiling, so a long failure streak can
stretch retries out indefinitely and leave the runtime behavior hard to
inspect from logs.

## What Changed

Cap the remote-control reconnect delay at 30 seconds, then reset the
reconnect attempt counter once that capped delay is emitted so the next
failure starts from the initial jittered delay again.

The reconnect failure log now records the attempt number, chosen delay,
and whether the cap triggered a reset, with a separate info log when the
backoff counter is reset after the cap.

## Verification

`just test -p codex-app-server-transport`

Related issue: N/A
2026-05-23 00:38:22 +00:00
dhruvgupta-oai
4bcabbfbec Display workspace usage limit error copy from response header (#24114)
## Why

`openai/openai#947613` adds `X-Codex-Rate-Limit-Reached-Type` for Codex
workspace credit-depletion and spend-cap responses. The CLI currently
reads the adjacent promo header but otherwise renders generic
usage-limit copy, so those responses do not explain the
workspace-specific action the user needs to take.

Backend dependency: https://github.com/openai/openai/pull/947613

## What Changed

- Parse `X-Codex-Rate-Limit-Reached-Type` in the usage-limit error
handling path alongside `x-codex-promo-message`.
- Keep the header value parsing with the shared `RateLimitReachedType`
enum.
- Carry the parsed type on `UsageLimitReachedError` and render
client-owned copy for the four workspace owner/member credit and
spend-cap values.
- Preserve existing promo and plan-based text for absent, generic, or
unknown header values.
- Keep the existing TUI workspace-owner nudge state path unchanged; the
response header only selects the displayed error string.
- Add focused display coverage for all specific type values and the
generic fallback case.

## Test Plan

- Added `usage_limit_reached_error_formats_rate_limit_reached_types`
coverage.
- Not run manually, per request; CI runs validation on the pushed
commit.
2026-05-22 23:58:49 +00:00
pakrym-oai
6ad3a83509 [codex] Remove external client session reset plumbing (#24157)
## Why

The turn loop no longer needs to decide when a `ModelClientSession`
should reset its websocket state after compaction. That reset behavior
belongs inside the model client, where the websocket cache and retry
state are owned. The repo guidance now calls this out explicitly so
future changes let the incremental request logic decide whether the
previous request can be reused.

## What Changed

- Removed the `reset_client_session` return value from pre-sampling and
auto-compact helpers in `core/src/session/turn.rs`.
- Changed compaction helpers to return `CodexResult<()>` so callers only
handle success or failure.
- Made `ModelClientSession::reset_websocket_session` private to
`core/src/client.rs`, leaving it callable only from model-client
internals.
- Added `AGENTS.md` guidance not to call `reset_client_session`
unnecessarily.

## Validation

- `just test -p codex-core session::turn`
2026-05-22 16:46:25 -07:00
Celia Chen
10ac2781eb chore: add JSON schema policy fixture coverage (#24152)
## Why

Before changing the Codex Bridge JSON schema policy, add integration
coverage around real connector-like MCP tool schemas. The existing unit
tests cover individual sanitizer behaviors, but they do not make it easy
to see whether full fixture schemas keep model-visible guidance, prune
only unreachable definitions, drop unsupported JSON Schema fields, and
stay within the Responses API schema budget.

## What Changed

- Added `tools/tests/json_schema_policy_fixtures.rs`, which converts MCP
tool fixtures through `mcp_tool_to_responses_api_tool` and validates the
resulting Responses tool parameters.
- Added connector-style fixtures for Slack, Google Calendar, Google
Drive, Notion, and Microsoft Outlook Email under
`tools/tests/fixtures/json_schema_policy/`.
- Added fixture assertions for preserved guidance, pruned definitions,
expected field drops after `JsonSchema` conversion, marker count
baselines, and dangling local `$ref` prevention.
- Added a real oversized golden Notion `create_page` input schema
fixture to exercise the compaction path that strips descriptions, drops
root `$defs`, rewrites local refs, and fits the compacted schema under
the budget.
2026-05-22 16:31:33 -07:00
Adam Perry @ OpenAI
7924743c38 [codex] Add image re-encoding benchmarks (#23935)
## Summary
- add Divan benchmarks for prompt image re-encoding paths
- wire the image benchmark smoke test into Rust CI workflows

## Why
Image prompt handling includes re-encoding work that benefits from
repeatable benchmark coverage so changes can be measured in CI and
locally.

This already helped identify a potential regression from changing compiler flags.

## Impact
Developers can run and compare the new image re-encoding benchmarks, and
CI exercises the benchmark target via the Rust benchmark smoke test.
2026-05-22 22:38:40 +00:00
pakrym-oai
fbd4efa9ed [codex] Use TurnInput for session task input (#24151)
## Why

The idea here is to erase the difference between initial and followup
inputs to a turn. Followup inputs are already represented as TurnInput.

Eventual goal is not to have explicit on task input at all and pull
everything from input Q.

## What Changed

- Changes `SessionTask::run` and the erased `AnySessionTask::run` path
to accept `Vec<TurnInput>`.
- Wraps user-submitted spawn input as `TurnInput::UserInput` at the
session task start boundary.
- Updates `run_turn` to record initial `TurnInput` using the same hook
and recording path used for pending input.
- Keeps review-specific conversion local to `ReviewTask`, where the
sub-Codex one-shot API still expects `Vec<UserInput>`.
- Moves the synthetic compact prompt into `CompactTask` and starts
compact tasks with empty task input.

## Validation

- `cargo check -p codex-core`
- `just test -p codex-core -E
'test(task_finish_emits_turn_item_lifecycle_for_leftover_pending_user_input)
| test(queued_response_items_for_next_turn_move_into_next_active_turn) |
test(steered_input_reopens_mailbox_delivery_for_current_turn)'`
2026-05-22 15:21:08 -07:00
rhan-oai
6419402a7c [codex-analytics] split compaction v2 analytics implementation (#24146)
## What changed

- Add a distinct `responses_compaction_v2` value for
`CodexCompactionEvent.implementation`.
- Emit that value from the remote compaction v2 path.
- Keep local compaction as `responses` and legacy `/responses/compact`
as `responses_compact`.

## Why

Remote compaction v2 and local prompt-based compaction were both
reported as `responses`, which made the analytics table collapse two
different compaction mechanisms into one implementation bucket.

## Validation

- `just fmt`
- `just test -p codex-analytics`

`just test -p codex-core` was started locally, but this PR is
intentionally being pushed for CI to finish the remaining validation.
2026-05-22 21:34:22 +00:00
Won Park
423488480f Add typed Images client to codex-api (#23989)
## Why

Standalone image generation needs a typed `codex-api` client surface for
the Codex image proxy routes before the harness and model-facing tool
layers are wired in.

## What changed

- Added `ImagesClient` support for JSON `images/generations` and
`images/edits` requests.
- Added typed request and response shapes for generation, JSON edit
image URLs, image metadata, and base64 image outputs.
- Kept generation model slugs open-ended while requiring the generation
model field that the downstream endpoint expects.
- Exported the new client and image types from `codex-api`.
- Added coverage for generation and edit wire shapes, extra response
metadata that the client ignores, and malformed image responses missing
`data`.

## Validation

- `cargo test -p codex-api`
- `just fix -p codex-api`
- `just fmt`
- `git diff --check main`
2026-05-22 14:10:55 -07:00
Matthew Zeng
6963145cb6 Support OAuth options in codex mcp add (#24120)
## Summary
- add `--oauth-client-id` and `--oauth-resource` options for streamable
HTTP `codex mcp add` registrations
- persist those options in MCP server config and use them during the
immediate OAuth login flow
- cover add-time serialization of both OAuth options in the CLI
integration tests

## Testing
- `just fmt`
- `cargo test -p codex-cli`
- `just fix -p codex-cli`
2026-05-22 13:21:01 -07:00
mchen-oai
3c83e57bfa Add trace_id to TurnStartedEvent (#23980)
## Why
[Recent PR](https://github.com/openai/codex/pull/22709) removed
`trace_id` from `TurnContextItem`.

## What changed
- Add to `TurnStartedEvent` so rollout consumers can correlate turns
with telemetry traces.
- Note that the branch name is out of date because I originally re-added
to `TurnContextItem`, but we decided to move it to `TurnStartedEvent`.

## Verification
- `cargo test -p codex-protocol`
- `cargo test -p codex-core --lib
regular_turn_emits_turn_started_without_waiting_for_startup_prewarm`
- `cargo test -p codex-core --test all
emits_warning_when_resumed_model_differs`
- `cargo test -p codex-rollout`
- `cargo test -p codex-state`
2026-05-22 13:10:56 -07:00
Michael Bolin
36a71a88bf cli: support --profile for codex sandbox (#24110)
## Why

`codex sandbox` now always runs the host sandbox backend, so it should
accept the same profile selection mechanism as the rest of the runtime
CLI surface. Without `--profile`, sandbox debugging can exercise only
the default config stack unless users manually translate profile config
into ad hoc `-c` overrides.

Supporting `--profile` lets sandbox invocations load
`$CODEX_HOME/<name>.config.toml`, including permission profile
configuration, before resolving the sandbox policy for the command being
run.

## What Changed

- Added `--profile NAME` / `-p NAME` to the host-specific `codex
sandbox` argument structs as `config_profile`.
- Allowed root-level `codex --profile NAME sandbox ...` and made a
sandbox-local `codex sandbox --profile NAME ...` override the root
selection.
- Threaded `LoaderOverrides` through sandbox config loading so selected
config profile files participate in permission resolution before the
legacy read-only fallback.
- Documented the new sandbox flag in `codex-rs/README.md`.

## Verification

- Added parser coverage for `codex sandbox --profile`.
- Added sandbox config-loader coverage that verifies selected config
profile loader overrides select the profile config rather than falling
back to read-only.
- Ran `cargo test -p codex-cli`.
2026-05-22 13:00:53 -07:00
Felipe Coury
acd851e89f fix(tui): restore Windows VT before TUI renders (#24082)
## Why

Older Git for Windows versions can leave the Windows console output mode
without virtual terminal processing after Codex runs git metadata
commands in a repository. When the TUI later emits ANSI control
sequences for redraws, restore, or image rendering, Windows Terminal can
show raw escape bytes or leave the prompt/status area corrupted.

This is a targeted mitigation for the repo-conditioned Windows rendering
corruption reported in #23888 and related reports #23512 and #23628.
Updating Git avoids the trigger for affected users, but Codex should
also reassert the terminal mode before it writes TUI control sequences.

| Before | After |
|---|---|
| <img width="2100" height="1359" alt="CleanShot 2026-05-22 at 11 23 21"
src="https://github.com/user-attachments/assets/3218c379-5f97-4c71-ab25-805c9d20578a"
/> | <img width="2100" height="1359" alt="CleanShot 2026-05-22 at 11 23
58"
src="https://github.com/user-attachments/assets/55ac72bb-37d0-400e-99bc-12dd5ea4092d"
/> |


## What Changed

- Re-enable Windows virtual terminal processing for stdout and stderr
before TUI mode setup, restore, redraw, resume, and pet image render
paths.
- Treat invalid, null, or non-console handles as no-ops so redirected or
non-console output is unaffected.
- Keep the helper as a no-op on non-Windows platforms.

## How to Test

1. On Windows Terminal with a Git 2.28.0 for Windows install, start
Codex inside a valid Git repository.
2. Start a new Codex CLI session.
3. Confirm the prompt, working indicator, and bottom status line remain
readable instead of showing raw ANSI escape sequences.
4. Repeat outside a Git repository to confirm the ordinary non-repo
startup path is unchanged.

Targeted tests:
- Not run locally; the behavior depends on Windows console mode APIs and
the current worktree is on macOS.
2026-05-22 16:20:09 -03:00
Michael Bolin
75b7e06621 docs: update README.md to mention curl-based installer (#24106)
Now that users can install via `curl` (or `irm`), we should tell them
about it so they no longer need to use `npm`!

Note that on one Windows machine I tested on, when I ran:

```
irm https://chatgpt.com/codex/install.ps1 | iex
```

I got this error:

```
iex : The property 'OSArchitecture' cannot be found on this object. Verify that the property exists.
At line:1 char:45
+ irm https://chatgpt.com/codex/install.ps1 | iex
+                                             ~~~
    + CategoryInfo          : NotSpecified: (:) [Invoke-Expression], PropertyNotFoundException
    + FullyQualifiedErrorId : PropertyNotFoundStrict,Microsoft.PowerShell.Commands.InvokeExpressionCommand
```

so we'll recommend the following that works from both `cmd.exe` and
PowerShell:

```
powershell -ExecutionPolicy ByPass -c "irm https://chatgpt.com/codex/install.ps1 | iex"
```

This PR makes a slight update to `codex-rs/tui/src/update_action.rs` to
match.
2026-05-22 18:39:08 +00:00
iceweasel-oai
5b1b6a20dd [codex] Use rolling files for Windows sandbox logs (#24117)
## Why

Windows sandbox diagnostics currently append to a single `sandbox.log`
under `CODEX_HOME/.sandbox`. That file never rolls over, which makes it
hard to safely include sandbox diagnostics in future feedback reports
without risking unbounded growth.

## What changed

- Replaced direct append-open sandbox logging with
`tracing_appender::rolling::RollingFileAppender`.
- Configured sandbox logs to rotate daily using names like
`sandbox.YYYY-MM-DD.log`.
- Added a conservative `MAX_LOG_FILES` cap of 90 retained matching log
files.
- Routed the Windows sandbox setup helper through the same rolling
writer.
- Added helpers for resolving the current daily sandbox log path so
future feedback upload work can use the same filename logic.
- Updated tests and test diagnostics to read the dated daily log file.

This intentionally does not include sandbox logs in `/feedback` yet;
scrubbing and attachment behavior can happen in a follow-up.

## Testing

- `cargo fmt -p codex-windows-sandbox`
- `cargo check -p codex-windows-sandbox`
- `cargo test -p codex-windows-sandbox`
- `cargo test -p codex-windows-sandbox logging::tests`
- `cargo clippy -p codex-windows-sandbox --all-targets -- -D warnings`
2026-05-22 11:37:01 -07:00
adams-oai
865ca936db Add new enterprise requirement gate (#23736)
Add new enterprise requirement gate.

Validation:
- `cargo test -p codex-config --lib`
- `cargo test -p codex-app-server-protocol --lib`
- `cargo test -p codex-tui --lib debug_config`
- `cargo test -p codex-app-server --lib` *(fails: stack overflow in
`in_process::tests::in_process_start_initializes_and_handles_typed_v2_request`;
reproduces when run alone)*
2026-05-22 11:33:44 -07:00