### Summary
* Added instruction on using `request_user_input`
* Added the output to be json with `plan` key and the actual plan as the
value.
* Remove `PLAN.md` write because that gets into sandbox issue. We can
add it back later.
Users of Azure endpoints are reporting that when they use `/review`,
they sometimes see an error "Invalid 'input[3].id". I suspect this is
specific to the Azure implementation of the `responses` API. The Azure
team generally copies the OpenAI code for this endpoint, but they do
have minor differences and sometimes lag in rolling out bug fixes or
updates.
The error appears to be triggered because the `/review` implementation
is using a user ID with a colon in it.
Addresses #9360
Bumps [ctor](https://github.com/mmastrac/rust-ctor) from 0.5.0 to 0.6.3.
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a
href="https://github.com/mmastrac/rust-ctor/commits">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [arc-swap](https://github.com/vorner/arc-swap) from 1.7.1 to
1.8.0.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/vorner/arc-swap/blob/master/CHANGELOG.md">arc-swap's
changelog</a>.</em></p>
<blockquote>
<h1>1.8.0</h1>
<ul>
<li>Support for Pin (<a
href="https://redirect.github.com/vorner/arc-swap/issues/185">#185</a>,
<a
href="https://redirect.github.com/vorner/arc-swap/issues/183">#183</a>).</li>
<li>Fix (hopefully) crash on ARM (<a
href="https://redirect.github.com/vorner/arc-swap/issues/164">#164</a>).</li>
<li>Fix Miri check (<a
href="https://redirect.github.com/vorner/arc-swap/issues/186">#186</a>,
<a
href="https://redirect.github.com/vorner/arc-swap/issues/156">#156</a>).</li>
<li>Fix support for Rust 1.31.0.</li>
<li>Some minor clippy lints.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="2540d266a8"><code>2540d26</code></a>
Version bump to 1.8.0</li>
<li><a
href="9981e3af23"><code>9981e3a</code></a>
Keep "old" Cargo.lock around</li>
<li><a
href="57a8abbfc4"><code>57a8abb</code></a>
Fix documentation links</li>
<li><a
href="346c5b642b"><code>346c5b6</code></a>
Fix some clippy warnings</li>
<li><a
href="0bd349a56b"><code>0bd349a</code></a>
Fix support for Rust 1.31.0</li>
<li><a
href="57aa5224c1"><code>57aa522</code></a>
Merge pull request <a
href="https://redirect.github.com/vorner/arc-swap/issues/185">#185</a>
from SpriteOvO/pin</li>
<li><a
href="4c0c4ab321"><code>4c0c4ab</code></a>
Implement <code>RefCnt</code> for <code>Pin\<Arc></code> and
<code>Pin\<Rc></code></li>
<li><a
href="e596275acf"><code>e596275</code></a>
Avoid warnings about hidden lifetimes</li>
<li><a
href="d849a2d17e"><code>d849a2d</code></a>
Use SeqCst in debt-lists</li>
<li><a
href="1f9b221da9"><code>1f9b221</code></a>
Merge pull request <a
href="https://redirect.github.com/vorner/arc-swap/issues/186">#186</a>
from nbdd0121/prov</li>
<li>Additional commits viewable in <a
href="https://github.com/vorner/arc-swap/compare/v1.7.1...v1.8.0">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
We already support reading from `config.toml` through a symlink, but the
code was not properly handling updates to a symlinked config file. This
PR generalizes safe symlink-chain resolution and atomic writes into
path_utils, updating all config write paths to use the shared logic
(including set_default_oss_provider, which previously didn't use the
common path), and adds tests for symlink chains and cycles.
This resolves#6646.
Notes:
* Symlink cycles or resolution failures replace the top-level symlink
with a real file.
* Shared config write path now handles symlinks consistently across
edits, defaults, and empty-user-layer creation.
This PR was inspired by https://github.com/openai/codex/pull/9437, which
was contributed by @ryoppippi
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
Include a link to a bug report or enhancement request.
Fixes#9058
## Summary
When the transcript backtrack preview is armed (press `Esc`), allow
navigating to newer user messages with the `→` arrow, in addition to
navigating backwards with `Esc`/`←`, before confirming with `Enter`.
## Changes
- Backtrack preview navigation: `Esc`/`←` steps to older user messages,
`→` steps to newer ones, `Enter` edits the selected message (clamped at
bounds, no wrap-around).
- Transcript overlay footer hints updated to advertise `esc/←`, `→`, and
`enter` when a message is highlighted.
## Related
- WSL shortcut-overlay snapshot determinism: #9359
## Testing
- `just fmt`
- `just fix -p codex-tui`
- `just fix -p codex-tui2`
- `cargo test -p codex-tui app_backtrack::`
- `cargo test -p codex-tui pager_overlay::`
- `cargo test -p codex-tui2 app_backtrack::`
- `cargo test -p codex-tui2 pager_overlay::`
---------
Co-authored-by: Josh McKinney <joshka@openai.com>
### Description
- Remove the now-unused `instructions` field from the session metadata
to simplify SessionMeta and stop propagating transient instruction text
through the rollout recorder API. This was only saving
user_instructions, and was never being read.
- Stop passing user instructions into the rollout writer at session
creation so the rollout header only contains canonical session metadata.
### Testing
- Ran `just fmt` which completed successfully.
- Ran `just fix -p codex-protocol`, `just fix -p codex-core`, `just fix
-p codex-app-server`, `just fix -p codex-tui`, and `just fix -p
codex-tui2` which completed (Clippy fixes applied) as part of
verification.
- Ran `cargo test -p codex-protocol` which passed (28 tests).
- Ran `cargo test -p codex-core` which showed failures in a small set of
tests (not caused by the protocol type change directly):
`default_client::tests::test_create_client_sets_default_headers`,
several `models_manager::manager::tests::refresh_available_models_*`,
and `shell_snapshot::tests::linux_sh_snapshot_includes_sections` (these
tests failed in this CI run).
- Ran `cargo test -p codex-app-server` which reported several failing
integration tests (including
`suite::codex_message_processor_flow::test_codex_jsonrpc_conversation_flow`,
`suite::output_schema::send_user_turn_*`, and
`suite::user_agent::get_user_agent_returns_current_codex_user_agent`).
- `cargo test -p codex-tui` and `cargo test -p codex-tui2` were
attempted but aborted due to disk space exhaustion (`No space left on
device`).
------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_696bd8ce632483228d298cf07c7eb41c)
## Summary
We have a variety of things we refer to as instructions in the code
base: our current canonical terms are:
- base instructions (raw string)
- developer instructions (has a type in protocol)
- user instructions
We also have `instructions` floating around in various places. We should
standardize on the above, and start using types to prevent them from
ending up in the wrong place. There will be additional PRs, but I'm
going to keep these small so we can easily follow them!
## Testing
- [x] Tests pass, this is purely a file move
Fix unified_exec_timeouts to use a unique variable value rather than
"codex" which was causing false positives when running tests locally
(presumably from my bash prompts). Discovered while running tests to
validate another change.
Fixes https://github.com/openai/codex/issues/9413
Test Plan:
Ran test locally on my fedora 43 x86_64 machine with:
```
cd codex/cargo-rs
cargo nextest run --all-features --no-fail-fast unified_exec::tests::unified_exec_timeouts
```
Before, unified_exec_timeouts fails:
```
Finished `test` profile [unoptimized + debuginfo] target(s) in 0.38s
────────────
Nextest run ID fa2b4949-a66c-408c-8002-32c52c70ec4f with nextest profile: default
Starting 1 test across 107 binaries (3211 tests skipped)
FAIL [ 5.667s] codex-core unified_exec::tests::unified_exec_timeouts
stdout ───
running 1 test
test unified_exec::tests::unified_exec_timeouts ... FAILED
failures:
failures:
unified_exec::tests::unified_exec_timeouts
test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 774 filtered out; finished in 5.66s
stderr ───
thread 'unified_exec::tests::unified_exec_timeouts' (459601) panicked at core/src/unified_exec/mod.rs:381:9:
timeout too short should yield incomplete output
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
────────────
Summary [ 5.677s] 1 test run: 0 passed, 1 failed, 3211 skipped
FAIL [ 5.667s] codex-core unified_exec::tests::unified_exec_timeouts
error: test run failed
```
After, works:
```
Finished `test` profile [unoptimized + debuginfo] target(s) in 0.34s
────────────
Nextest run ID f49e9004-e30b-4049-b0ff-283b543a1cd7 with nextest profile: default
Starting 1 test across 107 binaries (3211 tests skipped)
SLOW [> 15.000s] codex-core unified_exec::tests::unified_exec_timeouts
PASS [ 17.666s] codex-core unified_exec::tests::unified_exec_timeouts
────────────
Summary [ 17.676s] 1 test run: 1 passed (1 slow), 3211 skipped
```
Document the backtrack/rollback state machine and invariants between the
transcript overlay, in-flight “live tail”, and core thread state (tui + tui2).
Also adjust behavior for correctness:
- Track a single pending rollback and block additional rollbacks until core responds.
- Defer trimming transcript cells until ThreadRolledBack for the active session.
- Clear the guard on ThreadRollbackFailed so the user can retry.
- After a confirmed trim, schedule a one-shot scrollback refresh on the next draw.
- Clear stale pending rollback state when switching sessions.
---------
Co-authored-by: Josh McKinney <joshka@openai.com>
**Goal**: Prevent response.failed events with `invalid_prompt` from
being treated as retryable errors so the UI shows the actual error
message instead of continually retrying.
**Before**: Codex would continue to retry despite the prompt being
marked as disallowed
**After**: Codex will stop retrying once prompt is marked disallowed
- Merge `model` and `reasoning_effort` under collaboration modes.
- Add additional instructions for custom collaboration mode
- Default to Custom to not change behavior
Summary:
- Add forked_from to SessionMeta/SessionConfiguredEvent and persist it
for forked sessions.
- Surface forked_from in /status for tui + tui2 and add snapshots.
Add support for returning threads by either `created_at` OR `updated_at`
descending. Previously core always returned threads ordered by
`created_at`.
This PR:
- updates core to be able to list threads by `updated_at` OR
`created_at` descending based on what the caller wants
- also update `thread/list` in app-server to expose this (default to
`created_at` if not specified)
All existing codepaths (app-server, TUI) still default to `created_at`,
so no behavior change is expected with this PR.
**Implementation**
To sort by `updated_at` is a bit nontrivial (whereas `created_at` is
easy due to the way we structure the folders and filenames on disk,
which are all based on `created_at`).
The most naive way to do this without introducing a cache file or sqlite
DB (which we have to implement/maintain) is to scan files in reverse
`created_at` order on disk, and look at the file's mtime (last modified
timestamp according to the filesystem) until we reach `MAX_SCAN_FILES`
(currently set to 10,000). Then, we can return the most recent N
threads.
Based on some quick and dirty benchmarking on my machine with ~1000
rollout files, calling `thread/list` with limit 50, the `updated_at`
path is slower as expected due to all the I/O:
- updated-at: average 103.10 ms
- created-at: average 41.10 ms
Those absolute numbers aren't a big deal IMO, but we can certainly
optimize this in a followup if needed by introducing more state stored
on disk.
**Caveat**
There's also a limitation in that any files older than `MAX_SCAN_FILES`
will be excluded, which means if a user continues a REALLY old thread,
it's possible to not be included. In practice that should not be too big
of an issue.
If a user makes...
- 1000 rollouts/day → threads older than 10 days won't show up
- 100 rollouts/day → ~100 days
If this becomes a problem for some reason, even more motivation to
implement an updated_at cache.
Implemented /fork to fork the current session directly (no picker),
handling it via a new ForkCurrentSession app event in both tui and tui2.
Updated slash command descriptions/tooltips and adjusted the fork tests
accordingly. Removed the unused in-session fork picker event.
- capture the header from SSE/WS handshakes, store it per
ModelClientSession using `Oncelock`, echo it on turn-scoped requests,
and add SSE+WS integration tests for within-turn persistence +
cross-turn reset.
- keep `x-codex-turn-state` sticky within a user turn to maintain
routing continuity for retries/tool follow-ups.
PR #9245 made `codex resume --last` honor cwd, but I forgot to make the
same change for `codex exec resume --last`. This PR fixes the
inconsistency.
This addresses #8700
A thread can now be spawned by another thread. In order to process the
approval requests of such sub-threads, we need to detect those event and
show them in the TUI.
This is a temporary solution while the UX is being figured out. This PR
should be reverted once done