## Summary
musl 1.2.5 includes [several fixes to DNS over
TCP](https://www.openwall.com/lists/musl/2024/03/01/2), which appears to
be the root cause of #6116.
This approach is a bit janky, but according to codex:
> On the Ubuntu 24.04 runners we use, apt-cache policy musl-tools shows
only the distro build (1.2.4-2ubuntu2)"
We should build with this version and confirm.
## Testing
- [ ] TODO: test and see if this fixes Azure issues
V2 for `account/updated` and `account/logout` for app server. correspond
to old `authStatusChange` and `LogoutChatGpt` respectively. Followup PRs
will make other v2 endpoints call `account/updated` instead of
`authStatusChange` too.
## Problem
The `is_api_message` function in `conversation_history.rs` had a
misalignment between its documentation and implementation:
- **Comment stated**: "Anything that is not a system message or
'reasoning' message is considered an API message"
- **Code behavior**: Was returning `true` for `ResponseItem::Reasoning`,
meaning reasoning messages were incorrectly treated as API messages
This inconsistency could lead to reasoning messages being persisted in
conversation history when they should be filtered out.
## Root Cause
Investigation revealed that reasoning messages are explicitly excluded
throughout the codebase:
1. **Chat completions API** (lines 267-272 in `chat_completions.rs`)
omits reasoning from conversation history:
```rust
ResponseItem::Reasoning { .. } | ResponseItem::Other => {
// Omit these items from the conversation history.
continue;
}
```
2. **Existing tests** like `drops_reasoning_when_last_role_is_user` and
`ignores_reasoning_before_last_user` validate that reasoning should be
excluded from API payloads
## Solution
Fixed the `is_api_message` function to align with its documentation and
the rest of the codebase:
```rust
// Before: Reasoning was incorrectly returning true
ResponseItem::Reasoning { .. } | ResponseItem::WebSearchCall { .. } => true,
// After: Reasoning correctly returns false
ResponseItem::WebSearchCall { .. } => true,
ResponseItem::Reasoning { .. } | ResponseItem::Other => false,
```
## Testing
- Enhanced existing test to verify reasoning messages are properly
filtered out
- All 264 core tests pass, including 8 chat completions tests that
validate reasoning behavior
- No regressions introduced
This ensures reasoning messages are consistently excluded from API
message processing across the entire codebase.
I have read the CLA Document and I hereby sign the CLA
Closes#4452
This fixes a usability issue where users with symlinked folders in their
working directory couldn't search those files using the `@` file search
feature.
## Rationale
The "bug" was in the file search implementation in
`codex-rs/file-search/src/lib.rs`. The `WalkBuilder` was using default
settings which don't follow symlinks, causing two related issues:
1. Partial search results: The `@` search would find symlinked
directories but couldn't find files inside them
2. Inconsistent behavior: Users expect symlinked folders to behave like
regular folders in search results.
## Root cause
The `ignore` crate's `WalkBuilder` defaults to `.follow_links(false)`
[[source](9802945e63/crates/ignore/src/walk.rs (L532))],
so when traversing the file system, it would:
- Detect symlinked directories as directory entries
- But not traverse into them to index their contents
- The `get_file_path` function would then filter out actual directories,
leaving only the symlinked folder itself as a result
Fix: Added `.follow_links(true)` to the `WalkBuilder` configuration,
making the file search follow symlinks and index their contents just
like regular directories.
This change maintains backward compatibility since symlink following is
generally expected behavior for file search tools, and it aligns with
how users expect the `@` search feature to work.
Co-authored-by: Eric Traut <etraut@openai.com>
I was missing an example config.toml, and following docs/config.md alone
was slower. I had GPT-5 scan the codebase for every accepted config key,
check the defaults, and generate a single example config.toml with
annotations. It lists all keys Codex reads from TOML, sets each to its
effective default where it exists, leaves optional ones commented, and
adds short comments on purpose and valid values. This should make
onboarding faster and reduce configuration errors. I can rename it to
config.example.toml or move it under docs/ if you prefer.
This fixes bug #6121.
The `input_messages` field passed to the notify handler is currently
empty because the logic is incorrectly including the OutputText rather
than InputText. I've fixed that and added proper filtering to remove
messages associated with AGENTS.md and other context injected by the
harness.
Testing: I wrote a notify handler and verified that the user prompt is
correctly passed through to the handler.
## Summary
- replace the word part enum with a simple `is_word_separator` helper
- keep word-boundary logic aligned with the helper and punctuation-aware
behavior
- extend forward/backward deletion tests to cover whitespace around
separators
## Testing
- just fix -p codex-tui
- cargo test -p codex-tui
------
https://chatgpt.com/codex/tasks/task_i_68f91c71d838832ca2a3c4f0ec1b55d4
This value is used to determine whether mid-turn compaction is required.
Reasoning items are only excluded between turns (and soon will start to
be preserved even across turns) so it's incorrect to subtract
reasoning_output_tokens mid term.
This will result in higher values reported between turns but we are also
looking into preserving reasoning items for the entire conversation to
improve performance and caching.
Error message for attempting to OAuth with a remote RCP is incorrect and
misleading. The correct config is
```
[features]
rmcp_client = true
```
Co-authored-by: Eric Traut <etraut@openai.com>
This pull request adds a new documentation section to explain the
available slash commands in Codex. The update introduces a clear
overview and a reference table for built-in commands, making it easier
for users to understand and utilize these features.
Documentation updates:
* Added a new section to `docs/slash_commands.md` describing what slash
commands are and listing all built-in commands with their purposes in a
formatted table.
Hi OpenAI Codex team, currently "Visit chatgpt.com/codex/settings/usage
for up-to-date information on rate limits and credits" message in status
card and error messages. For now, without the "https://" prefix, the
link cannot be clicked directly from most terminals or chat interfaces.
<img width="636" height="127" alt="Screenshot 2025-11-02 at 22 47 06"
src="https://github.com/user-attachments/assets/5ea11e8b-fb74-451c-85dc-f4d492b2678b"
/>
---
The fix is intent to improve this issue:
- It makes the link clickable in terminals that support it, hence better
accessibility
- It follows standard URL formatting practices
- It maintains consistency with other links in the application (like the
existing "https://openai.com/chatgpt/pricing" links)
Thank you!
Addresses issue https://github.com/openai/codex/issues/3582 where an
"archive conversation" command in the extension fails on Windows.
The problem is that the `archive_conversation` api server call is not
canonicalizing the path to the rollout path when performing its check to
verify that the rollout path is in the sessions directory. This causes
it to fail 100% of the time on Windows.
Testing: I was able to repro the error on Windows 100% prior to this
change. After the change, I'm no longer able to repro.
When I enable `experimental_sandbox_command_assessment`, I get an
incorrect deprecation warning: "experimental_sandbox_command_assessment
is deprecated. Use experimental_sandbox_command_assessment instead."
This PR fixes this error.
* Removed sandbox risk categories; feedback indicates that these are not
that useful and "less is more"
* Tweaked the assessment prompt to generate terser answers
* Fixed bug in orchestrator that prevents this feature from being
exposed in the extension
Fixes#4161
Currently Codex uses a regex to parse the "Please try again in 1.898s"
OpenAI-style rate limit message, so that it can wait the correct
duration before retrying. Azure OpenAI returns a different error that
looks like "Rate limit exceeded. Try again in 35 seconds."
This PR extends the regex and parsing code to match in a more fuzzy
manner, handling anything matching the pattern "try again in
\<duration>\<unit>".
## Summary
This PR fixes a broken self-referencing link in the contributing
documentation.
## Changes
- Removed the phrase 'Following the [development
setup](#development-workflow) instructions above' from the Development
workflow section
- The link referenced a non-existent section and the phrase didn't make
logical sense in context
## Before
The text referenced 'development setup instructions above' but:
1. No section called 'development setup' exists
2. There were no instructions 'above' that point
3. The link pointed to the same section it was in
## After
Simplified to: 'Ensure your change is free of lint warnings and test
failures.'
## Type
Documentation fix
I have read the CLA Document and I hereby sign the CLA
Co-authored-by: Ritesh Chauhan <sagar.chauhn11@gmail.com>
## Summary
Can never have enough tests on this code path - checking that json
inside a shell call is deserialized correctly.
## Tests
- [x] These are tests 😎
I finished reading “Getting Started,” but couldn’t find the
“Configuration” section in the README. After following the link, I
realized “Configuration” is in a separate file, so I updated the README
accordingly.
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
Co-authored-by: Eric Traut <etraut@openai.com>
Bumps [indexmap](https://github.com/indexmap-rs/indexmap) from 2.10.0 to
2.11.4.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/indexmap-rs/indexmap/blob/main/RELEASES.md">indexmap's
changelog</a>.</em></p>
<blockquote>
<h2>2.11.4 (2025-09-18)</h2>
<ul>
<li>Updated the <code>hashbrown</code> dependency to a range allowing
0.15 or 0.16.</li>
</ul>
<h2>2.11.3 (2025-09-15)</h2>
<ul>
<li>Make the minimum <code>serde</code> version only apply when
"serde" is enabled.</li>
</ul>
<h2>2.11.2 (2025-09-15)</h2>
<ul>
<li>Switched the "serde" feature to depend on
<code>serde_core</code>, improving build
parallelism in cases where other dependents have enabled
"serde/derive".</li>
</ul>
<h2>2.11.1 (2025-09-08)</h2>
<ul>
<li>Added a <code>get_key_value_mut</code> method to
<code>IndexMap</code>.</li>
<li>Removed the unnecessary <code>Ord</code> bound on
<code>insert_sorted_by</code> methods.</li>
</ul>
<h2>2.11.0 (2025-08-22)</h2>
<ul>
<li>Added <code>insert_sorted_by</code> and
<code>insert_sorted_by_key</code> methods to <code>IndexMap</code>,
<code>IndexSet</code>, and <code>VacantEntry</code>, like customizable
versions of <code>insert_sorted</code>.</li>
<li>Added <code>is_sorted</code>, <code>is_sorted_by</code>, and
<code>is_sorted_by_key</code> methods to
<code>IndexMap</code> and <code>IndexSet</code>, as well as their
<code>Slice</code> counterparts.</li>
<li>Added <code>sort_by_key</code> and <code>sort_unstable_by_key</code>
methods to <code>IndexMap</code> and
<code>IndexSet</code>, as well as parallel counterparts.</li>
<li>Added <code>replace_index</code> methods to <code>IndexMap</code>,
<code>IndexSet</code>, and <code>VacantEntry</code>
to replace the key (or set value) at a given index.</li>
<li>Added optional <code>sval</code> serialization support.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="03f9e58626"><code>03f9e58</code></a>
Merge pull request <a
href="https://redirect.github.com/indexmap-rs/indexmap/issues/418">#418</a>
from a1phyr/hashbrown_0.16</li>
<li><a
href="ee6080d480"><code>ee6080d</code></a>
Release 2.11.4</li>
<li><a
href="a7da8f181e"><code>a7da8f1</code></a>
Use a range for hashbrown</li>
<li><a
href="0cd5aefb44"><code>0cd5aef</code></a>
Update <code>hashbrown</code> to 0.16</li>
<li><a
href="fd5c819daf"><code>fd5c819</code></a>
Merge pull request <a
href="https://redirect.github.com/indexmap-rs/indexmap/issues/417">#417</a>
from cuviper/release-2.11.3</li>
<li><a
href="9321145e1f"><code>9321145</code></a>
Release 2.11.3</li>
<li><a
href="7b485688c2"><code>7b48568</code></a>
Merge pull request <a
href="https://redirect.github.com/indexmap-rs/indexmap/issues/416">#416</a>
from cuviper/release-2.11.2</li>
<li><a
href="49ce7fa471"><code>49ce7fa</code></a>
Release 2.11.2</li>
<li><a
href="58fd834804"><code>58fd834</code></a>
Merge pull request <a
href="https://redirect.github.com/indexmap-rs/indexmap/issues/414">#414</a>
from DaniPopes/serde_core</li>
<li><a
href="5dc1d6ab31"><code>5dc1d6a</code></a>
Depend on <code>serde_core</code> instead of <code>serde</code></li>
<li>Additional commits viewable in <a
href="https://github.com/indexmap-rs/indexmap/compare/2.10.0...2.11.4">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Eric Traut <etraut@openai.com>
Bumps [anyhow](https://github.com/dtolnay/anyhow) from 1.0.99 to
1.0.100.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/dtolnay/anyhow/releases">anyhow's
releases</a>.</em></p>
<blockquote>
<h2>1.0.100</h2>
<ul>
<li>Teach clippy to lint formatting arguments in <code>bail!</code>,
<code>ensure!</code>, <code>anyhow!</code> (<a
href="https://redirect.github.com/dtolnay/anyhow/issues/426">#426</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="18c2598afa"><code>18c2598</code></a>
Release 1.0.100</li>
<li><a
href="f2719888cb"><code>f271988</code></a>
Merge pull request <a
href="https://redirect.github.com/dtolnay/anyhow/issues/426">#426</a>
from dtolnay/clippyfmt</li>
<li><a
href="52f2115a1f"><code>52f2115</code></a>
Mark macros with clippy::format_args</li>
<li><a
href="da5fd9d5a3"><code>da5fd9d</code></a>
Raise minimum tested compiler to rust 1.76</li>
<li><a
href="211e4092b7"><code>211e409</code></a>
Opt in to generate-macro-expansion when building on docs.rs</li>
<li><a
href="b48fc02c32"><code>b48fc02</code></a>
Enforce trybuild >= 1.0.108</li>
<li><a
href="d5f59fbd45"><code>d5f59fb</code></a>
Update ui test suite to nightly-2025-09-07</li>
<li><a
href="238415d25b"><code>238415d</code></a>
Update ui test suite to nightly-2025-08-24</li>
<li><a
href="3bab0709a3"><code>3bab070</code></a>
Update actions/checkout@v4 -> v5</li>
<li><a
href="42492546e3"><code>4249254</code></a>
Order cap-lints flag in the same order as thiserror build script</li>
<li>See full diff in <a
href="https://github.com/dtolnay/anyhow/compare/1.0.99...1.0.100">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Eric Traut <etraut@openai.com>
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 2.0.16 to
2.0.17.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/dtolnay/thiserror/releases">thiserror's
releases</a>.</em></p>
<blockquote>
<h2>2.0.17</h2>
<ul>
<li>Use differently named __private module per patch release (<a
href="https://redirect.github.com/dtolnay/thiserror/issues/434">#434</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="72ae716e6d"><code>72ae716</code></a>
Release 2.0.17</li>
<li><a
href="599fdce83a"><code>599fdce</code></a>
Merge pull request <a
href="https://redirect.github.com/dtolnay/thiserror/issues/434">#434</a>
from dtolnay/private</li>
<li><a
href="9ec05f6b38"><code>9ec05f6</code></a>
Use differently named __private module per patch release</li>
<li><a
href="d2c492b549"><code>d2c492b</code></a>
Raise minimum tested compiler to rust 1.76</li>
<li><a
href="fc3ab9501d"><code>fc3ab95</code></a>
Opt in to generate-macro-expansion when building on docs.rs</li>
<li><a
href="819fe29dbb"><code>819fe29</code></a>
Update ui test suite to nightly-2025-09-12</li>
<li><a
href="259f48c549"><code>259f48c</code></a>
Enforce trybuild >= 1.0.108</li>
<li><a
href="470e6a681c"><code>470e6a6</code></a>
Update ui test suite to nightly-2025-08-24</li>
<li><a
href="544e191e6e"><code>544e191</code></a>
Update actions/checkout@v4 -> v5</li>
<li><a
href="cbc1ebad3e"><code>cbc1eba</code></a>
Delete duplicate cap-lints flag from build script</li>
<li>See full diff in <a
href="https://github.com/dtolnay/thiserror/compare/2.0.16...2.0.17">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
You can trigger a rebase of this PR by commenting `@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
> **Note**
> Automatic rebases have been disabled on this pull request as it has
been open for over 30 days.
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Eric Traut <etraut@openai.com>
## What?
Fixed error handling in `insert_history_lines_to_writer` where all
terminal operations were silently ignoring errors via `.ok()`.
## Why?
Silent I/O failures could leave the terminal in an inconsistent state
(e.g., scroll region not reset) with no way to debug. This violates Rust
error handling best practices.
## How?
- Changed function signature to return `io::Result<()>`
- Replaced all `.ok()` calls with `?` operator to propagate errors
- Added `tracing::warn!` in wrapper function for backward compatibility
- Updated 15 test call sites to handle Result with `.expect()`
## Testing
- ✅ Pass all tests
## Type of Change
- [x] Bug fix (non-breaking change)
---------
Signed-off-by: Huaiwu Li <lhwzds@gmail.com>
Co-authored-by: Eric Traut <etraut@openai.com>
we are seeing [reports](https://github.com/openai/codex/issues/6004) of
users having verbosity in their config.toml and facing issues.
gpt-5-codex doesn't accept other values rather than medium for
verbosity.
Fixes a Markdown parsing issue where a list item used `*` without a
following space (`*Line ranges ...`). Per CommonMark, a space after the
list marker is required. Updated to `* Line ranges ...` so the guideline
renders as a standalone bullet. This change improves readability and
prevents mis-parsing in renderers.
Co-authored-by: Eric Traut <etraut@openai.com>
## Summary
- add the `/exit` slash command alongside `/quit` and reuse shared exit
handling
- refactor the chat widget to funnel quit, exit, logout, and shutdown
flows through a common `request_exit` helper
- add focused unit tests that confirm both `/quit` and `/exit` send an
`ExitRequest`
## Testing
- `just fmt`
- `just fix -p codex-tui`
- `cargo test -p codex-tui`
------
https://chatgpt.com/codex/tasks/task_i_6903d5a8f47c8321bf180f031f2fa330
- Added the new codex-windows-sandbox crate that builds both a library
entry point (run_windows_sandbox_capture) and a CLI executable to launch
commands inside a Windows restricted-token sandbox, including ACL
management, capability SID provisioning, network lockdown, and output
capture
(windows-sandbox-rs/src/lib.rs:167, windows-sandbox-rs/src/main.rs:54).
- Introduced the experimental WindowsSandbox feature flag and wiring so
Windows builds can opt into the sandbox:
SandboxType::WindowsRestrictedToken, the in-process execution path, and
platform sandbox selection now honor the flag (core/src/features.rs:47,
core/src/config.rs:1224, core/src/safety.rs:19,
core/src/sandboxing/mod.rs:69, core/src/exec.rs:79,
core/src/exec.rs:172).
- Updated workspace metadata to include the new crate and its
Windows-specific dependencies so the core crate can link against it
(codex-rs/
Cargo.toml:91, core/Cargo.toml:86).
- Added a PowerShell bootstrap script that installs the Windows
toolchain, required CLI utilities, and builds the workspace to ease
development
on the platform (scripts/setup-windows.ps1:1).
- Landed a Python smoke-test suite that exercises
read-only/workspace-write policies, ACL behavior, and network denial for
the Windows sandbox
binary (windows-sandbox-rs/sandbox_smoketests.py:1).
# Summary
This PR is related to the Issue #3978 and contains a fix to the seatbelt
profile for macOS that allows to run java/jdk tooling from the sandbox.
I have found that the included change is the minimum change to make it
run on my machine.
There is a unit test added by codex when making this fix. I wonder if it
is useful since you need java installed on the target machine for it to
be relevant. I can remove it it is better.
Fixes#3978
There's still some debate about whether we want to expose
`tools.view_image` or `feature.view_image` so those are left unchanged
for now, but this old `include_view_image_tool` config is good-to-go.
Also updated the doc to reflect that `view_image` tool is now by default
true.
Pull request template, minimal:
---
### **What?**
Minor change (low-hanging fruit).
### **Why?**
To improve code quality or documentation with minimal risk and effort.
### **How?**
Edited directly via VSCode Editor.
---
**Checklist (pre-PR):**
* [x] I have read the CLA Document and hereby sign the CLA.
* [x] I reviewed the “Contributing” markdown file for this project.
*This template meets standard external (non-OpenAI) PR requirements and
signals compliance for maintainers.*
Co-authored-by: Eric Traut <etraut@openai.com>
We had this annotation everywhere in app-server APIs which made it so
that fields get serialized as `field?: T`, meaning if the field as
`None` we would omit the field in the payload. Removing this annotation
changes it so that we return `field: T | null` instead, which makes
codex app-server's API more aligned with the convention of public OpenAI
APIs like Responses.
Separately, remove the `#[ts(optional_fields = nullable)]` annotations
that were recently added which made all the TS types become `field?: T |
null` which is not great since clients need to handle undefined and
null.
I think generally it'll be best to have optional types be either:
- `field: T | null` (preferred, aligned with public OpenAI APIs)
- `field?: T` where we have to, such as types generated from the MCP
schema:
https://github.com/modelcontextprotocol/modelcontextprotocol/blob/main/schema/2025-06-18/schema.ts
(see changes to `mcp-types/`)
I updated @etraut-openai's unit test to check that all generated TS
types are one or the other, not both (so will error if we have a type
that has `field?: T | null`). I don't think there's currently a good use
case for that - but we can always revisit.
## Summary
Duplicates the tests in `apply_patch_cli.rs`, but tests the freeform
apply_patch tool as opposed to the function call path. The good news is
that all the tests pass with zero logical tests, with the exception of
the heredoc, which doesn't really make sense in the freeform tool
context anyway.
@jif-oai since you wrote the original tests in #5557, I'd love your
opinion on the right way to DRY these test cases between the two. Happy
to set up a more sophisticated harness, but didn't want to go down the
rabbit hole until we agreed on the right pattern
## Testing
- [x] These are tests
## Summary
- add a debug-only `/rollout` slash command that prints the rollout file
path or reports when none is known
- surface the new command in the slash command metadata and cover it
with unit tests
<img width="539" height="99" alt="image"
src="https://github.com/user-attachments/assets/688e1334-8a06-4576-abb8-ada33b458661"
/>
## Summary
- re-enable the TypeScript SDK test that verifies local images are
forwarded to `codex exec`
## Testing
- `pnpm test` *(fails: unable to download pnpm 10.8.1 because external
network access is blocked in the sandbox)*
------
https://chatgpt.com/codex/tasks/task_i_690289cb861083209fd006867e2adfb1
Adds AgentMessageContentDelta, ReasoningContentDelta,
ReasoningRawContentDelta item streaming events while maintaining
compatibility for old events.
---------
Co-authored-by: Owen Lin <owen@openai.com>
In this PR, I am exploring migrating task kind to an invocation of
Codex. The main reason would be getting rid off multiple
`ConversationHistory` state and streamlining our context/history
management.
This approach depends on opening a channel between the sub-codex and
codex. This channel is responsible for forwarding `interactive`
(`approvals`) and `non-interactive` events. The `task` is responsible
for handling those events.
This opens the door for implementing `codex as a tool`, replacing
`compact` and `review`, and potentially subagents.
One consideration is this code is very similar to `app-server` specially
in the approval part. If in the future we wanted an interactive
`sub-codex` we should consider using `codex-mcp`
The goal is to have a single place where we actually write files
In a follow-up PR, will move everything config related in a dedicated
module and move the helpers in a dedicated file
We currently have nested enums when sending raw response items in the
app-server protocol. This makes downstream schemas confusing because we
need to embed `type`-discriminated enums within each other.
This PR adds a small wrapper around the response item so we can keep the
schemas separate
This PR addresses a current hole in the TypeScript code generation for
the API server protocol. Fields that are marked as "Optional<>" in the
Rust code are serialized such that the value is omitted when it is
deserialized — appearing as `undefined`, but the TS type indicates
(incorrectly) that it is always defined but possibly `null`. This can
lead to subtle errors that the TypeScript compiler doesn't catch. The
fix is to include the `#[ts(optional_fields = nullable)]` macro for all
protocol structs that contain one or more `Optional<>` fields.
This PR also includes a new test that validates that all TS protocol
code containing "| null" in its type is marked optional ("?") to catch
cases where `#[ts(optional_fields = nullable)]` is omitted.
feature: Add "!cmd" user shell execution
This change lets users run local shell commands directly from the TUI by
prefixing their input with ! (e.g. !ls). Output is truncated to keep the
exec cell usable, and Ctrl-C cleanly
interrupts long-running commands (e.g. !sleep 10000).
**Summary of changes**
- Route Op::RunUserShellCommand through a dedicated UserShellCommandTask
(core/src/tasks/user_shell.rs), keeping the task logic out of codex.rs.
- Reuse the existing tool router: the task constructs a ToolCall for the
local_shell tool and relies on ShellHandler, so no manual MCP tool
lookup is required.
- Emit exec lifecycle events (ExecCommandBegin/ExecCommandEnd) so the
TUI can show command metadata, live output, and exit status.
**End-to-end flow**
**TUI handling**
1. ChatWidget::submit_user_message (TUI) intercepts messages starting
with !.
2. Non-empty commands dispatch Op::RunUserShellCommand { command };
empty commands surface a help hint.
3. No UserInput items are created, so nothing is enqueued for the model.
**Core submission loop**
4. The submission loop routes the op to handlers::run_user_shell_command
(core/src/codex.rs).
5. A fresh TurnContext is created and Session::spawn_user_shell_command
enqueues UserShellCommandTask.
**Task execution**
6. UserShellCommandTask::run emits TaskStartedEvent, formats the
command, and prepares a ToolCall targeting local_shell.
7. ToolCallRuntime::handle_tool_call dispatches to ShellHandler.
**Shell tool runtime**
8. ShellHandler::run_exec_like launches the process via the unified exec
runtime, honoring sandbox and shell policies, and emits
ExecCommandBegin/End.
9. Stdout/stderr are captured for the UI, but the task does not turn the
resulting ToolOutput into a model response.
**Completion**
10. After ExecCommandEnd, the task finishes without an assistant
message; the session marks it complete and the exec cell displays the
final output.
**Conversation context**
- The command and its output never enter the conversation history or the
model prompt; the flow is local-only.
- Only exec/task events are emitted for UI rendering.
**Demo video**
https://github.com/user-attachments/assets/fcd114b0-4304-4448-a367-a04c43e0b996
Found that the VS Code Codex extension throws “Error starting
conversation” when initializing a conversation with Git for Windows’
bash on PATH.
Debugging showed the bash-detection logic did not return as expected;
this change makes it reliable in that scenario.
Possibly related to issue #2841.
# Preserve PATH precedence & fix Windows MCP env propagation
## Problem & intent
Preserve user PATH precedence and reduce Windows setup friction for MCP
servers by avoiding PATH reordering and ensuring Windows child processes
receive essential env vars.
- Addresses: #4180#5225#2945#3245#3385#2892#3310#3457#4370
- Supersedes: #4182, #3866, #3828 (overlapping/inferior once this
merges)
- Notes: #2626 / #2646 are the original PATH-mutation sources being
corrected.
---
## Before / After
**Before**
- PATH was **prepended** with an `apply_patch` helper dir (Rust + Node
wrapper), reordering tools and breaking virtualenvs/shims on
macOS/Linux.
- On Windows, MCP servers missed core env vars and often failed to start
without explicit per-server env blocks.
**After**
- Helper dir is **appended** to PATH (preserves user/tool precedence).
- Windows MCP child env now includes common core variables and mirrors
`PATH` → `Path`, so typical CLIs/plugins work **without** per-server env
blocks.
---
## Scope of change
### `codex-rs/arg0/src/lib.rs`
- Append temp/helper dir to `PATH` instead of prepending.
### `codex-cli/bin/codex.js`
- Mirror the same append behavior for the Node wrapper.
### `codex-rs/rmcp-client/src/utils.rs`
- Expand Windows `DEFAULT_ENV_VARS` (e.g., `COMSPEC`, `SYSTEMROOT`,
`PROGRAMFILES*`, `APPDATA`, etc.).
- Mirror `PATH` → `Path` for Windows child processes.
- Small unit test; conditional `mut` + `clippy` cleanup.
---
## Security effects
No broadened privileges. Only environment propagation for well-known
Windows keys on stdio MCP child processes. No sandbox policy changes and
no network additions.
---
## Testing evidence
**Static**
- `cargo fmt`
- `cargo clippy -p codex-arg0 -D warnings` → **clean**
- `cargo clippy -p codex-rmcp-client -D warnings` → **clean**
- `cargo test -p codex-rmcp-client` → **13 passed**
**Manual**
- Local verification on Windows PowerShell 5/7 and WSL (no `unused_mut`
warnings on non-Windows targets).
---
## Checklist
- [x] Append (not prepend) helper dir to PATH in Rust and Node wrappers
- [x] Windows MCP child inherits core env vars; `PATH` mirrored to
`Path`
- [x] `cargo fmt` / `clippy` clean across touched crates
- [x] Unit tests updated/passing where applicable
- [x] Cross-platform behavior preserved (macOS/Linux PATH precedence
intact)
This PR adds an option to app server to allow conversation summaries to
be fetched from just the conversation id rather than rollout path for
convenience at the cost of some latency to discover the rollout path.
This convenience is non-trivial as it allows app servers to simply
maintain conversation ids rather than rollout paths and the associated
platform (Windows) handling associated with storing and encoding them
correctly.
Follow-up to https://github.com/openai/codex/pull/5063
Refined the app-server export pipeline so JSON Schema variants and
discriminator fields are annotated with descriptive, stable titles
before writing the bundle. This eliminates anonymous enum names in the
generated Pydantic models (goodbye Type7) while keeping downstream
tooling simple. Added shared helpers to derive titles and literals, and
reused them across the traversal logic for clarity. Running just fix -p
codex-app-server-protocol, just fmt, and cargo test -p
codex-app-server-protocol validates the change.
solves: https://github.com/openai/codex/issues/5675
Block non-image uploads in the view_image workflow. We now confirm the
file’s MIME is image/* before building the data URL; otherwise we emit a
“unsupported MIME type” error to the model. This stops the agent from
sending application/json blobs that the Responses API rejects with 400s.
<img width="409" height="556" alt="Screenshot 2025-10-28 at 1 15 10 PM"
src="https://github.com/user-attachments/assets/a92199e8-2769-4b1d-8e33-92d9238c90fe"
/>
Addresses https://github.com/openai/codex/issues/5773
Testing: I tested that images work (regardless of order that they are
associated with the task prompt) in both the CLI and Extension. Also
verified that conversations in CLI and extension with images can be
resumed.
This fixes an issue where messages sent during the final response stream
would seem to disappear, because the "queued messages" UI wasn't shown
during streaming.
There's a lot of visual noise in app-server's integration tests due to
the number of `.expect("<some_msg>")` lines which are largely redundant
/ not very useful. Clean them up by using `anyhow::Result` + `?`
consistently.
Replaces the existing pattern of:
```
let codex_home = TempDir::new().expect("create temp dir");
create_config_toml(codex_home.path()).expect("write config.toml");
let mut mcp = McpProcess::new(codex_home.path())
.await
.expect("spawn mcp process");
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize())
.await
.expect("initialize timeout")
.expect("initialize request");
```
With:
```
let codex_home = TempDir::new()?;
create_config_toml(codex_home.path())?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
```
This PR is a follow-up to #5591. It allows users to choose which auth
storage mode they want by using the new
`cli_auth_credentials_store_mode` config.
## Summary
- Coerce Windows `workspace-write` configs back to read-only, surface
the forced downgrade in the approvals popup,
and funnel users toward WSL or Full Access.
- Add WSL installation instructions to the Auto preset on Windows while
keeping the preset available for other
platforms.
- Skip the trust-on-first-run prompt on native Windows so new folders
remain read-only without additional
confirmation.
- Expose a structured sandbox policy resolution from config to flag
Windows downgrades and adjust tests (core,
exec, TUI) to reflect the new behavior; provide a Windows-only approvals
snapshot.
## Testing
- cargo fmt
- cargo test -p codex-core
config::tests::add_dir_override_extends_workspace_writable_roots
- cargo test -p codex-exec
suite::resume::exec_resume_preserves_cli_configuration_overrides
- cargo test -p codex-tui
chatwidget::tests::approvals_selection_popup_snapshot
- cargo test -p codex-tui
approvals_popup_includes_wsl_note_for_auto_mode
- cargo test -p codex-tui windows_skips_trust_prompt
- just fix -p codex-core
- just fix -p codex-tui
fixing drag/drop photos bug in codex
state of the world before:
sometimes, when you drag screenshots into codex, the image does not
properly render into context. instead, the file name is shown in
quotation marks.
https://github.com/user-attachments/assets/3c0e540a-505c-4ec0-b634-e9add6a73119
the screenshot is not actually included in agent context. the agent
needs to manually call the view_image tool to see the screenshot. this
can be unreliable especially if the image is part of a longer prompt and
is dependent on the agent going out of its way to view the image.
state of the world after:
https://github.com/user-attachments/assets/5f2b7bf7-8a3f-4708-85f3-d68a017bfd97
now, images will always be directly embedded into chat context
## Technical Details
- MacOS sends screenshot paths with a narrow no‑break space right before
the “AM/PM” suffix, which used to trigger our non‑ASCII fallback in the
paste burst detector.
- That fallback flushed the partially buffered paste immediately, so the
path arrived in two separate `handle_paste` calls (quoted prefix +
`PM.png'`). The split string could not be normalized to a real path, so
we showed the quoted filename instead of embedding the image.
- We now append non‑ASCII characters into the burst buffer when a burst
is already active. Finder’s payload stays intact, the path normalizes,
and the image attaches automatically.
- When no burst is active (e.g. during IME typing), non‑ASCII characters
still bypass the buffer so text entry remains responsive.
It's pretty amazing we have gotten here without the ability for the
model to see image content from MCP tool calls.
This PR builds off of 4391 and fixes#4819. I would like @KKcorps to get
adequete credit here but I also want to get this fix in ASAP so I gave
him a week to update it and haven't gotten a response so I'm going to
take it across the finish line.
This test highlights how absured the current situation is. I asked the
model to read this image using the Chrome MCP
<img width="2378" height="674" alt="image"
src="https://github.com/user-attachments/assets/9ef52608-72a2-4423-9f5e-7ae36b2b56e0"
/>
After this change, it correctly outputs:
> Captured the page: image dhows a dark terminal-style UI labeled
`OpenAI Codex (v0.0.0)` with prompt `model: gpt-5-codex medium` and
working directory `/codex/codex-rs`
(and more)
Before this change, it said:
> Took the full-page screenshot you asked for. It shows a long,
horizontally repeating pattern of stylized people in orange, light-blue,
and mustard clothing, holding hands in alternating poses against a white
background. No text or other graphics-just rows of flat illustration
stretching off to the right.
Without this change, the Figma, Playwright, Chrome, and other visual MCP
servers are pretty much entirely useless.
I tested this change with the openai respones api as well as a third
party completions api
Makes sense to move this struct to `app-server-protocol/` since we want
to serialize as camelCase, but we don't for structs defined in
`protocol/`
It was:
```
export type Account = { "type": "ApiKey", api_key: string, } | { "type": "chatgpt", email: string | null, plan_type: PlanType, };
```
But we want:
```
export type Account = { "type": "apiKey", apiKey: string, } | { "type": "chatgpt", email: string | null, planType: PlanType, };
```
move the truncation logic to conversation history to use on any tool
output. This will help us in avoiding edge cases while truncating the
tool calls and mcp calls.
Follow-up PR to #5569. Add Keyring Support for Auth Storage in Codex CLI
as well as a hybrid mode (default to persisting in keychain but fall
back to file when unavailable.)
It also refactors out the keyringstore implementation from rmcp-client
[here](https://github.com/openai/codex/blob/main/codex-rs/rmcp-client/src/oauth.rs)
to a new keyring-store crate.
There will be a follow-up that picks the right credential mode depending
on the config, instead of hardcoding `AuthCredentialsStoreMode::File`.
This PR introduces a new `Auth Storage` abstraction layer that takes
care of read, write, and load of auth tokens based on the
AuthCredentialsStoreMode. It is similar to how we handle MCP client
oauth
[here](https://github.com/openai/codex/blob/main/codex-rs/rmcp-client/src/oauth.rs).
Instead of reading and writing directly from disk for auth tokens, Codex
CLI workflows now should instead use this auth storage using the public
helper functions.
This PR is just a refactor of the current code so the behavior stays the
same. We will add support for keyring and hybrid mode in follow-up PRs.
I have read the CLA Document and I hereby sign the CLA
This PR does the following:
1. Changes `try_refresh_token` to handle the case where the endpoint
returns a response without an `id_token`. The OpenID spec indicates that
this field is optional and clients should not assume it's present.
2. Changes the `attempt_stream_responses` to propagate token refresh
errors rather than silently ignoring them.
3. Fixes a typo in a couple of error messages (unrelated to the above,
but something I noticed in passing) - "reconnect" should be spelled
without a hyphen.
This PR does not implement the additional suggestion from @pakrym-oai
that we should sign out when receiving `refresh_token_expired` from the
refresh endpoint. Leaving this as a follow-on because I'm undecided on
whether this should be implemented in `try_refresh_token` or its
callers.
This adds an RPC to the app server to the the `ConversationSummary` via
a rollout path. Now that the VS Code extension supports showing the
Codex UI in an editor panel where the URI of the panel maps to the
rollout file, we need to be able to get the `ConversationSummary` from
the rollout file directly.
An AppServer client should be able to use any (`model_provider`, `model`) in the user's config. `NewConversationParams` already supported specifying the `model`, but this PR expands it to support `model_provider`, as well.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/5793).
* #5803
* __->__ #5793
Because conversations that use the Responses API can have encrypted
reasoning messages, trying to resume a conversation with a different
provider could lead to confusing "failed to decrypt" errors. (This is
reproducible by starting a conversation using ChatGPT login and resuming
it as a conversation that uses OpenAI models via Azure.)
This changes `ListConversationsParams` to take a `model_providers:
Option<Vec<String>>` and adds `model_provider` on each
`ConversationSummary` it returns so these cases can be disambiguated.
Note this ended up making changes to
`codex-rs/core/src/rollout/tests.rs` because it had a number of cases
where it expected `Some` for the value of `next_cursor`, but the list of
rollouts was complete, so according to this docstring:
bcd64c7e72/codex-rs/app-server-protocol/src/protocol.rs (L334-L337)
If there are no more items to return, then `next_cursor` should be
`None`. This PR updates that logic.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/5658).
* #5803
* #5793
* __->__ #5658
Revert #5642 because this generates:
```
// GENERATED CODE! DO NOT MODIFY BY HAND!
// This file was generated by [ts-rs](https://github.com/Aleph-Alpha/ts-rs). Do not edit this file manually.
export type GetAccountResponse = Account | null;
```
But `Account` is unknown.
The unique use of `#[ts(export)]` on `GetAccountResponse` is also
suspicious as are the changes to
`codex-rs/app-server-protocol/src/export.rs` since the existing system
has worked fine for quite some time.
Though a pure backout of #5642 puts things in a state where, as the PR
noted, the following does not work:
```
cargo run -p codex-app-server-protocol --bin export -- --out DIR
```
So in addition to the backout, this PR adds:
```rust
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct GetAccountResponse {
pub account: Account,
}
```
and changes `GetAccount.response` as follows:
```diff
- response: Option<Account>,
+ response: GetAccountResponse,
```
making it consistent with other types.
With this change, I verified that both of the following work:
```
just codex generate-ts --out /tmp/somewhere
cargo run -p codex-app-server-protocol --bin export -- --out /tmp/somewhere-else
```
The generated TypeScript is as follows:
```typescript
// GetAccountResponse.ts
import type { Account } from "./Account";
export type GetAccountResponse = { account: Account, };
```
and
```typescript
// Account.ts
import type { PlanType } from "./PlanType";
export type Account = { "type": "ApiKey", api_key: string, } | { "type": "chatgpt", email: string | null, plan_type: PlanType, };
```
Though while the inconsistency between `"type": "ApiKey"` and `"type":
"chatgpt"` is quite concerning, I'm not sure if that format is ever
written to disk in any case, but @owenlin0, I would recommend looking
into that.
Also, it appears that the types in `codex-rs/protocol/src/account.rs`
are used exclusively by the `app-server-protocol` crate, so perhaps they
should just be moved there?
Currently, `approval_policy` is supported in profiles, but
`sandbox_mode` is not. This PR adds support for `sandbox_mode`.
Note: a fix for this was submitted in [this
PR](https://github.com/openai/codex/pull/2397), but the underlying code
has changed significantly since then.
This addresses issue #3034
This PR fixes a test that is sporadically failing in CI.
The problem is that two unit tests (the older `login_and_cancel_chatgpt`
and a recently added
`login_chatgpt_includes_forced_workspace_query_param`) exercise code
paths that start the login server. The server binds to a hard-coded
localhost port number, so attempts to start more than one server at the
same time will fail. If these two tests happen to run concurrently, one
of them will fail.
To fix this, I've added a simple mutex. We can use this same mutex for
future tests that use the same pattern.
This PR adds support for a model-based summary and risk assessment for
commands that violate the sandbox policy and require user approval. This
aids the user in evaluating whether the command should be approved.
The feature works by taking a failed command and passing it back to the
model and asking it to summarize the command, give it a risk level (low,
medium, high) and a risk category (e.g. "data deletion" or "data
exfiltration"). It uses a new conversation thread so the context in the
existing thread doesn't influence the answer. If the call to the model
fails or takes longer than 5 seconds, it falls back to the current
behavior.
For now, this is an experimental feature and is gated by a config key
`experimental_sandbox_command_assessment`.
Here is a screen shot of the approval prompt showing the risk assessment
and summary.
<img width="723" height="282" alt="image"
src="https://github.com/user-attachments/assets/4597dd7c-d5a0-4e9f-9d13-414bd082fd6b"
/>
The API schema export is currently broken:
```
> cargo run -p codex-app-server-protocol --bin export -- --out DIR
Error: this type cannot be exported
```
This PR fixes the error message so we get more info:
```
> cargo run -p codex-app-server-protocol --bin export -- --out DIR
Error: failed to export client responses: dependency core::option::Option<codex_protocol::account::Account> cannot be exported
```
And fixes the root cause which is the `account/read` response.
## Summary
- wrap the default reqwest::Client inside a new
CodexHttpClient/CodexRequestBuilder pair and log the HTTP method, URL,
and status for each request
- update the auth/model/provider plumbing to use the new builder helpers
so headers and bearer auth continue to be applied consistently
- add the shared `http` dependency that backs the header conversion
helpers
## Testing
- `CODEX_SANDBOX=seatbelt CODEX_SANDBOX_NETWORK_DISABLED=1 cargo test -p
codex-core`
- `CODEX_SANDBOX=seatbelt CODEX_SANDBOX_NETWORK_DISABLED=1 cargo test -p
codex-chatgpt`
- `CODEX_SANDBOX=seatbelt CODEX_SANDBOX_NETWORK_DISABLED=1 cargo test -p
codex-tui`
------
https://chatgpt.com/codex/tasks/task_i_68fa5038c17483208b1148661c5873be
1. I have seen too many reports of people hitting startup timeout errors
and thinking Codex is broken. Hopefully this will help people
self-serve. We may also want to consider raising the timeout to ~15s.
2. Make it more clear what PAT is (personal access token) in the GitHub
error
<img width="2378" height="674" alt="CleanShot 2025-10-23 at 22 05 06"
src="https://github.com/user-attachments/assets/d148ce1d-ade3-4511-84a4-c164aefdb5c5"
/>
I want to centralize input processing and management to
`ConversationHistory`. This would need `ConversationHistory` to have
access to `token_info` (i.e. preventing adding a big input to the
history). Besides, it makes more sense to have it on
`ConversationHistory` than `state`.
Walk the sessions tree instead of using file_search so gitignored
CODEX_HOME directories can resume sessions. Add a regression test that
covers a .gitignore'd sessions directory.
Fixes#5247Fixes#5412
---------
Co-authored-by: Owen Lin <owen@openai.com>
Currently we collect all all turn items in a vector, then we add it to
the history on success. This result in losing those items on errors
including aborting `ctrl+c`.
This PR:
- Adds the ability for the tool call to handle cancellation
- bubble the turn items up to where we are recording this info
Admittedly, this logic is an ad-hoc logic that doesn't handle a lot of
error edge cases. The right thing to do is recording to the history on
the spot as `items`/`tool calls output` come. However, this isn't
possible because of having different `task_kind` that has different
`conversation_histories`. The `try_run_turn` has no idea what thread are
we using. We cannot also pass an `arc` to the `conversation_histories`
because it's a private element of `state`.
That's said, `abort` is the most common case and we should cover it
until we remove `task kind`
This shows the aggregated (stdout + stderr) buffer regardless of exit
code.
Many commands output useful / relevant info on stdout when returning a
non-zero exit code, or the same on stderr when returning an exit code of
0. Often, useful info is present on both stdout AND stderr. Also, the
model sees both. So it is confusing to see commands listed as "(no
output)" that in fact do have output, just on the stream that doesn't
match the exit status, or to see some sort of trivial output like "Tests
failed" but lacking any information about the actual failure.
As such, always display the aggregated output in the display. Transcript
mode remains unchanged as it was already displaying the text that the
model sees, which seems correct for transcript mode.
These are the schema definitions for the new JSON-RPC APIs associated
with accounts. These are not wired up to business logic yet and will
currently throw an internal error indicating these are unimplemented.
- ensure paste burst flush preserves ASCII characters before IME commits
- add regression test covering digit followed by Japanese text
submission
Fixesopenai/codex#4356
Co-authored-by: Josh McKinney <joshka@openai.com>
Codex will now send an `account/rateLimits/updated` notification
whenever the user's rate limits are updated.
This is implemented by just transforming the existing TokenCount event.
We are doing some ad-hoc logic while dealing with conversation history.
Ideally, we shouldn't mutate `vec[responseitem]` manually at all and
should depend on `ConversationHistory` for those changes.
Those changes are:
- Adding input to the history
- Removing items from the history
- Correcting history
I am also adding some `error` logs for cases we shouldn't ideally face.
For example, we shouldn't be missing `toolcalls` or `outputs`. We
shouldn't hit `ContextWindowExceeded` while performing `compact`
This refactor will give us granular control over our context management.
I haven't heard of any issues with the studio rmcp client so let's
remove the legacy one and default to the new one.
Any code changes are moving code from the adapter inline but there
should be no meaningful functionality changes.
1. Adds AgentMessage, Reasoning, WebSearch items.
2. Switches the ResponseItem parsing to use new items and then also emit
3. Removes user-item kind and filters out "special" (environment) user
items when returning to clients.
## What
- Add the `--cask` flag to the Homebrew update command for Codex.
## Why
- `brew upgrade codex` alone does not update the cask, so users were not
getting the right upgrade instructions.
## How
- Update `UpdateAction::BrewUpgrade` in `codex-rs/tui/src/updates.rs` to
use `upgrade --cask codex`.
## Testing
- [x] cargo test -p codex-tui
Co-authored-by: Thibault Sottiaux <tibo@openai.com>
While we do not want to encourage users to hardcode secrets in their
`config.toml` file, it should be possible to pass an API key
programmatically. For example, when using `codex app-server`, it is
possible to pass a "bag of configuration" as part of the
`NewConversationParams`:
682d05512f/codex-rs/app-server-protocol/src/protocol.rs (L248-L251)
When using `codex app-server`, it's not practical to change env vars of
the `codex app-server` process on the fly (which is how we usually read
API key values), so this helps with that.
## Summary
- make the plan tool available by default by removing the feature flag
and always registering the handler
- drop plan-tool CLI and API toggles across the exec, TUI, MCP server,
and app server code paths
- update tests and configs to reflect the always-on plan tool and guard
workspace restriction tests against env leakage
## Testing
Manually tested the extension.
------
https://chatgpt.com/codex/tasks/task_i_68f67a3ff2d083209562a773f814c1f9
This #[serial] approach is not ideal. I am tracking a separate issue to
create an injectable env var provider but I want to fix these tests
first.
Fixes#5447
Today `sub_id` is an ID of a single incoming Codex Op submition. We then
associate all events triggered by this operation using the same
`sub_id`.
At the same time we are also creating a TurnContext per submission and
we'd like to start associating some events (item added/item completed)
with an entire turn instead of just the operation that started it.
Using turn context when sending events give us flexibility to change
notification scheme.
Expose the session cwd in the notify payload and update docs so scripts
and extensions receive the real project path; users get accurate
project-aware notifications in CLI and VS Code.
Fixes#5387
Because the GitHub MCP is one of the most popular MCPs and it
confusingly doesn't support OAuth, we should make it more clear how to
make it work so people don't think Codex is broken.
Without proper `zsh -lc` parsing, we lose some things like proper
command parsing, turn diff tracking, safe command checks, and other
things we expect from raw or `bash -lc` commands.
Some MCP servers expose a lot of tools. In those cases, it is reasonable
to allow/denylist tools for Codex to use so it doesn't get overwhelmed
with too many tools.
The new configuration options available in the `mcp_server` toml table
are:
* `enabled_tools`
* `disabled_tools`
Fixes#4796
Adds a `GET account/rateLimits/read` API to app-server. This calls the
codex backend to fetch the user's current rate limits.
This would be helpful in checking rate limits without having to send a
message.
For calling the codex backend usage API, I generated the types and
manually copied the relevant ones into `codex-backend-openapi-types`.
It'll be nice to extend our internal openapi generator to support Rust
so we don't have to run these manual steps.
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
We don't instruct the model to use citations, so it never emits them.
Further, ratatui [doesn't currently support rendering links into the
terminal with OSC 8](https://github.com/ratatui/ratatui/issues/1028), so
even if we did parse citations, we can't correctly render them.
So, remove all the code related to rendering them.
Adds a new ItemStarted event and delivers UserMessage as the first item
type (more to come).
Renames `InputItem` to `UserInput` considering we're using the `Item`
suffix for actual items.
The backend will be returning unix timestamps (seconds since epoch)
instead of RFC 3339 strings. This will make it more ergonomic for
developers to integrate against - no string parsing.
Add shared helper to format warnings when add-dir is incompatible with
the sandbox. Surface the warning in the TUI entrypoint and document the
limitation for add-dir.
Add annotations and an export script that let us generate app-server
protocol types as typescript and JSONSchema.
The script itself is a bit hacky because we need to manually label some
of the types. Unfortunately it seems that enum variants don't get good
names by default and end up with something like `EventMsg1`,
`EventMsg2`, etc. I'm not an expert in this by any means, but since this
is only run manually and we already need to enumerate the types required
to describe the protocol, it didn't seem that much worse. An ideal
solution here would be to have some kind of root that we could generate
schemas for in one go, but I'm not sure if that's compatible with how we
generate the protocol today.
Extends shell wrapper stripping in TUI to handle `zsh -lc` in addition
to `bash -lc`.
Currently, Linux users (and macOS users with zsh profiles) see cluttered
command headers like `• Ran zsh -lc "echo hello"` instead of `• Ran echo
hello`. This happens because `codex-rs/tui/src/exec_command.rs` only
checks for literal `"bash"`, ignoring `zsh` and absolute paths like
`/usr/bin/zsh`.
**Changes:**
- Added `is_login_shell_with_lc` helper that extracts shell basename and
matches against `bash` or `zsh`
- Updated pattern matching to use the helper instead of hardcoded check
- Added test coverage for zsh and absolute paths (`/usr/bin/zsh`,
`/bin/bash`)
**Testing:**
```bash
cd codex-rs
cargo test strip_bash_lc_and_escape -p codex-tui
```
All 4 test cases pass (bash, zsh, and absolute paths for both).
Closes#4201
Extend `run` and `runStreamed` input to be either a `string` or
structured input. A structured input is an array of text parts and/or
image paths, which will then be fed to the CLI through the `--image`
argument. Text parts are combined with double newlines. For instance:
```ts
const turn = await thread.run([
{ type: "text", text: "Describe these screenshots" },
{ type: "local_image", path: "./ui.png" },
{ type: "local_image", path: "./diagram.jpg" },
{ type: "text", text: "Thanks!" },
]);
```
Ends up launching the CLI with:
```
codex exec --image foo.png --image bar.png "Describe these screenshots\n\nThanks!"
```
The complete `Input` type for both function now is:
```ts
export type UserInput =
| {
type: "text";
text: string;
}
| {
type: "local_image";
path: string;
};
export type Input = string | UserInput[];
```
This brings the Codex SDK closer to feature parity with the CLI.
Adresses #5280 .
This should make it more clear that specific tools come from MCP
servers.
#4806 requested that we add the server name but we already do that.
Fixes#4806
Tightened the docs so the sandbox guide matches reality, noted the new
tools.view_image toggle next to web search, and linked the README to the
getting-started guide which now owns the familiar tips (backtrack, --cd,
--add-dir, etc.).
Updated the configuration guide so it matches the current CLI behavior.
Clarified the platform-specific default model, explained how custom
model-providers interact with bundled ones, refreshed the streamable
HTTP/MCP section with accurate guidance on the RMCP client and OAuth
flag, and removed stale keys from the reference table.
Update FAQ, improve general structure for config, add more links across
the sections in the documentation, remove out of date and duplicate
content and better explain certain concepts such as approvals and
sandboxing.
Add a `--add-dir` CLI flag so sessions can use extra writable roots in
addition to the ones specified in the config file. These are ephemerally
added during the session only.
Fixes#3303Fixes#2797
The goal of this change:
1. Unify user input and user turn implementation.
2. Have a single place where turn/session setting overrides are applied.
3. Have a single place where turn context is created.
4. Create TurnContext only for actual turn and have a separate structure
for current session settings (reuse ConfigureSession)
Expand the custom prompts documentation and link it from other guides. Show saved prompt metadata in the slash-command popup, with tests covering description fallbacks.
Exit when a requested resume session is missing after restoring the
terminal and print a helpful message instructing users how to resume
existing sessions.
Partially addresses #5247.
# What
Updates the install command in the changelog template (`cliff.toml`)
from
```
npm install -g codex@version
```
to
```
npm install -g @openai/codex@<version>
```
# Why
The current command is incorrect, it tries installs the old “codex”
static site generator rather than the OpenAI Codex CLI.
# How
Edited only the header string in `cliff.toml` to point to
@openai/codex@<version>. No changelog regeneration or other files
touched.
Fixes#2059
Co-authored-by: Thibault Sottiaux <tibo@openai.com>
Fixes#4870#4717#3260#4431#2718#4898#5036
- Fix the chat composer “phantom space” bug that appeared when
backspacing CJK (and other double-width) characters after the composer
got a uniform background in 43b63ccae89c….
- Pull diff_buffers’s clear-to-end logic forward to iterate by display
width, so wide graphemes are counted correctly when computing the
trailing column.
- Keep modifier-aware detection so styled cells are still flushed, and
add a regression test (diff_buffers_clear_to_end_starts_after_wide_char)
that covers the CJK deletion scenario.
---------
Co-authored-by: Josh McKinney <joshka@openai.com>
This change ensures that we store the absolute time instead of relative
offsets of when the primary and secondary rate limits will reset.
Previously these got recalculated relative to current time, which leads
to the displayed reset times to change over time, including after doing
a codex resume.
For previously changed sessions, this will cause the reset times to not
show due to this being a breaking change:
<img width="524" height="55" alt="Screenshot 2025-10-17 at 5 14 18 PM"
src="https://github.com/user-attachments/assets/53ebd43e-da25-4fef-9c47-94a529d40265"
/>
Fixes https://github.com/openai/codex/issues/4761
I dropped the build of the old cli from the flake, where the default.nix
already seemed to removed in a previous iterations. Then I updated
flake.nix and codex-rs expression to be able to build again (see
individual commits for details).
Tested by running the following builds:
```
$ nix build .#packages.x86_64-linux.codex-rs
$ nix build .#packages.aarch64-darwin.codex-cli
```
---------
Signed-off-by: Jörg Thalheim <joerg@thalheim.io>
`ParsedCommand::Read` has a `name` field that attempts to identify the
name of the file being read, but the file may not be in the `cwd` in
which the command is invoked as demonstrated by this existing unit test:
0139f6780c/codex-rs/core/src/parse_command.rs (L250-L260)
As you can see, `tui/Cargo.toml` is the relative path to the file being
read.
This PR introduces a new `path: PathBuf` field to `ParsedCommand::Read`
that attempts to capture this information. When possible, this is an
absolute path, though when relative, it should be resolved against the
`cwd` that will be used to run the command to derive the absolute path.
This should make it easier for clients to provide UI for a "read file"
event that corresponds to the command execution.
This makes stdio mcp servers more flexible by allowing users to specify
the cwd to run the server command from and adding additional environment
variables to be passed through to the server.
Example config using the test server in this repo:
```toml
[mcp_servers.test_stdio]
cwd = "/Users/<user>/code/codex/codex-rs"
command = "cargo"
args = ["run", "--bin", "test_stdio_server"]
env_vars = ["MCP_TEST_VALUE"]
```
@bolinfest I know you hate these env var tests but let's roll with this
for now. I may take a stab at the env guard + serial macro at some
point.
This adds two new config fields to streamable http mcp servers:
`http_headers`: a map of key to value
`env_http_headers` a map of key to env var which will be resolved at
request time
All headers will be passed to all MCP requests to that server just like
authorization headers.
There is a test ensuring that headers are not passed to other servers.
Fixes#5180
## Summary
When using the trusted state during tui startup, we created a new
WorkspaceWrite policy without checking the config.toml for a
`sandbox_workspace_write` field. This would result in us setting the
sandbox_mode as workspace-write, but ignoring the field if the user had
set `sandbox_workspace_write` without also setting `sandbox_mode` in the
config.toml. This PR adds support for respecting
`sandbox_workspace_write` setting in config.toml in the trusted
directory flow, and adds tests to cover this case.
## Testing
- [x] Added unit tests
Also: fixed the contents of the `APPLE_CERTIFICATE_P12` and
`APPLE_CERTIFICATE_PASSWORD` secrets, so the code-signing step will use
the right certificate now.
## Summary
- add a kill buffer to the text area and wire Ctrl+Y to yank it
- capture text from Ctrl+W, Ctrl+U, and Ctrl+K operations into the kill
buffer
- add regression coverage ensuring the last kill can be yanked back
Fixes#5017
------
https://chatgpt.com/codex/tasks/task_i_68e95bf06c48832cbf3d2ba8fa2035d2
This adds `parsed_cmd: Vec<ParsedCommand>` to `ExecApprovalRequestEvent`
in the core protocol (`protocol/src/protocol.rs`), which is also what
this field is named on `ExecCommandBeginEvent`. Honestly, I don't love
the name (it sounds like a single command, but it is actually a list of
them), but I don't want to get distracted by a naming discussion right
now.
This also adds `parsed_cmd` to `ExecCommandApprovalParams` in
`codex-rs/app-server-protocol/src/protocol.rs`, so it will be available
via `codex app-server`, as well.
For consistency, I also updated `ExecApprovalElicitRequestParams` in
`codex-rs/mcp-server/src/exec_approval.rs` to include this field under
the name `codex_parsed_cmd`, as that struct already has a number of
special `codex_*` fields. Note this is the code for when Codex is used
as an MCP _server_ and therefore has to conform to the official spec for
an MCP elicitation type.
Note these two types were identical, so it seems clear to standardize on the one in `codex_protocol` and eliminate the `Into` stuff.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/5218).
* #5222
* __->__ #5218
1. If Codex detects that a `codex mcp add -url …` server supports oauth,
it will auto-initiate the login flow.
2. If the TUI starts and a MCP server supports oauth but isn't logged
in, it will give the user an explicit warning telling them to log in.
Tightened the CLI integration tests to stop relying on wall-clock
sleeps—new fs watcher helper waits for session files instead of timing
out, and SSE mocks/fixtures make the flows deterministic.
keep a 1 cell margin at the right edge of the screen in the composer
(and in the user message in history).
this lets us print clear-to-EOL 1 char before the end of the line in
history, so that resizing the terminal will keep the background color
(at least in iterm/terminal.app). it also stops the cursor in the
textarea from floating off the right edge.
---------
Co-authored-by: joshka-oai <joshka@openai.com>
Add proper feature flag instead of having custom flags for everything.
This is just for experimental/wip part of the code
It can be used through CLI:
```bash
codex --enable unified_exec --disable view_image_tool
```
Or in the `config.toml`
```toml
# Global toggles applied to every profile unless overridden.
[features]
apply_patch_freeform = true
view_image_tool = false
```
Follow-up:
In a following PR, the goal is to have a default have `bundles` of
features that we can associate to a model
Refactor trust_directory to use ColumnRenderable & friends, thus
correcting wrapping behavior at small widths. Also introduce
RowRenderable with fixed-width rows.
- fixed wrapping in trust_directory
- changed selector cursor to match other list item selections
- allow y/n to work as well as 1/2
- fixed key_hint to be standard
before:
<img width="661" height="550" alt="Screenshot 2025-10-09 at 9 50 36 AM"
src="https://github.com/user-attachments/assets/e01627aa-bee4-4e25-8eca-5575c43f05bf"
/>
after:
<img width="661" height="550" alt="Screenshot 2025-10-09 at 9 51 31 AM"
src="https://github.com/user-attachments/assets/cb816cbd-7609-4c83-b62f-b4dba392d79a"
/>
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
This adds a queryable auth status for MCP servers which is useful:
1. To determine whether a streamable HTTP server supports auth or not
based on whether or not it supports RFC 8414-3.2
2. Allow us to build a better user experience on top of MCP status
Instead of querying all 256 terminal colors on startup, which was slow
in some terminals, hardcode the default xterm palette.
Additionally, tweak the shimmer so that it blends between default_fg and
default_bg, instead of "dark gray" (according to the terminal) and pure
white (regardless of terminal theme).
Clear the history cursor before checking for duplicate submissions so
sending the same message twice exits history mode. This prevents Up/Down
from staying stuck in history browsing after duplicate sends.
1. You can now add streamable http servers via the CLI
2. As part of this, I'm also changing the existing bearer_token plain
text config field with ane env var
```
mcp add github --url https://api.githubcopilot.com/mcp/ --bearer-token-env-var=GITHUB_PAT
```
This lets users/companies explicitly choose whether to force/disallow
the keyring/fallback file storage for mcp credentials.
People who develop with Codex will want to use this until we sign
binaries or else each ad-hoc debug builds will require keychain access
on every build. I don't love this and am open to other ideas for how to
handle that.
```toml
mcp_oauth_credentials_store = "auto"
mcp_oauth_credentials_store = "file"
mcp_oauth_credentials_store = "keyrung"
```
Defaults to `auto`
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
## Summary
- ensure the TypeScript SDK sets CODEX_INTERNAL_ORIGINATOR_OVERRIDE to
codex_sdk_ts when spawning the Codex CLI
- extend the responses proxy test helper to capture request headers for
assertions
- add coverage that verifies Codex threads launched from the TypeScript
SDK send the codex_sdk_ts originator header
## Testing
- Not Run (not requested)
------
https://chatgpt.com/codex/tasks/task_i_68e561b125248320a487f129093d16e7
We use to put the review prompt in the first user message as well to
bypass statsig overrides, but now that's been resolved and instructions
are being respected, so we're duplicating the review instructions.
this fixes an issue where text lines with long words would sometimes
overflow.
- the default penalties for the OptimalFit algorithm allow overflowing
in some cases. this seems insane to me, and i considered just banning
the OptimalFit algorithm by disabling the 'smawk' feature on textwrap,
but decided to keep it and just bump the overflow penalty to ~infinity
since optimal fit does sometimes produce nicer wrappings. it's not clear
this is worth it, though, and maybe we should just dump the optimal fit
algorithm completely.
- user history messages weren't rendering with the same wrap algorithm
as used in the composer, which sometimes resulted in wrapping messages
differently in the history vs. in the composer.
Before this change:
```
tamird@L03G26TD12 codex-rs % codex
zsh: do you wish to see all 3864 possibilities (1285 lines)?
```
After this change:
```
tamird@L03G26TD12 codex-rs % codex
app-server -- [experimental] Run the app server
apply a -- Apply the latest diff produced by Codex agent as a `git apply` to your local working tree
cloud -- [EXPERIMENTAL] Browse tasks from Codex Cloud and apply changes locally
completion -- Generate shell completion scripts
debug -- Internal debugging commands
exec e -- Run Codex non-interactively
generate-ts -- Internal: generate TypeScript protocol bindings
help -- Print this message or the help of the given subcommand(s)
login -- Manage login
logout -- Remove stored authentication credentials
mcp -- [experimental] Run Codex as an MCP server and manage MCP servers
mcp-server -- [experimental] Run the Codex MCP server (stdio transport)
responses-api-proxy -- Internal: run the responses API proxy
resume -- Resume a previous interactive session (picker by default; use --last to continue the most recent)
```
when exiting a session that was started with `codex resume`, the note
about how to resume again wasn't being printed.
thanks @aibrahim-oai for pointing out this issue!
`http_config.auth_header` automatically added `Bearer `. By adding it
ourselves, we were sending `Bearer Bearer <token>`.
I confirmed that the GitHub MCP initialization 400s before and works
now.
I also optimized the oauth flow to not check the keyring if you
explicitly pass in a bearer token.
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
## Summary
- add a `codex sandbox` subcommand with macOS and Linux targets while
keeping the legacy `codex debug` aliases
- update documentation to highlight the new sandbox entrypoints and
point existing references to the new command
- clarify the core README about the linux sandbox helper alias
## Testing
- just fmt
- just fix -p codex-cli
- cargo test -p codex-cli
------
https://chatgpt.com/codex/tasks/task_i_68e2e00ca1e8832d8bff53aa0b50b49e
## Summary
- replace manual wiremock SSE mounts in the compact suite with the
shared response helpers
- simplify the exec auth_env integration test by using the
mount_sse_once_match helper
- rely on mount_sse_sequence plus server request collection to replace
the bespoke SeqResponder utility in tests
## Testing
- just fmt
------
https://chatgpt.com/codex/tasks/task_i_68e2e238f2a88320a337f0b9e4098093
## Summary
- expand the TypeScript SDK README with streaming, architecture, and API
docs
- refresh quick start examples and clarify thread management options
## Testing
- Not Run (docs only)
---------
Co-authored-by: pakrym-oai <pakrym@openai.com>
## Summary
- add a reusable `ev_response_created` helper that builds
`response.created` SSE events for integration tests
- update the exec and core integration suites to use the new helper
instead of repeating manual JSON literals
- keep the streaming fixtures consistent by relying on the shared helper
in every touched test
## Testing
- `just fmt`
------
https://chatgpt.com/codex/tasks/task_i_68e1fe885bb883208aafffb94218da61
`codex-responses-api-proxy` is designed so that there should be exactly
one copy of the API key in memory (that is `mlock`'d on UNIX), but in
practice, I was seeing two when I dumped the process data from
`/proc/$PID/mem`.
It appears that `std::io::stdin()` maintains an internal `BufReader`
that we cannot zero out, so this PR changes the implementation on UNIX
so that we use a low-level `read(2)` instead.
Even though it seems like it would be incredibly unlikely, we also make
this logic tolerant of short reads. Either `\n` or `EOF` must be sent to
signal the end of the key written to stdin.
## Summary
- replace manual event polling loops in several core test suites with
the shared wait_for_event helpers
- keep prior assertions intact by using closure captures for stateful
expectations, including plan updates, patch lifecycles, and review flow
checks
- rely on wait_for_event_with_timeout where longer waits are required,
simplifying timeout handling
## Testing
- just fmt
------
https://chatgpt.com/codex/tasks/task_i_68e1d58582d483208febadc5f90dd95e
## Summary
This PR is an alternative approach to #4711, but instead of changing our
storage, parses out shell calls in the client and reserializes them on
the fly before we send them out as part of the request.
What this changes:
1. Adds additional serialization logic when the
ApplyPatchToolType::Freeform is in use.
2. Adds a --custom-apply-patch flag to enable this setting on a
session-by-session basis.
This change is delicate, but is not meant to be permanent. It is meant
to be the first step in a migration:
1. (This PR) Add in-flight serialization with config
2. Update model_family default
3. Update serialization logic to store turn outputs in a structured
format, with logic to serialize based on model_family setting.
4. Remove this rewrite in-flight logic.
## Test Plan
- [x] Additional unit tests added
- [x] Integration tests added
- [x] Tested locally
In the past, we were treating `input exceeded context window` as a
streaming error and retrying on it. Retrying on it has no point because
it won't change the behavior. In this PR, we surface the error to the
client without retry and also send a token count event to indicate that
the context window is full.
<img width="650" height="125" alt="image"
src="https://github.com/user-attachments/assets/c26b1213-4c27-4bfc-90f4-51a270a3efd5"
/>
We truncate the output of exec commands to not blow the context window.
However, some cases we weren't doing that. This caused reports of people
with 76% context window left facing `input exceeded context window`
which is weird.
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
The `experimental_use_rmcp_client` flag is still useful to:
1. Toggle between stdio clients
2. Enable oauth beacuse we want to land
https://github.com/modelcontextprotocol/rust-sdk/pull/469,
https://github.com/openai/codex/pull/4677, and binary signing before we
enable it by default
However, for no-auth http servers, there is only one option so we don't
need the flag and it seems to be working pretty well.
Codex isn’t great yet on Windows outside of WSL, and while we’ve merged
https://github.com/openai/codex/pull/4269 to reduce the repetitive
manual approvals on readonly commands, we’ve noticed that users seem to
have more issues with GPT-5-Codex than with GPT-5 on Windows.
This change makes GPT-5 the default for Windows users while we continue
to improve the CLI harness and model for GPT-5-Codex on Windows.
## Summary
- Factor `load_config_as_toml` into `core::config_loader` so config
loading is reusable across callers.
- Layer `~/.codex/config.toml`, optional `~/.codex/managed_config.toml`,
and macOS managed preferences (base64) with recursive table merging and
scoped threads per source.
## Config Flow
```
Managed prefs (macOS profile: com.openai.codex/config_toml_base64)
▲
│
~/.codex/managed_config.toml │ (optional file-based override)
▲
│
~/.codex/config.toml (user-defined settings)
```
- The loader searches under the resolved `CODEX_HOME` directory
(defaults to `~/.codex`).
- Managed configs let administrators ship fleet-wide overrides via
device profiles which is useful for enforcing certain settings like
sandbox or approval defaults.
- For nested hash tables: overlays merge recursively. Child tables are
merged key-by-key, while scalar or array values replace the prior layer
entirely. This lets admins add or tweak individual fields without
clobbering unrelated user settings.
Apparently we were not running our `pnpm run prettier` check in CI, so
many files that were covered by the existing Prettier check were not
well-formatted.
This updates CI and formats the files.
This PR adds oauth login support to streamable http servers when
`experimental_use_rmcp_client` is enabled.
This PR is large but represents the minimal amount of work required for
this to work. To keep this PR smaller, login can only be done with
`codex mcp login` and `codex mcp logout` but it doesn't appear in `/mcp`
or `codex mcp list` yet. Fingers crossed that this is the last large MCP
PR and that subsequent PRs can be smaller.
Under the hood, credentials are stored using platform credential
managers using the [keyring crate](https://crates.io/crates/keyring).
When the keyring isn't available, it falls back to storing credentials
in `CODEX_HOME/.credentials.json` which is consistent with how other
coding agents handle authentication.
I tested this on macOS, Windows, WSL (ubuntu), and Linux. I wasn't able
to test the dbus store on linux but did verify that the fallback works.
One quirk is that if you have credentials, during development, every
build will have its own ad-hoc binary so the keyring won't recognize the
reader as being the same as the write so it may ask for the user's
password. I may add an override to disable this or allow
users/enterprises to opt-out of the keyring storage if it causes issues.
<img width="5064" height="686" alt="CleanShot 2025-09-30 at 19 31 40"
src="https://github.com/user-attachments/assets/9573f9b4-07f1-4160-83b8-2920db287e2d"
/>
<img width="745" height="486" alt="image"
src="https://github.com/user-attachments/assets/9562649b-ea5f-4f22-ace2-d0cb438b143e"
/>
This issue was due to the fact that the timeout is not always sufficient
to have enough character for truncation + a race between synthetic
timeout and process kill
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
This updates `codex exec` so that, by default, most of the agent's
activity is written to stderr so that only the final agent message is
written to stdout. This makes it easier to pipe `codex exec` into
another tool without extra filtering.
I introduced `#![deny(clippy::print_stdout)]` to help enforce this
change and renamed the `ts_println!()` macro to `ts_msg()` because (1)
it no longer calls `println!()` and (2), `ts_eprintln!()` seemed too
long of a name.
While here, this also adds `-o` as an alias for `--output-last-message`.
Fixes https://github.com/openai/codex/issues/1670
# Tool System Refactor
- Centralizes tool definitions and execution in `core/src/tools/*`:
specs (`spec.rs`), handlers (`handlers/*`), router (`router.rs`),
registry/dispatch (`registry.rs`), and shared context (`context.rs`).
One registry now builds the model-visible tool list and binds handlers.
- Router converts model responses to tool calls; Registry dispatches
with consistent telemetry via `codex-rs/otel` and unified error
handling. Function, Local Shell, MCP, and experimental `unified_exec`
all flow through this path; legacy shell aliases still work.
- Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and
make adding tools predictable and testable.
Example: `read_file`
- Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`,
registered by `build_specs`).
- Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`,
1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation).
- E2E test: `core/tests/suite/read_file.rs` validates the tool returns
the requested lines.
## Next steps:
- Decompose `handle_container_exec_with_params`
- Add parallel tool calls
Previously, users could supply their API key directly via:
```shell
codex login --api-key KEY
```
but this has the drawback that `KEY` is more likely to end up in shell
history, can be read from `/proc`, etc.
This PR removes support for `--api-key` and replaces it with
`--with-api-key`, which reads the key from stdin, so either of these are
better options:
```
printenv OPENAI_API_KEY | codex login --with-api-key
codex login --with-api-key < my_key.txt
```
Other CLIs, such as `gh auth login --with-token`, follow the same
practice.
Before when you would enter `/di`, hit tab on `/diff`, and then hit
enter, it would execute `/diff`. But now it's just sending it as a text.
This fixes the issue.
### Summary
* Updated fuzzy search result to include the file name.
* This should not affect CLI usage and the UI there will be addressed in
a separate PR.
### Testing
Tested locally and with the extension.
### Screenshot
<img width="431" height="244" alt="Screenshot 2025-10-02 at 11 08 44 AM"
src="https://github.com/user-attachments/assets/ba2ca299-a81d-4453-9242-1750e945aea2"
/>
---------
Co-authored-by: shijie.rao <shijie.rao@squareup.com>
- prefix command approval reasons with "Reason:"
- show keyboard shortcuts for some ListSelectionItems
- remove "description" lines for approval options, and make the labels
more verbose
- add a spacer line in diff display after the path
and some other minor refactors that go along with the above.
<img width="859" height="508" alt="Screenshot 2025-10-02 at 1 24 50 PM"
src="https://github.com/user-attachments/assets/4fa7ecaf-3d3a-406a-bb4d-23e30ce3e5cf"
/>
Fixes#4176
Some common tools provide a schema (even if just an empty object schema)
as the value for `additionalProperties`. The parsing as it currently
stands fails when it encounters this. This PR updates the schema to
accept a schema object in addition to a boolean value, per the JSON
Schema spec.
We get spurrious reports that the model writes fenced code blocks
without an info tag which then causes auto-language detection in the
extension to incorrectly highlight the code and show the wrong language.
The model should really always include a tag when it can.
This PR fixes a bug that results in a hang in the oauth login flow if a
user logs in, then logs out, then logs in again without first closing
the browser window.
Root cause of problem: We use a local web server for the oauth flow, and
it's implemented using the `tiny_http` rust crate. During the first
login, a socket is created between the browser and the server. The
`tiny_http` library creates worker threads that persist for as long as
this socket remains open. Currently, there's no way to close the
connection on the server side — the library provides no API to do this.
The library also filters all "Connect: close" headers, which makes it
difficult to tell the client browser to close the connection. On the
second login attempt, the browser uses the existing connection rather
than creating a new one. Since that connection is associated with a
server instance that no longer exists, it is effectively ignored.
I considered switching from `tiny_http` to a different web server
library, but that would have been a big change with significant
regression risk. This PR includes a more surgical fix that works around
the limitation of `tiny_http` and sends a "Connect: close" header on the
last "success" page of the oauth flow.
Before this PR:
```typescript
export type RequestId = string | bigint;
```
After:
```typescript
export type RequestId = string | number;
```
`bigint` introduces headaches in TypeScript without providing any real
value.
I just had to use this like so:
```
./codex-rs/scripts/create_github_release --publish-alpha --emergency-version-override 0.43.0-alpha.10
```
because the build for `0.43.0-alpha.9` failed:
https://github.com/openai/codex/actions/runs/18167317356
## Summary
- show the remaining context window percentage in `/status` alongside
existing token usage details
- replace the composer shortcut prompt with the context window
percentage (or an unavailable message) while a task is running
- update TUI snapshots to reflect the new context window line
## Testing
- cargo test -p codex-tui
------
https://chatgpt.com/codex/tasks/task_i_68dc6e7397ac8321909d7daff25a396c
## Summary
- show a dim “(no output)” placeholder when an executed command produces
no stdout or stderr so empty runs are visible
- update TUI snapshots to include the new placeholder in history
renderings
## Testing
- cargo test -p codex-tui
------
https://chatgpt.com/codex/tasks/task_i_68dc056c1d5883218fe8d9929e9b1657
**Summary**
This PR fixes an issue in the device code login flow where trailing
slashes in the issuer URL could cause malformed URLs during codex token
exchange step
**Test**
Before the changes
`Error logging in with device code: device code exchange failed: error
decoding response body`
After the changes
`Successfully logged in`
Implement command safety for PowerShell commands on Windows
This change adds a new Windows-specific command-safety module under
`codex-rs/core/src/command_safety/windows_safe_commands.rs` to strictly
sanitise PowerShell invocations. Key points:
- Introduce `is_safe_command_windows()` to only allow explicitly
read-only PowerShell calls.
- Parse and split PowerShell invocations (including inline `-Command`
scripts and pipelines).
- Block unsafe switches (`-File`, `-EncodedCommand`, `-ExecutionPolicy`,
unknown flags, call operators, redirections, separators).
- Whitelist only read-only cmdlets (`Get-ChildItem`, `Get-Content`,
`Select-Object`, etc.), safe Git subcommands (`status`, `log`, `show`,
`diff`, `cat-file`), and ripgrep without unsafe options.
- Add comprehensive unit tests covering allowed and rejected command
patterns (nested calls, side effects, chaining, redirections).
This ensures Codex on Windows can safely execute discover-only
PowerShell workflows without risking destructive operations.
There was a bit of copypasta I put up with when were publishing two
packages to npm, but now that it's three, I created some more scripts to
consolidate things.
With this change, I ran:
```shell
./scripts/stage_npm_packages.py --release-version 0.43.0-alpha.8 --package codex --package codex-responses-api-proxy --package codex-sdk
```
Indeed when it finished, I ended up with:
```shell
$ tree dist
dist
└── npm
├── codex-npm-0.43.0-alpha.8.tgz
├── codex-responses-api-proxy-npm-0.43.0-alpha.8.tgz
└── codex-sdk-npm-0.43.0-alpha.8.tgz
$ tar tzvf dist/npm/codex-sdk-npm-0.43.0-alpha.8.tgz
-rwxr-xr-x 0 0 0 25476720 Oct 26 1985 package/vendor/aarch64-apple-darwin/codex/codex
-rwxr-xr-x 0 0 0 29871400 Oct 26 1985 package/vendor/aarch64-unknown-linux-musl/codex/codex
-rwxr-xr-x 0 0 0 28368096 Oct 26 1985 package/vendor/x86_64-apple-darwin/codex/codex
-rwxr-xr-x 0 0 0 36029472 Oct 26 1985 package/vendor/x86_64-unknown-linux-musl/codex/codex
-rw-r--r-- 0 0 0 10926 Oct 26 1985 package/LICENSE
-rw-r--r-- 0 0 0 30187520 Oct 26 1985 package/vendor/aarch64-pc-windows-msvc/codex/codex.exe
-rw-r--r-- 0 0 0 35277824 Oct 26 1985 package/vendor/x86_64-pc-windows-msvc/codex/codex.exe
-rw-r--r-- 0 0 0 4842 Oct 26 1985 package/dist/index.js
-rw-r--r-- 0 0 0 1347 Oct 26 1985 package/package.json
-rw-r--r-- 0 0 0 9867 Oct 26 1985 package/dist/index.js.map
-rw-r--r-- 0 0 0 12 Oct 26 1985 package/README.md
-rw-r--r-- 0 0 0 4287 Oct 26 1985 package/dist/index.d.ts
```
# Extract and Centralize Sandboxing
- Goal: Improve safety and clarity by centralizing sandbox planning and
execution.
- Approach:
- Add planner (ExecPlan) and backend registry (Direct/Seatbelt/Linux)
with run_with_plan.
- Refactor codex.rs to plan-then-execute; handle failures/escalation via
the plan.
- Delegate apply_patch to the codex binary and run it with an empty env
for determinism.
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
We continue the separation between `codex app-server` and `codex
mcp-server`.
In particular, we introduce a new crate, `codex-app-server-protocol`,
and migrate `codex-rs/protocol/src/mcp_protocol.rs` into it, renaming it
`codex-rs/app-server-protocol/src/protocol.rs`.
Because `ConversationId` was defined in `mcp_protocol.rs`, we move it
into its own file, `codex-rs/protocol/src/conversation_id.rs`, and
because it is referenced in a ton of places, we have to touch a lot of
files as part of this PR.
We also decide to get away from proper JSON-RPC 2.0 semantics, so we
also introduce `codex-rs/app-server-protocol/src/jsonrpc_lite.rs`, which
is basically the same `JSONRPCMessage` type defined in `mcp-types`
except with all of the `"jsonrpc": "2.0"` removed.
Getting rid of `"jsonrpc": "2.0"` makes our serialization logic
considerably simpler, as we can lean heavier on serde to serialize
directly into the wire format that we use now.
Manually curating `protocol-ts/src/lib.rs` was error-prone, as expected.
I finally asked Codex to write some Rust macros so we can ensure that:
- For every variant of `ClientRequest` and `ServerRequest`, there is an
associated `params` and `response` type.
- All response types are included automatically in the output of `codex
generate-ts`.
I don't believe there is any upside in making process hardening opt-in
for Codex CLI releases. If you want to tinker with Codex CLI, then build
from source (or run as `root`)?
Fixes:
- Removed overdeclaration of types that were unnecessary because they
were already included by induction.
- Reordered list of response types to match the enum order, making it
easier to identify what was missing.
- Added `ExecArbitraryCommandResponse` because it was missing.
- Leveraged `use codex_protocol::mcp_protocol::*;` to make the file more
readable.
- Removed crate dependency on `mcp-types` now that we have separate the
app server from the MCP server:
https://github.com/openai/codex/pull/4471
My next move is to come up with some scheme that ensures request types
always have a response type and that the response type is automatically
included with the output of `codex generate-ts`.
This ensures changes the generated TypeScript type for `ClientRequest`
so that instead of this:
```typescript
/**
* Request from the client to the server.
*/
export type ClientRequest =
| { method: "initialize"; id: RequestId; params: InitializeParams }
| { method: "newConversation"; id: RequestId; params: NewConversationParams }
// ...
| { method: "getUserAgent"; id: RequestId }
| { method: "userInfo"; id: RequestId }
// ...
```
we have this:
```typescript
/**
* Request from the client to the server.
*/
export type ClientRequest =
| { method: "initialize"; id: RequestId; params: InitializeParams }
| { method: "newConversation"; id: RequestId; params: NewConversationParams }
// ...
| { method: "getUserAgent"; id: RequestId; params: undefined }
| { method: "userInfo"; id: RequestId; params: undefined }
// ...
```
which makes TypeScript happier when it comes to destructuring instances
of `ClientRequest` because it does not complain about `params` not being
guaranteed to exist anymore.
Update prompt to prevent codex to use Python script or fancy commands to
edit files.
## Testing:
3 scenarios have been considered:
1. Rename codex to meca_code. Proceed to the whole refactor file by
file. Don't ask for approval at each step
2. Add a description to every single function you can find in the repo
3. Rewrite codex.rs in a more idiomatic way. Make sure to touch ONLY
this file and that clippy does not complain at the end
Before this update, 22% (estimation as it's sometimes hard to find all
the creative way the model find to edit files) of the file editions
where made using something else than a raw `apply_patch`
After this update, not a single edition without `apply_patch` was found
[EDIT]
I managed to have a few `["bash", "-lc", "apply_path"]` when reaching <
10% context left
Here's the logic:
1. If text is empty and selector is open:
- Enter on a prompt without args should autosubmit the prompt
- Enter on a prompt with numeric args should add `/prompts:name ` to the
text input
- Enter on a prompt with named args should add `/prompts:name ARG1=""
ARG2=""` to the text input
2. If text is not empty but no args are passed:
- For prompts with numeric args -> we allow it to submit (params are
optional)
- For prompts with named args -> we throw an error (all params should
have values)
<img width="454" height="246" alt="Screenshot 2025-09-23 at 2 23 21 PM"
src="https://github.com/user-attachments/assets/fd180a1b-7d17-42ec-b231-8da48828b811"
/>
This is a very large PR with some non-backwards-compatible changes.
Historically, `codex mcp` (or `codex mcp serve`) started a JSON-RPC-ish
server that had two overlapping responsibilities:
- Running an MCP server, providing some basic tool calls.
- Running the app server used to power experiences such as the VS Code
extension.
This PR aims to separate these into distinct concepts:
- `codex mcp-server` for the MCP server
- `codex app-server` for the "application server"
Note `codex mcp` still exists because it already has its own subcommands
for MCP management (`list`, `add`, etc.)
The MCP logic continues to live in `codex-rs/mcp-server` whereas the
refactored app server logic is in the new `codex-rs/app-server` folder.
Note that most of the existing integration tests in
`codex-rs/mcp-server/tests/suite` were actually for the app server, so
all the tests have been moved with the exception of
`codex-rs/mcp-server/tests/suite/mod.rs`.
Because this is already a large diff, I tried not to change more than I
had to, so `codex-rs/app-server/tests/common/mcp_process.rs` still uses
the name `McpProcess` for now, but I will do some mechanical renamings
to things like `AppServer` in subsequent PRs.
While `mcp-server` and `app-server` share some overlapping functionality
(like reading streams of JSONL and dispatching based on message types)
and some differences (completely different message types), I ended up
doing a bit of copypasta between the two crates, as both have somewhat
similar `message_processor.rs` and `outgoing_message.rs` files for now,
though I expect them to diverge more in the near future.
One material change is that of the initialize handshake for `codex
app-server`, as we no longer use the MCP types for that handshake.
Instead, we update `codex-rs/protocol/src/mcp_protocol.rs` to add an
`Initialize` variant to `ClientRequest`, which takes the `ClientInfo`
object we need to update the `USER_AGENT_SUFFIX` in
`codex-rs/app-server/src/message_processor.rs`.
One other material change is in
`codex-rs/app-server/src/codex_message_processor.rs` where I eliminated
a use of the `send_event_as_notification()` method I am generally trying
to deprecate (because it blindly maps an `EventMsg` into a
`JSONNotification`) in favor of `send_server_notification()`, which
takes a `ServerNotification`, as that is intended to be a custom enum of
all notification types supported by the app server. So to make this
update, I had to introduce a new variant of `ServerNotification`,
`SessionConfigured`, which is a non-backwards compatible change with the
old `codex mcp`, and clients will have to be updated after the next
release that contains this PR. Note that
`codex-rs/app-server/tests/suite/list_resume.rs` also had to be update
to reflect this change.
I introduced `codex-rs/utils/json-to-toml/src/lib.rs` as a small utility
crate to avoid some of the copying between `mcp-server` and
`app-server`.
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
# test
```
codex-rs % export CODEX_DEVICE_AUTH_BASE_URL=http://localhost:3007
codex-rs % cargo run --bin codex login --experimental_use-device-code
Compiling codex-login v0.0.0 (/Users/rakesh/code/codex/codex-rs/login)
Compiling codex-mcp-server v0.0.0 (/Users/rakesh/code/codex/codex-rs/mcp-server)
Compiling codex-tui v0.0.0 (/Users/rakesh/code/codex/codex-rs/tui)
Compiling codex-cli v0.0.0 (/Users/rakesh/code/codex/codex-rs/cli)
Finished `dev` profile [unoptimized + debuginfo] target(s) in 2.90s
Running `target/debug/codex login --experimental_use-device-code`
To authenticate, enter this code when prompted: 6Q27-KBVRF with interval 5
^C
```
The error in the last line is since the poll endpoint is not yet
implemented
Fixing the "? for shortcuts"
- Only show the hint when composer is empty
- Don't reset footer on new task updates
- Reorder the elements
- Align the "?" and "/" with overlay on and off
Based on #4364
### Title
## otel
Codex can emit [OpenTelemetry](https://opentelemetry.io/) **log events**
that
describe each run: outbound API requests, streamed responses, user
input,
tool-approval decisions, and the result of every tool invocation. Export
is
**disabled by default** so local runs remain self-contained. Opt in by
adding an
`[otel]` table and choosing an exporter.
```toml
[otel]
environment = "staging" # defaults to "dev"
exporter = "none" # defaults to "none"; set to otlp-http or otlp-grpc to send events
log_user_prompt = false # defaults to false; redact prompt text unless explicitly enabled
```
Codex tags every exported event with `service.name = "codex-cli"`, the
CLI
version, and an `env` attribute so downstream collectors can distinguish
dev/staging/prod traffic. Only telemetry produced inside the
`codex_otel`
crate—the events listed below—is forwarded to the exporter.
### Event catalog
Every event shares a common set of metadata fields: `event.timestamp`,
`conversation.id`, `app.version`, `auth_mode` (when available),
`user.account_id` (when available), `terminal.type`, `model`, and
`slug`.
With OTEL enabled Codex emits the following event types (in addition to
the
metadata above):
- `codex.api_request`
- `cf_ray` (optional)
- `attempt`
- `duration_ms`
- `http.response.status_code` (optional)
- `error.message` (failures)
- `codex.sse_event`
- `event.kind`
- `duration_ms`
- `error.message` (failures)
- `input_token_count` (completion only)
- `output_token_count` (completion only)
- `cached_token_count` (completion only, optional)
- `reasoning_token_count` (completion only, optional)
- `tool_token_count` (completion only)
- `codex.user_prompt`
- `prompt_length`
- `prompt` (redacted unless `log_user_prompt = true`)
- `codex.tool_decision`
- `tool_name`
- `call_id`
- `decision` (`approved`, `approved_for_session`, `denied`, or `abort`)
- `source` (`config` or `user`)
- `codex.tool_result`
- `tool_name`
- `call_id`
- `arguments`
- `duration_ms` (execution time for the tool)
- `success` (`"true"` or `"false"`)
- `output`
### Choosing an exporter
Set `otel.exporter` to control where events go:
- `none` – leaves instrumentation active but skips exporting. This is
the
default.
- `otlp-http` – posts OTLP log records to an OTLP/HTTP collector.
Specify the
endpoint, protocol, and headers your collector expects:
```toml
[otel]
exporter = { otlp-http = {
endpoint = "https://otel.example.com/v1/logs",
protocol = "binary",
headers = { "x-otlp-api-key" = "${OTLP_TOKEN}" }
}}
```
- `otlp-grpc` – streams OTLP log records over gRPC. Provide the endpoint
and any
metadata headers:
```toml
[otel]
exporter = { otlp-grpc = {
endpoint = "https://otel.example.com:4317",
headers = { "x-otlp-meta" = "abc123" }
}}
```
If the exporter is `none` nothing is written anywhere; otherwise you
must run or point to your
own collector. All exporters run on a background batch worker that is
flushed on
shutdown.
If you build Codex from source the OTEL crate is still behind an `otel`
feature
flag; the official prebuilt binaries ship with the feature enabled. When
the
feature is disabled the telemetry hooks become no-ops so the CLI
continues to
function without the extra dependencies.
---------
Co-authored-by: Anton Panasenko <apanasenko@openai.com>
This PR expands `.github/workflows/rust-release.yml` so that it also
builds and publishes the `npm` module for
`@openai/codex-responses-api-proxy` in addition to `@openai/codex`. Note
both `npm` modules are similar, in that they each contain a single `.js`
file that is a thin launcher around the appropriate native executable.
(Since we have a minimal dependency on Node.js, I also lowered the
minimum version from 20 to 16 and verified that works on my machine.)
As part of this change, we tighten up some of the docs around
`codex-responses-api-proxy` and ensure the details regarding protecting
the `OPENAI_API_KEY` in memory match the implementation.
To test the `npm` build process, I ran:
```
./codex-cli/scripts/build_npm_package.py --package codex-responses-api-proxy --version 0.43.0-alpha.3
```
which stages the `npm` module for `@openai/codex-responses-api-proxy` in
a temp directory, using the binary artifacts from
https://github.com/openai/codex/releases/tag/rust-v0.43.0-alpha.3.
This should make the `codex-responses-api-proxy` binaries available for
all platforms in a GitHub Release as well as a corresponding DotSlash
file.
Making `codex-responses-api-proxy` available as an `npm` module will be
done in a follow-up PR.
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/4404).
* __->__ #4406
* #4404
* #4403
This removes the `codex responses-api-proxy` subcommand in favor of
running it as a standalone CLI.
As part of this change, we:
- remove the dependency on `tokio`/`async/await` as well as `codex_arg0`
- introduce the use of `pre_main_hardening()` so `CODEX_SECURE_MODE=1`
is not required
---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/4404).
* #4406
* __->__ #4404
* #4403
This is likely the reason that I saw some conversations "freeze up" when
using the proxy.
Note the client in `core` does not specify a timeout when making
requests to the Responses API, so the proxy should not, either.
This PR adds support for streamable HTTP MCP servers when the
`experimental_use_rmcp_client` is enabled.
To set one up, simply add a new mcp server config with the url:
```
[mcp_servers.figma]
url = "http://127.0.0.1:3845/mcp"
```
It also supports an optional `bearer_token` which will be provided in an
authorization header. The full oauth flow is not supported yet.
The config parsing will throw if it detects that the user mixed and
matched config fields (like command + bearer token or url + env).
The best way to review it is to review `core/src` and then
`rmcp-client/src/rmcp_client.rs` first. The rest is tests and
propagating the `Transport` struct around the codebase.
Example with the Figma MCP:
<img width="5084" height="1614" alt="CleanShot 2025-09-26 at 13 35 40"
src="https://github.com/user-attachments/assets/eaf2771e-df3e-4300-816b-184d7dec5a28"
/>
- Render `send a message to load usage data` in the beginning of the
session
- Render `data not available yet` if received no rate limits
- nit case
- Deleted stall snapshots that were moved to
`codex-rs/tui/src/status/snapshots`
The [official Rust
SDK](57fc428c57)
has come a long way since we first started our mcp client implementation
5 months ago and, today, it is much more complete than our own
stdio-only implementation.
This PR introduces a new config flag `experimental_use_rmcp_client`
which will use a new mcp client powered by the sdk instead of our own.
To keep this PR simple, I've only implemented the same stdio MCP
functionality that we had but will expand on it with future PRs.
---------
Co-authored-by: pakrym-oai <pakrym@openai.com>
Adds a 1-per-turn todo-list item and item.updated event
```jsonl
{"type":"item.started","item":{"id":"item_6","item_type":"todo_list","items":[{"text":"Record initial two-step plan now","completed":false},{"text":"Update progress to next step","completed":false}]}}
{"type":"item.updated","item":{"id":"item_6","item_type":"todo_list","items":[{"text":"Record initial two-step plan now","completed":true},{"text":"Update progress to next step","completed":false}]}}
{"type":"item.completed","item":{"id":"item_6","item_type":"todo_list","items":[{"text":"Record initial two-step plan now","completed":true},{"text":"Update progress to next step","completed":false}]}}
```
Details are in `responses-api-proxy/README.md`, but the key contribution
of this PR is a new subcommand, `codex responses-api-proxy`, which reads
the auth token for use with the OpenAI Responses API from `stdin` at
startup and then proxies `POST` requests to `/v1/responses` over to
`https://api.openai.com/v1/responses`, injecting the auth token as part
of the `Authorization` header.
The expectation is that `codex responses-api-proxy` is launched by a
privileged user who has access to the auth token so that it can be used
by unprivileged users of the Codex CLI on the same host.
If the client only has one user account with `sudo`, one option is to:
- run `sudo codex responses-api-proxy --http-shutdown --server-info
/tmp/server-info.json` to start the server
- record the port written to `/tmp/server-info.json`
- relinquish their `sudo` privileges (which is irreversible!) like so:
```
sudo deluser $USER sudo || sudo gpasswd -d $USER sudo || true
```
- use `codex` with the proxy (see `README.md`)
- when done, make a `GET` request to the server using the `PORT` from
`server-info.json` to shut it down:
```shell
curl --fail --silent --show-error "http://127.0.0.1:$PORT/shutdown"
```
To protect the auth token, we:
- allocate a 1024 byte buffer on the stack and write `"Bearer "` into it
to start
- we then read from `stdin`, copying to the contents into the buffer
after the prefix
- after verifying the input looks good, we create a `String` from that
buffer (so the data is now on the heap)
- we zero out the stack-allocated buffer using
https://crates.io/crates/zeroize so it is not optimized away by the
compiler
- we invoke `.leak()` on the `String` so we can treat its contents as a
`&'static str`, as it will live for the rest of the processs
- on UNIX, we `mlock(2)` the memory backing the `&'static str`
- when using the `&'static str` when building an HTTP request, we use
`HeaderValue::from_static()` to avoid copying the `&str`
- we also invoke `.set_sensitive(true)` on the `HeaderValue`, which in
theory indicates to other parts of the HTTP stack that the header should
be treated with "special care" to avoid leakage:
439d1c50d7/src/header/value.rs (L346-L376)
- Refactor Exec Cell into its own module
- update exec command rendering to inline the first command line
- limit continuation lines
- always show trimmed output
Extracting tasks in a module and start abstraction behind a Trait (more
to come on this but each task will be tackled in a dedicated PR)
The goal was to drop the ActiveTask and to have a (potentially) set of
tasks during each turn
Certain shell commands are potentially dangerous, and we want to check
for them.
Unless the user has explicitly approved a command, we will *always* ask
them for approval
when one of these commands is encountered, regardless of whether they
are in a sandbox, or what their approval policy is.
The first (of probably many) such examples is `git reset --hard`. We
will be conservative and check for any `git reset`
This addresses bug #4092
Testing:
* Confirmed error occurs prior to fix if logging in using API key and no
`~/.codex` directory exists
* Confirmed after fix that `~/.codex` directory is properly created and
error doesn't occur
Adds a new `item.started` event to `codex exec` and implements it for
command_execution item type.
```jsonl
{"type":"session.created","session_id":"019982d1-75f0-7920-b051-e0d3731a5ed8"}
{"type":"item.completed","item":{"id":"item_0","item_type":"reasoning","text":"**Executing commands securely**\n\nI'm thinking about how the default harness typically uses \"bash -lc,\" while historically \"bash\" is what we've been using. The command should be executed as a string in our CLI, so using \"bash -lc 'echo hello'\" is optimal but calling \"echo hello\" directly feels safer. The sandbox makes sure environment variables like CODEX_SANDBOX_NETWORK_DISABLED=1 are set, so I won't ask for approval. I just need to run \"echo hello\" and correctly present the output."}}
{"type":"item.completed","item":{"id":"item_1","item_type":"reasoning","text":"**Preparing for tool calls**\n\nI realize that I need to include a preamble before making any tool calls. So, I'll first state the preamble in the commentary channel, then proceed with the tool call. After that, I need to present the final message along with the output. It's possible that the CLI will show the output inline, but I must ensure that I present the result clearly regardless. Let's move forward and get this organized!"}}
{"type":"item.completed","item":{"id":"item_2","item_type":"assistant_message","text":"Running `echo` to confirm shell access and print output."}}
{"type":"item.started","item":{"id":"item_3","item_type":"command_execution","command":"bash -lc echo hello","aggregated_output":"","exit_code":null,"status":"in_progress"}}
{"type":"item.completed","item":{"id":"item_3","item_type":"command_execution","command":"bash -lc echo hello","aggregated_output":"hello\n","exit_code":0,"status":"completed"}}
{"type":"item.completed","item":{"id":"item_4","item_type":"assistant_message","text":"hello"}}
```
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
This changes the reqwest client used in tests to be sandbox-friendly,
and skips a bunch of other tests that don't work inside the
sandbox/without network.
This pull request add a new experimental format of JSON output.
You can try it using `codex exec --experimental-json`.
Design takes a lot of inspiration from Responses API items and stream
format.
# Session and items
Each invocation of `codex exec` starts or resumes a session.
Session contains multiple high-level item types:
1. Assistant message
2. Assistant thinking
3. Command execution
4. File changes
5. To-do lists
6. etc.
# Events
Session and items are going through their life cycles which is
represented by events.
Session is `session.created` or `session.resumed`
Items are `item.added`, `item.updated`, `item.completed`,
`item.require_approval` (or other item types like `item.output_delta`
when we need streaming).
So a typical session can look like:
<details>
```
{
"type": "session.created",
"session_id": "01997dac-9581-7de3-b6a0-1df8256f2752"
}
{
"type": "item.completed",
"item": {
"id": "itm_0",
"item_type": "assistant_message",
"text": "I’ll locate the top-level README and remove its first line. Then I’ll show a quick summary of what changed."
}
}
{
"type": "item.completed",
"item": {
"id": "itm_1",
"item_type": "command_execution",
"command": "bash -lc ls -la | sed -n '1,200p'",
"aggregated_output": "pyenv: cannot rehash: /Users/pakrym/.pyenv/shims isn't writable\ntotal 192\ndrwxr-xr-x@ 33 pakrym staff 1056 Sep 24 14:36 .\ndrwxr-xr-x 41 pakrym staff 1312 Sep 24 09:17 ..\n-rw-r--r--@ 1 pakrym staff 6 Jul 9 16:16 .codespellignore\n-rw-r--r--@ 1 pakrym staff 258 Aug 13 09:40 .codespellrc\ndrwxr-xr-x@ 5 pakrym staff 160 Jul 23 08:26 .devcontainer\n-rw-r--r--@ 1 pakrym staff 6148 Jul 22 10:03 .DS_Store\ndrwxr-xr-x@ 15 pakrym staff 480 Sep 24 14:38 .git\ndrwxr-xr-x@ 12 pakrym staff 384 Sep 2 16:00 .github\n-rw-r--r--@ 1 pakrym staff 778 Jul 9 16:16 .gitignore\ndrwxr-xr-x@ 3 pakrym staff 96 Aug 11 09:37 .husky\n-rw-r--r--@ 1 pakrym staff 104 Jul 9 16:16 .npmrc\n-rw-r--r--@ 1 pakrym staff 96 Sep 2 08:52 .prettierignore\n-rw-r--r--@ 1 pakrym staff 170 Jul 9 16:16 .prettierrc.toml\ndrwxr-xr-x@ 5 pakrym staff 160 Sep 14 17:43 .vscode\ndrwxr-xr-x@ 2 pakrym staff 64 Sep 11 11:37 2025-09-11\n-rw-r--r--@ 1 pakrym staff 5505 Sep 18 09:28 AGENTS.md\n-rw-r--r--@ 1 pakrym staff 92 Sep 2 08:52 CHANGELOG.md\n-rw-r--r--@ 1 pakrym staff 1145 Jul 9 16:16 cliff.toml\ndrwxr-xr-x@ 11 pakrym staff 352 Sep 24 13:03 codex-cli\ndrwxr-xr-x@ 38 pakrym staff 1216 Sep 24 14:38 codex-rs\ndrwxr-xr-x@ 18 pakrym staff 576 Sep 23 11:01 docs\n-rw-r--r--@ 1 pakrym staff 2038 Jul 9 16:16 flake.lock\n-rw-r--r--@ 1 pakrym staff 1434 Jul 9 16:16 flake.nix\n-rw-r--r--@ 1 pakrym staff 10926 Jul 9 16:16 LICENSE\ndrwxr-xr-x@ 465 pakrym staff 14880 Jul 15 07:36 node_modules\n-rw-r--r--@ 1 pakrym staff 242 Aug 5 08:25 NOTICE\n-rw-r--r--@ 1 pakrym staff 578 Aug 14 12:31 package.json\n-rw-r--r--@ 1 pakrym staff 498 Aug 11 09:37 pnpm-lock.yaml\n-rw-r--r--@ 1 pakrym staff 58 Aug 11 09:37 pnpm-workspace.yaml\n-rw-r--r--@ 1 pakrym staff 2402 Jul 9 16:16 PNPM.md\n-rw-r--r--@ 1 pakrym staff 4393 Sep 12 14:36 README.md\ndrwxr-xr-x@ 4 pakrym staff 128 Sep 18 09:28 scripts\ndrwxr-xr-x@ 2 pakrym staff 64 Sep 11 11:34 tmp\n",
"exit_code": 0,
"status": "completed"
}
}
{
"type": "item.completed",
"item": {
"id": "itm_2",
"item_type": "reasoning",
"text": "**Reviewing README.md file**\n\nI've located the README.md file at the root, and it’s 4393 bytes. Now, I need to remove the first line, but first, I should check its content to make sure I’m patching it correctly. I’ll use sed to display the first 20 lines. By reviewing those lines, I can determine exactly what needs to be removed before I proceed with the editing. Let's do this carefully!"
}
}
{
"type": "item.completed",
"item": {
"id": "itm_3",
"item_type": "command_execution",
"command": "bash -lc sed -n '1,40p' README.md",
"aggregated_output": "<h1 align=\"center\">OpenAI Codex CLI</h1>\n\n<p align=\"center\"><code>npm i -g @openai/codex</code><br />or <code>brew install codex</code></p>\n\n<p align=\"center\"><strong>Codex CLI</strong> is a coding agent from OpenAI that runs locally on your computer.\n</br>\n</br>If you want Codex in your code editor (VS Code, Cursor, Windsurf), <a href=\"https://developers.openai.com/codex/ide\">install in your IDE</a>\n</br>If you are looking for the <em>cloud-based agent</em> from OpenAI, <strong>Codex Web</strong>, go to <a href=\"https://chatgpt.com/codex\">chatgpt.com/codex</a></p>\n\n<p align=\"center\">\n <img src=\"./.github/codex-cli-splash.png\" alt=\"Codex CLI splash\" width=\"80%\" />\n </p>\n\n---\n\n## Quickstart\n\n### Installing and running Codex CLI\n\nInstall globally with your preferred package manager. If you use npm:\n\n```shell\nnpm install -g @openai/codex\n```\n\nAlternatively, if you use Homebrew:\n\n```shell\nbrew install codex\n```\n\nThen simply run `codex` to get started:\n\n```shell\ncodex\n```\n\n<details>\n<summary>You can also go to the <a href=\"https://github.com/openai/codex/releases/latest\">latest GitHub Release</a> and download the appropriate binary for your platform.</summary>\n\n",
"exit_code": 0,
"status": "completed"
}
}
{
"type": "item.completed",
"item": {
"id": "itm_4",
"item_type": "reasoning",
"text": "**Updating README file**\n\nThe user wants to update the README by removing the first line, which is an HTML h1 tag. I'll proceed to delete that line while keeping the rest intact. To do this, I need to use the apply_patch tool and create a patch in the diff format. I’ll take care to include context lines around the change. After removing the header, I’m considering whether to delete the following blank line as well to keep it tidy."
}
}
{
"type": "item.completed",
"item": {
"id": "itm_5",
"item_type": "reasoning",
"text": "**Crafting the patch**\n\nThe user has asked to remove just the first line, without mentioning the blank lines. Technically, leaving a blank line at the start is acceptable, but it's not ideal. Ideally, I'd prefer to remove the first line while maintaining the subsequent blank line. However, if I can also remove that blank line, it could tidy things up even more. Still, since the user's request is strictly to remove only the first line, I'll stick to that and craft the patch accordingly."
}
}
{
"type": "item.completed",
"item": {
"id": "itm_6",
"item_type": "assistant_message",
"text": "Applying the change by removing the first line from `README.md`."
}
}
{
"type": "item.completed",
"item": {
"id": "itm_7",
"item_type": "file_change",
"changes": [
{
"path": "/Users/pakrym/code/codex/README.md",
"kind": "update"
}
],
"status": "completed"
}
}
{
"type": "item.completed",
"item": {
"id": "itm_8",
"item_type": "reasoning",
"text": "**Validating README Update**\n\nI updated the README.md and now I need to validate that the first line has been successfully removed. To do this, I'll check the first 10 lines of the document to ensure everything looks correct. It’s always good to double-check these kinds of changes, so I’m going to take that extra step to make sure the update was executed properly. Let's see what those initial lines say!"
}
}
{
"type": "item.completed",
"item": {
"id": "itm_9",
"item_type": "command_execution",
"command": "bash -lc sed -n '1,20p' README.md",
"aggregated_output": "\n<p align=\"center\"><code>npm i -g @openai/codex</code><br />or <code>brew install codex</code></p>\n\n<p align=\"center\"><strong>Codex CLI</strong> is a coding agent from OpenAI that runs locally on your computer.\n</br>\n</br>If you want Codex in your code editor (VS Code, Cursor, Windsurf), <a href=\"https://developers.openai.com/codex/ide\">install in your IDE</a>\n</br>If you are looking for the <em>cloud-based agent</em> from OpenAI, <strong>Codex Web</strong>, go to <a href=\"https://chatgpt.com/codex\">chatgpt.com/codex</a></p>\n\n<p align=\"center\">\n <img src=\"./.github/codex-cli-splash.png\" alt=\"Codex CLI splash\" width=\"80%\" />\n </p>\n\n---\n\n## Quickstart\n\n### Installing and running Codex CLI\n\nInstall globally with your preferred package manager. If you use npm:\n\n",
"exit_code": 0,
"status": "completed"
}
}
{
"type": "item.completed",
"item": {
"id": "itm_10",
"item_type": "assistant_message",
"text": "- Updated `README.md:1` by removing the opening H1 line.\n- The README now starts with the install instructions paragraph.\n\nWant me to also remove the leading blank line at the top?"
}
}
```
</details>
The idea is to give users fully formatted items they can use directly in
their rendering/application logic and avoid having them building up
items manually based on events (unless they want to for streaming).
This PR implements only the `item.completed` payload for some event
types, more event types and item types to come.
---------
Co-authored-by: Michael Bolin <mbolin@openai.com>
I would like to be able to swap in a different way to resolve model
sampling requests, so this refactoring consolidates things behind
`attempt_stream_responses()` to make that easier. Ideally, we would
support an in-memory backend that we can use in our integration tests,
for example.
Instead of overwriting the contents of the composer when pressing
<kbd>Esc</kbd> when there's a queued message, prepend the queued
message(s) to the composer draft.
Because the `codex` process could contain sensitive information in
memory, such as API keys, we add logic so that when
`CODEX_SECURE_MODE=1` is specified, we avail ourselves of whatever the
operating system provides to restrict observability/tampering, which
includes:
- disabling `ptrace(2)`, so it is not possible to attach to the process
with a debugger, such as `gdb`
- disabling core dumps
Admittedly, a user with root privileges can defeat these safeguards.
For now, we only add support for this in the `codex` multitool, but we
may ultimately want to support this in some of the smaller CLIs that are
buildable out of our Cargo workspace.
## Current State Observations
- `Session` currently holds many unrelated responsibilities (history,
approval queues, task handles, rollout recorder, shell discovery, token
tracking, etc.), making it hard to reason about ownership and lifetimes.
- The anonymous `State` struct inside `codex.rs` mixes session-long data
with turn-scoped queues and approval bookkeeping.
- Turn execution (`run_task`) relies on ad-hoc local variables that
should conceptually belong to a per-turn state object.
- External modules (`codex::compact`, tests) frequently poke the raw
`Session.state` mutex, which couples them to implementation details.
- Interrupts, approvals, and rollout persistence all have bespoke
cleanup paths, contributing to subtle bugs when a turn is aborted
mid-flight.
## Desired End State
- Keep a slim `Session` object that acts as the orchestrator and façade.
It should expose a focused API (submit, approvals, interrupts, event
emission) without storing unrelated fields directly.
- Introduce a `state` module that encapsulates all mutable data
structures:
- `SessionState`: session-persistent data (history, approved commands,
token/rate-limit info, maybe user preferences).
- `ActiveTurn`: metadata for the currently running turn (sub-id, task
kind, abort handle) and an `Arc<TurnState>`.
- `TurnState`: all turn-scoped pieces (pending inputs, approval waiters,
diff tracker, review history, auto-compact flags, last agent message,
outstanding tool call bookkeeping).
- Group long-lived helpers/managers into a dedicated `SessionServices`
struct so `Session` does not accumulate "random" fields.
- Provide clear, lock-safe APIs so other modules never touch raw
mutexes.
- Ensure every turn creates/drops a `TurnState` and that
interrupts/finishes delegate cleanup to it.
For the most part, we try to avoid environment variables in favor of
config options so the environment variables do not leak into child
processes. These environment variables are no longer honored, so let's
delete them to be clear.
Ultimately, I would also like to eliminate `CODEX_RS_SSE_FIXTURE` in
favor of something cleaner.
refactors command_safety files into its own package, so we can add
platform-specific ones
Also creates a windows-specific of `is_known_safe_command` that just
returns false always, since that is what happens today.
This eliminates a "bounce" at the end of streaming where we hide the
status indicator at the end of the turn and the composer moves up two
lines.
Also, simplify streaming further by removing the HistorySink and
inverting control, and collapsing a few single-element structures.
Bumps [chrono](https://github.com/chronotope/chrono) from 0.4.41 to
0.4.42.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/chronotope/chrono/releases">chrono's
releases</a>.</em></p>
<blockquote>
<h2>0.4.42</h2>
<h2>What's Changed</h2>
<ul>
<li>Add fuzzer for DateTime::parse_from_str by <a
href="https://github.com/tyler92"><code>@tyler92</code></a> in <a
href="https://redirect.github.com/chronotope/chrono/pull/1700">chronotope/chrono#1700</a></li>
<li>Fix wrong amount of micro/milliseconds by <a
href="https://github.com/nmlt"><code>@nmlt</code></a> in <a
href="https://redirect.github.com/chronotope/chrono/pull/1703">chronotope/chrono#1703</a></li>
<li>Add warning about MappedLocalTime and wasm by <a
href="https://github.com/lutzky"><code>@lutzky</code></a> in <a
href="https://redirect.github.com/chronotope/chrono/pull/1702">chronotope/chrono#1702</a></li>
<li>Fix incorrect parsing of fixed-length second fractions by <a
href="https://github.com/chris-leach"><code>@chris-leach</code></a> in
<a
href="https://redirect.github.com/chronotope/chrono/pull/1705">chronotope/chrono#1705</a></li>
<li>Fix cfgs for <code>wasm32-linux</code> support by <a
href="https://github.com/arjunr2"><code>@arjunr2</code></a> in <a
href="https://redirect.github.com/chronotope/chrono/pull/1707">chronotope/chrono#1707</a></li>
<li>Fix OpenHarmony's <code>tzdata</code> parsing by <a
href="https://github.com/ldm0"><code>@ldm0</code></a> in <a
href="https://redirect.github.com/chronotope/chrono/pull/1679">chronotope/chrono#1679</a></li>
<li>Convert NaiveDate to/from days since unix epoch by <a
href="https://github.com/findepi"><code>@findepi</code></a> in <a
href="https://redirect.github.com/chronotope/chrono/pull/1715">chronotope/chrono#1715</a></li>
<li>Add <code>?Sized</code> bound to related methods of
<code>DelayedFormat::write_to</code> by <a
href="https://github.com/Huliiiiii"><code>@Huliiiiii</code></a> in <a
href="https://redirect.github.com/chronotope/chrono/pull/1721">chronotope/chrono#1721</a></li>
<li>Add <code>from_timestamp_secs</code> method to <code>DateTime</code>
by <a href="https://github.com/jasonaowen"><code>@jasonaowen</code></a>
in <a
href="https://redirect.github.com/chronotope/chrono/pull/1719">chronotope/chrono#1719</a></li>
<li>Migrate to core::error::Error by <a
href="https://github.com/benbrittain"><code>@benbrittain</code></a> in
<a
href="https://redirect.github.com/chronotope/chrono/pull/1704">chronotope/chrono#1704</a></li>
<li>Upgrade to windows-bindgen 0.63 by <a
href="https://github.com/djc"><code>@djc</code></a> in <a
href="https://redirect.github.com/chronotope/chrono/pull/1730">chronotope/chrono#1730</a></li>
<li>strftime: simplify error handling by <a
href="https://github.com/djc"><code>@djc</code></a> in <a
href="https://redirect.github.com/chronotope/chrono/pull/1731">chronotope/chrono#1731</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="f3fd15f976"><code>f3fd15f</code></a>
Bump version to 0.4.42</li>
<li><a
href="5cf5603500"><code>5cf5603</code></a>
strftime: add regression test case</li>
<li><a
href="a6231701ee"><code>a623170</code></a>
strftime: simplify error handling</li>
<li><a
href="36fbfb1221"><code>36fbfb1</code></a>
strftime: move specifier handling out of match to reduce rightward
drift</li>
<li><a
href="7f413c363b"><code>7f413c3</code></a>
strftime: yield None early</li>
<li><a
href="9d5dfe1640"><code>9d5dfe1</code></a>
strftime: outline constants</li>
<li><a
href="e5f6be7db4"><code>e5f6be7</code></a>
strftime: move error() method below caller</li>
<li><a
href="d516c2764d"><code>d516c27</code></a>
strftime: merge impl blocks</li>
<li><a
href="0ee2172fb9"><code>0ee2172</code></a>
strftime: re-order items to keep impls together</li>
<li><a
href="757a8b0226"><code>757a8b0</code></a>
Upgrade to windows-bindgen 0.63</li>
<li>Additional commits viewable in <a
href="https://github.com/chronotope/chrono/compare/v0.4.41...v0.4.42">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
This commit removes the `once_cell` dependency from `Cargo.toml` files
in the `codex-rs` and `apply-patch` directories, replacing its usage
with `std::sync::LazyLock` and `std::sync::OnceLock` where applicable.
This change simplifies the dependency tree and utilizes standard library
features for lazy initialization.
# External (non-OpenAI) Pull Request Requirements
Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
I am not sure what is going on, as
https://github.com/openai/codex/pull/3660 introduced this new logic and
I swear that CI was green before I merged that PR, but I am seeing
failures in this CI job this morning. This feels like a
non-backwards-compatible change in `gh`, but that feels unlikely...
Nevertheless, this is what I currently see on my laptop:
```
$ gh --version
gh version 2.76.2 (2025-07-30)
https://github.com/cli/cli/releases/tag/v2.76.2
$ gh run list --workflow .github/workflows/rust-release.yml --branch rust-v0.40.0 --json workflowName,url,headSha --jq 'first(.[])'
{
"headSha": "5268705a69713752adcbd8416ef9e84a683f7aa3",
"url": "https://github.com/openai/codex/actions/runs/17952349351",
"workflowName": ".github/workflows/rust-release.yml"
}
```
Looking at sample output from an old GitHub issue
(https://github.com/cli/cli/issues/6678), it appears that, at least at
one point in time, the `workflowName` was _not_ the path to the
workflow.
Bumps [tempfile](https://github.com/Stebalien/tempfile) from 3.20.0 to
3.22.0.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/Stebalien/tempfile/blob/master/CHANGELOG.md">tempfile's
changelog</a>.</em></p>
<blockquote>
<h2>3.22.0</h2>
<ul>
<li>Updated <code>windows-sys</code> requirement to allow version
0.61.x</li>
<li>Remove <code>unstable-windows-keep-open-tempfile</code>
feature.</li>
</ul>
<h2>3.21.0</h2>
<ul>
<li>Updated <code>windows-sys</code> requirement to allow version
0.60.x</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="f720dbe098"><code>f720dbe</code></a>
chore: release 3.22.0</li>
<li><a
href="55d742cb5d"><code>55d742c</code></a>
chore: remove deprecated unstable feature flag</li>
<li><a
href="bc41a0b586"><code>bc41a0b</code></a>
build(deps): update windows-sys requirement from >=0.52, <0.61 to
>=0.52, <0....</li>
<li><a
href="3c55387ede"><code>3c55387</code></a>
test: make sure we don't drop tempdirs early (<a
href="https://redirect.github.com/Stebalien/tempfile/issues/373">#373</a>)</li>
<li><a
href="17bf644406"><code>17bf644</code></a>
doc(builder): clarify permissions (<a
href="https://redirect.github.com/Stebalien/tempfile/issues/372">#372</a>)</li>
<li><a
href="c7423f1761"><code>c7423f1</code></a>
doc(env): document the alternative to setting the tempdir (<a
href="https://redirect.github.com/Stebalien/tempfile/issues/371">#371</a>)</li>
<li><a
href="5af60ca9e3"><code>5af60ca</code></a>
test(wasi): run a few tests that shouldn't have been disabled (<a
href="https://redirect.github.com/Stebalien/tempfile/issues/370">#370</a>)</li>
<li><a
href="6c0c56198a"><code>6c0c561</code></a>
fix(doc): temp_dir doesn't check if writable</li>
<li><a
href="48bff5f54c"><code>48bff5f</code></a>
test(tempdir): configure tempdir on wasi</li>
<li><a
href="704a1d2752"><code>704a1d2</code></a>
test(tempdir): cleanup tempdir tests and run more tests on wasi</li>
<li>Additional commits viewable in <a
href="https://github.com/Stebalien/tempfile/compare/v3.20.0...v3.22.0">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
[//]: # (dependabot-start)
⚠️ **Dependabot is rebasing this PR** ⚠️
Rebasing might not happen immediately, so don't worry if this takes some
time.
Note: if you make any changes to this PR yourself, they will take
precedence over the rebase.
---
[//]: # (dependabot-end)
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.224 to
1.0.226.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/serde-rs/serde/releases">serde's
releases</a>.</em></p>
<blockquote>
<h2>v1.0.226</h2>
<ul>
<li>Deduplicate variant matching logic inside generated Deserialize impl
for adjacently tagged enums (<a
href="https://redirect.github.com/serde-rs/serde/issues/2935">#2935</a>,
thanks <a
href="https://github.com/Mingun"><code>@Mingun</code></a>)</li>
</ul>
<h2>v1.0.225</h2>
<ul>
<li>Avoid triggering a deprecation warning in derived Serialize and
Deserialize impls for a data structure that contains its own
deprecations (<a
href="https://redirect.github.com/serde-rs/serde/issues/2879">#2879</a>,
thanks <a
href="https://github.com/rcrisanti"><code>@rcrisanti</code></a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="1799547846"><code>1799547</code></a>
Release 1.0.226</li>
<li><a
href="2dbeefb11b"><code>2dbeefb</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/serde/issues/2935">#2935</a>
from Mingun/dedupe-adj-enums</li>
<li><a
href="8a3c29ff19"><code>8a3c29f</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/serde/issues/2986">#2986</a>
from dtolnay/didnotwork</li>
<li><a
href="defc24d361"><code>defc24d</code></a>
Remove "did not work" comment from test suite</li>
<li><a
href="2316610760"><code>2316610</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/serde/issues/2929">#2929</a>
from Mingun/flatten-enum-tests</li>
<li><a
href="c09e2bd690"><code>c09e2bd</code></a>
Add tests for flatten unit variant in adjacently tagged (tag + content)
enums</li>
<li><a
href="fe7dcc4cd8"><code>fe7dcc4</code></a>
Test all possible orders of map entries for enum-flatten-in-struct
representa...</li>
<li><a
href="a20e66e131"><code>a20e66e</code></a>
Check serialization in
flatten::enum_::internally_tagged::unit_enum_with_unkn...</li>
<li><a
href="1c1a5d95cd"><code>1c1a5d9</code></a>
Reorder struct_ and newtype tests of adjacently_tagged enums to match
order i...</li>
<li><a
href="ee3c2372fb"><code>ee3c237</code></a>
Opt in to generate-macro-expansion when building on docs.rs</li>
<li>Additional commits viewable in <a
href="https://github.com/serde-rs/serde/compare/v1.0.224...v1.0.226">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
This updates our release process so that when we build an alpha of the
Codex CLI (as determined by pushing a tag of the format
`rust-v<cli-version>-alpha.<alpha-version>`), we will now publish the
corresponding npm module publicly, but under the `alpha` tag. As you can
see, this PR adds `--tag alpha` to the `npm publish` command, as
appropriate.
We try to ensure ripgrep (`rg`) is provided with Codex.
- For `brew`, we declare it as a dependency of our formula:
08d82d8b00/Formula/c/codex.rb (L24)
- For `npm`, we declare `@vscode/ripgrep` as a dependency, which
installs the platform-specific binary as part of a `postinstall` script:
fdb8dadcae/codex-cli/package.json (L22)
- Users who download the CLI directly from GitHub Releases are on their
own.
In practice, I have seen `@vscode/ripgrep` fail on occasion. Here is a
trace from a GitHub workflow:
```
npm error code 1
npm error path /Users/runner/hostedtoolcache/node/20.19.5/arm64/lib/node_modules/@openai/codex/node_modules/@vscode/ripgrep
npm error command failed
npm error command sh -c node ./lib/postinstall.js
npm error Finding release for v13.0.0-13
npm error GET https://api.github.com/repos/microsoft/ripgrep-prebuilt/releases/tags/v13.0.0-13
npm error Deleting invalid download cache
npm error Download attempt 1 failed, retrying in 2 seconds...
npm error Finding release for v13.0.0-13
npm error GET https://api.github.com/repos/microsoft/ripgrep-prebuilt/releases/tags/v13.0.0-13
npm error Deleting invalid download cache
npm error Download attempt 2 failed, retrying in 4 seconds...
npm error Finding release for v13.0.0-13
npm error GET https://api.github.com/repos/microsoft/ripgrep-prebuilt/releases/tags/v13.0.0-13
npm error Deleting invalid download cache
npm error Download attempt 3 failed, retrying in 8 seconds...
npm error Finding release for v13.0.0-13
npm error GET https://api.github.com/repos/microsoft/ripgrep-prebuilt/releases/tags/v13.0.0-13
npm error Deleting invalid download cache
npm error Download attempt 4 failed, retrying in 16 seconds...
npm error Finding release for v13.0.0-13
npm error GET https://api.github.com/repos/microsoft/ripgrep-prebuilt/releases/tags/v13.0.0-13
npm error Deleting invalid download cache
npm error Error: Request failed: 403
```
To eliminate this error, this PR changes things so that we vendor the
`rg` binary into https://www.npmjs.com/package/@openai/codex so it is
guaranteed to be included when a user runs `npm i -g @openai/codex`.
The downside of this approach is the increase in package size: we
include the `rg` executable for six architectures (in addition to the
six copies of `codex` we already include). In a follow-up, I plan to add
support for "slices" of our npm module, so that soon users will be able
to do:
```
npm install -g @openai/codex@aarch64-apple-darwin
```
Admittedly, this is a sizable change and I tried to clean some things up
in the process:
- `install_native_deps.sh` has been replaced by `install_native_deps.py`
- `stage_release.sh` and `stage_rust_release.py` has been replaced by
`build_npm_package.py`
We now vendor in a DotSlash file for ripgrep (as a modest attempt to
facilitate local testing) and then build up the extension by:
- creating a temp directory and copying `package.json` over to it with
the target value for `"version"`
- finding the GitHub workflow that corresponds to the
`--release-version` and copying the various `codex` artifacts to
respective `vendor/TARGET_TRIPLE/codex` folder
- downloading the `rg` artifacts specified in the DotSlash file and
copying them over to the respective `vendor/TARGET_TRIPLE/path` folder
- if `--pack-output` is specified, runs `npm pack` on the temp directory
To test, I downloaded the artifact produced by this CI job:
https://github.com/openai/codex/actions/runs/17961595388/job/51085840022?pr=3660
and verified that `node ./bin/codex.js 'which -a rg'` worked as
intended.
### Summary
Sometimes in exec runs, we want to allow the model to use the
`update_plan` tool, but that's not easily configurable. This change adds
a feature flag for this, and formats the output so it's human-readable
## Test Plan
<img width="1280" height="354" alt="Screenshot 2025-09-11 at 12 39
44 AM"
src="https://github.com/user-attachments/assets/72e11070-fb98-47f5-a784-5123ca7333d9"
/>
- Only show the usage data section when signed in with ChatGPT. (Tested
with Chat auth and API auth.)
- Friendlier string change.
- Also removed `.dim()` on the string, since it was the only string in
`/status` that was dim.
## Summary
Introduces a “ghost commit” workflow that snapshots the tree without
touching refs.
1. git commit-tree writes an unreferenced commit object from the current
index, optionally pointing to the current HEAD as its parent.
2. We then stash that commit id and use git restore --source <ghost> to
roll the worktree (and index) back to the recorded snapshot later on.
## Details
- Ghost commits live only as loose objects—we never update branches or
tags—so the repo history stays untouched while still giving us a full
tree snapshot.
- Force-included paths let us stage otherwise ignored files before
capturing the tree.
- Restoration rehydrates both tracked and force-included files while
leaving untracked/ignored files alone.
## Summary
- refactor the stream retry integration tests to construct conversations
through `TestCodex`
- remove bespoke config and tempdir setup now handled by the shared
builder
## Testing
- cargo test -p codex-core --test all
stream_error_allows_next_turn::continue_after_stream_error
- cargo test -p codex-core --test all
stream_no_completed::retries_on_early_close
------
https://chatgpt.com/codex/tasks/task_i_68d2b94d83888320bc75a0bc3bd77b49
Adds a "View Stack" to the bottom pane to allow for pushing/popping
bottom panels.
`esc` will go back instead of dismissing.
Benefit: We retain the "selection state" of a parent panel (e.g. the
review panel).
Backtracking multiple times could drop earlier turns. We now derive the
active user-turn positions from the transcript on demand (keying off the
latest session header) instead of caching state. This keeps the replayed
context intact during repeated edits and adds a regression test.
The only file to watch is the cargo.toml
All the others come from just fix + a few manual small fix
The set of rules have been taken from the list of clippy rules
arbitrarily while trying to optimise the learning and style of the code
while limiting the loss of productivity
Adds the following options:
1. Review current changes
2. Review a specific commit
3. Review against a base branch (PR style)
4. Custom instructions
<img width="487" height="330" alt="Screenshot 2025-09-20 at 2 11 36 PM"
src="https://github.com/user-attachments/assets/edb0aaa5-5747-47fa-881f-cc4c4f7fe8bc"
/>
---
\+ Adds the following UI helpers:
1. Makes list selection searchable
2. Adds navigation to the bottom pane, so you could add a stack of
popups
3. Basic custom prompt view
We currently get information about rate limits in the response headers.
We want to forward them to the clients to have better transparency.
UI/UX plans have been discussed and this information is needed.
Currently, we change the tool description according to the sandbox
policy and approval policy. This breaks the cache when the user hits
`/approvals`. This PR does the following:
- Always use the shell with escalation parameter:
- removes `create_shell_tool_for_sandbox` and always uses unified tool
via `create_shell_tool`
- Reject the func call when the model uses escalation parameter when it
cannot.
### Why Use `tokio::sync::Mutex`
`std::sync::Mutex` are not _async-aware_. As a result, they will block
the entire thread instead of just yielding the task. Furthermore they
can be poisoned which is not the case of `tokio` Mutex.
This allows the Tokio runtime to continue running other tasks while
waiting for the lock, preventing deadlocks and performance bottlenecks.
In general, this is preferred in async environment
This change instructs the model to install any missing command. Else
tokens are wasted when it tries to run
commands that aren't available multiple times before installing them.
Often, `gh` infers `--repo` when it is run from a Git clone, but our
`publish-npm` step is designed to avoid the overhead of cloning the
repo, so add the `--repo` option explicitly to fix things.
The build for `v0.37.0-alpha.3` failed on the `Create GitHub Release`
step:
https://github.com/openai/codex/actions/runs/17786866086/job/50556513221
with:
```
⚠️ GitHub release failed with status: 403
{"message":"Resource not accessible by integration","documentation_url":"https://docs.github.com/rest/releases/releases#create-a-release","status":"403"}
Skip retry — your GitHub token/PAT does not have the required permission to create a release
```
I believe I should have not introduced a top-level `permissions` for the
workflow in https://github.com/openai/codex/pull/3431 because that
affected the `permissions` for each job in the workflow.
This PR introduces `publish-npm` as its own job, which allows us to:
- consolidate all the Node.js-related steps required for publishing
- limit the reach of the `id-token: write` permission
- skip it altogether if is an alpha build
With this PR, each of `release`, `publish-npm`, and `update-branch` has
an explicit `permissions` block.
Proposal: We want to record a dev message like so:
```
{
"type": "message",
"role": "user",
"content": [
{
"type": "input_text",
"text": "<user_action>
<context>User initiated a review task. Here's the full review output from reviewer model. User may select one or more comments to resolve.</context>
<action>review</action>
<results>
{findings_str}
</results>
</user_action>"
}
]
},
```
Without showing in the chat transcript.
Rough idea, but it fixes issue where the user finishes a review thread,
and asks the parent "fix the rest of the review issues" thinking that
the parent knows about it.
### Question: Why not a tool call?
Because the agent didn't make the call, it was a human. + we haven't
implemented sub-agents yet, and we'll need to think about the way we
represent these human-led tool calls for the agent.
1. Adds the environment prompt (including cwd) to review thread
2. Prepends the review prompt as a user message (temporary fix so the
instructions are not replaced on backend)
3. Sets reasoning to low
4. Sets default review model to `gpt-5-codex`
## Summary
SendUserTurn has not been correctly handling updates to policies. While
the tui protocol handles this in `Op::OverrideTurnContext`, the
SendUserTurn should be appending `EnvironmentContext` messages when the
sandbox settings change. MCP client behavior should match the cli
behavior, so we update `SendUserTurn` message to match.
## Testing
- [x] Added prompt caching tests
Bumps [wildmatch](https://github.com/becheran/wildmatch) from 2.4.0 to
2.5.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/becheran/wildmatch/releases">wildmatch's
releases</a>.</em></p>
<blockquote>
<h2>v2.5.0</h2>
<p><a
href="https://redirect.github.com/becheran/wildmatch/pull/27">becheran/wildmatch#27</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="b39902c120"><code>b39902c</code></a>
chore: Release wildmatch version 2.5.0</li>
<li><a
href="87a8cf4c80"><code>87a8cf4</code></a>
Merge pull request <a
href="https://redirect.github.com/becheran/wildmatch/issues/28">#28</a>
from smichaku/micha/fix-unicode-case-insensitive-matching</li>
<li><a
href="a3ab4903f5"><code>a3ab490</code></a>
fix: Fix unicode matching for non-ASCII characters</li>
<li>See full diff in <a
href="https://github.com/becheran/wildmatch/compare/v2.4.0...v2.5.0">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.219 to
1.0.223.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/serde-rs/serde/releases">serde's
releases</a>.</em></p>
<blockquote>
<h2>v1.0.223</h2>
<ul>
<li>Fix serde_core documentation links (<a
href="https://redirect.github.com/serde-rs/serde/issues/2978">#2978</a>)</li>
</ul>
<h2>v1.0.222</h2>
<ul>
<li>Make <code>serialize_with</code> attribute produce code that works
if respanned to 2024 edition (<a
href="https://redirect.github.com/serde-rs/serde/issues/2950">#2950</a>,
thanks <a href="https://github.com/aytey"><code>@aytey</code></a>)</li>
</ul>
<h2>v1.0.221</h2>
<ul>
<li>Documentation improvements (<a
href="https://redirect.github.com/serde-rs/serde/issues/2973">#2973</a>)</li>
<li>Deprecate <code>serde_if_integer128!</code> macro (<a
href="https://redirect.github.com/serde-rs/serde/issues/2975">#2975</a>)</li>
</ul>
<h2>v1.0.220</h2>
<ul>
<li>Add a way for data formats to depend on serde traits without waiting
for serde_derive compilation: <a
href="https://docs.rs/serde_core">https://docs.rs/serde_core</a> (<a
href="https://redirect.github.com/serde-rs/serde/issues/2608">#2608</a>,
thanks <a
href="https://github.com/osiewicz"><code>@osiewicz</code></a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="6c316d7cb5"><code>6c316d7</code></a>
Release 1.0.223</li>
<li><a
href="a4ac0c2bc6"><code>a4ac0c2</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/serde/issues/2978">#2978</a>
from dtolnay/htmlrooturl</li>
<li><a
href="ed76364f87"><code>ed76364</code></a>
Change serde_core's html_root_url to docs.rs/serde_core</li>
<li><a
href="57e21a1afa"><code>57e21a1</code></a>
Release 1.0.222</li>
<li><a
href="bb58726133"><code>bb58726</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/serde/issues/2950">#2950</a>
from aytey/fix_lifetime_issue_2024</li>
<li><a
href="3f6925125b"><code>3f69251</code></a>
Delete unneeded field of MapDeserializer</li>
<li><a
href="fd4decf2fe"><code>fd4decf</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/serde/issues/2976">#2976</a>
from dtolnay/content</li>
<li><a
href="00b1b6b2b5"><code>00b1b6b</code></a>
Move Content's Deserialize impl from serde_core to serde</li>
<li><a
href="cf141aa8c7"><code>cf141aa</code></a>
Move Content's Clone impl from serde_core to serde</li>
<li><a
href="ff3aee490a"><code>ff3aee4</code></a>
Release 1.0.221</li>
<li>Additional commits viewable in <a
href="https://github.com/serde-rs/serde/compare/v1.0.219...v1.0.223">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.143 to
1.0.145.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/serde-rs/json/releases">serde_json's
releases</a>.</em></p>
<blockquote>
<h2>v1.0.145</h2>
<ul>
<li>Raise serde version requirement to >=1.0.220</li>
</ul>
<h2>v1.0.144</h2>
<ul>
<li>Switch serde dependency to serde_core (<a
href="https://redirect.github.com/serde-rs/json/issues/1285">#1285</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="efa66e3a1d"><code>efa66e3</code></a>
Release 1.0.145</li>
<li><a
href="23679e2b9d"><code>23679e2</code></a>
Add serde version constraint</li>
<li><a
href="fc27bafbf7"><code>fc27baf</code></a>
Release 1.0.144</li>
<li><a
href="caef3c6ea6"><code>caef3c6</code></a>
Ignore uninlined_format_args pedantic clippy lint</li>
<li><a
href="81ba3aaaff"><code>81ba3aa</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/json/issues/1285">#1285</a>
from dtolnay/serdecore</li>
<li><a
href="d21e8ce7a7"><code>d21e8ce</code></a>
Switch serde dependency to serde_core</li>
<li><a
href="6beb6cd596"><code>6beb6cd</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/json/issues/1286">#1286</a>
from dtolnay/up</li>
<li><a
href="1dbc803749"><code>1dbc803</code></a>
Raise required compiler to Rust 1.61</li>
<li><a
href="0bf5d87003"><code>0bf5d87</code></a>
Enforce trybuild >= 1.0.108</li>
<li><a
href="d12e943590"><code>d12e943</code></a>
Update actions/checkout@v4 -> v5</li>
<li>See full diff in <a
href="https://github.com/serde-rs/json/compare/v1.0.143...v1.0.145">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Reported height was `20` instead of `21`, so `area.height >=
MIN_ANIMATION_HEIGHT` was `false` and therefore `show_animation` was
`false`, so the animation never displayed.
Changes:
- skip the welcome animation when the terminal area is below 60x21
- skip the model upgrade animation when the terminal area is below 60x24
to avoid clipping
---------
Co-authored-by: Michael Bolin <mbolin@openai.com>
Bumps [tracing-subscriber](https://github.com/tokio-rs/tracing) from
0.3.19 to 0.3.20.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/tokio-rs/tracing/releases">tracing-subscriber's
releases</a>.</em></p>
<blockquote>
<h2>tracing-subscriber 0.3.20</h2>
<p><strong>Security Fix</strong>: ANSI Escape Sequence Injection
(CVE-TBD)</p>
<h2>Impact</h2>
<p>Previous versions of tracing-subscriber were vulnerable to ANSI
escape sequence injection attacks. Untrusted user input containing ANSI
escape sequences could be injected into terminal output when logged,
potentially allowing attackers to:</p>
<ul>
<li>Manipulate terminal title bars</li>
<li>Clear screens or modify terminal display</li>
<li>Potentially mislead users through terminal manipulation</li>
</ul>
<p>In isolation, impact is minimal, however security issues have been
found in terminal emulators that enabled an attacker to use ANSI escape
sequences via logs to exploit vulnerabilities in the terminal
emulator.</p>
<h2>Solution</h2>
<p>Version 0.3.20 fixes this vulnerability by escaping ANSI control
characters in when writing events to destinations that may be printed to
the terminal.</p>
<h2>Affected Versions</h2>
<p>All versions of tracing-subscriber prior to 0.3.20 are affected by
this vulnerability.</p>
<h2>Recommendations</h2>
<p>Immediate Action Required: We recommend upgrading to
tracing-subscriber 0.3.20 immediately, especially if your
application:</p>
<ul>
<li>Logs user-provided input (form data, HTTP headers, query parameters,
etc.)</li>
<li>Runs in environments where terminal output is displayed to
users</li>
</ul>
<h2>Migration</h2>
<p>This is a patch release with no breaking API changes. Simply update
your Cargo.toml:</p>
<pre lang="toml"><code>[dependencies]
tracing-subscriber = "0.3.20"
</code></pre>
<h2>Acknowledgments</h2>
<p>We would like to thank <a href="http://github.com/zefr0x">zefr0x</a>
who responsibly reported the issue at
<code>security@tokio.rs</code>.</p>
<p>If you believe you have found a security vulnerability in any
tokio-rs project, please email us at <code>security@tokio.rs</code>.</p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="4c52ca5266"><code>4c52ca5</code></a>
fmt: fix ANSI escape sequence injection vulnerability (<a
href="https://redirect.github.com/tokio-rs/tracing/issues/3368">#3368</a>)</li>
<li><a
href="f71cebe41e"><code>f71cebe</code></a>
subscriber: impl Clone for EnvFilter (<a
href="https://redirect.github.com/tokio-rs/tracing/issues/3360">#3360</a>)</li>
<li><a
href="3a1f571102"><code>3a1f571</code></a>
Fix CI (<a
href="https://redirect.github.com/tokio-rs/tracing/issues/3361">#3361</a>)</li>
<li><a
href="e63ef57f3d"><code>e63ef57</code></a>
chore: prepare tracing-attributes 0.1.30 (<a
href="https://redirect.github.com/tokio-rs/tracing/issues/3316">#3316</a>)</li>
<li><a
href="6e59a13b1a"><code>6e59a13</code></a>
attributes: fix tracing::instrument regression around shadowing (<a
href="https://redirect.github.com/tokio-rs/tracing/issues/3311">#3311</a>)</li>
<li><a
href="e4df761275"><code>e4df761</code></a>
tracing: update core to 0.1.34 and attributes to 0.1.29 (<a
href="https://redirect.github.com/tokio-rs/tracing/issues/3305">#3305</a>)</li>
<li><a
href="643f392ebb"><code>643f392</code></a>
chore: prepare tracing-attributes 0.1.29 (<a
href="https://redirect.github.com/tokio-rs/tracing/issues/3304">#3304</a>)</li>
<li><a
href="d08e7a6eea"><code>d08e7a6</code></a>
chore: prepare tracing-core 0.1.34 (<a
href="https://redirect.github.com/tokio-rs/tracing/issues/3302">#3302</a>)</li>
<li><a
href="6e70c571d3"><code>6e70c57</code></a>
tracing-subscriber: count numbers of enters in <code>Timings</code> (<a
href="https://redirect.github.com/tokio-rs/tracing/issues/2944">#2944</a>)</li>
<li><a
href="c01d4fd9de"><code>c01d4fd</code></a>
fix docs and enable CI on <code>main</code> branch (<a
href="https://redirect.github.com/tokio-rs/tracing/issues/3295">#3295</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/tokio-rs/tracing/compare/tracing-subscriber-0.3.19...tracing-subscriber-0.3.20">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [slab](https://github.com/tokio-rs/slab) from 0.4.10 to 0.4.11.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/tokio-rs/slab/releases">slab's
releases</a>.</em></p>
<blockquote>
<h2>v0.4.11</h2>
<ul>
<li>Fix <code>Slab::get_disjoint_mut</code> out of bounds (<a
href="https://redirect.github.com/tokio-rs/slab/issues/152">#152</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/tokio-rs/slab/blob/master/CHANGELOG.md">slab's
changelog</a>.</em></p>
<blockquote>
<h1>0.4.11 (August 8, 2025)</h1>
<ul>
<li>Fix <code>Slab::get_disjoint_mut</code> out of bounds (<a
href="https://redirect.github.com/tokio-rs/slab/issues/152">#152</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="2e5779f8eb"><code>2e5779f</code></a>
Release v0.4.11 (<a
href="https://redirect.github.com/tokio-rs/slab/issues/153">#153</a>)</li>
<li><a
href="2d65c514bc"><code>2d65c51</code></a>
Fix get_disjoint_mut error condition (<a
href="https://redirect.github.com/tokio-rs/slab/issues/152">#152</a>)</li>
<li>See full diff in <a
href="https://github.com/tokio-rs/slab/compare/v0.4.10...v0.4.11">compare
view</a></li>
</ul>
</details>
<br />
[](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)
Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.
[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
---
<details>
<summary>Dependabot commands and options</summary>
<br />
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
You can disable automated security fix PRs for this repo from the
[Security Alerts page](https://github.com/openai/codex/network/alerts).
</details>
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Summary
- common: use exact equality for Swiftfox exclusion to avoid hiding
future slugs that merely contain the substring
- core: treat missing internal_storage.json as expected (debug), warn
only on real IO/parse errors
- tui: drop DEBUG_HIGH gate; always consider showing rollout prompt, but
suppress under ApiKey auth mode
When logging in using ChatGPT using the `codex login` command, a
successful login should write a new `auth.json` file with the ChatGPT
token information. The old code attempted to retain the API key and
merge the token information into the existing `auth.json` file. With the
new simplified login mechanism, `auth.json` should have auth information
for only ChatGPT or API Key, not both.
The `codex login --api-key <key>` code path was already doing the right
thing here, but the `codex login` command was incorrect. This PR fixes
the problem and adds test cases for both commands.
This PR addresses an edge-case bug that appears in the VS Code extension
in the following situation:
1. Log in using ChatGPT (using either the CLI or extension). This will
create an `auth.json` file.
2. Manually modify `config.toml` to specify a custom provider.
3. Start a fresh copy of the VS Code extension.
The profile menu in the VS Code extension will indicate that you are
logged in using ChatGPT even though you're not.
This is caused by the `get_auth_status` method returning an
`auth_method: 'chatgpt'` when a custom provider is configured and it
doesn't use OpenAI auth (i.e. `requires_openai_auth` is false). The
method should always return `auth_method: None` if
`requires_openai_auth` is false.
The same bug also causes the NUX (new user experience) screen to be
displayed in the VSCE in this situation.
## Summary
Resolves a merge conflict between #3597 and #3560, and adds tests to
double check our apply_patch configuration.
## Testing
- [x] Added unit tests
---------
Co-authored-by: dedrisian-oai <dedrisian@openai.com>
Adding the ability to resume conversations.
we have one verb `resume`.
Behavior:
`tui`:
`codex resume`: opens session picker
`codex resume --last`: continue last message
`codex resume <session id>`: continue conversation with `session id`
`exec`:
`codex resume --last`: continue last conversation
`codex resume <session id>`: continue conversation with `session id`
Implementation:
- I added a function to find the path in `~/.codex/sessions/` with a
`UUID`. This is helpful in resuming with session id.
- Added the above mentioned flags
- Added lots of testing
There are exactly 4 types of flaky tests in Windows x86 right now:
1. `review_input_isolated_from_parent_history` => Times out waiting for
closing events
2. `review_does_not_emit_agent_message_on_structured_output` => Times
out waiting for closing events
3. `auto_compact_runs_after_token_limit_hit` => Times out waiting for
closing events
4. `auto_compact_runs_after_token_limit_hit` => Also has a problem where
auto compact should add a third request, but receives 4 requests.
1, 2, and 3 seem to be solved with increasing threads on windows runner
from 2 -> 4.
Don't know yet why # 4 is happening, but probably also because of
WireMock issues on windows causing races.
We need to construct the history different when compact happens. For
this, we need to just consider the history after compact and convert
compact to a response item.
This needs to change and use `build_compact_history` when this #3446 is
merged.
No (intended) functional change.
This refactors the transcript view to hold a list of HistoryCells
instead of a list of Lines. This simplifies and makes much of the logic
more robust, as well as laying the groundwork for future changes, e.g.
live-updating history cells in the transcript.
Similar to #2879 in goal. Fixes#2755.
## 📝 Review Mode -- Core
This PR introduces the Core implementation for Review mode:
- New op `Op::Review { prompt: String }:` spawns a child review task
with isolated context, a review‑specific system prompt, and a
`Config.review_model`.
- `EnteredReviewMode`: emitted when the child review session starts.
Every event from this point onwards reflects the review session.
- `ExitedReviewMode(Option<ReviewOutputEvent>)`: emitted when the review
finishes or is interrupted, with optional structured findings:
```json
{
"findings": [
{
"title": "<≤ 80 chars, imperative>",
"body": "<valid Markdown explaining *why* this is a problem; cite files/lines/functions>",
"confidence_score": <float 0.0-1.0>,
"priority": <int 0-3>,
"code_location": {
"absolute_file_path": "<file path>",
"line_range": {"start": <int>, "end": <int>}
}
}
],
"overall_correctness": "patch is correct" | "patch is incorrect",
"overall_explanation": "<1-3 sentence explanation justifying the overall_correctness verdict>",
"overall_confidence_score": <float 0.0-1.0>
}
```
## Questions
### Why separate out its own message history?
We want the review thread to match the training of our review models as
much as possible -- that means using a custom prompt, removing user
instructions, and starting a clean chat history.
We also want to make sure the review thread doesn't leak into the parent
thread.
### Why do this as a mode, vs. sub-agents?
1. We want review to be a synchronous task, so it's fine for now to do a
bespoke implementation.
2. We're still unclear about the final structure for sub-agents. We'd
prefer to land this quickly and then refactor into sub-agents without
rushing that implementation.
this adds some more capabilities to the default sandbox which I feel are
safe. Most are in the
[renderer.sb](https://source.chromium.org/chromium/chromium/src/+/main:sandbox/policy/mac/renderer.sb)
sandbox for chrome renderers, which i feel is fair game for codex
commands.
Specific changes:
1. Allow processes in the sandbox to send signals to any other process
in the same sandbox (e.g. child processes or daemonized processes),
instead of just themselves.
2. Allow user-preference-read
3. Allow process-info* to anything in the same sandbox. This is a bit
wider than Chromium allows, but it seems OK to me to allow anything in
the sandbox to get details about other processes in the same sandbox.
Bazel uses these to e.g. wait for another process to exit.
4. Allow all CPU feature detection, this seems harmless to me. It's
wider than Chromium, but Chromium is concerned about fingerprinting, and
tightly controls what CPU features they actually care about, and we
don't have either that restriction or that advantage.
5. Allow new sysctl-reads:
```
(sysctl-name "vm.loadavg")
(sysctl-name-prefix "kern.proc.pgrp.")
(sysctl-name-prefix "kern.proc.pid.")
(sysctl-name-prefix "net.routetable.")
```
bazel needs these for waiting on child processes and for communicating
with its local build server, i believe. I wonder if we should just allow
all (sysctl-read), as reading any arbitrary info about the system seems
fine to me.
6. Allow iokit-open on RootDomainUserClient. This has to do with power
management I believe, and Chromium allows renderers to do this, so okay.
Bazel needs it to boot successfully, possibly for sleep/wake callbacks?
7. Mach lookup to `com.apple.system.opendirectoryd.libinfo`, which has
to do with user data, and which Chrome allows.
8. Mach lookup to `com.apple.PowerManagement.control`. Chromium allows
its GPU process to do this, but not its renderers. Bazel needs this to
boot, probably relatedly to sleep/wake stuff.
Azure Responses API doesn't work well with store:false and response
items.
If store = false and id is sent an error is thrown that ID is not found
If store = false and id is not sent an error is thrown that ID is
required
Add detection for Azure urls and add a workaround to preserve reasoning
item IDs and send store:true
sometimes the model forgets to actually invoke `apply_patch` and puts a
patch as the script body. trying to execute this as bash sometimes
creates files named `,` or `{` or does other unknown things, so catch
this situation and return an error to the model.
## Compact feature:
1. Stops the model when the context window become too large
2. Add a user turn, asking for the model to summarize
3. Build a bridge that contains all the previous user message + the
summary. Rendered from a template
4. Start sampling again from a clean conversation with only that bridge
It turns out that we want slightly different behavior for the
`SetDefaultModel` RPC because some models do not work with reasoning
(like GPT-4.1), so we should be able to explicitly clear this value.
Verified in `codex-rs/mcp-server/tests/suite/set_default_model.rs`.
## Summary
Standardizes the shell description across sandbox_types, since we cover
this in the prompt, and have moved necessary details (like
network_access and writeable workspace roots) to EnvironmentContext
messages.
## Test Plan
- [x] updated unit tests
2025-09-12 14:24:09 -04:00
1128 changed files with 118632 additions and 23114 deletions
For MacOS and Linux: copy the output of `uname -mprs`
For Windows: copy the output of `"$([Environment]::OSVersion | ForEach-Object VersionString) $(if ([Environment]::Is64BitOperatingSystem) { "x64" } else { "x86" })"` in the PowerShell console
- type:textarea
id:actual
attributes:
label:What issue are you seeing?
description:Please include the full error messages and prompts with PII redacted. If possible, please provide text instead of a screenshot.
validations:
required:true
- type:textarea
id:steps
attributes:
label:What steps can reproduce the bug?
description:Explain the bug and provide a code snippet that can reproduce it.
description:Explain the bug and provide a code snippet that can reproduce it. Please include session id, token limit usage, context window usage if applicable.
validations:
required:true
- type:textarea
@@ -44,11 +59,6 @@ body:
attributes:
label:What is the expected behavior?
description:If possible, please provide text instead of a screenshot.
- type:textarea
id:actual
attributes:
label:What do you see instead?
description:If possible, please provide text instead of a screenshot.
label:What version of the VS Code extension are you using?
validations:
required:true
- type:input
id:plan
attributes:
label:What subscription do you have?
validations:
required:true
- type:input
id:ide
attributes:
label:Which IDE are you using?
description:Like `VS Code`, `Cursor`, `Windsurf`, etc.
validations:
required:true
- type:input
id:platform
attributes:
@@ -26,11 +36,18 @@ body:
description:|
For MacOS and Linux: copy the output of `uname -mprs`
For Windows: copy the output of `"$([Environment]::OSVersion | ForEach-Object VersionString) $(if ([Environment]::Is64BitOperatingSystem) { "x64" } else { "x86" })"` in the PowerShell console
- type:textarea
id:actual
attributes:
label:What issue are you seeing?
description:Please include the full error messages and prompts with PII redacted. If possible, please provide text instead of a screenshot.
validations:
required:true
- type:textarea
id:steps
attributes:
label:What steps can reproduce the bug?
description:Explain the bug and provide a code snippet that can reproduce it.
description:Explain the bug and provide a code snippet that can reproduce it. Please include session id, token limit usage, context window usage if applicable.
validations:
required:true
- type:textarea
@@ -38,11 +55,6 @@ body:
attributes:
label:What is the expected behavior?
description:If possible, please provide text instead of a screenshot.
- type:textarea
id:actual
attributes:
label:What do you see instead?
description:If possible, please provide text instead of a screenshot.
You are an assistant that reviews GitHub issues for the repository.
Your job is to choose the most appropriate existing labels for the issue described later in this prompt.
Follow these rules:
- Only pick labels out of the list below.
- Prefer a small set of precise labels over many broad ones.
- If none of the labels fit, respond with an empty JSON array: []
- Output must be a JSON array of label names (strings) with no additional commentary.
Labels to apply:
1. bug — Reproducible defects in Codex products (CLI, VS Code extension, web, auth).
2. enhancement — Feature requests or usability improvements that ask for new capabilities, better ergonomics, or quality-of-life tweaks.
3. extension — VS Code (or other IDE) extension-specific issues.
4. windows-os — Bugs or friction specific to Windows environments (PowerShell behavior, path handling, copy/paste, OS-specific auth or tooling failures).
5. mcp — Topics involving Model Context Protocol servers/clients.
6. codex-web — Issues targeting the Codex web UI/Cloud experience.
8. azure — Problems or requests tied to Azure OpenAI deployments.
9. documentation — Updates or corrections needed in docs/README/config references (broken links, missing examples, outdated keys, clarification requests).
10. model-behavior — Undesirable LLM behavior: forgetting goals, refusing work, hallucinating environment details, quota misreports, or other reasoning/performance anomalies.
Issue information is available in environment variables:
You are an assistant that triages new GitHub issues by identifying potential duplicates.
You will receive the following JSON files located in the current working directory:
- `codex-current-issue.json`: JSON object describing the newly created issue (fields: number, title, body).
- `codex-existing-issues.json`: JSON array of recent issues (each element includes number, title, body, createdAt).
Instructions:
- Compare the current issue against the existing issues to find up to five that appear to describe the same underlying problem or request.
- Focus on the underlying intent and context of each issue—such as reported symptoms, feature requests, reproduction steps, or error messages—rather than relying solely on string similarity or synthetic metrics.
- After your analysis, validate your results in 1-2 lines explaining your decision to return the selected matches.
You are an assistant that reviews GitHub issues for the repository.
Your job is to choose the most appropriate existing labels for the issue described later in this prompt.
Follow these rules:
- Only pick labels out of the list below.
- Prefer a small set of precise labels over many broad ones.
Labels to apply:
1. bug — Reproducible defects in Codex products (CLI, VS Code extension, web, auth).
2. enhancement — Feature requests or usability improvements that ask for new capabilities, better ergonomics, or quality-of-life tweaks.
3. extension — VS Code (or other IDE) extension-specific issues.
4. windows-os — Bugs or friction specific to Windows environments (always when PowerShell is mentioned, path handling, copy/paste, OS-specific auth or tooling failures).
5. mcp — Topics involving Model Context Protocol servers/clients.
6. codex-web — Issues targeting the Codex web UI/Cloud experience.
8. azure — Problems or requests tied to Azure OpenAI deployments.
9. documentation — Updates or corrections needed in docs/README/config references (broken links, missing examples, outdated keys, clarification requests).
10. model-behavior — Undesirable LLM behavior: forgetting goals, refusing work, hallucinating environment details, quota misreports, or other reasoning/performance anomalies.
@@ -4,14 +4,22 @@ In the codex-rs folder where the rust code lives:
- Crate names are prefixed with `codex-`. For example, the `core` folder's crate is named `codex-core`
- When using format! and you can inline variables into {}, always do that.
- Install any commands the repo relies on (for example `just`, `rg`, or `cargo-insta`) if they aren't already available before running instructions here.
- Never add or modify any code related to `CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR` or `CODEX_SANDBOX_ENV_VAR`.
- You operate in a sandbox where `CODEX_SANDBOX_NETWORK_DISABLED=1` will be set whenever you use the `shell` tool. Any existing code that uses `CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR` was authored with this fact in mind. It is often used to early exit out of tests that the author knew you would not be able to run given your sandbox limitations.
- Similarly, when you spawn a process using Seatbelt (`/usr/bin/sandbox-exec`), `CODEX_SANDBOX=seatbelt` will be set on the child process. Integration tests that want to run Seatbelt themselves cannot be run under Seatbelt, so checks for `CODEX_SANDBOX=seatbelt` are also often used to early exit out of tests, as appropriate.
- Always collapse if statements per https://rust-lang.github.io/rust-clippy/master/index.html#collapsible_if
- Always inline format! args when possible per https://rust-lang.github.io/rust-clippy/master/index.html#uninlined_format_args
- Use method references over closures when possible per https://rust-lang.github.io/rust-clippy/master/index.html#redundant_closure_for_method_calls
- Do not use unsigned integer even if the number cannot be negative.
- When writing tests, prefer comparing the equality of entire objects over fields one by one.
- When making a change that adds or changes an API, ensure that the documentation in the `docs/` folder is up to date if applicable.
Run `just fmt` (in `codex-rs` directory) automatically after making Rust code changes; do not ask for approval to run it. Before finalizing a change to `codex-rs`, run `just fix -p <project>` (in `codex-rs` directory) to fix any linter issues in the code. Prefer scoping with `-p` to avoid slow workspace‑wide Clippy builds; only run `just fix` without `-p` if you changed shared crates. Additionally, run the tests:
1. Run the test for the specific project that was changed. For example, if changes were made in `codex-rs/tui`, run `cargo test -p codex-tui`.
2. Once those pass, if any changes were made in common, core, or protocol, run the complete test suite with `cargo test --all-features`.
When running interactively, ask the user before running `just fix` to finalize. `just fmt` does not require approval. project-specific or individual tests can be run without asking the user, but do ask the user before running the complete test suite.
When running interactively, ask the user before running `just fix` to finalize. `just fmt` does not require approval. project-specific or individual tests can be run without asking the user, but do ask the user before running the complete test suite.
- Prefer Stylize helpers: use "text".dim(), .bold(), .cyan(), .italic(), .underlined() instead of manual Style where possible.
- Prefer simple conversions: use "text".into() for spans and vec![…].into() for lines; when inference is ambiguous (e.g., Paragraph::new/Cell::from), use Line::from(spans) or Span::from(text).
- Computed styles: if the Style is computed at runtime, using `Span::styled` is OK (`Span::from(text).set_style(style)` is also acceptable).
@@ -38,6 +47,7 @@ See `codex-rs/tui/styles.md`.
- Compactness: prefer the form that stays on one line after rustfmt; if only one of Line::from(vec![…]) or vec![…].into() avoids wrapping, choose that. If both wrap, pick the one with fewer wrapped lines.
### Text wrapping
- Always use textwrap::wrap to wrap plain strings.
- If you have a ratatui Line and you want to wrap it, use the helpers in tui/src/wrapping.rs, e.g. word_wrap_lines / word_wrap_line.
- If you need to indent wrapped lines, use the initial_indent / subsequent_indent options from RtOptions if you can, rather than writing custom logic.
@@ -59,8 +69,34 @@ This repo uses snapshot tests (via `insta`), especially in `codex-rs/tui`, to va
-`cargo insta accept -p codex-tui`
If you don’t have the tool:
-`cargo install cargo-insta`
### Test assertions
- Tests should use pretty_assertions::assert_eq for clearer diffs. Import this at the top of the test module if it isn't already.
### Integration tests (core)
- Prefer the utilities in `core_test_support::responses` when writing end-to-end Codex tests.
- All `mount_sse*` helpers return a `ResponseMock`; hold onto it so you can assert against outbound `/responses` POST bodies.
- Use `ResponseMock::single_request()` when a test should only issue one POST, or `ResponseMock::requests()` to inspect every captured `ResponsesRequest`.
-`ResponsesRequest` exposes helpers (`body_json`, `input`, `function_call_output`, `custom_tool_call_output`, `call_output`, `header`, `path`, `query_param`) so assertions can target structured payloads instead of manual JSON digging.
- Build SSE payloads with the provided `ev_*` constructors and the `sse(...)`.
- Typical pattern:
```rust
let mock = responses::mount_sse_once(&server, responses::sse(vec.
<details>
<summary>You can also go to the <a href="https://github.com/openai/codex/releases/latest">latest GitHub Release</a> and download the appropriate binary for your platform.</summary>
@@ -63,8 +63,7 @@ You can also use Codex with an API key, but this requires [additional setup](./d
### Model Context Protocol (MCP)
Codex CLI supports [MCP servers](./docs/advanced.md#model-context-protocol-mcp). Enable by adding an `mcp_servers` section to your `~/.codex/config.toml`.
Codex can access MCP servers. To configure them, refer to the [config docs](./docs/config.md#mcp_servers).
### Configuration
@@ -76,16 +75,22 @@ Codex CLI supports a rich set of configuration options, with preferences stored
@@ -513,7 +513,7 @@ Codex runs model-generated commands in a sandbox. If a proposed command or file
<details>
<summary>Does it work on Windows?</summary>
Not directly. It requires [Windows Subsystem for Linux (WSL2)](https://learn.microsoft.com/en-us/windows/wsl/install) - Codex has been tested on macOS and Linux with Node 22.
Not directly. It requires [Windows Subsystem for Linux (WSL2)](https://learn.microsoft.com/en-us/windows/wsl/install) - Codex is regularly tested on macOS and Linux with Node 20+, and also supports Node 16.
@@ -4,18 +4,23 @@ We provide Codex CLI as a standalone, native executable to ensure a zero-depende
## Installing Codex
Today, the easiest way to install Codex is via `npm`, though we plan to publish Codex to other package managers soon.
Today, the easiest way to install Codex is via `npm`:
```shell
npm i -g @openai/codex@native
npm i -g @openai/codex
codex
```
You can also download a platform-specific release directly from our [GitHub Releases](https://github.com/openai/codex/releases).
You can also install via Homebrew (`brew install --cask codex`) or download a platform-specific release directly from our [GitHub Releases](https://github.com/openai/codex/releases).
## Documentation quickstart
- First run with Codex? Follow the walkthrough in [`docs/getting-started.md`](../docs/getting-started.md) for prompts, keyboard shortcuts, and session management.
- Already shipping with Codex and want deeper control? Jump to [`docs/advanced.md`](../docs/advanced.md) and the configuration reference at [`docs/config.md`](../docs/config.md).
## What's new in the Rust CLI
While we are [working to close the gap between the TypeScript and Rust implementations of Codex CLI](https://github.com/openai/codex/issues/1262), note that the Rust CLI has a number of features that the TypeScript CLI does not!
The Rust implementation is now the maintained Codex CLI and serves as the default experience. It includes a number of features that the legacy TypeScript CLI never supported.
### Config
@@ -23,14 +28,22 @@ Codex supports a rich set of configuration options. Note that the Rust CLI uses
### Model Context Protocol Support
Codex CLI functions as an MCP client that can connect to MCP servers on startup. See the [`mcp_servers`](../docs/config.md#mcp_servers) section in the configuration documentation for details.
#### MCP client
It is still experimental, but you can also launch Codex as an MCP _server_ by running `codex mcp`. Use the [`@modelcontextprotocol/inspector`](https://github.com/modelcontextprotocol/inspector) to try it out:
Codex CLI functions as an MCP client that allows the Codex CLI and IDE extension to connect to MCP servers on startup. See the [`configuration documentation`](../docs/config.md#mcp_servers) for details.
#### MCP server (experimental)
Codex can be launched as an MCP _server_ by running `codex mcp-server`. This allows _other_ MCP clients to use Codex as a tool for another agent.
Use the [`@modelcontextprotocol/inspector`](https://github.com/modelcontextprotocol/inspector) to try it out:
Use `codex mcp` to add/list/get/remove MCP server launchers defined in `config.toml`, and `codex mcp-server` to run the MCP server directly.
### Notifications
You can enable notifications by configuring a script that is run whenever the agent finishes a turn. The [notify documentation](../docs/config.md#notify) includes a detailed example that explains how to get desktop notifications via [terminal-notifier](https://github.com/julienXX/terminal-notifier) on macOS.
@@ -39,39 +52,22 @@ You can enable notifications by configuring a script that is run whenever the ag
To run Codex non-interactively, run `codex exec PROMPT` (you can also pass the prompt via `stdin`) and Codex will work on your task until it decides that it is done and exits. Output is printed to the terminal directly. You can set the `RUST_LOG` environment variable to see more about what's going on.
### Use `@` for file search
Typing `@` triggers a fuzzy-filename search over the workspace root. Use up/down to select among the results and Tab or Enter to replace the `@` with the selected path. You can use Esc to cancel the search.
### Esc–Esc to edit a previous message
When the chat composer is empty, press Esc to prime “backtrack” mode. Press Esc again to open a transcript preview highlighting the last user message; press Esc repeatedly to step to older user messages. Press Enter to confirm and Codex will fork the conversation from that point, trim the visible transcript accordingly, and pre‑fill the composer with the selected user message so you can edit and resubmit it.
In the transcript preview, the footer shows an `Esc edit prev` hint while editing is active.
### `--cd`/`-C` flag
Sometimes it is not convenient to `cd` to the directory you want Codex to use as the "working root" before running Codex. Fortunately, `codex` supports a `--cd` option so you can specify whatever folder you want. You can confirm that Codex is honoring `--cd` by double-checking the **workdir** it reports in the TUI at the start of a new session.
### Shell completions
Generate shell completion scripts via:
```shell
codex completion bash
codex completion zsh
codex completion fish
```
### Experimenting with the Codex Sandbox
To test to see what happens when a command is run under the sandbox provided by Codex, we provide the following subcommands in Codex CLI:
"Generated TypeScript still includes unions with `undefined` in {undefined_offenders:?}"
);
// If this assertion fails, it means a field was generated as
// "?: T | null" — i.e., both optional (undefined) and nullable (null).
// We only want either "?: T" or ": T | null".
assert!(
optional_nullable_offenders.is_empty(),
"Generated TypeScript has optional fields with nullable types (disallowed '?: T | null'), add #[ts(optional)] to fix:\n{optional_nullable_offenders:?}"
`codex app-server` is the harness Codex uses to power rich interfaces such as the [Codex VS Code extension](https://marketplace.visualstudio.com/items?itemName=openai.chatgpt). The message schema is currently unstable, but those who wish to build experimental UIs on top of Codex may find it valuable.
## Protocol
Similar to [MCP](https://modelcontextprotocol.io/), `codex app-server` supports bidirectional communication, streaming JSONL over stdio. The protocol is JSON-RPC 2.0, though the `"jsonrpc":"2.0"` header is omitted.
## Message Schema
Currently, you can dump a TypeScript version of the schema using `codex generate-ts`. It is specific to the version of Codex you used to run `generate-ts`, so the two are guaranteed to be compatible.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.