Compare commits

...

82 Commits

Author SHA1 Message Date
Eric Traut
5ebfafab97 Fixed formatting issues 2025-11-12 11:15:44 -08:00
Eric Traut
4ce63c70a8 Merge branch 'main' into patch-1 2025-11-12 11:35:46 -06:00
Abkari Mohammed Sayeem
e201a3ea55 Fix Prettier formatting - move colons outside bold markers
Changed **Text:** to **Text**: throughout the file for Prettier compliance. Punctuation should be outside emphasis markers per Prettier markdown rules.
2025-11-12 22:52:32 +05:30
pakrym-oai
7d9ad3effd Fix otel tests (#6541)
Mount responses only once, remove unneeded retries and add a final
assistant messages to complete the turn.
2025-11-12 16:35:34 +00:00
Michael Bolin
c3a710ee14 chore: verify boolean values can be parsed as config overrides (#6516)
This is important to ensure that this:

```
codex --enable unified_exec
```

and this:

```
codex --config features.unified_exec=true
```

are equivalent. Also that when it is passed programmatically:


807e2c27f0/codex-rs/app-server-protocol/src/protocol/v1.rs (L55)

then this should work for `config`:

```json
{"features": {"shell_command_tool": true}}
```

though I believe also this:

```json
{"features.shell_command_tool": true}
```
2025-11-12 08:19:16 -08:00
Michael Bolin
29364f3a9b feat: shell_command tool (#6510)
This adds support for a new variant of the shell tool behind a flag. To
test, run `codex` with `--enable shell_command_tool`, which will
register the tool with Codex under the name `shell_command` that accepts
the following shape:

```python
{
  command: str
  workdir: str | None,
  timeout_ms: int | None,
  with_escalated_permissions: bool | None,
  justification: str | None,
}
```

This is comparable to the existing tool registered under
`shell`/`container.exec`. The primary difference is that it accepts
`command` as a `str` instead of a `str[]`. The `shell_command` tool
executes by running `execvp(["bash", "-lc", command])`, though the exact
arguments to `execvp(3)` depend on the user's default shell.

The hypothesis is that this will simplify things for the model. For
example, on Windows, instead of generating:

```json
{"command": ["pwsh.exe", "-NoLogo", "-Command", "ls -Name"]}
```

The model could simply generate:

```json
{"command": "ls -Name"}
```

As part of this change, I extracted some logic out of `user_shell.rs` as
`Shell::derive_exec_args()` so that it can be reused in
`codex-rs/core/src/tools/handlers/shell.rs`. Note the original code
generated exec arg lists like:

```javascript
["bash", "-lc", command]
["zsh", "-lc", command]
["pwsh.exe", "-NoProfile", "-Command", command]
```

Using `-l` for Bash and Zsh, but then specifying `-NoProfile` for
PowerShell seemed inconsistent to me, so I changed this in the new
implementation while also adding a `use_login_shell: bool` option to
make this explicit. If we decide to add a `login: bool` to
`ShellCommandToolCallParams` like we have for unified exec:


807e2c27f0/codex-rs/core/src/tools/handlers/unified_exec.rs (L33-L34)

Then this should make it straightforward to support.
2025-11-12 08:18:57 -08:00
Abkari Mohammed Sayeem
e9abecfb68 Fix heading levels - change #### to ### for examples
Prettier requires consistent heading levels throughout the document. Changed example headings from #### (h4) to ### (h3) to match the document structure and pass formatting checks.
2025-11-12 18:49:13 +05:30
jif-oai
530db0ad73 feat: warning switch model on resume (#6507)
<img width="1259" height="40" alt="Screenshot 2025-11-11 at 14 01 41"
src="https://github.com/user-attachments/assets/48ead3d2-d89c-4d8a-a578-82d9663dbd88"
/>
2025-11-12 11:13:37 +00:00
Gabriel Peal
424bfecd0b Re-add prettier log-level=warn to generate-ts (#6528)
I added it in https://github.com/openai/codex/pull/6342 but it was
removed in
https://github.com/openai/codex/pull/5063/files#diff-e2aa6dad1e886b7765158a27aefd1be5de99baa71b44f6bc5ce3fe462b9ae5d3R135
as a result of a bad diamond merge
2025-11-11 21:30:01 -05:00
Lionel Cheng
eb1c651c00 Update full-auto description with on-request (#6523)
This PR fixes #6522 by correcting the comment for `full-auto` in both
`codex-rs/exec/src/cli.rs` and `codex-rs/tui/src/cli.rs` from `-a
on-failure` to `-a on-request` to make it coherent with
`codex-rs/tui/src/lib.rs:97-105`:

```rust
pub async fn run_main(
    mut cli: Cli,
    codex_linux_sandbox_exe: Option<PathBuf>,
) -> std::io::Result<AppExitInfo> {
    let (sandbox_mode, approval_policy) = if cli.full_auto {
        (
            Some(SandboxMode::WorkspaceWrite),
            Some(AskForApproval::OnRequest),
        )
```

Running `just codex --help` or `just codex exec --help` should now yield
the correct description of `full-auto` CLI argument.

Signed-off-by: lionelchg <lionel.cheng@hotmail.fr>
2025-11-11 15:59:20 -08:00
Celia Chen
e357fc723d [app-server] add item started/completed events for turn items (#6517)
This one should be quite straightforward, as it's just a translation of
TurnItem events we already emit to ThreadItem that app-server exposes to
customers.

To test, cp my change to owen/app_server_test_client and do the
following:
```
cargo build -p codex-cli
RUST_LOG=codex_app_server=info CODEX_BIN=target/debug/codex cargo run -p codex-app-server-test-client -- send-message-v2 "hello"
```

example event before (still kept there for backward compatibility):
```
{
<   "method": "codex/event/item_completed",
<   "params": {
<     "conversationId": "019a74cc-fad9-7ab3-83a3-f42827b7b074",
<     "id": "0",
<     "msg": {
<       "item": {
<         "Reasoning": {
<           "id": "rs_03d183492e07e20a016913a936eb8c81a1a7671a103fee8afc",
<           "raw_content": [],
<           "summary_text": [
<             "Hey! What would you like to work on? I can explore the repo, run specific tests, or implement a change. Let's keep it short and straightforward. There's no need for a lengthy introduction or elaborate planning, just a friendly greeting and an open offer to help. I want to make sure the user feels welcomed and understood right from the start. It's all about keeping the tone friendly and concise!"
<           ]
<         }
<       },
<       "thread_id": "019a74cc-fad9-7ab3-83a3-f42827b7b074",
<       "turn_id": "0",
<       "type": "item_completed"
<     }
<   }
< }
```

after (v2):
```
< {
<   "method": "item/completed",
<   "params": {
<     "item": {
<       "id": "rs_03d183492e07e20a016913a936eb8c81a1a7671a103fee8afc",
<       "text": "Hey! What would you like to work on? I can explore the repo, run specific tests, or implement a change. Let's keep it short and straightforward. There's no need for a lengthy introduction or elaborate planning, just a friendly greeting and an open offer to help. I want to make sure the user feels welcomed and understood right from the start. It's all about keeping the tone friendly and concise!",
<       "type": "reasoning"
<     }
<   }
< }
```
2025-11-11 22:43:24 +00:00
Abkari Mohammed Sayeem
286cb2a021 Fix Prettier formatting - add newline at end of file
Fixes the Prettier formatting check failure by adding a newline character at the end of the file, which is required by the linter.
2025-11-12 00:50:43 +05:30
Abkari Mohammed Sayeem
8c3a2b1302 Reduce examples to 2 and remove Implementation reference section
Per reviewer feedback:
- Reduced from 4 examples to 2 (kept Example 1: Basic named arguments and Example 2: Mixed positional and named arguments)
- Removed Example 3: Using positional arguments
- Removed Example 4: Draft PR helper
- Removed entire Implementation reference section as it doesn't belong in public docs
2025-11-12 00:20:01 +05:30
pakrym-oai
807e2c27f0 Add unified exec escalation handling and tests (#6492)
Similar implementation to the shell tool
2025-11-11 08:19:35 -08:00
jif-oai
ad279eacdc nit: logs to trace (#6503) 2025-11-11 13:37:06 +00:00
jif-oai
052b052832 Enable ghost_commit feature by default (#6041)
## Summary
- enable the ghost_commit feature flag by default

## Testing
- just fmt

------
https://chatgpt.com/codex/tasks/task_i_6904ce2d0370832dbb3c2c09a90fb188
2025-11-11 09:20:46 +00:00
Celia Chen
6951872776 [hygiene][app-server] have a helper function for duplicate code in turn APIs (#6488)
turn_start and turn_interrupt have some logic that can be shared. have a
helper function for it.
2025-11-11 02:44:47 +00:00
pakrym-oai
bb7b0213a8 Colocate more of bash parsing (#6489)
Move a few callsites that were detecting `bash -lc` into a shared
helper.
2025-11-11 02:38:36 +00:00
pakrym-oai
6c36318bd8 Use codex-linux-sandbox in unified exec (#6480)
Unified exec isn't working on Linux because we don't provide the correct
arg0.

The library we use for pty management doesn't allow setting arg0
separately from executable. Use the same aliasing strategy we use for
`apply_patch` for `codex-linux-sandbox`.

Use `#[ctor]` hack to dispatch codex-linux-sandbox calls.


Addresses https://github.com/openai/codex/issues/6450
2025-11-10 17:17:09 -08:00
zhao-oai
930f81a17b flip rate limit status bar (#6482)
flipping rate limit status bar to match chat.com/codex/settings/usage

<img width="848" height="420" alt="Screenshot 2025-11-10 at 4 53 41 PM"
src="https://github.com/user-attachments/assets/e326db3f-4405-412d-9e62-337282ec9a35"
/>
2025-11-11 01:13:10 +00:00
iceweasel-oai
9aff64e017 upload Windows .exe file artifacts for CLI releases (#6478)
This PR is to unlock future WinGet installation. WinGet struggles to
create command aliases when installing from nested ZIPs on some clients,
so adding raw Windows x64/Arm64 executables lets the manifest use
InstallerType: portable with direct EXEs, which reliably registers the
codex alias. This makes “winget install → codex” work out of the box
without PATH changes.

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>
2025-11-10 23:31:06 +00:00
Owen Lin
3838d6739c [app-server] update macro to make renaming methods less boilerplate-y (#6470)
We already do this for notification definitions and it's really nice.

Verified there are no changes to actual exported files by diff'ing
before and after this change.
2025-11-10 15:15:08 -08:00
Josh McKinney
60deb6773a refactor(tui): job-control for Ctrl-Z handling (#6477)
- Moved the unix-only suspend/resume logic into a dedicated job_control
module housing SuspendContext, replacing scattered cfg-gated fields and
helpers in tui.rs.
- Tui now holds a single suspend_context (Arc-backed) instead of
multiple atomics, and the event stream uses it directly for Ctrl-Z
handling.
- Added detailed docs around the suspend/resume flow, cursor tracking,
and the Arc/atomic ownership model for the 'static event stream.
- Renamed the process-level SIGTSTP helper to suspend_process and the
cursor tracker to set_cursor_y to better reflect their roles.
2025-11-10 23:13:43 +00:00
Jeremy Rose
0271c20d8f add codex debug seatbelt --log-denials (#4098)
This adds a debugging tool for analyzing why certain commands fail to
execute under the sandbox.

Example output:

```
$ codex debug seatbelt --log-denials bash -lc "(echo foo > ~/foo.txt)"
bash: /Users/nornagon/foo.txt: Operation not permitted

=== Sandbox denials ===
(bash) file-write-data /dev/tty
(bash) file-write-data /dev/ttys001
(bash) sysctl-read kern.ngroups
(bash) file-write-create /Users/nornagon/foo.txt
```

It operates by:

1. spawning `log stream` to watch system logs, and
2. tracking all descendant PIDs using kqueue + proc_listchildpids.

this is a "best-effort" technique, as `log stream` may drop logs(?), and
kqueue + proc_listchildpids isn't atomic and can end up missing very
short-lived processes. But it works well enough in my testing to be
useful :)
2025-11-10 22:48:14 +00:00
George Nesterenok
52e97b9b6b Fix wayland image paste error (#4824)
## Summary
- log and surface clipboard failures instead of silently ignoring them
when `Ctrl+V` pastes an image (`paste_image_to_temp_png()` now feeds an
error history cell)
- enable `arboard`’s `wayland-data-control` feature so native Wayland
sessions can deliver image data without XWayland
- keep the success path unchanged: valid images still attach and show
the `[image …]` placeholder as before

Fixes #4818

---------

Co-authored-by: Eric Traut <etraut@openai.com>
Co-authored-by: Jeremy Rose <172423086+nornagon-openai@users.noreply.github.com>
2025-11-10 14:35:30 -08:00
Owen Lin
2ac49fea58 [app-server] chore: move initialize out of deprecated API section (#6468)
Self-explanatory - `initialize` is not a deprecated API and works
equally well with the v2 APIs.
2025-11-10 20:24:36 +00:00
jif-oai
f01f2ec9ee feat: add workdir to unified_exec (#6466) 2025-11-10 19:53:36 +00:00
zhao-oai
980886498c Add user command event types (#6246)
adding new user command event, logic in TUI to render user command
events
2025-11-10 19:18:45 +00:00
Ahmed Ibrahim
e743d251a7 Add opt-out for rate limit model nudge (#6433)
## Summary
- add a `hide_rate_limit_model_nudge` notice flag plus config edit
plumbing so the rate limit reminder preference is persisted and
documented
- extend the chat widget prompt with a "never show again" option, and
wire new app events so selecting it hides future nudges immediately and
writes the config
- add unit coverage and refresh the snapshot for the three-option prompt

## Testing
- `just fmt`
- `just fix -p codex-tui`
- `just fix -p codex-core`
- `cargo test -p codex-tui`
- `cargo test -p codex-core` *(fails at
`exec::tests::kill_child_process_group_kills_grandchildren_on_timeout`:
grandchild process still alive)*

------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_6910d7f407748321b2661fc355416994)
2025-11-10 09:21:53 -08:00
Shijie Rao
788badd221 fix: update brew auto update version check (#6238)
### Summary
* Use
`https://github.com/Homebrew/homebrew-cask/blob/main/Casks/c/codex.rb`
to get the latest available version for brew usage.
2025-11-10 09:05:00 -08:00
Owen Lin
fbdedd9a06 [app-server] feat: add command to generate json schema (#6406)
Add a `codex generate-json-schema` command for generating a JSON schema
bundle of app-server types, analogous to the existing `codex
generate-ts` command for Typescript.
2025-11-10 16:59:14 +00:00
Eric Traut
5916153157 Don't lock PRs that have been closed without merging (#6422)
The CLA action is designed to automatically lock a PR when it is closed.
This preserves the CLA agreement statements, preventing the contributor
from deleting them after the fact. However, this action is currently
locking PRs that are closed without merging. I'd like to keep such PRs
open so the contributor can respond with additional comments. I'm
currently manually unlocking PRs that I close, but I'd like to eliminate
this manual step.
2025-11-10 08:46:11 -08:00
Eric Traut
b46012e483 Support exiting from the login menu (#6419)
I recently fixed a bug in [this
PR](https://github.com/openai/codex/pull/6285) that prevented Ctrl+C
from dismissing the login menu in the TUI and leaving the user unauthed.

A [user pointed out](https://github.com/openai/codex/issues/6418) that
this makes Ctrl+C can no longer be used to exit the app. This PR changes
the behavior so we exit the app rather than ignoring the Ctrl+C.
2025-11-10 08:43:11 -08:00
Owen Lin
42683dadfb fix: use generate_ts from app_server_protocol (#6407)
Update `codex generate-ts` to use the TS export code from
`app-server-protocol/src/export.rs`.

I realized there were two duplicate implementations of Typescript export
code:
- `app-server-protocol/src/export.rs`
- the `codex-protocol-ts` crate

The `codex-protocol-ts` crate that `codex generate-ts` uses is out of
date now since it doesn't handle the V2 namespace from:
https://github.com/openai/codex/pull/6212.
2025-11-10 08:08:12 -08:00
Eric Traut
65cb1a1b77 Updated docs to reflect recent changes in web_search configuration (#6376)
This is a simplified version of [a
PR](https://github.com/openai/codex/pull/6134) supplied by a community
member.

It updates the docs to reflect a recent config deprecation.
2025-11-10 07:57:56 -08:00
jif-oai
50a77dc138 Move compact (#6454) 2025-11-10 11:59:48 +00:00
Raduan A.
557ac63094 Fix config documentation: correct TOML parsing description (#6424)
The CLI help text and inline comments incorrectly stated that -c
key=value flag parses values as JSON, when the implementation actually
uses TOML parsing via parse_toml_value(). This caused confusion when
users attempted to configure MCP servers using JSON syntax based on the
documentation.

Changes:
- Updated help text to correctly state TOML parsing instead of JSON

Fixes #4531
2025-11-09 22:58:32 -08:00
Andrew Nikolin
131c384361 Fix warning message phrasing (#6446)
Small fix for sentence phrasing in the warning message

Co-authored-by: AndrewNikolin <877163+AndrewNikolin@users.noreply.github.com>
2025-11-09 22:12:28 -08:00
dependabot[bot]
e2598f5094 chore(deps): bump zeroize from 1.8.1 to 1.8.2 in /codex-rs (#6444)
Bumps [zeroize](https://github.com/RustCrypto/utils) from 1.8.1 to
1.8.2.
<details>
<summary>Commits</summary>
<ul>
<li><a
href="c100874101"><code>c100874</code></a>
zeroize v1.8.2 (<a
href="https://redirect.github.com/RustCrypto/utils/issues/1229">#1229</a>)</li>
<li><a
href="3940ccbebd"><code>3940ccb</code></a>
Switch from <code>doc_auto_cfg</code> to <code>doc_cfg</code> (<a
href="https://redirect.github.com/RustCrypto/utils/issues/1228">#1228</a>)</li>
<li><a
href="c68a5204b2"><code>c68a520</code></a>
Fix Nightly warnings (<a
href="https://redirect.github.com/RustCrypto/utils/issues/1080">#1080</a>)</li>
<li><a
href="b15cc6c1cd"><code>b15cc6c</code></a>
cargo: point <code>repository</code> metadata to clonable URLs (<a
href="https://redirect.github.com/RustCrypto/utils/issues/1079">#1079</a>)</li>
<li><a
href="3db6690f7b"><code>3db6690</code></a>
zeroize: fix <code>homepage</code>/<code>repository</code> in Cargo.toml
(<a
href="https://redirect.github.com/RustCrypto/utils/issues/1076">#1076</a>)</li>
<li>See full diff in <a
href="https://github.com/RustCrypto/utils/compare/zeroize-v1.8.1...zeroize-v1.8.2">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=zeroize&package-manager=cargo&previous-version=1.8.1&new-version=1.8.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-09 22:03:36 -08:00
dependabot[bot]
78b2aeea55 chore(deps): bump askama from 0.12.1 to 0.14.0 in /codex-rs (#6443)
Bumps [askama](https://github.com/askama-rs/askama) from 0.12.1 to
0.14.0.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/askama-rs/askama/releases">askama's
releases</a>.</em></p>
<blockquote>
<h2>v0.14.0</h2>
<h2>Added Features</h2>
<ul>
<li>Implement <code>Values</code> on tuple by <a
href="https://github.com/GuillaumeGomez"><code>@​GuillaumeGomez</code></a>
in <a
href="https://redirect.github.com/askama-rs/askama/pull/391">askama-rs/askama#391</a></li>
<li>Pass variables to sub-templates more reliably even if indirectly by
<a href="https://github.com/Kijewski"><code>@​Kijewski</code></a> in <a
href="https://redirect.github.com/askama-rs/askama/pull/397">askama-rs/askama#397</a></li>
<li>Implement <code>first</code> and <code>blank</code> arguments for
<code>|indent</code> by <a
href="https://github.com/Kijewski"><code>@​Kijewski</code></a> in <a
href="https://redirect.github.com/askama-rs/askama/pull/401">askama-rs/askama#401</a></li>
<li>Add named arguments for builtin filters by <a
href="https://github.com/Kijewski"><code>@​Kijewski</code></a> in <a
href="https://redirect.github.com/askama-rs/askama/pull/403">askama-rs/askama#403</a></li>
<li>Add <code>unique</code> filter by <a
href="https://github.com/GuillaumeGomez"><code>@​GuillaumeGomez</code></a>
in <a
href="https://redirect.github.com/askama-rs/askama/pull/405">askama-rs/askama#405</a></li>
</ul>
<h2>Bug Fixes And Consistency</h2>
<ul>
<li><code>askama_derive</code> accidentally exposed as a feature by <a
href="https://github.com/Kijewski"><code>@​Kijewski</code></a> in <a
href="https://redirect.github.com/askama-rs/askama/pull/384">askama-rs/askama#384</a></li>
<li>Track config files by <a
href="https://github.com/GuillaumeGomez"><code>@​GuillaumeGomez</code></a>
in <a
href="https://redirect.github.com/askama-rs/askama/pull/385">askama-rs/askama#385</a></li>
<li>If using local variable as value when creating a new variable, do
not put it behind a reference by <a
href="https://github.com/GuillaumeGomez"><code>@​GuillaumeGomez</code></a>
in <a
href="https://redirect.github.com/askama-rs/askama/pull/392">askama-rs/askama#392</a></li>
<li>generator: make <code>CARGO_MANIFEST_DIR</code> part of
<code>ConfigKey</code> by <a
href="https://github.com/strickczq"><code>@​strickczq</code></a> in <a
href="https://redirect.github.com/askama-rs/askama/pull/395">askama-rs/askama#395</a></li>
<li>Do not put question mark initialization expressions behind a
reference by <a
href="https://github.com/GuillaumeGomez"><code>@​GuillaumeGomez</code></a>
in <a
href="https://redirect.github.com/askama-rs/askama/pull/400">askama-rs/askama#400</a></li>
<li>Update to more current rust version on readthedocs by <a
href="https://github.com/Kijewski"><code>@​Kijewski</code></a> in <a
href="https://redirect.github.com/askama-rs/askama/pull/410">askama-rs/askama#410</a></li>
<li>Fix <code>unique</code> filter implementation by <a
href="https://github.com/GuillaumeGomez"><code>@​GuillaumeGomez</code></a>
in <a
href="https://redirect.github.com/askama-rs/askama/pull/417">askama-rs/askama#417</a></li>
<li>Add <code>|titlecase</code> as alias for <code>|title</code> by <a
href="https://github.com/Kijewski"><code>@​Kijewski</code></a> in <a
href="https://redirect.github.com/askama-rs/askama/pull/416">askama-rs/askama#416</a></li>
</ul>
<h2>Further Changes</h2>
<ul>
<li>book: add page about <code>FastWritable</code> by <a
href="https://github.com/Kijewski"><code>@​Kijewski</code></a> in <a
href="https://redirect.github.com/askama-rs/askama/pull/407">askama-rs/askama#407</a></li>
<li>Add throughput to derive benchmark by <a
href="https://github.com/Kijewski"><code>@​Kijewski</code></a> in <a
href="https://redirect.github.com/askama-rs/askama/pull/413">askama-rs/askama#413</a></li>
<li>Move <code>FastWritable</code> into <code>askama</code> root by <a
href="https://github.com/GuillaumeGomez"><code>@​GuillaumeGomez</code></a>
in <a
href="https://redirect.github.com/askama-rs/askama/pull/411">askama-rs/askama#411</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/strickczq"><code>@​strickczq</code></a>
made their first contribution in <a
href="https://redirect.github.com/askama-rs/askama/pull/395">askama-rs/askama#395</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/askama-rs/askama/compare/v0.13.0...v0.14.0">https://github.com/askama-rs/askama/compare/v0.13.0...v0.14.0</a></p>
<h2>v0.13.1</h2>
<h2>What's Changed</h2>
<ul>
<li><code>askama_derive</code> accidentally exposed as a feature by <a
href="https://github.com/Kijewski"><code>@​Kijewski</code></a> in <a
href="https://redirect.github.com/askama-rs/askama/pull/384">askama-rs/askama#384</a></li>
<li>Track config files by <a
href="https://github.com/GuillaumeGomez"><code>@​GuillaumeGomez</code></a>
in <a
href="https://redirect.github.com/askama-rs/askama/pull/385">askama-rs/askama#385</a></li>
<li>Implement <code>Values</code> on tuple by <a
href="https://github.com/GuillaumeGomez"><code>@​GuillaumeGomez</code></a>
in <a
href="https://redirect.github.com/askama-rs/askama/pull/391">askama-rs/askama#391</a></li>
<li>generator: make <code>CARGO_MANIFEST_DIR</code> part of
<code>ConfigKey</code> by <a
href="https://github.com/strickczq"><code>@​strickczq</code></a> in <a
href="https://redirect.github.com/askama-rs/askama/pull/395">askama-rs/askama#395</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/strickczq"><code>@​strickczq</code></a>
made their first contribution in <a
href="https://redirect.github.com/askama-rs/askama/pull/395">askama-rs/askama#395</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/askama-rs/askama/compare/v0.13.0...v0.13.1">https://github.com/askama-rs/askama/compare/v0.13.0...v0.13.1</a></p>
<h2>v0.13.0 – Rinja is Askama, again!</h2>
<p>With this release, the <a
href="https://blog.guillaume-gomez.fr/articles/2024-07-31+docs.rs+switching+jinja+template+framework+from+tera+to+rinja">fork</a>
rinja got merged back into the main project. Please have a look at our
<a
href="https://blog.guillaume-gomez.fr/articles/2025-03-19+Askama+and+Rinja+merge">blog
post</a> for more information about the split and the merge.</p>
<h2>What's Changed</h2>
<p>This release (v0.13.0), when <a
href="https://github.com/askama-rs/askama/compare/0.12.1...v0.13.0">compared
to</a> the last stable askama release (v0.12.1), consists of:</p>
<ul>
<li>over 1000 commits</li>
<li>with changes in over 500 files</li>
<li>with over 40k additions and 8000 deletions</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="95867ac8ce"><code>95867ac</code></a>
Merge pull request <a
href="https://redirect.github.com/askama-rs/askama/issues/416">#416</a>
from Kijewski/pr-upgrading-0.14</li>
<li><a
href="61b7422497"><code>61b7422</code></a>
Add <code>|titlecase</code> as alias for <code>|title</code></li>
<li><a
href="79be271593"><code>79be271</code></a>
Run doctests</li>
<li><a
href="72bbe3ede1"><code>72bbe3e</code></a>
Bump version number to v0.14.0</li>
<li><a
href="57750338fa"><code>5775033</code></a>
book: update <code>upgrading.md</code></li>
<li><a
href="a5b43c0aa2"><code>a5b43c0</code></a>
Fix <code>unique</code> filter implementation</li>
<li><a
href="7fccbdf1d7"><code>7fccbdf</code></a>
Remove usage of <code>nextest</code></li>
<li><a
href="6a16256f24"><code>6a16256</code></a>
Fix new clippy lints</li>
<li><a
href="04a4d5b020"><code>04a4d5b</code></a>
Update MSRV to 1.83</li>
<li><a
href="d2a788a740"><code>d2a788a</code></a>
Add doc about <code>unique</code> filter</li>
<li>Additional commits viewable in <a
href="https://github.com/askama-rs/askama/compare/0.12.1...v0.14.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=askama&package-manager=cargo&previous-version=0.12.1&new-version=0.14.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-09 22:02:26 -08:00
dependabot[bot]
082d2fa19a chore(deps): bump taiki-e/install-action from 2.60.0 to 2.62.49 (#6438)
Bumps
[taiki-e/install-action](https://github.com/taiki-e/install-action) from
2.60.0 to 2.62.49.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/taiki-e/install-action/releases">taiki-e/install-action's
releases</a>.</em></p>
<blockquote>
<h2>2.62.49</h2>
<ul>
<li>
<p>Update <code>cargo-binstall@latest</code> to 1.15.11.</p>
</li>
<li>
<p>Update <code>cargo-auditable@latest</code> to 0.7.2.</p>
</li>
<li>
<p>Update <code>vacuum@latest</code> to 0.20.2.</p>
</li>
</ul>
<h2>2.62.48</h2>
<ul>
<li>
<p>Update <code>mise@latest</code> to 2025.11.3.</p>
</li>
<li>
<p>Update <code>cargo-audit@latest</code> to 0.22.0.</p>
</li>
<li>
<p>Update <code>vacuum@latest</code> to 0.20.1.</p>
</li>
<li>
<p>Update <code>uv@latest</code> to 0.9.8.</p>
</li>
<li>
<p>Update <code>cargo-udeps@latest</code> to 0.1.60.</p>
</li>
<li>
<p>Update <code>zizmor@latest</code> to 1.16.3.</p>
</li>
</ul>
<h2>2.62.47</h2>
<ul>
<li>
<p>Update <code>vacuum@latest</code> to 0.20.0.</p>
</li>
<li>
<p>Update <code>cargo-nextest@latest</code> to 0.9.111.</p>
</li>
<li>
<p>Update <code>cargo-shear@latest</code> to 1.6.2.</p>
</li>
</ul>
<h2>2.62.46</h2>
<ul>
<li>
<p>Update <code>vacuum@latest</code> to 0.19.5.</p>
</li>
<li>
<p>Update <code>syft@latest</code> to 1.37.0.</p>
</li>
<li>
<p>Update <code>mise@latest</code> to 2025.11.2.</p>
</li>
<li>
<p>Update <code>knope@latest</code> to 0.21.5.</p>
</li>
</ul>
<h2>2.62.45</h2>
<ul>
<li>
<p>Update <code>zizmor@latest</code> to 1.16.2.</p>
</li>
<li>
<p>Update <code>cargo-binstall@latest</code> to 1.15.10.</p>
</li>
<li>
<p>Update <code>ubi@latest</code> to 0.8.4.</p>
</li>
<li>
<p>Update <code>mise@latest</code> to 2025.11.1.</p>
</li>
<li>
<p>Update <code>cargo-semver-checks@latest</code> to 0.45.0.</p>
</li>
</ul>
<h2>2.62.44</h2>
<ul>
<li>Update <code>mise@latest</code> to 2025.11.0.</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/taiki-e/install-action/blob/main/CHANGELOG.md">taiki-e/install-action's
changelog</a>.</em></p>
<blockquote>
<h1>Changelog</h1>
<p>All notable changes to this project will be documented in this
file.</p>
<p>This project adheres to <a href="https://semver.org">Semantic
Versioning</a>.</p>
<!-- raw HTML omitted -->
<h2>[Unreleased]</h2>
<h2>[2.62.49] - 2025-11-09</h2>
<ul>
<li>
<p>Update <code>cargo-binstall@latest</code> to 1.15.11.</p>
</li>
<li>
<p>Update <code>cargo-auditable@latest</code> to 0.7.2.</p>
</li>
<li>
<p>Update <code>vacuum@latest</code> to 0.20.2.</p>
</li>
</ul>
<h2>[2.62.48] - 2025-11-08</h2>
<ul>
<li>
<p>Update <code>mise@latest</code> to 2025.11.3.</p>
</li>
<li>
<p>Update <code>cargo-audit@latest</code> to 0.22.0.</p>
</li>
<li>
<p>Update <code>vacuum@latest</code> to 0.20.1.</p>
</li>
<li>
<p>Update <code>uv@latest</code> to 0.9.8.</p>
</li>
<li>
<p>Update <code>cargo-udeps@latest</code> to 0.1.60.</p>
</li>
<li>
<p>Update <code>zizmor@latest</code> to 1.16.3.</p>
</li>
</ul>
<h2>[2.62.47] - 2025-11-05</h2>
<ul>
<li>
<p>Update <code>vacuum@latest</code> to 0.20.0.</p>
</li>
<li>
<p>Update <code>cargo-nextest@latest</code> to 0.9.111.</p>
</li>
<li>
<p>Update <code>cargo-shear@latest</code> to 1.6.2.</p>
</li>
</ul>
<h2>[2.62.46] - 2025-11-04</h2>
<ul>
<li>
<p>Update <code>vacuum@latest</code> to 0.19.5.</p>
</li>
<li>
<p>Update <code>syft@latest</code> to 1.37.0.</p>
</li>
<li>
<p>Update <code>mise@latest</code> to 2025.11.2.</p>
</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="44c6d64aa6"><code>44c6d64</code></a>
Release 2.62.49</li>
<li><a
href="3a701df4c2"><code>3a701df</code></a>
Update <code>cargo-binstall@latest</code> to 1.15.11</li>
<li><a
href="4242e04eb8"><code>4242e04</code></a>
Update <code>cargo-auditable@latest</code> to 0.7.2</li>
<li><a
href="3df5533ef8"><code>3df5533</code></a>
Update <code>vacuum@latest</code> to 0.20.2</li>
<li><a
href="e797ba6a25"><code>e797ba6</code></a>
Release 2.62.48</li>
<li><a
href="bcf91e02ac"><code>bcf91e0</code></a>
Update <code>mise@latest</code> to 2025.11.3</li>
<li><a
href="e78113b60c"><code>e78113b</code></a>
Update <code>cargo-audit@latest</code> to 0.22.0</li>
<li><a
href="0ef486444e"><code>0ef4864</code></a>
Update <code>vacuum@latest</code> to 0.20.1</li>
<li><a
href="5eda7b1985"><code>5eda7b1</code></a>
Update <code>uv@latest</code> to 0.9.8</li>
<li><a
href="3853a413e6"><code>3853a41</code></a>
Update <code>cargo-udeps@latest</code> to 0.1.60</li>
<li>Additional commits viewable in <a
href="0c5db7f7f8...44c6d64aa6">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=taiki-e/install-action&package-manager=github_actions&previous-version=2.60.0&new-version=2.62.49)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-09 22:00:08 -08:00
dependabot[bot]
7c7c7567d5 chore(deps): bump codespell-project/actions-codespell from 2.1 to 2.2 (#6437)
Bumps
[codespell-project/actions-codespell](https://github.com/codespell-project/actions-codespell)
from 2.1 to 2.2.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/codespell-project/actions-codespell/releases">codespell-project/actions-codespell's
releases</a>.</em></p>
<blockquote>
<h2>v2.2</h2>
<!-- raw HTML omitted -->
<h2>What's Changed</h2>
<ul>
<li>Add the config file option and tests by <a
href="https://github.com/rdimaio"><code>@​rdimaio</code></a> in <a
href="https://redirect.github.com/codespell-project/actions-codespell/pull/80">codespell-project/actions-codespell#80</a></li>
<li>Use <code>pip install</code> with <code>--no-cache-dir</code> in the
Dockerfile by <a
href="https://github.com/PeterDaveHello"><code>@​PeterDaveHello</code></a>
in <a
href="https://redirect.github.com/codespell-project/actions-codespell/pull/89">codespell-project/actions-codespell#89</a></li>
<li>Upgrade to Python 3.13 by <a
href="https://github.com/candrews"><code>@​candrews</code></a> in <a
href="https://redirect.github.com/codespell-project/actions-codespell/pull/82">codespell-project/actions-codespell#82</a></li>
<li>Add checkout action and problem matcher to README by <a
href="https://github.com/vadi2"><code>@​vadi2</code></a> in <a
href="https://redirect.github.com/codespell-project/actions-codespell/pull/32">codespell-project/actions-codespell#32</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/rdimaio"><code>@​rdimaio</code></a> made
their first contribution in <a
href="https://redirect.github.com/codespell-project/actions-codespell/pull/80">codespell-project/actions-codespell#80</a></li>
<li><a
href="https://github.com/PeterDaveHello"><code>@​PeterDaveHello</code></a>
made their first contribution in <a
href="https://redirect.github.com/codespell-project/actions-codespell/pull/89">codespell-project/actions-codespell#89</a></li>
<li><a href="https://github.com/candrews"><code>@​candrews</code></a>
made their first contribution in <a
href="https://redirect.github.com/codespell-project/actions-codespell/pull/82">codespell-project/actions-codespell#82</a></li>
<li><a href="https://github.com/vadi2"><code>@​vadi2</code></a> made
their first contribution in <a
href="https://redirect.github.com/codespell-project/actions-codespell/pull/32">codespell-project/actions-codespell#32</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/codespell-project/actions-codespell/compare/v2...v2.2">https://github.com/codespell-project/actions-codespell/compare/v2...v2.2</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="8f01853be1"><code>8f01853</code></a>
MAINT: Release notes</li>
<li><a
href="23a4abea24"><code>23a4abe</code></a>
Add checkout action and problem matcher to README (<a
href="https://redirect.github.com/codespell-project/actions-codespell/issues/32">#32</a>)</li>
<li><a
href="906f13fba1"><code>906f13f</code></a>
Upgrade to Python 3.13 (<a
href="https://redirect.github.com/codespell-project/actions-codespell/issues/82">#82</a>)</li>
<li><a
href="df0bba344d"><code>df0bba3</code></a>
[pre-commit.ci] pre-commit autoupdate (<a
href="https://redirect.github.com/codespell-project/actions-codespell/issues/85">#85</a>)</li>
<li><a
href="c460eef33e"><code>c460eef</code></a>
Use <code>pip install</code> with <code>--no-cache-dir</code> in the
Dockerfile (<a
href="https://redirect.github.com/codespell-project/actions-codespell/issues/89">#89</a>)</li>
<li><a
href="037a23a348"><code>037a23a</code></a>
Add the config file option and tests (<a
href="https://redirect.github.com/codespell-project/actions-codespell/issues/80">#80</a>)</li>
<li><a
href="8d1a4b1bd9"><code>8d1a4b1</code></a>
Bump actions/setup-python from 5 to 6 (<a
href="https://redirect.github.com/codespell-project/actions-codespell/issues/92">#92</a>)</li>
<li><a
href="71286cb40f"><code>71286cb</code></a>
Bump actions/checkout from 4 to 5 (<a
href="https://redirect.github.com/codespell-project/actions-codespell/issues/91">#91</a>)</li>
<li><a
href="fad9339798"><code>fad9339</code></a>
[pre-commit.ci] pre-commit autoupdate (<a
href="https://redirect.github.com/codespell-project/actions-codespell/issues/84">#84</a>)</li>
<li>See full diff in <a
href="406322ec52...8f01853be1">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=codespell-project/actions-codespell&package-manager=github_actions&previous-version=2.1&new-version=2.2)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-09 21:58:51 -08:00
iceweasel-oai
625f2208c4 For npm upgrade on Windows, go through cmd.exe to get path traversal working (#6387)
On Windows, `npm` by itself does not resolve under std::process::Command
which does not consider PATHEXT to resolve it to `npm.cmd` in the PATH.
By running the npm upgrade command via cmd.exe we get proper path
semantics so it actually works.
2025-11-09 21:07:44 -08:00
kinopeee
5f1fab0e7c fix(cloud-tasks): respect cli_auth_credentials_store config (#5856)
## Problem

`codex cloud` always instantiated `AuthManager` with `File` mode,
ignoring the user's actual `cli_auth_credentials_store` setting. This
caused users with `cli_auth_credentials_store = "keyring"` (or `"auto"`)
to see "Not signed in" errors even when they had valid credentials
stored in the system keyring.

## Root cause

The code called `Config::load_from_base_config_with_overrides()` with an
empty `ConfigToml::default()`, which always returned `File` as the
default store mode instead of loading the actual user configuration.

## Solution

- **Added `util::load_cli_auth_manager()` helper**  
Properly loads user config via
`load_config_as_toml_with_cli_overrides()` and extracts the
`cli_auth_credentials_store` setting before creating `AuthManager`.

- **Updated callers**  
  - `init_backend()` - used when starting cloud tasks UI
  - `build_chatgpt_headers()` - used for API requests

## Testing

-  `just fmt`
-  `just fix -p codex-cloud-tasks`
-  `cargo test -p codex-cloud-tasks`

## Files changed

- `codex-rs/cloud-tasks/src/lib.rs`
- `codex-rs/cloud-tasks/src/util.rs`

## Verification

Users with keyring-based auth can now run `codex cloud` successfully
without "Not signed in" errors.

---------

Co-authored-by: Eric Traut <etraut@openai.com>
Co-authored-by: celia-oai <celia@openai.com>
2025-11-10 00:35:08 +00:00
Oliver Mannion
c07461e6f3 fix(seatbelt): Allow reading hw.physicalcpu (#6421)
Allow reading `hw.physicalcpu` so numpy can be imported when running in
the sandbox.
 
resolves #6420
2025-11-09 08:53:36 -08:00
Raduan A.
8b80a0a269 Fix SDK documentation: replace 'file diffs' with 'file change notifications' (#6425)
The TypeScript SDK's README incorrectly claimed that runStreamed() emits
"file diffs". However, the FileChangeItem type only contains metadata
(path, kind, status) without actual diff content.

Updated line 36 to accurately describe the SDK as providing "file change
notifications" instead of "file diffs" to match the actual
implementation in items.ts.

Fixes #5850
2025-11-09 08:37:16 -08:00
iceweasel-oai
a47181e471 more world-writable warning improvements (#6389)
3 improvements:
1. show up to 3 actual paths that are world-writable
2. do the scan/warning for Read-Only mode too, because it also applies
there
3. remove the "Cancel" option since it doesn't always apply (like on
startup)
2025-11-08 11:35:43 -08:00
Raduan A.
5beb6167c8 feat(tui): Display keyboard shortcuts inline for approval options (#5889)
Shows single-key shortcuts (y, a, n) next to approval options to make
them more discoverable. Previously these shortcuts worked but were
hidden, making the feature hard to discover.

Changes:
- "Yes, proceed" now shows "y" shortcut
- "Yes, and don't ask again" now shows "a" shortcut
- "No, and tell Codex..." continues to show "esc" shortcut

This improves UX by surfacing the quick keyboard shortcuts that were
already functional but undiscoverable in the UI.

---

Update:

added parentheses for better visual clarity 
<img width="1540" height="486" alt="CleanShot 2025-11-05 at 11 47 07@2x"
src="https://github.com/user-attachments/assets/f951c34a-9ec8-4b81-b151-7b2ccba94658"
/>

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Eric Traut <etraut@openai.com>
2025-11-08 09:08:42 -08:00
iceweasel-oai
917f39ec12 Improve world-writable scan (#6381)
1. scan many more directories since it's much faster than the original
implementation
2. limit overall scan time to 2s
3. skip some directories that are noisy - ApplicationData, Installer,
etc.
2025-11-07 21:28:55 -08:00
Luca King
a2fdfce02a Kill shell tool process groups on timeout (#5258)
## Summary
- launch shell tool processes in their own process group so Codex owns
the full tree
- on timeout or ctrl-c, send SIGKILL to the process group before
terminating the tracked child
- document that the default shell/unified_exec timeout remains 1000 ms

## Original Bug
Long-lived shell tool commands hang indefinitely because the timeout
handler only terminated the direct child process; any grandchildren it
spawned kept running and held the PTY open, preventing Codex from
regaining control.

## Repro Original Bug
Install next.js and run `next dev` (which is a long-running shell
process with children). On openai:main, it will cause the agent to
permanently get stuck here until human intervention. On this branch,
this command will be terminated successfully after timeout_ms which will
unblock the agent. This is a critical fix for unmonitored / lightly
monitored agents that don't have immediate human observation to unblock
them.

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>
Co-authored-by: Michael Bolin <bolinfest@gmail.com>
2025-11-07 17:54:35 -08:00
pakrym-oai
91b16b8682 Don't request approval for safe commands in unified exec (#6380) 2025-11-07 16:36:04 -08:00
Alexander Smirnov
183fc8e01a core: replace Cloudflare 403 HTML with friendly message (#6252)
### Motivation

When Codex is launched from a region where Cloudflare blocks access (for
example, Russia), the CLI currently dumps Cloudflare’s entire HTML error
page. This isn’t actionable and makes it hard for users to understand
what happened. We want to detect the Cloudflare block and show a
concise, user-friendly explanation instead.

### What Changed

- Added CLOUDFLARE_BLOCKED_MESSAGE and a friendly_message() helper to
UnexpectedResponseError. Whenever we see a 403 whose body contains the
Cloudflare block notice, we now emit a single-line message (Access
blocked by Cloudflare…) while preserving the HTTP status and request id.
All other responses keep the original behaviour.
- Added two focused unit tests:
- unexpected_status_cloudflare_html_is_simplified ensures the Cloudflare
HTML case yields the friendly message.
- unexpected_status_non_html_is_unchanged confirms plain-text 403s still
return the raw body.

### Testing

- cargo build -p codex-cli
- cargo test -p codex-core
- just fix -p codex-core
- cargo test --all-features

---------

Co-authored-by: Eric Traut <etraut@openai.com>
2025-11-07 15:55:16 -08:00
Josh McKinney
9fba811764 refactor(terminal): cleanup deprecated flush logic (#6373)
Removes flush logic that was leftover to test against ratatui's flush
Cleaned up the flush logic so it's a bit more intent revealing.
DrawCommand now owns the Cells that it draws as this works around a
borrow checker problem.
2025-11-07 15:54:07 -08:00
Celia Chen
db408b9e62 [App-server] add initialization to doc (#6377)
Address comments in #6353.
2025-11-07 23:52:20 +00:00
Jakob Malmo
2eecc1a2e4 fix(wsl): normalize Windows paths during update (#6086) (#6097)
When running under WSL, the update command could receive Windows-style
absolute paths (e.g., `C:\...`) and pass them to Linux processes
unchanged, which fails because WSL expects those paths in
`/mnt/<drive>/...` form.

This patch adds a tiny helper in the CLI (`cli/src/wsl_paths.rs`) that:
- Detects WSL (`WSL_DISTRO_NAME` or `"microsoft"` in `/proc/version`)  
- Converts `X:\...` → `/mnt/x/...`  

`run_update_action` now normalizes the package-manager command and
arguments under WSL before spawning.
Non-WSL platforms are unaffected.  

Includes small unit tests for the converter.  

**Fixes:** #6086, #6084

Co-authored-by: Eric Traut <etraut@openai.com>
2025-11-07 14:49:17 -08:00
Dan Hernandez
c76528ca1f [SDK] Add network_access and web_search options to TypeScript SDK (#6367)
## Summary

This PR adds two new optional boolean fields to `ThreadOptions` in the
TypeScript SDK:

- **`networkAccess`**: Enables network access in the sandbox by setting
`sandbox_workspace_write.network_access` config
- **`webSearch`**: Enables the web search tool by setting
`tools.web_search` config

These options map to existing Codex configuration options and are
properly threaded through the SDK layers:
1. `ThreadOptions` (threadOptions.ts) - User-facing API
2. `CodexExecArgs` (exec.ts) - Internal execution args  
3. CLI flags via `--config` in the `codex exec` command

## Changes

- `sdk/typescript/src/threadOptions.ts`: Added `networkAccess` and
`webSearch` fields to `ThreadOptions` type
- `sdk/typescript/src/exec.ts`: Added fields to `CodexExecArgs` and CLI
flag generation
- `sdk/typescript/src/thread.ts`: Pass options through to exec layer

## Test Plan

- [x] Build succeeds (`pnpm build`)
- [x] Linter passes (`pnpm lint`)
- [x] Type definitions are properly exported
- [ ] Manual testing with sample code (to be done by reviewer)

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-11-07 13:19:34 -08:00
Michael Bolin
bb47f2226f feat: add --promote-alpha option to create_github_release script (#6370)
Historically, running `create_github_release --publish-release` would
always publish a new release from latest `main`, which isn't always the
best idea. We should really publish an alpha, let it bake, and then
promote it.

This PR introduces a new flag, `--promote-alpha`, which does exactly
that. It also works with `--dry-run`, so you can sanity check the commit
it will use as the base commit for the new release before running it for
real.

```shell
$ ./codex-rs/scripts/create_github_release --dry-run --promote-alpha 0.56.0-alpha.2
Publishing version 0.56.0
Running gh api GET /repos/openai/codex/git/refs/tags/rust-v0.56.0-alpha.2
Running gh api GET /repos/openai/codex/git/tags/7d4ef77bc35b011aa0c76c5cbe6cd7d3e53f1dfe
Running gh api GET /repos/openai/codex/compare/main...8b49211e67d3c863df5ecc13fc5f88516a20fa69
Would publish version 0.56.0 using base commit 62474a30e8 derived from rust-v0.56.0-alpha.2.
```
2025-11-07 20:05:22 +00:00
Jeremy Rose
c6ab92bc50 tui: add comments to tui.rs (#6369) 2025-11-07 18:17:52 +00:00
pakrym-oai
4c1a6f0ee0 Promote shell config tool to model family config (#6351) 2025-11-07 10:11:11 -08:00
Owen Lin
361d43b969 [app-server] doc: update README for threads and turns (#6368)
Self explanatory!
2025-11-07 17:02:49 +00:00
Celia Chen
2e81f1900d [App-server] Add auth v2 doc & update codex mcp interface auth section (#6353)
Added doc for auth v2 endpoints. Updated the auth section in Codex MCP
interface doc too.
2025-11-07 08:17:19 -08:00
Owen Lin
2030b28083 [app-server] feat: expose additional fields on Thread (#6338)
Add the following fields to Thread:

```
    pub preview: String,
    pub model_provider: String,
    pub created_at: i64,
```

Will prob need another PR once this lands:
https://github.com/openai/codex/pull/6337
2025-11-07 04:08:45 +00:00
Celia Chen
e84e39940b [App-server] Implement account/read endpoint (#6336)
This PR does two things:
1. add a new function in core that maps the core-internal plan type to
the external plan type;
2. implement account/read that get account status (v2 of
`getAuthStatus`).
2025-11-06 19:43:13 -08:00
pakrym-oai
e8905f6d20 Prefer wait_for_event over wait_for_event_with_timeout (#6349) 2025-11-06 18:11:11 -08:00
Shane Vitarana
316352be94 Fix apply_patch rename move path resolution (#5486)
Fixes https://github.com/openai/codex/issues/5485.

Fixed rename hunks so `apply_patch` resolves the destination path using
the verifier’s effective cwd, ensuring patches that run under `cd
<worktree> && apply_patch` stay inside the worktree.

Added a regression test
(`test_apply_patch_resolves_move_path_with_effective_cwd`) that
reproduced the old behavior (dest path resolved in the main repo) and
now passes.

Related to https://github.com/openai/codex/issues/5483.

Co-authored-by: Eric Traut <etraut@openai.com>
2025-11-06 17:02:09 -08:00
pakrym-oai
f8b30af6dc Prefer wait_for_event over wait_for_event_with_timeout. (#6346)
No need to specify the timeout in most cases.
2025-11-06 16:14:43 -08:00
Eric Traut
039a4b070e Updated the AI labeler rules to match the most recent issue tracker labels (#6347)
This PR updates the AI prompt used for the workflow that adds automated
labels to incoming issues. I've been updating and refining the list of
labels as I work through the issue backlog, and the old prompt was
becoming somewhat outdated.
2025-11-06 16:02:12 -08:00
pakrym-oai
c368c6aeea Remove shell tool when unified exec is enabled (#6345)
Also drop streameable shell that's just an alias for unified exec.
2025-11-06 15:46:24 -08:00
Eric Traut
0c647bc566 Don't retry "insufficient_quota" errors (#6340)
This PR makes an "insufficient quota" error fatal so we don't attempt to
retry it multiple times in the agent loop.

We have multiple bug reports from users about intermittent retry
behaviors, and this could explain some of them. With this change, we'll
eliminate the retries and surface a clear error message.

The PR is a nearly identical copy of [this
PR](https://github.com/openai/codex/pull/4837) contributed by
@abimaelmartell. The original PR has gone stale. Rather than wait for
the contributor to resolve merge conflicts, I wanted to get this change
in.
2025-11-06 15:12:01 -08:00
Ejaz Ahmed
e30f65118d feat: Enable CTRL-n and CTRL-p for navigating slash commands, files, history (#1994)
Adds CTRL-n and CTRL-p navigation for slash commands, files, and
history.
Closes #1992

Co-authored-by: Eric Traut <etraut@openai.com>
2025-11-06 14:58:18 -08:00
Jeremy Rose
1bd2d7a659 tui: fix backtracking past /status (#6335)
Fixes https://github.com/openai/codex/issues/4722

Supersedes https://github.com/openai/codex/pull/5058

Ideally we'd have a clearer way of separating history per-session than
by detecting a specific history cell type, but this is a fairly minimal
fix for now.
2025-11-06 14:50:07 -08:00
Gabriel Peal
65d53fd4b1 Make generate_ts prettier output warn-only (#6342)
Before, every file would be outputted with the time prettier spent
formatting it. This made downstream scripts way too noisy.
2025-11-06 17:45:51 -05:00
pakrym-oai
b5349202e9 Freeform unified exec output formatting (#6233) 2025-11-06 22:14:27 +00:00
Gabriel Peal
1b8cc8b625 [App Server] Add more session metadata to listConversations (#6337)
This unlocks a few new product experience for app server consumers
2025-11-06 17:13:24 -05:00
Jeremy Rose
8501b0b768 core: widen sandbox to allow certificate ops when network is enabled (#5980)
This allows `gh api` to work in the workspace-write sandbox w/ network
enabled. Without this we see e.g.

```
$ codex debug seatbelt --full-auto gh api repos/openai/codex/pulls --paginate -X GET -F state=all
Get "https://api.github.com/repos/openai/codex/pulls?per_page=100&state=all": tls: failed to verify certificate: x509: OSStatus -26276
```
2025-11-06 12:47:20 -08:00
Eric Traut
fe7eb18104 Updated contributing guidelines and PR template to request link to bug report in PR notes (#6332)
Some PRs are being submitted without reference to existing bug reports
or feature requests. This updates the PR template and contributing
guidelines to request that all PRs from the community contain such a
link. This provides additional context and helps prioritize, track, and
assess PRs.
2025-11-06 12:02:39 -08:00
Thibault Sottiaux
8c75ed39d5 feat: clarify that gpt-5-codex should not amend commits unless requested (#6333) 2025-11-06 11:42:47 -08:00
Owen Lin
fdb9fa301e chore: move relevant tests to app-server/tests/suite/v2 (#6289)
These are technically app-server v2 APIs, so move them to the same
directory as the others.
2025-11-06 10:53:17 -08:00
iceweasel-oai
871d442b8e Windows Sandbox: Show Everyone-writable directory warning (#6283)
Show a warning when Auto Sandbox mode becomes enabled, if we detect
Everyone-writable directories, since they cannot be protected by the
current implementation of the Sandbox.

This PR also includes changes to how we detect Everyone-writable to be
*much* faster
2025-11-06 10:44:42 -08:00
Eric Traut
8258ad88a0 Merge branch 'main' into patch-1 2025-11-01 15:49:11 -05:00
Abkari Mohammed Sayeem
3de1e54474 Merge branch 'main' into patch-1 2025-10-31 17:59:42 +05:30
Abkari Mohammed Sayeem
2d3387169c Fix documentation errors for Custom Prompts named arguments and add canonical examples
The Custom Prompts documentation (docs/prompts.md) was incomplete for named arguments:

1. Documentation for custom prompts was incomplete - named argument usage was mentioned briefly but lacked comprehensive canonical examples showing proper syntax and behavior.

2. Fixed by adding canonical, tested syntax and examples:
   - Example 1: Basic named arguments with TICKET_ID and TICKET_TITLE
   - Example 2: Mixed positional and named arguments with FILE and FOCUS
   - Example 3: Using positional arguments
   - Example 4: Updated draftpr example to use proper $FEATURE_NAME syntax
   - Added clear usage examples showing KEY=value syntax
   - Added expanded prompt examples showing the result
   - Documented error handling and validation requirements

3. Added Implementation Reference section that references the relevant feature implementation from the codebase (PRs #4470 and #4474 for initial implementation, #5332 and #5403 for clarifications).

This addresses issue #5039 by providing complete, accurate documentation for named argument usage in custom prompts.
2025-10-29 17:53:23 +05:30
149 changed files with 6248 additions and 1923 deletions

View File

@@ -4,3 +4,5 @@ Before opening this Pull Request, please read the dedicated "Contributing" markd
https://github.com/openai/codex/blob/main/docs/contributing.md
If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes.
Include a link to a bug report or enhancement request.

View File

@@ -16,10 +16,27 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: contributor-assistant/github-action@v2.6.1
# Run on close only if the PR was merged. This will lock the PR to preserve
# the CLA agreement. We don't want to lock PRs that have been closed without
# merging because the contributor may want to respond with additional comments.
# This action has a "lock-pullrequest-aftermerge" option that can be set to false,
# but that would unconditionally skip locking even in cases where the PR was merged.
if: |
github.event_name == 'pull_request_target' ||
github.event.comment.body == 'recheck' ||
github.event.comment.body == 'I have read the CLA Document and I hereby sign the CLA'
(
github.event_name == 'pull_request_target' &&
(
github.event.action == 'opened' ||
github.event.action == 'synchronize' ||
(github.event.action == 'closed' && github.event.pull_request.merged == true)
)
) ||
(
github.event_name == 'issue_comment' &&
(
github.event.comment.body == 'recheck' ||
github.event.comment.body == 'I have read the CLA Document and I hereby sign the CLA'
)
)
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:

View File

@@ -22,6 +22,6 @@ jobs:
- name: Annotate locations with typos
uses: codespell-project/codespell-problem-matcher@b80729f885d32f78a716c2f107b4db1025001c42 # v1
- name: Codespell
uses: codespell-project/actions-codespell@406322ec52dd7b488e48c1c4b82e2a8b3a1bf630 # v2.1
uses: codespell-project/actions-codespell@8f01853be192eb0f849a5c7d721450e7a467c579 # v2.2
with:
ignore_words_file: .codespellignore

View File

@@ -26,21 +26,36 @@ jobs:
prompt: |
You are an assistant that reviews GitHub issues for the repository.
Your job is to choose the most appropriate existing labels for the issue described later in this prompt.
Your job is to choose the most appropriate labels for the issue described later in this prompt.
Follow these rules:
- Only pick labels out of the list below.
- Prefer a small set of precise labels over many broad ones.
Labels to apply:
- Add one (and only one) of the following three labels to distinguish the type of issue. Default to "bug" if unsure.
1. bug — Reproducible defects in Codex products (CLI, VS Code extension, web, auth).
2. enhancement — Feature requests or usability improvements that ask for new capabilities, better ergonomics, or quality-of-life tweaks.
3. extension — VS Code (or other IDE) extension-specific issues.
4. windows-os — Bugs or friction specific to Windows environments (always when PowerShell is mentioned, path handling, copy/paste, OS-specific auth or tooling failures).
5. mcp — Topics involving Model Context Protocol servers/clients.
6. codex-web — Issues targeting the Codex web UI/Cloud experience.
8. azure — Problems or requests tied to Azure OpenAI deployments.
9. documentation — Updates or corrections needed in docs/README/config references (broken links, missing examples, outdated keys, clarification requests).
10. model-behaviorUndesirable LLM behavior: forgetting goals, refusing work, hallucinating environment details, quota misreports, or other reasoning/performance anomalies.
3. documentation — Updates or corrections needed in docs/README/config references (broken links, missing examples, outdated keys, clarification requests).
- If applicable, add one of the following labels to specify which sub-product or product surface the issue relates to.
1. CLI — the Codex command line interface.
2. extension — VS Code (or other IDE) extension-specific issues.
3. codex-web — Issues targeting the Codex web UI/Cloud experience.
4. github-actionIssues with the Codex GitHub action.
5. iOS — Issues with the Codex iOS app.
- Additionally add zero or more of the following labels that are relevant to the issue content. Prefer a small set of precise labels over many broad ones.
1. windows-os — Bugs or friction specific to Windows environments (always when PowerShell is mentioned, path handling, copy/paste, OS-specific auth or tooling failures).
2. mcp — Topics involving Model Context Protocol servers/clients.
3. mcp-server — Problems related to the codex mcp-server command, where codex runs as an MCP server.
4. azure — Problems or requests tied to Azure OpenAI deployments.
5. model-behavior — Undesirable LLM behavior: forgetting goals, refusing work, hallucinating environment details, quota misreports, or other reasoning/performance anomalies.
6. code-review — Issues related to the code review feature or functionality.
7. auth - Problems related to authentication, login, or access tokens.
8. codex-exec - Problems related to the "codex exec" command or functionality.
9. context-management - Problems related to compaction, context windows, or available context reporting.
10. custom-model - Problems that involve using custom model providers, local models, or OSS models.
11. rate-limits - Problems related to token limits, rate limits, or token usage reporting.
12. sandbox - Issues related to local sandbox environments or tool call approvals to override sandbox restrictions.
13. tool-calls - Problems related to specific tool call invocations including unexpected errors, failures, or hangs.
14. TUI - Problems with the terminal user interface (TUI) including keyboard shortcuts, copy & pasting, menus, or screen update issues.
Issue number: ${{ github.event.issue.number }}

View File

@@ -76,7 +76,7 @@ jobs:
steps:
- uses: actions/checkout@v5
- uses: dtolnay/rust-toolchain@1.90
- uses: taiki-e/install-action@0c5db7f7f897c03b771660e91d065338615679f4 # v2
- uses: taiki-e/install-action@44c6d64aa62cd779e873306675c7a58e86d6d532 # v2
with:
tool: cargo-shear
version: 1.5.1
@@ -170,7 +170,7 @@ jobs:
# Install and restore sccache cache
- name: Install sccache
uses: taiki-e/install-action@0c5db7f7f897c03b771660e91d065338615679f4 # v2
uses: taiki-e/install-action@44c6d64aa62cd779e873306675c7a58e86d6d532 # v2
with:
tool: sccache
version: 0.7.5
@@ -228,7 +228,7 @@ jobs:
- name: Install cargo-chef
if: ${{ matrix.profile == 'release' }}
uses: taiki-e/install-action@0c5db7f7f897c03b771660e91d065338615679f4 # v2
uses: taiki-e/install-action@44c6d64aa62cd779e873306675c7a58e86d6d532 # v2
with:
tool: cargo-chef
version: 0.1.71
@@ -370,7 +370,7 @@ jobs:
cargo-home-${{ matrix.runner }}-${{ matrix.target }}-${{ matrix.profile }}-
- name: Install sccache
uses: taiki-e/install-action@0c5db7f7f897c03b771660e91d065338615679f4 # v2
uses: taiki-e/install-action@44c6d64aa62cd779e873306675c7a58e86d6d532 # v2
with:
tool: sccache
version: 0.7.5
@@ -399,7 +399,7 @@ jobs:
sccache-${{ matrix.runner }}-${{ matrix.target }}-${{ matrix.profile }}-${{ hashFiles('**/Cargo.lock') }}-
sccache-${{ matrix.runner }}-${{ matrix.target }}-${{ matrix.profile }}-
- uses: taiki-e/install-action@0c5db7f7f897c03b771660e91d065338615679f4 # v2
- uses: taiki-e/install-action@44c6d64aa62cd779e873306675c7a58e86d6d532 # v2
with:
tool: nextest
version: 0.9.103

View File

@@ -295,6 +295,15 @@ jobs:
# ${{ matrix.target }}
dest="dist/${{ matrix.target }}"
# We want to ship the raw Windows executables in the GitHub Release
# in addition to the compressed archives. Keep the originals for
# Windows targets; remove them elsewhere to limit the number of
# artifacts that end up in the GitHub Release.
keep_originals=false
if [[ "${{ matrix.runner }}" == windows* ]]; then
keep_originals=true
fi
# For compatibility with environments that lack the `zstd` tool we
# additionally create a `.tar.gz` for all platforms and `.zip` for
# Windows alongside every single binary that we publish. The end result is:
@@ -324,7 +333,11 @@ jobs:
# Also create .zst (existing behaviour) *and* remove the original
# uncompressed binary to keep the directory small.
zstd -T0 -19 --rm "$dest/$base"
zstd_args=(-T0 -19)
if [[ "${keep_originals}" == false ]]; then
zstd_args+=(--rm)
fi
zstd "${zstd_args[@]}" "$dest/$base"
done
- name: Remove signing keychain

View File

@@ -84,6 +84,7 @@ If you dont have the tool:
- Use `ResponseMock::single_request()` when a test should only issue one POST, or `ResponseMock::requests()` to inspect every captured `ResponsesRequest`.
- `ResponsesRequest` exposes helpers (`body_json`, `input`, `function_call_output`, `custom_tool_call_output`, `call_output`, `header`, `path`, `query_param`) so assertions can target structured payloads instead of manual JSON digging.
- Build SSE payloads with the provided `ev_*` constructors and the `sse(...)`.
- Prefer `wait_for_event` over `wait_for_event_with_timeout`.
- Typical pattern:

207
codex-rs/Cargo.lock generated
View File

@@ -211,6 +211,7 @@ dependencies = [
"parking_lot",
"percent-encoding",
"windows-sys 0.59.0",
"wl-clipboard-rs",
"x11rb",
]
@@ -237,46 +238,44 @@ dependencies = [
[[package]]
name = "askama"
version = "0.12.1"
version = "0.14.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b79091df18a97caea757e28cd2d5fda49c6cd4bd01ddffd7ff01ace0c0ad2c28"
checksum = "f75363874b771be265f4ffe307ca705ef6f3baa19011c149da8674a87f1b75c4"
dependencies = [
"askama_derive",
"askama_escape",
"humansize",
"num-traits",
"itoa",
"percent-encoding",
"serde",
"serde_json",
]
[[package]]
name = "askama_derive"
version = "0.12.5"
version = "0.14.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "19fe8d6cb13c4714962c072ea496f3392015f0989b1a2847bb4b2d9effd71d83"
checksum = "129397200fe83088e8a68407a8e2b1f826cf0086b21ccdb866a722c8bcd3a94f"
dependencies = [
"askama_parser",
"basic-toml",
"mime",
"mime_guess",
"memchr",
"proc-macro2",
"quote",
"rustc-hash 2.1.1",
"serde",
"serde_derive",
"syn 2.0.104",
]
[[package]]
name = "askama_escape"
version = "0.10.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "619743e34b5ba4e9703bba34deac3427c72507c7159f5fd030aea8cac0cfe341"
[[package]]
name = "askama_parser"
version = "0.2.1"
version = "0.14.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "acb1161c6b64d1c3d83108213c2a2533a342ac225aabd0bda218278c2ddb00c0"
checksum = "d6ab5630b3d5eaf232620167977f95eb51f3432fc76852328774afbd242d4358"
dependencies = [
"nom",
"memchr",
"serde",
"serde_derive",
"winnow",
]
[[package]]
@@ -981,21 +980,23 @@ dependencies = [
"codex-mcp-server",
"codex-process-hardening",
"codex-protocol",
"codex-protocol-ts",
"codex-responses-api-proxy",
"codex-rmcp-client",
"codex-stdio-to-uds",
"codex-tui",
"codex-windows-sandbox",
"ctor 0.5.0",
"libc",
"owo-colors",
"predicates",
"pretty_assertions",
"regex-lite",
"serde_json",
"supports-color",
"tempfile",
"tokio",
"toml",
"tracing",
]
[[package]]
@@ -1066,6 +1067,7 @@ dependencies = [
"chrono",
"codex-app-server-protocol",
"codex-apply-patch",
"codex-arg0",
"codex-async-utils",
"codex-file-search",
"codex-git",
@@ -1080,6 +1082,7 @@ dependencies = [
"codex-windows-sandbox",
"core-foundation 0.9.4",
"core_test_support",
"ctor 0.5.0",
"dirs",
"dunce",
"env-flags",
@@ -1365,16 +1368,6 @@ dependencies = [
"uuid",
]
[[package]]
name = "codex-protocol-ts"
version = "0.0.0"
dependencies = [
"anyhow",
"clap",
"codex-app-server-protocol",
"ts-rs",
]
[[package]]
name = "codex-responses-api-proxy"
version = "0.0.0"
@@ -1452,8 +1445,10 @@ dependencies = [
"codex-login",
"codex-ollama",
"codex-protocol",
"codex-windows-sandbox",
"color-eyre",
"crossterm",
"derive_more 2.0.1",
"diffy",
"dirs",
"dunce",
@@ -1564,6 +1559,7 @@ version = "0.1.0"
dependencies = [
"anyhow",
"dirs-next",
"dunce",
"rand 0.8.5",
"serde",
"serde_json",
@@ -1657,6 +1653,15 @@ dependencies = [
"unicode-segmentation",
]
[[package]]
name = "convert_case"
version = "0.7.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bb402b8d4c85569410425650ce3eddc7d698ed96d39a73f941b08fb63082f1e7"
dependencies = [
"unicode-segmentation",
]
[[package]]
name = "core-foundation"
version = "0.9.4"
@@ -2002,7 +2007,7 @@ version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "cb7330aeadfbe296029522e6c40f315320aba36fc43a5b3632f3795348f3bd22"
dependencies = [
"convert_case",
"convert_case 0.6.0",
"proc-macro2",
"quote",
"syn 2.0.104",
@@ -2015,6 +2020,7 @@ version = "2.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bda628edc44c4bb645fbe0f758797143e4e07926f7ebf4e9bdfbd3d2ce621df3"
dependencies = [
"convert_case 0.7.1",
"proc-macro2",
"quote",
"syn 2.0.104",
@@ -2878,15 +2884,6 @@ version = "1.0.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "df3b46402a9d5adb4c86a0cf463f42e19994e3ee891101b1841f30a545cb49a9"
[[package]]
name = "humansize"
version = "2.1.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6cb51c9a029ddc91b07a787f1d86b53ccfa49b0e86688c946ebe8d3555685dd7"
dependencies = [
"libm",
]
[[package]]
name = "hyper"
version = "1.7.0"
@@ -3530,12 +3527,6 @@ dependencies = [
"pkg-config",
]
[[package]]
name = "libm"
version = "0.2.15"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f9fbbcab51052fe104eb5e5d351cf728d30a5be1fe14d9be8a3b097481fb97de"
[[package]]
name = "libredox"
version = "0.1.6"
@@ -4302,6 +4293,16 @@ dependencies = [
"windows-sys 0.52.0",
]
[[package]]
name = "os_pipe"
version = "1.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "db335f4760b14ead6290116f2427bf33a14d4f0617d49f78a246de10c1831224"
dependencies = [
"libc",
"windows-sys 0.59.0",
]
[[package]]
name = "owo-colors"
version = "4.2.2"
@@ -4449,7 +4450,7 @@ checksum = "3af6b589e163c5a788fab00ce0c0366f6efbb9959c2f9874b224936af7fce7e1"
dependencies = [
"base64",
"indexmap 2.12.0",
"quick-xml",
"quick-xml 0.38.0",
"serde",
"time",
]
@@ -4678,6 +4679,15 @@ version = "2.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a993555f31e5a609f617c12db6250dedcac1b0a85076912c436e6fc9b2c8e6a3"
[[package]]
name = "quick-xml"
version = "0.37.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "331e97a1af0bf59823e6eadffe373d7b27f485be8748f71471c662c1f269b7fb"
dependencies = [
"memchr",
]
[[package]]
name = "quick-xml"
version = "0.38.0"
@@ -6715,6 +6725,18 @@ version = "0.1.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c4013970217383f67b18aef68f6fb2e8d409bc5755227092d32efb0422ba24b8"
[[package]]
name = "tree_magic_mini"
version = "3.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f943391d896cdfe8eec03a04d7110332d445be7df856db382dd96a730667562c"
dependencies = [
"memchr",
"nom",
"once_cell",
"petgraph",
]
[[package]]
name = "try-lock"
version = "0.2.5"
@@ -7052,6 +7074,76 @@ dependencies = [
"web-sys",
]
[[package]]
name = "wayland-backend"
version = "0.3.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "673a33c33048a5ade91a6b139580fa174e19fb0d23f396dca9fa15f2e1e49b35"
dependencies = [
"cc",
"downcast-rs",
"rustix 1.0.8",
"smallvec",
"wayland-sys",
]
[[package]]
name = "wayland-client"
version = "0.31.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c66a47e840dc20793f2264eb4b3e4ecb4b75d91c0dd4af04b456128e0bdd449d"
dependencies = [
"bitflags 2.10.0",
"rustix 1.0.8",
"wayland-backend",
"wayland-scanner",
]
[[package]]
name = "wayland-protocols"
version = "0.32.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "efa790ed75fbfd71283bd2521a1cfdc022aabcc28bdcff00851f9e4ae88d9901"
dependencies = [
"bitflags 2.10.0",
"wayland-backend",
"wayland-client",
"wayland-scanner",
]
[[package]]
name = "wayland-protocols-wlr"
version = "0.3.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "efd94963ed43cf9938a090ca4f7da58eb55325ec8200c3848963e98dc25b78ec"
dependencies = [
"bitflags 2.10.0",
"wayland-backend",
"wayland-client",
"wayland-protocols",
"wayland-scanner",
]
[[package]]
name = "wayland-scanner"
version = "0.31.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "54cb1e9dc49da91950bdfd8b848c49330536d9d1fb03d4bfec8cae50caa50ae3"
dependencies = [
"proc-macro2",
"quick-xml 0.37.5",
"quote",
]
[[package]]
name = "wayland-sys"
version = "0.31.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "34949b42822155826b41db8e5d0c1be3a2bd296c747577a43a3e6daefc296142"
dependencies = [
"pkg-config",
]
[[package]]
name = "web-sys"
version = "0.3.77"
@@ -7623,6 +7715,25 @@ dependencies = [
"bitflags 2.10.0",
]
[[package]]
name = "wl-clipboard-rs"
version = "0.9.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8e5ff8d0e60065f549fafd9d6cb626203ea64a798186c80d8e7df4f8af56baeb"
dependencies = [
"libc",
"log",
"os_pipe",
"rustix 0.38.44",
"tempfile",
"thiserror 2.0.17",
"tree_magic_mini",
"wayland-backend",
"wayland-client",
"wayland-protocols",
"wayland-protocols-wlr",
]
[[package]]
name = "writeable"
version = "0.6.2"
@@ -7791,9 +7902,9 @@ dependencies = [
[[package]]
name = "zeroize"
version = "1.8.1"
version = "1.8.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ced3678a2879b30306d323f4542626697a464a97c0a07c9aebf7ebca65cd4dde"
checksum = "b97154e67e32c85465826e8bcc1c59429aaaf107c1e4a9e53c8d8ccd5eff88d0"
dependencies = [
"zeroize_derive",
]

View File

@@ -25,7 +25,6 @@ members = [
"ollama",
"process-hardening",
"protocol",
"protocol-ts",
"rmcp-client",
"responses-api-proxy",
"stdio-to-uds",
@@ -75,7 +74,6 @@ codex-ollama = { path = "ollama" }
codex-otel = { path = "otel" }
codex-process-hardening = { path = "process-hardening" }
codex-protocol = { path = "protocol" }
codex-protocol-ts = { path = "protocol-ts" }
codex-responses-api-proxy = { path = "responses-api-proxy" }
codex-rmcp-client = { path = "rmcp-client" }
codex-stdio-to-uds = { path = "stdio-to-uds" }
@@ -87,7 +85,7 @@ codex-utils-pty = { path = "utils/pty" }
codex-utils-readiness = { path = "utils/readiness" }
codex-utils-string = { path = "utils/string" }
codex-utils-tokenizer = { path = "utils/tokenizer" }
codex-windows-sandbox = { path = "windows-sandbox" }
codex-windows-sandbox = { path = "windows-sandbox-rs" }
core_test_support = { path = "core/tests/common" }
mcp-types = { path = "mcp-types" }
mcp_test_support = { path = "mcp-server/tests/common" }
@@ -96,8 +94,8 @@ mcp_test_support = { path = "mcp-server/tests/common" }
allocative = "0.3.3"
ansi-to-tui = "7.0.0"
anyhow = "1"
arboard = "3"
askama = "0.12"
arboard = { version = "3", features = ["wayland-data-control"] }
askama = "0.14"
assert_cmd = "2"
assert_matches = "1.5.0"
async-channel = "2.3.1"
@@ -213,7 +211,7 @@ which = "6"
wildmatch = "2.5.0"
wiremock = "0.6"
zeroize = "1.8.1"
zeroize = "1.8.2"
[workspace.lints]
rust = {}

View File

@@ -58,7 +58,7 @@ To test to see what happens when a command is run under the sandbox provided by
```
# macOS
codex sandbox macos [--full-auto] [COMMAND]...
codex sandbox macos [--full-auto] [--log-denials] [COMMAND]...
# Linux
codex sandbox linux [--full-auto] [COMMAND]...
@@ -67,7 +67,7 @@ codex sandbox linux [--full-auto] [COMMAND]...
codex sandbox windows [--full-auto] [COMMAND]...
# Legacy aliases
codex debug seatbelt [--full-auto] [COMMAND]...
codex debug seatbelt [--full-auto] [--log-denials] [COMMAND]...
codex debug landlock [--full-auto] [COMMAND]...
```

View File

@@ -92,6 +92,8 @@ pub fn generate_ts(out_dir: &Path, prettier: Option<&Path>) -> Result<()> {
{
let status = Command::new(prettier_bin)
.arg("--write")
.arg("--log-level")
.arg("warn")
.args(ts_files.iter().map(|p| p.as_os_str()))
.status()
.with_context(|| format!("Failed to invoke Prettier at {}", prettier_bin.display()))?;
@@ -666,6 +668,8 @@ fn ts_files_in_recursive(dir: &Path) -> Result<Vec<PathBuf>> {
Ok(files)
}
/// Generate an index.ts file that re-exports all generated types.
/// This allows consumers to import all types from a single file.
fn generate_index_ts(out_dir: &Path) -> Result<PathBuf> {
let mut entries: Vec<String> = Vec::new();
let mut stems: Vec<String> = ts_files_in(out_dir)?

View File

@@ -46,7 +46,7 @@ macro_rules! client_request_definitions {
(
$(
$(#[$variant_meta:meta])*
$variant:ident {
$variant:ident $(=> $wire:literal)? {
params: $(#[$params_meta:meta])* $params:ty,
response: $response:ty,
}
@@ -58,6 +58,7 @@ macro_rules! client_request_definitions {
pub enum ClientRequest {
$(
$(#[$variant_meta])*
$(#[serde(rename = $wire)] #[ts(rename = $wire)])?
$variant {
#[serde(rename = "id")]
request_id: RequestId,
@@ -101,105 +102,78 @@ macro_rules! client_request_definitions {
}
client_request_definitions! {
/// NEW APIs
// Thread lifecycle
#[serde(rename = "thread/start")]
#[ts(rename = "thread/start")]
ThreadStart {
params: v2::ThreadStartParams,
response: v2::ThreadStartResponse,
},
#[serde(rename = "thread/resume")]
#[ts(rename = "thread/resume")]
ThreadResume {
params: v2::ThreadResumeParams,
response: v2::ThreadResumeResponse,
},
#[serde(rename = "thread/archive")]
#[ts(rename = "thread/archive")]
ThreadArchive {
params: v2::ThreadArchiveParams,
response: v2::ThreadArchiveResponse,
},
#[serde(rename = "thread/list")]
#[ts(rename = "thread/list")]
ThreadList {
params: v2::ThreadListParams,
response: v2::ThreadListResponse,
},
#[serde(rename = "thread/compact")]
#[ts(rename = "thread/compact")]
ThreadCompact {
params: v2::ThreadCompactParams,
response: v2::ThreadCompactResponse,
},
#[serde(rename = "turn/start")]
#[ts(rename = "turn/start")]
TurnStart {
params: v2::TurnStartParams,
response: v2::TurnStartResponse,
},
#[serde(rename = "turn/interrupt")]
#[ts(rename = "turn/interrupt")]
TurnInterrupt {
params: v2::TurnInterruptParams,
response: v2::TurnInterruptResponse,
},
#[serde(rename = "model/list")]
#[ts(rename = "model/list")]
ModelList {
params: v2::ModelListParams,
response: v2::ModelListResponse,
},
#[serde(rename = "account/login/start")]
#[ts(rename = "account/login/start")]
LoginAccount {
params: v2::LoginAccountParams,
response: v2::LoginAccountResponse,
},
#[serde(rename = "account/login/cancel")]
#[ts(rename = "account/login/cancel")]
CancelLoginAccount {
params: v2::CancelLoginAccountParams,
response: v2::CancelLoginAccountResponse,
},
#[serde(rename = "account/logout")]
#[ts(rename = "account/logout")]
LogoutAccount {
params: #[ts(type = "undefined")] #[serde(skip_serializing_if = "Option::is_none")] Option<()>,
response: v2::LogoutAccountResponse,
},
#[serde(rename = "account/rateLimits/read")]
#[ts(rename = "account/rateLimits/read")]
GetAccountRateLimits {
params: #[ts(type = "undefined")] #[serde(skip_serializing_if = "Option::is_none")] Option<()>,
response: v2::GetAccountRateLimitsResponse,
},
#[serde(rename = "feedback/upload")]
#[ts(rename = "feedback/upload")]
FeedbackUpload {
params: v2::FeedbackUploadParams,
response: v2::FeedbackUploadResponse,
},
#[serde(rename = "account/read")]
#[ts(rename = "account/read")]
GetAccount {
params: #[ts(type = "undefined")] #[serde(skip_serializing_if = "Option::is_none")] Option<()>,
response: v2::GetAccountResponse,
},
/// DEPRECATED APIs below
Initialize {
params: v1::InitializeParams,
response: v1::InitializeResponse,
},
/// NEW APIs
// Thread lifecycle
ThreadStart => "thread/start" {
params: v2::ThreadStartParams,
response: v2::ThreadStartResponse,
},
ThreadResume => "thread/resume" {
params: v2::ThreadResumeParams,
response: v2::ThreadResumeResponse,
},
ThreadArchive => "thread/archive" {
params: v2::ThreadArchiveParams,
response: v2::ThreadArchiveResponse,
},
ThreadList => "thread/list" {
params: v2::ThreadListParams,
response: v2::ThreadListResponse,
},
ThreadCompact => "thread/compact" {
params: v2::ThreadCompactParams,
response: v2::ThreadCompactResponse,
},
TurnStart => "turn/start" {
params: v2::TurnStartParams,
response: v2::TurnStartResponse,
},
TurnInterrupt => "turn/interrupt" {
params: v2::TurnInterruptParams,
response: v2::TurnInterruptResponse,
},
ModelList => "model/list" {
params: v2::ModelListParams,
response: v2::ModelListResponse,
},
LoginAccount => "account/login/start" {
params: v2::LoginAccountParams,
response: v2::LoginAccountResponse,
},
CancelLoginAccount => "account/login/cancel" {
params: v2::CancelLoginAccountParams,
response: v2::CancelLoginAccountResponse,
},
LogoutAccount => "account/logout" {
params: #[ts(type = "undefined")] #[serde(skip_serializing_if = "Option::is_none")] Option<()>,
response: v2::LogoutAccountResponse,
},
GetAccountRateLimits => "account/rateLimits/read" {
params: #[ts(type = "undefined")] #[serde(skip_serializing_if = "Option::is_none")] Option<()>,
response: v2::GetAccountRateLimitsResponse,
},
FeedbackUpload => "feedback/upload" {
params: v2::FeedbackUploadParams,
response: v2::FeedbackUploadResponse,
},
GetAccount => "account/read" {
params: v2::GetAccountParams,
response: v2::GetAccountResponse,
},
/// DEPRECATED APIs below
NewConversation {
params: v1::NewConversationParams,
response: v1::NewConversationResponse,
@@ -263,6 +237,7 @@ client_request_definitions! {
params: #[ts(type = "undefined")] #[serde(skip_serializing_if = "Option::is_none")] Option<()>,
response: v1::LogoutChatGptResponse,
},
/// DEPRECATED in favor of GetAccount
GetAuthStatus {
params: v1::GetAuthStatusParams,
response: v1::GetAuthStatusResponse,
@@ -758,12 +733,17 @@ mod tests {
fn serialize_get_account() -> Result<()> {
let request = ClientRequest::GetAccount {
request_id: RequestId::Integer(5),
params: None,
params: v2::GetAccountParams {
refresh_token: false,
},
};
assert_eq!(
json!({
"method": "account/read",
"id": 5,
"params": {
"refreshToken": false
}
}),
serde_json::to_value(&request)?,
);
@@ -772,19 +752,16 @@ mod tests {
#[test]
fn account_serializes_fields_in_camel_case() -> Result<()> {
let api_key = v2::Account::ApiKey {
api_key: "secret".to_string(),
};
let api_key = v2::Account::ApiKey {};
assert_eq!(
json!({
"type": "apiKey",
"apiKey": "secret",
}),
serde_json::to_value(&api_key)?,
);
let chatgpt = v2::Account::Chatgpt {
email: Some("user@example.com".to_string()),
email: "user@example.com".to_string(),
plan_type: PlanType::Plus,
};
assert_eq!(

View File

@@ -11,6 +11,7 @@ use codex_protocol::models::ResponseItem;
use codex_protocol::protocol::AskForApproval;
use codex_protocol::protocol::EventMsg;
use codex_protocol::protocol::SandboxPolicy;
use codex_protocol::protocol::SessionSource;
use codex_protocol::protocol::TurnAbortReason;
use schemars::JsonSchema;
use serde::Deserialize;
@@ -113,6 +114,18 @@ pub struct ConversationSummary {
pub preview: String,
pub timestamp: Option<String>,
pub model_provider: String,
pub cwd: PathBuf,
pub cli_version: String,
pub source: SessionSource,
pub git_info: Option<ConversationGitInfo>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "snake_case")]
pub struct ConversationGitInfo {
pub sha: Option<String>,
pub branch: Option<String>,
pub origin_url: Option<String>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]

View File

@@ -6,6 +6,8 @@ use codex_protocol::ConversationId;
use codex_protocol::account::PlanType;
use codex_protocol::config_types::ReasoningEffort;
use codex_protocol::config_types::ReasoningSummary;
use codex_protocol::items::AgentMessageContent as CoreAgentMessageContent;
use codex_protocol::items::TurnItem as CoreTurnItem;
use codex_protocol::protocol::RateLimitSnapshot as CoreRateLimitSnapshot;
use codex_protocol::protocol::RateLimitWindow as CoreRateLimitWindow;
use codex_protocol::user_input::UserInput as CoreUserInput;
@@ -123,14 +125,11 @@ impl From<codex_protocol::protocol::SandboxPolicy> for SandboxPolicy {
pub enum Account {
#[serde(rename = "apiKey", rename_all = "camelCase")]
#[ts(rename = "apiKey", rename_all = "camelCase")]
ApiKey { api_key: String },
ApiKey {},
#[serde(rename = "chatgpt", rename_all = "camelCase")]
#[ts(rename = "chatgpt", rename_all = "camelCase")]
Chatgpt {
email: Option<String>,
plan_type: PlanType,
},
Chatgpt { email: String, plan_type: PlanType },
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
@@ -193,11 +192,20 @@ pub struct GetAccountRateLimitsResponse {
pub rate_limits: RateLimitSnapshot,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct GetAccountParams {
#[serde(default)]
pub refresh_token: bool,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct GetAccountResponse {
pub account: Account,
pub account: Option<Account>,
pub requires_openai_auth: bool,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Default, JsonSchema, TS)]
@@ -348,6 +356,11 @@ pub struct ThreadCompactResponse {}
#[ts(export_to = "v2/")]
pub struct Thread {
pub id: String,
/// Usually the first user message in the thread, if available.
pub preview: String,
pub model_provider: String,
/// Unix timestamp (in seconds) when the thread was created.
pub created_at: i64,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
@@ -446,6 +459,17 @@ impl UserInput {
}
}
impl From<CoreUserInput> for UserInput {
fn from(value: CoreUserInput) -> Self {
match value {
CoreUserInput::Text { text } => UserInput::Text { text },
CoreUserInput::Image { image_url } => UserInput::Image { url: image_url },
CoreUserInput::LocalImage { path } => UserInput::LocalImage { path },
_ => unreachable!("unsupported user input variant"),
}
}
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(tag = "type", rename_all = "camelCase")]
#[ts(tag = "type")]
@@ -503,6 +527,42 @@ pub enum ThreadItem {
},
}
impl From<CoreTurnItem> for ThreadItem {
fn from(value: CoreTurnItem) -> Self {
match value {
CoreTurnItem::UserMessage(user) => ThreadItem::UserMessage {
id: user.id,
content: user.content.into_iter().map(UserInput::from).collect(),
},
CoreTurnItem::AgentMessage(agent) => {
let text = agent
.content
.into_iter()
.map(|entry| match entry {
CoreAgentMessageContent::Text { text } => text,
})
.collect::<String>();
ThreadItem::AgentMessage { id: agent.id, text }
}
CoreTurnItem::Reasoning(reasoning) => {
let text = if !reasoning.summary_text.is_empty() {
reasoning.summary_text.join("\n")
} else {
reasoning.raw_content.join("\n")
};
ThreadItem::Reasoning {
id: reasoning.id,
text,
}
}
CoreTurnItem::WebSearch(search) => ThreadItem::WebSearch {
id: search.id,
query: search.query,
},
}
}
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
@@ -697,3 +757,100 @@ pub struct AccountLoginCompletedNotification {
pub success: bool,
pub error: Option<String>,
}
#[cfg(test)]
mod tests {
use super::*;
use codex_protocol::items::AgentMessageContent;
use codex_protocol::items::AgentMessageItem;
use codex_protocol::items::ReasoningItem;
use codex_protocol::items::TurnItem;
use codex_protocol::items::UserMessageItem;
use codex_protocol::items::WebSearchItem;
use codex_protocol::user_input::UserInput as CoreUserInput;
use pretty_assertions::assert_eq;
use std::path::PathBuf;
#[test]
fn core_turn_item_into_thread_item_converts_supported_variants() {
let user_item = TurnItem::UserMessage(UserMessageItem {
id: "user-1".to_string(),
content: vec![
CoreUserInput::Text {
text: "hello".to_string(),
},
CoreUserInput::Image {
image_url: "https://example.com/image.png".to_string(),
},
CoreUserInput::LocalImage {
path: PathBuf::from("local/image.png"),
},
],
});
assert_eq!(
ThreadItem::from(user_item),
ThreadItem::UserMessage {
id: "user-1".to_string(),
content: vec![
UserInput::Text {
text: "hello".to_string(),
},
UserInput::Image {
url: "https://example.com/image.png".to_string(),
},
UserInput::LocalImage {
path: PathBuf::from("local/image.png"),
},
],
}
);
let agent_item = TurnItem::AgentMessage(AgentMessageItem {
id: "agent-1".to_string(),
content: vec![
AgentMessageContent::Text {
text: "Hello ".to_string(),
},
AgentMessageContent::Text {
text: "world".to_string(),
},
],
});
assert_eq!(
ThreadItem::from(agent_item),
ThreadItem::AgentMessage {
id: "agent-1".to_string(),
text: "Hello world".to_string(),
}
);
let reasoning_item = TurnItem::Reasoning(ReasoningItem {
id: "reasoning-1".to_string(),
summary_text: vec!["line one".to_string(), "line two".to_string()],
raw_content: vec![],
});
assert_eq!(
ThreadItem::from(reasoning_item),
ThreadItem::Reasoning {
id: "reasoning-1".to_string(),
text: "line one\nline two".to_string(),
}
);
let search_item = TurnItem::WebSearch(WebSearchItem {
id: "search-1".to_string(),
query: "docs".to_string(),
});
assert_eq!(
ThreadItem::from(search_item),
ThreadItem::WebSearch {
id: "search-1".to_string(),
query: "docs".to_string(),
}
);
}
}

View File

@@ -1,6 +1,6 @@
# codex-app-server
`codex app-server` is the harness Codex uses to power rich interfaces such as the [Codex VS Code extension](https://marketplace.visualstudio.com/items?itemName=openai.chatgpt). The message schema is currently unstable, but those who wish to build experimental UIs on top of Codex may find it valuable.
`codex app-server` is the interface Codex uses to power rich interfaces such as the [Codex VS Code extension](https://marketplace.visualstudio.com/items?itemName=openai.chatgpt). The message schema is currently unstable, but those who wish to build experimental UIs on top of Codex may find it valuable.
## Protocol
@@ -8,8 +8,253 @@ Similar to [MCP](https://modelcontextprotocol.io/), `codex app-server` supports
## Message Schema
Currently, you can dump a TypeScript version of the schema using `codex generate-ts`. It is specific to the version of Codex you used to run `generate-ts`, so the two are guaranteed to be compatible.
Currently, you can dump a TypeScript version of the schema using `codex app-server generate-ts`, or a JSON Schema bundle via `codex app-server generate-json-schema`. Each output is specific to the version of Codex you used to run the command, so the generated artifacts are guaranteed to match that version.
```
codex generate-ts --out DIR
codex app-server generate-ts --out DIR
codex app-server generate-json-schema --out DIR
```
## Initialization
Clients must send a single `initialize` request before invoking any other method, then acknowledge with an `initialized` notification. The server returns the user agent string it will present to upstream services; subsequent requests issued before initialization receive a `"Not initialized"` error, and repeated `initialize` calls receive an `"Already initialized"` error.
Example:
```json
{ "method": "initialize", "id": 0, "params": {
"clientInfo": { "name": "codex-vscode", "title": "Codex VS Code Extension", "version": "0.1.0" }
} }
{ "id": 0, "result": { "userAgent": "codex-app-server/0.1.0 codex-vscode/0.1.0" } }
{ "method": "initialized" }
```
## Core primitives
We have 3 top level primitives:
- Thread - a conversation between the Codex agent and a user. Each thread contains multiple turns.
- Turn - one turn of the conversation, typically starting with a user message and finishing with an agent message. Each turn contains multiple items.
- Item - represents user inputs and agent outputs as part of the turn, persisted and used as the context for future conversations.
## Thread & turn endpoints
The JSON-RPC API exposes dedicated methods for managing Codex conversations. Threads store long-lived conversation metadata, and turns store the per-message exchange (input → Codex output, including streamed items). Use the thread APIs to create, list, or archive sessions, then drive the conversation with turn APIs and notifications.
### Quick reference
- `thread/start` — create a new thread; emits `thread/started` and auto-subscribes you to turn/item events for that thread.
- `thread/resume` — reopen an existing thread by id so subsequent `turn/start` calls append to it.
- `thread/list` — page through stored rollouts; supports cursor-based pagination and optional `modelProviders` filtering.
- `thread/archive` — move a threads rollout file into the archived directory; returns `{}` on success.
- `turn/start` — add user input to a thread and begin Codex generation; responds with the initial `turn` object and streams `turn/started`, `item/*`, and `turn/completed` notifications.
- `turn/interrupt` — request cancellation of an in-flight turn by `(thread_id, turn_id)`; success is an empty `{}` response and the turn finishes with `status: "interrupted"`.
### 1) Start or resume a thread
Start a fresh thread when you need a new Codex conversation.
```json
{ "method": "thread/start", "id": 10, "params": {
// Optionally set config settings. If not specified, will use the user's
// current config settings.
"model": "gpt-5-codex",
"cwd": "/Users/me/project",
"approvalPolicy": "never",
"sandbox": "workspaceWrite",
} }
{ "id": 10, "result": {
"thread": {
"id": "thr_123",
"preview": "",
"modelProvider": "openai",
"createdAt": 1730910000
}
} }
{ "method": "thread/started", "params": { "thread": { } } }
```
To continue a stored session, call `thread/resume` with the `thread.id` you previously recorded. The response shape matches `thread/start`, and no additional notifications are emitted:
```json
{ "method": "thread/resume", "id": 11, "params": { "threadId": "thr_123" } }
{ "id": 11, "result": { "thread": { "id": "thr_123", } } }
```
### 2) List threads (pagination & filters)
`thread/list` lets you render a history UI. Pass any combination of:
- `cursor` — opaque string from a prior response; omit for the first page.
- `limit` — server defaults to a reasonable page size if unset.
- `modelProviders` — restrict results to specific providers; unset, null, or an empty array will include all providers.
Example:
```json
{ "method": "thread/list", "id": 20, "params": {
"cursor": null,
"limit": 25,
} }
{ "id": 20, "result": {
"data": [
{ "id": "thr_a", "preview": "Create a TUI", "modelProvider": "openai", "createdAt": 1730831111 },
{ "id": "thr_b", "preview": "Fix tests", "modelProvider": "openai", "createdAt": 1730750000 }
],
"nextCursor": "opaque-token-or-null"
} }
```
When `nextCursor` is `null`, youve reached the final page.
### 3) Archive a thread
Use `thread/archive` to move the persisted rollout (stored as a JSONL file on disk) into the archived sessions directory.
```json
{ "method": "thread/archive", "id": 21, "params": { "threadId": "thr_b" } }
{ "id": 21, "result": {} }
```
An archived thread will not appear in future calls to `thread/list`.
### 4) Start a turn (send user input)
Turns attach user input (text or images) to a thread and trigger Codex generation. The `input` field is a list of discriminated unions:
- `{"type":"text","text":"Explain this diff"}`
- `{"type":"image","url":"https://…png"}`
- `{"type":"localImage","path":"/tmp/screenshot.png"}`
You can optionally specify config overrides on the new turn. If specified, these settings become the default for subsequent turns on the same thread.
```json
{ "method": "turn/start", "id": 30, "params": {
"threadId": "thr_123",
"input": [ { "type": "text", "text": "Run tests" } ],
// Below are optional config overrides
"cwd": "/Users/me/project",
"approvalPolicy": "unlessTrusted",
"sandboxPolicy": {
"mode": "workspaceWrite",
"writableRoots": ["/Users/me/project"],
"networkAccess": true
},
"model": "gpt-5-codex",
"effort": "medium",
"summary": "concise"
} }
{ "id": 30, "result": { "turn": {
"id": "turn_456",
"status": "inProgress",
"items": [],
"error": null
} } }
```
### 5) Interrupt an active turn
You can cancel a running Turn with `turn/interrupt`.
```json
{ "method": "turn/interrupt", "id": 31, "params": {
"threadId": "thr_123",
"turnId": "turn_456"
} }
{ "id": 31, "result": {} }
```
The server requests cancellations for running subprocesses, then emits a `turn/completed` event with `status: "interrupted"`. Rely on the `turn/completed` to know when Codex-side cleanup is done.
## Auth endpoints
The JSON-RPC auth/account surface exposes request/response methods plus server-initiated notifications (no `id`). Use these to determine auth state, start or cancel logins, logout, and inspect ChatGPT rate limits.
### Quick reference
- `account/read` — fetch current account info; optionally refresh tokens.
- `account/login/start` — begin login (`apiKey` or `chatgpt`).
- `account/login/completed` (notify) — emitted when a login attempt finishes (success or error).
- `account/login/cancel` — cancel a pending ChatGPT login by `loginId`.
- `account/logout` — sign out; triggers `account/updated`.
- `account/updated` (notify) — emitted whenever auth mode changes (`authMode`: `apikey`, `chatgpt`, or `null`).
- `account/rateLimits/read` — fetch ChatGPT rate limits; updates arrive via `account/rateLimits/updated` (notify).
### 1) Check auth state
Request:
```json
{ "method": "account/read", "id": 1, "params": { "refreshToken": false } }
```
Response examples:
```json
{ "id": 1, "result": { "account": null, "requiresOpenaiAuth": false } } // No OpenAI auth needed (e.g., OSS/local models)
{ "id": 1, "result": { "account": null, "requiresOpenaiAuth": true } } // OpenAI auth required (typical for OpenAI-hosted models)
{ "id": 1, "result": { "account": { "type": "apiKey" }, "requiresOpenaiAuth": true } }
{ "id": 1, "result": { "account": { "type": "chatgpt", "email": "user@example.com", "planType": "pro" }, "requiresOpenaiAuth": true } }
```
Field notes:
- `refreshToken` (bool): set `true` to force a token refresh.
- `requiresOpenaiAuth` reflects the active provider; when `false`, Codex can run without OpenAI credentials.
### 2) Log in with an API key
1. Send:
```json
{ "method": "account/login/start", "id": 2, "params": { "type": "apiKey", "apiKey": "sk-…" } }
```
2. Expect:
```json
{ "id": 2, "result": { "type": "apiKey" } }
```
3. Notifications:
```json
{ "method": "account/login/completed", "params": { "loginId": null, "success": true, "error": null } }
{ "method": "account/updated", "params": { "authMode": "apikey" } }
```
### 3) Log in with ChatGPT (browser flow)
1. Start:
```json
{ "method": "account/login/start", "id": 3, "params": { "type": "chatgpt" } }
{ "id": 3, "result": { "type": "chatgpt", "loginId": "<uuid>", "authUrl": "https://chatgpt.com/…&redirect_uri=http%3A%2F%2Flocalhost%3A<port>%2Fauth%2Fcallback" } }
```
2. Open `authUrl` in a browser; the app-server hosts the local callback.
3. Wait for notifications:
```json
{ "method": "account/login/completed", "params": { "loginId": "<uuid>", "success": true, "error": null } }
{ "method": "account/updated", "params": { "authMode": "chatgpt" } }
```
### 4) Cancel a ChatGPT login
```json
{ "method": "account/login/cancel", "id": 4, "params": { "loginId": "<uuid>" } }
{ "method": "account/login/completed", "params": { "loginId": "<uuid>", "success": false, "error": "…" } }
```
### 5) Logout
```json
{ "method": "account/logout", "id": 5 }
{ "id": 5, "result": {} }
{ "method": "account/updated", "params": { "authMode": null } }
```
### 6) Rate limits (ChatGPT)
```json
{ "method": "account/rateLimits/read", "id": 6 }
{ "id": 6, "result": { "rateLimits": { "primary": { "usedPercent": 25, "windowDurationMins": 15, "resetsAt": 1730947200 }, "secondary": null } } }
{ "method": "account/rateLimits/updated", "params": { "rateLimits": { … } } }
```
Field notes:
- `usedPercent` is current usage within the OpenAI quota window.
- `windowDurationMins` is the quota window length.
- `resetsAt` is a Unix timestamp (seconds) for the next reset.
### Dev notes
- `codex app-server generate-ts --out <dir>` emits v2 types under `v2/`.
- `codex app-server generate-json-schema --out <dir>` outputs `codex_app_server_protocol.schemas.json`.
- See [“Authentication and authorization” in the config docs](../../docs/config.md#authentication-and-authorization) for configuration knobs.

View File

@@ -4,6 +4,9 @@ use crate::fuzzy_file_search::run_fuzzy_file_search;
use crate::models::supported_models;
use crate::outgoing_message::OutgoingMessageSender;
use crate::outgoing_message::OutgoingNotification;
use chrono::DateTime;
use chrono::Utc;
use codex_app_server_protocol::Account;
use codex_app_server_protocol::AccountLoginCompletedNotification;
use codex_app_server_protocol::AccountRateLimitsUpdatedNotification;
use codex_app_server_protocol::AccountUpdatedNotification;
@@ -20,6 +23,7 @@ use codex_app_server_protocol::CancelLoginAccountParams;
use codex_app_server_protocol::CancelLoginAccountResponse;
use codex_app_server_protocol::CancelLoginChatGptResponse;
use codex_app_server_protocol::ClientRequest;
use codex_app_server_protocol::ConversationGitInfo;
use codex_app_server_protocol::ConversationSummary;
use codex_app_server_protocol::ExecCommandApprovalParams;
use codex_app_server_protocol::ExecCommandApprovalResponse;
@@ -29,7 +33,9 @@ use codex_app_server_protocol::FeedbackUploadParams;
use codex_app_server_protocol::FeedbackUploadResponse;
use codex_app_server_protocol::FuzzyFileSearchParams;
use codex_app_server_protocol::FuzzyFileSearchResponse;
use codex_app_server_protocol::GetAccountParams;
use codex_app_server_protocol::GetAccountRateLimitsResponse;
use codex_app_server_protocol::GetAccountResponse;
use codex_app_server_protocol::GetAuthStatusParams;
use codex_app_server_protocol::GetAuthStatusResponse;
use codex_app_server_protocol::GetConversationSummaryParams;
@@ -40,6 +46,8 @@ use codex_app_server_protocol::GitDiffToRemoteResponse;
use codex_app_server_protocol::InputItem as WireInputItem;
use codex_app_server_protocol::InterruptConversationParams;
use codex_app_server_protocol::InterruptConversationResponse;
use codex_app_server_protocol::ItemCompletedNotification;
use codex_app_server_protocol::ItemStartedNotification;
use codex_app_server_protocol::JSONRPCErrorError;
use codex_app_server_protocol::ListConversationsParams;
use codex_app_server_protocol::ListConversationsResponse;
@@ -130,8 +138,10 @@ use codex_protocol::ConversationId;
use codex_protocol::config_types::ForcedLoginMethod;
use codex_protocol::items::TurnItem;
use codex_protocol::models::ResponseItem;
use codex_protocol::protocol::GitInfo;
use codex_protocol::protocol::RateLimitSnapshot as CoreRateLimitSnapshot;
use codex_protocol::protocol::RolloutItem;
use codex_protocol::protocol::SessionMetaLine;
use codex_protocol::protocol::USER_MESSAGE_BEGIN;
use codex_protocol::user_input::UserInput as CoreInputItem;
use codex_utils_json_to_toml::json_to_toml;
@@ -190,6 +200,30 @@ enum ApiVersion {
}
impl CodexMessageProcessor {
async fn conversation_from_thread_id(
&self,
thread_id: &str,
) -> Result<(ConversationId, Arc<CodexConversation>), JSONRPCErrorError> {
// Resolve conversation id from v2 thread id string.
let conversation_id =
ConversationId::from_string(thread_id).map_err(|err| JSONRPCErrorError {
code: INVALID_REQUEST_ERROR_CODE,
message: format!("invalid thread id: {err}"),
data: None,
})?;
let conversation = self
.conversation_manager
.get_conversation(conversation_id)
.await
.map_err(|_| JSONRPCErrorError {
code: INVALID_REQUEST_ERROR_CODE,
message: format!("conversation not found: {conversation_id}"),
data: None,
})?;
Ok((conversation_id, conversation))
}
pub fn new(
auth_manager: Arc<AuthManager>,
conversation_manager: Arc<ConversationManager>,
@@ -270,12 +304,8 @@ impl CodexMessageProcessor {
ClientRequest::CancelLoginAccount { request_id, params } => {
self.cancel_login_v2(request_id, params).await;
}
ClientRequest::GetAccount {
request_id,
params: _,
} => {
self.send_unimplemented_error(request_id, "account/read")
.await;
ClientRequest::GetAccount { request_id, params } => {
self.get_account(request_id, params).await;
}
ClientRequest::ResumeConversation { request_id, params } => {
self.handle_resume_conversation(request_id, params).await;
@@ -798,13 +828,17 @@ impl CodexMessageProcessor {
}
}
async fn refresh_token_if_requested(&self, do_refresh: bool) {
if do_refresh && let Err(err) = self.auth_manager.refresh_token().await {
tracing::warn!("failed to refresh token whilte getting account: {err}");
}
}
async fn get_auth_status(&self, request_id: RequestId, params: GetAuthStatusParams) {
let include_token = params.include_token.unwrap_or(false);
let do_refresh = params.refresh_token.unwrap_or(false);
if do_refresh && let Err(err) = self.auth_manager.refresh_token().await {
tracing::warn!("failed to refresh token while getting auth status: {err}");
}
self.refresh_token_if_requested(do_refresh).await;
// Determine whether auth is required based on the active model provider.
// If a custom provider is configured with `requires_openai_auth == false`,
@@ -849,6 +883,56 @@ impl CodexMessageProcessor {
self.outgoing.send_response(request_id, response).await;
}
async fn get_account(&self, request_id: RequestId, params: GetAccountParams) {
let do_refresh = params.refresh_token;
self.refresh_token_if_requested(do_refresh).await;
// Whether auth is required for the active model provider.
let requires_openai_auth = self.config.model_provider.requires_openai_auth;
if !requires_openai_auth {
let response = GetAccountResponse {
account: None,
requires_openai_auth,
};
self.outgoing.send_response(request_id, response).await;
return;
}
let account = match self.auth_manager.auth() {
Some(auth) => Some(match auth.mode {
AuthMode::ApiKey => Account::ApiKey {},
AuthMode::ChatGPT => {
let email = auth.get_account_email();
let plan_type = auth.account_plan_type();
match (email, plan_type) {
(Some(email), Some(plan_type)) => Account::Chatgpt { email, plan_type },
_ => {
let error = JSONRPCErrorError {
code: INVALID_REQUEST_ERROR_CODE,
message:
"email and plan type are required for chatgpt authentication"
.to_string(),
data: None,
};
self.outgoing.send_error(request_id, error).await;
return;
}
}
}
}),
None => None,
};
let response = GetAccountResponse {
account,
requires_openai_auth,
};
self.outgoing.send_response(request_id, response).await;
}
async fn get_user_agent(&self, request_id: RequestId) {
let user_agent = get_codex_user_agent();
let response = GetUserAgentResponse { user_agent };
@@ -1146,8 +1230,31 @@ impl CodexMessageProcessor {
match self.conversation_manager.new_conversation(config).await {
Ok(new_conv) => {
let thread = Thread {
id: new_conv.conversation_id.to_string(),
let conversation_id = new_conv.conversation_id;
let rollout_path = new_conv.session_configured.rollout_path.clone();
let fallback_provider = self.config.model_provider_id.as_str();
// A bit hacky, but the summary contains a lot of useful information for the thread
// that unfortunately does not get returned from conversation_manager.new_conversation().
let thread = match read_summary_from_rollout(
rollout_path.as_path(),
fallback_provider,
)
.await
{
Ok(summary) => summary_to_thread(summary),
Err(err) => {
warn!(
"failed to load summary for new thread {}: {}",
conversation_id, err
);
Thread {
id: conversation_id.to_string(),
preview: String::new(),
model_provider: self.config.model_provider_id.clone(),
created_at: chrono::Utc::now().timestamp(),
}
}
};
let response = ThreadStartResponse {
@@ -1157,12 +1264,12 @@ impl CodexMessageProcessor {
// Auto-attach a conversation listener when starting a thread.
// Use the same behavior as the v1 API with experimental_raw_events=false.
if let Err(err) = self
.attach_conversation_listener(new_conv.conversation_id, false)
.attach_conversation_listener(conversation_id, false)
.await
{
tracing::warn!(
"failed to attach listener for conversation {}: {}",
new_conv.conversation_id,
conversation_id,
err.message
);
}
@@ -1260,12 +1367,7 @@ impl CodexMessageProcessor {
}
};
let data = summaries
.into_iter()
.map(|s| Thread {
id: s.conversation_id.to_string(),
})
.collect();
let data = summaries.into_iter().map(summary_to_thread).collect();
let response = ThreadListResponse { data, next_cursor };
self.outgoing.send_response(request_id, response).await;
@@ -1352,6 +1454,8 @@ impl CodexMessageProcessor {
.await
{
Ok(_) => {
let thread = summary_to_thread(summary);
// Auto-attach a conversation listener when resuming a thread.
if let Err(err) = self
.attach_conversation_listener(conversation_id, false)
@@ -1364,11 +1468,7 @@ impl CodexMessageProcessor {
);
}
let response = ThreadResumeResponse {
thread: Thread {
id: conversation_id.to_string(),
},
};
let response = ThreadResumeResponse { thread };
self.outgoing.send_response(request_id, response).await;
}
Err(err) => {
@@ -1510,7 +1610,18 @@ impl CodexMessageProcessor {
let items = page
.items
.into_iter()
.filter_map(|it| extract_conversation_summary(it.path, &it.head, &fallback_provider))
.filter_map(|it| {
let session_meta_line = it.head.first().and_then(|first| {
serde_json::from_value::<SessionMetaLine>(first.clone()).ok()
})?;
extract_conversation_summary(
it.path,
&it.head,
&session_meta_line.meta,
session_meta_line.git.as_ref(),
fallback_provider.as_str(),
)
})
.collect::<Vec<_>>();
// Encode next_cursor as a plain string
@@ -2060,34 +2171,14 @@ impl CodexMessageProcessor {
}
async fn turn_start(&self, request_id: RequestId, params: TurnStartParams) {
// Resolve conversation id from v2 thread id string.
let conversation_id = match ConversationId::from_string(&params.thread_id) {
Ok(id) => id,
Err(err) => {
let error = JSONRPCErrorError {
code: INVALID_REQUEST_ERROR_CODE,
message: format!("invalid thread id: {err}"),
data: None,
};
let (_, conversation) = match self.conversation_from_thread_id(&params.thread_id).await {
Ok(v) => v,
Err(error) => {
self.outgoing.send_error(request_id, error).await;
return;
}
};
let Ok(conversation) = self
.conversation_manager
.get_conversation(conversation_id)
.await
else {
let error = JSONRPCErrorError {
code: INVALID_REQUEST_ERROR_CODE,
message: format!("conversation not found: {conversation_id}"),
data: None,
};
self.outgoing.send_error(request_id, error).await;
return;
};
// Keep a copy of v2 inputs for the notification payload.
let v2_inputs_for_notif = params.input.clone();
@@ -2161,33 +2252,14 @@ impl CodexMessageProcessor {
async fn turn_interrupt(&mut self, request_id: RequestId, params: TurnInterruptParams) {
let TurnInterruptParams { thread_id, .. } = params;
// Resolve conversation id from v2 thread id string.
let conversation_id = match ConversationId::from_string(&thread_id) {
Ok(id) => id,
Err(err) => {
let error = JSONRPCErrorError {
code: INVALID_REQUEST_ERROR_CODE,
message: format!("invalid thread id: {err}"),
data: None,
};
self.outgoing.send_error(request_id, error).await;
return;
}
};
let Ok(conversation) = self
.conversation_manager
.get_conversation(conversation_id)
.await
else {
let error = JSONRPCErrorError {
code: INVALID_REQUEST_ERROR_CODE,
message: format!("conversation not found: {conversation_id}"),
data: None,
let (conversation_id, conversation) =
match self.conversation_from_thread_id(&thread_id).await {
Ok(v) => v,
Err(error) => {
self.outgoing.send_error(request_id, error).await;
return;
}
};
self.outgoing.send_error(request_id, error).await;
return;
};
// Record the pending interrupt so we can reply when TurnAborted arrives.
{
@@ -2539,6 +2611,20 @@ async fn apply_bespoke_event_handling(
.await;
}
}
EventMsg::ItemStarted(item_started_event) => {
let item: ThreadItem = item_started_event.item.clone().into();
let notification = ItemStartedNotification { item };
outgoing
.send_server_notification(ServerNotification::ItemStarted(notification))
.await;
}
EventMsg::ItemCompleted(item_completed_event) => {
let item: ThreadItem = item_completed_event.item.clone().into();
let notification = ItemCompletedNotification { item };
outgoing
.send_server_notification(ServerNotification::ItemCompleted(notification))
.await;
}
// If this is a TurnAborted, reply to any pending interrupt requests.
EventMsg::TurnAborted(turn_aborted_event) => {
let pending = {
@@ -2671,16 +2757,25 @@ async fn read_summary_from_rollout(
)));
};
let session_meta = serde_json::from_value::<SessionMeta>(first.clone()).map_err(|_| {
IoError::other(format!(
"rollout at {} does not start with session metadata",
path.display()
))
})?;
let session_meta_line =
serde_json::from_value::<SessionMetaLine>(first.clone()).map_err(|_| {
IoError::other(format!(
"rollout at {} does not start with session metadata",
path.display()
))
})?;
let SessionMetaLine {
meta: session_meta,
git,
} = session_meta_line;
if let Some(summary) =
extract_conversation_summary(path.to_path_buf(), &head, fallback_provider)
{
if let Some(summary) = extract_conversation_summary(
path.to_path_buf(),
&head,
&session_meta,
git.as_ref(),
fallback_provider,
) {
return Ok(summary);
}
@@ -2691,7 +2786,9 @@ async fn read_summary_from_rollout(
};
let model_provider = session_meta
.model_provider
.clone()
.unwrap_or_else(|| fallback_provider.to_string());
let git_info = git.as_ref().map(map_git_info);
Ok(ConversationSummary {
conversation_id: session_meta.id,
@@ -2699,19 +2796,20 @@ async fn read_summary_from_rollout(
path: path.to_path_buf(),
preview: String::new(),
model_provider,
cwd: session_meta.cwd,
cli_version: session_meta.cli_version,
source: session_meta.source,
git_info,
})
}
fn extract_conversation_summary(
path: PathBuf,
head: &[serde_json::Value],
session_meta: &SessionMeta,
git: Option<&GitInfo>,
fallback_provider: &str,
) -> Option<ConversationSummary> {
let session_meta = match head.first() {
Some(first_line) => serde_json::from_value::<SessionMeta>(first_line.clone()).ok()?,
None => return None,
};
let preview = head
.iter()
.filter_map(|value| serde_json::from_value::<ResponseItem>(value.clone()).ok())
@@ -2733,7 +2831,9 @@ fn extract_conversation_summary(
let conversation_id = session_meta.id;
let model_provider = session_meta
.model_provider
.clone()
.unwrap_or_else(|| fallback_provider.to_string());
let git_info = git.map(map_git_info);
Some(ConversationSummary {
conversation_id,
@@ -2741,13 +2841,53 @@ fn extract_conversation_summary(
path,
preview: preview.to_string(),
model_provider,
cwd: session_meta.cwd.clone(),
cli_version: session_meta.cli_version.clone(),
source: session_meta.source.clone(),
git_info,
})
}
fn map_git_info(git_info: &GitInfo) -> ConversationGitInfo {
ConversationGitInfo {
sha: git_info.commit_hash.clone(),
branch: git_info.branch.clone(),
origin_url: git_info.repository_url.clone(),
}
}
fn parse_datetime(timestamp: Option<&str>) -> Option<DateTime<Utc>> {
timestamp.and_then(|ts| {
chrono::DateTime::parse_from_rfc3339(ts)
.ok()
.map(|dt| dt.with_timezone(&chrono::Utc))
})
}
fn summary_to_thread(summary: ConversationSummary) -> Thread {
let ConversationSummary {
conversation_id,
preview,
timestamp,
model_provider,
..
} = summary;
let created_at = parse_datetime(timestamp.as_deref());
Thread {
id: conversation_id.to_string(),
preview,
model_provider,
created_at: created_at.map(|dt| dt.timestamp()).unwrap_or(0),
}
}
#[cfg(test)]
mod tests {
use super::*;
use anyhow::Result;
use codex_protocol::protocol::SessionSource;
use pretty_assertions::assert_eq;
use serde_json::json;
use tempfile::TempDir;
@@ -2786,8 +2926,11 @@ mod tests {
}),
];
let session_meta = serde_json::from_value::<SessionMeta>(head[0].clone())?;
let summary =
extract_conversation_summary(path.clone(), &head, "test-provider").expect("summary");
extract_conversation_summary(path.clone(), &head, &session_meta, None, "test-provider")
.expect("summary");
let expected = ConversationSummary {
conversation_id,
@@ -2795,6 +2938,10 @@ mod tests {
path,
preview: "Count to 5".to_string(),
model_provider: "test-provider".to_string(),
cwd: PathBuf::from("/"),
cli_version: "0.0.0".to_string(),
source: SessionSource::VSCode,
git_info: None,
};
assert_eq!(summary, expected);
@@ -2839,6 +2986,10 @@ mod tests {
path: path.clone(),
preview: String::new(),
model_provider: "fallback".to_string(),
cwd: PathBuf::new(),
cli_version: String::new(),
source: SessionSource::VSCode,
git_info: None,
};
assert_eq!(summary, expected);

View File

@@ -19,6 +19,7 @@ use codex_app_server_protocol::CancelLoginChatGptParams;
use codex_app_server_protocol::ClientInfo;
use codex_app_server_protocol::ClientNotification;
use codex_app_server_protocol::FeedbackUploadParams;
use codex_app_server_protocol::GetAccountParams;
use codex_app_server_protocol::GetAuthStatusParams;
use codex_app_server_protocol::InitializeParams;
use codex_app_server_protocol::InterruptConversationParams;
@@ -249,6 +250,15 @@ impl McpProcess {
self.send_request("account/rateLimits/read", None).await
}
/// Send an `account/read` JSON-RPC request.
pub async fn send_get_account_request(
&mut self,
params: GetAccountParams,
) -> anyhow::Result<i64> {
let params = Some(serde_json::to_value(params)?);
self.send_request("account/read", params).await
}
/// Send a `feedback/upload` JSON-RPC request.
pub async fn send_feedback_upload_request(
&mut self,

View File

@@ -7,8 +7,6 @@ mod fuzzy_file_search;
mod interrupt;
mod list_resume;
mod login;
mod model_list;
mod rate_limits;
mod send_message;
mod set_default_model;
mod user_agent;

View File

@@ -2,11 +2,15 @@ use anyhow::Result;
use anyhow::bail;
use app_test_support::McpProcess;
use app_test_support::to_response;
use app_test_support::ChatGptAuthFixture;
use app_test_support::write_chatgpt_auth;
use codex_app_server_protocol::Account;
use codex_app_server_protocol::AuthMode;
use codex_app_server_protocol::CancelLoginAccountParams;
use codex_app_server_protocol::CancelLoginAccountResponse;
use codex_app_server_protocol::GetAuthStatusParams;
use codex_app_server_protocol::GetAuthStatusResponse;
use codex_app_server_protocol::GetAccountParams;
use codex_app_server_protocol::GetAccountResponse;
use codex_app_server_protocol::JSONRPCError;
use codex_app_server_protocol::JSONRPCResponse;
use codex_app_server_protocol::LoginAccountResponse;
@@ -15,6 +19,7 @@ use codex_app_server_protocol::RequestId;
use codex_app_server_protocol::ServerNotification;
use codex_core::auth::AuthCredentialsStoreMode;
use codex_login::login_with_api_key;
use codex_protocol::account::PlanType as AccountPlanType;
use pretty_assertions::assert_eq;
use serial_test::serial;
use std::path::Path;
@@ -25,22 +30,30 @@ use tokio::time::timeout;
const DEFAULT_READ_TIMEOUT: std::time::Duration = std::time::Duration::from_secs(10);
// Helper to create a minimal config.toml for the app server
fn create_config_toml(
codex_home: &Path,
forced_method: Option<&str>,
forced_workspace_id: Option<&str>,
) -> std::io::Result<()> {
#[derive(Default)]
struct CreateConfigTomlParams {
forced_method: Option<String>,
forced_workspace_id: Option<String>,
requires_openai_auth: Option<bool>,
}
fn create_config_toml(codex_home: &Path, params: CreateConfigTomlParams) -> std::io::Result<()> {
let config_toml = codex_home.join("config.toml");
let forced_line = if let Some(method) = forced_method {
let forced_line = if let Some(method) = params.forced_method {
format!("forced_login_method = \"{method}\"\n")
} else {
String::new()
};
let forced_workspace_line = if let Some(ws) = forced_workspace_id {
let forced_workspace_line = if let Some(ws) = params.forced_workspace_id {
format!("forced_chatgpt_workspace_id = \"{ws}\"\n")
} else {
String::new()
};
let requires_line = match params.requires_openai_auth {
Some(true) => "requires_openai_auth = true\n".to_string(),
Some(false) => String::new(),
None => String::new(),
};
let contents = format!(
r#"
model = "mock-model"
@@ -57,6 +70,7 @@ base_url = "http://127.0.0.1:0/v1"
wire_api = "chat"
request_max_retries = 0
stream_max_retries = 0
{requires_line}
"#
);
std::fs::write(config_toml, contents)
@@ -65,7 +79,7 @@ stream_max_retries = 0
#[tokio::test]
async fn logout_account_removes_auth_and_notifies() -> Result<()> {
let codex_home = TempDir::new()?;
create_config_toml(codex_home.path(), None, None)?;
create_config_toml(codex_home.path(), CreateConfigTomlParams::default())?;
login_with_api_key(
codex_home.path(),
@@ -104,27 +118,25 @@ async fn logout_account_removes_auth_and_notifies() -> Result<()> {
"auth.json should be deleted"
);
let status_id = mcp
.send_get_auth_status_request(GetAuthStatusParams {
include_token: Some(true),
refresh_token: Some(false),
let get_id = mcp
.send_get_account_request(GetAccountParams {
refresh_token: false,
})
.await?;
let status_resp: JSONRPCResponse = timeout(
let get_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(status_id)),
mcp.read_stream_until_response_message(RequestId::Integer(get_id)),
)
.await??;
let status: GetAuthStatusResponse = to_response(status_resp)?;
assert_eq!(status.auth_method, None);
assert_eq!(status.auth_token, None);
let account: GetAccountResponse = to_response(get_resp)?;
assert_eq!(account.account, None);
Ok(())
}
#[tokio::test]
async fn login_account_api_key_succeeds_and_notifies() -> Result<()> {
let codex_home = TempDir::new()?;
create_config_toml(codex_home.path(), None, None)?;
create_config_toml(codex_home.path(), CreateConfigTomlParams::default())?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
@@ -171,7 +183,13 @@ async fn login_account_api_key_succeeds_and_notifies() -> Result<()> {
#[tokio::test]
async fn login_account_api_key_rejected_when_forced_chatgpt() -> Result<()> {
let codex_home = TempDir::new()?;
create_config_toml(codex_home.path(), Some("chatgpt"), None)?;
create_config_toml(
codex_home.path(),
CreateConfigTomlParams {
forced_method: Some("chatgpt".to_string()),
..Default::default()
},
)?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
@@ -195,7 +213,13 @@ async fn login_account_api_key_rejected_when_forced_chatgpt() -> Result<()> {
#[tokio::test]
async fn login_account_chatgpt_rejected_when_forced_api() -> Result<()> {
let codex_home = TempDir::new()?;
create_config_toml(codex_home.path(), Some("api"), None)?;
create_config_toml(
codex_home.path(),
CreateConfigTomlParams {
forced_method: Some("api".to_string()),
..Default::default()
},
)?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
@@ -219,7 +243,7 @@ async fn login_account_chatgpt_rejected_when_forced_api() -> Result<()> {
#[serial(login_port)]
async fn login_account_chatgpt_start() -> Result<()> {
let codex_home = TempDir::new()?;
create_config_toml(codex_home.path(), None, None)?;
create_config_toml(codex_home.path(), CreateConfigTomlParams::default())?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
@@ -285,7 +309,13 @@ async fn login_account_chatgpt_start() -> Result<()> {
#[serial(login_port)]
async fn login_account_chatgpt_includes_forced_workspace_query_param() -> Result<()> {
let codex_home = TempDir::new()?;
create_config_toml(codex_home.path(), None, Some("ws-forced"))?;
create_config_toml(
codex_home.path(),
CreateConfigTomlParams {
forced_workspace_id: Some("ws-forced".to_string()),
..Default::default()
},
)?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
@@ -307,3 +337,156 @@ async fn login_account_chatgpt_includes_forced_workspace_query_param() -> Result
);
Ok(())
}
#[tokio::test]
async fn get_account_no_auth() -> Result<()> {
let codex_home = TempDir::new()?;
create_config_toml(
codex_home.path(),
CreateConfigTomlParams {
requires_openai_auth: Some(true),
..Default::default()
},
)?;
let mut mcp = McpProcess::new_with_env(codex_home.path(), &[("OPENAI_API_KEY", None)]).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
let params = GetAccountParams {
refresh_token: false,
};
let request_id = mcp.send_get_account_request(params).await?;
let resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(request_id)),
)
.await??;
let account: GetAccountResponse = to_response(resp)?;
assert_eq!(account.account, None, "expected no account");
assert_eq!(account.requires_openai_auth, true);
Ok(())
}
#[tokio::test]
async fn get_account_with_api_key() -> Result<()> {
let codex_home = TempDir::new()?;
create_config_toml(
codex_home.path(),
CreateConfigTomlParams {
requires_openai_auth: Some(true),
..Default::default()
},
)?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
let req_id = mcp
.send_login_account_api_key_request("sk-test-key")
.await?;
let resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(req_id)),
)
.await??;
let _login_ok = to_response::<LoginAccountResponse>(resp)?;
let params = GetAccountParams {
refresh_token: false,
};
let request_id = mcp.send_get_account_request(params).await?;
let resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(request_id)),
)
.await??;
let received: GetAccountResponse = to_response(resp)?;
let expected = GetAccountResponse {
account: Some(Account::ApiKey {}),
requires_openai_auth: true,
};
assert_eq!(received, expected);
Ok(())
}
#[tokio::test]
async fn get_account_when_auth_not_required() -> Result<()> {
let codex_home = TempDir::new()?;
create_config_toml(
codex_home.path(),
CreateConfigTomlParams {
requires_openai_auth: Some(false),
..Default::default()
},
)?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
let params = GetAccountParams {
refresh_token: false,
};
let request_id = mcp.send_get_account_request(params).await?;
let resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(request_id)),
)
.await??;
let received: GetAccountResponse = to_response(resp)?;
let expected = GetAccountResponse {
account: None,
requires_openai_auth: false,
};
assert_eq!(received, expected);
Ok(())
}
#[tokio::test]
async fn get_account_with_chatgpt() -> Result<()> {
let codex_home = TempDir::new()?;
create_config_toml(
codex_home.path(),
CreateConfigTomlParams {
requires_openai_auth: Some(true),
..Default::default()
},
)?;
write_chatgpt_auth(
codex_home.path(),
ChatGptAuthFixture::new("access-chatgpt")
.email("user@example.com")
.plan_type("pro"),
AuthCredentialsStoreMode::File,
)?;
let mut mcp = McpProcess::new_with_env(codex_home.path(), &[("OPENAI_API_KEY", None)]).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
let params = GetAccountParams {
refresh_token: false,
};
let request_id = mcp.send_get_account_request(params).await?;
let resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(request_id)),
)
.await??;
let received: GetAccountResponse = to_response(resp)?;
let expected = GetAccountResponse {
account: Some(Account::Chatgpt {
email: "user@example.com".to_string(),
plan_type: AccountPlanType::Pro,
}),
requires_openai_auth: true,
};
assert_eq!(received, expected);
Ok(())
}

View File

@@ -1,4 +1,6 @@
mod account;
mod model_list;
mod rate_limits;
mod thread_archive;
mod thread_list;
mod thread_resume;

View File

@@ -19,7 +19,7 @@ use tokio::time::timeout;
const DEFAULT_TIMEOUT: Duration = Duration::from_secs(10);
const INVALID_REQUEST_ERROR_CODE: i64 = -32600;
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
#[tokio::test]
async fn list_models_returns_all_models_with_large_limit() -> Result<()> {
let codex_home = TempDir::new()?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
@@ -106,7 +106,7 @@ async fn list_models_returns_all_models_with_large_limit() -> Result<()> {
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
#[tokio::test]
async fn list_models_pagination_works() -> Result<()> {
let codex_home = TempDir::new()?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
@@ -159,7 +159,7 @@ async fn list_models_pagination_works() -> Result<()> {
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
#[tokio::test]
async fn list_models_rejects_invalid_cursor() -> Result<()> {
let codex_home = TempDir::new()?;
let mut mcp = McpProcess::new(codex_home.path()).await?;

View File

@@ -26,7 +26,7 @@ use wiremock::matchers::path;
const DEFAULT_READ_TIMEOUT: std::time::Duration = std::time::Duration::from_secs(10);
const INVALID_REQUEST_ERROR_CODE: i64 = -32600;
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
#[tokio::test]
async fn get_account_rate_limits_requires_auth() -> Result<()> {
let codex_home = TempDir::new()?;
@@ -51,7 +51,7 @@ async fn get_account_rate_limits_requires_auth() -> Result<()> {
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
#[tokio::test]
async fn get_account_rate_limits_requires_chatgpt_auth() -> Result<()> {
let codex_home = TempDir::new()?;
@@ -78,7 +78,7 @@ async fn get_account_rate_limits_requires_chatgpt_auth() -> Result<()> {
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
#[tokio::test]
async fn get_account_rate_limits_returns_snapshot() -> Result<()> {
let codex_home = TempDir::new()?;
write_chatgpt_auth(

View File

@@ -102,6 +102,11 @@ async fn thread_list_pagination_next_cursor_none_on_last_page() -> Result<()> {
next_cursor: cursor1,
} = to_response::<ThreadListResponse>(page1_resp)?;
assert_eq!(data1.len(), 2);
for thread in &data1 {
assert_eq!(thread.preview, "Hello");
assert_eq!(thread.model_provider, "mock_provider");
assert!(thread.created_at > 0);
}
let cursor1 = cursor1.expect("expected nextCursor on first page");
// Page 2: with cursor → expect next_cursor None when no more results.
@@ -122,6 +127,11 @@ async fn thread_list_pagination_next_cursor_none_on_last_page() -> Result<()> {
next_cursor: cursor2,
} = to_response::<ThreadListResponse>(page2_resp)?;
assert!(data2.len() <= 2);
for thread in &data2 {
assert_eq!(thread.preview, "Hello");
assert_eq!(thread.model_provider, "mock_provider");
assert!(thread.created_at > 0);
}
assert_eq!(cursor2, None, "expected nextCursor to be null on last page");
Ok(())
@@ -200,6 +210,11 @@ async fn thread_list_respects_provider_filter() -> Result<()> {
let ThreadListResponse { data, next_cursor } = to_response::<ThreadListResponse>(resp)?;
assert_eq!(data.len(), 1);
assert_eq!(next_cursor, None);
let thread = &data[0];
assert_eq!(thread.preview, "X");
assert_eq!(thread.model_provider, "other_provider");
let expected_ts = chrono::DateTime::parse_from_rfc3339("2025-01-02T11:00:00Z")?.timestamp();
assert_eq!(thread.created_at, expected_ts);
Ok(())
}

View File

@@ -49,7 +49,7 @@ async fn thread_resume_returns_existing_thread() -> Result<()> {
.await??;
let ThreadResumeResponse { thread: resumed } =
to_response::<ThreadResumeResponse>(resume_resp)?;
assert_eq!(resumed.id, thread.id);
assert_eq!(resumed, thread);
Ok(())
}

View File

@@ -42,6 +42,15 @@ async fn thread_start_creates_thread_and_emits_started() -> Result<()> {
.await??;
let ThreadStartResponse { thread } = to_response::<ThreadStartResponse>(resp)?;
assert!(!thread.id.is_empty(), "thread id should not be empty");
assert!(
thread.preview.is_empty(),
"new threads should start with an empty preview"
);
assert_eq!(thread.model_provider, "mock_provider");
assert!(
thread.created_at > 0,
"created_at should be a positive UNIX timestamp"
);
// A corresponding thread/started notification should arrive.
let notif: JSONRPCNotification = timeout(
@@ -51,7 +60,7 @@ async fn thread_start_creates_thread_and_emits_started() -> Result<()> {
.await??;
let started: ThreadStartedNotification =
serde_json::from_value(notif.params.expect("params must be present"))?;
assert_eq!(started.thread.id, thread.id);
assert_eq!(started.thread, thread);
Ok(())
}

View File

@@ -288,7 +288,7 @@ pub fn maybe_parse_apply_patch_verified(argv: &[String], cwd: &Path) -> MaybeApp
path,
ApplyPatchFileChange::Update {
unified_diff,
move_path: move_path.map(|p| cwd.join(p)),
move_path: move_path.map(|p| effective_cwd.join(p)),
new_content: contents,
},
);
@@ -1603,6 +1603,53 @@ g
);
}
#[test]
fn test_apply_patch_resolves_move_path_with_effective_cwd() {
let session_dir = tempdir().unwrap();
let worktree_rel = "alt";
let worktree_dir = session_dir.path().join(worktree_rel);
fs::create_dir_all(&worktree_dir).unwrap();
let source_name = "old.txt";
let dest_name = "renamed.txt";
let source_path = worktree_dir.join(source_name);
fs::write(&source_path, "before\n").unwrap();
let patch = wrap_patch(&format!(
r#"*** Update File: {source_name}
*** Move to: {dest_name}
@@
-before
+after"#
));
let shell_script = format!("cd {worktree_rel} && apply_patch <<'PATCH'\n{patch}\nPATCH");
let argv = vec!["bash".into(), "-lc".into(), shell_script];
let result = maybe_parse_apply_patch_verified(&argv, session_dir.path());
let action = match result {
MaybeApplyPatchVerified::Body(action) => action,
other => panic!("expected verified body, got {other:?}"),
};
assert_eq!(action.cwd, worktree_dir);
let change = action
.changes()
.get(&worktree_dir.join(source_name))
.expect("source file change present");
match change {
ApplyPatchFileChange::Update { move_path, .. } => {
assert_eq!(
move_path.as_deref(),
Some(worktree_dir.join(dest_name).as_path())
);
}
other => panic!("expected update change, got {other:?}"),
}
}
#[test]
fn test_apply_patch_fails_on_write_error() {
let dir = tempdir().unwrap();

View File

@@ -11,32 +11,7 @@ const LINUX_SANDBOX_ARG0: &str = "codex-linux-sandbox";
const APPLY_PATCH_ARG0: &str = "apply_patch";
const MISSPELLED_APPLY_PATCH_ARG0: &str = "applypatch";
/// While we want to deploy the Codex CLI as a single executable for simplicity,
/// we also want to expose some of its functionality as distinct CLIs, so we use
/// the "arg0 trick" to determine which CLI to dispatch. This effectively allows
/// us to simulate deploying multiple executables as a single binary on Mac and
/// Linux (but not Windows).
///
/// When the current executable is invoked through the hard-link or alias named
/// `codex-linux-sandbox` we *directly* execute
/// [`codex_linux_sandbox::run_main`] (which never returns). Otherwise we:
///
/// 1. Load `.env` values from `~/.codex/.env` before creating any threads.
/// 2. Construct a Tokio multi-thread runtime.
/// 3. Derive the path to the current executable (so children can re-invoke the
/// sandbox) when running on Linux.
/// 4. Execute the provided async `main_fn` inside that runtime, forwarding any
/// error. Note that `main_fn` receives `codex_linux_sandbox_exe:
/// Option<PathBuf>`, as an argument, which is generally needed as part of
/// constructing [`codex_core::config::Config`].
///
/// This function should be used to wrap any `main()` function in binary crates
/// in this workspace that depends on these helper CLIs.
pub fn arg0_dispatch_or_else<F, Fut>(main_fn: F) -> anyhow::Result<()>
where
F: FnOnce(Option<PathBuf>) -> Fut,
Fut: Future<Output = anyhow::Result<()>>,
{
pub fn arg0_dispatch() -> Option<TempDir> {
// Determine if we were invoked via the special alias.
let mut args = std::env::args_os();
let argv0 = args.next().unwrap_or_default();
@@ -76,10 +51,7 @@ where
// before creating any threads/the Tokio runtime.
load_dotenv();
// Retain the TempDir so it exists for the lifetime of the invocation of
// this executable. Admittedly, we could invoke `keep()` on it, but it
// would be nice to avoid leaving temporary directories behind, if possible.
let _path_entry = match prepend_path_entry_for_apply_patch() {
match prepend_path_entry_for_codex_aliases() {
Ok(path_entry) => Some(path_entry),
Err(err) => {
// It is possible that Codex will proceed successfully even if
@@ -87,7 +59,39 @@ where
eprintln!("WARNING: proceeding, even though we could not update PATH: {err}");
None
}
};
}
}
/// While we want to deploy the Codex CLI as a single executable for simplicity,
/// we also want to expose some of its functionality as distinct CLIs, so we use
/// the "arg0 trick" to determine which CLI to dispatch. This effectively allows
/// us to simulate deploying multiple executables as a single binary on Mac and
/// Linux (but not Windows).
///
/// When the current executable is invoked through the hard-link or alias named
/// `codex-linux-sandbox` we *directly* execute
/// [`codex_linux_sandbox::run_main`] (which never returns). Otherwise we:
///
/// 1. Load `.env` values from `~/.codex/.env` before creating any threads.
/// 2. Construct a Tokio multi-thread runtime.
/// 3. Derive the path to the current executable (so children can re-invoke the
/// sandbox) when running on Linux.
/// 4. Execute the provided async `main_fn` inside that runtime, forwarding any
/// error. Note that `main_fn` receives `codex_linux_sandbox_exe:
/// Option<PathBuf>`, as an argument, which is generally needed as part of
/// constructing [`codex_core::config::Config`].
///
/// This function should be used to wrap any `main()` function in binary crates
/// in this workspace that depends on these helper CLIs.
pub fn arg0_dispatch_or_else<F, Fut>(main_fn: F) -> anyhow::Result<()>
where
F: FnOnce(Option<PathBuf>) -> Fut,
Fut: Future<Output = anyhow::Result<()>>,
{
// Retain the TempDir so it exists for the lifetime of the invocation of
// this executable. Admittedly, we could invoke `keep()` on it, but it
// would be nice to avoid leaving temporary directories behind, if possible.
let _path_entry = arg0_dispatch();
// Regular invocation create a Tokio runtime and execute the provided
// async entry-point.
@@ -144,11 +148,16 @@ where
///
/// IMPORTANT: This function modifies the PATH environment variable, so it MUST
/// be called before multiple threads are spawned.
fn prepend_path_entry_for_apply_patch() -> std::io::Result<TempDir> {
pub fn prepend_path_entry_for_codex_aliases() -> std::io::Result<TempDir> {
let temp_dir = TempDir::new()?;
let path = temp_dir.path();
for filename in &[APPLY_PATCH_ARG0, MISSPELLED_APPLY_PATCH_ARG0] {
for filename in &[
APPLY_PATCH_ARG0,
MISSPELLED_APPLY_PATCH_ARG0,
#[cfg(target_os = "linux")]
LINUX_SANDBOX_ARG0,
] {
let exe = std::env::current_exe()?;
#[cfg(unix)]

View File

@@ -30,13 +30,14 @@ codex-login = { workspace = true }
codex-mcp-server = { workspace = true }
codex-process-hardening = { workspace = true }
codex-protocol = { workspace = true }
codex-protocol-ts = { workspace = true }
codex-responses-api-proxy = { workspace = true }
codex-rmcp-client = { workspace = true }
codex-stdio-to-uds = { workspace = true }
codex-tui = { workspace = true }
ctor = { workspace = true }
libc = { workspace = true }
owo-colors = { workspace = true }
regex-lite = { workspace = true}
serde_json = { workspace = true }
supports-color = { workspace = true }
toml = { workspace = true }
@@ -47,6 +48,7 @@ tokio = { workspace = true, features = [
"rt-multi-thread",
"signal",
] }
tracing = { workspace = true }
[target.'cfg(target_os = "windows")'.dependencies]
codex_windows_sandbox = { package = "codex-windows-sandbox", path = "../windows-sandbox-rs" }

View File

@@ -1,3 +1,8 @@
#[cfg(target_os = "macos")]
mod pid_tracker;
#[cfg(target_os = "macos")]
mod seatbelt;
use std::path::PathBuf;
use codex_common::CliConfigOverrides;
@@ -5,6 +10,7 @@ use codex_core::config::Config;
use codex_core::config::ConfigOverrides;
use codex_core::exec_env::create_env;
use codex_core::landlock::spawn_command_under_linux_sandbox;
#[cfg(target_os = "macos")]
use codex_core::seatbelt::spawn_command_under_seatbelt;
use codex_core::spawn::StdioPolicy;
use codex_protocol::config_types::SandboxMode;
@@ -14,12 +20,17 @@ use crate::SeatbeltCommand;
use crate::WindowsCommand;
use crate::exit_status::handle_exit_status;
#[cfg(target_os = "macos")]
use seatbelt::DenialLogger;
#[cfg(target_os = "macos")]
pub async fn run_command_under_seatbelt(
command: SeatbeltCommand,
codex_linux_sandbox_exe: Option<PathBuf>,
) -> anyhow::Result<()> {
let SeatbeltCommand {
full_auto,
log_denials,
config_overrides,
command,
} = command;
@@ -29,10 +40,19 @@ pub async fn run_command_under_seatbelt(
config_overrides,
codex_linux_sandbox_exe,
SandboxType::Seatbelt,
log_denials,
)
.await
}
#[cfg(not(target_os = "macos"))]
pub async fn run_command_under_seatbelt(
_command: SeatbeltCommand,
_codex_linux_sandbox_exe: Option<PathBuf>,
) -> anyhow::Result<()> {
anyhow::bail!("Seatbelt sandbox is only available on macOS");
}
pub async fn run_command_under_landlock(
command: LandlockCommand,
codex_linux_sandbox_exe: Option<PathBuf>,
@@ -48,6 +68,7 @@ pub async fn run_command_under_landlock(
config_overrides,
codex_linux_sandbox_exe,
SandboxType::Landlock,
false,
)
.await
}
@@ -67,11 +88,13 @@ pub async fn run_command_under_windows(
config_overrides,
codex_linux_sandbox_exe,
SandboxType::Windows,
false,
)
.await
}
enum SandboxType {
#[cfg(target_os = "macos")]
Seatbelt,
Landlock,
Windows,
@@ -83,6 +106,7 @@ async fn run_command_under_sandbox(
config_overrides: CliConfigOverrides,
codex_linux_sandbox_exe: Option<PathBuf>,
sandbox_type: SandboxType,
log_denials: bool,
) -> anyhow::Result<()> {
let sandbox_mode = create_sandbox_mode(full_auto);
let config = Config::load_with_cli_overrides(
@@ -125,6 +149,8 @@ async fn run_command_under_sandbox(
let env_map = env.clone();
let command_vec = command.clone();
let base_dir = config.codex_home.clone();
// Preflight audit is invoked elsewhere at the appropriate times.
let res = tokio::task::spawn_blocking(move || {
run_windows_sandbox_capture(
policy_str,
@@ -167,7 +193,13 @@ async fn run_command_under_sandbox(
}
}
#[cfg(target_os = "macos")]
let mut denial_logger = log_denials.then(DenialLogger::new).flatten();
#[cfg(not(target_os = "macos"))]
let _ = log_denials;
let mut child = match sandbox_type {
#[cfg(target_os = "macos")]
SandboxType::Seatbelt => {
spawn_command_under_seatbelt(
command,
@@ -199,8 +231,27 @@ async fn run_command_under_sandbox(
unreachable!("Windows sandbox should have been handled above");
}
};
#[cfg(target_os = "macos")]
if let Some(denial_logger) = &mut denial_logger {
denial_logger.on_child_spawn(&child);
}
let status = child.wait().await?;
#[cfg(target_os = "macos")]
if let Some(denial_logger) = denial_logger {
let denials = denial_logger.finish().await;
eprintln!("\n=== Sandbox denials ===");
if denials.is_empty() {
eprintln!("None found.");
} else {
for seatbelt::SandboxDenial { name, capability } in denials {
eprintln!("({name}) {capability}");
}
}
}
handle_exit_status(status);
}

View File

@@ -0,0 +1,372 @@
use std::collections::HashSet;
use tokio::task::JoinHandle;
use tracing::warn;
/// Tracks the (recursive) descendants of a process by using `kqueue` to watch for fork events, and
/// `proc_listchildpids` to list the children of a process.
pub(crate) struct PidTracker {
kq: libc::c_int,
handle: JoinHandle<HashSet<i32>>,
}
impl PidTracker {
pub(crate) fn new(root_pid: i32) -> Option<Self> {
if root_pid <= 0 {
return None;
}
let kq = unsafe { libc::kqueue() };
let handle = tokio::task::spawn_blocking(move || track_descendants(kq, root_pid));
Some(Self { kq, handle })
}
pub(crate) async fn stop(self) -> HashSet<i32> {
trigger_stop_event(self.kq);
self.handle.await.unwrap_or_default()
}
}
unsafe extern "C" {
fn proc_listchildpids(
ppid: libc::c_int,
buffer: *mut libc::c_void,
buffersize: libc::c_int,
) -> libc::c_int;
}
/// Wrap proc_listchildpids.
fn list_child_pids(parent: i32) -> Vec<i32> {
unsafe {
let mut capacity: usize = 16;
loop {
let mut buf: Vec<i32> = vec![0; capacity];
let count = proc_listchildpids(
parent as libc::c_int,
buf.as_mut_ptr() as *mut libc::c_void,
(buf.len() * std::mem::size_of::<i32>()) as libc::c_int,
);
if count <= 0 {
return Vec::new();
}
let returned = count as usize;
if returned < capacity {
buf.truncate(returned);
return buf;
}
capacity = capacity.saturating_mul(2).max(returned + 16);
}
}
}
fn pid_is_alive(pid: i32) -> bool {
if pid <= 0 {
return false;
}
let res = unsafe { libc::kill(pid as libc::pid_t, 0) };
if res == 0 {
true
} else {
matches!(
std::io::Error::last_os_error().raw_os_error(),
Some(libc::EPERM)
)
}
}
enum WatchPidError {
ProcessGone,
Other(std::io::Error),
}
/// Add `pid` to the watch list in `kq`.
fn watch_pid(kq: libc::c_int, pid: i32) -> Result<(), WatchPidError> {
if pid <= 0 {
return Err(WatchPidError::ProcessGone);
}
let kev = libc::kevent {
ident: pid as libc::uintptr_t,
filter: libc::EVFILT_PROC,
flags: libc::EV_ADD | libc::EV_CLEAR,
fflags: libc::NOTE_FORK | libc::NOTE_EXEC | libc::NOTE_EXIT,
data: 0,
udata: std::ptr::null_mut(),
};
let res = unsafe { libc::kevent(kq, &kev, 1, std::ptr::null_mut(), 0, std::ptr::null()) };
if res < 0 {
let err = std::io::Error::last_os_error();
if err.raw_os_error() == Some(libc::ESRCH) {
Err(WatchPidError::ProcessGone)
} else {
Err(WatchPidError::Other(err))
}
} else {
Ok(())
}
}
fn watch_children(
kq: libc::c_int,
parent: i32,
seen: &mut HashSet<i32>,
active: &mut HashSet<i32>,
) {
for child_pid in list_child_pids(parent) {
add_pid_watch(kq, child_pid, seen, active);
}
}
/// Watch `pid` and its children, updating `seen` and `active` sets.
fn add_pid_watch(kq: libc::c_int, pid: i32, seen: &mut HashSet<i32>, active: &mut HashSet<i32>) {
if pid <= 0 {
return;
}
let newly_seen = seen.insert(pid);
let mut should_recurse = newly_seen;
if active.insert(pid) {
match watch_pid(kq, pid) {
Ok(()) => {
should_recurse = true;
}
Err(WatchPidError::ProcessGone) => {
active.remove(&pid);
return;
}
Err(WatchPidError::Other(err)) => {
warn!("failed to watch pid {pid}: {err}");
active.remove(&pid);
return;
}
}
}
if should_recurse {
watch_children(kq, pid, seen, active);
}
}
const STOP_IDENT: libc::uintptr_t = 1;
fn register_stop_event(kq: libc::c_int) -> bool {
let kev = libc::kevent {
ident: STOP_IDENT,
filter: libc::EVFILT_USER,
flags: libc::EV_ADD | libc::EV_CLEAR,
fflags: 0,
data: 0,
udata: std::ptr::null_mut(),
};
let res = unsafe { libc::kevent(kq, &kev, 1, std::ptr::null_mut(), 0, std::ptr::null()) };
res >= 0
}
fn trigger_stop_event(kq: libc::c_int) {
if kq < 0 {
return;
}
let kev = libc::kevent {
ident: STOP_IDENT,
filter: libc::EVFILT_USER,
flags: 0,
fflags: libc::NOTE_TRIGGER,
data: 0,
udata: std::ptr::null_mut(),
};
let _ = unsafe { libc::kevent(kq, &kev, 1, std::ptr::null_mut(), 0, std::ptr::null()) };
}
/// Put all of the above together to track all the descendants of `root_pid`.
fn track_descendants(kq: libc::c_int, root_pid: i32) -> HashSet<i32> {
if kq < 0 {
let mut seen = HashSet::new();
seen.insert(root_pid);
return seen;
}
if !register_stop_event(kq) {
let mut seen = HashSet::new();
seen.insert(root_pid);
let _ = unsafe { libc::close(kq) };
return seen;
}
let mut seen: HashSet<i32> = HashSet::new();
let mut active: HashSet<i32> = HashSet::new();
add_pid_watch(kq, root_pid, &mut seen, &mut active);
const EVENTS_CAP: usize = 32;
let mut events: [libc::kevent; EVENTS_CAP] =
unsafe { std::mem::MaybeUninit::zeroed().assume_init() };
let mut stop_requested = false;
loop {
if active.is_empty() {
if !pid_is_alive(root_pid) {
break;
}
add_pid_watch(kq, root_pid, &mut seen, &mut active);
if active.is_empty() {
continue;
}
}
let nev = unsafe {
libc::kevent(
kq,
std::ptr::null::<libc::kevent>(),
0,
events.as_mut_ptr(),
EVENTS_CAP as libc::c_int,
std::ptr::null(),
)
};
if nev < 0 {
let err = std::io::Error::last_os_error();
if err.kind() == std::io::ErrorKind::Interrupted {
continue;
}
break;
}
if nev == 0 {
continue;
}
for ev in events.iter().take(nev as usize) {
let pid = ev.ident as i32;
if ev.filter == libc::EVFILT_USER && ev.ident == STOP_IDENT {
stop_requested = true;
break;
}
if (ev.flags & libc::EV_ERROR) != 0 {
if ev.data == libc::ESRCH as isize {
active.remove(&pid);
}
continue;
}
if (ev.fflags & libc::NOTE_FORK) != 0 {
watch_children(kq, pid, &mut seen, &mut active);
}
if (ev.fflags & libc::NOTE_EXIT) != 0 {
active.remove(&pid);
}
}
if stop_requested {
break;
}
}
let _ = unsafe { libc::close(kq) };
seen
}
#[cfg(test)]
mod tests {
use super::*;
use std::process::Command;
use std::process::Stdio;
use std::time::Duration;
#[test]
fn pid_is_alive_detects_current_process() {
let pid = std::process::id() as i32;
assert!(pid_is_alive(pid));
}
#[cfg(target_os = "macos")]
#[test]
fn list_child_pids_includes_spawned_child() {
let mut child = Command::new("/bin/sleep")
.arg("5")
.stdin(Stdio::null())
.spawn()
.expect("failed to spawn child process");
let child_pid = child.id() as i32;
let parent_pid = std::process::id() as i32;
let mut found = false;
for _ in 0..100 {
if list_child_pids(parent_pid).contains(&child_pid) {
found = true;
break;
}
std::thread::sleep(Duration::from_millis(10));
}
let _ = child.kill();
let _ = child.wait();
assert!(found, "expected to find child pid {child_pid} in list");
}
#[cfg(target_os = "macos")]
#[tokio::test]
async fn pid_tracker_collects_spawned_children() {
let tracker = PidTracker::new(std::process::id() as i32).expect("failed to create tracker");
let mut child = Command::new("/bin/sleep")
.arg("0.1")
.stdin(Stdio::null())
.spawn()
.expect("failed to spawn child process");
let child_pid = child.id() as i32;
let parent_pid = std::process::id() as i32;
let _ = child.wait();
let seen = tracker.stop().await;
assert!(
seen.contains(&parent_pid),
"expected tracker to include parent pid {parent_pid}"
);
assert!(
seen.contains(&child_pid),
"expected tracker to include child pid {child_pid}"
);
}
#[cfg(target_os = "macos")]
#[tokio::test]
async fn pid_tracker_collects_bash_subshell_descendants() {
let tracker = PidTracker::new(std::process::id() as i32).expect("failed to create tracker");
let child = Command::new("/bin/bash")
.arg("-c")
.arg("(sleep 0.1 & echo $!; wait)")
.stdin(Stdio::null())
.stdout(Stdio::piped())
.stderr(Stdio::null())
.spawn()
.expect("failed to spawn bash");
let output = child.wait_with_output().unwrap().stdout;
let subshell_pid = String::from_utf8_lossy(&output)
.trim()
.parse::<i32>()
.expect("failed to parse subshell pid");
let seen = tracker.stop().await;
assert!(
seen.contains(&subshell_pid),
"expected tracker to include subshell pid {subshell_pid}"
);
}
}

View File

@@ -0,0 +1,114 @@
use std::collections::HashSet;
use tokio::io::AsyncBufReadExt;
use tokio::process::Child;
use tokio::task::JoinHandle;
use super::pid_tracker::PidTracker;
pub struct SandboxDenial {
pub name: String,
pub capability: String,
}
pub struct DenialLogger {
log_stream: Child,
pid_tracker: Option<PidTracker>,
log_reader: Option<JoinHandle<Vec<u8>>>,
}
impl DenialLogger {
pub(crate) fn new() -> Option<Self> {
let mut log_stream = start_log_stream()?;
let stdout = log_stream.stdout.take()?;
let log_reader = tokio::spawn(async move {
let mut reader = tokio::io::BufReader::new(stdout);
let mut logs = Vec::new();
let mut chunk = Vec::new();
loop {
match reader.read_until(b'\n', &mut chunk).await {
Ok(0) | Err(_) => break,
Ok(_) => {
logs.extend_from_slice(&chunk);
chunk.clear();
}
}
}
logs
});
Some(Self {
log_stream,
pid_tracker: None,
log_reader: Some(log_reader),
})
}
pub(crate) fn on_child_spawn(&mut self, child: &Child) {
if let Some(root_pid) = child.id() {
self.pid_tracker = PidTracker::new(root_pid as i32);
}
}
pub(crate) async fn finish(mut self) -> Vec<SandboxDenial> {
let pid_set = match self.pid_tracker {
Some(tracker) => tracker.stop().await,
None => Default::default(),
};
if pid_set.is_empty() {
return Vec::new();
}
let _ = self.log_stream.kill().await;
let _ = self.log_stream.wait().await;
let logs_bytes = match self.log_reader.take() {
Some(handle) => handle.await.unwrap_or_default(),
None => Vec::new(),
};
let logs = String::from_utf8_lossy(&logs_bytes);
let mut seen: HashSet<(String, String)> = HashSet::new();
let mut denials: Vec<SandboxDenial> = Vec::new();
for line in logs.lines() {
if let Ok(json) = serde_json::from_str::<serde_json::Value>(line)
&& let Some(msg) = json.get("eventMessage").and_then(|v| v.as_str())
&& let Some((pid, name, capability)) = parse_message(msg)
&& pid_set.contains(&pid)
&& seen.insert((name.clone(), capability.clone()))
{
denials.push(SandboxDenial { name, capability });
}
}
denials
}
}
fn start_log_stream() -> Option<Child> {
use std::process::Stdio;
const PREDICATE: &str = r#"(((processID == 0) AND (senderImagePath CONTAINS "/Sandbox")) OR (subsystem == "com.apple.sandbox.reporting"))"#;
tokio::process::Command::new("log")
.args(["stream", "--style", "ndjson", "--predicate", PREDICATE])
.stdin(Stdio::null())
.stdout(Stdio::piped())
.stderr(Stdio::null())
.kill_on_drop(true)
.spawn()
.ok()
}
fn parse_message(msg: &str) -> Option<(i32, String, String)> {
// Example message:
// Sandbox: processname(1234) deny(1) capability-name args...
static RE: std::sync::OnceLock<regex_lite::Regex> = std::sync::OnceLock::new();
let re = RE.get_or_init(|| {
#[expect(clippy::unwrap_used)]
regex_lite::Regex::new(r"^Sandbox:\s*(.+?)\((\d+)\)\s+deny\(.*?\)\s*(.+)$").unwrap()
});
let (_, [name, pid_str, capability]) = re.captures(msg)?.extract();
let pid = pid_str.trim().parse::<i32>().ok()?;
Some((pid, name.to_string(), capability.to_string()))
}

View File

@@ -11,6 +11,10 @@ pub struct SeatbeltCommand {
#[arg(long = "full-auto", default_value_t = false)]
pub full_auto: bool,
/// While the command runs, capture macOS sandbox denials via `log stream` and print them after exit
#[arg(long = "log-denials", default_value_t = false)]
pub log_denials: bool,
#[clap(skip)]
pub config_overrides: CliConfigOverrides,

View File

@@ -1,3 +1,4 @@
use clap::Args;
use clap::CommandFactory;
use clap::Parser;
use clap_complete::Shell;
@@ -20,14 +21,17 @@ use codex_exec::Cli as ExecCli;
use codex_responses_api_proxy::Args as ResponsesApiProxyArgs;
use codex_tui::AppExitInfo;
use codex_tui::Cli as TuiCli;
use codex_tui::updates::UpdateAction;
use codex_tui::update_action::UpdateAction;
use owo_colors::OwoColorize;
use std::path::PathBuf;
use supports_color::Stream;
mod mcp_cmd;
#[cfg(not(windows))]
mod wsl_paths;
use crate::mcp_cmd::McpCli;
use codex_core::config::Config;
use codex_core::config::ConfigOverrides;
use codex_core::features::is_known_feature_key;
@@ -79,8 +83,8 @@ enum Subcommand {
/// [experimental] Run the Codex MCP server (stdio transport).
McpServer,
/// [experimental] Run the app server.
AppServer,
/// [experimental] Run the app server or related tooling.
AppServer(AppServerCommand),
/// Generate shell completion scripts.
Completion(CompletionCommand),
@@ -96,9 +100,6 @@ enum Subcommand {
/// Resume a previous interactive session (picker by default; use --last to continue the most recent).
Resume(ResumeCommand),
/// Internal: generate TypeScript protocol bindings.
#[clap(hide = true)]
GenerateTs(GenerateTsCommand),
/// [EXPERIMENTAL] Browse tasks from Codex Cloud and apply changes locally.
#[clap(name = "cloud", alias = "cloud-tasks")]
Cloud(CloudTasksCli),
@@ -205,6 +206,22 @@ struct LogoutCommand {
}
#[derive(Debug, Parser)]
struct AppServerCommand {
/// Omit to run the app server; specify a subcommand for tooling.
#[command(subcommand)]
subcommand: Option<AppServerSubcommand>,
}
#[derive(Debug, clap::Subcommand)]
enum AppServerSubcommand {
/// [experimental] Generate TypeScript bindings for the app server protocol.
GenerateTs(GenerateTsCommand),
/// [experimental] Generate JSON Schema for the app server protocol.
GenerateJsonSchema(GenerateJsonSchemaCommand),
}
#[derive(Debug, Args)]
struct GenerateTsCommand {
/// Output directory where .ts files will be written
#[arg(short = 'o', long = "out", value_name = "DIR")]
@@ -215,6 +232,13 @@ struct GenerateTsCommand {
prettier: Option<PathBuf>,
}
#[derive(Debug, Args)]
struct GenerateJsonSchemaCommand {
/// Output directory where the schema bundle will be written
#[arg(short = 'o', long = "out", value_name = "DIR")]
out_dir: PathBuf,
}
#[derive(Debug, Parser)]
struct StdioToUdsCommand {
/// Path to the Unix domain socket to connect to.
@@ -267,10 +291,30 @@ fn handle_app_exit(exit_info: AppExitInfo) -> anyhow::Result<()> {
/// Run the update action and print the result.
fn run_update_action(action: UpdateAction) -> anyhow::Result<()> {
println!();
let (cmd, args) = action.command_args();
let cmd_str = action.command_str();
println!("Updating Codex via `{cmd_str}`...");
let status = std::process::Command::new(cmd).args(args).status()?;
let status = {
#[cfg(windows)]
{
// On Windows, run via cmd.exe so .CMD/.BAT are correctly resolved (PATHEXT semantics).
std::process::Command::new("cmd")
.args(["/C", &cmd_str])
.status()?
}
#[cfg(not(windows))]
{
let (cmd, args) = action.command_args();
let command_path = crate::wsl_paths::normalize_for_wsl(cmd);
let normalized_args: Vec<String> = args
.iter()
.map(crate::wsl_paths::normalize_for_wsl)
.collect();
std::process::Command::new(&command_path)
.args(&normalized_args)
.status()?
}
};
if !status.success() {
anyhow::bail!("`{cmd_str}` failed with status {status}");
}
@@ -387,9 +431,20 @@ async fn cli_main(codex_linux_sandbox_exe: Option<PathBuf>) -> anyhow::Result<()
prepend_config_flags(&mut mcp_cli.config_overrides, root_config_overrides.clone());
mcp_cli.run().await?;
}
Some(Subcommand::AppServer) => {
codex_app_server::run_main(codex_linux_sandbox_exe, root_config_overrides).await?;
}
Some(Subcommand::AppServer(app_server_cli)) => match app_server_cli.subcommand {
None => {
codex_app_server::run_main(codex_linux_sandbox_exe, root_config_overrides).await?;
}
Some(AppServerSubcommand::GenerateTs(gen_cli)) => {
codex_app_server_protocol::generate_ts(
&gen_cli.out_dir,
gen_cli.prettier.as_deref(),
)?;
}
Some(AppServerSubcommand::GenerateJsonSchema(gen_cli)) => {
codex_app_server_protocol::generate_json(&gen_cli.out_dir)?;
}
},
Some(Subcommand::Resume(ResumeCommand {
session_id,
last,
@@ -504,9 +559,6 @@ async fn cli_main(codex_linux_sandbox_exe: Option<PathBuf>) -> anyhow::Result<()
tokio::task::spawn_blocking(move || codex_stdio_to_uds::run(socket_path.as_path()))
.await??;
}
Some(Subcommand::GenerateTs(gen_cli)) => {
codex_protocol_ts::generate_ts(&gen_cli.out_dir, gen_cli.prettier.as_deref())?;
}
Some(Subcommand::Features(FeaturesCli { sub })) => match sub {
FeaturesSubcommand::List => {
// Respect root-level `-c` overrides plus top-level flags like `--profile`.

View File

@@ -0,0 +1,76 @@
use std::ffi::OsStr;
/// WSL-specific path helpers used by the updater logic.
///
/// See https://github.com/openai/codex/issues/6086.
pub fn is_wsl() -> bool {
#[cfg(target_os = "linux")]
{
if std::env::var_os("WSL_DISTRO_NAME").is_some() {
return true;
}
match std::fs::read_to_string("/proc/version") {
Ok(version) => version.to_lowercase().contains("microsoft"),
Err(_) => false,
}
}
#[cfg(not(target_os = "linux"))]
{
false
}
}
/// Convert a Windows absolute path (`C:\foo\bar` or `C:/foo/bar`) to a WSL mount path (`/mnt/c/foo/bar`).
/// Returns `None` if the input does not look like a Windows drive path.
pub fn win_path_to_wsl(path: &str) -> Option<String> {
let bytes = path.as_bytes();
if bytes.len() < 3
|| bytes[1] != b':'
|| !(bytes[2] == b'\\' || bytes[2] == b'/')
|| !bytes[0].is_ascii_alphabetic()
{
return None;
}
let drive = (bytes[0] as char).to_ascii_lowercase();
let tail = path[3..].replace('\\', "/");
if tail.is_empty() {
return Some(format!("/mnt/{drive}"));
}
Some(format!("/mnt/{drive}/{tail}"))
}
/// If under WSL and given a Windows-style path, return the equivalent `/mnt/<drive>/…` path.
/// Otherwise returns the input unchanged.
pub fn normalize_for_wsl<P: AsRef<OsStr>>(path: P) -> String {
let value = path.as_ref().to_string_lossy().to_string();
if !is_wsl() {
return value;
}
if let Some(mapped) = win_path_to_wsl(&value) {
return mapped;
}
value
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn win_to_wsl_basic() {
assert_eq!(
win_path_to_wsl(r"C:\Temp\codex.zip").as_deref(),
Some("/mnt/c/Temp/codex.zip")
);
assert_eq!(
win_path_to_wsl("D:/Work/codex.tgz").as_deref(),
Some("/mnt/d/Work/codex.tgz")
);
assert!(win_path_to_wsl("/home/user/codex").is_none());
}
#[test]
fn normalize_is_noop_on_unix_paths() {
assert_eq!(normalize_for_wsl("/home/u/x"), "/home/u/x");
}
}

View File

@@ -8,6 +8,7 @@ pub mod util;
pub use cli::Cli;
use anyhow::anyhow;
use codex_login::AuthManager;
use std::io::IsTerminal;
use std::io::Read;
use std::path::PathBuf;
@@ -56,20 +57,8 @@ async fn init_backend(user_agent_suffix: &str) -> anyhow::Result<BackendContext>
};
append_error_log(format!("startup: base_url={base_url} path_style={style}"));
let auth = match codex_core::config::find_codex_home()
.ok()
.map(|home| {
let store_mode = codex_core::config::Config::load_from_base_config_with_overrides(
codex_core::config::ConfigToml::default(),
codex_core::config::ConfigOverrides::default(),
home.clone(),
)
.map(|cfg| cfg.cli_auth_credentials_store_mode)
.unwrap_or_default();
codex_login::AuthManager::new(home, false, store_mode)
})
.and_then(|am| am.auth())
{
let auth_manager = util::load_auth_manager().await;
let auth = match auth_manager.as_ref().and_then(AuthManager::auth) {
Some(auth) => auth,
None => {
eprintln!(

View File

@@ -2,6 +2,10 @@ use base64::Engine as _;
use chrono::Utc;
use reqwest::header::HeaderMap;
use codex_core::config::Config;
use codex_core::config::ConfigOverrides;
use codex_login::AuthManager;
pub fn set_user_agent_suffix(suffix: &str) {
if let Ok(mut guard) = codex_core::default_client::USER_AGENT_SUFFIX.lock() {
guard.replace(suffix.to_string());
@@ -54,6 +58,18 @@ pub fn extract_chatgpt_account_id(token: &str) -> Option<String> {
.map(str::to_string)
}
pub async fn load_auth_manager() -> Option<AuthManager> {
// TODO: pass in cli overrides once cloud tasks properly support them.
let config = Config::load_with_cli_overrides(Vec::new(), ConfigOverrides::default())
.await
.ok()?;
Some(AuthManager::new(
config.codex_home,
false,
config.cli_auth_credentials_store_mode,
))
}
/// Build headers for ChatGPT-backed requests: `User-Agent`, optional `Authorization`,
/// and optional `ChatGPT-Account-Id`.
pub async fn build_chatgpt_headers() -> HeaderMap {
@@ -69,31 +85,22 @@ pub async fn build_chatgpt_headers() -> HeaderMap {
USER_AGENT,
HeaderValue::from_str(&ua).unwrap_or(HeaderValue::from_static("codex-cli")),
);
if let Ok(home) = codex_core::config::find_codex_home() {
let store_mode = codex_core::config::Config::load_from_base_config_with_overrides(
codex_core::config::ConfigToml::default(),
codex_core::config::ConfigOverrides::default(),
home.clone(),
)
.map(|cfg| cfg.cli_auth_credentials_store_mode)
.unwrap_or_default();
let am = codex_login::AuthManager::new(home, false, store_mode);
if let Some(auth) = am.auth()
&& let Ok(tok) = auth.get_token().await
&& !tok.is_empty()
if let Some(am) = load_auth_manager().await
&& let Some(auth) = am.auth()
&& let Ok(tok) = auth.get_token().await
&& !tok.is_empty()
{
let v = format!("Bearer {tok}");
if let Ok(hv) = HeaderValue::from_str(&v) {
headers.insert(AUTHORIZATION, hv);
}
if let Some(acc) = auth
.get_account_id()
.or_else(|| extract_chatgpt_account_id(&tok))
&& let Ok(name) = HeaderName::from_bytes(b"ChatGPT-Account-Id")
&& let Ok(hv) = HeaderValue::from_str(&acc)
{
let v = format!("Bearer {tok}");
if let Ok(hv) = HeaderValue::from_str(&v) {
headers.insert(AUTHORIZATION, hv);
}
if let Some(acc) = auth
.get_account_id()
.or_else(|| extract_chatgpt_account_id(&tok))
&& let Ok(name) = HeaderName::from_bytes(b"ChatGPT-Account-Id")
&& let Ok(hv) = HeaderValue::from_str(&acc)
{
headers.insert(name, hv);
}
headers.insert(name, hv);
}
}
headers

View File

@@ -19,8 +19,8 @@ use toml::Value;
pub struct CliConfigOverrides {
/// Override a configuration value that would otherwise be loaded from
/// `~/.codex/config.toml`. Use a dotted path (`foo.bar.baz`) to override
/// nested values. The `value` portion is parsed as JSON. If it fails to
/// parse as JSON, the raw string is used as a literal.
/// nested values. The `value` portion is parsed as TOML. If it fails to
/// parse as TOML, the raw string is used as a literal.
///
/// Examples:
/// - `-c model="o3"`
@@ -59,7 +59,7 @@ impl CliConfigOverrides {
return Err(format!("Empty key in override: {s}"));
}
// Attempt to parse as JSON. If that fails, treat it as a raw
// Attempt to parse as TOML. If that fails, treat it as a raw
// string. This allows convenient usage such as
// `-c model=o3` without the quotes.
let value: Value = match parse_toml_value(value_str) {
@@ -151,6 +151,15 @@ mod tests {
assert_eq!(v.as_integer(), Some(42));
}
#[test]
fn parses_bool() {
let true_literal = parse_toml_value("true").expect("parse");
assert_eq!(true_literal.as_bool(), Some(true));
let false_literal = parse_toml_value("false").expect("parse");
assert_eq!(false_literal.as_bool(), Some(false));
}
#[test]
fn fails_on_unquoted_string() {
assert!(parse_toml_value("hello").is_err());

View File

@@ -32,6 +32,7 @@ codex-utils-pty = { workspace = true }
codex-utils-readiness = { workspace = true }
codex-utils-string = { workspace = true }
codex-utils-tokenizer = { workspace = true }
codex-windows-sandbox = { package = "codex-windows-sandbox", path = "../windows-sandbox-rs" }
dirs = { workspace = true }
dunce = { workspace = true }
env-flags = { workspace = true }
@@ -83,7 +84,6 @@ tree-sitter-bash = { workspace = true }
uuid = { workspace = true, features = ["serde", "v4", "v5"] }
which = { workspace = true }
wildmatch = { workspace = true }
codex_windows_sandbox = { package = "codex-windows-sandbox", path = "../windows-sandbox-rs" }
[target.'cfg(target_os = "linux")'.dependencies]
@@ -104,7 +104,9 @@ openssl-sys = { workspace = true, features = ["vendored"] }
[dev-dependencies]
assert_cmd = { workspace = true }
assert_matches = { workspace = true }
codex-arg0 = { workspace = true }
core_test_support = { workspace = true }
ctor = { workspace = true }
escargot = { workspace = true }
image = { workspace = true, features = ["jpeg", "png"] }
maplit = { workspace = true }

View File

@@ -16,6 +16,7 @@ You are Codex, based on GPT-5. You are running as a coding agent in the Codex CL
* If asked to make a commit or code edits and there are unrelated changes to your work or changes that you didn't make in those files, don't revert those changes.
* If the changes are in files you've touched recently, you should read carefully and understand how you can work with the changes rather than reverting them.
* If the changes are in unrelated files, just ignore them and don't revert them.
- Do not amend a commit unless explicitly requested to do so.
- While you are working, you might notice unexpected changes that you didn't make. If this happens, STOP IMMEDIATELY and ask the user how they would like to proceed.
- **NEVER** use destructive commands like `git reset --hard` or `git checkout --` unless specifically requested or approved by the user.

View File

@@ -26,10 +26,12 @@ use crate::config::Config;
use crate::default_client::CodexHttpClient;
use crate::error::RefreshTokenFailedError;
use crate::error::RefreshTokenFailedReason;
use crate::token_data::PlanType;
use crate::token_data::KnownPlan as InternalKnownPlan;
use crate::token_data::PlanType as InternalPlanType;
use crate::token_data::TokenData;
use crate::token_data::parse_id_token;
use crate::util::try_parse_error_message;
use codex_protocol::account::PlanType as AccountPlanType;
use serde_json::Value;
use thiserror::Error;
@@ -202,7 +204,34 @@ impl CodexAuth {
self.get_current_token_data().and_then(|t| t.id_token.email)
}
pub(crate) fn get_plan_type(&self) -> Option<PlanType> {
/// Account-facing plan classification derived from the current token.
/// Returns a high-level `AccountPlanType` (e.g., Free/Plus/Pro/Team/…)
/// mapped from the ID token's internal plan value. Prefer this when you
/// need to make UI or product decisions based on the user's subscription.
pub fn account_plan_type(&self) -> Option<AccountPlanType> {
let map_known = |kp: &InternalKnownPlan| match kp {
InternalKnownPlan::Free => AccountPlanType::Free,
InternalKnownPlan::Plus => AccountPlanType::Plus,
InternalKnownPlan::Pro => AccountPlanType::Pro,
InternalKnownPlan::Team => AccountPlanType::Team,
InternalKnownPlan::Business => AccountPlanType::Business,
InternalKnownPlan::Enterprise => AccountPlanType::Enterprise,
InternalKnownPlan::Edu => AccountPlanType::Edu,
};
self.get_current_token_data()
.and_then(|t| t.id_token.chatgpt_plan_type)
.map(|pt| match pt {
InternalPlanType::Known(k) => map_known(&k),
InternalPlanType::Unknown(_) => AccountPlanType::Unknown,
})
}
/// Raw internal plan value from the ID token.
/// Exposes the underlying `token_data::PlanType` without mapping it to the
/// public `AccountPlanType`. Use this when downstream code needs to inspect
/// internal/unknown plan strings exactly as issued in the token.
pub(crate) fn get_plan_type(&self) -> Option<InternalPlanType> {
self.get_current_token_data()
.and_then(|t| t.id_token.chatgpt_plan_type)
}
@@ -609,8 +638,9 @@ mod tests {
use crate::config::ConfigOverrides;
use crate::config::ConfigToml;
use crate::token_data::IdTokenInfo;
use crate::token_data::KnownPlan;
use crate::token_data::PlanType;
use crate::token_data::KnownPlan as InternalKnownPlan;
use crate::token_data::PlanType as InternalPlanType;
use codex_protocol::account::PlanType as AccountPlanType;
use base64::Engine;
use codex_protocol::config_types::ForcedLoginMethod;
@@ -727,7 +757,7 @@ mod tests {
tokens: Some(TokenData {
id_token: IdTokenInfo {
email: Some("user@example.com".to_string()),
chatgpt_plan_type: Some(PlanType::Known(KnownPlan::Pro)),
chatgpt_plan_type: Some(InternalPlanType::Known(InternalKnownPlan::Pro)),
chatgpt_account_id: None,
raw_jwt: fake_jwt,
},
@@ -981,6 +1011,54 @@ mod tests {
.contains("ChatGPT login is required, but an API key is currently being used.")
);
}
#[test]
fn plan_type_maps_known_plan() {
let codex_home = tempdir().unwrap();
let _jwt = write_auth_file(
AuthFileParams {
openai_api_key: None,
chatgpt_plan_type: "pro".to_string(),
chatgpt_account_id: None,
},
codex_home.path(),
)
.expect("failed to write auth file");
let auth = super::load_auth(codex_home.path(), false, AuthCredentialsStoreMode::File)
.expect("load auth")
.expect("auth available");
pretty_assertions::assert_eq!(auth.account_plan_type(), Some(AccountPlanType::Pro));
pretty_assertions::assert_eq!(
auth.get_plan_type(),
Some(InternalPlanType::Known(InternalKnownPlan::Pro))
);
}
#[test]
fn plan_type_maps_unknown_to_unknown() {
let codex_home = tempdir().unwrap();
let _jwt = write_auth_file(
AuthFileParams {
openai_api_key: None,
chatgpt_plan_type: "mystery-tier".to_string(),
chatgpt_account_id: None,
},
codex_home.path(),
)
.expect("failed to write auth file");
let auth = super::load_auth(codex_home.path(), false, AuthCredentialsStoreMode::File)
.expect("load auth")
.expect("auth available");
pretty_assertions::assert_eq!(auth.account_plan_type(), Some(AccountPlanType::Unknown));
pretty_assertions::assert_eq!(
auth.get_plan_type(),
Some(InternalPlanType::Unknown("mystery-tier".to_string()))
);
}
}
/// Central manager providing a single source of truth for auth.json derived

View File

@@ -88,17 +88,33 @@ pub fn try_parse_word_only_commands_sequence(tree: &Tree, src: &str) -> Option<V
Some(commands)
}
pub fn is_well_known_sh_shell(shell: &str) -> bool {
if shell == "bash" || shell == "zsh" {
return true;
}
let shell_name = std::path::Path::new(shell)
.file_name()
.and_then(|s| s.to_str())
.unwrap_or(shell);
matches!(shell_name, "bash" | "zsh")
}
pub fn extract_bash_command(command: &[String]) -> Option<(&str, &str)> {
let [shell, flag, script] = command else {
return None;
};
if !matches!(flag.as_str(), "-lc" | "-c") || !is_well_known_sh_shell(shell) {
return None;
}
Some((shell, script))
}
/// Returns the sequence of plain commands within a `bash -lc "..."` or
/// `zsh -lc "..."` invocation when the script only contains word-only commands
/// joined by safe operators.
pub fn parse_shell_lc_plain_commands(command: &[String]) -> Option<Vec<Vec<String>>> {
let [shell, flag, script] = command else {
return None;
};
if flag != "-lc" || !(shell == "bash" || shell == "zsh") {
return None;
}
let (_, script) = extract_bash_command(command)?;
let tree = try_parse_shell(script)?;
try_parse_word_only_commands_sequence(&tree, script)

View File

@@ -447,6 +447,8 @@ impl ModelClient {
return Err(StreamAttemptError::Fatal(codex_err));
} else if error.r#type.as_deref() == Some("usage_not_included") {
return Err(StreamAttemptError::Fatal(CodexErr::UsageNotIncluded));
} else if is_quota_exceeded_error(&error) {
return Err(StreamAttemptError::Fatal(CodexErr::QuotaExceeded));
}
}
}
@@ -844,6 +846,8 @@ async fn process_sse<S>(
Ok(error) => {
if is_context_window_error(&error) {
response_error = Some(CodexErr::ContextWindowExceeded);
} else if is_quota_exceeded_error(&error) {
response_error = Some(CodexErr::QuotaExceeded);
} else {
let delay = try_parse_retry_after(&error);
let message = error.message.clone().unwrap_or_default();
@@ -975,6 +979,10 @@ fn is_context_window_error(error: &Error) -> bool {
error.code.as_deref() == Some("context_length_exceeded")
}
fn is_quota_exceeded_error(error: &Error) -> bool {
error.code.as_deref() == Some("insufficient_quota")
}
#[cfg(test)]
mod tests {
use super::*;
@@ -1307,6 +1315,41 @@ mod tests {
}
}
#[tokio::test]
async fn quota_exceeded_error_is_fatal() {
let raw_error = r#"{"type":"response.failed","sequence_number":3,"response":{"id":"resp_fatal_quota","object":"response","created_at":1759771626,"status":"failed","background":false,"error":{"code":"insufficient_quota","message":"You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors."},"incomplete_details":null}}"#;
let sse1 = format!("event: response.failed\ndata: {raw_error}\n\n");
let provider = ModelProviderInfo {
name: "test".to_string(),
base_url: Some("https://test.com".to_string()),
env_key: Some("TEST_API_KEY".to_string()),
env_key_instructions: None,
experimental_bearer_token: None,
wire_api: WireApi::Responses,
query_params: None,
http_headers: None,
env_http_headers: None,
request_max_retries: Some(0),
stream_max_retries: Some(0),
stream_idle_timeout_ms: Some(1000),
requires_openai_auth: false,
};
let otel_event_manager = otel_event_manager();
let events = collect_events(&[sse1.as_bytes()], provider, otel_event_manager).await;
assert_eq!(events.len(), 1);
match &events[0] {
Err(err @ CodexErr::QuotaExceeded) => {
assert_eq!(err.to_string(), CodexErr::QuotaExceeded.to_string());
}
other => panic!("unexpected quota exceeded event: {other:?}"),
}
}
// ────────────────────────────
// Table-driven test from `main`
// ────────────────────────────

View File

@@ -6,6 +6,7 @@ use std::sync::atomic::AtomicU64;
use crate::AuthManager;
use crate::client_common::REVIEW_PROMPT;
use crate::compact;
use crate::features::Feature;
use crate::function_tool::FunctionCallError;
use crate::mcp::auth::McpAuthStatusEntry;
@@ -66,6 +67,8 @@ use crate::error::Result as CodexResult;
use crate::exec::StreamOutput;
// Removed: legacy executor wiring replaced by ToolOrchestrator flows.
// legacy normalize_exec_result no longer used after orchestrator migration
use crate::compact::build_compacted_history;
use crate::compact::collect_user_messages;
use crate::mcp::auth::compute_auth_statuses;
use crate::mcp_connection_manager::McpConnectionManager;
use crate::model_family::find_family_for_model;
@@ -94,6 +97,7 @@ use crate::protocol::Submission;
use crate::protocol::TokenCountEvent;
use crate::protocol::TokenUsage;
use crate::protocol::TurnDiffEvent;
use crate::protocol::WarningEvent;
use crate::rollout::RolloutRecorder;
use crate::rollout::RolloutRecorderParams;
use crate::shell;
@@ -129,10 +133,6 @@ use codex_protocol::user_input::UserInput;
use codex_utils_readiness::Readiness;
use codex_utils_readiness::ReadinessFlag;
pub mod compact;
use self::compact::build_compacted_history;
use self::compact::collect_user_messages;
/// The high-level interface to the Codex system.
/// It operates as a queue pair where you send submissions and receive events.
pub struct Codex {
@@ -675,6 +675,34 @@ impl Session {
let rollout_items = conversation_history.get_rollout_items();
let persist = matches!(conversation_history, InitialHistory::Forked(_));
// If resuming, warn when the last recorded model differs from the current one.
if let InitialHistory::Resumed(_) = conversation_history
&& let Some(prev) = rollout_items.iter().rev().find_map(|it| {
if let RolloutItem::TurnContext(ctx) = it {
Some(ctx.model.as_str())
} else {
None
}
})
{
let curr = turn_context.client.get_model();
if prev != curr {
warn!(
"resuming session with different model: previous={prev}, current={curr}"
);
self.send_event(
&turn_context,
EventMsg::Warning(WarningEvent {
message: format!(
"This session was recorded with model `{prev}` but is resuming with `{curr}`. \
Consider switching back to `{prev}` as it may affect Codex performance."
),
}),
)
.await;
}
}
// Always add response items to conversation history
let reconstructed_history =
self.reconstruct_history_from_rollout(&turn_context, &rollout_items);
@@ -968,7 +996,7 @@ impl Session {
}
/// Append ResponseItems to the in-memory conversation history only.
async fn record_into_history(&self, items: &[ResponseItem]) {
pub(crate) async fn record_into_history(&self, items: &[ResponseItem]) {
let mut state = self.state.lock().await;
state.record_items(items.iter());
}
@@ -1020,7 +1048,7 @@ impl Session {
items
}
async fn persist_rollout_items(&self, items: &[RolloutItem]) {
pub(crate) async fn persist_rollout_items(&self, items: &[RolloutItem]) {
let recorder = {
let guard = self.services.rollout.lock().await;
guard.clone()
@@ -1037,7 +1065,7 @@ impl Session {
state.clone_history()
}
async fn update_token_usage_info(
pub(crate) async fn update_token_usage_info(
&self,
turn_context: &TurnContext,
token_usage: Option<&TokenUsage>,
@@ -1054,7 +1082,7 @@ impl Session {
self.send_token_count_event(turn_context).await;
}
async fn update_rate_limits(
pub(crate) async fn update_rate_limits(
&self,
turn_context: &TurnContext,
new_rate_limits: RateLimitSnapshot,
@@ -1075,7 +1103,7 @@ impl Session {
self.send_event(turn_context, event).await;
}
async fn set_total_tokens_full(&self, turn_context: &TurnContext) {
pub(crate) async fn set_total_tokens_full(&self, turn_context: &TurnContext) {
let context_window = turn_context.client.get_model_context_window();
if let Some(context_window) = context_window {
{
@@ -1118,7 +1146,11 @@ impl Session {
self.send_event(turn_context, event).await;
}
async fn notify_stream_error(&self, turn_context: &TurnContext, message: impl Into<String>) {
pub(crate) async fn notify_stream_error(
&self,
turn_context: &TurnContext,
message: impl Into<String>,
) {
let event = EventMsg::StreamError(StreamErrorEvent {
message: message.into(),
});
@@ -1643,8 +1675,7 @@ async fn spawn_review_thread(
let mut review_features = config.features.clone();
review_features
.disable(crate::features::Feature::WebSearchRequest)
.disable(crate::features::Feature::ViewImageTool)
.disable(crate::features::Feature::StreamableShell);
.disable(crate::features::Feature::ViewImageTool);
let tools_config = ToolsConfig::new(&ToolsConfigParams {
model_family: &review_model_family,
features: &review_features,
@@ -1928,6 +1959,7 @@ async fn run_turn(
return Err(CodexErr::UsageLimitReached(e));
}
Err(CodexErr::UsageNotIncluded) => return Err(CodexErr::UsageNotIncluded),
Err(e @ CodexErr::QuotaExceeded) => return Err(e),
Err(e @ CodexErr::RefreshTokenFailed(_)) => return Err(e),
Err(e) => {
// Use the configured provider-specific stream retry budget.
@@ -2320,6 +2352,7 @@ mod tests {
use crate::tools::context::ToolOutput;
use crate::tools::context::ToolPayload;
use crate::tools::handlers::ShellHandler;
use crate::tools::handlers::UnifiedExecHandler;
use crate::tools::registry::ToolHandler;
use crate::turn_diff_tracker::TurnDiffTracker;
use codex_app_server_protocol::AuthMode;
@@ -3059,6 +3092,48 @@ mod tests {
assert!(exec_output.output.contains("hi"));
}
#[tokio::test]
async fn unified_exec_rejects_escalated_permissions_when_policy_not_on_request() {
use crate::protocol::AskForApproval;
use crate::turn_diff_tracker::TurnDiffTracker;
let (session, mut turn_context_raw) = make_session_and_context();
turn_context_raw.approval_policy = AskForApproval::OnFailure;
let session = Arc::new(session);
let turn_context = Arc::new(turn_context_raw);
let tracker = Arc::new(tokio::sync::Mutex::new(TurnDiffTracker::new()));
let handler = UnifiedExecHandler;
let resp = handler
.handle(ToolInvocation {
session: Arc::clone(&session),
turn: Arc::clone(&turn_context),
tracker: Arc::clone(&tracker),
call_id: "exec-call".to_string(),
tool_name: "exec_command".to_string(),
payload: ToolPayload::Function {
arguments: serde_json::json!({
"cmd": "echo hi",
"with_escalated_permissions": true,
"justification": "need unsandboxed execution",
})
.to_string(),
},
})
.await;
let Err(FunctionCallError::RespondToModel(output)) = resp else {
panic!("expected error result");
};
let expected = format!(
"approval policy is {policy:?}; reject command — you cannot ask for escalated permissions if the approval policy is {policy:?}",
policy = turn_context.approval_policy
);
pretty_assertions::assert_eq!(output, expected);
}
#[test]
fn mcp_init_error_display_prompts_for_github_pat() {
let server_name = "github";

View File

@@ -1,4 +1,38 @@
use codex_protocol::protocol::AskForApproval;
use codex_protocol::protocol::SandboxPolicy;
use crate::bash::parse_shell_lc_plain_commands;
use crate::is_safe_command::is_known_safe_command;
pub fn requires_initial_appoval(
policy: AskForApproval,
sandbox_policy: &SandboxPolicy,
command: &[String],
with_escalated_permissions: bool,
) -> bool {
if is_known_safe_command(command) {
return false;
}
match policy {
AskForApproval::Never | AskForApproval::OnFailure => false,
AskForApproval::OnRequest => {
// In DangerFullAccess, only prompt if the command looks dangerous.
if matches!(sandbox_policy, SandboxPolicy::DangerFullAccess) {
return command_might_be_dangerous(command);
}
// In restricted sandboxes (ReadOnly/WorkspaceWrite), do not prompt for
// nonescalated, nondangerous commands — let the sandbox enforce
// restrictions (e.g., block network/write) without a user prompt.
let wants_escalation: bool = with_escalated_permissions;
if wants_escalation {
return true;
}
command_might_be_dangerous(command)
}
AskForApproval::UnlessTrusted => !is_known_safe_command(command),
}
}
pub fn command_might_be_dangerous(command: &[String]) -> bool {
if is_dangerous_to_call_with_exec(command) {

View File

@@ -1,10 +1,10 @@
use std::sync::Arc;
use super::Session;
use super::TurnContext;
use super::get_last_assistant_message_from_turn;
use crate::Prompt;
use crate::client_common::ResponseEvent;
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::codex::get_last_assistant_message_from_turn;
use crate::error::CodexErr;
use crate::error::Result as CodexResult;
use crate::protocol::AgentMessageEvent;
@@ -25,7 +25,7 @@ use codex_protocol::user_input::UserInput;
use futures::prelude::*;
use tracing::error;
pub const SUMMARIZATION_PROMPT: &str = include_str!("../../templates/compact/prompt.md");
pub const SUMMARIZATION_PROMPT: &str = include_str!("../templates/compact/prompt.md");
const COMPACT_USER_MESSAGE_MAX_TOKENS: usize = 20_000;
pub(crate) async fn run_inline_auto_compact_task(
@@ -164,7 +164,7 @@ async fn run_compact_task_inner(
sess.send_event(&turn_context, event).await;
let warning = EventMsg::Warning(WarningEvent {
message: "Heads up: Long conversations and multiple compactions can cause the model to be less accurate. Start new a new conversation when possible to keep conversations small and targeted.".to_string(),
message: "Heads up: Long conversations and multiple compactions can cause the model to be less accurate. Start a new conversation when possible to keep conversations small and targeted.".to_string(),
});
sess.send_event(&turn_context, warning).await;
}

View File

@@ -23,6 +23,10 @@ pub enum ConfigEdit {
},
/// Toggle the acknowledgement flag under `[notice]`.
SetNoticeHideFullAccessWarning(bool),
/// Toggle the Windows world-writable directories warning acknowledgement flag.
SetNoticeHideWorldWritableWarning(bool),
/// Toggle the rate limit model nudge acknowledgement flag.
SetNoticeHideRateLimitModelNudge(bool),
/// Toggle the Windows onboarding acknowledgement flag.
SetWindowsWslSetupAcknowledged(bool),
/// Replace the entire `[mcp_servers]` table.
@@ -239,6 +243,16 @@ impl ConfigDocument {
&[Notice::TABLE_KEY, "hide_full_access_warning"],
value(*acknowledged),
)),
ConfigEdit::SetNoticeHideWorldWritableWarning(acknowledged) => Ok(self.write_value(
Scope::Global,
&[Notice::TABLE_KEY, "hide_world_writable_warning"],
value(*acknowledged),
)),
ConfigEdit::SetNoticeHideRateLimitModelNudge(acknowledged) => Ok(self.write_value(
Scope::Global,
&[Notice::TABLE_KEY, "hide_rate_limit_model_nudge"],
value(*acknowledged),
)),
ConfigEdit::SetWindowsWslSetupAcknowledged(acknowledged) => Ok(self.write_value(
Scope::Global,
&["windows_wsl_setup_acknowledged"],
@@ -473,6 +487,18 @@ impl ConfigEditsBuilder {
self
}
pub fn set_hide_world_writable_warning(mut self, acknowledged: bool) -> Self {
self.edits
.push(ConfigEdit::SetNoticeHideWorldWritableWarning(acknowledged));
self
}
pub fn set_hide_rate_limit_model_nudge(mut self, acknowledged: bool) -> Self {
self.edits
.push(ConfigEdit::SetNoticeHideRateLimitModelNudge(acknowledged));
self
}
pub fn set_windows_wsl_setup_acknowledged(mut self, acknowledged: bool) -> Self {
self.edits
.push(ConfigEdit::SetWindowsWslSetupAcknowledged(acknowledged));
@@ -720,6 +746,34 @@ hide_full_access_warning = true
assert_eq!(contents, expected);
}
#[test]
fn blocking_set_hide_rate_limit_model_nudge_preserves_table() {
let tmp = tempdir().expect("tmpdir");
let codex_home = tmp.path();
std::fs::write(
codex_home.join(CONFIG_TOML_FILE),
r#"[notice]
existing = "value"
"#,
)
.expect("seed");
apply_blocking(
codex_home,
None,
&[ConfigEdit::SetNoticeHideRateLimitModelNudge(true)],
)
.expect("persist");
let contents =
std::fs::read_to_string(codex_home.join(CONFIG_TOML_FILE)).expect("read config");
let expected = r#"[notice]
existing = "value"
hide_rate_limit_model_nudge = true
"#;
assert_eq!(contents, expected);
}
#[test]
fn blocking_replace_mcp_servers_round_trips() {
let tmp = tempdir().expect("tmpdir");

View File

@@ -241,8 +241,6 @@ pub struct Config {
/// When `true`, run a model-based assessment for commands denied by the sandbox.
pub experimental_sandbox_command_assessment: bool,
pub use_experimental_streamable_shell_tool: bool,
/// If set to `true`, used only the experimental unified exec tool.
pub use_experimental_unified_exec_tool: bool,
@@ -655,7 +653,6 @@ pub struct ConfigToml {
/// Legacy, now use features
pub experimental_instructions_file: Option<PathBuf>,
pub experimental_compact_prompt_file: Option<PathBuf>,
pub experimental_use_exec_command_tool: Option<bool>,
pub experimental_use_unified_exec_tool: Option<bool>,
pub experimental_use_rmcp_client: Option<bool>,
pub experimental_use_freeform_apply_patch: Option<bool>,
@@ -999,7 +996,6 @@ impl Config {
let include_apply_patch_tool_flag = features.enabled(Feature::ApplyPatchFreeform);
let tools_web_search_request = features.enabled(Feature::WebSearchRequest);
let use_experimental_streamable_shell_tool = features.enabled(Feature::StreamableShell);
let use_experimental_unified_exec_tool = features.enabled(Feature::UnifiedExec);
let use_experimental_use_rmcp_client = features.enabled(Feature::RmcpClient);
let experimental_sandbox_command_assessment =
@@ -1156,7 +1152,6 @@ impl Config {
include_apply_patch_tool: include_apply_patch_tool_flag,
tools_web_search_request,
experimental_sandbox_command_assessment,
use_experimental_streamable_shell_tool,
use_experimental_unified_exec_tool,
use_experimental_use_rmcp_client,
features,
@@ -1715,7 +1710,6 @@ trust_level = "trusted"
fn legacy_toggles_map_to_features() -> std::io::Result<()> {
let codex_home = TempDir::new()?;
let cfg = ConfigToml {
experimental_use_exec_command_tool: Some(true),
experimental_use_unified_exec_tool: Some(true),
experimental_use_rmcp_client: Some(true),
experimental_use_freeform_apply_patch: Some(true),
@@ -1729,12 +1723,11 @@ trust_level = "trusted"
)?;
assert!(config.features.enabled(Feature::ApplyPatchFreeform));
assert!(config.features.enabled(Feature::StreamableShell));
assert!(config.features.enabled(Feature::UnifiedExec));
assert!(config.features.enabled(Feature::RmcpClient));
assert!(config.include_apply_patch_tool);
assert!(config.use_experimental_streamable_shell_tool);
assert!(config.use_experimental_unified_exec_tool);
assert!(config.use_experimental_use_rmcp_client);
@@ -2902,7 +2895,6 @@ model_verbosity = "high"
include_apply_patch_tool: false,
tools_web_search_request: false,
experimental_sandbox_command_assessment: false,
use_experimental_streamable_shell_tool: false,
use_experimental_unified_exec_tool: false,
use_experimental_use_rmcp_client: false,
features: Features::with_defaults(),
@@ -2974,7 +2966,6 @@ model_verbosity = "high"
include_apply_patch_tool: false,
tools_web_search_request: false,
experimental_sandbox_command_assessment: false,
use_experimental_streamable_shell_tool: false,
use_experimental_unified_exec_tool: false,
use_experimental_use_rmcp_client: false,
features: Features::with_defaults(),
@@ -3061,7 +3052,6 @@ model_verbosity = "high"
include_apply_patch_tool: false,
tools_web_search_request: false,
experimental_sandbox_command_assessment: false,
use_experimental_streamable_shell_tool: false,
use_experimental_unified_exec_tool: false,
use_experimental_use_rmcp_client: false,
features: Features::with_defaults(),
@@ -3134,7 +3124,6 @@ model_verbosity = "high"
include_apply_patch_tool: false,
tools_web_search_request: false,
experimental_sandbox_command_assessment: false,
use_experimental_streamable_shell_tool: false,
use_experimental_unified_exec_tool: false,
use_experimental_use_rmcp_client: false,
features: Features::with_defaults(),

View File

@@ -25,7 +25,6 @@ pub struct ConfigProfile {
pub experimental_compact_prompt_file: Option<PathBuf>,
pub include_apply_patch_tool: Option<bool>,
pub experimental_use_unified_exec_tool: Option<bool>,
pub experimental_use_exec_command_tool: Option<bool>,
pub experimental_use_rmcp_client: Option<bool>,
pub experimental_use_freeform_apply_patch: Option<bool>,
pub experimental_sandbox_command_assessment: Option<bool>,

View File

@@ -358,6 +358,10 @@ pub struct Tui {
pub struct Notice {
/// Tracks whether the user has acknowledged the full access warning prompt.
pub hide_full_access_warning: Option<bool>,
/// Tracks whether the user has acknowledged the Windows world-writable directories warning.
pub hide_world_writable_warning: Option<bool>,
/// Tracks whether the user opted out of the rate limit model switch reminder.
pub hide_rate_limit_model_nudge: Option<bool>,
}
impl Notice {

View File

@@ -109,6 +109,9 @@ pub enum CodexErr {
#[error("{0}")]
ConnectionFailed(ConnectionFailedError),
#[error("Quota exceeded. Check your plan and billing details.")]
QuotaExceeded,
#[error(
"To use Codex with your ChatGPT plan, upgrade to Plus: https://openai.com/chatgpt/pricing."
)]
@@ -235,18 +238,44 @@ pub struct UnexpectedResponseError {
pub request_id: Option<String>,
}
const CLOUDFLARE_BLOCKED_MESSAGE: &str =
"Access blocked by Cloudflare. This usually happens when connecting from a restricted region";
impl UnexpectedResponseError {
fn friendly_message(&self) -> Option<String> {
if self.status != StatusCode::FORBIDDEN {
return None;
}
if !self.body.contains("Cloudflare") || !self.body.contains("blocked") {
return None;
}
let mut message = format!("{CLOUDFLARE_BLOCKED_MESSAGE} (status {})", self.status);
if let Some(id) = &self.request_id {
message.push_str(&format!(", request id: {id}"));
}
Some(message)
}
}
impl std::fmt::Display for UnexpectedResponseError {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(
f,
"unexpected status {}: {}{}",
self.status,
self.body,
self.request_id
.as_ref()
.map(|id| format!(", request id: {id}"))
.unwrap_or_default()
)
if let Some(friendly) = self.friendly_message() {
write!(f, "{friendly}")
} else {
write!(
f,
"unexpected status {}: {}{}",
self.status,
self.body,
self.request_id
.as_ref()
.map(|id| format!(", request id: {id}"))
.unwrap_or_default()
)
}
}
}
@@ -662,6 +691,35 @@ mod tests {
});
}
#[test]
fn unexpected_status_cloudflare_html_is_simplified() {
let err = UnexpectedResponseError {
status: StatusCode::FORBIDDEN,
body: "<html><body>Cloudflare error: Sorry, you have been blocked</body></html>"
.to_string(),
request_id: Some("ray-id".to_string()),
};
let status = StatusCode::FORBIDDEN.to_string();
assert_eq!(
err.to_string(),
format!("{CLOUDFLARE_BLOCKED_MESSAGE} (status {status}), request id: ray-id")
);
}
#[test]
fn unexpected_status_non_html_is_unchanged() {
let err = UnexpectedResponseError {
status: StatusCode::FORBIDDEN,
body: "plain text error".to_string(),
request_id: None,
};
let status = StatusCode::FORBIDDEN.to_string();
assert_eq!(
err.to_string(),
format!("unexpected status {status}: plain text error")
);
}
#[test]
fn usage_limit_reached_includes_hours_and_minutes() {
let base = Utc.with_ymd_and_hms(2024, 1, 1, 0, 0, 0).unwrap();

View File

@@ -14,6 +14,7 @@ use tracing::warn;
use uuid::Uuid;
use crate::user_instructions::UserInstructions;
use crate::user_shell_command::is_user_shell_command_text;
fn is_session_prefix(text: &str) -> bool {
let trimmed = text.trim_start();
@@ -31,7 +32,7 @@ fn parse_user_message(message: &[ContentItem]) -> Option<UserMessageItem> {
for content_item in message.iter() {
match content_item {
ContentItem::InputText { text } => {
if is_session_prefix(text) {
if is_session_prefix(text) || is_user_shell_command_text(text) {
return None;
}
content.push(UserInput::Text { text: text.clone() });
@@ -197,7 +198,14 @@ mod tests {
text: "# AGENTS.md instructions for test_directory\n\n<INSTRUCTIONS>\ntest_text\n</INSTRUCTIONS>".to_string(),
}],
},
];
ResponseItem::Message {
id: None,
role: "user".to_string(),
content: vec![ContentItem::InputText {
text: "<user_shell_command>echo 42</user_shell_command>".to_string(),
}],
},
];
for item in items {
let turn_item = parse_turn_item(&item);

View File

@@ -313,6 +313,10 @@ pub(crate) mod errors {
SandboxTransformError::MissingLinuxSandboxExecutable => {
CodexErr::LandlockSandboxExecutableNotProvided
}
#[cfg(not(target_os = "macos"))]
SandboxTransformError::SeatbeltUnavailable => CodexErr::UnsupportedOperation(
"seatbelt sandbox is only available on macOS".to_string(),
),
}
}
}
@@ -514,6 +518,7 @@ async fn consume_truncated_output(
}
Err(_) => {
// timeout
kill_child_process_group(&mut child)?;
child.start_kill()?;
// Debatable whether `child.wait().await` should be called here.
(synthetic_exit_status(EXIT_CODE_SIGNAL_BASE + TIMEOUT_CODE), true)
@@ -521,6 +526,7 @@ async fn consume_truncated_output(
}
}
_ = tokio::signal::ctrl_c() => {
kill_child_process_group(&mut child)?;
child.start_kill()?;
(synthetic_exit_status(EXIT_CODE_SIGNAL_BASE + SIGKILL_CODE), false)
}
@@ -617,6 +623,38 @@ fn synthetic_exit_status(code: i32) -> ExitStatus {
std::process::ExitStatus::from_raw(code as u32)
}
#[cfg(unix)]
fn kill_child_process_group(child: &mut Child) -> io::Result<()> {
use std::io::ErrorKind;
if let Some(pid) = child.id() {
let pid = pid as libc::pid_t;
let pgid = unsafe { libc::getpgid(pid) };
if pgid == -1 {
let err = std::io::Error::last_os_error();
if err.kind() != ErrorKind::NotFound {
return Err(err);
}
return Ok(());
}
let result = unsafe { libc::killpg(pgid, libc::SIGKILL) };
if result == -1 {
let err = std::io::Error::last_os_error();
if err.kind() != ErrorKind::NotFound {
return Err(err);
}
}
}
Ok(())
}
#[cfg(not(unix))]
fn kill_child_process_group(_: &mut Child) -> io::Result<()> {
Ok(())
}
#[cfg(test)]
mod tests {
use super::*;
@@ -689,4 +727,51 @@ mod tests {
let output = make_exec_output(exit_code, "", "", "");
assert!(is_likely_sandbox_denied(SandboxType::LinuxSeccomp, &output));
}
#[cfg(unix)]
#[tokio::test]
async fn kill_child_process_group_kills_grandchildren_on_timeout() -> Result<()> {
let command = vec![
"/bin/bash".to_string(),
"-c".to_string(),
"sleep 60 & echo $!; sleep 60".to_string(),
];
let env: HashMap<String, String> = std::env::vars().collect();
let params = ExecParams {
command,
cwd: std::env::current_dir()?,
timeout_ms: Some(500),
env,
with_escalated_permissions: None,
justification: None,
arg0: None,
};
let output = exec(params, SandboxType::None, &SandboxPolicy::ReadOnly, None).await?;
assert!(output.timed_out);
let stdout = output.stdout.from_utf8_lossy().text;
let pid_line = stdout.lines().next().unwrap_or("").trim();
let pid: i32 = pid_line.parse().map_err(|error| {
io::Error::new(
io::ErrorKind::InvalidData,
format!("Failed to parse pid from stdout '{pid_line}': {error}"),
)
})?;
let mut killed = false;
for _ in 0..20 {
// Use kill(pid, 0) to check if the process is alive.
if unsafe { libc::kill(pid, 0) } == -1
&& let Some(libc::ESRCH) = std::io::Error::last_os_error().raw_os_error()
{
killed = true;
break;
}
tokio::time::sleep(Duration::from_millis(100)).await;
}
assert!(killed, "grandchild process with pid {pid} is still alive");
Ok(())
}
}

View File

@@ -29,8 +29,9 @@ pub enum Stage {
pub enum Feature {
/// Use the single unified PTY-backed exec tool.
UnifiedExec,
/// Use the streamable exec-command/write-stdin tool pair.
StreamableShell,
/// Use the shell command tool that takes `command` as a single string of
/// shell instead of an array of args passed to `execvp(3)`.
ShellCommandTool,
/// Enable experimental RMCP features such as OAuth login.
RmcpClient,
/// Include the freeform apply_patch tool.
@@ -118,8 +119,9 @@ impl Features {
self.enabled.contains(&f)
}
pub fn enable(&mut self, f: Feature) {
pub fn enable(&mut self, f: Feature) -> &mut Self {
self.enabled.insert(f);
self
}
pub fn disable(&mut self, f: Feature) -> &mut Self {
@@ -178,7 +180,6 @@ impl Features {
let base_legacy = LegacyFeatureToggles {
experimental_sandbox_command_assessment: cfg.experimental_sandbox_command_assessment,
experimental_use_freeform_apply_patch: cfg.experimental_use_freeform_apply_patch,
experimental_use_exec_command_tool: cfg.experimental_use_exec_command_tool,
experimental_use_unified_exec_tool: cfg.experimental_use_unified_exec_tool,
experimental_use_rmcp_client: cfg.experimental_use_rmcp_client,
tools_web_search: cfg.tools.as_ref().and_then(|t| t.web_search),
@@ -197,7 +198,7 @@ impl Features {
.experimental_sandbox_command_assessment,
experimental_use_freeform_apply_patch: config_profile
.experimental_use_freeform_apply_patch,
experimental_use_exec_command_tool: config_profile.experimental_use_exec_command_tool,
experimental_use_unified_exec_tool: config_profile.experimental_use_unified_exec_tool,
experimental_use_rmcp_client: config_profile.experimental_use_rmcp_client,
tools_web_search: config_profile.tools_web_search,
@@ -253,8 +254,8 @@ pub const FEATURES: &[FeatureSpec] = &[
default_enabled: false,
},
FeatureSpec {
id: Feature::StreamableShell,
key: "streamable_shell",
id: Feature::ShellCommandTool,
key: "shell_command_tool",
stage: Stage::Experimental,
default_enabled: false,
},
@@ -292,7 +293,7 @@ pub const FEATURES: &[FeatureSpec] = &[
id: Feature::GhostCommit,
key: "ghost_commit",
stage: Stage::Experimental,
default_enabled: false,
default_enabled: true,
},
FeatureSpec {
id: Feature::WindowsSandbox,

View File

@@ -17,10 +17,6 @@ const ALIASES: &[Alias] = &[
legacy_key: "experimental_use_unified_exec_tool",
feature: Feature::UnifiedExec,
},
Alias {
legacy_key: "experimental_use_exec_command_tool",
feature: Feature::StreamableShell,
},
Alias {
legacy_key: "experimental_use_rmcp_client",
feature: Feature::RmcpClient,
@@ -54,7 +50,6 @@ pub struct LegacyFeatureToggles {
pub include_apply_patch_tool: Option<bool>,
pub experimental_sandbox_command_assessment: Option<bool>,
pub experimental_use_freeform_apply_patch: Option<bool>,
pub experimental_use_exec_command_tool: Option<bool>,
pub experimental_use_unified_exec_tool: Option<bool>,
pub experimental_use_rmcp_client: Option<bool>,
pub tools_web_search: Option<bool>,
@@ -81,12 +76,6 @@ impl LegacyFeatureToggles {
self.experimental_use_freeform_apply_patch,
"experimental_use_freeform_apply_patch",
);
set_if_some(
features,
Feature::StreamableShell,
self.experimental_use_exec_command_tool,
"experimental_use_exec_command_tool",
);
set_if_some(
features,
Feature::UnifiedExec,

View File

@@ -81,6 +81,7 @@ mod function_tool;
mod state;
mod tasks;
mod user_notification;
mod user_shell_command;
pub mod util;
pub use apply_patch::CODEX_APPLY_PATCH_ARG1;
@@ -99,11 +100,12 @@ pub use client_common::Prompt;
pub use client_common::REVIEW_PROMPT;
pub use client_common::ResponseEvent;
pub use client_common::ResponseStream;
pub use codex::compact::content_items_to_text;
pub use codex_protocol::models::ContentItem;
pub use codex_protocol::models::LocalShellAction;
pub use codex_protocol::models::LocalShellExecAction;
pub use codex_protocol::models::LocalShellStatus;
pub use codex_protocol::models::ResponseItem;
pub use compact::content_items_to_text;
pub use event_mapping::parse_turn_item;
pub mod compact;
pub mod otel_init;

View File

@@ -1,5 +1,6 @@
use crate::config::types::ReasoningSummaryFormat;
use crate::tools::handlers::apply_patch::ApplyPatchToolType;
use crate::tools::spec::ConfigShellToolType;
/// The `instructions` field in the payload sent to a model should always start
/// with this content.
@@ -29,12 +30,6 @@ pub struct ModelFamily {
// Define if we need a special handling of reasoning summary
pub reasoning_summary_format: ReasoningSummaryFormat,
// This should be set to true when the model expects a tool named
// "local_shell" to be provided. Its contract must be understood natively by
// the model such that its description can be omitted.
// See https://platform.openai.com/docs/guides/tools-local-shell
pub uses_local_shell_tool: bool,
/// Whether this model supports parallel tool calls when using the
/// Responses API.
pub supports_parallel_tool_calls: bool,
@@ -57,6 +52,9 @@ pub struct ModelFamily {
/// If the model family supports setting the verbosity level when using Responses API.
pub support_verbosity: bool,
/// Preferred shell tool type for this model family when features do not override it.
pub shell_type: ConfigShellToolType,
}
macro_rules! model_family {
@@ -64,19 +62,20 @@ macro_rules! model_family {
$slug:expr, $family:expr $(, $key:ident : $value:expr )* $(,)?
) => {{
// defaults
#[allow(unused_mut)]
let mut mf = ModelFamily {
slug: $slug.to_string(),
family: $family.to_string(),
needs_special_apply_patch_instructions: false,
supports_reasoning_summaries: false,
reasoning_summary_format: ReasoningSummaryFormat::None,
uses_local_shell_tool: false,
supports_parallel_tool_calls: false,
apply_patch_tool_type: None,
base_instructions: BASE_INSTRUCTIONS.to_string(),
experimental_supported_tools: Vec::new(),
effective_context_window_percent: 95,
support_verbosity: false,
shell_type: ConfigShellToolType::Default,
};
// apply overrides
$(
@@ -105,8 +104,8 @@ pub fn find_family_for_model(slug: &str) -> Option<ModelFamily> {
model_family!(
slug, "codex-mini-latest",
supports_reasoning_summaries: true,
uses_local_shell_tool: true,
needs_special_apply_patch_instructions: true,
shell_type: ConfigShellToolType::Local,
)
} else if slug.starts_with("gpt-4.1") {
model_family!(
@@ -119,6 +118,8 @@ pub fn find_family_for_model(slug: &str) -> Option<ModelFamily> {
model_family!(slug, "gpt-4o", needs_special_apply_patch_instructions: true)
} else if slug.starts_with("gpt-3.5") {
model_family!(slug, "gpt-3.5", needs_special_apply_patch_instructions: true)
} else if slug.starts_with("porcupine") {
model_family!(slug, "porcupine", shell_type: ConfigShellToolType::UnifiedExec)
} else if slug.starts_with("test-gpt-5-codex") {
model_family!(
slug, slug,
@@ -181,12 +182,12 @@ pub fn derive_default_model_family(model: &str) -> ModelFamily {
needs_special_apply_patch_instructions: false,
supports_reasoning_summaries: false,
reasoning_summary_format: ReasoningSummaryFormat::None,
uses_local_shell_tool: false,
supports_parallel_tool_calls: false,
apply_patch_tool_type: None,
base_instructions: BASE_INSTRUCTIONS.to_string(),
experimental_supported_tools: Vec::new(),
effective_context_window_percent: 95,
support_verbosity: false,
shell_type: ConfigShellToolType::Default,
}
}

View File

@@ -1,3 +1,4 @@
use crate::bash::extract_bash_command;
use crate::bash::try_parse_shell;
use crate::bash::try_parse_word_only_commands_sequence;
use codex_protocol::parse_command::ParsedCommand;
@@ -853,6 +854,29 @@ mod tests {
}],
);
}
#[test]
fn bin_bash_lc_sed() {
assert_parsed(
&shlex_split_safe("/bin/bash -lc 'sed -n '1,10p' Cargo.toml'"),
vec![ParsedCommand::Read {
cmd: "sed -n '1,10p' Cargo.toml".to_string(),
name: "Cargo.toml".to_string(),
path: PathBuf::from("Cargo.toml"),
}],
);
}
#[test]
fn bin_zsh_lc_sed() {
assert_parsed(
&shlex_split_safe("/bin/zsh -lc 'sed -n '1,10p' Cargo.toml'"),
vec![ParsedCommand::Read {
cmd: "sed -n '1,10p' Cargo.toml".to_string(),
name: "Cargo.toml".to_string(),
path: PathBuf::from("Cargo.toml"),
}],
);
}
}
pub fn parse_command_impl(command: &[String]) -> Vec<ParsedCommand> {
@@ -1166,18 +1190,13 @@ fn parse_find_query_and_path(tail: &[String]) -> (Option<String>, Option<String>
}
fn parse_shell_lc_commands(original: &[String]) -> Option<Vec<ParsedCommand>> {
let [shell, flag, script] = original else {
return None;
};
if flag != "-lc" || !(shell == "bash" || shell == "zsh") {
return None;
}
let (_, script) = extract_bash_command(original)?;
if let Some(tree) = try_parse_shell(script)
&& let Some(all_commands) = try_parse_word_only_commands_sequence(&tree, script)
&& !all_commands.is_empty()
{
let script_tokens = shlex_split(script)
.unwrap_or_else(|| vec![shell.clone(), flag.clone(), script.clone()]);
let script_tokens = shlex_split(script).unwrap_or_else(|| vec![script.to_string()]);
// Strip small formatting helpers (e.g., head/tail/awk/wc/etc) so we
// bias toward the primary command when pipelines are present.
// First, drop obvious small formatting helpers (e.g., wc/awk/etc).
@@ -1186,7 +1205,7 @@ fn parse_shell_lc_commands(original: &[String]) -> Option<Vec<ParsedCommand>> {
let filtered_commands = drop_small_formatting_commands(all_commands);
if filtered_commands.is_empty() {
return Some(vec![ParsedCommand::Unknown {
cmd: script.clone(),
cmd: script.to_string(),
}]);
}
// Build parsed commands, tracking `cd` segments to compute effective file paths.
@@ -1250,7 +1269,7 @@ fn parse_shell_lc_commands(original: &[String]) -> Option<Vec<ParsedCommand>> {
});
if has_pipe && has_sed_n {
ParsedCommand::Read {
cmd: script.clone(),
cmd: script.to_string(),
name,
path,
}
@@ -1295,7 +1314,7 @@ fn parse_shell_lc_commands(original: &[String]) -> Option<Vec<ParsedCommand>> {
return Some(commands);
}
Some(vec![ParsedCommand::Unknown {
cmd: script.clone(),
cmd: script.to_string(),
}])
}

View File

@@ -14,8 +14,11 @@ use crate::exec::StdoutStream;
use crate::exec::execute_exec_env;
use crate::landlock::create_linux_sandbox_command_args;
use crate::protocol::SandboxPolicy;
#[cfg(target_os = "macos")]
use crate::seatbelt::MACOS_PATH_TO_SEATBELT_EXECUTABLE;
#[cfg(target_os = "macos")]
use crate::seatbelt::create_seatbelt_command_args;
#[cfg(target_os = "macos")]
use crate::spawn::CODEX_SANDBOX_ENV_VAR;
use crate::spawn::CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR;
use crate::tools::sandboxing::SandboxablePreference;
@@ -56,6 +59,9 @@ pub enum SandboxPreference {
pub(crate) enum SandboxTransformError {
#[error("missing codex-linux-sandbox executable path")]
MissingLinuxSandboxExecutable,
#[cfg(not(target_os = "macos"))]
#[error("seatbelt sandbox is only available on macOS")]
SeatbeltUnavailable,
}
#[derive(Default)]
@@ -107,6 +113,7 @@ impl SandboxManager {
let (command, sandbox_env, arg0_override) = match sandbox {
SandboxType::None => (command, HashMap::new(), None),
#[cfg(target_os = "macos")]
SandboxType::MacosSeatbelt => {
let mut seatbelt_env = HashMap::new();
seatbelt_env.insert(CODEX_SANDBOX_ENV_VAR.to_string(), "seatbelt".to_string());
@@ -117,6 +124,8 @@ impl SandboxManager {
full_command.append(&mut args);
(full_command, seatbelt_env, None)
}
#[cfg(not(target_os = "macos"))]
SandboxType::MacosSeatbelt => return Err(SandboxTransformError::SeatbeltUnavailable),
SandboxType::LinuxSeccomp => {
let exe = codex_linux_sandbox_exe
.ok_or(SandboxTransformError::MissingLinuxSandboxExecutable)?;

View File

@@ -1,4 +1,7 @@
#![cfg(target_os = "macos")]
use std::collections::HashMap;
use std::ffi::CStr;
use std::path::Path;
use std::path::PathBuf;
use tokio::process::Child;
@@ -9,6 +12,7 @@ use crate::spawn::StdioPolicy;
use crate::spawn::spawn_child_async;
const MACOS_SEATBELT_BASE_POLICY: &str = include_str!("seatbelt_base_policy.sbpl");
const MACOS_SEATBELT_NETWORK_POLICY: &str = include_str!("seatbelt_network_policy.sbpl");
/// When working with `sandbox-exec`, only consider `sandbox-exec` in `/usr/bin`
/// to defend against an attacker trying to inject a malicious version on the
@@ -44,27 +48,24 @@ pub(crate) fn create_seatbelt_command_args(
sandbox_policy: &SandboxPolicy,
sandbox_policy_cwd: &Path,
) -> Vec<String> {
let (file_write_policy, extra_cli_args) = {
let (file_write_policy, file_write_dir_params) = {
if sandbox_policy.has_full_disk_write_access() {
// Allegedly, this is more permissive than `(allow file-write*)`.
(
r#"(allow file-write* (regex #"^/"))"#.to_string(),
Vec::<String>::new(),
Vec::new(),
)
} else {
let writable_roots = sandbox_policy.get_writable_roots_with_cwd(sandbox_policy_cwd);
let mut writable_folder_policies: Vec<String> = Vec::new();
let mut cli_args: Vec<String> = Vec::new();
let mut file_write_params = Vec::new();
for (index, wr) in writable_roots.iter().enumerate() {
// Canonicalize to avoid mismatches like /var vs /private/var on macOS.
let canonical_root = wr.root.canonicalize().unwrap_or_else(|_| wr.root.clone());
let root_param = format!("WRITABLE_ROOT_{index}");
cli_args.push(format!(
"-D{root_param}={}",
canonical_root.to_string_lossy()
));
file_write_params.push((root_param.clone(), canonical_root));
if wr.read_only_subpaths.is_empty() {
writable_folder_policies.push(format!("(subpath (param \"{root_param}\"))"));
@@ -76,9 +77,9 @@ pub(crate) fn create_seatbelt_command_args(
for (subpath_index, ro) in wr.read_only_subpaths.iter().enumerate() {
let canonical_ro = ro.canonicalize().unwrap_or_else(|_| ro.clone());
let ro_param = format!("WRITABLE_ROOT_{index}_RO_{subpath_index}");
cli_args.push(format!("-D{ro_param}={}", canonical_ro.to_string_lossy()));
require_parts
.push(format!("(require-not (subpath (param \"{ro_param}\")))"));
file_write_params.push((ro_param, canonical_ro));
}
let policy_component = format!("(require-all {} )", require_parts.join(" "));
writable_folder_policies.push(policy_component);
@@ -86,13 +87,13 @@ pub(crate) fn create_seatbelt_command_args(
}
if writable_folder_policies.is_empty() {
("".to_string(), Vec::<String>::new())
("".to_string(), Vec::new())
} else {
let file_write_policy = format!(
"(allow file-write*\n{}\n)",
writable_folder_policies.join(" ")
);
(file_write_policy, cli_args)
(file_write_policy, file_write_params)
}
}
};
@@ -105,7 +106,7 @@ pub(crate) fn create_seatbelt_command_args(
// TODO(mbolin): apply_patch calls must also honor the SandboxPolicy.
let network_policy = if sandbox_policy.has_full_network_access() {
"(allow network-outbound)\n(allow network-inbound)\n(allow system-socket)"
MACOS_SEATBELT_NETWORK_POLICY
} else {
""
};
@@ -114,17 +115,49 @@ pub(crate) fn create_seatbelt_command_args(
"{MACOS_SEATBELT_BASE_POLICY}\n{file_read_policy}\n{file_write_policy}\n{network_policy}"
);
let dir_params = [file_write_dir_params, macos_dir_params()].concat();
let mut seatbelt_args: Vec<String> = vec!["-p".to_string(), full_policy];
seatbelt_args.extend(extra_cli_args);
let definition_args = dir_params
.into_iter()
.map(|(key, value)| format!("-D{key}={value}", value = value.to_string_lossy()));
seatbelt_args.extend(definition_args);
seatbelt_args.push("--".to_string());
seatbelt_args.extend(command);
seatbelt_args
}
/// Wraps libc::confstr to return a String.
fn confstr(name: libc::c_int) -> Option<String> {
let mut buf = vec![0_i8; (libc::PATH_MAX as usize) + 1];
let len = unsafe { libc::confstr(name, buf.as_mut_ptr(), buf.len()) };
if len == 0 {
return None;
}
// confstr guarantees NUL-termination when len > 0.
let cstr = unsafe { CStr::from_ptr(buf.as_ptr()) };
cstr.to_str().ok().map(ToString::to_string)
}
/// Wraps confstr to return a canonicalized PathBuf.
fn confstr_path(name: libc::c_int) -> Option<PathBuf> {
let s = confstr(name)?;
let path = PathBuf::from(s);
path.canonicalize().ok().or(Some(path))
}
fn macos_dir_params() -> Vec<(String, PathBuf)> {
if let Some(p) = confstr_path(libc::_CS_DARWIN_USER_CACHE_DIR) {
return vec![("DARWIN_USER_CACHE_DIR".to_string(), p)];
}
vec![]
}
#[cfg(test)]
mod tests {
use super::MACOS_SEATBELT_BASE_POLICY;
use super::create_seatbelt_command_args;
use super::macos_dir_params;
use crate::protocol::SandboxPolicy;
use pretty_assertions::assert_eq;
use std::fs;
@@ -134,11 +167,6 @@ mod tests {
#[test]
fn create_seatbelt_args_with_read_only_git_subpath() {
if cfg!(target_os = "windows") {
// /tmp does not exist on Windows, so skip this test.
return;
}
// Create a temporary workspace with two writable roots: one containing
// a top-level .git directory and one without it.
let tmp = TempDir::new().expect("tempdir");
@@ -199,6 +227,12 @@ mod tests {
format!("-DWRITABLE_ROOT_2={}", cwd.to_string_lossy()),
];
expected_args.extend(
macos_dir_params()
.into_iter()
.map(|(key, value)| format!("-D{key}={value}", value = value.to_string_lossy())),
);
expected_args.extend(vec![
"--".to_string(),
"/bin/echo".to_string(),
@@ -210,11 +244,6 @@ mod tests {
#[test]
fn create_seatbelt_args_for_cwd_as_git_repo() {
if cfg!(target_os = "windows") {
// /tmp does not exist on Windows, so skip this test.
return;
}
// Create a temporary workspace with two writable roots: one containing
// a top-level .git directory and one without it.
let tmp = TempDir::new().expect("tempdir");
@@ -292,6 +321,12 @@ mod tests {
expected_args.push(format!("-DWRITABLE_ROOT_2={p}"));
}
expected_args.extend(
macos_dir_params()
.into_iter()
.map(|(key, value)| format!("-D{key}={value}", value = value.to_string_lossy())),
);
expected_args.extend(vec![
"--".to_string(),
"/bin/echo".to_string(),

View File

@@ -49,6 +49,7 @@
(sysctl-name "hw.packages")
(sysctl-name "hw.pagesize_compat")
(sysctl-name "hw.pagesize")
(sysctl-name "hw.physicalcpu")
(sysctl-name "hw.physicalcpu_max")
(sysctl-name "hw.tbfrequency_compat")
(sysctl-name "hw.vectorunit")

View File

@@ -0,0 +1,30 @@
; when network access is enabled, these policies are added after those in seatbelt_base_policy.sbpl
; Ref https://source.chromium.org/chromium/chromium/src/+/main:sandbox/policy/mac/network.sb;drc=f8f264d5e4e7509c913f4c60c2639d15905a07e4
(allow network-outbound)
(allow network-inbound)
(allow system-socket)
(allow mach-lookup
; Used to look up the _CS_DARWIN_USER_CACHE_DIR in the sandbox.
(global-name "com.apple.bsd.dirhelper")
(global-name "com.apple.system.opendirectoryd.membership")
; Communicate with the security server for TLS certificate information.
(global-name "com.apple.SecurityServer")
(global-name "com.apple.networkd")
(global-name "com.apple.ocspd")
(global-name "com.apple.trustd.agent")
; Read network configuration.
(global-name "com.apple.SystemConfiguration.DNSConfiguration")
(global-name "com.apple.SystemConfiguration.configd")
)
(allow sysctl-read
(sysctl-name-regex #"^net.routetable")
)
(allow file-write*
(subpath (param "DARWIN_USER_CACHE_DIR"))
)

View File

@@ -31,16 +31,37 @@ pub enum Shell {
impl Shell {
pub fn name(&self) -> Option<String> {
match self {
Shell::Zsh(zsh) => std::path::Path::new(&zsh.shell_path)
.file_name()
.map(|s| s.to_string_lossy().to_string()),
Shell::Bash(bash) => std::path::Path::new(&bash.shell_path)
.file_name()
.map(|s| s.to_string_lossy().to_string()),
Shell::Zsh(ZshShell { shell_path, .. }) | Shell::Bash(BashShell { shell_path, .. }) => {
std::path::Path::new(shell_path)
.file_name()
.map(|s| s.to_string_lossy().to_string())
}
Shell::PowerShell(ps) => Some(ps.exe.clone()),
Shell::Unknown => None,
}
}
/// Takes a string of shell and returns the full list of command args to
/// use with `exec()` to run the shell command.
pub fn derive_exec_args(&self, command: &str, use_login_shell: bool) -> Vec<String> {
match self {
Shell::Zsh(ZshShell { shell_path, .. }) | Shell::Bash(BashShell { shell_path, .. }) => {
let arg = if use_login_shell { "-lc" } else { "-c" };
vec![shell_path.clone(), arg.to_string(), command.to_string()]
}
Shell::PowerShell(ps) => {
let mut args = vec![ps.exe.clone(), "-NoLogo".to_string()];
if !use_login_shell {
args.push("-NoProfile".to_string());
}
args.push("-Command".to_string());
args.push(command.to_string());
args
}
Shell::Unknown => shlex::split(command).unwrap_or_else(|| vec![command.to_string()]),
}
}
}
#[cfg(unix)]

View File

@@ -64,24 +64,32 @@ pub(crate) async fn spawn_child_async(
// any child processes that were spawned as part of a `"shell"` tool call
// to also be terminated.
// This relies on prctl(2), so it only works on Linux.
#[cfg(target_os = "linux")]
#[cfg(unix)]
unsafe {
#[cfg(target_os = "linux")]
let parent_pid = libc::getpid();
cmd.pre_exec(move || {
// This prctl call effectively requests, "deliver SIGTERM when my
// current parent dies."
if libc::prctl(libc::PR_SET_PDEATHSIG, libc::SIGTERM) == -1 {
if libc::setpgid(0, 0) == -1 {
return Err(std::io::Error::last_os_error());
}
// Though if there was a race condition and this pre_exec() block is
// run _after_ the parent (i.e., the Codex process) has already
// exited, then parent will be the closest configured "subreaper"
// ancestor process, or PID 1 (init). If the Codex process has exited
// already, so should the child process.
if libc::getppid() != parent_pid {
libc::raise(libc::SIGTERM);
// This relies on prctl(2), so it only works on Linux.
#[cfg(target_os = "linux")]
{
// This prctl call effectively requests, "deliver SIGTERM when my
// current parent dies."
if libc::prctl(libc::PR_SET_PDEATHSIG, libc::SIGTERM) == -1 {
return Err(std::io::Error::last_os_error());
}
// Though if there was a race condition and this pre_exec() block is
// run _after_ the parent (i.e., the Codex process) has already
// exited, then parent will be the closest configured "subreaper"
// ancestor process, or PID 1 (init). If the Codex process has exited
// already, so should the child process.
if libc::getppid() != parent_pid {
libc::raise(libc::SIGTERM);
}
}
Ok(())
});

View File

@@ -4,7 +4,7 @@ use async_trait::async_trait;
use tokio_util::sync::CancellationToken;
use crate::codex::TurnContext;
use crate::codex::compact;
use crate::compact;
use crate::state::TaskKind;
use codex_protocol::user_input::UserInput;

View File

@@ -75,12 +75,12 @@ async fn start_review_conversation(
// Avoid loading project docs; reviewer only needs findings
sub_agent_config.project_doc_max_bytes = 0;
// Carry over review-only feature restrictions so the delegate cannot
// re-enable blocked tools (web search, view image, streamable shell).
// re-enable blocked tools (web search, view image).
sub_agent_config
.features
.disable(crate::features::Feature::WebSearchRequest)
.disable(crate::features::Feature::ViewImageTool)
.disable(crate::features::Feature::StreamableShell);
.disable(crate::features::Feature::ViewImageTool);
// Set explicit review rubric for the sub-agent
sub_agent_config.base_instructions = Some(crate::REVIEW_PROMPT.to_string());
(run_codex_conversation_one_shot(

View File

@@ -1,28 +1,35 @@
use std::sync::Arc;
use std::time::Duration;
use async_trait::async_trait;
use codex_protocol::models::ShellToolCallParams;
use codex_async_utils::CancelErr;
use codex_async_utils::OrCancelExt;
use codex_protocol::user_input::UserInput;
use tokio::sync::Mutex;
use tokio_util::sync::CancellationToken;
use tracing::error;
use uuid::Uuid;
use crate::codex::TurnContext;
use crate::exec::ExecToolCallOutput;
use crate::exec::SandboxType;
use crate::exec::StdoutStream;
use crate::exec::StreamOutput;
use crate::exec::execute_exec_env;
use crate::exec_env::create_env;
use crate::parse_command::parse_command;
use crate::protocol::EventMsg;
use crate::protocol::ExecCommandBeginEvent;
use crate::protocol::ExecCommandEndEvent;
use crate::protocol::SandboxPolicy;
use crate::protocol::TaskStartedEvent;
use crate::sandboxing::ExecEnv;
use crate::state::TaskKind;
use crate::tools::context::ToolPayload;
use crate::tools::parallel::ToolCallRuntime;
use crate::tools::router::ToolCall;
use crate::tools::router::ToolRouter;
use crate::turn_diff_tracker::TurnDiffTracker;
use crate::tools::format_exec_output_str;
use crate::user_shell_command::user_shell_command_record_item;
use super::SessionTask;
use super::SessionTaskContext;
const USER_SHELL_TOOL_NAME: &str = "local_shell";
#[derive(Clone)]
pub(crate) struct UserShellCommandTask {
command: String,
@@ -56,56 +63,131 @@ impl SessionTask for UserShellCommandTask {
// Execute the user's script under their default shell when known; this
// allows commands that use shell features (pipes, &&, redirects, etc.).
// We do not source rc files or otherwise reformat the script.
let shell_invocation = match session.user_shell() {
crate::shell::Shell::Zsh(zsh) => vec![
zsh.shell_path.clone(),
"-lc".to_string(),
self.command.clone(),
],
crate::shell::Shell::Bash(bash) => vec![
bash.shell_path.clone(),
"-lc".to_string(),
self.command.clone(),
],
crate::shell::Shell::PowerShell(ps) => vec![
ps.exe.clone(),
"-NoProfile".to_string(),
"-Command".to_string(),
self.command.clone(),
],
crate::shell::Shell::Unknown => {
shlex::split(&self.command).unwrap_or_else(|| vec![self.command.clone()])
}
};
let use_login_shell = true;
let shell_invocation = session
.user_shell()
.derive_exec_args(&self.command, use_login_shell);
let params = ShellToolCallParams {
let call_id = Uuid::new_v4().to_string();
let raw_command = self.command.clone();
let parsed_cmd = parse_command(&shell_invocation);
session
.send_event(
turn_context.as_ref(),
EventMsg::ExecCommandBegin(ExecCommandBeginEvent {
call_id: call_id.clone(),
command: shell_invocation.clone(),
cwd: turn_context.cwd.clone(),
parsed_cmd,
is_user_shell_command: true,
}),
)
.await;
let exec_env = ExecEnv {
command: shell_invocation,
workdir: None,
cwd: turn_context.cwd.clone(),
env: create_env(&turn_context.shell_environment_policy),
timeout_ms: None,
sandbox: SandboxType::None,
with_escalated_permissions: None,
justification: None,
arg0: None,
};
let tool_call = ToolCall {
tool_name: USER_SHELL_TOOL_NAME.to_string(),
call_id: Uuid::new_v4().to_string(),
payload: ToolPayload::LocalShell { params },
};
let stdout_stream = Some(StdoutStream {
sub_id: turn_context.sub_id.clone(),
call_id: call_id.clone(),
tx_event: session.get_tx_event(),
});
let router = Arc::new(ToolRouter::from_config(&turn_context.tools_config, None));
let tracker = Arc::new(Mutex::new(TurnDiffTracker::new()));
let runtime = ToolCallRuntime::new(
Arc::clone(&router),
Arc::clone(&session),
Arc::clone(&turn_context),
Arc::clone(&tracker),
);
let sandbox_policy = SandboxPolicy::DangerFullAccess;
let exec_result = execute_exec_env(exec_env, &sandbox_policy, stdout_stream)
.or_cancel(&cancellation_token)
.await;
if let Err(err) = runtime
.handle_tool_call(tool_call, cancellation_token)
.await
{
error!("user shell command failed: {err:?}");
match exec_result {
Err(CancelErr::Cancelled) => {
let aborted_message = "command aborted by user".to_string();
let exec_output = ExecToolCallOutput {
exit_code: -1,
stdout: StreamOutput::new(String::new()),
stderr: StreamOutput::new(aborted_message.clone()),
aggregated_output: StreamOutput::new(aborted_message.clone()),
duration: Duration::ZERO,
timed_out: false,
};
let output_items = [user_shell_command_record_item(&raw_command, &exec_output)];
session
.record_conversation_items(turn_context.as_ref(), &output_items)
.await;
session
.send_event(
turn_context.as_ref(),
EventMsg::ExecCommandEnd(ExecCommandEndEvent {
call_id,
stdout: String::new(),
stderr: aborted_message.clone(),
aggregated_output: aborted_message.clone(),
exit_code: -1,
duration: Duration::ZERO,
formatted_output: aborted_message,
}),
)
.await;
}
Ok(Ok(output)) => {
session
.send_event(
turn_context.as_ref(),
EventMsg::ExecCommandEnd(ExecCommandEndEvent {
call_id: call_id.clone(),
stdout: output.stdout.text.clone(),
stderr: output.stderr.text.clone(),
aggregated_output: output.aggregated_output.text.clone(),
exit_code: output.exit_code,
duration: output.duration,
formatted_output: format_exec_output_str(&output),
}),
)
.await;
let output_items = [user_shell_command_record_item(&raw_command, &output)];
session
.record_conversation_items(turn_context.as_ref(), &output_items)
.await;
}
Ok(Err(err)) => {
error!("user shell command failed: {err:?}");
let message = format!("execution error: {err:?}");
let exec_output = ExecToolCallOutput {
exit_code: -1,
stdout: StreamOutput::new(String::new()),
stderr: StreamOutput::new(message.clone()),
aggregated_output: StreamOutput::new(message.clone()),
duration: Duration::ZERO,
timed_out: false,
};
session
.send_event(
turn_context.as_ref(),
EventMsg::ExecCommandEnd(ExecCommandEndEvent {
call_id,
stdout: exec_output.stdout.text.clone(),
stderr: exec_output.stderr.text.clone(),
aggregated_output: exec_output.aggregated_output.text.clone(),
exit_code: exec_output.exit_code,
duration: exec_output.duration,
formatted_output: format_exec_output_str(&exec_output),
}),
)
.await;
let output_items = [user_shell_command_record_item(&raw_command, &exec_output)];
session
.record_conversation_items(turn_context.as_ref(), &output_items)
.await;
}
}
None
}

View File

@@ -19,6 +19,7 @@ pub use mcp::McpHandler;
pub use mcp_resource::McpResourceHandler;
pub use plan::PlanHandler;
pub use read_file::ReadFileHandler;
pub use shell::ShellCommandHandler;
pub use shell::ShellHandler;
pub use test_sync::TestSyncHandler;
pub use unified_exec::UnifiedExecHandler;

View File

@@ -1,4 +1,5 @@
use async_trait::async_trait;
use codex_protocol::models::ShellCommandToolCallParams;
use codex_protocol::models::ShellToolCallParams;
use std::sync::Arc;
@@ -25,6 +26,8 @@ use crate::tools::sandboxing::ToolCtx;
pub struct ShellHandler;
pub struct ShellCommandHandler;
impl ShellHandler {
fn to_exec_params(params: ShellToolCallParams, turn_context: &TurnContext) -> ExecParams {
ExecParams {
@@ -39,6 +42,28 @@ impl ShellHandler {
}
}
impl ShellCommandHandler {
fn to_exec_params(
params: ShellCommandToolCallParams,
session: &crate::codex::Session,
turn_context: &TurnContext,
) -> ExecParams {
let shell = session.user_shell();
let use_login_shell = true;
let command = shell.derive_exec_args(&params.command, use_login_shell);
ExecParams {
command,
cwd: turn_context.resolve_path(params.workdir.clone()),
timeout_ms: params.timeout_ms,
env: create_env(&turn_context.shell_environment_policy),
with_escalated_permissions: params.with_escalated_permissions,
justification: params.justification,
arg0: None,
}
}
}
#[async_trait]
impl ToolHandler for ShellHandler {
fn kind(&self) -> ToolKind {
@@ -102,6 +127,49 @@ impl ToolHandler for ShellHandler {
}
}
#[async_trait]
impl ToolHandler for ShellCommandHandler {
fn kind(&self) -> ToolKind {
ToolKind::Function
}
fn matches_kind(&self, payload: &ToolPayload) -> bool {
matches!(payload, ToolPayload::Function { .. })
}
async fn handle(&self, invocation: ToolInvocation) -> Result<ToolOutput, FunctionCallError> {
let ToolInvocation {
session,
turn,
tracker,
call_id,
tool_name,
payload,
} = invocation;
let ToolPayload::Function { arguments } = payload else {
return Err(FunctionCallError::RespondToModel(format!(
"unsupported payload for shell_command handler: {tool_name}"
)));
};
let params: ShellCommandToolCallParams = serde_json::from_str(&arguments).map_err(|e| {
FunctionCallError::RespondToModel(format!("failed to parse function arguments: {e:?}"))
})?;
let exec_params = Self::to_exec_params(params, session.as_ref(), turn.as_ref());
ShellHandler::run_exec_like(
tool_name.as_str(),
exec_params,
session,
turn,
tracker,
call_id,
false,
)
.await
}
}
impl ShellHandler {
async fn run_exec_like(
tool_name: &str,
@@ -240,3 +308,49 @@ impl ShellHandler {
})
}
}
#[cfg(test)]
mod tests {
use crate::is_safe_command::is_known_safe_command;
use crate::shell::BashShell;
use crate::shell::Shell;
use crate::shell::ZshShell;
/// The logic for is_known_safe_command() has heuristics for known shells,
/// so we must ensure the commands generated by [ShellCommandHandler] can be
/// recognized as safe if the `command` is safe.
#[test]
fn commands_generated_by_shell_command_handler_can_be_matched_by_is_known_safe_command() {
let bash_shell = Shell::Bash(BashShell {
shell_path: "/bin/bash".to_string(),
bashrc_path: "/home/user/.bashrc".to_string(),
});
assert_safe(&bash_shell, "ls -la");
let zsh_shell = Shell::Zsh(ZshShell {
shell_path: "/bin/zsh".to_string(),
zshrc_path: "/home/user/.zshrc".to_string(),
});
assert_safe(&zsh_shell, "ls -la");
#[cfg(target_os = "windows")]
{
use crate::shell::PowerShellConfig;
let powershell = Shell::PowerShell(PowerShellConfig {
exe: "pwsh.exe".to_string(),
bash_exe_fallback: None,
});
assert_safe(&powershell, "ls -Name");
}
}
fn assert_safe(shell: &Shell, command: &str) {
assert!(is_known_safe_command(
&shell.derive_exec_args(command, /* use_login_shell */ true)
));
assert!(is_known_safe_command(
&shell.derive_exec_args(command, /* use_login_shell */ false)
));
}
}

View File

@@ -1,8 +1,7 @@
use std::time::Duration;
use std::path::PathBuf;
use async_trait::async_trait;
use serde::Deserialize;
use serde::Serialize;
use crate::function_tool::FunctionCallError;
use crate::protocol::EventMsg;
@@ -27,6 +26,8 @@ pub struct UnifiedExecHandler;
#[derive(Debug, Deserialize)]
struct ExecCommandArgs {
cmd: String,
#[serde(default)]
workdir: Option<String>,
#[serde(default = "default_shell")]
shell: String,
#[serde(default = "default_login")]
@@ -35,6 +36,10 @@ struct ExecCommandArgs {
yield_time_ms: Option<u64>,
#[serde(default)]
max_output_tokens: Option<usize>,
#[serde(default)]
with_escalated_permissions: Option<bool>,
#[serde(default)]
justification: Option<String>,
}
#[derive(Debug, Deserialize)]
@@ -99,6 +104,34 @@ impl ToolHandler for UnifiedExecHandler {
"failed to parse exec_command arguments: {err:?}"
))
})?;
let ExecCommandArgs {
cmd,
workdir,
shell,
login,
yield_time_ms,
max_output_tokens,
with_escalated_permissions,
justification,
} = args;
if with_escalated_permissions.unwrap_or(false)
&& !matches!(
context.turn.approval_policy,
codex_protocol::protocol::AskForApproval::OnRequest
)
{
return Err(FunctionCallError::RespondToModel(format!(
"approval policy is {policy:?}; reject command — you cannot ask for escalated permissions if the approval policy is {policy:?}",
policy = context.turn.approval_policy
)));
}
let workdir = workdir
.as_deref()
.filter(|value| !value.is_empty())
.map(PathBuf::from);
let cwd = workdir.clone().unwrap_or_else(|| context.turn.cwd.clone());
let event_ctx = ToolEventCtx::new(
context.session.as_ref(),
@@ -106,18 +139,20 @@ impl ToolHandler for UnifiedExecHandler {
&context.call_id,
None,
);
let emitter =
ToolEmitter::unified_exec(args.cmd.clone(), context.turn.cwd.clone(), true);
let emitter = ToolEmitter::unified_exec(cmd.clone(), cwd.clone(), true);
emitter.emit(event_ctx, ToolEventStage::Begin).await;
manager
.exec_command(
ExecCommandRequest {
command: &args.cmd,
shell: &args.shell,
login: args.login,
yield_time_ms: args.yield_time_ms,
max_output_tokens: args.max_output_tokens,
command: &cmd,
shell: &shell,
login,
yield_time_ms,
max_output_tokens,
workdir,
with_escalated_permissions,
justification,
},
&context,
)
@@ -163,11 +198,7 @@ impl ToolHandler for UnifiedExecHandler {
.await;
}
let content = serialize_response(&response).map_err(|err| {
FunctionCallError::RespondToModel(format!(
"failed to serialize unified exec output: {err:?}"
))
})?;
let content = format_response(&response);
Ok(ToolOutput::Function {
content,
@@ -177,32 +208,30 @@ impl ToolHandler for UnifiedExecHandler {
}
}
#[derive(Serialize)]
struct SerializedUnifiedExecResponse<'a> {
chunk_id: &'a str,
wall_time_seconds: f64,
output: &'a str,
#[serde(skip_serializing_if = "Option::is_none")]
session_id: Option<i32>,
#[serde(skip_serializing_if = "Option::is_none")]
exit_code: Option<i32>,
#[serde(skip_serializing_if = "Option::is_none")]
original_token_count: Option<usize>,
}
fn format_response(response: &UnifiedExecResponse) -> String {
let mut sections = Vec::new();
fn serialize_response(response: &UnifiedExecResponse) -> Result<String, serde_json::Error> {
let payload = SerializedUnifiedExecResponse {
chunk_id: &response.chunk_id,
wall_time_seconds: duration_to_seconds(response.wall_time),
output: &response.output,
session_id: response.session_id,
exit_code: response.exit_code,
original_token_count: response.original_token_count,
};
if !response.chunk_id.is_empty() {
sections.push(format!("Chunk ID: {}", response.chunk_id));
}
serde_json::to_string(&payload)
}
let wall_time_seconds = response.wall_time.as_secs_f64();
sections.push(format!("Wall time: {wall_time_seconds:.4} seconds"));
fn duration_to_seconds(duration: Duration) -> f64 {
duration.as_secs_f64()
if let Some(exit_code) = response.exit_code {
sections.push(format!("Process exited with code {exit_code}"));
}
if let Some(session_id) = response.session_id {
sections.push(format!("Process running with session ID {session_id}"));
}
if let Some(original_token_count) = response.original_token_count {
sections.push(format!("Original token count: {original_token_count}"));
}
sections.push("Output:".to_string());
sections.push(response.output.clone());
sections.join("\n")
}

View File

@@ -65,9 +65,9 @@ impl ToolCallRuntime {
Ok(Self::aborted_response(&call, secs))
},
res = async {
tracing::info!("waiting for tool gate");
tracing::trace!("waiting for tool gate");
readiness.wait_ready().await;
tracing::info!("tool gate released");
tracing::trace!("tool gate released");
let _guard = if supports_parallel {
Either::Left(lock.read().await)
} else {

View File

@@ -4,8 +4,7 @@ Runtime: shell
Executes shell requests under the orchestrator: asks for approval when needed,
builds a CommandSpec, and runs it under the current SandboxAttempt.
*/
use crate::command_safety::is_dangerous_command::command_might_be_dangerous;
use crate::command_safety::is_safe_command::is_known_safe_command;
use crate::command_safety::is_dangerous_command::requires_initial_appoval;
use crate::exec::ExecToolCallOutput;
use crate::protocol::SandboxPolicy;
use crate::sandboxing::execute_env;
@@ -121,28 +120,12 @@ impl Approvable<ShellRequest> for ShellRuntime {
policy: AskForApproval,
sandbox_policy: &SandboxPolicy,
) -> bool {
if is_known_safe_command(&req.command) {
return false;
}
match policy {
AskForApproval::Never | AskForApproval::OnFailure => false,
AskForApproval::OnRequest => {
// In DangerFullAccess, only prompt if the command looks dangerous.
if matches!(sandbox_policy, SandboxPolicy::DangerFullAccess) {
return command_might_be_dangerous(&req.command);
}
// In restricted sandboxes (ReadOnly/WorkspaceWrite), do not prompt for
// nonescalated, nondangerous commands — let the sandbox enforce
// restrictions (e.g., block network/write) without a user prompt.
let wants_escalation = req.with_escalated_permissions.unwrap_or(false);
if wants_escalation {
return true;
}
command_might_be_dangerous(&req.command)
}
AskForApproval::UnlessTrusted => !is_known_safe_command(&req.command),
}
requires_initial_appoval(
policy,
sandbox_policy,
&req.command,
req.with_escalated_permissions.unwrap_or(false),
)
}
fn wants_escalated_first_attempt(&self, req: &ShellRequest) -> bool {

View File

@@ -1,3 +1,4 @@
use crate::command_safety::is_dangerous_command::requires_initial_appoval;
/*
Runtime: unified exec
@@ -21,7 +22,9 @@ use crate::tools::sandboxing::with_cached_approval;
use crate::unified_exec::UnifiedExecError;
use crate::unified_exec::UnifiedExecSession;
use crate::unified_exec::UnifiedExecSessionManager;
use codex_protocol::protocol::AskForApproval;
use codex_protocol::protocol::ReviewDecision;
use codex_protocol::protocol::SandboxPolicy;
use futures::future::BoxFuture;
use std::collections::HashMap;
use std::path::PathBuf;
@@ -31,6 +34,8 @@ pub struct UnifiedExecRequest {
pub command: Vec<String>,
pub cwd: PathBuf,
pub env: HashMap<String, String>,
pub with_escalated_permissions: Option<bool>,
pub justification: Option<String>,
}
impl ProvidesSandboxRetryData for UnifiedExecRequest {
@@ -46,6 +51,7 @@ impl ProvidesSandboxRetryData for UnifiedExecRequest {
pub struct UnifiedExecApprovalKey {
pub command: Vec<String>,
pub cwd: PathBuf,
pub escalated: bool,
}
pub struct UnifiedExecRuntime<'a> {
@@ -53,8 +59,20 @@ pub struct UnifiedExecRuntime<'a> {
}
impl UnifiedExecRequest {
pub fn new(command: Vec<String>, cwd: PathBuf, env: HashMap<String, String>) -> Self {
Self { command, cwd, env }
pub fn new(
command: Vec<String>,
cwd: PathBuf,
env: HashMap<String, String>,
with_escalated_permissions: Option<bool>,
justification: Option<String>,
) -> Self {
Self {
command,
cwd,
env,
with_escalated_permissions,
justification,
}
}
}
@@ -81,6 +99,7 @@ impl Approvable<UnifiedExecRequest> for UnifiedExecRuntime<'_> {
UnifiedExecApprovalKey {
command: req.command.clone(),
cwd: req.cwd.clone(),
escalated: req.with_escalated_permissions.unwrap_or(false),
}
}
@@ -95,7 +114,10 @@ impl Approvable<UnifiedExecRequest> for UnifiedExecRuntime<'_> {
let call_id = ctx.call_id.to_string();
let command = req.command.clone();
let cwd = req.cwd.clone();
let reason = ctx.retry_reason.clone();
let reason = ctx
.retry_reason
.clone()
.or_else(|| req.justification.clone());
let risk = ctx.risk.clone();
Box::pin(async move {
with_cached_approval(&session.services, key, || async move {
@@ -106,6 +128,24 @@ impl Approvable<UnifiedExecRequest> for UnifiedExecRuntime<'_> {
.await
})
}
fn wants_initial_approval(
&self,
req: &UnifiedExecRequest,
policy: AskForApproval,
sandbox_policy: &SandboxPolicy,
) -> bool {
requires_initial_appoval(
policy,
sandbox_policy,
&req.command,
req.with_escalated_permissions.unwrap_or(false),
)
}
fn wants_escalated_first_attempt(&self, req: &UnifiedExecRequest) -> bool {
req.with_escalated_permissions.unwrap_or(false)
}
}
impl<'a> ToolRuntime<UnifiedExecRequest, UnifiedExecSession> for UnifiedExecRuntime<'a> {
@@ -115,8 +155,15 @@ impl<'a> ToolRuntime<UnifiedExecRequest, UnifiedExecSession> for UnifiedExecRunt
attempt: &SandboxAttempt<'_>,
_ctx: &ToolCtx<'_>,
) -> Result<UnifiedExecSession, ToolError> {
let spec = build_command_spec(&req.command, &req.cwd, &req.env, None, None, None)
.map_err(|_| ToolError::Rejected("missing command line for PTY".to_string()))?;
let spec = build_command_spec(
&req.command,
&req.cwd,
&req.env,
None,
req.with_escalated_permissions,
req.justification.clone(),
)
.map_err(|_| ToolError::Rejected("missing command line for PTY".to_string()))?;
let exec_env = attempt
.env_for(&spec)
.map_err(|err| ToolError::Codex(err.into()))?;

View File

@@ -15,11 +15,13 @@ use serde_json::json;
use std::collections::BTreeMap;
use std::collections::HashMap;
#[derive(Debug, Clone)]
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub enum ConfigShellToolType {
Default,
Local,
Streamable,
UnifiedExec,
/// Takes a command as a single string to be run in the user's default shell.
ShellCommand,
}
#[derive(Debug, Clone)]
@@ -28,7 +30,6 @@ pub(crate) struct ToolsConfig {
pub apply_patch_tool_type: Option<ApplyPatchToolType>,
pub web_search_request: bool,
pub include_view_image_tool: bool,
pub experimental_unified_exec_tool: bool,
pub experimental_supported_tools: Vec<String>,
}
@@ -43,18 +44,16 @@ impl ToolsConfig {
model_family,
features,
} = params;
let use_streamable_shell_tool = features.enabled(Feature::StreamableShell);
let experimental_unified_exec_tool = features.enabled(Feature::UnifiedExec);
let include_apply_patch_tool = features.enabled(Feature::ApplyPatchFreeform);
let include_web_search_request = features.enabled(Feature::WebSearchRequest);
let include_view_image_tool = features.enabled(Feature::ViewImageTool);
let shell_type = if use_streamable_shell_tool {
ConfigShellToolType::Streamable
} else if model_family.uses_local_shell_tool {
ConfigShellToolType::Local
let shell_type = if features.enabled(Feature::UnifiedExec) {
ConfigShellToolType::UnifiedExec
} else if features.enabled(Feature::ShellCommandTool) {
ConfigShellToolType::ShellCommand
} else {
ConfigShellToolType::Default
model_family.shell_type.clone()
};
let apply_patch_tool_type = match model_family.apply_patch_tool_type {
@@ -74,7 +73,6 @@ impl ToolsConfig {
apply_patch_tool_type,
web_search_request: include_web_search_request,
include_view_image_tool,
experimental_unified_exec_tool,
experimental_supported_tools: model_family.experimental_supported_tools.clone(),
}
}
@@ -144,6 +142,15 @@ fn create_exec_command_tool() -> ToolSpec {
description: Some("Shell command to execute.".to_string()),
},
);
properties.insert(
"workdir".to_string(),
JsonSchema::String {
description: Some(
"Optional working directory to run the command in; defaults to the turn cwd."
.to_string(),
),
},
);
properties.insert(
"shell".to_string(),
JsonSchema::String {
@@ -174,6 +181,24 @@ fn create_exec_command_tool() -> ToolSpec {
),
},
);
properties.insert(
"with_escalated_permissions".to_string(),
JsonSchema::Boolean {
description: Some(
"Whether to request escalated permissions. Set to true if command needs to be run without sandbox restrictions"
.to_string(),
),
},
);
properties.insert(
"justification".to_string(),
JsonSchema::String {
description: Some(
"Only set if with_escalated_permissions is true. 1-sentence explanation of why we want to run this command."
.to_string(),
),
},
);
ToolSpec::Function(ResponsesApiTool {
name: "exec_command".to_string(),
@@ -281,6 +306,53 @@ fn create_shell_tool() -> ToolSpec {
})
}
fn create_shell_command_tool() -> ToolSpec {
let mut properties = BTreeMap::new();
properties.insert(
"command".to_string(),
JsonSchema::String {
description: Some(
"The shell script to execute in the user's default shell".to_string(),
),
},
);
properties.insert(
"workdir".to_string(),
JsonSchema::String {
description: Some("The working directory to execute the command in".to_string()),
},
);
properties.insert(
"timeout_ms".to_string(),
JsonSchema::Number {
description: Some("The timeout for the command in milliseconds".to_string()),
},
);
properties.insert(
"with_escalated_permissions".to_string(),
JsonSchema::Boolean {
description: Some("Whether to request escalated permissions. Set to true if command needs to be run without sandbox restrictions".to_string()),
},
);
properties.insert(
"justification".to_string(),
JsonSchema::String {
description: Some("Only set if with_escalated_permissions is true. 1-sentence explanation of why we want to run this command.".to_string()),
},
);
ToolSpec::Function(ResponsesApiTool {
name: "shell_command".to_string(),
description: "Runs a shell command string and returns its output.".to_string(),
strict: false,
parameters: JsonSchema::Object {
properties,
required: Some(vec!["command".to_string()]),
additional_properties: Some(false.into()),
},
})
}
fn create_view_image_tool() -> ToolSpec {
// Support only local filesystem path.
let mut properties = BTreeMap::new();
@@ -870,6 +942,7 @@ pub(crate) fn build_specs(
use crate::tools::handlers::McpResourceHandler;
use crate::tools::handlers::PlanHandler;
use crate::tools::handlers::ReadFileHandler;
use crate::tools::handlers::ShellCommandHandler;
use crate::tools::handlers::ShellHandler;
use crate::tools::handlers::TestSyncHandler;
use crate::tools::handlers::UnifiedExecHandler;
@@ -885,16 +958,8 @@ pub(crate) fn build_specs(
let view_image_handler = Arc::new(ViewImageHandler);
let mcp_handler = Arc::new(McpHandler);
let mcp_resource_handler = Arc::new(McpResourceHandler);
let shell_command_handler = Arc::new(ShellCommandHandler);
let use_unified_exec = config.experimental_unified_exec_tool
|| matches!(config.shell_type, ConfigShellToolType::Streamable);
if use_unified_exec {
builder.push_spec(create_exec_command_tool());
builder.push_spec(create_write_stdin_tool());
builder.register_handler("exec_command", unified_exec_handler.clone());
builder.register_handler("write_stdin", unified_exec_handler);
}
match &config.shell_type {
ConfigShellToolType::Default => {
builder.push_spec(create_shell_tool());
@@ -902,8 +967,14 @@ pub(crate) fn build_specs(
ConfigShellToolType::Local => {
builder.push_spec(ToolSpec::LocalShell {});
}
ConfigShellToolType::Streamable => {
// Already handled by use_unified_exec.
ConfigShellToolType::UnifiedExec => {
builder.push_spec(create_exec_command_tool());
builder.push_spec(create_write_stdin_tool());
builder.register_handler("exec_command", unified_exec_handler.clone());
builder.register_handler("write_stdin", unified_exec_handler);
}
ConfigShellToolType::ShellCommand => {
builder.push_spec(create_shell_command_tool());
}
}
@@ -911,6 +982,7 @@ pub(crate) fn build_specs(
builder.register_handler("shell", shell_handler.clone());
builder.register_handler("container.exec", shell_handler.clone());
builder.register_handler("local_shell", shell_handler);
builder.register_handler("shell_command", shell_command_handler);
builder.push_spec_with_parallel_support(create_list_mcp_resources_tool(), true);
builder.push_spec_with_parallel_support(create_list_mcp_resource_templates_tool(), true);
@@ -1045,7 +1117,8 @@ mod tests {
match config.shell_type {
ConfigShellToolType::Default => Some("shell"),
ConfigShellToolType::Local => Some("local_shell"),
ConfigShellToolType::Streamable => None,
ConfigShellToolType::UnifiedExec => None,
ConfigShellToolType::ShellCommand => Some("shell_command"),
}
}
@@ -1095,7 +1168,7 @@ mod tests {
}
#[test]
fn test_full_toolset_specs_for_gpt5_codex() {
fn test_full_toolset_specs_for_gpt5_codex_unified_exec_web_search() {
let model_family = find_family_for_model("gpt-5-codex")
.expect("gpt-5-codex should be a valid model family");
let mut features = Features::with_defaults();
@@ -1129,7 +1202,6 @@ mod tests {
for spec in [
create_exec_command_tool(),
create_write_stdin_tool(),
create_shell_tool(),
create_list_mcp_resources_tool(),
create_list_mcp_resource_templates_tool(),
create_read_mcp_resource_tool(),
@@ -1156,32 +1228,106 @@ mod tests {
}
}
#[test]
fn test_build_specs_contains_expected_basics() {
let model_family = find_family_for_model("codex-mini-latest")
.expect("codex-mini-latest should be a valid model family");
let mut features = Features::with_defaults();
features.enable(Feature::WebSearchRequest);
features.enable(Feature::UnifiedExec);
fn assert_model_tools(model_family: &str, features: &Features, expected_tools: &[&str]) {
let model_family = find_family_for_model(model_family)
.unwrap_or_else(|| panic!("{model_family} should be a valid model family"));
let config = ToolsConfig::new(&ToolsConfigParams {
model_family: &model_family,
features: &features,
features,
});
let (tools, _) = build_specs(&config, Some(HashMap::new())).build();
let tool_names = tools.iter().map(|t| t.spec.name()).collect::<Vec<_>>();
assert_eq!(
&tool_names,
assert_eq!(&tool_names, &expected_tools,);
}
#[test]
fn test_build_specs_gpt5_codex_default() {
assert_model_tools(
"gpt-5-codex",
&Features::with_defaults(),
&[
"shell",
"list_mcp_resources",
"list_mcp_resource_templates",
"read_mcp_resource",
"update_plan",
"apply_patch",
"view_image",
],
);
}
#[test]
fn test_build_specs_gpt5_codex_unified_exec_web_search() {
assert_model_tools(
"gpt-5-codex",
Features::with_defaults()
.enable(Feature::UnifiedExec)
.enable(Feature::WebSearchRequest),
&[
"exec_command",
"write_stdin",
"list_mcp_resources",
"list_mcp_resource_templates",
"read_mcp_resource",
"update_plan",
"apply_patch",
"web_search",
"view_image",
],
);
}
#[test]
fn test_codex_mini_defaults() {
assert_model_tools(
"codex-mini-latest",
&Features::with_defaults(),
&[
"local_shell",
"list_mcp_resources",
"list_mcp_resource_templates",
"read_mcp_resource",
"update_plan",
"view_image",
],
);
}
#[test]
fn test_porcupine_defaults() {
assert_model_tools(
"porcupine",
&Features::with_defaults(),
&[
"exec_command",
"write_stdin",
"list_mcp_resources",
"list_mcp_resource_templates",
"read_mcp_resource",
"update_plan",
"view_image",
],
);
}
#[test]
fn test_codex_mini_unified_exec_web_search() {
assert_model_tools(
"codex-mini-latest",
Features::with_defaults()
.enable(Feature::UnifiedExec)
.enable(Feature::WebSearchRequest),
&[
"exec_command",
"write_stdin",
"list_mcp_resources",
"list_mcp_resource_templates",
"read_mcp_resource",
"update_plan",
"web_search",
"view_image",
]
],
);
}
@@ -1205,6 +1351,22 @@ mod tests {
assert_contains_tool_names(&tools, &subset);
}
#[test]
fn test_build_specs_shell_command_present() {
assert_model_tools(
"codex-mini-latest",
Features::with_defaults().enable(Feature::ShellCommandTool),
&[
"shell_command",
"list_mcp_resources",
"list_mcp_resource_templates",
"read_mcp_resource",
"update_plan",
"view_image",
],
);
}
#[test]
#[ignore]
fn test_parallel_support_flags() {
@@ -1660,6 +1822,21 @@ mod tests {
assert_eq!(description, expected);
}
#[test]
fn test_shell_command_tool() {
let tool = super::create_shell_command_tool();
let ToolSpec::Function(ResponsesApiTool {
description, name, ..
}) = &tool
else {
panic!("expected function tool");
};
assert_eq!(name, "shell_command");
let expected = "Runs a shell command string and returns its output.";
assert_eq!(description, expected);
}
#[test]
fn test_get_openai_tools_mcp_tools_with_additional_properties_schema() {
let model_family = find_family_for_model("gpt-5-codex")

View File

@@ -70,6 +70,9 @@ pub(crate) struct ExecCommandRequest<'a> {
pub login: bool,
pub yield_time_ms: Option<u64>,
pub max_output_tokens: Option<usize>,
pub workdir: Option<PathBuf>,
pub with_escalated_permissions: Option<bool>,
pub justification: Option<String>,
}
#[derive(Debug)]
@@ -199,6 +202,9 @@ mod tests {
login: true,
yield_time_ms,
max_output_tokens: None,
workdir: None,
with_escalated_permissions: None,
justification: None,
},
&context,
)

View File

@@ -1,3 +1,4 @@
use std::path::PathBuf;
use std::sync::Arc;
use tokio::sync::Notify;
@@ -38,6 +39,10 @@ impl UnifiedExecSessionManager {
request: ExecCommandRequest<'_>,
context: &UnifiedExecContext,
) -> Result<UnifiedExecResponse, UnifiedExecError> {
let cwd = request
.workdir
.clone()
.unwrap_or_else(|| context.turn.cwd.clone());
let shell_flag = if request.login { "-lc" } else { "-c" };
let command = vec![
request.shell.to_string(),
@@ -45,7 +50,15 @@ impl UnifiedExecSessionManager {
request.command.to_string(),
];
let session = self.open_session_with_sandbox(command, context).await?;
let session = self
.open_session_with_sandbox(
command,
cwd.clone(),
request.with_escalated_permissions,
request.justification,
context,
)
.await?;
let max_tokens = resolve_max_tokens(request.max_output_tokens);
let yield_time_ms =
@@ -66,7 +79,7 @@ impl UnifiedExecSessionManager {
None
} else {
Some(
self.store_session(session, context, request.command, start)
self.store_session(session, context, request.command, cwd.clone(), start)
.await,
)
};
@@ -87,6 +100,7 @@ impl UnifiedExecSessionManager {
Self::emit_exec_end_from_context(
context,
request.command.to_string(),
cwd,
response.output.clone(),
exit,
response.wall_time,
@@ -211,6 +225,7 @@ impl UnifiedExecSessionManager {
session: UnifiedExecSession,
context: &UnifiedExecContext,
command: &str,
cwd: PathBuf,
started_at: Instant,
) -> i32 {
let session_id = self
@@ -222,7 +237,7 @@ impl UnifiedExecSessionManager {
turn_ref: Arc::clone(&context.turn),
call_id: context.call_id.clone(),
command: command.to_string(),
cwd: context.turn.cwd.clone(),
cwd,
started_at,
};
self.sessions.lock().await.insert(session_id, entry);
@@ -258,6 +273,7 @@ impl UnifiedExecSessionManager {
async fn emit_exec_end_from_context(
context: &UnifiedExecContext,
command: String,
cwd: PathBuf,
aggregated_output: String,
exit_code: i32,
duration: Duration,
@@ -276,7 +292,7 @@ impl UnifiedExecSessionManager {
&context.call_id,
None,
);
let emitter = ToolEmitter::unified_exec(command, context.turn.cwd.clone(), true);
let emitter = ToolEmitter::unified_exec(command, cwd, true);
emitter
.emit(event_ctx, ToolEventStage::Success(output))
.await;
@@ -290,24 +306,35 @@ impl UnifiedExecSessionManager {
.command
.split_first()
.ok_or(UnifiedExecError::MissingCommandLine)?;
let spawned =
codex_utils_pty::spawn_pty_process(program, args, env.cwd.as_path(), &env.env)
.await
.map_err(|err| UnifiedExecError::create_session(err.to_string()))?;
let spawned = codex_utils_pty::spawn_pty_process(
program,
args,
env.cwd.as_path(),
&env.env,
&env.arg0,
)
.await
.map_err(|err| UnifiedExecError::create_session(err.to_string()))?;
UnifiedExecSession::from_spawned(spawned, env.sandbox).await
}
pub(super) async fn open_session_with_sandbox(
&self,
command: Vec<String>,
cwd: PathBuf,
with_escalated_permissions: Option<bool>,
justification: Option<String>,
context: &UnifiedExecContext,
) -> Result<UnifiedExecSession, UnifiedExecError> {
let mut orchestrator = ToolOrchestrator::new();
let mut runtime = UnifiedExecRuntime::new(self);
let req = UnifiedExecToolRequest::new(
command,
context.turn.cwd.clone(),
cwd,
create_env(&context.turn.shell_environment_policy),
with_escalated_permissions,
justification,
);
let tool_ctx = ToolCtx {
session: context.session.as_ref(),

View File

@@ -0,0 +1,108 @@
use std::time::Duration;
use codex_protocol::models::ContentItem;
use codex_protocol::models::ResponseItem;
use crate::exec::ExecToolCallOutput;
use crate::tools::format_exec_output_str;
pub const USER_SHELL_COMMAND_OPEN: &str = "<user_shell_command>";
pub const USER_SHELL_COMMAND_CLOSE: &str = "</user_shell_command>";
pub fn is_user_shell_command_text(text: &str) -> bool {
let trimmed = text.trim_start();
let lowered = trimmed.to_ascii_lowercase();
lowered.starts_with(USER_SHELL_COMMAND_OPEN)
}
fn format_duration_line(duration: Duration) -> String {
let duration_seconds = duration.as_secs_f64();
format!("Duration: {duration_seconds:.4} seconds")
}
fn format_user_shell_command_body(command: &str, exec_output: &ExecToolCallOutput) -> String {
let mut sections = Vec::new();
sections.push("<command>".to_string());
sections.push(command.to_string());
sections.push("</command>".to_string());
sections.push("<result>".to_string());
sections.push(format!("Exit code: {}", exec_output.exit_code));
sections.push(format_duration_line(exec_output.duration));
sections.push("Output:".to_string());
sections.push(format_exec_output_str(exec_output));
sections.push("</result>".to_string());
sections.join("\n")
}
pub fn format_user_shell_command_record(command: &str, exec_output: &ExecToolCallOutput) -> String {
let body = format_user_shell_command_body(command, exec_output);
format!("{USER_SHELL_COMMAND_OPEN}\n{body}\n{USER_SHELL_COMMAND_CLOSE}")
}
pub fn user_shell_command_record_item(
command: &str,
exec_output: &ExecToolCallOutput,
) -> ResponseItem {
ResponseItem::Message {
id: None,
role: "user".to_string(),
content: vec![ContentItem::InputText {
text: format_user_shell_command_record(command, exec_output),
}],
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::exec::StreamOutput;
use pretty_assertions::assert_eq;
#[test]
fn detects_user_shell_command_text_variants() {
assert!(is_user_shell_command_text(
"<user_shell_command>\necho hi\n</user_shell_command>"
));
assert!(!is_user_shell_command_text("echo hi"));
}
#[test]
fn formats_basic_record() {
let exec_output = ExecToolCallOutput {
exit_code: 0,
stdout: StreamOutput::new("hi".to_string()),
stderr: StreamOutput::new(String::new()),
aggregated_output: StreamOutput::new("hi".to_string()),
duration: Duration::from_secs(1),
timed_out: false,
};
let item = user_shell_command_record_item("echo hi", &exec_output);
let ResponseItem::Message { content, .. } = item else {
panic!("expected message");
};
let [ContentItem::InputText { text }] = content.as_slice() else {
panic!("expected input text");
};
assert_eq!(
text,
"<user_shell_command>\n<command>\necho hi\n</command>\n<result>\nExit code: 0\nDuration: 1.0000 seconds\nOutput:\nhi\n</result>\n</user_shell_command>"
);
}
#[test]
fn uses_aggregated_output_over_streams() {
let exec_output = ExecToolCallOutput {
exit_code: 42,
stdout: StreamOutput::new("stdout-only".to_string()),
stderr: StreamOutput::new("stderr-only".to_string()),
aggregated_output: StreamOutput::new("combined output wins".to_string()),
duration: Duration::from_millis(120),
timed_out: false,
};
let record = format_user_shell_command_record("false", &exec_output);
assert_eq!(
record,
"<user_shell_command>\n<command>\nfalse\n</command>\n<result>\nExit code: 42\nDuration: 0.1200 seconds\nOutput:\ncombined output wins\n</result>\n</user_shell_command>"
);
}
}

View File

@@ -61,6 +61,18 @@ impl ResponsesRequest {
self.0.body_json().unwrap()
}
/// Returns all `input_text` spans from `message` inputs for the provided role.
pub fn message_input_texts(&self, role: &str) -> Vec<String> {
self.inputs_of_type("message")
.into_iter()
.filter(|item| item.get("role").and_then(Value::as_str) == Some(role))
.filter_map(|item| item.get("content").and_then(Value::as_array).cloned())
.flatten()
.filter(|span| span.get("type").and_then(Value::as_str) == Some("input_text"))
.filter_map(|span| span.get("text").and_then(Value::as_str).map(str::to_owned))
.collect()
}
pub fn input(&self) -> Vec<Value> {
self.0.body_json::<Value>().unwrap()["input"]
.as_array()
@@ -434,12 +446,6 @@ pub async fn mount_sse_once(server: &MockServer, body: String) -> ResponseMock {
response_mock
}
pub async fn mount_sse(server: &MockServer, body: String) -> ResponseMock {
let (mock, response_mock) = base_mock();
mock.respond_with(sse_response(body)).mount(server).await;
response_mock
}
pub async fn start_mock_server() -> MockServer {
MockServer::builder()
.body_print_limit(BodyPrintLimit::Limited(80_000))

View File

@@ -13,7 +13,7 @@ use core_test_support::responses::mount_sse_sequence;
use core_test_support::responses::sse;
use core_test_support::responses::start_mock_server;
use core_test_support::test_codex::test_codex;
use core_test_support::wait_for_event_with_timeout;
use core_test_support::wait_for_event;
use regex_lite::Regex;
use serde_json::json;
@@ -42,8 +42,6 @@ async fn interrupt_long_running_tool_emits_turn_aborted() {
let codex = test_codex().build(&server).await.unwrap().codex;
let wait_timeout = Duration::from_secs(5);
// Kick off a turn that triggers the function call.
codex
.submit(Op::UserInput {
@@ -55,22 +53,12 @@ async fn interrupt_long_running_tool_emits_turn_aborted() {
.unwrap();
// Wait until the exec begins to avoid a race, then interrupt.
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::ExecCommandBegin(_)),
wait_timeout,
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::ExecCommandBegin(_))).await;
codex.submit(Op::Interrupt).await.unwrap();
// Expect TurnAborted soon after.
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TurnAborted(_)),
wait_timeout,
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TurnAborted(_))).await;
}
/// After an interrupt we expect the next request to the model to include both
@@ -107,8 +95,6 @@ async fn interrupt_tool_records_history_entries() {
let fixture = test_codex().build(&server).await.unwrap();
let codex = Arc::clone(&fixture.codex);
let wait_timeout = Duration::from_millis(100);
codex
.submit(Op::UserInput {
items: vec![UserInput::Text {
@@ -118,22 +104,12 @@ async fn interrupt_tool_records_history_entries() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::ExecCommandBegin(_)),
wait_timeout,
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::ExecCommandBegin(_))).await;
tokio::time::sleep(Duration::from_secs_f32(0.1)).await;
codex.submit(Op::Interrupt).await.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TurnAborted(_)),
wait_timeout,
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TurnAborted(_))).await;
codex
.submit(Op::UserInput {
@@ -144,12 +120,7 @@ async fn interrupt_tool_records_history_entries() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TaskComplete(_)),
wait_timeout,
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
let requests = response_mock.requests();
assert!(

View File

@@ -1,6 +1,7 @@
#![allow(clippy::unwrap_used, clippy::expect_used)]
use anyhow::Result;
use codex_core::features::Feature;
use codex_core::model_family::find_family_for_model;
use codex_core::protocol::ApplyPatchApprovalRequestEvent;
use codex_core::protocol::AskForApproval;
@@ -23,14 +24,13 @@ use core_test_support::skip_if_no_network;
use core_test_support::test_codex::TestCodex;
use core_test_support::test_codex::test_codex;
use core_test_support::wait_for_event;
use core_test_support::wait_for_event_with_timeout;
use pretty_assertions::assert_eq;
use regex_lite::Regex;
use serde_json::Value;
use serde_json::json;
use std::env;
use std::fs;
use std::path::PathBuf;
use std::time::Duration;
use wiremock::Mock;
use wiremock::MockServer;
use wiremock::ResponseTemplate;
@@ -73,6 +73,10 @@ enum ActionKind {
RunCommand {
command: &'static [&'static str],
},
RunUnifiedExecCommand {
command: &'static str,
justification: Option<&'static str>,
},
ApplyPatchFunction {
target: TargetPath,
content: &'static str,
@@ -83,6 +87,9 @@ enum ActionKind {
},
}
const DEFAULT_UNIFIED_EXEC_JUSTIFICATION: &str =
"Requires escalated permissions to bypass the sandbox in tests.";
impl ActionKind {
async fn prepare(
&self,
@@ -136,6 +143,26 @@ impl ActionKind {
let event = shell_event(call_id, &command, 1_000, with_escalated_permissions)?;
Ok((event, Some(command)))
}
ActionKind::RunUnifiedExecCommand {
command,
justification,
} => {
let event = exec_command_event(
call_id,
command,
Some(1000),
with_escalated_permissions,
*justification,
)?;
Ok((
event,
Some(vec![
"/bin/bash".to_string(),
"-lc".to_string(),
command.to_string(),
]),
))
}
ActionKind::ApplyPatchFunction { target, content } => {
let (path, patch_path) = target.resolve_for_patch(test);
let _ = fs::remove_file(&path);
@@ -185,6 +212,28 @@ fn shell_event(
Ok(ev_function_call(call_id, "shell", &args_str))
}
fn exec_command_event(
call_id: &str,
cmd: &str,
yield_time_ms: Option<u64>,
with_escalated_permissions: bool,
justification: Option<&str>,
) -> Result<Value> {
let mut args = json!({
"cmd": cmd.to_string(),
});
if let Some(yield_time_ms) = yield_time_ms {
args["yield_time_ms"] = json!(yield_time_ms);
}
if with_escalated_permissions {
args["with_escalated_permissions"] = json!(true);
let reason = justification.unwrap_or(DEFAULT_UNIFIED_EXEC_JUSTIFICATION);
args["justification"] = json!(reason);
}
let args_str = serde_json::to_string(&args)?;
Ok(ev_function_call(call_id, "exec_command", &args_str))
}
#[derive(Clone)]
enum Expectation {
FileCreated {
@@ -208,6 +257,9 @@ enum Expectation {
CommandSuccess {
stdout_contains: &'static str,
},
CommandFailure {
output_contains: &'static str,
},
}
impl Expectation {
@@ -339,6 +391,19 @@ impl Expectation {
result.stdout
);
}
Expectation::CommandFailure { output_contains } => {
assert_ne!(
result.exit_code,
Some(0),
"expected non-zero exit for command failure: {}",
result.stdout
);
assert!(
result.stdout.contains(output_contains),
"command failure stderr missing {output_contains:?}: {}",
result.stdout
);
}
}
Ok(())
}
@@ -364,7 +429,7 @@ struct ScenarioSpec {
sandbox_policy: SandboxPolicy,
action: ActionKind,
with_escalated_permissions: bool,
requires_apply_patch_tool: bool,
features: Vec<Feature>,
model_override: Option<&'static str>,
outcome: Outcome,
expectation: Expectation,
@@ -412,10 +477,24 @@ fn parse_result(item: &Value) -> CommandResult {
let stdout = parsed["output"].as_str().unwrap_or_default().to_string();
CommandResult { exit_code, stdout }
}
Err(_) => CommandResult {
exit_code: None,
stdout: output_str.to_string(),
},
Err(_) => {
let regex =
Regex::new(r"(?s)^.*?Process exited with code (\d+)\n.*?Output:\n(.*)$").unwrap();
// parse freeform output
if let Some(captures) = regex.captures(output_str) {
let exit_code = captures.get(1).unwrap().as_str().parse::<i64>().unwrap();
let output = captures.get(2).unwrap().as_str();
CommandResult {
exit_code: Some(exit_code),
stdout: output.to_string(),
}
} else {
CommandResult {
exit_code: None,
stdout: output_str.to_string(),
}
}
}
}
}
@@ -423,16 +502,12 @@ async fn expect_exec_approval(
test: &TestCodex,
expected_command: &[String],
) -> ExecApprovalRequestEvent {
let event = wait_for_event_with_timeout(
&test.codex,
|event| {
matches!(
event,
EventMsg::ExecApprovalRequest(_) | EventMsg::TaskComplete(_)
)
},
Duration::from_secs(5),
)
let event = wait_for_event(&test.codex, |event| {
matches!(
event,
EventMsg::ExecApprovalRequest(_) | EventMsg::TaskComplete(_)
)
})
.await;
match event {
@@ -449,16 +524,12 @@ async fn expect_patch_approval(
test: &TestCodex,
expected_call_id: &str,
) -> ApplyPatchApprovalRequestEvent {
let event = wait_for_event_with_timeout(
&test.codex,
|event| {
matches!(
event,
EventMsg::ApplyPatchApprovalRequest(_) | EventMsg::TaskComplete(_)
)
},
Duration::from_secs(5),
)
let event = wait_for_event(&test.codex, |event| {
matches!(
event,
EventMsg::ApplyPatchApprovalRequest(_) | EventMsg::TaskComplete(_)
)
})
.await;
match event {
@@ -472,16 +543,12 @@ async fn expect_patch_approval(
}
async fn wait_for_completion_without_approval(test: &TestCodex) {
let event = wait_for_event_with_timeout(
&test.codex,
|event| {
matches!(
event,
EventMsg::ExecApprovalRequest(_) | EventMsg::TaskComplete(_)
)
},
Duration::from_secs(5),
)
let event = wait_for_event(&test.codex, |event| {
matches!(
event,
EventMsg::ExecApprovalRequest(_) | EventMsg::TaskComplete(_)
)
})
.await;
match event {
@@ -520,7 +587,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "danger-on-request",
},
with_escalated_permissions: false,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::Auto,
expectation: Expectation::FileCreated {
@@ -537,7 +604,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
response_body: "danger-network-ok",
},
with_escalated_permissions: false,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::Auto,
expectation: Expectation::NetworkSuccess {
@@ -552,7 +619,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
command: &["echo", "trusted-unless"],
},
with_escalated_permissions: false,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::Auto,
expectation: Expectation::CommandSuccess {
@@ -568,7 +635,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "danger-on-failure",
},
with_escalated_permissions: false,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::Auto,
expectation: Expectation::FileCreated {
@@ -585,7 +652,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "danger-unless-trusted",
},
with_escalated_permissions: false,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::ExecApproval {
decision: ReviewDecision::Approved,
@@ -605,7 +672,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "danger-never",
},
with_escalated_permissions: false,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::Auto,
expectation: Expectation::FileCreated {
@@ -622,7 +689,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "read-only-approval",
},
with_escalated_permissions: true,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::ExecApproval {
decision: ReviewDecision::Approved,
@@ -641,7 +708,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
command: &["echo", "trusted-read-only"],
},
with_escalated_permissions: false,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::Auto,
expectation: Expectation::CommandSuccess {
@@ -657,7 +724,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
response_body: "should-not-see",
},
with_escalated_permissions: false,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::Auto,
expectation: Expectation::NetworkFailure { expect_tag: "ERR:" },
@@ -671,7 +738,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "should-not-write",
},
with_escalated_permissions: true,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::ExecApproval {
decision: ReviewDecision::Denied,
@@ -692,7 +759,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "read-only-on-failure",
},
with_escalated_permissions: false,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::ExecApproval {
decision: ReviewDecision::Approved,
@@ -712,7 +779,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
response_body: "read-only-network-ok",
},
with_escalated_permissions: true,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::ExecApproval {
decision: ReviewDecision::Approved,
@@ -731,7 +798,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "shell-apply-patch",
},
with_escalated_permissions: false,
requires_apply_patch_tool: true,
features: vec![],
model_override: None,
outcome: Outcome::PatchApproval {
decision: ReviewDecision::Approved,
@@ -751,7 +818,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "function-apply-patch",
},
with_escalated_permissions: false,
requires_apply_patch_tool: true,
features: vec![],
model_override: Some("gpt-5-codex"),
outcome: Outcome::Auto,
expectation: Expectation::PatchApplied {
@@ -768,7 +835,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "function-patch-danger",
},
with_escalated_permissions: false,
requires_apply_patch_tool: true,
features: vec![Feature::ApplyPatchFreeform],
model_override: Some("gpt-5-codex"),
outcome: Outcome::Auto,
expectation: Expectation::PatchApplied {
@@ -785,7 +852,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "function-patch-outside",
},
with_escalated_permissions: false,
requires_apply_patch_tool: true,
features: vec![],
model_override: Some("gpt-5-codex"),
outcome: Outcome::PatchApproval {
decision: ReviewDecision::Approved,
@@ -805,7 +872,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "function-patch-outside-denied",
},
with_escalated_permissions: false,
requires_apply_patch_tool: true,
features: vec![],
model_override: Some("gpt-5-codex"),
outcome: Outcome::PatchApproval {
decision: ReviewDecision::Denied,
@@ -825,7 +892,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "shell-patch-outside",
},
with_escalated_permissions: false,
requires_apply_patch_tool: true,
features: vec![],
model_override: None,
outcome: Outcome::PatchApproval {
decision: ReviewDecision::Approved,
@@ -845,7 +912,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "function-patch-unless-trusted",
},
with_escalated_permissions: false,
requires_apply_patch_tool: true,
features: vec![],
model_override: Some("gpt-5-codex"),
outcome: Outcome::PatchApproval {
decision: ReviewDecision::Approved,
@@ -865,7 +932,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "function-patch-never",
},
with_escalated_permissions: false,
requires_apply_patch_tool: true,
features: vec![],
model_override: Some("gpt-5-codex"),
outcome: Outcome::Auto,
expectation: Expectation::FileNotCreated {
@@ -884,7 +951,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "read-only-unless-trusted",
},
with_escalated_permissions: false,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::ExecApproval {
decision: ReviewDecision::Approved,
@@ -904,7 +971,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "read-only-never",
},
with_escalated_permissions: false,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::Auto,
expectation: Expectation::FileNotCreated {
@@ -924,7 +991,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
command: &["echo", "trusted-never"],
},
with_escalated_permissions: false,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::Auto,
expectation: Expectation::CommandSuccess {
@@ -940,7 +1007,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "workspace-on-request",
},
with_escalated_permissions: false,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::Auto,
expectation: Expectation::FileCreated {
@@ -957,7 +1024,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
response_body: "workspace-network-blocked",
},
with_escalated_permissions: false,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::Auto,
expectation: Expectation::NetworkFailure { expect_tag: "ERR:" },
@@ -971,7 +1038,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "workspace-on-request-outside",
},
with_escalated_permissions: true,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::ExecApproval {
decision: ReviewDecision::Approved,
@@ -991,7 +1058,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
response_body: "workspace-network-ok",
},
with_escalated_permissions: false,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::Auto,
expectation: Expectation::NetworkSuccess {
@@ -1008,7 +1075,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "workspace-on-failure",
},
with_escalated_permissions: false,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::ExecApproval {
decision: ReviewDecision::Approved,
@@ -1028,7 +1095,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "workspace-unless-trusted",
},
with_escalated_permissions: false,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::ExecApproval {
decision: ReviewDecision::Approved,
@@ -1048,7 +1115,7 @@ fn scenarios() -> Vec<ScenarioSpec> {
content: "workspace-never",
},
with_escalated_permissions: false,
requires_apply_patch_tool: false,
features: vec![],
model_override: None,
outcome: Outcome::Auto,
expectation: Expectation::FileNotCreated {
@@ -1060,6 +1127,62 @@ fn scenarios() -> Vec<ScenarioSpec> {
},
},
},
ScenarioSpec {
name: "unified exec on request no approval for safe command",
approval_policy: OnRequest,
sandbox_policy: SandboxPolicy::DangerFullAccess,
action: ActionKind::RunUnifiedExecCommand {
command: "echo \"hello unified exec\"",
justification: None,
},
with_escalated_permissions: false,
features: vec![Feature::UnifiedExec],
model_override: None,
outcome: Outcome::Auto,
expectation: Expectation::CommandSuccess {
stdout_contains: "hello unified exec",
},
},
#[cfg(not(all(target_os = "linux", target_arch = "aarch64")))]
// Linux sandbox arg0 test workaround doesn't work on ARM
ScenarioSpec {
name: "unified exec on request escalated requires approval",
approval_policy: OnRequest,
sandbox_policy: SandboxPolicy::ReadOnly,
action: ActionKind::RunUnifiedExecCommand {
command: "python3 -c 'print('\"'\"'escalated unified exec'\"'\"')'",
justification: Some(DEFAULT_UNIFIED_EXEC_JUSTIFICATION),
},
with_escalated_permissions: true,
features: vec![Feature::UnifiedExec],
model_override: None,
outcome: Outcome::ExecApproval {
decision: ReviewDecision::Approved,
expected_reason: Some(DEFAULT_UNIFIED_EXEC_JUSTIFICATION),
},
expectation: Expectation::CommandSuccess {
stdout_contains: "escalated unified exec",
},
},
ScenarioSpec {
name: "unified exec on request requires approval unless trusted",
approval_policy: AskForApproval::UnlessTrusted,
sandbox_policy: SandboxPolicy::DangerFullAccess,
action: ActionKind::RunUnifiedExecCommand {
command: "git reset --hard",
justification: None,
},
with_escalated_permissions: false,
features: vec![Feature::UnifiedExec],
model_override: None,
outcome: Outcome::ExecApproval {
decision: ReviewDecision::Denied,
expected_reason: None,
},
expectation: Expectation::CommandFailure {
output_contains: "rejected by user",
},
},
]
}
@@ -1079,7 +1202,7 @@ async fn run_scenario(scenario: &ScenarioSpec) -> Result<()> {
let server = start_mock_server().await;
let approval_policy = scenario.approval_policy;
let sandbox_policy = scenario.sandbox_policy.clone();
let requires_apply_patch_tool = scenario.requires_apply_patch_tool;
let features = scenario.features.clone();
let model_override = scenario.model_override;
let mut builder = test_codex().with_config(move |config| {
@@ -1089,8 +1212,8 @@ async fn run_scenario(scenario: &ScenarioSpec) -> Result<()> {
config.model = model.to_string();
config.model_family =
find_family_for_model(model).expect("model should map to a known family");
if requires_apply_patch_tool {
config.include_apply_patch_tool = true;
for feature in features {
config.features.enable(feature);
}
});
let test = builder.build(&server).await?;

View File

@@ -32,7 +32,6 @@ use core_test_support::skip_if_no_network;
use core_test_support::test_codex::TestCodex;
use core_test_support::test_codex::test_codex;
use core_test_support::wait_for_event;
use core_test_support::wait_for_event_with_timeout;
use futures::StreamExt;
use serde_json::json;
use std::io::Write;
@@ -1117,26 +1116,20 @@ async fn context_window_error_sets_total_tokens_to_model_window() -> anyhow::Res
})
.await?;
use std::time::Duration;
let token_event = wait_for_event_with_timeout(
&codex,
|event| {
matches!(
event,
EventMsg::TokenCount(payload)
if payload.info.as_ref().is_some_and(|info| {
info.model_context_window == Some(info.total_token_usage.total_tokens)
&& info.total_token_usage.total_tokens > 0
})
)
},
Duration::from_secs(5),
)
let token_event = wait_for_event(&codex, |event| {
matches!(
event,
EventMsg::TokenCount(payload)
if payload.info.as_ref().is_some_and(|info| {
info.model_context_window == Some(info.total_token_usage.total_tokens)
&& info.total_token_usage.total_tokens > 0
})
)
})
.await;
let EventMsg::TokenCount(token_payload) = token_event else {
unreachable!("wait_for_event_with_timeout returned unexpected event");
unreachable!("wait_for_event returned unexpected event");
};
let info = token_payload

View File

@@ -54,7 +54,7 @@ const COMPACT_PROMPT_MARKER: &str =
pub(super) const TEST_COMPACT_PROMPT: &str =
"You are performing a CONTEXT CHECKPOINT COMPACTION for a tool.\nTest-only compact prompt.";
pub(super) const COMPACT_WARNING_MESSAGE: &str = "Heads up: Long conversations and multiple compactions can cause the model to be less accurate. Start new a new conversation when possible to keep conversations small and targeted.";
pub(super) const COMPACT_WARNING_MESSAGE: &str = "Heads up: Long conversations and multiple compactions can cause the model to be less accurate. Start a new conversation when possible to keep conversations small and targeted.";
fn auto_summary(summary: &str) -> String {
summary.to_string()

View File

@@ -18,12 +18,11 @@ async fn emits_deprecation_notice_for_legacy_feature_flag() -> anyhow::Result<()
let server = start_mock_server().await;
let mut builder = test_codex().with_config(|config| {
config.features.enable(Feature::StreamableShell);
config.features.record_legacy_usage_force(
"experimental_use_exec_command_tool",
Feature::StreamableShell,
);
config.use_experimental_streamable_shell_tool = true;
config.features.enable(Feature::UnifiedExec);
config
.features
.record_legacy_usage_force("use_experimental_unified_exec_tool", Feature::UnifiedExec);
config.use_experimental_unified_exec_tool = true;
});
let TestCodex { codex, .. } = builder.build(&server).await?;
@@ -37,13 +36,13 @@ async fn emits_deprecation_notice_for_legacy_feature_flag() -> anyhow::Result<()
let DeprecationNoticeEvent { summary, details } = notice;
assert_eq!(
summary,
"`experimental_use_exec_command_tool` is deprecated. Use `streamable_shell` instead."
"`use_experimental_unified_exec_tool` is deprecated. Use `unified_exec` instead."
.to_string(),
);
assert_eq!(
details.as_deref(),
Some(
"Enable it with `--enable streamable_shell` or `[features].streamable_shell` in config.toml. See https://github.com/openai/codex/blob/main/docs/config.md#feature-flags for details."
"Enable it with `--enable unified_exec` or `[features].unified_exec` in config.toml. See https://github.com/openai/codex/blob/main/docs/config.md#feature-flags for details."
),
);

View File

@@ -1,4 +1,17 @@
// Aggregates all former standalone integration tests as modules.
use codex_arg0::arg0_dispatch;
use ctor::ctor;
use tempfile::TempDir;
// This code runs before any other tests are run.
// It allows the test binary to behave like codex and dispatch to apply_patch and codex-linux-sandbox
// based on the arg0.
// NOTE: this doesn't work on ARM
#[ctor]
pub static CODEX_ALIASES_TEMP_DIR: TempDir = unsafe {
#[allow(clippy::unwrap_used)]
arg0_dispatch().unwrap()
};
#[cfg(not(target_os = "windows"))]
mod abort_tasks;
@@ -26,6 +39,7 @@ mod model_overrides;
mod model_tools;
mod otel;
mod prompt_caching;
mod quota_exceeded;
mod read_file;
mod resume;
mod review;

View File

@@ -60,7 +60,6 @@ async fn collect_tool_identifiers_for_model(model: &str) -> Vec<String> {
config.features.disable(Feature::ApplyPatchFreeform);
config.features.disable(Feature::ViewImageTool);
config.features.disable(Feature::WebSearchRequest);
config.features.disable(Feature::StreamableShell);
config.features.disable(Feature::UnifiedExec);
let conversation_manager =

View File

@@ -1,3 +1,4 @@
use codex_core::features::Feature;
use codex_protocol::protocol::AskForApproval;
use codex_protocol::protocol::EventMsg;
use codex_protocol::protocol::Op;
@@ -8,14 +9,12 @@ use core_test_support::responses::ev_assistant_message;
use core_test_support::responses::ev_completed;
use core_test_support::responses::ev_custom_tool_call;
use core_test_support::responses::ev_function_call;
use core_test_support::responses::mount_sse;
use core_test_support::responses::mount_sse_once;
use core_test_support::responses::sse;
use core_test_support::responses::start_mock_server;
use core_test_support::test_codex::TestCodex;
use core_test_support::test_codex::test_codex;
use core_test_support::wait_for_event_with_timeout;
use std::time::Duration;
use core_test_support::wait_for_event;
use tracing_test::traced_test;
use core_test_support::responses::ev_local_shell_call;
@@ -38,12 +37,7 @@ async fn responses_api_emits_api_request_event() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TaskComplete(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
logs_assert(|lines: &[&str]| {
lines
@@ -84,12 +78,7 @@ async fn process_sse_emits_tracing_for_output_item() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TaskComplete(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
logs_assert(|lines: &[&str]| {
lines
@@ -112,8 +101,7 @@ async fn process_sse_emits_failed_event_on_parse_error() {
let TestCodex { codex, .. } = test_codex()
.with_config(move |config| {
config.model_provider.request_max_retries = Some(0);
config.model_provider.stream_max_retries = Some(0);
config.features.disable(Feature::GhostCommit);
})
.build(&server)
.await
@@ -128,12 +116,7 @@ async fn process_sse_emits_failed_event_on_parse_error() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TaskComplete(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
logs_assert(|lines: &[&str]| {
lines
@@ -157,8 +140,7 @@ async fn process_sse_records_failed_event_when_stream_closes_without_completed()
let TestCodex { codex, .. } = test_codex()
.with_config(move |config| {
config.model_provider.request_max_retries = Some(0);
config.model_provider.stream_max_retries = Some(0);
config.features.disable(Feature::GhostCommit);
})
.build(&server)
.await
@@ -173,12 +155,7 @@ async fn process_sse_records_failed_event_when_stream_closes_without_completed()
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TaskComplete(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
logs_assert(|lines: &[&str]| {
lines
@@ -211,11 +188,18 @@ async fn process_sse_failed_event_records_response_error_message() {
})]),
)
.await;
mount_sse_once(
&server,
sse(vec![
ev_assistant_message("msg-1", "local shell done"),
ev_completed("done"),
]),
)
.await;
let TestCodex { codex, .. } = test_codex()
.with_config(move |config| {
config.model_provider.request_max_retries = Some(0);
config.model_provider.stream_max_retries = Some(0);
config.features.disable(Feature::GhostCommit);
})
.build(&server)
.await
@@ -230,12 +214,7 @@ async fn process_sse_failed_event_records_response_error_message() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TaskComplete(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
logs_assert(|lines: &[&str]| {
lines
@@ -266,11 +245,18 @@ async fn process_sse_failed_event_logs_parse_error() {
})]),
)
.await;
mount_sse_once(
&server,
sse(vec![
ev_assistant_message("msg-1", "local shell done"),
ev_completed("done"),
]),
)
.await;
let TestCodex { codex, .. } = test_codex()
.with_config(move |config| {
config.model_provider.request_max_retries = Some(0);
config.model_provider.stream_max_retries = Some(0);
config.features.disable(Feature::GhostCommit);
})
.build(&server)
.await
@@ -285,12 +271,7 @@ async fn process_sse_failed_event_logs_parse_error() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TaskComplete(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
logs_assert(|lines: &[&str]| {
lines
@@ -319,8 +300,7 @@ async fn process_sse_failed_event_logs_missing_error() {
let TestCodex { codex, .. } = test_codex()
.with_config(move |config| {
config.model_provider.request_max_retries = Some(0);
config.model_provider.stream_max_retries = Some(0);
config.features.disable(Feature::GhostCommit);
})
.build(&server)
.await
@@ -335,12 +315,7 @@ async fn process_sse_failed_event_logs_missing_error() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TaskComplete(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
logs_assert(|lines: &[&str]| {
lines
@@ -367,10 +342,18 @@ async fn process_sse_failed_event_logs_response_completed_parse_error() {
)
.await;
mount_sse_once(
&server,
sse(vec![
ev_assistant_message("msg-1", "local shell done"),
ev_completed("done"),
]),
)
.await;
let TestCodex { codex, .. } = test_codex()
.with_config(move |config| {
config.model_provider.request_max_retries = Some(0);
config.model_provider.stream_max_retries = Some(0);
config.features.disable(Feature::GhostCommit);
})
.build(&server)
.await
@@ -385,12 +368,7 @@ async fn process_sse_failed_event_logs_response_completed_parse_error() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TaskComplete(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
logs_assert(|lines: &[&str]| {
lines
@@ -440,12 +418,7 @@ async fn process_sse_emits_completed_telemetry() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TaskComplete(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
logs_assert(|lines: &[&str]| {
lines
@@ -469,7 +442,7 @@ async fn process_sse_emits_completed_telemetry() {
async fn handle_response_item_records_tool_result_for_custom_tool_call() {
let server = start_mock_server().await;
mount_sse(
mount_sse_once(
&server,
sse(vec![
ev_custom_tool_call(
@@ -481,11 +454,18 @@ async fn handle_response_item_records_tool_result_for_custom_tool_call() {
]),
)
.await;
mount_sse_once(
&server,
sse(vec![
ev_assistant_message("msg-1", "local shell done"),
ev_completed("done"),
]),
)
.await;
let TestCodex { codex, .. } = test_codex()
.with_config(move |config| {
config.model_provider.request_max_retries = Some(0);
config.model_provider.stream_max_retries = Some(0);
config.features.disable(Feature::GhostCommit);
})
.build(&server)
.await
@@ -500,12 +480,7 @@ async fn handle_response_item_records_tool_result_for_custom_tool_call() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TokenCount(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TokenCount(_))).await;
logs_assert(|lines: &[&str]| {
let line = lines
@@ -537,7 +512,7 @@ async fn handle_response_item_records_tool_result_for_custom_tool_call() {
async fn handle_response_item_records_tool_result_for_function_call() {
let server = start_mock_server().await;
mount_sse(
mount_sse_once(
&server,
sse(vec![
ev_function_call("function-call", "nonexistent", "{\"value\":1}"),
@@ -546,10 +521,18 @@ async fn handle_response_item_records_tool_result_for_function_call() {
)
.await;
mount_sse_once(
&server,
sse(vec![
ev_assistant_message("msg-1", "local shell done"),
ev_completed("done"),
]),
)
.await;
let TestCodex { codex, .. } = test_codex()
.with_config(move |config| {
config.model_provider.request_max_retries = Some(0);
config.model_provider.stream_max_retries = Some(0);
config.features.disable(Feature::GhostCommit);
})
.build(&server)
.await
@@ -564,12 +547,7 @@ async fn handle_response_item_records_tool_result_for_function_call() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TokenCount(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TokenCount(_))).await;
logs_assert(|lines: &[&str]| {
let line = lines
@@ -601,7 +579,7 @@ async fn handle_response_item_records_tool_result_for_function_call() {
async fn handle_response_item_records_tool_result_for_local_shell_missing_ids() {
let server = start_mock_server().await;
mount_sse(
mount_sse_once(
&server,
sse(vec![
serde_json::json!({
@@ -620,10 +598,18 @@ async fn handle_response_item_records_tool_result_for_local_shell_missing_ids()
)
.await;
mount_sse_once(
&server,
sse(vec![
ev_assistant_message("msg-1", "local shell done"),
ev_completed("done"),
]),
)
.await;
let TestCodex { codex, .. } = test_codex()
.with_config(move |config| {
config.model_provider.request_max_retries = Some(0);
config.model_provider.stream_max_retries = Some(0);
config.features.disable(Feature::GhostCommit);
})
.build(&server)
.await
@@ -638,12 +624,7 @@ async fn handle_response_item_records_tool_result_for_local_shell_missing_ids()
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TokenCount(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TokenCount(_))).await;
logs_assert(|lines: &[&str]| {
let line = lines
@@ -669,7 +650,7 @@ async fn handle_response_item_records_tool_result_for_local_shell_missing_ids()
async fn handle_response_item_records_tool_result_for_local_shell_call() {
let server = start_mock_server().await;
mount_sse(
mount_sse_once(
&server,
sse(vec![
ev_local_shell_call("shell-call", "completed", vec!["/bin/echo", "shell"]),
@@ -678,10 +659,18 @@ async fn handle_response_item_records_tool_result_for_local_shell_call() {
)
.await;
mount_sse_once(
&server,
sse(vec![
ev_assistant_message("msg-1", "local shell done"),
ev_completed("done"),
]),
)
.await;
let TestCodex { codex, .. } = test_codex()
.with_config(move |config| {
config.model_provider.request_max_retries = Some(0);
config.model_provider.stream_max_retries = Some(0);
config.features.disable(Feature::GhostCommit);
})
.build(&server)
.await
@@ -696,12 +685,7 @@ async fn handle_response_item_records_tool_result_for_local_shell_call() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TokenCount(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TokenCount(_))).await;
logs_assert(|lines: &[&str]| {
let line = lines
@@ -765,10 +749,23 @@ fn tool_decision_assertion<'a>(
#[traced_test]
async fn handle_container_exec_autoapprove_from_config_records_tool_decision() {
let server = start_mock_server().await;
mount_sse(
mount_sse_once(
&server,
sse(vec![
ev_local_shell_call("auto_config_call", "completed", vec!["/bin/echo", "hello"]),
ev_local_shell_call(
"auto_config_call",
"completed",
vec!["/bin/echo", "local shell"],
),
ev_completed("done"),
]),
)
.await;
mount_sse_once(
&server,
sse(vec![
ev_assistant_message("msg-1", "local shell done"),
ev_completed("done"),
]),
)
@@ -778,8 +775,6 @@ async fn handle_container_exec_autoapprove_from_config_records_tool_decision() {
.with_config(|config| {
config.approval_policy = AskForApproval::OnRequest;
config.sandbox_policy = SandboxPolicy::DangerFullAccess;
config.model_provider.request_max_retries = Some(0);
config.model_provider.stream_max_retries = Some(0);
})
.build(&server)
.await
@@ -794,12 +789,7 @@ async fn handle_container_exec_autoapprove_from_config_records_tool_decision() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TokenCount(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
logs_assert(tool_decision_assertion(
"auto_config_call",
@@ -812,7 +802,7 @@ async fn handle_container_exec_autoapprove_from_config_records_tool_decision() {
#[traced_test]
async fn handle_container_exec_user_approved_records_tool_decision() {
let server = start_mock_server().await;
mount_sse(
mount_sse_once(
&server,
sse(vec![
ev_local_shell_call("user_approved_call", "completed", vec!["/bin/date"]),
@@ -821,11 +811,18 @@ async fn handle_container_exec_user_approved_records_tool_decision() {
)
.await;
mount_sse_once(
&server,
sse(vec![
ev_assistant_message("msg-1", "local shell done"),
ev_completed("done"),
]),
)
.await;
let TestCodex { codex, .. } = test_codex()
.with_config(|config| {
config.approval_policy = AskForApproval::UnlessTrusted;
config.model_provider.request_max_retries = Some(0);
config.model_provider.stream_max_retries = Some(0);
})
.build(&server)
.await
@@ -840,12 +837,7 @@ async fn handle_container_exec_user_approved_records_tool_decision() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::ExecApprovalRequest(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::ExecApprovalRequest(_))).await;
codex
.submit(Op::ExecApproval {
@@ -855,12 +847,7 @@ async fn handle_container_exec_user_approved_records_tool_decision() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TokenCount(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TokenCount(_))).await;
logs_assert(tool_decision_assertion(
"user_approved_call",
@@ -874,7 +861,7 @@ async fn handle_container_exec_user_approved_records_tool_decision() {
async fn handle_container_exec_user_approved_for_session_records_tool_decision() {
let server = start_mock_server().await;
mount_sse(
mount_sse_once(
&server,
sse(vec![
ev_local_shell_call("user_approved_session_call", "completed", vec!["/bin/date"]),
@@ -882,12 +869,18 @@ async fn handle_container_exec_user_approved_for_session_records_tool_decision()
]),
)
.await;
mount_sse_once(
&server,
sse(vec![
ev_assistant_message("msg-1", "local shell done"),
ev_completed("done"),
]),
)
.await;
let TestCodex { codex, .. } = test_codex()
.with_config(|config| {
config.approval_policy = AskForApproval::UnlessTrusted;
config.model_provider.request_max_retries = Some(0);
config.model_provider.stream_max_retries = Some(0);
})
.build(&server)
.await
@@ -902,12 +895,7 @@ async fn handle_container_exec_user_approved_for_session_records_tool_decision()
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::ExecApprovalRequest(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::ExecApprovalRequest(_))).await;
codex
.submit(Op::ExecApproval {
@@ -917,12 +905,7 @@ async fn handle_container_exec_user_approved_for_session_records_tool_decision()
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TokenCount(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TokenCount(_))).await;
logs_assert(tool_decision_assertion(
"user_approved_session_call",
@@ -936,7 +919,7 @@ async fn handle_container_exec_user_approved_for_session_records_tool_decision()
async fn handle_sandbox_error_user_approves_retry_records_tool_decision() {
let server = start_mock_server().await;
mount_sse(
mount_sse_once(
&server,
sse(vec![
ev_local_shell_call("sandbox_retry_call", "completed", vec!["/bin/date"]),
@@ -944,12 +927,18 @@ async fn handle_sandbox_error_user_approves_retry_records_tool_decision() {
]),
)
.await;
mount_sse_once(
&server,
sse(vec![
ev_assistant_message("msg-1", "local shell done"),
ev_completed("done"),
]),
)
.await;
let TestCodex { codex, .. } = test_codex()
.with_config(|config| {
config.approval_policy = AskForApproval::UnlessTrusted;
config.model_provider.request_max_retries = Some(0);
config.model_provider.stream_max_retries = Some(0);
})
.build(&server)
.await
@@ -964,12 +953,7 @@ async fn handle_sandbox_error_user_approves_retry_records_tool_decision() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::ExecApprovalRequest(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::ExecApprovalRequest(_))).await;
codex
.submit(Op::ExecApproval {
@@ -979,12 +963,7 @@ async fn handle_sandbox_error_user_approves_retry_records_tool_decision() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TokenCount(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TokenCount(_))).await;
logs_assert(tool_decision_assertion(
"sandbox_retry_call",
@@ -998,7 +977,7 @@ async fn handle_sandbox_error_user_approves_retry_records_tool_decision() {
async fn handle_container_exec_user_denies_records_tool_decision() {
let server = start_mock_server().await;
mount_sse(
mount_sse_once(
&server,
sse(vec![
ev_local_shell_call("user_denied_call", "completed", vec!["/bin/date"]),
@@ -1007,11 +986,17 @@ async fn handle_container_exec_user_denies_records_tool_decision() {
)
.await;
mount_sse_once(
&server,
sse(vec![
ev_assistant_message("msg-1", "local shell done"),
ev_completed("done"),
]),
)
.await;
let TestCodex { codex, .. } = test_codex()
.with_config(|config| {
config.approval_policy = AskForApproval::UnlessTrusted;
config.model_provider.request_max_retries = Some(0);
config.model_provider.stream_max_retries = Some(0);
})
.build(&server)
.await
@@ -1026,12 +1011,7 @@ async fn handle_container_exec_user_denies_records_tool_decision() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::ExecApprovalRequest(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::ExecApprovalRequest(_))).await;
codex
.submit(Op::ExecApproval {
@@ -1041,12 +1021,7 @@ async fn handle_container_exec_user_denies_records_tool_decision() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TokenCount(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TokenCount(_))).await;
logs_assert(tool_decision_assertion(
"user_denied_call",
@@ -1060,7 +1035,7 @@ async fn handle_container_exec_user_denies_records_tool_decision() {
async fn handle_sandbox_error_user_approves_for_session_records_tool_decision() {
let server = start_mock_server().await;
mount_sse(
mount_sse_once(
&server,
sse(vec![
ev_local_shell_call("sandbox_session_call", "completed", vec!["/bin/date"]),
@@ -1068,12 +1043,18 @@ async fn handle_sandbox_error_user_approves_for_session_records_tool_decision()
]),
)
.await;
mount_sse_once(
&server,
sse(vec![
ev_assistant_message("msg-1", "local shell done"),
ev_completed("done"),
]),
)
.await;
let TestCodex { codex, .. } = test_codex()
.with_config(|config| {
config.approval_policy = AskForApproval::UnlessTrusted;
config.model_provider.request_max_retries = Some(0);
config.model_provider.stream_max_retries = Some(0);
})
.build(&server)
.await
@@ -1088,12 +1069,7 @@ async fn handle_sandbox_error_user_approves_for_session_records_tool_decision()
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::ExecApprovalRequest(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::ExecApprovalRequest(_))).await;
codex
.submit(Op::ExecApproval {
@@ -1103,12 +1079,7 @@ async fn handle_sandbox_error_user_approves_for_session_records_tool_decision()
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TokenCount(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TokenCount(_))).await;
logs_assert(tool_decision_assertion(
"sandbox_session_call",
@@ -1122,7 +1093,7 @@ async fn handle_sandbox_error_user_approves_for_session_records_tool_decision()
async fn handle_sandbox_error_user_denies_records_tool_decision() {
let server = start_mock_server().await;
mount_sse(
mount_sse_once(
&server,
sse(vec![
ev_local_shell_call("sandbox_deny_call", "completed", vec!["/bin/date"]),
@@ -1131,11 +1102,18 @@ async fn handle_sandbox_error_user_denies_records_tool_decision() {
)
.await;
mount_sse_once(
&server,
sse(vec![
ev_assistant_message("msg-1", "local shell done"),
ev_completed("done"),
]),
)
.await;
let TestCodex { codex, .. } = test_codex()
.with_config(|config| {
config.approval_policy = AskForApproval::UnlessTrusted;
config.model_provider.request_max_retries = Some(0);
config.model_provider.stream_max_retries = Some(0);
})
.build(&server)
.await
@@ -1150,12 +1128,7 @@ async fn handle_sandbox_error_user_denies_records_tool_decision() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::ExecApprovalRequest(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::ExecApprovalRequest(_))).await;
codex
.submit(Op::ExecApproval {
@@ -1165,12 +1138,7 @@ async fn handle_sandbox_error_user_denies_records_tool_decision() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TokenCount(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TokenCount(_))).await;
logs_assert(tool_decision_assertion(
"sandbox_deny_call",

View File

@@ -0,0 +1,72 @@
use anyhow::Result;
use codex_core::protocol::EventMsg;
use codex_core::protocol::Op;
use codex_protocol::user_input::UserInput;
use core_test_support::responses::ev_response_created;
use core_test_support::responses::mount_sse_once;
use core_test_support::responses::sse;
use core_test_support::responses::start_mock_server;
use core_test_support::skip_if_no_network;
use core_test_support::test_codex::test_codex;
use core_test_support::wait_for_event;
use pretty_assertions::assert_eq;
use serde_json::json;
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn quota_exceeded_emits_single_error_event() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let mut builder = test_codex();
mount_sse_once(
&server,
sse(vec![
ev_response_created("resp-1"),
json!({
"type": "response.failed",
"response": {
"id": "resp-1",
"error": {
"code": "insufficient_quota",
"message": "You exceeded your current quota, please check your plan and billing details."
}
}
}),
]),
)
.await;
let test = builder.build(&server).await?;
test.codex
.submit(Op::UserInput {
items: vec![UserInput::Text {
text: "quota?".into(),
}],
})
.await
.unwrap();
let mut error_events = 0;
loop {
let event = wait_for_event(&test.codex, |_| true).await;
match event {
EventMsg::Error(err) => {
error_events += 1;
assert_eq!(
err.message,
"Quota exceeded. Check your plan and billing details."
);
}
EventMsg::TaskComplete(_) => break,
_ => {}
}
}
assert_eq!(error_events, 1, "expected exactly one Codex:Error event");
Ok(())
}

View File

@@ -0,0 +1,70 @@
#![allow(clippy::unwrap_used, clippy::expect_used)]
use codex_core::AuthManager;
use codex_core::CodexAuth;
use codex_core::ConversationManager;
use codex_core::NewConversation;
use codex_core::protocol::EventMsg;
use codex_core::protocol::InitialHistory;
use codex_core::protocol::ResumedHistory;
use codex_core::protocol::RolloutItem;
use codex_core::protocol::TurnContextItem;
use codex_core::protocol::WarningEvent;
use codex_protocol::ConversationId;
use core::time::Duration;
use core_test_support::load_default_config_for_test;
use core_test_support::wait_for_event;
use tempfile::TempDir;
fn resume_history(config: &codex_core::config::Config, previous_model: &str, rollout_path: &std::path::Path) -> InitialHistory {
let turn_ctx = TurnContextItem {
cwd: config.cwd.clone(),
approval_policy: config.approval_policy,
sandbox_policy: config.sandbox_policy.clone(),
model: previous_model.to_string(),
effort: config.model_reasoning_effort,
summary: config.model_reasoning_summary,
};
InitialHistory::Resumed(ResumedHistory {
conversation_id: ConversationId::default(),
history: vec![RolloutItem::TurnContext(turn_ctx)],
rollout_path: rollout_path.to_path_buf(),
})
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn emits_warning_when_resumed_model_differs() {
// Arrange a config with a current model and a prior rollout recorded under a different model.
let home = TempDir::new().expect("tempdir");
let mut config = load_default_config_for_test(&home);
config.model = "current-model".to_string();
// Ensure cwd is absolute (the helper sets it to the temp dir already).
assert!(config.cwd.is_absolute());
let rollout_path = home.path().join("rollout.jsonl");
std::fs::write(&rollout_path, "").expect("create rollout placeholder");
let initial_history = resume_history(&config, "previous-model", &rollout_path);
let conversation_manager = ConversationManager::with_auth(CodexAuth::from_api_key("test"));
let auth_manager = AuthManager::from_auth_for_testing(CodexAuth::from_api_key("test"));
// Act: resume the conversation.
let NewConversation { conversation, .. } = conversation_manager
.resume_conversation_with_history(config, initial_history, auth_manager)
.await
.expect("resume conversation");
// Assert: a Warning event is emitted describing the model mismatch.
let warning = wait_for_event(&conversation, |ev| matches!(ev, EventMsg::Warning(_))).await;
let EventMsg::Warning(WarningEvent { message }) = warning else {
panic!("expected warning event");
};
assert!(message.contains("previous-model"));
assert!(message.contains("current-model"));
// Drain the TaskComplete/Shutdown window to avoid leaking tasks between tests.
// The warning is emitted during initialization, so a short sleep is sufficient.
tokio::time::sleep(Duration::from_millis(50)).await;
}

View File

@@ -23,7 +23,6 @@ use core_test_support::load_default_config_for_test;
use core_test_support::load_sse_fixture_with_id_from_str;
use core_test_support::skip_if_no_network;
use core_test_support::wait_for_event;
use core_test_support::wait_for_event_with_timeout;
use pretty_assertions::assert_eq;
use std::path::PathBuf;
use std::sync::Arc;
@@ -246,37 +245,31 @@ async fn review_filters_agent_message_related_events() {
let mut saw_exited = false;
// Drain until TaskComplete; assert filtered events never surface.
wait_for_event_with_timeout(
&codex,
|event| match event {
EventMsg::TaskComplete(_) => true,
EventMsg::EnteredReviewMode(_) => {
saw_entered = true;
false
wait_for_event(&codex, |event| match event {
EventMsg::TaskComplete(_) => true,
EventMsg::EnteredReviewMode(_) => {
saw_entered = true;
false
}
EventMsg::ExitedReviewMode(_) => {
saw_exited = true;
false
}
// The following must be filtered by review flow
EventMsg::AgentMessageContentDelta(_) => {
panic!("unexpected AgentMessageContentDelta surfaced during review")
}
EventMsg::AgentMessageDelta(_) => {
panic!("unexpected AgentMessageDelta surfaced during review")
}
EventMsg::ItemCompleted(ev) => match &ev.item {
codex_protocol::items::TurnItem::AgentMessage(_) => {
panic!("unexpected ItemCompleted for TurnItem::AgentMessage surfaced during review")
}
EventMsg::ExitedReviewMode(_) => {
saw_exited = true;
false
}
// The following must be filtered by review flow
EventMsg::AgentMessageContentDelta(_) => {
panic!("unexpected AgentMessageContentDelta surfaced during review")
}
EventMsg::AgentMessageDelta(_) => {
panic!("unexpected AgentMessageDelta surfaced during review")
}
EventMsg::ItemCompleted(ev) => match &ev.item {
codex_protocol::items::TurnItem::AgentMessage(_) => {
panic!(
"unexpected ItemCompleted for TurnItem::AgentMessage surfaced during review"
)
}
_ => false,
},
_ => false,
},
tokio::time::Duration::from_secs(5),
)
_ => false,
})
.await;
assert!(saw_entered && saw_exited, "missing review lifecycle events");
@@ -335,25 +328,21 @@ async fn review_does_not_emit_agent_message_on_structured_output() {
// Drain events until TaskComplete; ensure none are AgentMessage.
let mut saw_entered = false;
let mut saw_exited = false;
wait_for_event_with_timeout(
&codex,
|event| match event {
EventMsg::TaskComplete(_) => true,
EventMsg::AgentMessage(_) => {
panic!("unexpected AgentMessage during review with structured output")
}
EventMsg::EnteredReviewMode(_) => {
saw_entered = true;
false
}
EventMsg::ExitedReviewMode(_) => {
saw_exited = true;
false
}
_ => false,
},
tokio::time::Duration::from_secs(5),
)
wait_for_event(&codex, |event| match event {
EventMsg::TaskComplete(_) => true,
EventMsg::AgentMessage(_) => {
panic!("unexpected AgentMessage during review with structured output")
}
EventMsg::EnteredReviewMode(_) => {
saw_entered = true;
false
}
EventMsg::ExitedReviewMode(_) => {
saw_exited = true;
false
}
_ => false,
})
.await;
assert!(saw_entered && saw_exited, "missing review lifecycle events");

View File

@@ -25,7 +25,6 @@ use core_test_support::responses::mount_sse_once_match;
use core_test_support::skip_if_no_network;
use core_test_support::test_codex::test_codex;
use core_test_support::wait_for_event;
use core_test_support::wait_for_event_with_timeout;
use escargot::CargoBuild;
use mcp_types::ContentBlock;
use serde_json::Value;
@@ -125,11 +124,9 @@ async fn stdio_server_round_trip() -> anyhow::Result<()> {
})
.await?;
let begin_event = wait_for_event_with_timeout(
&fixture.codex,
|ev| matches!(ev, EventMsg::McpToolCallBegin(_)),
Duration::from_secs(10),
)
let begin_event = wait_for_event(&fixture.codex, |ev| {
matches!(ev, EventMsg::McpToolCallBegin(_))
})
.await;
let EventMsg::McpToolCallBegin(begin) = begin_event else {
@@ -268,11 +265,9 @@ async fn stdio_image_responses_round_trip() -> anyhow::Result<()> {
.await?;
// Wait for tool begin/end and final completion.
let begin_event = wait_for_event_with_timeout(
&fixture.codex,
|ev| matches!(ev, EventMsg::McpToolCallBegin(_)),
Duration::from_secs(10),
)
let begin_event = wait_for_event(&fixture.codex, |ev| {
matches!(ev, EventMsg::McpToolCallBegin(_))
})
.await;
let EventMsg::McpToolCallBegin(begin) = begin_event else {
unreachable!("begin");
@@ -465,11 +460,9 @@ async fn stdio_image_completions_round_trip() -> anyhow::Result<()> {
})
.await?;
let begin_event = wait_for_event_with_timeout(
&fixture.codex,
|ev| matches!(ev, EventMsg::McpToolCallBegin(_)),
Duration::from_secs(10),
)
let begin_event = wait_for_event(&fixture.codex, |ev| {
matches!(ev, EventMsg::McpToolCallBegin(_))
})
.await;
let EventMsg::McpToolCallBegin(begin) = begin_event else {
unreachable!("begin");
@@ -609,11 +602,9 @@ async fn stdio_server_propagates_whitelisted_env_vars() -> anyhow::Result<()> {
})
.await?;
let begin_event = wait_for_event_with_timeout(
&fixture.codex,
|ev| matches!(ev, EventMsg::McpToolCallBegin(_)),
Duration::from_secs(10),
)
let begin_event = wait_for_event(&fixture.codex, |ev| {
matches!(ev, EventMsg::McpToolCallBegin(_))
})
.await;
let EventMsg::McpToolCallBegin(begin) = begin_event else {
@@ -762,11 +753,9 @@ async fn streamable_http_tool_call_round_trip() -> anyhow::Result<()> {
})
.await?;
let begin_event = wait_for_event_with_timeout(
&fixture.codex,
|ev| matches!(ev, EventMsg::McpToolCallBegin(_)),
Duration::from_secs(10),
)
let begin_event = wait_for_event(&fixture.codex, |ev| {
matches!(ev, EventMsg::McpToolCallBegin(_))
})
.await;
let EventMsg::McpToolCallBegin(begin) = begin_event else {
@@ -947,11 +936,9 @@ async fn streamable_http_with_oauth_round_trip() -> anyhow::Result<()> {
})
.await?;
let begin_event = wait_for_event_with_timeout(
&fixture.codex,
|ev| matches!(ev, EventMsg::McpToolCallBegin(_)),
Duration::from_secs(10),
)
let begin_event = wait_for_event(&fixture.codex, |ev| {
matches!(ev, EventMsg::McpToolCallBegin(_))
})
.await;
let EventMsg::McpToolCallBegin(begin) = begin_event else {

View File

@@ -1,5 +1,3 @@
use std::time::Duration;
use codex_core::ModelProviderInfo;
use codex_core::WireApi;
use codex_core::protocol::EventMsg;
@@ -9,7 +7,7 @@ use core_test_support::load_sse_fixture_with_id;
use core_test_support::skip_if_no_network;
use core_test_support::test_codex::TestCodex;
use core_test_support::test_codex::test_codex;
use core_test_support::wait_for_event_with_timeout;
use core_test_support::wait_for_event;
use wiremock::Mock;
use wiremock::MockServer;
use wiremock::ResponseTemplate;
@@ -96,19 +94,9 @@ async fn continue_after_stream_error() {
.unwrap();
// Expect an Error followed by TaskComplete so the session is released.
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::Error(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::Error(_))).await;
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TaskComplete(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
// 2) Second turn: now send another prompt that should succeed using the
// mock server SSE stream. If the agent failed to clear the running task on
@@ -122,10 +110,5 @@ async fn continue_after_stream_error() {
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TaskComplete(_)),
Duration::from_secs(5),
)
.await;
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
}

View File

@@ -1,8 +1,6 @@
//! Verifies that the agent retries when the SSE stream terminates before
//! delivering a `response.completed` event.
use std::time::Duration;
use codex_core::ModelProviderInfo;
use codex_core::WireApi;
use codex_core::protocol::EventMsg;
@@ -13,7 +11,7 @@ use core_test_support::load_sse_fixture_with_id;
use core_test_support::skip_if_no_network;
use core_test_support::test_codex::TestCodex;
use core_test_support::test_codex::test_codex;
use core_test_support::wait_for_event_with_timeout;
use core_test_support::wait_for_event;
use wiremock::Mock;
use wiremock::MockServer;
use wiremock::Request;
@@ -103,10 +101,5 @@ async fn retries_on_early_close() {
.unwrap();
// Wait until TaskComplete (should succeed after retry).
wait_for_event_with_timeout(
&codex,
|event| matches!(event, EventMsg::TaskComplete(_)),
Duration::from_secs(10),
)
.await;
wait_for_event(&codex, |event| matches!(event, EventMsg::TaskComplete(_))).await;
}

View File

@@ -1,7 +1,8 @@
#![cfg(not(target_os = "windows"))]
use std::collections::HashMap;
use std::sync::OnceLock;
use anyhow::Context;
use anyhow::Result;
use codex_core::features::Feature;
use codex_core::protocol::AskForApproval;
@@ -10,6 +11,7 @@ use codex_core::protocol::Op;
use codex_core::protocol::SandboxPolicy;
use codex_protocol::config_types::ReasoningSummary;
use codex_protocol::user_input::UserInput;
use core_test_support::assert_regex_match;
use core_test_support::responses::ev_assistant_message;
use core_test_support::responses::ev_completed;
use core_test_support::responses::ev_function_call;
@@ -23,7 +25,7 @@ use core_test_support::test_codex::TestCodex;
use core_test_support::test_codex::test_codex;
use core_test_support::wait_for_event;
use core_test_support::wait_for_event_match;
use core_test_support::wait_for_event_with_timeout;
use regex_lite::Regex;
use serde_json::Value;
use serde_json::json;
@@ -35,7 +37,95 @@ fn extract_output_text(item: &Value) -> Option<&str> {
})
}
fn collect_tool_outputs(bodies: &[Value]) -> Result<HashMap<String, Value>> {
#[derive(Debug)]
struct ParsedUnifiedExecOutput {
chunk_id: Option<String>,
wall_time_seconds: f64,
session_id: Option<i32>,
exit_code: Option<i32>,
original_token_count: Option<usize>,
output: String,
}
#[allow(clippy::expect_used)]
fn parse_unified_exec_output(raw: &str) -> Result<ParsedUnifiedExecOutput> {
static OUTPUT_REGEX: OnceLock<Regex> = OnceLock::new();
let regex = OUTPUT_REGEX.get_or_init(|| {
Regex::new(concat!(
r#"(?s)^(?:Total output lines: \d+\n\n)?"#,
r#"(?:Chunk ID: (?P<chunk_id>[^\n]+)\n)?"#,
r#"Wall time: (?P<wall_time>-?\d+(?:\.\d+)?) seconds\n"#,
r#"(?:Process exited with code (?P<exit_code>-?\d+)\n)?"#,
r#"(?:Process running with session ID (?P<session_id>-?\d+)\n)?"#,
r#"(?:Original token count: (?P<original_token_count>\d+)\n)?"#,
r#"Output:\n?(?P<output>.*)$"#,
))
.expect("valid unified exec output regex")
});
let cleaned = raw.trim_matches('\r');
let captures = regex
.captures(cleaned)
.ok_or_else(|| anyhow::anyhow!("missing Output section in unified exec output {raw}"))?;
let chunk_id = captures
.name("chunk_id")
.map(|value| value.as_str().to_string());
let wall_time_seconds = captures
.name("wall_time")
.expect("wall_time group present")
.as_str()
.parse::<f64>()
.context("failed to parse wall time seconds")?;
let exit_code = captures
.name("exit_code")
.map(|value| {
value
.as_str()
.parse::<i32>()
.context("failed to parse exit code from unified exec output")
})
.transpose()?;
let session_id = captures
.name("session_id")
.map(|value| {
value
.as_str()
.parse::<i32>()
.context("failed to parse session id from unified exec output")
})
.transpose()?;
let original_token_count = captures
.name("original_token_count")
.map(|value| {
value
.as_str()
.parse::<usize>()
.context("failed to parse original token count from unified exec output")
})
.transpose()?;
let output = captures
.name("output")
.expect("output group present")
.as_str()
.to_string();
Ok(ParsedUnifiedExecOutput {
chunk_id,
wall_time_seconds,
session_id,
exit_code,
original_token_count,
output,
})
}
fn collect_tool_outputs(bodies: &[Value]) -> Result<HashMap<String, ParsedUnifiedExecOutput>> {
let mut outputs = HashMap::new();
for body in bodies {
if let Some(items) = body.get("input").and_then(Value::as_array) {
@@ -50,8 +140,8 @@ fn collect_tool_outputs(bodies: &[Value]) -> Result<HashMap<String, Value>> {
if trimmed.is_empty() {
continue;
}
let parsed: Value = serde_json::from_str(trimmed).map_err(|err| {
anyhow::anyhow!("failed to parse tool output content {trimmed:?}: {err}")
let parsed = parse_unified_exec_output(content).with_context(|| {
format!("failed to parse unified exec output for {call_id}")
})?;
outputs.insert(call_id.to_string(), parsed);
}
@@ -133,6 +223,90 @@ async fn unified_exec_emits_exec_command_begin_event() -> Result<()> {
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn unified_exec_respects_workdir_override() -> Result<()> {
skip_if_no_network!(Ok(()));
skip_if_sandbox!(Ok(()));
let server = start_mock_server().await;
let mut builder = test_codex().with_config(|config| {
config.use_experimental_unified_exec_tool = true;
config.features.enable(Feature::UnifiedExec);
});
let TestCodex {
codex,
cwd,
session_configured,
..
} = builder.build(&server).await?;
let workdir = cwd.path().join("uexec_workdir_test");
std::fs::create_dir_all(&workdir)?;
let call_id = "uexec-workdir";
let args = json!({
"cmd": "pwd",
"yield_time_ms": 250,
"workdir": workdir.to_string_lossy().to_string(),
});
let responses = vec![
sse(vec![
ev_response_created("resp-1"),
ev_function_call(call_id, "exec_command", &serde_json::to_string(&args)?),
ev_completed("resp-1"),
]),
sse(vec![
ev_response_created("resp-2"),
ev_assistant_message("msg-1", "finished"),
ev_completed("resp-2"),
]),
];
mount_sse_sequence(&server, responses).await;
let session_model = session_configured.model.clone();
codex
.submit(Op::UserTurn {
items: vec![UserInput::Text {
text: "run workdir test".into(),
}],
final_output_json_schema: None,
cwd: cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
wait_for_event(&codex, |event| matches!(event, EventMsg::TaskComplete(_))).await;
let requests = server.received_requests().await.expect("recorded requests");
assert!(!requests.is_empty(), "expected at least one POST request");
let bodies = requests
.iter()
.map(|req| req.body_json::<Value>().expect("request json"))
.collect::<Vec<_>>();
let outputs = collect_tool_outputs(&bodies)?;
let output = outputs
.get(call_id)
.expect("missing exec_command workdir output");
let output_text = output.output.trim();
let output_canonical = std::fs::canonicalize(output_text)?;
let expected_canonical = std::fs::canonicalize(&workdir)?;
assert_eq!(
output_canonical, expected_canonical,
"pwd should reflect the requested workdir override"
);
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn unified_exec_emits_exec_command_end_event() -> Result<()> {
skip_if_no_network!(Ok(()));
@@ -391,8 +565,6 @@ async fn unified_exec_emits_output_delta_for_write_stdin() -> Result<()> {
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn unified_exec_skips_begin_event_for_empty_input() -> Result<()> {
use tokio::time::Duration;
skip_if_no_network!(Ok(()));
skip_if_sandbox!(Ok(()));
@@ -468,7 +640,7 @@ async fn unified_exec_skips_begin_event_for_empty_input() -> Result<()> {
let mut begin_events = Vec::new();
loop {
let event_msg = wait_for_event_with_timeout(&codex, |_| true, Duration::from_secs(2)).await;
let event_msg = wait_for_event(&codex, |_| true).await;
match event_msg {
EventMsg::ExecCommandBegin(event) => begin_events.push(event),
EventMsg::TaskComplete(_) => break,
@@ -556,51 +728,38 @@ async fn exec_command_reports_chunk_and_exit_metadata() -> Result<()> {
.get(call_id)
.expect("missing exec_command metadata output");
let chunk_id = metadata
.get("chunk_id")
.and_then(Value::as_str)
.expect("missing chunk_id");
let chunk_id = metadata.chunk_id.as_ref().expect("missing chunk_id");
assert_eq!(chunk_id.len(), 6, "chunk id should be 6 hex characters");
assert!(
chunk_id.chars().all(|c| c.is_ascii_hexdigit()),
"chunk id should be hexadecimal: {chunk_id}"
);
let wall_time = metadata
.get("wall_time_seconds")
.and_then(Value::as_f64)
.unwrap_or_default();
let wall_time = metadata.wall_time_seconds;
assert!(
wall_time >= 0.0,
"wall_time_seconds should be non-negative, got {wall_time}"
);
assert!(
metadata.get("session_id").is_none(),
metadata.session_id.is_none(),
"exec_command for a completed process should not include session_id"
);
let exit_code = metadata
.get("exit_code")
.and_then(Value::as_i64)
.expect("expected exit_code");
let exit_code = metadata.exit_code.expect("expected exit_code");
assert_eq!(exit_code, 0, "expected successful exit");
let output_text = metadata
.get("output")
.and_then(Value::as_str)
.expect("missing output text");
let output_text = &metadata.output;
assert!(
output_text.contains("tokens truncated"),
"expected truncation notice in output: {output_text:?}"
);
let original_tokens = metadata
.get("original_token_count")
.and_then(Value::as_u64)
.expect("missing original_token_count");
.original_token_count
.expect("missing original_token_count") as usize;
assert!(
original_tokens as usize > 6,
original_tokens > 6,
"original token count should exceed max_output_tokens"
);
@@ -711,39 +870,34 @@ async fn write_stdin_returns_exit_metadata_and_clears_session() -> Result<()> {
.get(start_call_id)
.expect("missing start output for exec_command");
let session_id = start_output
.get("session_id")
.and_then(Value::as_i64)
.session_id
.expect("expected session id from exec_command");
assert!(
session_id >= 0,
"session_id should be non-negative, got {session_id}"
);
assert!(
start_output.get("exit_code").is_none(),
start_output.exit_code.is_none(),
"initial exec_command should not include exit_code while session is running"
);
let send_output = outputs
.get(send_call_id)
.expect("missing write_stdin echo output");
let echoed = send_output
.get("output")
.and_then(Value::as_str)
.unwrap_or_default();
let echoed = send_output.output.as_str();
assert!(
echoed.contains("hello unified exec"),
"expected echoed output from cat, got {echoed:?}"
);
let echoed_session = send_output
.get("session_id")
.and_then(Value::as_i64)
.session_id
.expect("write_stdin should return session id while process is running");
assert_eq!(
echoed_session, session_id,
"write_stdin should reuse existing session id"
);
assert!(
send_output.get("exit_code").is_none(),
send_output.exit_code.is_none(),
"write_stdin should not include exit_code while process is running"
);
@@ -751,18 +905,17 @@ async fn write_stdin_returns_exit_metadata_and_clears_session() -> Result<()> {
.get(exit_call_id)
.expect("missing exit metadata output");
assert!(
exit_output.get("session_id").is_none(),
exit_output.session_id.is_none(),
"session_id should be omitted once the process exits"
);
let exit_code = exit_output
.get("exit_code")
.and_then(Value::as_i64)
.exit_code
.expect("expected exit_code after sending EOF");
assert_eq!(exit_code, 0, "cat should exit cleanly after EOF");
let exit_chunk = exit_output
.get("chunk_id")
.and_then(Value::as_str)
.chunk_id
.as_ref()
.expect("missing chunk id for exit output");
assert!(
exit_chunk.chars().all(|c| c.is_ascii_hexdigit()),
@@ -964,26 +1117,18 @@ async fn unified_exec_reuses_session_via_stdin() -> Result<()> {
let start_output = outputs
.get(first_call_id)
.expect("missing first unified_exec output");
let session_id = start_output["session_id"].as_i64().unwrap_or_default();
let session_id = start_output.session_id.unwrap_or_default();
assert!(
session_id >= 0,
"expected session id in first unified_exec response"
);
assert!(
start_output["output"]
.as_str()
.unwrap_or_default()
.is_empty()
);
assert!(start_output.output.is_empty());
let reuse_output = outputs
.get(second_call_id)
.expect("missing reused unified_exec output");
assert_eq!(
reuse_output["session_id"].as_i64().unwrap_or_default(),
session_id
);
let echoed = reuse_output["output"].as_str().unwrap_or_default();
assert_eq!(reuse_output.session_id.unwrap_or_default(), session_id);
let echoed = reuse_output.output.as_str();
assert!(
echoed.contains("hello unified exec"),
"expected echoed output, got {echoed:?}"
@@ -1100,7 +1245,7 @@ PY
let start_output = outputs
.get(first_call_id)
.expect("missing initial unified_exec output");
let session_id = start_output["session_id"].as_i64().unwrap_or_default();
let session_id = start_output.session_id.unwrap_or_default();
assert!(
session_id >= 0,
"expected session id from initial unified_exec response"
@@ -1109,7 +1254,7 @@ PY
let poll_output = outputs
.get(second_call_id)
.expect("missing poll unified_exec output");
let poll_text = poll_output["output"].as_str().unwrap_or_default();
let poll_text = poll_output.output.as_str();
assert!(
poll_text.contains("TAIL-MARKER"),
"expected poll output to contain tail marker, got {poll_text:?}"
@@ -1209,16 +1354,11 @@ async fn unified_exec_timeout_and_followup_poll() -> Result<()> {
let outputs = collect_tool_outputs(&bodies)?;
let first_output = outputs.get(first_call_id).expect("missing timeout output");
assert_eq!(first_output["session_id"], 0);
assert!(
first_output["output"]
.as_str()
.unwrap_or_default()
.is_empty()
);
assert_eq!(first_output.session_id, Some(0));
assert!(first_output.output.is_empty());
let poll_output = outputs.get(second_call_id).expect("missing poll output");
let output_text = poll_output["output"].as_str().unwrap_or_default();
let output_text = poll_output.output.as_str();
assert!(
output_text.contains("ready"),
"expected ready output, got {output_text:?}"
@@ -1226,3 +1366,162 @@ async fn unified_exec_timeout_and_followup_poll() -> Result<()> {
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
// Skipped on arm because the ctor logic to handle arg0 doesn't work on ARM
#[cfg(not(target_arch = "arm"))]
async fn unified_exec_formats_large_output_summary() -> Result<()> {
skip_if_no_network!(Ok(()));
skip_if_sandbox!(Ok(()));
let server = start_mock_server().await;
let mut builder = test_codex().with_config(|config| {
config.features.enable(Feature::UnifiedExec);
});
let TestCodex {
codex,
cwd,
session_configured,
..
} = builder.build(&server).await?;
let script = r#"python3 - <<'PY'
for i in range(300):
print(f"line-{i}")
PY
"#;
let call_id = "uexec-large-output";
let args = serde_json::json!({
"cmd": script,
"yield_time_ms": 500,
});
let responses = vec![
sse(vec![
ev_response_created("resp-1"),
ev_function_call(call_id, "exec_command", &serde_json::to_string(&args)?),
ev_completed("resp-1"),
]),
sse(vec![
ev_assistant_message("msg-1", "done"),
ev_completed("resp-2"),
]),
];
mount_sse_sequence(&server, responses).await;
let session_model = session_configured.model.clone();
codex
.submit(Op::UserTurn {
items: vec![UserInput::Text {
text: "summarize large output".into(),
}],
final_output_json_schema: None,
cwd: cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
wait_for_event(&codex, |event| matches!(event, EventMsg::TaskComplete(_))).await;
let requests = server.received_requests().await.expect("recorded requests");
assert!(!requests.is_empty(), "expected at least one POST request");
let bodies = requests
.iter()
.map(|req| req.body_json::<Value>().expect("request json"))
.collect::<Vec<_>>();
let outputs = collect_tool_outputs(&bodies)?;
let large_output = outputs.get(call_id).expect("missing large output summary");
assert_regex_match(
concat!(
r"(?s)",
r"line-0.*?",
r"\[\.{3} omitted \d+ of \d+ lines \.{3}\].*?",
r"line-299",
),
&large_output.output,
);
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn unified_exec_runs_under_sandbox() -> Result<()> {
skip_if_no_network!(Ok(()));
skip_if_sandbox!(Ok(()));
let server = start_mock_server().await;
let mut builder = test_codex().with_config(|config| {
config.features.enable(Feature::UnifiedExec);
});
let TestCodex {
codex,
cwd,
session_configured,
..
} = builder.build(&server).await?;
let call_id = "uexec";
let args = serde_json::json!({
"cmd": "echo 'hello'",
"yield_time_ms": 500,
});
let responses = vec![
sse(vec![
ev_response_created("resp-1"),
ev_function_call(call_id, "exec_command", &serde_json::to_string(&args)?),
ev_completed("resp-1"),
]),
sse(vec![
ev_assistant_message("msg-1", "done"),
ev_completed("resp-2"),
]),
];
mount_sse_sequence(&server, responses).await;
let session_model = session_configured.model.clone();
codex
.submit(Op::UserTurn {
items: vec![UserInput::Text {
text: "summarize large output".into(),
}],
final_output_json_schema: None,
cwd: cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
// Important!
sandbox_policy: SandboxPolicy::ReadOnly,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
wait_for_event(&codex, |event| matches!(event, EventMsg::TaskComplete(_))).await;
let requests = server.received_requests().await.expect("recorded requests");
assert!(!requests.is_empty(), "expected at least one POST request");
let bodies = requests
.iter()
.map(|req| req.body_json::<Value>().expect("request json"))
.collect::<Vec<_>>();
let outputs = collect_tool_outputs(&bodies)?;
let output = outputs.get(call_id).expect("missing output");
assert_regex_match("hello[\r\n]+", &output.output);
Ok(())
}

View File

@@ -2,35 +2,20 @@ use codex_core::ConversationManager;
use codex_core::NewConversation;
use codex_core::protocol::EventMsg;
use codex_core::protocol::ExecCommandEndEvent;
use codex_core::protocol::ExecOutputStream;
use codex_core::protocol::Op;
use codex_core::protocol::TurnAbortReason;
use core_test_support::assert_regex_match;
use core_test_support::load_default_config_for_test;
use core_test_support::responses;
use core_test_support::wait_for_event;
use core_test_support::wait_for_event_match;
use regex_lite::escape;
use std::path::PathBuf;
use std::process::Command;
use std::process::Stdio;
use tempfile::TempDir;
fn detect_python_executable() -> Option<String> {
let candidates = ["python3", "python"];
candidates.iter().find_map(|candidate| {
Command::new(candidate)
.arg("--version")
.stdout(Stdio::null())
.stderr(Stdio::null())
.status()
.ok()
.and_then(|status| status.success().then(|| (*candidate).to_string()))
})
}
#[tokio::test]
async fn user_shell_cmd_ls_and_cat_in_temp_dir() {
let Some(python) = detect_python_executable() else {
eprintln!("skipping test: python3 not found in PATH");
return;
};
// Create a temporary working directory with a known file.
let cwd = TempDir::new().unwrap();
let file_name = "hello.txt";
@@ -55,10 +40,8 @@ async fn user_shell_cmd_ls_and_cat_in_temp_dir() {
.await
.expect("create new conversation");
// 1) python should list the file
let list_cmd = format!(
"{python} -c \"import pathlib; print('\\n'.join(sorted(p.name for p in pathlib.Path('.').iterdir())))\""
);
// 1) shell command should list the file
let list_cmd = "ls".to_string();
codex
.submit(Op::RunUserShellCommand { command: list_cmd })
.await
@@ -76,10 +59,8 @@ async fn user_shell_cmd_ls_and_cat_in_temp_dir() {
"ls output should include {file_name}, got: {stdout:?}"
);
// 2) python should print the file contents verbatim
let cat_cmd = format!(
"{python} -c \"import pathlib; print(pathlib.Path('{file_name}').read_text(), end='')\""
);
// 2) shell command should print the file contents verbatim
let cat_cmd = format!("cat {file_name}");
codex
.submit(Op::RunUserShellCommand { command: cat_cmd })
.await
@@ -95,7 +76,7 @@ async fn user_shell_cmd_ls_and_cat_in_temp_dir() {
};
assert_eq!(exit_code, 0);
if cfg!(windows) {
// Windows' Python writes CRLF line endings; normalize so the assertion remains portable.
// Windows shells emit CRLF line endings; normalize so the assertion remains portable.
stdout = stdout.replace("\r\n", "\n");
}
assert_eq!(stdout, contents);
@@ -103,10 +84,6 @@ async fn user_shell_cmd_ls_and_cat_in_temp_dir() {
#[tokio::test]
async fn user_shell_cmd_can_be_interrupted() {
let Some(python) = detect_python_executable() else {
eprintln!("skipping test: python3 not found in PATH");
return;
};
// Set up isolated config and conversation.
let codex_home = TempDir::new().unwrap();
let config = load_default_config_for_test(&codex_home);
@@ -121,7 +98,7 @@ async fn user_shell_cmd_can_be_interrupted() {
.expect("create new conversation");
// Start a long-running command and then interrupt it.
let sleep_cmd = format!("{python} -c \"import time; time.sleep(5)\"");
let sleep_cmd = "sleep 5".to_string();
codex
.submit(Op::RunUserShellCommand { command: sleep_cmd })
.await
@@ -138,3 +115,137 @@ async fn user_shell_cmd_can_be_interrupted() {
};
assert_eq!(ev.reason, TurnAbortReason::Interrupted);
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn user_shell_command_history_is_persisted_and_shared_with_model() -> anyhow::Result<()> {
let server = responses::start_mock_server().await;
let mut builder = core_test_support::test_codex::test_codex();
let test = builder.build(&server).await?;
#[cfg(windows)]
let command = r#"$val = $env:CODEX_SANDBOX; if ([string]::IsNullOrEmpty($val)) { $val = 'not-set' } ; [System.Console]::Write($val)"#.to_string();
#[cfg(not(windows))]
let command = r#"sh -c "printf '%s' \"${CODEX_SANDBOX:-not-set}\"""#.to_string();
test.codex
.submit(Op::RunUserShellCommand {
command: command.clone(),
})
.await?;
let begin_event = wait_for_event_match(&test.codex, |ev| match ev {
EventMsg::ExecCommandBegin(event) => Some(event.clone()),
_ => None,
})
.await;
assert!(begin_event.is_user_shell_command);
let matches_last_arg = begin_event.command.last() == Some(&command);
let matches_split = shlex::split(&command).is_some_and(|split| split == begin_event.command);
assert!(
matches_last_arg || matches_split,
"user command begin event should include the original command; got: {:?}",
begin_event.command
);
let delta_event = wait_for_event_match(&test.codex, |ev| match ev {
EventMsg::ExecCommandOutputDelta(event) => Some(event.clone()),
_ => None,
})
.await;
assert_eq!(delta_event.stream, ExecOutputStream::Stdout);
let chunk_text =
String::from_utf8(delta_event.chunk.clone()).expect("user command chunk is valid utf-8");
assert_eq!(chunk_text.trim(), "not-set");
let end_event = wait_for_event_match(&test.codex, |ev| match ev {
EventMsg::ExecCommandEnd(event) => Some(event.clone()),
_ => None,
})
.await;
assert_eq!(end_event.exit_code, 0);
assert_eq!(end_event.stdout.trim(), "not-set");
let _ = wait_for_event(&test.codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
let responses = vec![responses::sse(vec![
responses::ev_response_created("resp-1"),
responses::ev_assistant_message("msg-1", "done"),
responses::ev_completed("resp-1"),
])];
let mock = responses::mount_sse_sequence(&server, responses).await;
test.submit_turn("follow-up after shell command").await?;
let request = mock.single_request();
let command_message = request
.message_input_texts("user")
.into_iter()
.find(|text| text.contains("<user_shell_command>"))
.expect("command message recorded in request");
let command_message = command_message.replace("\r\n", "\n");
let escaped_command = escape(&command);
let expected_pattern = format!(
r"(?m)\A<user_shell_command>\n<command>\n{escaped_command}\n</command>\n<result>\nExit code: 0\nDuration: [0-9]+(?:\.[0-9]+)? seconds\nOutput:\nnot-set\n</result>\n</user_shell_command>\z"
);
assert_regex_match(&expected_pattern, &command_message);
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn user_shell_command_output_is_truncated_in_history() -> anyhow::Result<()> {
let server = responses::start_mock_server().await;
let mut builder = core_test_support::test_codex::test_codex();
let test = builder.build(&server).await?;
#[cfg(windows)]
let command = r#"for ($i=1; $i -le 400; $i++) { Write-Output $i }"#.to_string();
#[cfg(not(windows))]
let command = "seq 1 400".to_string();
test.codex
.submit(Op::RunUserShellCommand {
command: command.clone(),
})
.await?;
let end_event = wait_for_event_match(&test.codex, |ev| match ev {
EventMsg::ExecCommandEnd(event) => Some(event.clone()),
_ => None,
})
.await;
assert_eq!(end_event.exit_code, 0);
let _ = wait_for_event(&test.codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
let responses = vec![responses::sse(vec![
responses::ev_response_created("resp-1"),
responses::ev_assistant_message("msg-1", "done"),
responses::ev_completed("resp-1"),
])];
let mock = responses::mount_sse_sequence(&server, responses).await;
test.submit_turn("follow-up after shell command").await?;
let request = mock.single_request();
let command_message = request
.message_input_texts("user")
.into_iter()
.find(|text| text.contains("<user_shell_command>"))
.expect("command message recorded in request");
let command_message = command_message.replace("\r\n", "\n");
let head = (1..=128).map(|i| format!("{i}\n")).collect::<String>();
let tail = (273..=400).map(|i| format!("{i}\n")).collect::<String>();
let truncated_body =
format!("Total output lines: 400\n\n{head}\n[... omitted 144 of 400 lines ...]\n\n{tail}");
let escaped_command = escape(&command);
let escaped_truncated_body = escape(&truncated_body);
let expected_pattern = format!(
r"(?m)\A<user_shell_command>\n<command>\n{escaped_command}\n</command>\n<result>\nExit code: 0\nDuration: [0-9]+(?:\.[0-9]+)? seconds\nOutput:\n{escaped_truncated_body}\n</result>\n</user_shell_command>\z"
);
assert_regex_match(&expected_pattern, &command_message);
Ok(())
}

View File

@@ -21,7 +21,8 @@ At a glance:
- `getUserSavedConfig`, `setDefaultModel`, `getUserAgent`, `userInfo`
- `model/list` → enumerate available models and reasoning options
- Auth
- `loginApiKey`, `loginChatGpt`, `cancelLoginChatGpt`, `logoutChatGpt`, `getAuthStatus`
- `account/read`, `account/login/start`, `account/login/cancel`, `account/logout`, `account/rateLimits/read`
- notifications: `account/login/completed`, `account/updated`, `account/rateLimits/updated`
- Utilities
- `gitDiffToRemote`, `execOneOffCommand`
- Approvals (server → client requests)
@@ -113,11 +114,7 @@ The client must reply with `{ decision: "allow" | "deny" }` for each request.
## Auth helpers
For ChatGPT or APIkey based auth flows, the server exposes helpers:
- `loginApiKey { apiKey }`
- `loginChatGpt` → returns `{ loginId, authUrl }`; browser completes flow; then `loginChatGptComplete` notification follows
- `cancelLoginChatGpt { loginId }`, `logoutChatGpt`, `getAuthStatus { includeToken?, refreshToken? }`
For the complete request/response shapes and flow examples, see the [“Auth endpoints (v2)” section in the appserver README](../app-server/README.md#auth-endpoints-v2).
## Example: start and send a message

View File

@@ -30,7 +30,7 @@ pub struct Cli {
#[arg(long = "profile", short = 'p')]
pub config_profile: Option<String>,
/// Convenience alias for low-friction sandboxed automatic execution (-a on-failure, --sandbox workspace-write).
/// Convenience alias for low-friction sandboxed automatic execution (-a on-request, --sandbox workspace-write).
#[arg(long = "full-auto", default_value_t = false)]
pub full_auto: bool,

View File

@@ -548,7 +548,7 @@ fn warning_event_produces_error_item() {
let out = ep.collect_thread_events(&event(
"e1",
EventMsg::Warning(WarningEvent {
message: "Heads up: Long conversations and multiple compactions can cause the model to be less accurate. Start new a new conversation when possible to keep conversations small and targeted.".to_string(),
message: "Heads up: Long conversations and multiple compactions can cause the model to be less accurate. Start a new conversation when possible to keep conversations small and targeted.".to_string(),
}),
));
assert_eq!(
@@ -557,7 +557,7 @@ fn warning_event_produces_error_item() {
item: ThreadItem {
id: "item_0".to_string(),
details: ThreadItemDetails::Error(ErrorItem {
message: "Heads up: Long conversations and multiple compactions can cause the model to be less accurate. Start new a new conversation when possible to keep conversations small and targeted.".to_string(),
message: "Heads up: Long conversations and multiple compactions can cause the model to be less accurate. Start a new conversation when possible to keep conversations small and targeted.".to_string(),
}),
},
})]

View File

@@ -1,21 +0,0 @@
[package]
edition = "2024"
name = "codex-protocol-ts"
version = { workspace = true }
[lints]
workspace = true
[lib]
name = "codex_protocol_ts"
path = "src/lib.rs"
[[bin]]
name = "codex-protocol-ts"
path = "src/main.rs"
[dependencies]
anyhow = { workspace = true }
clap = { workspace = true, features = ["derive"] }
codex-app-server-protocol = { workspace = true }
ts-rs = { workspace = true }

View File

@@ -1,10 +0,0 @@
#!/bin/bash
set -euo pipefail
cd "$(dirname "$0")"/..
tmpdir=$(mktemp -d)
just codex generate-ts --prettier ../node_modules/.bin/prettier --out "$tmpdir"
echo "wrote output to $tmpdir"

View File

@@ -1,133 +0,0 @@
use anyhow::Context;
use anyhow::Result;
use anyhow::anyhow;
use codex_app_server_protocol::ClientNotification;
use codex_app_server_protocol::ClientRequest;
use codex_app_server_protocol::ServerNotification;
use codex_app_server_protocol::ServerRequest;
use codex_app_server_protocol::export_client_responses;
use codex_app_server_protocol::export_server_responses;
use std::ffi::OsStr;
use std::fs;
use std::io::Read;
use std::io::Write;
use std::path::Path;
use std::path::PathBuf;
use std::process::Command;
use ts_rs::TS;
const HEADER: &str = "// GENERATED CODE! DO NOT MODIFY BY HAND!\n\n";
pub fn generate_ts(out_dir: &Path, prettier: Option<&Path>) -> Result<()> {
ensure_dir(out_dir)?;
// Generate the TS bindings client -> server messages.
ClientRequest::export_all_to(out_dir)?;
export_client_responses(out_dir)?;
ClientNotification::export_all_to(out_dir)?;
// Generate the TS bindings server -> client messages.
ServerRequest::export_all_to(out_dir)?;
export_server_responses(out_dir)?;
ServerNotification::export_all_to(out_dir)?;
// Generate index.ts that re-exports all types.
generate_index_ts(out_dir)?;
// Prepend header to each generated .ts file
let ts_files = ts_files_in(out_dir)?;
for file in &ts_files {
prepend_header_if_missing(file)?;
}
// Format with Prettier by passing individual files (no shell globbing)
if let Some(prettier_bin) = prettier
&& !ts_files.is_empty()
{
let status = Command::new(prettier_bin)
.arg("--write")
.args(ts_files.iter().map(|p| p.as_os_str()))
.status()
.with_context(|| format!("Failed to invoke Prettier at {}", prettier_bin.display()))?;
if !status.success() {
return Err(anyhow!("Prettier failed with status {status}"));
}
}
Ok(())
}
fn ensure_dir(dir: &Path) -> Result<()> {
fs::create_dir_all(dir)
.with_context(|| format!("Failed to create output directory {}", dir.display()))
}
fn prepend_header_if_missing(path: &Path) -> Result<()> {
let mut content = String::new();
{
let mut f = fs::File::open(path)
.with_context(|| format!("Failed to open {} for reading", path.display()))?;
f.read_to_string(&mut content)
.with_context(|| format!("Failed to read {}", path.display()))?;
}
if content.starts_with(HEADER) {
return Ok(());
}
let mut f = fs::File::create(path)
.with_context(|| format!("Failed to open {} for writing", path.display()))?;
f.write_all(HEADER.as_bytes())
.with_context(|| format!("Failed to write header to {}", path.display()))?;
f.write_all(content.as_bytes())
.with_context(|| format!("Failed to write content to {}", path.display()))?;
Ok(())
}
fn ts_files_in(dir: &Path) -> Result<Vec<PathBuf>> {
let mut files = Vec::new();
for entry in
fs::read_dir(dir).with_context(|| format!("Failed to read dir {}", dir.display()))?
{
let entry = entry?;
let path = entry.path();
if path.is_file() && path.extension() == Some(OsStr::new("ts")) {
files.push(path);
}
}
files.sort();
Ok(files)
}
/// Generate an index.ts file that re-exports all generated types.
/// This allows consumers to import all types from a single file.
fn generate_index_ts(out_dir: &Path) -> Result<PathBuf> {
let mut entries: Vec<String> = Vec::new();
let mut stems: Vec<String> = ts_files_in(out_dir)?
.into_iter()
.filter_map(|p| {
let stem = p.file_stem()?.to_string_lossy().into_owned();
if stem == "index" { None } else { Some(stem) }
})
.collect();
stems.sort();
stems.dedup();
for name in stems {
entries.push(format!("export type {{ {name} }} from \"./{name}\";\n"));
}
let mut content =
String::with_capacity(HEADER.len() + entries.iter().map(String::len).sum::<usize>());
content.push_str(HEADER);
for line in &entries {
content.push_str(line);
}
let index_path = out_dir.join("index.ts");
let mut f = fs::File::create(&index_path)
.with_context(|| format!("Failed to create {}", index_path.display()))?;
f.write_all(content.as_bytes())
.with_context(|| format!("Failed to write {}", index_path.display()))?;
Ok(index_path)
}

View File

@@ -1,20 +0,0 @@
use anyhow::Result;
use clap::Parser;
use std::path::PathBuf;
#[derive(Parser, Debug)]
#[command(about = "Generate TypeScript bindings for the Codex protocol")]
struct Args {
/// Output directory where .ts files will be written
#[arg(short = 'o', long = "out", value_name = "DIR")]
out_dir: PathBuf,
/// Optional path to the Prettier executable to format generated files
#[arg(short = 'p', long = "prettier", value_name = "PRETTIER_BIN")]
prettier: Option<PathBuf>,
}
fn main() -> Result<()> {
let args = Args::parse();
codex_protocol_ts::generate_ts(&args.out_dir, args.prettier.as_deref())
}

Some files were not shown because too many files have changed in this diff Show More