Compare commits

...

113 Commits

Author SHA1 Message Date
Ahmed Ibrahim
b3f6608e6b reconfig 2025-11-21 12:14:58 -08:00
Dylan Hurd
0e051644a9 fix(scripts) next_minor_version should reset patch number (#7050)
## Summary
When incrementing the minor version, we should reset patch to 0, rather
than keeping it.

## Testing
- [x] tested locally with dry_run and `get_latest_release_version`
mocked out

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>
2025-11-21 10:17:12 -08:00
Michael Bolin
40d14c0756 fix: clear out duplicate entries for bash in the GitHub release (#7103)
https://github.com/openai/codex/pull/7005 introduced a new part of the
release process that added multiple files named `bash` in the `dist/`
folder used as the basis of the GitHub Release. I believe that all file
names in a GitHub Release have to be unique, which is why the recent
release build failed:

https://github.com/openai/codex/actions/runs/19577669780/job/56070183504

Based on the output of the **List** step, I believe these are the
appropriate artifacts to delete as a quick fix.
2025-11-21 09:59:30 -08:00
jif-oai
af65666561 chore: drop model_max_output_tokens (#7100) 2025-11-21 17:42:54 +00:00
Owen Lin
2ae1f81d84 [app-server] feat: add Declined status for command exec (#7101)
Add a `Declined` status for when we request an approval from the user
and the user declines. This allows us to distinguish from commands that
actually ran, but failed.

This behaves similarly to apply_patch / FileChange, which does the same
thing.
2025-11-21 09:19:39 -08:00
Michael Bolin
d363a0968e feat: codex-shell-tool-mcp (#7005)
This adds a GitHub workflow for building a new npm module we are
experimenting with that contains an MCP server for running Bash
commands. The new workflow, `shell-tool-mcp`, is a dependency of the
general `release` workflow so that we continue to use one version number
for all artifacts across the project in one GitHub release.

`.github/workflows/shell-tool-mcp.yml` is the primary workflow
introduced by this PR, which does the following:

- builds the `codex-exec-mcp-server` and `codex-execve-wrapper`
executables for both arm64 and x64 versions of Mac and Linux (preferring
the MUSL version for Linux)
- builds Bash (dynamically linked) for a [comically] large number of
platforms (both x64 and arm64 for most) with a small patch specified by
`shell-tool-mcp/patches/bash-exec-wrapper.patch`:
  - `debian-11`
  - `debian-12`
  - `ubuntu-20.04`
  - `ubuntu-22.04`
  - `ubuntu-24.04`
  - `centos-9`
  - `macos-13` (x64 only)
  - `macos-14` (arm64 only)
  - `macos-15` (arm64 only)
- builds the TypeScript for the [new] Node module declared in the
`shell-tool-mcp/` folder, which creates `bin/mcp-server.js`
- adds all of the native binaries to `shell-tool-mcp/vendor/` folder;
`bin/mcp-server.js` does a runtime check to determine which ones to
execute
- uses `npm pack` to create the `.tgz` for the module
- if `publish: true` is set, invokes the `npm publish` call with the
`.tgz`

The justification for building Bash for so many different operating
systems is because, since it is dynamically linked, we want to increase
our confidence that the version we build is compatible with the glibc
whatever OS we end up running on. (Note this is less of a concern with
`codex-exec-mcp-server` and `codex-execve-wrapper` on Linux, as they are
statically linked.)

This PR also introduces the code for the npm module in `shell-tool-mcp/`
(the proposed module name is `@openai/codex-shell-tool-mcp`). Initially,
I intended the module to be a single file of vanilla JavaScript (like
[`codex-cli/bin/codex.js`](ab5972d447/codex-cli/bin/codex.js)),
but some of the logic seemed a bit tricky, so I decided to port it to
TypeScript and add unit tests.

`shell-tool-mcp/src/index.ts` defines the `main()` function for the
module, which performs runtime checks to determine the clang triple to
find the path to the Rust executables within the `vendor/` folder
(`resolveTargetTriple()`). It uses a combination of `readOsRelease()`
and `resolveBashPath()` to determine the correct Bash executable to run
in the environment. Ultimately, it spawns a command like the following:

```
codex-exec-mcp-server \
    --execve codex-execve-wrapper \
    --bash custom-bash "$@"
```

Note `.github/workflows/shell-tool-mcp-ci.yml` defines a fairly standard
CI job for the module (`format`/`build`/`test`).

To test this PR, I pushed this branch to my personal fork of Codex and
ran the CI job there:

https://github.com/bolinfest/codex/actions/runs/19564311320

Admittedly, the graph looks a bit wild now:

<img width="5115" height="2969" alt="Screenshot 2025-11-20 at 11 44
58 PM"
src="https://github.com/user-attachments/assets/cc5ef306-efc1-4ed7-a137-5347e394f393"
/>

But when it finished, I was able to download `codex-shell-tool-mcp-npm`
from the **Artifacts** for the workflow in an empty temp directory,
unzip the `.zip` and then the `.tgz` inside it, followed by `xattr -rc
.` to remove the quarantine bits. Then I ran:

```shell
npx @modelcontextprotocol/inspector node /private/tmp/foobar4/package/bin/mcp-server.js
```

which launched the MCP Inspector and I was able to use it as expected!
This bodes well that this should work once the package is published to
npm:

```shell
npx @modelcontextprotocol/inspector npx @openai/codex-shell-tool-mcp
```

Also, to verify the package contains what I expect:

```shell
/tmp/foobar4/package$ tree
.
├── bin
│   └── mcp-server.js
├── package.json
├── README.md
└── vendor
    ├── aarch64-apple-darwin
    │   ├── bash
    │   │   ├── macos-14
    │   │   │   └── bash
    │   │   └── macos-15
    │   │       └── bash
    │   ├── codex-exec-mcp-server
    │   └── codex-execve-wrapper
    ├── aarch64-unknown-linux-musl
    │   ├── bash
    │   │   ├── centos-9
    │   │   │   └── bash
    │   │   ├── debian-11
    │   │   │   └── bash
    │   │   ├── debian-12
    │   │   │   └── bash
    │   │   ├── ubuntu-20.04
    │   │   │   └── bash
    │   │   ├── ubuntu-22.04
    │   │   │   └── bash
    │   │   └── ubuntu-24.04
    │   │       └── bash
    │   ├── codex-exec-mcp-server
    │   └── codex-execve-wrapper
    ├── x86_64-apple-darwin
    │   ├── bash
    │   │   └── macos-13
    │   │       └── bash
    │   ├── codex-exec-mcp-server
    │   └── codex-execve-wrapper
    └── x86_64-unknown-linux-musl
        ├── bash
        │   ├── centos-9
        │   │   └── bash
        │   ├── debian-11
        │   │   └── bash
        │   ├── debian-12
        │   │   └── bash
        │   ├── ubuntu-20.04
        │   │   └── bash
        │   ├── ubuntu-22.04
        │   │   └── bash
        │   └── ubuntu-24.04
        │       └── bash
        ├── codex-exec-mcp-server
        └── codex-execve-wrapper

26 directories, 26 files
```
2025-11-21 08:16:36 -08:00
jif-oai
bce030ddb5 Revert "fix: read max_output_tokens param from config" (#7088)
Reverts openai/codex#4139
2025-11-21 11:40:02 +01:00
iceweasel-oai
f4af6e389e Windows Sandbox: support network_access and exclude_tmpdir_env_var (#7030) 2025-11-20 22:59:55 -08:00
Eric Traut
b315b22f7b Fixed the deduplicator github action (#7070)
It stopped working (found zero duplicates) starting three days ago when
the model was switched from `gpt-5` to `gpt-5.1`. I'm not sure why it
stopped working. This is an attempt to get it working again by using the
default model for the codex action (which is presumably
`gpt-5.1-codex-max`).
2025-11-20 22:46:55 -08:00
Yorling
c9e149fd5c fix: read max_output_tokens param from config (#4139)
Request param `max_output_tokens` is documented in
`https://github.com/openai/codex/blob/main/docs/config.md`,
but nowhere uses the item in config, this commit read it from config for
GPT responses API.

see https://github.com/openai/codex/issues/4138 for issue report.

Signed-off-by: Yorling <shallowcloud@yeah.net>
2025-11-20 22:46:34 -08:00
Eric Traut
bacdc004be Fixed two tests that can fail in some environments that have global git rewrite rules (#7068)
This fixes https://github.com/openai/codex/issues/7044
2025-11-20 22:45:40 -08:00
pakrym-oai
ab5972d447 Support all types of search actions (#7061)
Fixes the 

```
{
  "error": {
    "message": "Invalid value: 'other'. Supported values are: 'search', 'open_page', and 'find_in_page'.",
    "type": "invalid_request_error",
    "param": "input[150].action.type",
    "code": "invalid_value"
  }
```
error.


The actual-actual fix here is supporting absent `query` parameter.
2025-11-20 20:45:28 -08:00
pakrym-oai
767b66f407 Migrate coverage to shell_command (#7042) 2025-11-21 03:44:00 +00:00
pakrym-oai
830ab4ce20 Support full powershell paths in is_safe_command (#7055)
New shell implementation always uses full paths.
2025-11-20 19:29:15 -08:00
Dylan Hurd
3f73e2c892 fix(app-server) remove www warning (#7046)
### Summary
After #7022, we no longer need this warning. We should also clean up the
schema for the notification, but this is a quick fix to just stop the
behavior in the VSCE

## Testing
- [x] Ran locally
2025-11-20 19:18:39 -08:00
Dylan Hurd
1822ffe870 feat(tui): default reasoning selection to medium (#7040)
## Summary
- allow selection popups to request an initial highlighted row
- begin the /models reasoning selector focused on the default effort

## Testing
- just fmt
- just fix -p codex-tui
- cargo test -p codex-tui



https://github.com/user-attachments/assets/b322aeb1-e8f3-4578-92f7-5c2fa5ee4c98



------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_691f75e8fc188322a910fbe2138666ef)
2025-11-20 17:06:04 -08:00
Celia Chen
7e2165f394 [app-server] update doc with codex error info (#6941)
Document new codex error info. Also fixed the name from
`codex_error_code` to `codex_error_info`.
2025-11-21 01:02:37 +00:00
Michael Bolin
8e5f38c0f0 feat: waiting for an elicitation should not count against a shell tool timeout (#6973)
Previously, we were running into an issue where we would run the `shell`
tool call with a timeout of 10s, but it fired an elicitation asking for
user approval, the time the user took to respond to the elicitation was
counted agains the 10s timeout, so the `shell` tool call would fail with
a timeout error unless the user is very fast!

This PR addresses this issue by introducing a "stopwatch" abstraction
that is used to manage the timeout. The idea is:

- `Stopwatch::new()` is called with the _real_ timeout of the `shell`
tool call.
- `process_exec_tool_call()` is called with the `Cancellation` variant
of `ExecExpiration` because it should not manage its own timeout in this
case
- the `Stopwatch` expiration is wired up to the `cancel_rx` passed to
`process_exec_tool_call()`
- when an elicitation for the `shell` tool call is received, the
`Stopwatch` pauses
- because it is possible for multiple elicitations to arrive
concurrently, it keeps track of the number of "active pauses" and does
not resume until that counter goes down to zero

I verified that I can test the MCP server using
`@modelcontextprotocol/inspector` and specify `git status` as the
`command` with a timeout of 500ms and that the elicitation pops up and I
have all the time in the world to respond whereas previous to this PR,
that would not have been possible.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/6973).
* #7005
* __->__ #6973
* #6972
2025-11-20 16:45:38 -08:00
Ahmed Ibrahim
1388e99674 fix flaky tool_call_output_exceeds_limit_truncated_chars_limit (#7043)
I am suspecting this is flaky because of the wall time can become 0,
0.1, or 1.
2025-11-20 16:36:29 -08:00
Michael Bolin
f56d1dc8fc feat: update process_exec_tool_call() to take a cancellation token (#6972)
This updates `ExecParams` so that instead of taking `timeout_ms:
Option<u64>`, it now takes a more general cancellation mechanism,
`ExecExpiration`, which is an enum that includes a
`Cancellation(tokio_util::sync::CancellationToken)` variant.

If the cancellation token is fired, then `process_exec_tool_call()`
returns in the same way as if a timeout was exceeded.

This is necessary so that in #6973, we can manage the timeout logic
external to the `process_exec_tool_call()` because we want to "suspend"
the timeout when an elicitation from a human user is pending.








---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/6972).
* #7005
* #6973
* __->__ #6972
2025-11-20 16:29:57 -08:00
Ahmed Ibrahim
9be310041b migrate collect_tool_identifiers_for_model to test_codex (#7041)
Maybe it solved flakiness
2025-11-20 16:02:50 -08:00
Xiao-Yong Jin
0fbcdd77c8 core: make shell behavior portable on FreeBSD (#7039)
- Use /bin/sh instead of /bin/bash on FreeBSD/OpenBSD in the process
group timeout test to avoid command-not-found failures.

- Accept /usr/local/bin/bash as a valid SHELL path to match common
FreeBSD installations.

- Switch the shell serialization duration test to /bin/sh for improved
portability across Unix platforms.

With this change, `cargo test -p codex-core --lib` runs and passes on
FreeBSD.
2025-11-20 16:01:35 -08:00
Celia Chen
9bce050385 [app-server & core] introduce new codex error code and v2 app-server error events (#6938)
This PR does two things:
1. populate a new `codex_error_code` protocol in error events sent from
core to client;
2. old v1 core events `codex/event/stream_error` and `codex/event/error`
will now both become `error`. We also show codex error code for
turncompleted -> error status.

new events in app server test:
```
< {
<   "method": "codex/event/stream_error",
<   "params": {
<     "conversationId": "019aa34c-0c14-70e0-9706-98520a760d67",
<     "id": "0",
<     "msg": {
<       "codex_error_code": {
<         "response_stream_disconnected": {
<           "http_status_code": 401
<         }
<       },
<       "message": "Reconnecting... 2/5",
<       "type": "stream_error"
<     }
<   }
< }

 {
<   "method": "error",
<   "params": {
<     "error": {
<       "codexErrorCode": {
<         "responseStreamDisconnected": {
<           "httpStatusCode": 401
<         }
<       },
<       "message": "Reconnecting... 2/5"
<     }
<   }
< }

< {
<   "method": "turn/completed",
<   "params": {
<     "turn": {
<       "error": {
<         "codexErrorCode": {
<           "responseTooManyFailedAttempts": {
<             "httpStatusCode": 401
<           }
<         },
<         "message": "exceeded retry limit, last status: 401 Unauthorized, request id: 9a1b495a1a97ed3e-SJC"
<       },
<       "id": "0",
<       "items": [],
<       "status": "failed"
<     }
<   }
< }
```
2025-11-20 23:06:55 +00:00
iceweasel-oai
3f92ad4190 add deny ACEs for world writable dirs (#7022)
Our Restricted Token contains 3 SIDs (Logon, Everyone, {WorkspaceWrite
Capability || ReadOnly Capability})

because it must include Everyone, that left us vulnerable to directories
that allow writes to Everyone. Even though those directories do not have
ACEs that enable our capability SIDs to write to them, they could still
be written to even in ReadOnly mode, or even in WorkspaceWrite mode if
they are outside of a writable root.

A solution to this is to explicitly add *Deny* ACEs to these
directories, always for the ReadOnly Capability SID, and for the
WorkspaceWrite SID if the directory is outside of a workspace root.

Under a restricted token, Windows always checks Deny ACEs before Allow
ACEs so even though our restricted token would allow a write to these
directories due to the Everyone SID, it fails first because of the Deny
ACE on the capability SID
2025-11-20 14:50:33 -08:00
Ahmed Ibrahim
54ee302a06 Attempt to fix unified_exec_formats_large_output_summary flakiness (#7029)
second attempt to fix this test after
https://github.com/openai/codex/pull/6884. I think this flakiness is
happening because yield_time is too small for a 10,000 step loop in
python.
2025-11-20 14:38:04 -08:00
Ahmed Ibrahim
44fa06ae36 fix flaky test: approval_matrix_covers_all_modes (#7028)
looks like it sometimes flake around 30. let's give it more time.
2025-11-20 14:37:42 -08:00
pakrym-oai
856f97f449 Delete shell_command feature (#7024) 2025-11-20 14:14:56 -08:00
zhao-oai
fe7a3f0c2b execpolicycheck command in codex cli (#7012)
adding execpolicycheck tool onto codex cli

this is useful for validating policies (can be multiple) against
commands.

it will also surface errors in policy syntax:
<img width="1150" height="281" alt="Screenshot 2025-11-19 at 12 46
21 PM"
src="https://github.com/user-attachments/assets/8f99b403-564c-4172-acc9-6574a8d13dc3"
/>

this PR also changes output format when there's no match in the CLI.
instead of returning the raw string `noMatch`, we return
`{"noMatch":{}}`

this PR is a rewrite of: https://github.com/openai/codex/pull/6932 (due
to the numerous merge conflicts present in the original PR)

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>
2025-11-20 16:44:31 -05:00
zhao-oai
c30ca0d5b6 increasing user shell timeout to 1 hour (#7025)
setting user shell timeout to an unreasonably high value since there
isn't an easy way to have a command run without timeouts

currently, user shell commands timeout is 10 seconds
2025-11-20 13:39:16 -08:00
Weiller Carvalho
a8a6cbdd1c fix: route feedback issue links by category (#6840)
## Summary
- TUI feedback note now only links to the bug-report template when the
category is bug/bad result.
- Good result/other feedback shows a thank-you+thread ID instead of
funneling people to file a bug.
- Added a helper + unit test so future changes keep the behavior
consistent.

## Testing
  - just fmt
  - just fix -p codex-tui
  - cargo test -p codex-tui

  Fixes #6839
2025-11-20 13:20:03 -08:00
Dmitri Khokhlov
e4257f432e codex-exec: allow resume --last to read prompt #6717 (#6719)
### Description

- codex exec --json resume --last "<prompt>" bailed out because clap
treated the prompt as SESSION_ID. I removed the conflicts_with flag and
reinterpret that positional as a prompt when
--last is set, so the flow now keeps working in JSON mode.
(codex-rs/exec/src/cli.rs:84-104, codex-rs/exec/src/lib.rs:75-130)
- Added a regression test that exercises resume --last in JSON mode to
ensure the prompt is accepted and the rollout file is updated.
(codex-rs/exec/tests/suite/resume.rs:126-178)

### Testing

  - just fmt
  - cargo test -p codex-exec
  - just fix -p codex-exec
  - cargo test -p codex-exec

#6717

Signed-off-by: Dmitri Khokhlov <dkhokhlov@cribl.io>
2025-11-20 13:10:49 -08:00
Jeremy Rose
2c793083f4 tui: centralize markdown styling and make inline code cyan (#7023)
<img width="762" height="271" alt="Screenshot 2025-11-20 at 12 54 06 PM"
src="https://github.com/user-attachments/assets/10021d63-27eb-407b-8fcc-43740e3bfb0f"
/>
2025-11-20 21:06:22 +00:00
Lionel Cheng
e150798baf Bumped number of fuzzy search results from 8 to 20 (#7013)
I just noticed that in the VSCode / Codex extension when you type @ the
number of results is around 70:

- small video of searching for `mod.rs` inside `codex` repository:
https://github.com/user-attachments/assets/46e53d31-adff-465e-b32b-051c4c1c298c

- while in the CLI the number of results is currently of 8 which is
quite small:
<img width="615" height="439" alt="Screenshot 2025-11-20 at 09 42 04"
src="https://github.com/user-attachments/assets/1c6d12cb-3b1f-4d5b-9ad3-6b12975eaaec"
/>

I bumped it to 20. I had several cases where I wanted a file and did not
find it because the number of results was too small

Signed-off-by: lionel-oai <lionel@openai.com>
Co-authored-by: lionel-oai <lionel@openai.com>
2025-11-20 12:33:12 -08:00
Kyuheon Kim
33a6cc66ab fix(cli): correct mcp add usage order (#6827)
## Summary
- add an explicit `override_usage` string to `AddArgs` so clap prints
`<NAME>` before the command/url choice, matching the actual parser and
docs

### Before

Usage: codex mcp add [OPTIONS] <COMMAND|--url <URL>> <NAME>


### After

Usage: codex mcp add [OPTIONS] <NAME> [--url <URL> | -- <COMMAND>...]

---------

Signed-off-by: kyuheon-kr <kyuheon.kr@gmail.com>
2025-11-20 12:32:12 -08:00
pakrym-oai
52d0ec4cd8 Delete tiktoken-rs (#7018) 2025-11-20 11:15:04 -08:00
LIHUA
397279d46e Fix: Improve text encoding for shell output in VSCode preview (#6178) (#6182)
## 🐛 Problem

Users running commands with non-ASCII characters (like Russian text
"пример") in Windows/WSL environments experience garbled text in
VSCode's shell preview window, with Unicode replacement characters (�)
appearing instead of the actual text.

**Issue**: https://github.com/openai/codex/issues/6178

## 🔧 Root Cause

The issue was in `StreamOutput<Vec<u8>>::from_utf8_lossy()` method in
`codex-rs/core/src/exec.rs`, which used `String::from_utf8_lossy()` to
convert shell output bytes to strings. This function immediately
replaces any invalid UTF-8 byte sequences with replacement characters,
without attempting to decode using other common encodings.

In Windows/WSL environments, shell output often uses encodings like:

- Windows-1252 (common Windows encoding)
- Latin-1/ISO-8859-1 (extended ASCII)

## 🛠️ Solution

Replaced the simple `String::from_utf8_lossy()` call with intelligent
encoding detection via a new `bytes_to_string_smart()` function that
tries multiple encoding strategies:

1. **UTF-8** (fast path for valid UTF-8)
2. **Windows-1252** (handles Windows-specific characters in 0x80-0x9F
range)
3. **Latin-1** (fallback for extended ASCII)
4. **Lossy UTF-8** (final fallback, same as before)

## 📁 Changes

### New Files

- `codex-rs/core/src/text_encoding.rs` - Smart encoding detection module
- `codex-rs/core/tests/suite/text_encoding_fix.rs` - Integration tests

### Modified Files

- `codex-rs/core/src/lib.rs` - Added text_encoding module
- `codex-rs/core/src/exec.rs` - Updated StreamOutput::from_utf8_lossy()
- `codex-rs/core/tests/suite/mod.rs` - Registered new test module

##  Testing

- **5 unit tests** covering UTF-8, Windows-1252, Latin-1, and fallback
scenarios
- **2 integration tests** simulating the exact Issue #6178 scenario
- **Demonstrates improvement** over the previous
`String::from_utf8_lossy()` approach

All tests pass:

```bash
cargo test -p codex-core text_encoding
cargo test -p codex-core test_shell_output_encoding_issue_6178
```

## 🎯 Impact

-  **Eliminates garbled text** in VSCode shell preview for non-ASCII
content
-  **Supports Windows/WSL environments** with proper encoding detection
-  **Zero performance impact** for UTF-8 text (fast path)
-  **Backward compatible** - UTF-8 content works exactly as before
-  **Handles edge cases** with robust fallback mechanism

## 🧪 Test Scenarios

The fix has been tested with:

- Russian text ("пример")
- Windows-1252 quotation marks (""test")
- Latin-1 accented characters ("café")
- Mixed encoding content
- Invalid byte sequences (graceful fallback)

## 📋 Checklist

- [X] Addresses the reported issue
- [X] Includes comprehensive tests
- [X] Maintains backward compatibility
- [X] Follows project coding conventions
- [X] No breaking changes

---------

Co-authored-by: Josh McKinney <joshka@openai.com>
2025-11-20 11:04:11 -08:00
pakrym-oai
30ca89424c Always fallback to real shell (#6953)
Either cmd.exe or `/bin/sh`.
2025-11-20 10:58:46 -08:00
Eric Traut
d909048a85 Added feature switch to disable animations in TUI (#6870)
This PR adds support for a new feature flag `tui.animations`. By
default, the TUI uses animations in its welcome screen, "working"
spinners, and "shimmer" effects. This animations can interfere with
screen readers, so it's good to provide a way to disable them.

This change is inspired by [a
PR](https://github.com/openai/codex/pull/4014) contributed by @Orinks.
That PR has faltered a bit, but I think the core idea is sound. This
version incorporates feedback from @aibrahim-oai. In particular:
1. It uses a feature flag (`tui.animations`) rather than the unqualified
CLI key `no-animations`. Feature flags are the preferred way to expose
boolean switches. They are also exposed via CLI command switches.
2. It includes more complete documentation.
3. It disables a few animations that the other PR omitted.
2025-11-20 10:40:08 -08:00
jif-oai
888c6dd9e7 fix: command formatting for user commands (#7002) 2025-11-20 17:29:15 +01:00
hanson-openai
b5dd189067 Allow unified_exec to early exit (if the process terminates before yield_time_ms) (#6867)
Thread through an `exit_notify` tokio `Notify` through to the
`UnifiedExecSession` so that we can return early if the command
terminates before `yield_time_ms`.

As Codex review correctly pointed out below 🙌 we also need a
`exit_signaled` flag so that commands which finish before we start
waiting can also exit early.

Since the default `yield_time_ms` is now 10s, this means that we don't
have to wait 10s for trivial commands like ls, sed, etc (which are the
majority of agent commands 😅)

---------

Co-authored-by: jif-oai <jif@openai.com>
2025-11-20 13:34:41 +01:00
Michael Bolin
54e6e4ac32 fix: when displaying execv, show file instead of arg0 (#6966)
After merging https://github.com/openai/codex/pull/6958, I realized that
the `command` I was displaying was not quite right. Since we know it, we
should show the _exact_ program being executed (the first arg to
`execve(3)`) rather than `arg0` to be more precise.

Below is the same command I used to test
https://github.com/openai/codex/pull/6958, but now you can see it shows
`/Users/mbolin/.openai/bin/git` instead of just `git`.

<img width="1526" height="1444" alt="image"
src="https://github.com/user-attachments/assets/428128d1-c658-456e-a64e-fc6a0009cb34"
/>
2025-11-19 22:42:58 -08:00
Michael Bolin
e8af41de8a fix: clean up elicitation used by exec-server (#6958)
Using appropriate message/title fields, I think this looks better now:

<img width="3370" height="3208" alt="image"
src="https://github.com/user-attachments/assets/e9bbf906-4ba8-4563-affc-62cdc6c97342"
/>

Though note that in the current version of the Inspector (`0.17.2`), you
cannot hit **Submit** until you fill out the field. I believe this is a
bug in the Inspector, as it does not properly handle the case when all
fields are optional. I put up a fix:

https://github.com/modelcontextprotocol/inspector/pull/926
2025-11-20 04:59:17 +00:00
Owen Lin
d6c30ed25e [app-server] feat: v2 apply_patch approval flow (#6760)
This PR adds the API V2 version of the apply_patch approval flow, which
centers around `ThreadItem::FileChange`.

This PR wires the new RPC (`item/fileChange/requestApproval`, V2 only)
and related events (`item/started`, `item/completed` for
`ThreadItem::FileChange`, which are emitted in both V1 and V2) through
the app-server
protocol. The new approval RPC is only sent when the user initiates a
turn with the new `turn/start` API so we don't break backwards
compatibility with VSCE.

Similar to https://github.com/openai/codex/pull/6758, the approach I
took was to make as few changes to the Codex core as possible,
leveraging existing `EventMsg` core events, and translating those in
app-server. I did have to add a few additional fields to
`EventMsg::PatchApplyBegin` and `EventMsg::PatchApplyEnd`, but those
were fairly lightweight.

However, the `EventMsg`s emitted by core are the following:
```
1) Auto-approved (no request for approval)

- EventMsg::PatchApplyBegin
- EventMsg::PatchApplyEnd

2) Approved by user
- EventMsg::ApplyPatchApprovalRequest
- EventMsg::PatchApplyBegin
- EventMsg::PatchApplyEnd

3) Declined by user
- EventMsg::ApplyPatchApprovalRequest
- EventMsg::PatchApplyBegin
- EventMsg::PatchApplyEnd
```

For a request triggering an approval, this would result in:
```
item/fileChange/requestApproval
item/started
item/completed
```

which is different from the `ThreadItem::CommandExecution` flow
introduced in https://github.com/openai/codex/pull/6758, which does the
below and is preferable:
```
item/started
item/commandExecution/requestApproval
item/completed
```

To fix this, we leverage `TurnSummaryStore` on codex_message_processor
to store a little bit of state, allowing us to fire `item/started` and
`item/fileChange/requestApproval` whenever we receive the underlying
`EventMsg::ApplyPatchApprovalRequest`, and no-oping when we receive the
`EventMsg::PatchApplyBegin` later.

This is much less invasive than modifying the order of EventMsg within
core (I tried).

The resulting payloads:
```
{
  "method": "item/started",
  "params": {
    "item": {
      "changes": [
        {
          "diff": "Hello from Codex!\n",
          "kind": "add",
          "path": "/Users/owen/repos/codex/codex-rs/APPROVAL_DEMO.txt"
        }
      ],
      "id": "call_Nxnwj7B3YXigfV6Mwh03d686",
      "status": "inProgress",
      "type": "fileChange"
    }
  }
}
```

```
{
  "id": 0,
  "method": "item/fileChange/requestApproval",
  "params": {
    "grantRoot": null,
    "itemId": "call_Nxnwj7B3YXigfV6Mwh03d686",
    "reason": null,
    "threadId": "019a9e11-8295-7883-a283-779e06502c6f",
    "turnId": "1"
  }
}
```

```
{
  "id": 0,
  "result": {
    "decision": "accept"
  }
}
```

```
{
  "method": "item/completed",
  "params": {
    "item": {
      "changes": [
        {
          "diff": "Hello from Codex!\n",
          "kind": "add",
          "path": "/Users/owen/repos/codex/codex-rs/APPROVAL_DEMO.txt"
        }
      ],
      "id": "call_Nxnwj7B3YXigfV6Mwh03d686",
      "status": "completed",
      "type": "fileChange"
    }
  }
}
```
2025-11-19 20:13:31 -08:00
zhao-oai
fb9849e1e3 migrating execpolicy -> execpolicy-legacy and execpolicy2 -> execpolicy (#6956) 2025-11-19 19:14:10 -08:00
Celia Chen
72a1453ac5 Revert "[core] add optional status_code to error events (#6865)" (#6955)
This reverts commit c2ec477d93.

# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2025-11-20 01:26:14 +00:00
Ahmed Ibrahim
6d67b8b283 stop model migration screen after first time. (#6954)
it got serialized wrong.
2025-11-19 17:17:04 -08:00
zhao-oai
74a75679d9 update execpolicy quickstart readme (#6952) 2025-11-19 16:57:27 -08:00
pakrym-oai
92e3046733 Single pass truncation (#6914) 2025-11-19 16:56:37 -08:00
zhao-oai
65c13f1ae7 execpolicy2 core integration (#6641)
This PR threads execpolicy2 into codex-core.

activated via feature flag: exec_policy (on by default)

reads and parses all .codexpolicy files in `codex_home/codex`

refactored tool runtime API to integrate execpolicy logic

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>
2025-11-19 16:50:43 -08:00
Dylan Hurd
b00a7cf40d fix(shell) fallback shells (#6948)
## Summary
Add fallbacks when user_shell_path does not resolve to a known shell
type

## Testing
- [x] Tests still pass
2025-11-19 16:41:38 -08:00
Michael Bolin
13d378f2ce chore: refactor exec-server to prepare it for standalone MCP use (#6944)
This PR reorganizes things slightly so that:

- Instead of a single multitool executable, `codex-exec-server`, we now
have two executables:
  - `codex-exec-mcp-server` to launch the MCP server
- `codex-execve-wrapper` is the `execve(2)` wrapper to use with the
`BASH_EXEC_WRAPPER` environment variable
- `BASH_EXEC_WRAPPER` must be a single executable: it cannot be a
command string composed of an executable with args (i.e., it no longer
adds the `escalate` subcommand, as before)
- `codex-exec-mcp-server` takes `--bash` and `--execve` as options.
Though if `--execve` is not specified, the MCP server will check the
directory containing `std::env::current_exe()` and attempt to use the
file named `codex-execve-wrapper` within it. In development, this works
out since these executables are side-by-side in the `target/debug`
folder.

With respect to testing, this also fixes an important bug in
`dummy_exec_policy()`, as I was using `ends_with()` as if it applied to
a `String`, but in this case, it is used with a `&Path`, so the
semantics are slightly different.

Putting this all together, I was able to test this by running the
following:

```
~/code/codex/codex-rs$ npx @modelcontextprotocol/inspector \
    ./target/debug/codex-exec-mcp-server --bash ~/code/bash/bash
```

If I try to run `git status` in `/Users/mbolin/code/codex` via the
`shell` tool from the MCP server:

<img width="1589" height="1335" alt="image"
src="https://github.com/user-attachments/assets/9db6aea8-7fbc-4675-8b1f-ec446685d6c4"
/>

then I get prompted with the following elicitation, as expected:

<img width="1589" height="1335" alt="image"
src="https://github.com/user-attachments/assets/21b68fe0-494d-4562-9bad-0ddc55fc846d"
/>

Though a current limitation is that the `shell` tool defaults to a
timeout of 10s, which means I only have 10s to respond to the
elicitation. Ideally, the time spent waiting for a response from a human
should not count against the timeout for the command execution. I will
address this in a subsequent PR.

---

Note `~/code/bash/bash` was created by doing:

```
cd ~/code
git clone https://github.com/bminor/bash
cd bash
git checkout a8a1c2fac029404d3f42cd39f5a20f24b6e4fe4b
<apply the patch below>
./configure
make
```

The patch:

```
diff --git a/execute_cmd.c b/execute_cmd.c
index 070f5119..d20ad2b9 100644
--- a/execute_cmd.c
+++ b/execute_cmd.c
@@ -6129,6 +6129,19 @@ shell_execve (char *command, char **args, char **env)
   char sample[HASH_BANG_BUFSIZ];
   size_t larray;

+  char* exec_wrapper = getenv("BASH_EXEC_WRAPPER");
+  if (exec_wrapper && *exec_wrapper && !whitespace (*exec_wrapper))
+    {
+      char *orig_command = command;
+
+      larray = strvec_len (args);
+
+      memmove (args + 2, args, (++larray) * sizeof (char *));
+      args[0] = exec_wrapper;
+      args[1] = orig_command;
+      command = exec_wrapper;
+    }
+
```
2025-11-19 16:38:14 -08:00
Lionel Cheng
a6597a9958 Fix/correct reasoning display (#6749)
This closes #6748 by implementing fallback to
`model_family.default_reasoning_effort` in `reasoning_effort` display of
`/status` when no `model_reasoning_effort` is set in the configuration.

## common/src/config_summary.rs

- `create_config_summary_entries` now fills the "reasoning effort" entry
with the explicit `config.model_reasoning_effort` when present and falls
back to `config.model_family.default_reasoning_effort` when it is
`None`, instead of emitting the literal string `none`.
- This ensures downstream consumers such as `tui/src/status/helpers.rs`
continue to work unchanged while automatically picking up model-family
defaults when the user has not selected a reasoning effort.

## tui/src/status/helpers.rs / core/src/model_family.rs

`ModelFamily::default_reasoning_effort` metadata is set to `medium` for
both `gpt-5*-codex` and `gpt-5` models following the default behaviour
of the API and recommendation of the codebase:
- per https://platform.openai.com/docs/api-reference/responses/create
`gpt-5` defaults to `medium` reasoning when no preset is passed
- there is no mention of the preset for `gpt-5.1-codex` in the API docs
but `medium` is the default setting for `gpt-5.1-codex` as per
`codex-rs/tui/src/chatwidget/snapshots/codex_tui__chatwidget__tests__model_reasoning_selection_popup.snap`

---------

Signed-off-by: lionelchg <lionel.cheng@hotmail.fr>
Co-authored-by: Eric Traut <etraut@openai.com>
2025-11-19 15:52:24 -08:00
Beehive Innovations
692989c277 fix(context left after review): review footer context after /review (#5610)
## Summary
- show live review token usage while `/review` runs and restore the main
session indicator afterward
  - add regression coverage for the footer behavior

## Testing
  - just fmt
  - cargo test -p codex-tui

Fixes #5604

---------

Signed-off-by: Fahad <fahad@2doapp.com>
2025-11-19 22:50:07 +00:00
iceweasel-oai
2fde03b4a0 stop over-reporting world-writable directories (#6936)
Fix world-writable audit false positives by expanding generic
permissions with MapGenericMask and then checking only concrete write
bits. The earlier check looked for FILE_GENERIC_WRITE/generic masks
directly, which shares bits with read permissions and could flag an
Everyone read ACE as writable.
2025-11-19 13:59:17 -08:00
Michael Bolin
056c8f8279 fix: prepare ExecPolicy in exec-server for execpolicy2 cutover (#6888)
This PR introduces an extra layer of abstraction to prepare us for the
migration to execpolicy2:

- introduces a new trait, `EscalationPolicy`, whose `determine_action()`
method is responsible for producing the `EscalateAction`
- the existing `ExecPolicy` typedef is changed to return an intermediate
`ExecPolicyOutcome` instead of `EscalateAction`
- the default implementation of `EscalationPolicy`,
`McpEscalationPolicy`, composes `ExecPolicy`
- the `ExecPolicyOutcome` includes `codex_execpolicy2::Decision`, which
has a `Prompt` variant
- when `McpEscalationPolicy` gets `Decision::Prompt` back from
`ExecPolicy`, it prompts the user via an MCP elicitation and maps the
result into an `ElicitationAction`
- now that the end user can reply to an elicitation with `Decline` or
`Cancel`, we introduce a new variant, `EscalateAction::Deny`, which the
client handles by returning exit code `1` without running anything

Note the way the elicitation is created is still not quite right, but I
will fix that once we have things running end-to-end for real in a
follow-up PR.
2025-11-19 13:55:29 -08:00
Celia Chen
c2ec477d93 [core] add optional status_code to error events (#6865)
We want to better uncover error status code for clients. Add an optional
status_code to error events (thread error, error, stream error) so app
server could uncover the status code from the client side later.

in event log:
```
< {
<   "method": "codex/event/stream_error",
<   "params": {
<     "conversationId": "019a9a32-f576-7292-9711-8e57e8063536",
<     "id": "0",
<     "msg": {
<       "message": "Reconnecting... 5/5",
<       "status_code": 401,
<       "type": "stream_error"
<     }
<   }
< }
< {
<   "method": "codex/event/error",
<   "params": {
<     "conversationId": "019a9a32-f576-7292-9711-8e57e8063536",
<     "id": "0",
<     "msg": {
<       "message": "exceeded retry limit, last status: 401 Unauthorized, request id: 9a0cb03a485067f7-SJC",
<       "status_code": 401,
<       "type": "error"
<     }
<   }
< }
```
2025-11-19 19:51:21 +00:00
Dylan Hurd
20982d5c6a fix(app-server) move windows world writable warning (#6916)
## Summary
Move the app-server warning into the process_new_conversation

## Testing
- [x] Tested locally
2025-11-19 11:24:49 -08:00
pakrym-oai
64ae9aa3c3 Keep gpt-5.1-codex the default (#6922) 2025-11-19 11:08:10 -08:00
zhao-oai
72af589398 storing credits (#6858)
Expand the rate-limit cache/TUI: store credit snapshots alongside
primary and secondary windows, render “Credits” when the backend reports
they exist (unlimited vs rounded integer balances)
2025-11-19 10:49:35 -08:00
iceweasel-oai
b3d320433f have world_writable_warning_details accept cwd as a param (#6913)
this enables app-server to pass in the correct workspace cwd for the
current conversation
2025-11-19 10:10:03 -08:00
jif-oai
91a1d20e2d use another prompt (#6912) 2025-11-19 17:47:47 +00:00
jif-oai
87716e7cd0 NITs (#6911) 2025-11-19 17:43:51 +00:00
jif-oai
8976551f0d Fix ordering 2 (#6910) 2025-11-19 17:40:27 +00:00
jif-oai
f1d6767685 fix: ordering (#6909) 2025-11-19 17:39:07 +00:00
Ahmed Ibrahim
d62cab9a06 fix: don't truncate at new lines (#6907) 2025-11-19 17:05:48 +00:00
Ahmed Ibrahim
d5dfba2509 feat: arcticfox in the wild (#6906)
<img width="485" height="600" alt="image"
src="https://github.com/user-attachments/assets/4341740d-dd58-4a3e-b69a-33a3be0606c5"
/>

---------

Co-authored-by: jif-oai <jif@openai.com>
2025-11-19 16:31:06 +00:00
Owen Lin
1924500250 [app-server] populate thread>turns>items on thread/resume (#6848)
This PR allows clients to render historical messages when resuming a
thread via `thread/resume` by reading from the list of `EventMsg`
payloads loaded from the rollout, and then transforming them into Turns
and ThreadItems to be returned on the `Thread` object.

This is implemented by leveraging `SessionConfiguredNotification` which
returns this list of `EventMsg` objects when resuming a conversation,
and then applying a stateful `ThreadHistoryBuilder` that parses from
this EventMsg log and transforms it into Turns and ThreadItems.

Note that we only persist a subset of `EventMsg`s in a rollout as
defined in `policy.rs`, so we lose fidelity whenever we resume a thread
compared to when we streamed the thread's turns originally. However,
this behavior is at parity with the legacy API.
2025-11-19 15:58:09 +00:00
jif-oai
cfc57e14c7 nit: useless log to debug (#6898)
When you type too fast in most terminals, it gets interpreted as paste,
making this log spam
2025-11-19 12:32:53 +00:00
Dylan Hurd
15b5eb30ed fix(core) Support changing /approvals before conversation (#6836)
## Summary
Setting `/approvals` before the start of a conversation was not updating
the environment_context for a conversation. Not sure exactly when this
problem was introduced, but this should reduce model confusion
dramatically.

## Testing
- [x] Added unit test to reproduce bug, confirmed fix with update
- [x] Tested locally
2025-11-19 11:32:48 +00:00
jif-oai
3e9e1d993d chore: consolidate compaction token usage (#6894) 2025-11-19 11:26:01 +00:00
Dylan Hurd
44c747837a chore(app-server) world-writable windows notification (#6880)
## Summary
On app-server startup, detect whether the experimental sandbox is
enabled, and send a notification .

**Note**
New conversations will not respect the feature because we [ignore cli
overrides in
NewConversation](a75321a64c/codex-rs/app-server/src/codex_message_processor.rs (L1237-L1252)).
However, this should be okay, since we don't actually use config for
this, we use a [global
variable](87cce88f48/codex-rs/core/src/safety.rs (L105-L110)).
We should carefully unwind this setup at some point.


## Testing
- [ ] In progress: testing locally

---------

Co-authored-by: jif-oai <jif@openai.com>
2025-11-19 11:19:34 +00:00
jif-oai
4985a7a444 fix: parallel tool call instruction injection (#6893) 2025-11-19 11:01:57 +00:00
jif-oai
10d571f236 nit: stable (#6895) 2025-11-19 10:43:43 +00:00
jif-oai
956d3bfac6 feat: warning large commits (#6838) 2025-11-19 10:22:10 +00:00
Thibault Sottiaux
73488657cb fix label (#6892) 2025-11-19 10:11:30 +00:00
Ahmed Ibrahim
efebc62fb7 Move shell to use truncate_text (#6842)
Move shell to use the configurable `truncate_text`

---------

Co-authored-by: pakrym-oai <pakrym@openai.com>
2025-11-19 01:56:08 -08:00
pakrym-oai
75f38f16dd Run remote auto compaction (#6879) 2025-11-19 00:43:58 -08:00
Ahmed Ibrahim
0440a3f105 flaky-unified_exec_formats_large_output_summary (#6884)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2025-11-19 00:00:37 -08:00
pakrym-oai
ee0484a98c shell_command returns freeform output (#6860)
Instead of returning structured out and then re-formatting it into
freeform, return the freeform output from shell_command tool.

Keep `shell` as the default tool for GPT-5.
2025-11-18 23:38:43 -08:00
Dylan Hurd
7e0e675db4 chore(core) arcticfox (#6876)
..
2025-11-18 23:38:08 -08:00
Dylan Hurd
84458f12f6 fix(tui) ghost snapshot notifications (#6881)
## Summary
- avoid surfacing ghost snapshot warnings in the TUI when snapshot
creation fails, logging the conditions instead
- continue to capture successful ghost snapshots without changing
existing behavior

## Testing
- `cargo test -p codex-core` *(fails:
default_client::tests::test_create_client_sets_default_headers,
default_client::tests::test_get_codex_user_agent,
exec::tests::kill_child_process_group_kills_grandchildren_on_timeout)*

------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_691c02238db08322927c47b8c2d72c4c)
2025-11-18 23:23:00 -08:00
Ahmed Ibrahim
793063070b fix: typos in model picker (#6859)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2025-11-19 06:29:02 +00:00
ae
030d1d5b1c chore: update windows docs url (#6877)
- Testing: None
2025-11-19 06:24:17 +00:00
ae
7e6316d4aa feat: tweak windows sandbox strings (#6875)
New strings:
1. Approval mode picker just says "Select Approval Mode"
1. Updated "Auto" to "Agent"
1. When you select "Agent", you get "Agent mode on Windows uses an
experimental sandbox to limit network and filesystem access. [Learn
more]"
1. Updated world-writable warning to "The Windows sandbox cannot protect
writes to folders that are writable by Everyone. Consider removing write
access for Everyone from the following folders: {folders}"

---------

Co-authored-by: iceweasel-oai <iceweasel@openai.com>
2025-11-19 06:00:06 +00:00
Michael Bolin
a75321a64c fix: add more fields to ThreadStartResponse and ThreadResumeResponse (#6847)
This adds the following fields to `ThreadStartResponse` and
`ThreadResumeResponse`:

```rust
    pub model: String,
    pub model_provider: String,
    pub cwd: PathBuf,
    pub approval_policy: AskForApproval,
    pub sandbox: SandboxPolicy,
    pub reasoning_effort: Option<ReasoningEffort>,
```

This is important because these fields are optional in
`ThreadStartParams` and `ThreadResumeParams`, so the caller needs to be
able to determine what values were ultimately used to start/resume the
conversation. (Though note that any of these could be changed later
between turns in the conversation.)

Though to get this information reliably, it must be read from the
internal `SessionConfiguredEvent` that is created in response to the
start of a conversation. Because `SessionConfiguredEvent` (as defined in
`codex-rs/protocol/src/protocol.rs`) did not have all of these fields, a
number of them had to be added as part of this PR.

Because `SessionConfiguredEvent` is referenced in many tests, test
instances of `SessionConfiguredEvent` had to be updated, as well, which
is why this PR touches so many files.
2025-11-18 21:18:43 -08:00
ae
7508e4fd2d chore: update windows sandbox docs (#6872) 2025-11-18 21:02:04 -08:00
pakrym-oai
cac0a6a29d Remote compaction on by-default (#6866) 2025-11-19 02:21:57 +00:00
Celia Chen
b395dc1be6 [app-server] introduce turn/completed v2 event (#6800)
similar to logic in
`codex/codex-rs/exec/src/event_processor_with_jsonl_output.rs`.
translation of v1 -> v2 events:
`codex/event/task_complete` -> `turn/completed`
`codex/event/turn_aborted` -> `turn/completed` with `interrupted` status
`codex/event/error` -> `turn/completed` with `error` status

this PR also makes `items` field in `Turn` optional. For now, we only
populate it when we resume a thread, and leave it as None for all other
places until we properly rewrite core to keep track of items.

tested using the codex app server client. example new event:
```
< {
<   "method": "turn/completed",
<   "params": {
<     "turn": {
<       "id": "0",
<       "items": [],
<       "status": "interrupted"
<     }
<   }
< }
```
2025-11-19 01:55:24 +00:00
zhao-oai
4288091f63 update credit status details (#6862) 2025-11-19 01:40:22 +00:00
Jeremy Rose
526eb3ff82 tui: add branch to 'codex resume', filter by cwd (#6232)
By default, show only sessions that shared a cwd with the current cwd.
`--all` shows all sessions in all cwds. Also, show the branch name from
the rollout metadata.

<img width="1091" height="638" alt="Screenshot 2025-11-04 at 3 30 47 PM"
src="https://github.com/user-attachments/assets/aae90308-6115-455f-aff7-22da5f1d9681"
/>
2025-11-19 00:47:37 +00:00
iceweasel-oai
b952bd2649 smoketest for browser vuln, rough draft of Windows security doc (#6822) 2025-11-18 16:43:34 -08:00
iceweasel-oai
cf57320b9f windows sandbox: support multiple workspace roots (#6854)
The Windows sandbox did not previously support multiple workspace roots
via config. Now it does
2025-11-18 16:35:00 -08:00
zhao-oai
4fb714fb46 updating codex backend models (#6855) 2025-11-18 16:25:50 -08:00
Jeremy Rose
c1391b9f94 exec-server (#6630) 2025-11-19 00:20:19 +00:00
Eric Traut
9275e93364 Fix tests so they don't emit an extraneous config.toml in the source tree (#6853)
This PR fixes the `release_event_does_not_change_selection` test so it
doesn't cause an extra `config.toml` to be emitted in the sources when
running the tests locally. Prior to this fix, I needed to delete this
file every time I ran the tests to prevent it from showing up as an
uncommitted source file.
2025-11-18 15:27:45 -08:00
Owen Lin
b3a824ae3c [app-server-test-client] feat: auto approve command (#6852) 2025-11-18 15:25:02 -08:00
Eric Traut
ab30453dee Improved runtime of generated_ts_has_no_optional_nullable_fields test (#6851)
The `generated_ts_has_no_optional_nullable_fields` test was occasionally
failing on slow CI nodes because of a timeout. This change reduces the
work done by the test. It adds some "options" for the `generate_ts`
function so it can skip work that's not needed for the test.
2025-11-18 15:24:32 -08:00
jif-oai
c56d0c159b fix: local compaction (#6844) 2025-11-18 22:18:10 +00:00
simister
0bf857bc91 Fix typo in config.md for MCP server (#6845)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2025-11-18 14:06:13 -08:00
Anton Panasenko
f7a921039c [codex][otel] support mtls configuration (#6228)
fix for https://github.com/openai/codex/issues/6153

supports mTLS configuration and includes TLS features in the library
build to enable secure HTTPS connections with custom root certificates.

grpc:
https://docs.rs/tonic/0.13.1/src/tonic/transport/channel/endpoint.rs.html#63
https:
https://docs.rs/reqwest/0.12.23/src/reqwest/async_impl/client.rs.html#516
2025-11-18 14:01:01 -08:00
jif-oai
8ddae8cde3 feat: review in app server (#6613) 2025-11-18 21:58:54 +00:00
Dylan Hurd
29ca89c414 chore(config) enable shell_command (#6843)
## Summary
Enables shell_command as default for `gpt-5*` and `codex-*` models.

## Testing
- [x] Updated unit tests
2025-11-18 12:46:02 -08:00
iceweasel-oai
4bada5a84d Prompt to turn on windows sandbox when auto mode selected. (#6618)
- stop prompting users to install WSL 
- prompt users to turn on Windows sandbox when auto mode requested.

<img width="1660" height="195" alt="Screenshot 2025-11-17 110612"
src="https://github.com/user-attachments/assets/c67fc239-a227-417e-94bb-599a8ed8f11e"
/>
<img width="1684" height="168" alt="Screenshot 2025-11-17 110637"
src="https://github.com/user-attachments/assets/d18c3370-830d-4971-8746-04757ae2f709"
/>
<img width="1655" height="293" alt="Screenshot 2025-11-17 110719"
src="https://github.com/user-attachments/assets/d21f6ce9-c23e-4842-baf6-8938b77c16db"
/>
2025-11-18 11:38:18 -08:00
Ahmed Ibrahim
3de8790714 Add the utility to truncate by tokens (#6746)
- This PR is to make it on path for truncating by tokens. This path will
be initially used by unified exec and context manager (responsible for
MCP calls mainly).
- We are exposing new config `calls_output_max_tokens`
- Use `tokens` as the main budget unit but truncate based on the model
family by Introducing `TruncationPolicy`.
- Introduce `truncate_text` as a router for truncation based on the
mode.

In next PRs:
- remove truncate_with_line_bytes_budget
- Add the ability to the model to override the token budget.
2025-11-18 11:36:23 -08:00
Alejandro Peña
b035c604b0 Update faq.md section on supported models (#6832)
Update faq.md to recommend usage of GPT-5.1 Codex, the latest Codex
model from OpenAI.
2025-11-18 09:38:45 -08:00
zhao-oai
e9e644a119 fixing localshell tool calls (#6823)
- Local-shell tool responses were always tagged as
`ExecCommandSource::UserShell` because handler would call
`run_exec_like` with `is_user_shell_cmd` set to true.
- Treat `ToolPayload::LocalShell` the same as other model generated
shell tool calls by deleting `is_user_shell_cmd` from `run_exec_like`
(since actual user shell commands follow a separate code path)
2025-11-18 17:28:26 +00:00
jif-oai
f5d9939cda feat: enable parallel tool calls (#6796) 2025-11-18 17:10:14 +00:00
jif-oai
838531d3e4 feat: remote compaction (#6795)
Co-authored-by: pakrym-oai <pakrym@openai.com>
2025-11-18 16:51:16 +00:00
jif-oai
0eb2e6f9ee nit: app server (#6830) 2025-11-18 16:34:13 +00:00
jif-oai
c20df79a38 nit: mark ghost commit as stable (#6833) 2025-11-18 16:05:49 +00:00
jif-oai
fc55fd7a81 feat: git branch tooling (#6831) 2025-11-18 15:26:09 +00:00
Lael
f3d4e210d8 🐛 fix(rmcp-client): refresh OAuth tokens using expires_at (#6574)
## Summary
- persist OAuth credential expiry timestamps and rehydrate `expires_in`
- proactively refresh rmcp OAuth tokens when `expires_at` is near, then
persist

## Testing
- just fmt
- just fix -p codex-rmcp-client
- cargo test -p codex-rmcp-client

Fixes #6572
2025-11-18 02:16:58 -05:00
Dylan Hurd
28ebe1c97a fix(windows) shell_command on windows, minor parsing (#6811)
## Summary
Enables shell_command for windows users, and starts adding some basic
command parsing here, to at least remove powershell prefixes. We'll
follow this up with command parsing but I wanted to land this change
separately with some basic UX.

**NOTE**: This implementation parses bash and powershell on both
platforms. In theory this is possible, since you can use git bash on
windows or powershell on linux. In practice, this may not be worth the
complexity of supporting, so I don't feel strongly about the current
approach vs. platform-specific branching.

## Testing
- [x] Added a bunch of tests 
- [x] Ran on both windows and os x
2025-11-17 22:23:53 -08:00
307 changed files with 15843 additions and 4355 deletions

View File

@@ -46,7 +46,6 @@ jobs:
with:
openai-api-key: ${{ secrets.CODEX_OPENAI_API_KEY }}
allow-users: "*"
model: gpt-5.1
prompt: |
You are an assistant that triages new GitHub issues by identifying potential duplicates.

View File

@@ -371,8 +371,20 @@ jobs:
path: |
codex-rs/dist/${{ matrix.target }}/*
shell-tool-mcp:
name: shell-tool-mcp
needs: tag-check
uses: ./.github/workflows/shell-tool-mcp.yml
with:
release-tag: ${{ github.ref_name }}
# We are not ready to publish yet.
publish: false
secrets: inherit
release:
needs: build
needs:
- build
- shell-tool-mcp
name: release
runs-on: ubuntu-latest
permissions:
@@ -395,6 +407,14 @@ jobs:
- name: List
run: ls -R dist/
# This is a temporary fix: we should modify shell-tool-mcp.yml so these
# files do not end up in dist/ in the first place.
- name: Delete entries from dist/ that should not go in the release
run: |
rm -rf dist/shell-tool-mcp*
ls -R dist/
- name: Define release name
id: release_name
run: |

48
.github/workflows/shell-tool-mcp-ci.yml vendored Normal file
View File

@@ -0,0 +1,48 @@
name: shell-tool-mcp CI
on:
push:
paths:
- "shell-tool-mcp/**"
- ".github/workflows/shell-tool-mcp-ci.yml"
- "pnpm-lock.yaml"
- "pnpm-workspace.yaml"
pull_request:
paths:
- "shell-tool-mcp/**"
- ".github/workflows/shell-tool-mcp-ci.yml"
- "pnpm-lock.yaml"
- "pnpm-workspace.yaml"
env:
NODE_VERSION: 22
jobs:
test:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v5
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
run_install: false
- name: Setup Node.js
uses: actions/setup-node@v5
with:
node-version: ${{ env.NODE_VERSION }}
cache: "pnpm"
- name: Install dependencies
run: pnpm install --frozen-lockfile
- name: Format check
run: pnpm --filter @openai/codex-shell-tool-mcp run format
- name: Run tests
run: pnpm --filter @openai/codex-shell-tool-mcp test
- name: Build
run: pnpm --filter @openai/codex-shell-tool-mcp run build

412
.github/workflows/shell-tool-mcp.yml vendored Normal file
View File

@@ -0,0 +1,412 @@
name: shell-tool-mcp
on:
workflow_call:
inputs:
release-version:
description: Version to publish (x.y.z or x.y.z-alpha.N). Defaults to GITHUB_REF_NAME when it starts with rust-v.
required: false
type: string
release-tag:
description: Tag name to use when downloading release artifacts (defaults to rust-v<version>).
required: false
type: string
publish:
description: Whether to publish to npm when the version is releasable.
required: false
default: true
type: boolean
env:
NODE_VERSION: 22
jobs:
metadata:
runs-on: ubuntu-latest
outputs:
version: ${{ steps.compute.outputs.version }}
release_tag: ${{ steps.compute.outputs.release_tag }}
should_publish: ${{ steps.compute.outputs.should_publish }}
npm_tag: ${{ steps.compute.outputs.npm_tag }}
steps:
- name: Compute version and tags
id: compute
run: |
set -euo pipefail
version="${{ inputs.release-version }}"
release_tag="${{ inputs.release-tag }}"
if [[ -z "$version" ]]; then
if [[ -n "$release_tag" && "$release_tag" =~ ^rust-v.+ ]]; then
version="${release_tag#rust-v}"
elif [[ "${GITHUB_REF_NAME:-}" =~ ^rust-v.+ ]]; then
version="${GITHUB_REF_NAME#rust-v}"
release_tag="${GITHUB_REF_NAME}"
else
echo "release-version is required when GITHUB_REF_NAME is not a rust-v tag."
exit 1
fi
fi
if [[ -z "$release_tag" ]]; then
release_tag="rust-v${version}"
fi
npm_tag=""
should_publish="false"
if [[ "$version" =~ ^[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
should_publish="true"
elif [[ "$version" =~ ^[0-9]+\.[0-9]+\.[0-9]+-alpha\.[0-9]+$ ]]; then
should_publish="true"
npm_tag="alpha"
fi
echo "version=${version}" >> "$GITHUB_OUTPUT"
echo "release_tag=${release_tag}" >> "$GITHUB_OUTPUT"
echo "npm_tag=${npm_tag}" >> "$GITHUB_OUTPUT"
echo "should_publish=${should_publish}" >> "$GITHUB_OUTPUT"
rust-binaries:
name: Build Rust - ${{ matrix.target }}
needs: metadata
runs-on: ${{ matrix.runner }}
timeout-minutes: 30
defaults:
run:
working-directory: codex-rs
strategy:
fail-fast: false
matrix:
include:
- runner: macos-15-xlarge
target: aarch64-apple-darwin
- runner: macos-15-xlarge
target: x86_64-apple-darwin
- runner: ubuntu-24.04
target: x86_64-unknown-linux-musl
install_musl: true
- runner: ubuntu-24.04-arm
target: aarch64-unknown-linux-musl
install_musl: true
steps:
- name: Checkout repository
uses: actions/checkout@v5
- uses: dtolnay/rust-toolchain@1.90
with:
targets: ${{ matrix.target }}
- if: ${{ matrix.install_musl }}
name: Install musl build dependencies
run: |
sudo apt-get update
sudo apt-get install -y musl-tools pkg-config
- name: Build exec server binaries
run: cargo build --release --target ${{ matrix.target }} --bin codex-exec-mcp-server --bin codex-execve-wrapper
- name: Stage exec server binaries
run: |
dest="${GITHUB_WORKSPACE}/artifacts/vendor/${{ matrix.target }}"
mkdir -p "$dest"
cp "target/${{ matrix.target }}/release/codex-exec-mcp-server" "$dest/"
cp "target/${{ matrix.target }}/release/codex-execve-wrapper" "$dest/"
- uses: actions/upload-artifact@v4
with:
name: shell-tool-mcp-rust-${{ matrix.target }}
path: artifacts/**
if-no-files-found: error
bash-linux:
name: Build Bash (Linux) - ${{ matrix.variant }} - ${{ matrix.target }}
needs: metadata
runs-on: ${{ matrix.runner }}
timeout-minutes: 30
container:
image: ${{ matrix.image }}
strategy:
fail-fast: false
matrix:
include:
- runner: ubuntu-24.04
target: x86_64-unknown-linux-musl
variant: ubuntu-24.04
image: ubuntu:24.04
- runner: ubuntu-24.04
target: x86_64-unknown-linux-musl
variant: ubuntu-22.04
image: ubuntu:22.04
- runner: ubuntu-24.04
target: x86_64-unknown-linux-musl
variant: ubuntu-20.04
image: ubuntu:20.04
- runner: ubuntu-24.04
target: x86_64-unknown-linux-musl
variant: debian-12
image: debian:12
- runner: ubuntu-24.04
target: x86_64-unknown-linux-musl
variant: debian-11
image: debian:11
- runner: ubuntu-24.04
target: x86_64-unknown-linux-musl
variant: centos-9
image: quay.io/centos/centos:stream9
- runner: ubuntu-24.04-arm
target: aarch64-unknown-linux-musl
variant: ubuntu-24.04
image: arm64v8/ubuntu:24.04
- runner: ubuntu-24.04-arm
target: aarch64-unknown-linux-musl
variant: ubuntu-22.04
image: arm64v8/ubuntu:22.04
- runner: ubuntu-24.04-arm
target: aarch64-unknown-linux-musl
variant: ubuntu-20.04
image: arm64v8/ubuntu:20.04
- runner: ubuntu-24.04-arm
target: aarch64-unknown-linux-musl
variant: debian-12
image: arm64v8/debian:12
- runner: ubuntu-24.04-arm
target: aarch64-unknown-linux-musl
variant: debian-11
image: arm64v8/debian:11
- runner: ubuntu-24.04-arm
target: aarch64-unknown-linux-musl
variant: centos-9
image: quay.io/centos/centos:stream9
steps:
- name: Install build prerequisites
shell: bash
run: |
set -euo pipefail
if command -v apt-get >/dev/null 2>&1; then
apt-get update
DEBIAN_FRONTEND=noninteractive apt-get install -y git build-essential bison autoconf gettext
elif command -v dnf >/dev/null 2>&1; then
dnf install -y git gcc gcc-c++ make bison autoconf gettext
elif command -v yum >/dev/null 2>&1; then
yum install -y git gcc gcc-c++ make bison autoconf gettext
else
echo "Unsupported package manager in container"
exit 1
fi
- name: Checkout repository
uses: actions/checkout@v5
- name: Build patched Bash
shell: bash
run: |
set -euo pipefail
git clone --depth 1 https://github.com/bminor/bash /tmp/bash
cd /tmp/bash
git fetch --depth 1 origin a8a1c2fac029404d3f42cd39f5a20f24b6e4fe4b
git checkout a8a1c2fac029404d3f42cd39f5a20f24b6e4fe4b
git apply "${GITHUB_WORKSPACE}/shell-tool-mcp/patches/bash-exec-wrapper.patch"
./configure --without-bash-malloc
cores="$(command -v nproc >/dev/null 2>&1 && nproc || getconf _NPROCESSORS_ONLN)"
make -j"${cores}"
dest="${GITHUB_WORKSPACE}/artifacts/vendor/${{ matrix.target }}/bash/${{ matrix.variant }}"
mkdir -p "$dest"
cp bash "$dest/bash"
- uses: actions/upload-artifact@v4
with:
name: shell-tool-mcp-bash-${{ matrix.target }}-${{ matrix.variant }}
path: artifacts/**
if-no-files-found: error
bash-darwin:
name: Build Bash (macOS) - ${{ matrix.variant }} - ${{ matrix.target }}
needs: metadata
runs-on: ${{ matrix.runner }}
timeout-minutes: 30
strategy:
fail-fast: false
matrix:
include:
- runner: macos-15-xlarge
target: aarch64-apple-darwin
variant: macos-15
- runner: macos-14
target: aarch64-apple-darwin
variant: macos-14
- runner: macos-13
target: x86_64-apple-darwin
variant: macos-13
steps:
- name: Checkout repository
uses: actions/checkout@v5
- name: Build patched Bash
shell: bash
run: |
set -euo pipefail
git clone --depth 1 https://github.com/bminor/bash /tmp/bash
cd /tmp/bash
git fetch --depth 1 origin a8a1c2fac029404d3f42cd39f5a20f24b6e4fe4b
git checkout a8a1c2fac029404d3f42cd39f5a20f24b6e4fe4b
git apply "${GITHUB_WORKSPACE}/shell-tool-mcp/patches/bash-exec-wrapper.patch"
./configure --without-bash-malloc
cores="$(getconf _NPROCESSORS_ONLN)"
make -j"${cores}"
dest="${GITHUB_WORKSPACE}/artifacts/vendor/${{ matrix.target }}/bash/${{ matrix.variant }}"
mkdir -p "$dest"
cp bash "$dest/bash"
- uses: actions/upload-artifact@v4
with:
name: shell-tool-mcp-bash-${{ matrix.target }}-${{ matrix.variant }}
path: artifacts/**
if-no-files-found: error
package:
name: Package npm module
needs:
- metadata
- rust-binaries
- bash-linux
- bash-darwin
runs-on: ubuntu-latest
env:
PACKAGE_VERSION: ${{ needs.metadata.outputs.version }}
steps:
- name: Checkout repository
uses: actions/checkout@v5
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 10.8.1
run_install: false
- name: Setup Node.js
uses: actions/setup-node@v5
with:
node-version: ${{ env.NODE_VERSION }}
- name: Install JavaScript dependencies
run: pnpm install --frozen-lockfile
- name: Build (shell-tool-mcp)
run: pnpm --filter @openai/codex-shell-tool-mcp run build
- name: Download build artifacts
uses: actions/download-artifact@v4
with:
path: artifacts
- name: Assemble staging directory
id: staging
shell: bash
run: |
set -euo pipefail
staging="${STAGING_DIR}"
mkdir -p "$staging" "$staging/vendor"
cp shell-tool-mcp/README.md "$staging/"
cp shell-tool-mcp/package.json "$staging/"
cp -R shell-tool-mcp/bin "$staging/"
found_vendor="false"
shopt -s nullglob
for vendor_dir in artifacts/*/vendor; do
rsync -av "$vendor_dir/" "$staging/vendor/"
found_vendor="true"
done
if [[ "$found_vendor" == "false" ]]; then
echo "No vendor payloads were downloaded."
exit 1
fi
node - <<'NODE'
import fs from "node:fs";
import path from "node:path";
const stagingDir = process.env.STAGING_DIR;
const version = process.env.PACKAGE_VERSION;
const pkgPath = path.join(stagingDir, "package.json");
const pkg = JSON.parse(fs.readFileSync(pkgPath, "utf8"));
pkg.version = version;
fs.writeFileSync(pkgPath, JSON.stringify(pkg, null, 2) + "\n");
NODE
echo "dir=$staging" >> "$GITHUB_OUTPUT"
env:
STAGING_DIR: ${{ runner.temp }}/shell-tool-mcp
- name: Ensure binaries are executable
run: |
set -euo pipefail
staging="${{ steps.staging.outputs.dir }}"
chmod +x \
"$staging"/vendor/*/codex-exec-mcp-server \
"$staging"/vendor/*/codex-execve-wrapper \
"$staging"/vendor/*/bash/*/bash
- name: Create npm tarball
shell: bash
run: |
set -euo pipefail
mkdir -p dist/npm
staging="${{ steps.staging.outputs.dir }}"
pack_info=$(cd "$staging" && npm pack --ignore-scripts --json --pack-destination "${GITHUB_WORKSPACE}/dist/npm")
filename=$(PACK_INFO="$pack_info" node -e 'const data = JSON.parse(process.env.PACK_INFO); console.log(data[0].filename);')
mv "dist/npm/${filename}" "dist/npm/codex-shell-tool-mcp-npm-${PACKAGE_VERSION}.tgz"
- uses: actions/upload-artifact@v4
with:
name: codex-shell-tool-mcp-npm
path: dist/npm/codex-shell-tool-mcp-npm-${{ env.PACKAGE_VERSION }}.tgz
if-no-files-found: error
publish:
name: Publish npm package
needs:
- metadata
- package
if: ${{ inputs.publish && needs.metadata.outputs.should_publish == 'true' }}
runs-on: ubuntu-latest
permissions:
id-token: write
contents: read
steps:
- name: Setup pnpm
uses: pnpm/action-setup@v4
with:
version: 10.8.1
run_install: false
- name: Setup Node.js
uses: actions/setup-node@v5
with:
node-version: ${{ env.NODE_VERSION }}
registry-url: https://registry.npmjs.org
scope: "@openai"
- name: Update npm
run: npm install -g npm@latest
- name: Download npm tarball
uses: actions/download-artifact@v4
with:
name: codex-shell-tool-mcp-npm
path: dist/npm
- name: Publish to npm
env:
NPM_TAG: ${{ needs.metadata.outputs.npm_tag }}
VERSION: ${{ needs.metadata.outputs.version }}
shell: bash
run: |
set -euo pipefail
tag_args=()
if [[ -n "${NPM_TAG}" ]]; then
tag_args+=(--tag "${NPM_TAG}")
fi
npm publish "dist/npm/codex-shell-tool-mcp-npm-${VERSION}.tgz" "${tag_args[@]}"

View File

@@ -69,7 +69,38 @@ Codex can access MCP servers. To configure them, refer to the [config docs](./do
Codex CLI supports a rich set of configuration options, with preferences stored in `~/.codex/config.toml`. For full configuration options, see [Configuration](./docs/config.md).
---
### Execpolicy Quickstart
Codex can enforce your own rules-based execution policy before it runs shell commands.
1. Create a policy directory: `mkdir -p ~/.codex/policy`.
2. Create one or more `.codexpolicy` files in that folder. Codex automatically loads every `.codexpolicy` file in there on startup.
3. Write `prefix_rule` entries to describe the commands you want to allow, prompt, or block:
```starlark
prefix_rule(
pattern = ["git", ["push", "fetch"]],
decision = "prompt", # allow | prompt | forbidden
match = [["git", "push", "origin", "main"]], # examples that must match
not_match = [["git", "status"]], # examples that must not match
)
```
- `pattern` is a list of shell tokens, evaluated from left to right; wrap tokens in a nested list to express alternatives (e.g., match both `push` and `fetch`).
- `decision` sets the severity; Codex picks the strictest decision when multiple rules match (forbidden > prompt > allow).
- `match` and `not_match` act as (optional) unit tests. Codex validates them when it loads your policy, so you get feedback if an example has unexpected behavior.
In this example rule, if Codex wants to run commands with the prefix `git push` or `git fetch`, it will first ask for user approval.
Use the `codex execpolicy check` subcommand to preview decisions before you save a rule (see the [`codex-execpolicy` README](./codex-rs/execpolicy/README.md) for syntax details):
```shell
codex execpolicy check --policy ~/.codex/policy/default.codexpolicy git push origin main
```
Pass multiple `--policy` flags to test how several files combine, and use `--pretty` for formatted JSON output. See the [`codex-rs/execpolicy` README](./codex-rs/execpolicy/README.md) for a more detailed walkthrough of the available syntax.
## Note: `execpolicy` commands are still in preview. The API may have breaking changes in the future.
### Docs & FAQ

View File

@@ -7,3 +7,7 @@ slow-timeout = { period = "15s", terminate-after = 2 }
# Do not add new tests here
filter = 'test(rmcp_client) | test(humanlike_typing_1000_chars_appears_live_no_placeholder)'
slow-timeout = { period = "1m", terminate-after = 4 }
[[profile.default.overrides]]
filter = 'test(approval_matrix_covers_all_modes)'
slow-timeout = { period = "30s", terminate-after = 2 }

132
codex-rs/Cargo.lock generated
View File

@@ -187,8 +187,10 @@ dependencies = [
"codex-app-server-protocol",
"codex-core",
"codex-protocol",
"core_test_support",
"serde",
"serde_json",
"shlex",
"tokio",
"uuid",
"wiremock",
@@ -260,7 +262,7 @@ dependencies = [
"memchr",
"proc-macro2",
"quote",
"rustc-hash 2.1.1",
"rustc-hash",
"serde",
"serde_derive",
"syn 2.0.104",
@@ -726,6 +728,17 @@ version = "0.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "613afe47fcd5fac7ccf1db93babcb082c5994d996f20b8b159f2ad1658eb5724"
[[package]]
name = "chardetng"
version = "0.1.17"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "14b8f0b65b7b08ae3c8187e8d77174de20cb6777864c6b832d8ad365999cf1ea"
dependencies = [
"cfg-if",
"encoding_rs",
"memchr",
]
[[package]]
name = "chrono"
version = "0.4.42"
@@ -857,6 +870,7 @@ dependencies = [
"serde",
"serde_json",
"serial_test",
"shlex",
"tempfile",
"tokio",
"toml",
@@ -879,6 +893,7 @@ dependencies = [
"serde",
"serde_json",
"strum_macros 0.27.2",
"thiserror 2.0.17",
"ts-rs",
"uuid",
]
@@ -989,6 +1004,7 @@ dependencies = [
"codex-common",
"codex-core",
"codex-exec",
"codex-execpolicy",
"codex-login",
"codex-mcp-server",
"codex-process-hardening",
@@ -1080,11 +1096,13 @@ dependencies = [
"async-trait",
"base64",
"bytes",
"chardetng",
"chrono",
"codex-app-server-protocol",
"codex-apply-patch",
"codex-arg0",
"codex-async-utils",
"codex-execpolicy",
"codex-file-search",
"codex-git",
"codex-keyring-store",
@@ -1094,13 +1112,13 @@ dependencies = [
"codex-utils-pty",
"codex-utils-readiness",
"codex-utils-string",
"codex-utils-tokenizer",
"codex-windows-sandbox",
"core-foundation 0.9.4",
"core_test_support",
"ctor 0.5.0",
"dirs",
"dunce",
"encoding_rs",
"env-flags",
"escargot",
"eventsource-stream",
@@ -1182,9 +1200,47 @@ dependencies = [
"wiremock",
]
[[package]]
name = "codex-exec-server"
version = "0.0.0"
dependencies = [
"anyhow",
"async-trait",
"clap",
"codex-core",
"libc",
"path-absolutize",
"pretty_assertions",
"rmcp",
"serde",
"serde_json",
"shlex",
"socket2 0.6.0",
"tempfile",
"tokio",
"tokio-util",
"tracing",
"tracing-subscriber",
]
[[package]]
name = "codex-execpolicy"
version = "0.0.0"
dependencies = [
"anyhow",
"clap",
"multimap",
"pretty_assertions",
"serde",
"serde_json",
"shlex",
"starlark",
"thiserror 2.0.17",
]
[[package]]
name = "codex-execpolicy-legacy"
version = "0.0.0"
dependencies = [
"allocative",
"anyhow",
@@ -1202,21 +1258,6 @@ dependencies = [
"tempfile",
]
[[package]]
name = "codex-execpolicy2"
version = "0.0.0"
dependencies = [
"anyhow",
"clap",
"multimap",
"pretty_assertions",
"serde",
"serde_json",
"shlex",
"starlark",
"thiserror 2.0.17",
]
[[package]]
name = "codex-feedback"
version = "0.0.0"
@@ -1366,6 +1407,7 @@ dependencies = [
"codex-app-server-protocol",
"codex-protocol",
"eventsource-stream",
"http",
"opentelemetry",
"opentelemetry-otlp",
"opentelemetry-semantic-conventions",
@@ -1399,6 +1441,7 @@ dependencies = [
"icu_provider",
"mcp-types",
"mime_guess",
"pretty_assertions",
"schemars 0.8.22",
"serde",
"serde_json",
@@ -1589,23 +1632,12 @@ dependencies = [
name = "codex-utils-string"
version = "0.0.0"
[[package]]
name = "codex-utils-tokenizer"
version = "0.0.0"
dependencies = [
"anyhow",
"codex-utils-cache",
"pretty_assertions",
"thiserror 2.0.17",
"tiktoken-rs",
"tokio",
]
[[package]]
name = "codex-windows-sandbox"
version = "0.1.0"
dependencies = [
"anyhow",
"codex-protocol",
"dirs-next",
"dunce",
"rand 0.8.5",
@@ -1747,6 +1779,7 @@ dependencies = [
"notify",
"regex-lite",
"serde_json",
"shlex",
"tempfile",
"tokio",
"walkdir",
@@ -2421,17 +2454,6 @@ dependencies = [
"once_cell",
]
[[package]]
name = "fancy-regex"
version = "0.13.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "531e46835a22af56d1e3b66f04844bed63158bc094a628bec1d321d9b4c44bf2"
dependencies = [
"bit-set",
"regex-automata",
"regex-syntax 0.8.5",
]
[[package]]
name = "fastrand"
version = "2.3.0"
@@ -3719,11 +3741,13 @@ dependencies = [
"assert_cmd",
"codex-core",
"codex-mcp-server",
"core_test_support",
"mcp-types",
"os_info",
"pretty_assertions",
"serde",
"serde_json",
"shlex",
"tokio",
"wiremock",
]
@@ -4756,7 +4780,7 @@ dependencies = [
"pin-project-lite",
"quinn-proto",
"quinn-udp",
"rustc-hash 2.1.1",
"rustc-hash",
"rustls",
"socket2 0.6.0",
"thiserror 2.0.17",
@@ -4776,7 +4800,7 @@ dependencies = [
"lru-slab",
"rand 0.9.2",
"ring",
"rustc-hash 2.1.1",
"rustc-hash",
"rustls",
"rustls-pki-types",
"slab",
@@ -5121,12 +5145,6 @@ version = "0.1.25"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "989e6739f80c4ad5b13e0fd7fe89531180375b18520cc8c82080e4dc4035b84f"
[[package]]
name = "rustc-hash"
version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "08d43f7aa6b08d49f382cde6a7982047c3426db949b1424bc4b7ec9ae12c6ce2"
[[package]]
name = "rustc-hash"
version = "2.1.1"
@@ -5174,6 +5192,7 @@ version = "0.23.29"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2491382039b29b9b11ff08b76ff6c97cf287671dbb74f0be44bda389fffe9bd1"
dependencies = [
"log",
"once_cell",
"ring",
"rustls-pki-types",
@@ -6346,21 +6365,6 @@ dependencies = [
"zune-jpeg",
]
[[package]]
name = "tiktoken-rs"
version = "0.9.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3a19830747d9034cd9da43a60eaa8e552dfda7712424aebf187b7a60126bae0d"
dependencies = [
"anyhow",
"base64",
"bstr",
"fancy-regex",
"lazy_static",
"regex",
"rustc-hash 1.1.0",
]
[[package]]
name = "time"
version = "0.3.44"
@@ -6601,8 +6605,10 @@ dependencies = [
"percent-encoding",
"pin-project",
"prost",
"rustls-native-certs",
"socket2 0.5.10",
"tokio",
"tokio-rustls",
"tokio-stream",
"tower",
"tower-layer",

View File

@@ -16,8 +16,9 @@ members = [
"common",
"core",
"exec",
"exec-server",
"execpolicy",
"execpolicy2",
"execpolicy-legacy",
"keyring-store",
"file-search",
"linux-sandbox",
@@ -40,7 +41,6 @@ members = [
"utils/pty",
"utils/readiness",
"utils/string",
"utils/tokenizer",
]
resolver = "2"
@@ -66,6 +66,7 @@ codex-chatgpt = { path = "chatgpt" }
codex-common = { path = "common" }
codex-core = { path = "core" }
codex-exec = { path = "exec" }
codex-execpolicy = { path = "execpolicy" }
codex-feedback = { path = "feedback" }
codex-file-search = { path = "file-search" }
codex-git = { path = "utils/git" }
@@ -88,7 +89,6 @@ codex-utils-json-to-toml = { path = "utils/json-to-toml" }
codex-utils-pty = { path = "utils/pty" }
codex-utils-readiness = { path = "utils/readiness" }
codex-utils-string = { path = "utils/string" }
codex-utils-tokenizer = { path = "utils/tokenizer" }
codex-windows-sandbox = { path = "windows-sandbox-rs" }
core_test_support = { path = "core/tests/common" }
mcp-types = { path = "mcp-types" }
@@ -109,6 +109,7 @@ axum = { version = "0.8", default-features = false }
base64 = "0.22.1"
bytes = "1.10.1"
chrono = "0.4.42"
chardetng = "0.1.17"
clap = "4"
clap_complete = "4"
color-eyre = "0.6.3"
@@ -121,6 +122,7 @@ dotenvy = "0.15.7"
dunce = "1.0.4"
env-flags = "0.1.1"
env_logger = "0.11.5"
encoding_rs = "0.8.35"
escargot = "0.5"
eventsource-stream = "0.2.3"
futures = { version = "0.3", default-features = false }
@@ -167,7 +169,6 @@ reqwest = "0.12"
rmcp = { version = "0.8.5", default-features = false }
schemars = "0.8.22"
seccompiler = "0.5.0"
sentry = "0.34.0"
serde = "1"
serde_json = "1"
serde_with = "3.14"
@@ -176,6 +177,7 @@ sha1 = "0.10.6"
sha2 = "0.10"
shlex = "1.3.0"
similar = "2.7.0"
socket2 = "0.6.0"
starlark = "0.13.0"
strum = "0.27.2"
strum_macros = "0.27.2"
@@ -185,7 +187,6 @@ tempfile = "3.23.0"
test-log = "0.2.18"
textwrap = "0.16.2"
thiserror = "2.0.17"
tiktoken-rs = "0.9"
time = "0.3"
tiny_http = "0.12"
tokio = "1"
@@ -263,7 +264,6 @@ ignored = [
"icu_provider",
"openssl-sys",
"codex-utils-readiness",
"codex-utils-tokenizer",
]
[profile.release]

View File

@@ -19,6 +19,7 @@ schemars = { workspace = true }
serde = { workspace = true, features = ["derive"] }
serde_json = { workspace = true }
strum_macros = { workspace = true }
thiserror = { workspace = true }
ts-rs = { workspace = true }
uuid = { workspace = true, features = ["serde", "v7"] }

View File

@@ -61,7 +61,32 @@ pub fn generate_types(out_dir: &Path, prettier: Option<&Path>) -> Result<()> {
Ok(())
}
#[derive(Clone, Copy, Debug)]
pub struct GenerateTsOptions {
pub generate_indices: bool,
pub ensure_headers: bool,
pub run_prettier: bool,
}
impl Default for GenerateTsOptions {
fn default() -> Self {
Self {
generate_indices: true,
ensure_headers: true,
run_prettier: true,
}
}
}
pub fn generate_ts(out_dir: &Path, prettier: Option<&Path>) -> Result<()> {
generate_ts_with_options(out_dir, prettier, GenerateTsOptions::default())
}
pub fn generate_ts_with_options(
out_dir: &Path,
prettier: Option<&Path>,
options: GenerateTsOptions,
) -> Result<()> {
let v2_out_dir = out_dir.join("v2");
ensure_dir(out_dir)?;
ensure_dir(&v2_out_dir)?;
@@ -74,17 +99,28 @@ pub fn generate_ts(out_dir: &Path, prettier: Option<&Path>) -> Result<()> {
export_server_responses(out_dir)?;
ServerNotification::export_all_to(out_dir)?;
generate_index_ts(out_dir)?;
generate_index_ts(&v2_out_dir)?;
if options.generate_indices {
generate_index_ts(out_dir)?;
generate_index_ts(&v2_out_dir)?;
}
// Ensure our header is present on all TS files (root + subdirs like v2/).
let ts_files = ts_files_in_recursive(out_dir)?;
for file in &ts_files {
prepend_header_if_missing(file)?;
let mut ts_files = Vec::new();
let should_collect_ts_files =
options.ensure_headers || (options.run_prettier && prettier.is_some());
if should_collect_ts_files {
ts_files = ts_files_in_recursive(out_dir)?;
}
if options.ensure_headers {
for file in &ts_files {
prepend_header_if_missing(file)?;
}
}
// Optionally run Prettier on all generated TS files.
if let Some(prettier_bin) = prettier
if options.run_prettier
&& let Some(prettier_bin) = prettier
&& !ts_files.is_empty()
{
let status = Command::new(prettier_bin)
@@ -723,7 +759,13 @@ mod tests {
let _guard = TempDirGuard(output_dir.clone());
generate_ts(&output_dir, None)?;
// Avoid doing more work than necessary to keep the test from timing out.
let options = GenerateTsOptions {
generate_indices: false,
ensure_headers: false,
run_prettier: false,
};
generate_ts_with_options(&output_dir, None, options)?;
let mut undefined_offenders = Vec::new();
let mut optional_nullable_offenders = BTreeSet::new();

View File

@@ -7,5 +7,6 @@ pub use export::generate_ts;
pub use export::generate_types;
pub use jsonrpc_lite::*;
pub use protocol::common::*;
pub use protocol::thread_history::*;
pub use protocol::v1::*;
pub use protocol::v2::*;

View File

@@ -129,6 +129,10 @@ client_request_definitions! {
params: v2::TurnInterruptParams,
response: v2::TurnInterruptResponse,
},
ReviewStart => "review/start" {
params: v2::ReviewStartParams,
response: v2::TurnStartResponse,
},
ModelList => "model/list" {
params: v2::ModelListParams,
@@ -374,7 +378,7 @@ macro_rules! server_notification_definitions {
impl TryFrom<JSONRPCNotification> for ServerNotification {
type Error = serde_json::Error;
fn try_from(value: JSONRPCNotification) -> Result<Self, Self::Error> {
fn try_from(value: JSONRPCNotification) -> Result<Self, serde_json::Error> {
serde_json::from_value(serde_json::to_value(value)?)
}
}
@@ -434,6 +438,13 @@ server_request_definitions! {
response: v2::CommandExecutionRequestApprovalResponse,
},
/// Sent when approval is requested for a specific file change.
/// This request is used for Turns started via turn/start.
FileChangeRequestApproval => "item/fileChange/requestApproval" {
params: v2::FileChangeRequestApprovalParams,
response: v2::FileChangeRequestApprovalResponse,
},
/// DEPRECATED APIs below
/// Request to approve a patch.
/// This request is used for Turns started via the legacy APIs (i.e. SendUserTurn, SendUserMessage).
@@ -476,6 +487,7 @@ pub struct FuzzyFileSearchResponse {
server_notification_definitions! {
/// NEW NOTIFICATIONS
Error => "error" (v2::ErrorNotification),
ThreadStarted => "thread/started" (v2::ThreadStartedNotification),
TurnStarted => "turn/started" (v2::TurnStartedNotification),
TurnCompleted => "turn/completed" (v2::TurnCompletedNotification),
@@ -490,6 +502,9 @@ server_notification_definitions! {
ReasoningSummaryPartAdded => "item/reasoning/summaryPartAdded" (v2::ReasoningSummaryPartAddedNotification),
ReasoningTextDelta => "item/reasoning/textDelta" (v2::ReasoningTextDeltaNotification),
/// Notifies the user of world-writable directories on Windows, which cannot be protected by the sandbox.
WindowsWorldWritableWarning => "windows/worldWritableWarning" (v2::WindowsWorldWritableWarningNotification),
#[serde(rename = "account/login/completed")]
#[ts(rename = "account/login/completed")]
#[strum(serialize = "account/login/completed")]
@@ -524,7 +539,7 @@ mod tests {
let request = ClientRequest::NewConversation {
request_id: RequestId::Integer(42),
params: v1::NewConversationParams {
model: Some("gpt-5.1-codex".to_string()),
model: Some("gpt-5.1-codex-max".to_string()),
model_provider: None,
profile: None,
cwd: None,
@@ -542,7 +557,7 @@ mod tests {
"method": "newConversation",
"id": 42,
"params": {
"model": "gpt-5.1-codex",
"model": "gpt-5.1-codex-max",
"modelProvider": null,
"profile": null,
"cwd": null,

View File

@@ -2,5 +2,6 @@
// Exposes protocol pieces used by `lib.rs` via `pub use protocol::common::*;`.
pub mod common;
pub mod thread_history;
pub mod v1;
pub mod v2;

View File

@@ -0,0 +1,409 @@
use crate::protocol::v2::ThreadItem;
use crate::protocol::v2::Turn;
use crate::protocol::v2::TurnStatus;
use crate::protocol::v2::UserInput;
use codex_protocol::protocol::AgentReasoningEvent;
use codex_protocol::protocol::AgentReasoningRawContentEvent;
use codex_protocol::protocol::EventMsg;
use codex_protocol::protocol::TurnAbortedEvent;
use codex_protocol::protocol::UserMessageEvent;
/// Convert persisted [`EventMsg`] entries into a sequence of [`Turn`] values.
///
/// The purpose of this is to convert the EventMsgs persisted in a rollout file
/// into a sequence of Turns and ThreadItems, which allows the client to render
/// the historical messages when resuming a thread.
pub fn build_turns_from_event_msgs(events: &[EventMsg]) -> Vec<Turn> {
let mut builder = ThreadHistoryBuilder::new();
for event in events {
builder.handle_event(event);
}
builder.finish()
}
struct ThreadHistoryBuilder {
turns: Vec<Turn>,
current_turn: Option<PendingTurn>,
next_turn_index: i64,
next_item_index: i64,
}
impl ThreadHistoryBuilder {
fn new() -> Self {
Self {
turns: Vec::new(),
current_turn: None,
next_turn_index: 1,
next_item_index: 1,
}
}
fn finish(mut self) -> Vec<Turn> {
self.finish_current_turn();
self.turns
}
/// This function should handle all EventMsg variants that can be persisted in a rollout file.
/// See `should_persist_event_msg` in `codex-rs/core/rollout/policy.rs`.
fn handle_event(&mut self, event: &EventMsg) {
match event {
EventMsg::UserMessage(payload) => self.handle_user_message(payload),
EventMsg::AgentMessage(payload) => self.handle_agent_message(payload.message.clone()),
EventMsg::AgentReasoning(payload) => self.handle_agent_reasoning(payload),
EventMsg::AgentReasoningRawContent(payload) => {
self.handle_agent_reasoning_raw_content(payload)
}
EventMsg::TokenCount(_) => {}
EventMsg::EnteredReviewMode(_) => {}
EventMsg::ExitedReviewMode(_) => {}
EventMsg::UndoCompleted(_) => {}
EventMsg::TurnAborted(payload) => self.handle_turn_aborted(payload),
_ => {}
}
}
fn handle_user_message(&mut self, payload: &UserMessageEvent) {
self.finish_current_turn();
let mut turn = self.new_turn();
let id = self.next_item_id();
let content = self.build_user_inputs(payload);
turn.items.push(ThreadItem::UserMessage { id, content });
self.current_turn = Some(turn);
}
fn handle_agent_message(&mut self, text: String) {
if text.is_empty() {
return;
}
let id = self.next_item_id();
self.ensure_turn()
.items
.push(ThreadItem::AgentMessage { id, text });
}
fn handle_agent_reasoning(&mut self, payload: &AgentReasoningEvent) {
if payload.text.is_empty() {
return;
}
// If the last item is a reasoning item, add the new text to the summary.
if let Some(ThreadItem::Reasoning { summary, .. }) = self.ensure_turn().items.last_mut() {
summary.push(payload.text.clone());
return;
}
// Otherwise, create a new reasoning item.
let id = self.next_item_id();
self.ensure_turn().items.push(ThreadItem::Reasoning {
id,
summary: vec![payload.text.clone()],
content: Vec::new(),
});
}
fn handle_agent_reasoning_raw_content(&mut self, payload: &AgentReasoningRawContentEvent) {
if payload.text.is_empty() {
return;
}
// If the last item is a reasoning item, add the new text to the content.
if let Some(ThreadItem::Reasoning { content, .. }) = self.ensure_turn().items.last_mut() {
content.push(payload.text.clone());
return;
}
// Otherwise, create a new reasoning item.
let id = self.next_item_id();
self.ensure_turn().items.push(ThreadItem::Reasoning {
id,
summary: Vec::new(),
content: vec![payload.text.clone()],
});
}
fn handle_turn_aborted(&mut self, _payload: &TurnAbortedEvent) {
let Some(turn) = self.current_turn.as_mut() else {
return;
};
turn.status = TurnStatus::Interrupted;
}
fn finish_current_turn(&mut self) {
if let Some(turn) = self.current_turn.take() {
if turn.items.is_empty() {
return;
}
self.turns.push(turn.into());
}
}
fn new_turn(&mut self) -> PendingTurn {
PendingTurn {
id: self.next_turn_id(),
items: Vec::new(),
status: TurnStatus::Completed,
}
}
fn ensure_turn(&mut self) -> &mut PendingTurn {
if self.current_turn.is_none() {
let turn = self.new_turn();
return self.current_turn.insert(turn);
}
if let Some(turn) = self.current_turn.as_mut() {
return turn;
}
unreachable!("current turn must exist after initialization");
}
fn next_turn_id(&mut self) -> String {
let id = format!("turn-{}", self.next_turn_index);
self.next_turn_index += 1;
id
}
fn next_item_id(&mut self) -> String {
let id = format!("item-{}", self.next_item_index);
self.next_item_index += 1;
id
}
fn build_user_inputs(&self, payload: &UserMessageEvent) -> Vec<UserInput> {
let mut content = Vec::new();
if !payload.message.trim().is_empty() {
content.push(UserInput::Text {
text: payload.message.clone(),
});
}
if let Some(images) = &payload.images {
for image in images {
content.push(UserInput::Image { url: image.clone() });
}
}
content
}
}
struct PendingTurn {
id: String,
items: Vec<ThreadItem>,
status: TurnStatus,
}
impl From<PendingTurn> for Turn {
fn from(value: PendingTurn) -> Self {
Self {
id: value.id,
items: value.items,
status: value.status,
}
}
}
#[cfg(test)]
mod tests {
use super::*;
use codex_protocol::protocol::AgentMessageEvent;
use codex_protocol::protocol::AgentReasoningEvent;
use codex_protocol::protocol::AgentReasoningRawContentEvent;
use codex_protocol::protocol::TurnAbortReason;
use codex_protocol::protocol::TurnAbortedEvent;
use codex_protocol::protocol::UserMessageEvent;
use pretty_assertions::assert_eq;
#[test]
fn builds_multiple_turns_with_reasoning_items() {
let events = vec![
EventMsg::UserMessage(UserMessageEvent {
message: "First turn".into(),
images: Some(vec!["https://example.com/one.png".into()]),
}),
EventMsg::AgentMessage(AgentMessageEvent {
message: "Hi there".into(),
}),
EventMsg::AgentReasoning(AgentReasoningEvent {
text: "thinking".into(),
}),
EventMsg::AgentReasoningRawContent(AgentReasoningRawContentEvent {
text: "full reasoning".into(),
}),
EventMsg::UserMessage(UserMessageEvent {
message: "Second turn".into(),
images: None,
}),
EventMsg::AgentMessage(AgentMessageEvent {
message: "Reply two".into(),
}),
];
let turns = build_turns_from_event_msgs(&events);
assert_eq!(turns.len(), 2);
let first = &turns[0];
assert_eq!(first.id, "turn-1");
assert_eq!(first.status, TurnStatus::Completed);
assert_eq!(first.items.len(), 3);
assert_eq!(
first.items[0],
ThreadItem::UserMessage {
id: "item-1".into(),
content: vec![
UserInput::Text {
text: "First turn".into(),
},
UserInput::Image {
url: "https://example.com/one.png".into(),
}
],
}
);
assert_eq!(
first.items[1],
ThreadItem::AgentMessage {
id: "item-2".into(),
text: "Hi there".into(),
}
);
assert_eq!(
first.items[2],
ThreadItem::Reasoning {
id: "item-3".into(),
summary: vec!["thinking".into()],
content: vec!["full reasoning".into()],
}
);
let second = &turns[1];
assert_eq!(second.id, "turn-2");
assert_eq!(second.items.len(), 2);
assert_eq!(
second.items[0],
ThreadItem::UserMessage {
id: "item-4".into(),
content: vec![UserInput::Text {
text: "Second turn".into()
}],
}
);
assert_eq!(
second.items[1],
ThreadItem::AgentMessage {
id: "item-5".into(),
text: "Reply two".into(),
}
);
}
#[test]
fn splits_reasoning_when_interleaved() {
let events = vec![
EventMsg::UserMessage(UserMessageEvent {
message: "Turn start".into(),
images: None,
}),
EventMsg::AgentReasoning(AgentReasoningEvent {
text: "first summary".into(),
}),
EventMsg::AgentReasoningRawContent(AgentReasoningRawContentEvent {
text: "first content".into(),
}),
EventMsg::AgentMessage(AgentMessageEvent {
message: "interlude".into(),
}),
EventMsg::AgentReasoning(AgentReasoningEvent {
text: "second summary".into(),
}),
];
let turns = build_turns_from_event_msgs(&events);
assert_eq!(turns.len(), 1);
let turn = &turns[0];
assert_eq!(turn.items.len(), 4);
assert_eq!(
turn.items[1],
ThreadItem::Reasoning {
id: "item-2".into(),
summary: vec!["first summary".into()],
content: vec!["first content".into()],
}
);
assert_eq!(
turn.items[3],
ThreadItem::Reasoning {
id: "item-4".into(),
summary: vec!["second summary".into()],
content: Vec::new(),
}
);
}
#[test]
fn marks_turn_as_interrupted_when_aborted() {
let events = vec![
EventMsg::UserMessage(UserMessageEvent {
message: "Please do the thing".into(),
images: None,
}),
EventMsg::AgentMessage(AgentMessageEvent {
message: "Working...".into(),
}),
EventMsg::TurnAborted(TurnAbortedEvent {
reason: TurnAbortReason::Replaced,
}),
EventMsg::UserMessage(UserMessageEvent {
message: "Let's try again".into(),
images: None,
}),
EventMsg::AgentMessage(AgentMessageEvent {
message: "Second attempt complete.".into(),
}),
];
let turns = build_turns_from_event_msgs(&events);
assert_eq!(turns.len(), 2);
let first_turn = &turns[0];
assert_eq!(first_turn.status, TurnStatus::Interrupted);
assert_eq!(first_turn.items.len(), 2);
assert_eq!(
first_turn.items[0],
ThreadItem::UserMessage {
id: "item-1".into(),
content: vec![UserInput::Text {
text: "Please do the thing".into()
}],
}
);
assert_eq!(
first_turn.items[1],
ThreadItem::AgentMessage {
id: "item-2".into(),
text: "Working...".into(),
}
);
let second_turn = &turns[1];
assert_eq!(second_turn.status, TurnStatus::Completed);
assert_eq!(second_turn.items.len(), 2);
assert_eq!(
second_turn.items[0],
ThreadItem::UserMessage {
id: "item-3".into(),
content: vec![UserInput::Text {
text: "Let's try again".into()
}],
}
);
assert_eq!(
second_turn.items[1],
ThreadItem::AgentMessage {
id: "item-4".into(),
text: "Second attempt complete.".into(),
}
);
}
}

View File

@@ -11,6 +11,8 @@ use codex_protocol::items::AgentMessageContent as CoreAgentMessageContent;
use codex_protocol::items::TurnItem as CoreTurnItem;
use codex_protocol::models::ResponseItem;
use codex_protocol::parse_command::ParsedCommand as CoreParsedCommand;
use codex_protocol::protocol::CodexErrorInfo as CoreCodexErrorInfo;
use codex_protocol::protocol::CreditsSnapshot as CoreCreditsSnapshot;
use codex_protocol::protocol::RateLimitSnapshot as CoreRateLimitSnapshot;
use codex_protocol::protocol::RateLimitWindow as CoreRateLimitWindow;
use codex_protocol::user_input::UserInput as CoreUserInput;
@@ -19,6 +21,7 @@ use schemars::JsonSchema;
use serde::Deserialize;
use serde::Serialize;
use serde_json::Value as JsonValue;
use thiserror::Error;
use ts_rs::TS;
// Macro to declare a camelCased API v2 enum mirroring a core enum which
@@ -46,6 +49,72 @@ macro_rules! v2_enum_from_core {
};
}
/// This translation layer make sure that we expose codex error code in camel case.
///
/// When an upstream HTTP status is available (for example, from the Responses API or a provider),
/// it is forwarded in `httpStatusCode` on the relevant `codexErrorInfo` variant.
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Eq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub enum CodexErrorInfo {
ContextWindowExceeded,
UsageLimitExceeded,
HttpConnectionFailed {
#[serde(rename = "httpStatusCode")]
#[ts(rename = "httpStatusCode")]
http_status_code: Option<u16>,
},
/// Failed to connect to the response SSE stream.
ResponseStreamConnectionFailed {
#[serde(rename = "httpStatusCode")]
#[ts(rename = "httpStatusCode")]
http_status_code: Option<u16>,
},
InternalServerError,
Unauthorized,
BadRequest,
SandboxError,
/// The response SSE stream disconnected in the middle of a turn before completion.
ResponseStreamDisconnected {
#[serde(rename = "httpStatusCode")]
#[ts(rename = "httpStatusCode")]
http_status_code: Option<u16>,
},
/// Reached the retry limit for responses.
ResponseTooManyFailedAttempts {
#[serde(rename = "httpStatusCode")]
#[ts(rename = "httpStatusCode")]
http_status_code: Option<u16>,
},
Other,
}
impl From<CoreCodexErrorInfo> for CodexErrorInfo {
fn from(value: CoreCodexErrorInfo) -> Self {
match value {
CoreCodexErrorInfo::ContextWindowExceeded => CodexErrorInfo::ContextWindowExceeded,
CoreCodexErrorInfo::UsageLimitExceeded => CodexErrorInfo::UsageLimitExceeded,
CoreCodexErrorInfo::HttpConnectionFailed { http_status_code } => {
CodexErrorInfo::HttpConnectionFailed { http_status_code }
}
CoreCodexErrorInfo::ResponseStreamConnectionFailed { http_status_code } => {
CodexErrorInfo::ResponseStreamConnectionFailed { http_status_code }
}
CoreCodexErrorInfo::InternalServerError => CodexErrorInfo::InternalServerError,
CoreCodexErrorInfo::Unauthorized => CodexErrorInfo::Unauthorized,
CoreCodexErrorInfo::BadRequest => CodexErrorInfo::BadRequest,
CoreCodexErrorInfo::SandboxError => CodexErrorInfo::SandboxError,
CoreCodexErrorInfo::ResponseStreamDisconnected { http_status_code } => {
CodexErrorInfo::ResponseStreamDisconnected { http_status_code }
}
CoreCodexErrorInfo::ResponseTooManyFailedAttempts { http_status_code } => {
CodexErrorInfo::ResponseTooManyFailedAttempts { http_status_code }
}
CoreCodexErrorInfo::Other => CodexErrorInfo::Other,
}
}
}
v2_enum_from_core!(
pub enum AskForApproval from codex_protocol::protocol::AskForApproval {
UnlessTrusted, OnFailure, OnRequest, Never
@@ -402,6 +471,12 @@ pub struct ThreadStartParams {
#[ts(export_to = "v2/")]
pub struct ThreadStartResponse {
pub thread: Thread,
pub model: String,
pub model_provider: String,
pub cwd: PathBuf,
pub approval_policy: AskForApproval,
pub sandbox: SandboxPolicy,
pub reasoning_effort: Option<ReasoningEffort>,
}
#[derive(Serialize, Deserialize, Debug, Default, Clone, PartialEq, JsonSchema, TS)]
@@ -444,6 +519,12 @@ pub struct ThreadResumeParams {
#[ts(export_to = "v2/")]
pub struct ThreadResumeResponse {
pub thread: Thread,
pub model: String,
pub model_provider: String,
pub cwd: PathBuf,
pub approval_policy: AskForApproval,
pub sandbox: SandboxPolicy,
pub reasoning_effort: Option<ReasoningEffort>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
@@ -505,6 +586,10 @@ pub struct Thread {
pub created_at: i64,
/// [UNSTABLE] Path to the thread on disk.
pub path: PathBuf,
/// Only populated on a `thread/resume` response.
/// For all other responses and notifications returning a Thread,
/// the turns field will be an empty list.
pub turns: Vec<Turn>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
@@ -519,25 +604,37 @@ pub struct AccountUpdatedNotification {
#[ts(export_to = "v2/")]
pub struct Turn {
pub id: String,
/// Only populated on a `thread/resume` response.
/// For all other responses and notifications returning a Turn,
/// the items field will be an empty list.
pub items: Vec<ThreadItem>,
#[serde(flatten)]
pub status: TurnStatus,
pub error: Option<TurnError>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS, Error)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
#[error("{message}")]
pub struct TurnError {
pub message: String,
pub codex_error_info: Option<CodexErrorInfo>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct ErrorNotification {
pub error: TurnError,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(tag = "status", rename_all = "camelCase")]
#[ts(tag = "status", export_to = "v2/")]
pub enum TurnStatus {
Completed,
Interrupted,
Failed,
Failed { error: TurnError },
InProgress,
}
@@ -562,6 +659,45 @@ pub struct TurnStartParams {
pub summary: Option<ReasoningSummary>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct ReviewStartParams {
pub thread_id: String,
pub target: ReviewTarget,
/// When true, also append the final review message to the original thread.
#[serde(default)]
pub append_to_original_thread: bool,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(tag = "type", rename_all = "camelCase")]
#[ts(tag = "type", export_to = "v2/")]
pub enum ReviewTarget {
/// Review the working tree: staged, unstaged, and untracked files.
UncommittedChanges,
/// Review changes between the current branch and the given base branch.
#[serde(rename_all = "camelCase")]
#[ts(rename_all = "camelCase")]
BaseBranch { branch: String },
/// Review the changes introduced by a specific commit.
#[serde(rename_all = "camelCase")]
#[ts(rename_all = "camelCase")]
Commit {
sha: String,
/// Optional human-readable label (e.g., commit subject) for UIs.
title: Option<String>,
},
/// Arbitrary instructions, equivalent to the old free-form prompt.
#[serde(rename_all = "camelCase")]
#[ts(rename_all = "camelCase")]
Custom { instructions: String },
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
@@ -723,6 +859,7 @@ pub enum CommandExecutionStatus {
InProgress,
Completed,
Failed,
Declined,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
@@ -735,20 +872,23 @@ pub struct FileUpdateChange {
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[serde(tag = "type", rename_all = "camelCase")]
#[ts(tag = "type")]
#[ts(export_to = "v2/")]
pub enum PatchChangeKind {
Add,
Delete,
Update,
Update { move_path: Option<PathBuf> },
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub enum PatchApplyStatus {
InProgress,
Completed,
Failed,
Declined,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
@@ -814,8 +954,6 @@ pub struct Usage {
#[ts(export_to = "v2/")]
pub struct TurnCompletedNotification {
pub turn: Turn,
// TODO: should usage be stored on the Turn object, and we return that instead?
pub usage: Usage,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
@@ -883,6 +1021,15 @@ pub struct McpToolCallProgressNotification {
pub message: String,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct WindowsWorldWritableWarningNotification {
pub sample_paths: Vec<String>,
pub extra_count: usize,
pub failed_scan: bool,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
@@ -916,6 +1063,26 @@ pub struct CommandExecutionRequestApprovalResponse {
pub accept_settings: Option<CommandExecutionRequestAcceptSettings>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct FileChangeRequestApprovalParams {
pub thread_id: String,
pub turn_id: String,
pub item_id: String,
/// Optional explanatory reason (e.g. request for extra write access).
pub reason: Option<String>,
/// [UNSTABLE] When set, the agent is asking the user to allow writes under this root
/// for the remainder of the session (unclear if this is honored today).
pub grant_root: Option<PathBuf>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[ts(export_to = "v2/")]
pub struct FileChangeRequestApprovalResponse {
pub decision: ApprovalDecision,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
@@ -929,6 +1096,7 @@ pub struct AccountRateLimitsUpdatedNotification {
pub struct RateLimitSnapshot {
pub primary: Option<RateLimitWindow>,
pub secondary: Option<RateLimitWindow>,
pub credits: Option<CreditsSnapshot>,
}
impl From<CoreRateLimitSnapshot> for RateLimitSnapshot {
@@ -936,6 +1104,7 @@ impl From<CoreRateLimitSnapshot> for RateLimitSnapshot {
Self {
primary: value.primary.map(RateLimitWindow::from),
secondary: value.secondary.map(RateLimitWindow::from),
credits: value.credits.map(CreditsSnapshot::from),
}
}
}
@@ -959,6 +1128,25 @@ impl From<CoreRateLimitWindow> for RateLimitWindow {
}
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct CreditsSnapshot {
pub has_credits: bool,
pub unlimited: bool,
pub balance: Option<String>,
}
impl From<CoreCreditsSnapshot> for CreditsSnapshot {
fn from(value: CoreCreditsSnapshot) -> Self {
Self {
has_credits: value.has_credits,
unlimited: value.unlimited,
balance: value.balance,
}
}
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
@@ -981,6 +1169,7 @@ mod tests {
use codex_protocol::items::WebSearchItem;
use codex_protocol::user_input::UserInput as CoreUserInput;
use pretty_assertions::assert_eq;
use serde_json::json;
use std::path::PathBuf;
#[test]
@@ -1066,4 +1255,20 @@ mod tests {
}
);
}
#[test]
fn codex_error_info_serializes_http_status_code_in_camel_case() {
let value = CodexErrorInfo::ResponseTooManyFailedAttempts {
http_status_code: Some(401),
};
assert_eq!(
serde_json::to_value(value).unwrap(),
json!({
"responseTooManyFailedAttempts": {
"httpStatusCode": 401
}
})
);
}
}

View File

@@ -17,15 +17,22 @@ use clap::Parser;
use clap::Subcommand;
use codex_app_server_protocol::AddConversationListenerParams;
use codex_app_server_protocol::AddConversationSubscriptionResponse;
use codex_app_server_protocol::ApprovalDecision;
use codex_app_server_protocol::AskForApproval;
use codex_app_server_protocol::ClientInfo;
use codex_app_server_protocol::ClientRequest;
use codex_app_server_protocol::CommandExecutionRequestAcceptSettings;
use codex_app_server_protocol::CommandExecutionRequestApprovalParams;
use codex_app_server_protocol::CommandExecutionRequestApprovalResponse;
use codex_app_server_protocol::FileChangeRequestApprovalParams;
use codex_app_server_protocol::FileChangeRequestApprovalResponse;
use codex_app_server_protocol::GetAccountRateLimitsResponse;
use codex_app_server_protocol::InitializeParams;
use codex_app_server_protocol::InitializeResponse;
use codex_app_server_protocol::InputItem;
use codex_app_server_protocol::JSONRPCMessage;
use codex_app_server_protocol::JSONRPCNotification;
use codex_app_server_protocol::JSONRPCRequest;
use codex_app_server_protocol::JSONRPCResponse;
use codex_app_server_protocol::LoginChatGptCompleteNotification;
use codex_app_server_protocol::LoginChatGptResponse;
@@ -36,14 +43,17 @@ use codex_app_server_protocol::SandboxPolicy;
use codex_app_server_protocol::SendUserMessageParams;
use codex_app_server_protocol::SendUserMessageResponse;
use codex_app_server_protocol::ServerNotification;
use codex_app_server_protocol::ServerRequest;
use codex_app_server_protocol::ThreadStartParams;
use codex_app_server_protocol::ThreadStartResponse;
use codex_app_server_protocol::TurnStartParams;
use codex_app_server_protocol::TurnStartResponse;
use codex_app_server_protocol::TurnStatus;
use codex_app_server_protocol::UserInput as V2UserInput;
use codex_protocol::ConversationId;
use codex_protocol::protocol::Event;
use codex_protocol::protocol::EventMsg;
use serde::Serialize;
use serde::de::DeserializeOwned;
use serde_json::Value;
use uuid::Uuid;
@@ -502,10 +512,9 @@ impl CodexClient {
ServerNotification::TurnCompleted(payload) => {
if payload.turn.id == turn_id {
println!("\n< turn/completed notification: {:?}", payload.turn.status);
if let Some(error) = payload.turn.error {
if let TurnStatus::Failed { error } = &payload.turn.status {
println!("[turn error] {}", error.message);
}
println!("< usage: {:?}", payload.usage);
break;
}
}
@@ -603,8 +612,8 @@ impl CodexClient {
JSONRPCMessage::Notification(notification) => {
self.pending_notifications.push_back(notification);
}
JSONRPCMessage::Request(_) => {
bail!("unexpected request from codex app-server");
JSONRPCMessage::Request(request) => {
self.handle_server_request(request)?;
}
}
}
@@ -624,8 +633,8 @@ impl CodexClient {
// No outstanding requests, so ignore stray responses/errors for now.
continue;
}
JSONRPCMessage::Request(_) => {
bail!("unexpected request from codex app-server");
JSONRPCMessage::Request(request) => {
self.handle_server_request(request)?;
}
}
}
@@ -661,6 +670,115 @@ impl CodexClient {
fn request_id(&self) -> RequestId {
RequestId::String(Uuid::new_v4().to_string())
}
fn handle_server_request(&mut self, request: JSONRPCRequest) -> Result<()> {
let server_request = ServerRequest::try_from(request)
.context("failed to deserialize ServerRequest from JSONRPCRequest")?;
match server_request {
ServerRequest::CommandExecutionRequestApproval { request_id, params } => {
self.handle_command_execution_request_approval(request_id, params)?;
}
ServerRequest::FileChangeRequestApproval { request_id, params } => {
self.approve_file_change_request(request_id, params)?;
}
other => {
bail!("received unsupported server request: {other:?}");
}
}
Ok(())
}
fn handle_command_execution_request_approval(
&mut self,
request_id: RequestId,
params: CommandExecutionRequestApprovalParams,
) -> Result<()> {
let CommandExecutionRequestApprovalParams {
thread_id,
turn_id,
item_id,
reason,
risk,
} = params;
println!(
"\n< commandExecution approval requested for thread {thread_id}, turn {turn_id}, item {item_id}"
);
if let Some(reason) = reason.as_deref() {
println!("< reason: {reason}");
}
if let Some(risk) = risk.as_ref() {
println!("< risk assessment: {risk:?}");
}
let response = CommandExecutionRequestApprovalResponse {
decision: ApprovalDecision::Accept,
accept_settings: Some(CommandExecutionRequestAcceptSettings { for_session: false }),
};
self.send_server_request_response(request_id, &response)?;
println!("< approved commandExecution request for item {item_id}");
Ok(())
}
fn approve_file_change_request(
&mut self,
request_id: RequestId,
params: FileChangeRequestApprovalParams,
) -> Result<()> {
let FileChangeRequestApprovalParams {
thread_id,
turn_id,
item_id,
reason,
grant_root,
} = params;
println!(
"\n< fileChange approval requested for thread {thread_id}, turn {turn_id}, item {item_id}"
);
if let Some(reason) = reason.as_deref() {
println!("< reason: {reason}");
}
if let Some(grant_root) = grant_root.as_deref() {
println!("< grant root: {}", grant_root.display());
}
let response = FileChangeRequestApprovalResponse {
decision: ApprovalDecision::Accept,
};
self.send_server_request_response(request_id, &response)?;
println!("< approved fileChange request for item {item_id}");
Ok(())
}
fn send_server_request_response<T>(&mut self, request_id: RequestId, response: &T) -> Result<()>
where
T: Serialize,
{
let message = JSONRPCMessage::Response(JSONRPCResponse {
id: request_id,
result: serde_json::to_value(response)?,
});
self.write_jsonrpc_message(message)
}
fn write_jsonrpc_message(&mut self, message: JSONRPCMessage) -> Result<()> {
let payload = serde_json::to_string(&message)?;
let pretty = serde_json::to_string_pretty(&message)?;
print_multiline_with_prefix("> ", &pretty);
if let Some(stdin) = self.stdin.as_mut() {
writeln!(stdin, "{payload}")?;
stdin
.flush()
.context("failed to flush response to codex app-server")?;
return Ok(());
}
bail!("codex app-server stdin closed")
}
}
fn print_multiline_with_prefix(prefix: &str, payload: &str) {

View File

@@ -53,3 +53,4 @@ serial_test = { workspace = true }
tempfile = { workspace = true }
toml = { workspace = true }
wiremock = { workspace = true }
shlex = { workspace = true }

View File

@@ -65,6 +65,7 @@ The JSON-RPC API exposes dedicated methods for managing Codex conversations. Thr
- `thread/archive` — move a threads rollout file into the archived directory; returns `{}` on success.
- `turn/start` — add user input to a thread and begin Codex generation; responds with the initial `turn` object and streams `turn/started`, `item/*`, and `turn/completed` notifications.
- `turn/interrupt` — request cancellation of an in-flight turn by `(thread_id, turn_id)`; success is an empty `{}` response and the turn finishes with `status: "interrupted"`.
- `review/start` — kick off Codexs automated reviewer for a thread; responds like `turn/start` and emits a `item/completed` notification with a `codeReview` item when results are ready.
### 1) Start or resume a thread
@@ -181,6 +182,58 @@ You can cancel a running Turn with `turn/interrupt`.
The server requests cancellations for running subprocesses, then emits a `turn/completed` event with `status: "interrupted"`. Rely on the `turn/completed` to know when Codex-side cleanup is done.
### 6) Request a code review
Use `review/start` to run Codexs reviewer on the currently checked-out project. The request takes the thread id plus a `target` describing what should be reviewed:
- `{"type":"uncommittedChanges"}` — staged, unstaged, and untracked files.
- `{"type":"baseBranch","branch":"main"}` — diff against the provided branchs upstream (see prompt for the exact `git merge-base`/`git diff` instructions Codex will run).
- `{"type":"commit","sha":"abc1234","title":"Optional subject"}` — review a specific commit.
- `{"type":"custom","instructions":"Free-form reviewer instructions"}` — fallback prompt equivalent to the legacy manual review request.
- `appendToOriginalThread` (bool, default `false`) — when `true`, Codex also records a final assistant-style message with the review summary in the original thread. When `false`, only the `codeReview` item is emitted for the review run and no extra message is added to the original thread.
Example request/response:
```json
{ "method": "review/start", "id": 40, "params": {
"threadId": "thr_123",
"appendToOriginalThread": true,
"target": { "type": "commit", "sha": "1234567deadbeef", "title": "Polish tui colors" }
} }
{ "id": 40, "result": { "turn": {
"id": "turn_900",
"status": "inProgress",
"items": [
{ "type": "userMessage", "id": "turn_900", "content": [ { "type": "text", "text": "Review commit 1234567: Polish tui colors" } ] }
],
"error": null
} } }
```
Codex streams the usual `turn/started` notification followed by an `item/started`
with the same `codeReview` item id so clients can show progress:
```json
{ "method": "item/started", "params": { "item": {
"type": "codeReview",
"id": "turn_900",
"review": "current changes"
} } }
```
When the reviewer finishes, the server emits `item/completed` containing the same
`codeReview` item with the final review text:
```json
{ "method": "item/completed", "params": { "item": {
"type": "codeReview",
"id": "turn_900",
"review": "Looks solid overall...\n\n- Prefer Stylize helpers — app.rs:10-20\n ..."
} } }
```
The `review` string is plain text that already bundles the overall explanation plus a bullet list for each structured finding (matching `ThreadItem::CodeReview` in the generated schema). Use this notification to render the reviewer output in your client.
## Auth endpoints
The JSON-RPC auth/account surface exposes request/response methods plus server-initiated notifications (no `id`). Use these to determine auth state, start or cancel logins, logout, and inspect ChatGPT rate limits.
@@ -286,6 +339,29 @@ Event notifications are the server-initiated event stream for thread lifecycles,
The app-server streams JSON-RPC notifications while a turn is running. Each turn starts with `turn/started` (initial `turn`) and ends with `turn/completed` (final `turn` plus token `usage`), and clients subscribe to the events they care about, rendering each item incrementally as updates arrive. The per-item lifecycle is always: `item/started` → zero or more item-specific deltas → `item/completed`.
- `turn/started` — `{ turn }` with the turn id, empty `items`, and `status: "inProgress"`.
- `turn/completed` — `{ turn }` where `turn.status` is `completed`, `interrupted`, or `failed`; failures carry `{ error: { message, codexErrorInfo? } }`.
Today both notifications carry an empty `items` array even when item events were streamed; rely on `item/*` notifications for the canonical item list until this is fixed.
#### Errors
`error` event is emitted whenever the server hits an error mid-turn (for example, upstream model errors or quota limits). Carries the same `{ error: { message, codexErrorInfo? } }` payload as `turn.status: "failed"` and may precede that terminal notification.
`codexErrorInfo` maps to the `CodexErrorInfo` enum. Common values:
- `ContextWindowExceeded`
- `UsageLimitExceeded`
- `HttpConnectionFailed { httpStatusCode? }`: upstream HTTP failures including 4xx/5xx
- `ResponseStreamConnectionFailed { httpStatusCode? }`: failure to connect to the response SSE stream
- `ResponseStreamDisconnected { httpStatusCode? }`: disconnect of the response SSE stream in the middle of a turn before completion
- `ResponseTooManyFailedAttempts { httpStatusCode? }`
- `BadRequest`
- `Unauthorized`
- `SandboxError`
- `InternalServerError`
- `Other`: all unclassified errors
When an upstream HTTP status is available (for example, from the Responses API or a provider), it is forwarded in `httpStatusCode` on the relevant `codexErrorInfo` variant.
#### Thread items
`ThreadItem` is the tagged union carried in turn responses and `item/*` notifications. Currently we support events for the following items:

View File

@@ -1,24 +1,33 @@
use crate::codex_message_processor::ApiVersion;
use crate::codex_message_processor::PendingInterrupts;
use crate::codex_message_processor::TurnSummary;
use crate::codex_message_processor::TurnSummaryStore;
use crate::outgoing_message::OutgoingMessageSender;
use codex_app_server_protocol::AccountRateLimitsUpdatedNotification;
use codex_app_server_protocol::AgentMessageDeltaNotification;
use codex_app_server_protocol::ApplyPatchApprovalParams;
use codex_app_server_protocol::ApplyPatchApprovalResponse;
use codex_app_server_protocol::ApprovalDecision;
use codex_app_server_protocol::CodexErrorInfo as V2CodexErrorInfo;
use codex_app_server_protocol::CommandAction as V2ParsedCommand;
use codex_app_server_protocol::CommandExecutionOutputDeltaNotification;
use codex_app_server_protocol::CommandExecutionRequestApprovalParams;
use codex_app_server_protocol::CommandExecutionRequestApprovalResponse;
use codex_app_server_protocol::CommandExecutionStatus;
use codex_app_server_protocol::ErrorNotification;
use codex_app_server_protocol::ExecCommandApprovalParams;
use codex_app_server_protocol::ExecCommandApprovalResponse;
use codex_app_server_protocol::FileChangeRequestApprovalParams;
use codex_app_server_protocol::FileChangeRequestApprovalResponse;
use codex_app_server_protocol::FileUpdateChange;
use codex_app_server_protocol::InterruptConversationResponse;
use codex_app_server_protocol::ItemCompletedNotification;
use codex_app_server_protocol::ItemStartedNotification;
use codex_app_server_protocol::McpToolCallError;
use codex_app_server_protocol::McpToolCallResult;
use codex_app_server_protocol::McpToolCallStatus;
use codex_app_server_protocol::PatchApplyStatus;
use codex_app_server_protocol::PatchChangeKind as V2PatchChangeKind;
use codex_app_server_protocol::ReasoningSummaryPartAddedNotification;
use codex_app_server_protocol::ReasoningSummaryTextDeltaNotification;
use codex_app_server_protocol::ReasoningTextDeltaNotification;
@@ -26,7 +35,11 @@ use codex_app_server_protocol::SandboxCommandAssessment as V2SandboxCommandAsses
use codex_app_server_protocol::ServerNotification;
use codex_app_server_protocol::ServerRequestPayload;
use codex_app_server_protocol::ThreadItem;
use codex_app_server_protocol::Turn;
use codex_app_server_protocol::TurnCompletedNotification;
use codex_app_server_protocol::TurnError;
use codex_app_server_protocol::TurnInterruptResponse;
use codex_app_server_protocol::TurnStatus;
use codex_core::CodexConversation;
use codex_core::parse_command::shlex_join;
use codex_core::protocol::ApplyPatchApprovalRequestEvent;
@@ -34,12 +47,17 @@ use codex_core::protocol::Event;
use codex_core::protocol::EventMsg;
use codex_core::protocol::ExecApprovalRequestEvent;
use codex_core::protocol::ExecCommandEndEvent;
use codex_core::protocol::FileChange as CoreFileChange;
use codex_core::protocol::McpToolCallBeginEvent;
use codex_core::protocol::McpToolCallEndEvent;
use codex_core::protocol::Op;
use codex_core::protocol::ReviewDecision;
use codex_core::review_format::format_review_findings_block;
use codex_protocol::ConversationId;
use codex_protocol::protocol::ReviewOutputEvent;
use std::collections::HashMap;
use std::convert::TryFrom;
use std::path::PathBuf;
use std::sync::Arc;
use tokio::sync::oneshot;
use tracing::error;
@@ -52,30 +70,84 @@ pub(crate) async fn apply_bespoke_event_handling(
conversation: Arc<CodexConversation>,
outgoing: Arc<OutgoingMessageSender>,
pending_interrupts: PendingInterrupts,
turn_summary_store: TurnSummaryStore,
api_version: ApiVersion,
) {
let Event { id: event_id, msg } = event;
match msg {
EventMsg::TaskComplete(_ev) => {
handle_turn_complete(conversation_id, event_id, &outgoing, &turn_summary_store).await;
}
EventMsg::ApplyPatchApprovalRequest(ApplyPatchApprovalRequestEvent {
call_id,
turn_id,
changes,
reason,
grant_root,
}) => {
let params = ApplyPatchApprovalParams {
conversation_id,
call_id,
file_changes: changes,
reason,
grant_root,
};
let rx = outgoing
.send_request(ServerRequestPayload::ApplyPatchApproval(params))
.await;
tokio::spawn(async move {
on_patch_approval_response(event_id, rx, conversation).await;
});
}
}) => match api_version {
ApiVersion::V1 => {
let params = ApplyPatchApprovalParams {
conversation_id,
call_id,
file_changes: changes.clone(),
reason,
grant_root,
};
let rx = outgoing
.send_request(ServerRequestPayload::ApplyPatchApproval(params))
.await;
tokio::spawn(async move {
on_patch_approval_response(event_id, rx, conversation).await;
});
}
ApiVersion::V2 => {
// Until we migrate the core to be aware of a first class FileChangeItem
// and emit the corresponding EventMsg, we repurpose the call_id as the item_id.
let item_id = call_id.clone();
let patch_changes = convert_patch_changes(&changes);
let first_start = {
let mut map = turn_summary_store.lock().await;
let summary = map.entry(conversation_id).or_default();
summary.file_change_started.insert(item_id.clone())
};
if first_start {
let item = ThreadItem::FileChange {
id: item_id.clone(),
changes: patch_changes.clone(),
status: PatchApplyStatus::InProgress,
};
let notification = ItemStartedNotification { item };
outgoing
.send_server_notification(ServerNotification::ItemStarted(notification))
.await;
}
let params = FileChangeRequestApprovalParams {
thread_id: conversation_id.to_string(),
turn_id: turn_id.clone(),
item_id: item_id.clone(),
reason,
grant_root,
};
let rx = outgoing
.send_request(ServerRequestPayload::FileChangeRequestApproval(params))
.await;
tokio::spawn(async move {
on_file_change_request_approval_response(
event_id,
conversation_id,
item_id,
patch_changes,
rx,
conversation,
outgoing,
turn_summary_store,
)
.await;
});
}
},
EventMsg::ExecApprovalRequest(ExecApprovalRequestEvent {
call_id,
turn_id,
@@ -103,12 +175,20 @@ pub(crate) async fn apply_bespoke_event_handling(
});
}
ApiVersion::V2 => {
let item_id = call_id.clone();
let command_actions = parsed_cmd
.iter()
.cloned()
.map(V2ParsedCommand::from)
.collect::<Vec<_>>();
let command_string = shlex_join(&command);
let params = CommandExecutionRequestApprovalParams {
thread_id: conversation_id.to_string(),
turn_id: turn_id.clone(),
// Until we migrate the core to be aware of a first class CommandExecutionItem
// and emit the corresponding EventMsg, we repurpose the call_id as the item_id.
item_id: call_id.clone(),
item_id: item_id.clone(),
reason,
risk: risk.map(V2SandboxCommandAssessment::from),
};
@@ -118,8 +198,17 @@ pub(crate) async fn apply_bespoke_event_handling(
))
.await;
tokio::spawn(async move {
on_command_execution_request_approval_response(event_id, rx, conversation)
.await;
on_command_execution_request_approval_response(
event_id,
item_id,
command_string,
cwd,
command_actions,
rx,
conversation,
outgoing,
)
.await;
});
}
},
@@ -189,6 +278,42 @@ pub(crate) async fn apply_bespoke_event_handling(
.await;
}
}
EventMsg::Error(ev) => {
let turn_error = TurnError {
message: ev.message,
codex_error_info: ev.codex_error_info.map(V2CodexErrorInfo::from),
};
handle_error(conversation_id, turn_error.clone(), &turn_summary_store).await;
outgoing
.send_server_notification(ServerNotification::Error(ErrorNotification {
error: turn_error,
}))
.await;
}
EventMsg::StreamError(ev) => {
// We don't need to update the turn summary store for stream errors as they are intermediate error states for retries,
// but we notify the client.
let turn_error = TurnError {
message: ev.message,
codex_error_info: ev.codex_error_info.map(V2CodexErrorInfo::from),
};
outgoing
.send_server_notification(ServerNotification::Error(ErrorNotification {
error: turn_error,
}))
.await;
}
EventMsg::EnteredReviewMode(review_request) => {
let notification = ItemStartedNotification {
item: ThreadItem::CodeReview {
id: event_id.clone(),
review: review_request.user_facing_hint,
},
};
outgoing
.send_server_notification(ServerNotification::ItemStarted(notification))
.await;
}
EventMsg::ItemStarted(item_started_event) => {
let item: ThreadItem = item_started_event.item.clone().into();
let notification = ItemStartedNotification { item };
@@ -203,17 +328,80 @@ pub(crate) async fn apply_bespoke_event_handling(
.send_server_notification(ServerNotification::ItemCompleted(notification))
.await;
}
EventMsg::ExitedReviewMode(review_event) => {
let review_text = match review_event.review_output {
Some(output) => render_review_output_text(&output),
None => REVIEW_FALLBACK_MESSAGE.to_string(),
};
let notification = ItemCompletedNotification {
item: ThreadItem::CodeReview {
id: event_id,
review: review_text,
},
};
outgoing
.send_server_notification(ServerNotification::ItemCompleted(notification))
.await;
}
EventMsg::PatchApplyBegin(patch_begin_event) => {
// Until we migrate the core to be aware of a first class FileChangeItem
// and emit the corresponding EventMsg, we repurpose the call_id as the item_id.
let item_id = patch_begin_event.call_id.clone();
let first_start = {
let mut map = turn_summary_store.lock().await;
let summary = map.entry(conversation_id).or_default();
summary.file_change_started.insert(item_id.clone())
};
if first_start {
let item = ThreadItem::FileChange {
id: item_id.clone(),
changes: convert_patch_changes(&patch_begin_event.changes),
status: PatchApplyStatus::InProgress,
};
let notification = ItemStartedNotification { item };
outgoing
.send_server_notification(ServerNotification::ItemStarted(notification))
.await;
}
}
EventMsg::PatchApplyEnd(patch_end_event) => {
// Until we migrate the core to be aware of a first class FileChangeItem
// and emit the corresponding EventMsg, we repurpose the call_id as the item_id.
let item_id = patch_end_event.call_id.clone();
let status = if patch_end_event.success {
PatchApplyStatus::Completed
} else {
PatchApplyStatus::Failed
};
let changes = convert_patch_changes(&patch_end_event.changes);
complete_file_change_item(
conversation_id,
item_id,
changes,
status,
outgoing.as_ref(),
&turn_summary_store,
)
.await;
}
EventMsg::ExecCommandBegin(exec_command_begin_event) => {
let item_id = exec_command_begin_event.call_id.clone();
let command_actions = exec_command_begin_event
.parsed_cmd
.into_iter()
.map(V2ParsedCommand::from)
.collect::<Vec<_>>();
let command = shlex_join(&exec_command_begin_event.command);
let cwd = exec_command_begin_event.cwd;
let item = ThreadItem::CommandExecution {
id: exec_command_begin_event.call_id.clone(),
command: shlex_join(&exec_command_begin_event.command),
cwd: exec_command_begin_event.cwd,
id: item_id,
command,
cwd,
status: CommandExecutionStatus::InProgress,
command_actions: exec_command_begin_event
.parsed_cmd
.into_iter()
.map(V2ParsedCommand::from)
.collect(),
command_actions,
aggregated_output: None,
exit_code: None,
duration_ms: None,
@@ -251,6 +439,10 @@ pub(crate) async fn apply_bespoke_event_handling(
} else {
CommandExecutionStatus::Failed
};
let command_actions = parsed_cmd
.into_iter()
.map(V2ParsedCommand::from)
.collect::<Vec<_>>();
let aggregated_output = if aggregated_output.is_empty() {
None
@@ -265,7 +457,7 @@ pub(crate) async fn apply_bespoke_event_handling(
command: shlex_join(&command),
cwd,
status,
command_actions: parsed_cmd.into_iter().map(V2ParsedCommand::from).collect(),
command_actions,
aggregated_output,
exit_code: Some(exit_code),
duration_ms: Some(duration_ms),
@@ -298,12 +490,127 @@ pub(crate) async fn apply_bespoke_event_handling(
}
}
}
handle_turn_interrupted(conversation_id, event_id, &outgoing, &turn_summary_store)
.await;
}
_ => {}
}
}
async fn emit_turn_completed_with_status(
event_id: String,
status: TurnStatus,
outgoing: &OutgoingMessageSender,
) {
let notification = TurnCompletedNotification {
turn: Turn {
id: event_id,
items: vec![],
status,
},
};
outgoing
.send_server_notification(ServerNotification::TurnCompleted(notification))
.await;
}
async fn complete_file_change_item(
conversation_id: ConversationId,
item_id: String,
changes: Vec<FileUpdateChange>,
status: PatchApplyStatus,
outgoing: &OutgoingMessageSender,
turn_summary_store: &TurnSummaryStore,
) {
{
let mut map = turn_summary_store.lock().await;
if let Some(summary) = map.get_mut(&conversation_id) {
summary.file_change_started.remove(&item_id);
}
}
let item = ThreadItem::FileChange {
id: item_id,
changes,
status,
};
let notification = ItemCompletedNotification { item };
outgoing
.send_server_notification(ServerNotification::ItemCompleted(notification))
.await;
}
async fn complete_command_execution_item(
item_id: String,
command: String,
cwd: PathBuf,
command_actions: Vec<V2ParsedCommand>,
status: CommandExecutionStatus,
outgoing: &OutgoingMessageSender,
) {
let item = ThreadItem::CommandExecution {
id: item_id,
command,
cwd,
status,
command_actions,
aggregated_output: None,
exit_code: None,
duration_ms: None,
};
let notification = ItemCompletedNotification { item };
outgoing
.send_server_notification(ServerNotification::ItemCompleted(notification))
.await;
}
async fn find_and_remove_turn_summary(
conversation_id: ConversationId,
turn_summary_store: &TurnSummaryStore,
) -> TurnSummary {
let mut map = turn_summary_store.lock().await;
map.remove(&conversation_id).unwrap_or_default()
}
async fn handle_turn_complete(
conversation_id: ConversationId,
event_id: String,
outgoing: &OutgoingMessageSender,
turn_summary_store: &TurnSummaryStore,
) {
let turn_summary = find_and_remove_turn_summary(conversation_id, turn_summary_store).await;
let status = if let Some(error) = turn_summary.last_error {
TurnStatus::Failed { error }
} else {
TurnStatus::Completed
};
emit_turn_completed_with_status(event_id, status, outgoing).await;
}
async fn handle_turn_interrupted(
conversation_id: ConversationId,
event_id: String,
outgoing: &OutgoingMessageSender,
turn_summary_store: &TurnSummaryStore,
) {
find_and_remove_turn_summary(conversation_id, turn_summary_store).await;
emit_turn_completed_with_status(event_id, TurnStatus::Interrupted, outgoing).await;
}
async fn handle_error(
conversation_id: ConversationId,
error: TurnError,
turn_summary_store: &TurnSummaryStore,
) {
let mut map = turn_summary_store.lock().await;
map.entry(conversation_id).or_default().last_error = Some(error);
}
async fn on_patch_approval_response(
event_id: String,
receiver: oneshot::Receiver<JsonValue>,
@@ -382,42 +689,194 @@ async fn on_exec_approval_response(
}
}
async fn on_command_execution_request_approval_response(
const REVIEW_FALLBACK_MESSAGE: &str = "Reviewer failed to output a response.";
fn render_review_output_text(output: &ReviewOutputEvent) -> String {
let mut sections = Vec::new();
let explanation = output.overall_explanation.trim();
if !explanation.is_empty() {
sections.push(explanation.to_string());
}
if !output.findings.is_empty() {
let findings = format_review_findings_block(&output.findings, None);
let trimmed = findings.trim();
if !trimmed.is_empty() {
sections.push(trimmed.to_string());
}
}
if sections.is_empty() {
REVIEW_FALLBACK_MESSAGE.to_string()
} else {
sections.join("\n\n")
}
}
fn convert_patch_changes(changes: &HashMap<PathBuf, CoreFileChange>) -> Vec<FileUpdateChange> {
let mut converted: Vec<FileUpdateChange> = changes
.iter()
.map(|(path, change)| FileUpdateChange {
path: path.to_string_lossy().into_owned(),
kind: map_patch_change_kind(change),
diff: format_file_change_diff(change),
})
.collect();
converted.sort_by(|a, b| a.path.cmp(&b.path));
converted
}
fn map_patch_change_kind(change: &CoreFileChange) -> V2PatchChangeKind {
match change {
CoreFileChange::Add { .. } => V2PatchChangeKind::Add,
CoreFileChange::Delete { .. } => V2PatchChangeKind::Delete,
CoreFileChange::Update { move_path, .. } => V2PatchChangeKind::Update {
move_path: move_path.clone(),
},
}
}
fn format_file_change_diff(change: &CoreFileChange) -> String {
match change {
CoreFileChange::Add { content } => content.clone(),
CoreFileChange::Delete { content } => content.clone(),
CoreFileChange::Update {
unified_diff,
move_path,
} => {
if let Some(path) = move_path {
format!("{unified_diff}\n\nMoved to: {}", path.display())
} else {
unified_diff.clone()
}
}
}
}
#[allow(clippy::too_many_arguments)]
async fn on_file_change_request_approval_response(
event_id: String,
conversation_id: ConversationId,
item_id: String,
changes: Vec<FileUpdateChange>,
receiver: oneshot::Receiver<JsonValue>,
conversation: Arc<CodexConversation>,
codex: Arc<CodexConversation>,
outgoing: Arc<OutgoingMessageSender>,
turn_summary_store: TurnSummaryStore,
) {
let response = receiver.await;
let value = match response {
Ok(value) => value,
let (decision, completion_status) = match response {
Ok(value) => {
let response = serde_json::from_value::<FileChangeRequestApprovalResponse>(value)
.unwrap_or_else(|err| {
error!("failed to deserialize FileChangeRequestApprovalResponse: {err}");
FileChangeRequestApprovalResponse {
decision: ApprovalDecision::Decline,
}
});
let (decision, completion_status) = match response.decision {
ApprovalDecision::Accept => (ReviewDecision::Approved, None),
ApprovalDecision::Decline => {
(ReviewDecision::Denied, Some(PatchApplyStatus::Declined))
}
ApprovalDecision::Cancel => {
(ReviewDecision::Abort, Some(PatchApplyStatus::Declined))
}
};
// Allow EventMsg::PatchApplyEnd to emit ItemCompleted for accepted patches.
// Only short-circuit on declines/cancels/failures.
(decision, completion_status)
}
Err(err) => {
error!("request failed: {err:?}");
return;
(ReviewDecision::Denied, Some(PatchApplyStatus::Failed))
}
};
let response = serde_json::from_value::<CommandExecutionRequestApprovalResponse>(value)
.unwrap_or_else(|err| {
error!("failed to deserialize CommandExecutionRequestApprovalResponse: {err}");
CommandExecutionRequestApprovalResponse {
decision: ApprovalDecision::Decline,
accept_settings: None,
}
});
if let Some(status) = completion_status {
complete_file_change_item(
conversation_id,
item_id,
changes,
status,
outgoing.as_ref(),
&turn_summary_store,
)
.await;
}
let CommandExecutionRequestApprovalResponse {
decision,
accept_settings,
} = response;
if let Err(err) = codex
.submit(Op::PatchApproval {
id: event_id,
decision,
})
.await
{
error!("failed to submit PatchApproval: {err}");
}
}
let decision = match (decision, accept_settings) {
(ApprovalDecision::Accept, Some(settings)) if settings.for_session => {
ReviewDecision::ApprovedForSession
#[allow(clippy::too_many_arguments)]
async fn on_command_execution_request_approval_response(
event_id: String,
item_id: String,
command: String,
cwd: PathBuf,
command_actions: Vec<V2ParsedCommand>,
receiver: oneshot::Receiver<JsonValue>,
conversation: Arc<CodexConversation>,
outgoing: Arc<OutgoingMessageSender>,
) {
let response = receiver.await;
let (decision, completion_status) = match response {
Ok(value) => {
let response = serde_json::from_value::<CommandExecutionRequestApprovalResponse>(value)
.unwrap_or_else(|err| {
error!("failed to deserialize CommandExecutionRequestApprovalResponse: {err}");
CommandExecutionRequestApprovalResponse {
decision: ApprovalDecision::Decline,
accept_settings: None,
}
});
let CommandExecutionRequestApprovalResponse {
decision,
accept_settings,
} = response;
let (decision, completion_status) = match (decision, accept_settings) {
(ApprovalDecision::Accept, Some(settings)) if settings.for_session => {
(ReviewDecision::ApprovedForSession, None)
}
(ApprovalDecision::Accept, _) => (ReviewDecision::Approved, None),
(ApprovalDecision::Decline, _) => (
ReviewDecision::Denied,
Some(CommandExecutionStatus::Declined),
),
(ApprovalDecision::Cancel, _) => (
ReviewDecision::Abort,
Some(CommandExecutionStatus::Declined),
),
};
(decision, completion_status)
}
Err(err) => {
error!("request failed: {err:?}");
(ReviewDecision::Denied, Some(CommandExecutionStatus::Failed))
}
(ApprovalDecision::Accept, _) => ReviewDecision::Approved,
(ApprovalDecision::Decline, _) => ReviewDecision::Denied,
(ApprovalDecision::Cancel, _) => ReviewDecision::Abort,
};
if let Some(status) = completion_status {
complete_command_execution_item(
item_id.clone(),
command.clone(),
cwd.clone(),
command_actions.clone(),
status,
outgoing.as_ref(),
)
.await;
}
if let Err(err) = conversation
.submit(Op::ExecApproval {
id: event_id,
@@ -486,13 +945,171 @@ async fn construct_mcp_tool_call_end_notification(
#[cfg(test)]
mod tests {
use super::*;
use crate::CHANNEL_CAPACITY;
use crate::outgoing_message::OutgoingMessage;
use crate::outgoing_message::OutgoingMessageSender;
use anyhow::Result;
use anyhow::anyhow;
use anyhow::bail;
use codex_core::protocol::McpInvocation;
use mcp_types::CallToolResult;
use mcp_types::ContentBlock;
use mcp_types::TextContent;
use pretty_assertions::assert_eq;
use serde_json::Value as JsonValue;
use std::collections::HashMap;
use std::time::Duration;
use tokio::sync::Mutex;
use tokio::sync::mpsc;
fn new_turn_summary_store() -> TurnSummaryStore {
Arc::new(Mutex::new(HashMap::new()))
}
#[tokio::test]
async fn test_handle_error_records_message() -> Result<()> {
let conversation_id = ConversationId::new();
let turn_summary_store = new_turn_summary_store();
handle_error(
conversation_id,
TurnError {
message: "boom".to_string(),
codex_error_info: Some(V2CodexErrorInfo::InternalServerError),
},
&turn_summary_store,
)
.await;
let turn_summary = find_and_remove_turn_summary(conversation_id, &turn_summary_store).await;
assert_eq!(
turn_summary.last_error,
Some(TurnError {
message: "boom".to_string(),
codex_error_info: Some(V2CodexErrorInfo::InternalServerError),
})
);
Ok(())
}
#[tokio::test]
async fn test_handle_turn_complete_emits_completed_without_error() -> Result<()> {
let conversation_id = ConversationId::new();
let event_id = "complete1".to_string();
let (tx, mut rx) = mpsc::channel(CHANNEL_CAPACITY);
let outgoing = Arc::new(OutgoingMessageSender::new(tx));
let turn_summary_store = new_turn_summary_store();
handle_turn_complete(
conversation_id,
event_id.clone(),
&outgoing,
&turn_summary_store,
)
.await;
let msg = rx
.recv()
.await
.ok_or_else(|| anyhow!("should send one notification"))?;
match msg {
OutgoingMessage::AppServerNotification(ServerNotification::TurnCompleted(n)) => {
assert_eq!(n.turn.id, event_id);
assert_eq!(n.turn.status, TurnStatus::Completed);
}
other => bail!("unexpected message: {other:?}"),
}
assert!(rx.try_recv().is_err(), "no extra messages expected");
Ok(())
}
#[tokio::test]
async fn test_handle_turn_interrupted_emits_interrupted_with_error() -> Result<()> {
let conversation_id = ConversationId::new();
let event_id = "interrupt1".to_string();
let turn_summary_store = new_turn_summary_store();
handle_error(
conversation_id,
TurnError {
message: "oops".to_string(),
codex_error_info: None,
},
&turn_summary_store,
)
.await;
let (tx, mut rx) = mpsc::channel(CHANNEL_CAPACITY);
let outgoing = Arc::new(OutgoingMessageSender::new(tx));
handle_turn_interrupted(
conversation_id,
event_id.clone(),
&outgoing,
&turn_summary_store,
)
.await;
let msg = rx
.recv()
.await
.ok_or_else(|| anyhow!("should send one notification"))?;
match msg {
OutgoingMessage::AppServerNotification(ServerNotification::TurnCompleted(n)) => {
assert_eq!(n.turn.id, event_id);
assert_eq!(n.turn.status, TurnStatus::Interrupted);
}
other => bail!("unexpected message: {other:?}"),
}
assert!(rx.try_recv().is_err(), "no extra messages expected");
Ok(())
}
#[tokio::test]
async fn test_handle_turn_complete_emits_failed_with_error() -> Result<()> {
let conversation_id = ConversationId::new();
let event_id = "complete_err1".to_string();
let turn_summary_store = new_turn_summary_store();
handle_error(
conversation_id,
TurnError {
message: "bad".to_string(),
codex_error_info: Some(V2CodexErrorInfo::Other),
},
&turn_summary_store,
)
.await;
let (tx, mut rx) = mpsc::channel(CHANNEL_CAPACITY);
let outgoing = Arc::new(OutgoingMessageSender::new(tx));
handle_turn_complete(
conversation_id,
event_id.clone(),
&outgoing,
&turn_summary_store,
)
.await;
let msg = rx
.recv()
.await
.ok_or_else(|| anyhow!("should send one notification"))?;
match msg {
OutgoingMessage::AppServerNotification(ServerNotification::TurnCompleted(n)) => {
assert_eq!(n.turn.id, event_id);
assert_eq!(
n.turn.status,
TurnStatus::Failed {
error: TurnError {
message: "bad".to_string(),
codex_error_info: Some(V2CodexErrorInfo::Other),
}
}
);
}
other => bail!("unexpected message: {other:?}"),
}
assert!(rx.try_recv().is_err(), "no extra messages expected");
Ok(())
}
#[tokio::test]
async fn test_construct_mcp_tool_call_begin_notification_with_args() {
@@ -522,6 +1139,123 @@ mod tests {
assert_eq!(notification, expected);
}
#[tokio::test]
async fn test_handle_turn_complete_emits_error_multiple_turns() -> Result<()> {
// Conversation A will have two turns; Conversation B will have one turn.
let conversation_a = ConversationId::new();
let conversation_b = ConversationId::new();
let turn_summary_store = new_turn_summary_store();
let (tx, mut rx) = mpsc::channel(CHANNEL_CAPACITY);
let outgoing = Arc::new(OutgoingMessageSender::new(tx));
// Turn 1 on conversation A
let a_turn1 = "a_turn1".to_string();
handle_error(
conversation_a,
TurnError {
message: "a1".to_string(),
codex_error_info: Some(V2CodexErrorInfo::BadRequest),
},
&turn_summary_store,
)
.await;
handle_turn_complete(
conversation_a,
a_turn1.clone(),
&outgoing,
&turn_summary_store,
)
.await;
// Turn 1 on conversation B
let b_turn1 = "b_turn1".to_string();
handle_error(
conversation_b,
TurnError {
message: "b1".to_string(),
codex_error_info: None,
},
&turn_summary_store,
)
.await;
handle_turn_complete(
conversation_b,
b_turn1.clone(),
&outgoing,
&turn_summary_store,
)
.await;
// Turn 2 on conversation A
let a_turn2 = "a_turn2".to_string();
handle_turn_complete(
conversation_a,
a_turn2.clone(),
&outgoing,
&turn_summary_store,
)
.await;
// Verify: A turn 1
let msg = rx
.recv()
.await
.ok_or_else(|| anyhow!("should send first notification"))?;
match msg {
OutgoingMessage::AppServerNotification(ServerNotification::TurnCompleted(n)) => {
assert_eq!(n.turn.id, a_turn1);
assert_eq!(
n.turn.status,
TurnStatus::Failed {
error: TurnError {
message: "a1".to_string(),
codex_error_info: Some(V2CodexErrorInfo::BadRequest),
}
}
);
}
other => bail!("unexpected message: {other:?}"),
}
// Verify: B turn 1
let msg = rx
.recv()
.await
.ok_or_else(|| anyhow!("should send second notification"))?;
match msg {
OutgoingMessage::AppServerNotification(ServerNotification::TurnCompleted(n)) => {
assert_eq!(n.turn.id, b_turn1);
assert_eq!(
n.turn.status,
TurnStatus::Failed {
error: TurnError {
message: "b1".to_string(),
codex_error_info: None,
}
}
);
}
other => bail!("unexpected message: {other:?}"),
}
// Verify: A turn 2
let msg = rx
.recv()
.await
.ok_or_else(|| anyhow!("should send third notification"))?;
match msg {
OutgoingMessage::AppServerNotification(ServerNotification::TurnCompleted(n)) => {
assert_eq!(n.turn.id, a_turn2);
assert_eq!(n.turn.status, TurnStatus::Completed);
}
other => bail!("unexpected message: {other:?}"),
}
assert!(rx.try_recv().is_err(), "no extra messages expected");
Ok(())
}
#[tokio::test]
async fn test_construct_mcp_tool_call_begin_notification_without_args() {
let begin_event = McpToolCallBeginEvent {

View File

@@ -60,6 +60,8 @@ use codex_app_server_protocol::RemoveConversationSubscriptionResponse;
use codex_app_server_protocol::RequestId;
use codex_app_server_protocol::ResumeConversationParams;
use codex_app_server_protocol::ResumeConversationResponse;
use codex_app_server_protocol::ReviewStartParams;
use codex_app_server_protocol::ReviewTarget;
use codex_app_server_protocol::SandboxMode;
use codex_app_server_protocol::SendUserMessageParams;
use codex_app_server_protocol::SendUserMessageResponse;
@@ -81,6 +83,7 @@ use codex_app_server_protocol::ThreadStartParams;
use codex_app_server_protocol::ThreadStartResponse;
use codex_app_server_protocol::ThreadStartedNotification;
use codex_app_server_protocol::Turn;
use codex_app_server_protocol::TurnError;
use codex_app_server_protocol::TurnInterruptParams;
use codex_app_server_protocol::TurnStartParams;
use codex_app_server_protocol::TurnStartResponse;
@@ -89,6 +92,7 @@ use codex_app_server_protocol::TurnStatus;
use codex_app_server_protocol::UserInfoResponse;
use codex_app_server_protocol::UserInput as V2UserInput;
use codex_app_server_protocol::UserSavedConfig;
use codex_app_server_protocol::build_turns_from_event_msgs;
use codex_backend_client::Client as BackendClient;
use codex_core::AuthManager;
use codex_core::CodexConversation;
@@ -109,12 +113,15 @@ use codex_core::config_loader::load_config_as_toml;
use codex_core::default_client::get_codex_user_agent;
use codex_core::exec::ExecParams;
use codex_core::exec_env::create_env;
use codex_core::features::Feature;
use codex_core::find_conversation_path_by_id_str;
use codex_core::get_platform_sandbox;
use codex_core::git_info::git_diff_to_remote;
use codex_core::parse_cursor;
use codex_core::protocol::EventMsg;
use codex_core::protocol::Op;
use codex_core::protocol::ReviewRequest;
use codex_core::protocol::SessionConfiguredEvent;
use codex_core::read_head_for_summary;
use codex_feedback::CodexFeedback;
use codex_login::ServerOptions as LoginServerOptions;
@@ -132,6 +139,7 @@ use codex_protocol::protocol::USER_MESSAGE_BEGIN;
use codex_protocol::user_input::UserInput as CoreInputItem;
use codex_utils_json_to_toml::json_to_toml;
use std::collections::HashMap;
use std::collections::HashSet;
use std::ffi::OsStr;
use std::io::Error as IoError;
use std::path::Path;
@@ -151,6 +159,15 @@ use uuid::Uuid;
type PendingInterruptQueue = Vec<(RequestId, ApiVersion)>;
pub(crate) type PendingInterrupts = Arc<Mutex<HashMap<ConversationId, PendingInterruptQueue>>>;
/// Per-conversation accumulation of the latest states e.g. error message while a turn runs.
#[derive(Default, Clone)]
pub(crate) struct TurnSummary {
pub(crate) file_change_started: HashSet<String>,
pub(crate) last_error: Option<TurnError>,
}
pub(crate) type TurnSummaryStore = Arc<Mutex<HashMap<ConversationId, TurnSummary>>>;
// Duration before a ChatGPT login attempt is abandoned.
const LOGIN_CHATGPT_TIMEOUT: Duration = Duration::from_secs(10 * 60);
struct ActiveLogin {
@@ -158,8 +175,8 @@ struct ActiveLogin {
login_id: Uuid,
}
impl ActiveLogin {
fn drop(&self) {
impl Drop for ActiveLogin {
fn drop(&mut self) {
self.shutdown_handle.shutdown();
}
}
@@ -175,6 +192,7 @@ pub(crate) struct CodexMessageProcessor {
active_login: Arc<Mutex<Option<ActiveLogin>>>,
// Queue of pending interrupt requests per conversation. We reply when TurnAborted arrives.
pending_interrupts: PendingInterrupts,
turn_summary_store: TurnSummaryStore,
pending_fuzzy_searches: Arc<Mutex<HashMap<String, Arc<AtomicBool>>>>,
feedback: CodexFeedback,
}
@@ -227,11 +245,97 @@ impl CodexMessageProcessor {
conversation_listeners: HashMap::new(),
active_login: Arc::new(Mutex::new(None)),
pending_interrupts: Arc::new(Mutex::new(HashMap::new())),
turn_summary_store: Arc::new(Mutex::new(HashMap::new())),
pending_fuzzy_searches: Arc::new(Mutex::new(HashMap::new())),
feedback,
}
}
fn review_request_from_target(
target: ReviewTarget,
append_to_original_thread: bool,
) -> Result<(ReviewRequest, String), JSONRPCErrorError> {
fn invalid_request(message: String) -> JSONRPCErrorError {
JSONRPCErrorError {
code: INVALID_REQUEST_ERROR_CODE,
message,
data: None,
}
}
match target {
// TODO(jif) those messages will be extracted in a follow-up PR.
ReviewTarget::UncommittedChanges => Ok((
ReviewRequest {
prompt: "Review the current code changes (staged, unstaged, and untracked files) and provide prioritized findings.".to_string(),
user_facing_hint: "current changes".to_string(),
append_to_original_thread,
},
"Review uncommitted changes".to_string(),
)),
ReviewTarget::BaseBranch { branch } => {
let branch = branch.trim().to_string();
if branch.is_empty() {
return Err(invalid_request("branch must not be empty".to_string()));
}
let prompt = format!("Review the code changes against the base branch '{branch}'. Start by finding the merge diff between the current branch and {branch}'s upstream e.g. (`git merge-base HEAD \"$(git rev-parse --abbrev-ref \"{branch}@{{upstream}}\")\"`), then run `git diff` against that SHA to see what changes we would merge into the {branch} branch. Provide prioritized, actionable findings.");
let hint = format!("changes against '{branch}'");
let display = format!("Review changes against base branch '{branch}'");
Ok((
ReviewRequest {
prompt,
user_facing_hint: hint,
append_to_original_thread,
},
display,
))
}
ReviewTarget::Commit { sha, title } => {
let sha = sha.trim().to_string();
if sha.is_empty() {
return Err(invalid_request("sha must not be empty".to_string()));
}
let brief_title = title
.map(|t| t.trim().to_string())
.filter(|t| !t.is_empty());
let prompt = if let Some(title) = brief_title.clone() {
format!("Review the code changes introduced by commit {sha} (\"{title}\"). Provide prioritized, actionable findings.")
} else {
format!("Review the code changes introduced by commit {sha}. Provide prioritized, actionable findings.")
};
let short_sha = sha.chars().take(7).collect::<String>();
let hint = format!("commit {short_sha}");
let display = if let Some(title) = brief_title {
format!("Review commit {short_sha}: {title}")
} else {
format!("Review commit {short_sha}")
};
Ok((
ReviewRequest {
prompt,
user_facing_hint: hint,
append_to_original_thread,
},
display,
))
}
ReviewTarget::Custom { instructions } => {
let trimmed = instructions.trim().to_string();
if trimmed.is_empty() {
return Err(invalid_request("instructions must not be empty".to_string()));
}
Ok((
ReviewRequest {
prompt: trimmed.clone(),
user_facing_hint: trimmed.clone(),
append_to_original_thread,
},
trimmed,
))
}
}
}
pub async fn process_request(&mut self, request: ClientRequest) {
match request {
ClientRequest::Initialize { .. } => {
@@ -263,6 +367,9 @@ impl CodexMessageProcessor {
ClientRequest::TurnInterrupt { request_id, params } => {
self.turn_interrupt(request_id, params).await;
}
ClientRequest::ReviewStart { request_id, params } => {
self.review_start(request_id, params).await;
}
ClientRequest::NewConversation { request_id, params } => {
// Do not tokio::spawn() to process new_conversation()
// asynchronously because we need to ensure the conversation is
@@ -417,7 +524,7 @@ impl CodexMessageProcessor {
{
let mut guard = self.active_login.lock().await;
if let Some(active) = guard.take() {
active.drop();
drop(active);
}
}
@@ -525,7 +632,7 @@ impl CodexMessageProcessor {
{
let mut guard = self.active_login.lock().await;
if let Some(existing) = guard.take() {
existing.drop();
drop(existing);
}
*guard = Some(ActiveLogin {
shutdown_handle: shutdown_handle.clone(),
@@ -615,7 +722,7 @@ impl CodexMessageProcessor {
{
let mut guard = self.active_login.lock().await;
if let Some(existing) = guard.take() {
existing.drop();
drop(existing);
}
*guard = Some(ActiveLogin {
shutdown_handle: shutdown_handle.clone(),
@@ -704,7 +811,7 @@ impl CodexMessageProcessor {
let mut guard = self.active_login.lock().await;
if guard.as_ref().map(|l| l.login_id) == Some(login_id) {
if let Some(active) = guard.take() {
active.drop();
drop(active);
}
Ok(())
} else {
@@ -758,7 +865,7 @@ impl CodexMessageProcessor {
{
let mut guard = self.active_login.lock().await;
if let Some(active) = guard.take() {
active.drop();
drop(active);
}
}
@@ -1063,7 +1170,7 @@ impl CodexMessageProcessor {
let exec_params = ExecParams {
command: params.command,
cwd,
timeout_ms,
expiration: timeout_ms.into(),
env,
with_escalated_permissions: None,
justification: None,
@@ -1135,7 +1242,7 @@ impl CodexMessageProcessor {
let overrides = ConfigOverrides {
model,
config_profile: profile,
cwd: cwd.map(PathBuf::from),
cwd: cwd.clone().map(PathBuf::from),
approval_policy,
sandbox_mode,
model_provider,
@@ -1147,7 +1254,17 @@ impl CodexMessageProcessor {
..Default::default()
};
let config = match derive_config_from_params(overrides, cli_overrides).await {
// Persist windows sandbox feature.
// TODO: persist default config in general.
let mut cli_overrides = cli_overrides.unwrap_or_default();
if cfg!(windows) && self.config.features.enabled(Feature::WindowsSandbox) {
cli_overrides.insert(
"features.enable_experimental_windows_sandbox".to_string(),
serde_json::json!(true),
);
}
let config = match derive_config_from_params(overrides, Some(cli_overrides)).await {
Ok(config) => config,
Err(err) => {
let error = JSONRPCErrorError {
@@ -1212,8 +1329,12 @@ impl CodexMessageProcessor {
match self.conversation_manager.new_conversation(config).await {
Ok(new_conv) => {
let conversation_id = new_conv.conversation_id;
let rollout_path = new_conv.session_configured.rollout_path.clone();
let NewConversation {
conversation_id,
session_configured,
..
} = new_conv;
let rollout_path = session_configured.rollout_path.clone();
let fallback_provider = self.config.model_provider_id.as_str();
// A bit hacky, but the summary contains a lot of useful information for the thread
@@ -1238,8 +1359,22 @@ impl CodexMessageProcessor {
}
};
let SessionConfiguredEvent {
model,
model_provider_id,
cwd,
approval_policy,
sandbox_policy,
..
} = session_configured;
let response = ThreadStartResponse {
thread: thread.clone(),
model,
model_provider: model_provider_id,
cwd,
approval_policy: approval_policy.into(),
sandbox: sandbox_policy.into(),
reasoning_effort: session_configured.reasoning_effort,
};
// Auto-attach a conversation listener when starting a thread.
@@ -1521,6 +1656,11 @@ impl CodexMessageProcessor {
session_configured,
..
}) => {
let SessionConfiguredEvent {
rollout_path,
initial_messages,
..
} = session_configured;
// Auto-attach a conversation listener when resuming a thread.
if let Err(err) = self
.attach_conversation_listener(conversation_id, false, ApiVersion::V2)
@@ -1533,8 +1673,8 @@ impl CodexMessageProcessor {
);
}
let thread = match read_summary_from_rollout(
session_configured.rollout_path.as_path(),
let mut thread = match read_summary_from_rollout(
rollout_path.as_path(),
fallback_model_provider.as_str(),
)
.await
@@ -1545,14 +1685,27 @@ impl CodexMessageProcessor {
request_id,
format!(
"failed to load rollout `{}` for conversation {conversation_id}: {err}",
session_configured.rollout_path.display()
rollout_path.display()
),
)
.await;
return;
}
};
let response = ThreadResumeResponse { thread };
thread.turns = initial_messages
.as_deref()
.map_or_else(Vec::new, build_turns_from_event_msgs);
let response = ThreadResumeResponse {
thread,
model: session_configured.model,
model_provider: session_configured.model_provider_id,
cwd: session_configured.cwd,
approval_policy: session_configured.approval_policy.into(),
sandbox: session_configured.sandbox_policy.into(),
reasoning_effort: session_configured.reasoning_effort,
};
self.outgoing.send_response(request_id, response).await;
}
Err(err) => {
@@ -1803,6 +1956,15 @@ impl CodexMessageProcessor {
include_apply_patch_tool,
} = overrides;
// Persist windows sandbox feature.
let mut cli_overrides = cli_overrides.unwrap_or_default();
if cfg!(windows) && self.config.features.enabled(Feature::WindowsSandbox) {
cli_overrides.insert(
"features.enable_experimental_windows_sandbox".to_string(),
serde_json::json!(true),
);
}
let overrides = ConfigOverrides {
model,
config_profile: profile,
@@ -1818,7 +1980,7 @@ impl CodexMessageProcessor {
..Default::default()
};
derive_config_from_params(overrides, cli_overrides).await
derive_config_from_params(overrides, Some(cli_overrides)).await
}
None => Ok(self.config.as_ref().clone()),
};
@@ -2272,9 +2434,6 @@ impl CodexMessageProcessor {
}
};
// Keep a copy of v2 inputs for the notification payload.
let v2_inputs_for_notif = params.input.clone();
// Map v2 input items to core input items.
let mapped_items: Vec<CoreInputItem> = params
.input
@@ -2314,12 +2473,8 @@ impl CodexMessageProcessor {
Ok(turn_id) => {
let turn = Turn {
id: turn_id.clone(),
items: vec![ThreadItem::UserMessage {
id: turn_id,
content: v2_inputs_for_notif,
}],
items: vec![],
status: TurnStatus::InProgress,
error: None,
};
let response = TurnStartResponse { turn: turn.clone() };
@@ -2342,6 +2497,64 @@ impl CodexMessageProcessor {
}
}
async fn review_start(&self, request_id: RequestId, params: ReviewStartParams) {
let ReviewStartParams {
thread_id,
target,
append_to_original_thread,
} = params;
let (_, conversation) = match self.conversation_from_thread_id(&thread_id).await {
Ok(v) => v,
Err(error) => {
self.outgoing.send_error(request_id, error).await;
return;
}
};
let (review_request, display_text) =
match Self::review_request_from_target(target, append_to_original_thread) {
Ok(value) => value,
Err(err) => {
self.outgoing.send_error(request_id, err).await;
return;
}
};
let turn_id = conversation.submit(Op::Review { review_request }).await;
match turn_id {
Ok(turn_id) => {
let mut items = Vec::new();
if !display_text.is_empty() {
items.push(ThreadItem::UserMessage {
id: turn_id.clone(),
content: vec![V2UserInput::Text { text: display_text }],
});
}
let turn = Turn {
id: turn_id.clone(),
items,
status: TurnStatus::InProgress,
};
let response = TurnStartResponse { turn: turn.clone() };
self.outgoing.send_response(request_id, response).await;
let notif = TurnStartedNotification { turn };
self.outgoing
.send_server_notification(ServerNotification::TurnStarted(notif))
.await;
}
Err(err) => {
let error = JSONRPCErrorError {
code: INTERNAL_ERROR_CODE,
message: format!("failed to start review: {err}"),
data: None,
};
self.outgoing.send_error(request_id, error).await;
}
}
}
async fn turn_interrupt(&mut self, request_id: RequestId, params: TurnInterruptParams) {
let TurnInterruptParams { thread_id, .. } = params;
@@ -2441,6 +2654,7 @@ impl CodexMessageProcessor {
let outgoing_for_task = self.outgoing.clone();
let pending_interrupts = self.pending_interrupts.clone();
let turn_summary_store = self.turn_summary_store.clone();
let api_version_for_task = api_version;
tokio::spawn(async move {
loop {
@@ -2497,6 +2711,7 @@ impl CodexMessageProcessor {
conversation.clone(),
outgoing_for_task.clone(),
pending_interrupts.clone(),
turn_summary_store.clone(),
api_version_for_task,
)
.await;
@@ -2791,6 +3006,7 @@ fn summary_to_thread(summary: ConversationSummary) -> Thread {
model_provider,
created_at: created_at.map(|dt| dt.timestamp()).unwrap_or(0),
path,
turns: Vec::new(),
}
}

View File

@@ -19,6 +19,10 @@ pub(crate) async fn run_fuzzy_file_search(
roots: Vec<String>,
cancellation_flag: Arc<AtomicBool>,
) -> Vec<FuzzyFileSearchResult> {
if roots.is_empty() {
return Vec::new();
}
#[expect(clippy::expect_used)]
let limit_per_root =
NonZero::new(LIMIT_PER_ROOT).expect("LIMIT_PER_ROOT should be a valid non-zero usize");

View File

@@ -47,7 +47,7 @@ pub async fn run_main(
) -> IoResult<()> {
// Set up channels.
let (incoming_tx, mut incoming_rx) = mpsc::channel::<JSONRPCMessage>(CHANNEL_CAPACITY);
let (outgoing_tx, mut outgoing_rx) = mpsc::unbounded_channel::<OutgoingMessage>();
let (outgoing_tx, mut outgoing_rx) = mpsc::channel::<OutgoingMessage>(CHANNEL_CAPACITY);
// Task: read from stdin, push to `incoming_tx`.
let stdin_reader_handle = tokio::spawn({

View File

@@ -6,7 +6,6 @@ use crate::outgoing_message::OutgoingMessageSender;
use codex_app_server_protocol::ClientInfo;
use codex_app_server_protocol::ClientRequest;
use codex_app_server_protocol::InitializeResponse;
use codex_app_server_protocol::JSONRPCError;
use codex_app_server_protocol::JSONRPCErrorError;
use codex_app_server_protocol::JSONRPCNotification;
@@ -118,6 +117,7 @@ impl MessageProcessor {
self.outgoing.send_response(request_id, response).await;
self.initialized = true;
return;
}
}

View File

@@ -19,12 +19,12 @@ use crate::error_code::INTERNAL_ERROR_CODE;
/// Sends messages to the client and manages request callbacks.
pub(crate) struct OutgoingMessageSender {
next_request_id: AtomicI64,
sender: mpsc::UnboundedSender<OutgoingMessage>,
sender: mpsc::Sender<OutgoingMessage>,
request_id_to_callback: Mutex<HashMap<RequestId, oneshot::Sender<Result>>>,
}
impl OutgoingMessageSender {
pub(crate) fn new(sender: mpsc::UnboundedSender<OutgoingMessage>) -> Self {
pub(crate) fn new(sender: mpsc::Sender<OutgoingMessage>) -> Self {
Self {
next_request_id: AtomicI64::new(0),
sender,
@@ -45,8 +45,12 @@ impl OutgoingMessageSender {
}
let outgoing_message =
OutgoingMessage::Request(request.request_with_id(outgoing_message_id));
let _ = self.sender.send(outgoing_message);
OutgoingMessage::Request(request.request_with_id(outgoing_message_id.clone()));
if let Err(err) = self.sender.send(outgoing_message).await {
warn!("failed to send request {outgoing_message_id:?} to client: {err:?}");
let mut request_id_to_callback = self.request_id_to_callback.lock().await;
request_id_to_callback.remove(&outgoing_message_id);
}
rx_approve
}
@@ -72,7 +76,9 @@ impl OutgoingMessageSender {
match serde_json::to_value(response) {
Ok(result) => {
let outgoing_message = OutgoingMessage::Response(OutgoingResponse { id, result });
let _ = self.sender.send(outgoing_message);
if let Err(err) = self.sender.send(outgoing_message).await {
warn!("failed to send response to client: {err:?}");
}
}
Err(err) => {
self.send_error(
@@ -89,21 +95,29 @@ impl OutgoingMessageSender {
}
pub(crate) async fn send_server_notification(&self, notification: ServerNotification) {
let _ = self
if let Err(err) = self
.sender
.send(OutgoingMessage::AppServerNotification(notification));
.send(OutgoingMessage::AppServerNotification(notification))
.await
{
warn!("failed to send server notification to client: {err:?}");
}
}
/// All notifications should be migrated to [`ServerNotification`] and
/// [`OutgoingMessage::Notification`] should be removed.
pub(crate) async fn send_notification(&self, notification: OutgoingNotification) {
let outgoing_message = OutgoingMessage::Notification(notification);
let _ = self.sender.send(outgoing_message);
if let Err(err) = self.sender.send(outgoing_message).await {
warn!("failed to send notification to client: {err:?}");
}
}
pub(crate) async fn send_error(&self, id: RequestId, error: JSONRPCErrorError) {
let outgoing_message = OutgoingMessage::Error(OutgoingError { id, error });
let _ = self.sender.send(outgoing_message);
if let Err(err) = self.sender.send(outgoing_message).await {
warn!("failed to send error to client: {err:?}");
}
}
}
@@ -215,6 +229,7 @@ mod tests {
resets_at: Some(123),
}),
secondary: None,
credits: None,
},
});
@@ -229,7 +244,8 @@ mod tests {
"windowDurationMins": 15,
"resetsAt": 123
},
"secondary": null
"secondary": null,
"credits": null
}
},
}),

View File

@@ -24,3 +24,5 @@ tokio = { workspace = true, features = [
] }
uuid = { workspace = true }
wiremock = { workspace = true }
core_test_support = { path = "../../../core/tests/common" }
shlex = { workspace = true }

View File

@@ -9,12 +9,14 @@ pub use auth_fixtures::ChatGptIdTokenClaims;
pub use auth_fixtures::encode_id_token;
pub use auth_fixtures::write_chatgpt_auth;
use codex_app_server_protocol::JSONRPCResponse;
pub use core_test_support::format_with_current_shell;
pub use core_test_support::format_with_current_shell_display;
pub use mcp_process::McpProcess;
pub use mock_model_server::create_mock_chat_completions_server;
pub use mock_model_server::create_mock_chat_completions_server_unchecked;
pub use responses::create_apply_patch_sse_response;
pub use responses::create_final_assistant_message_sse_response;
pub use responses::create_shell_sse_response;
pub use responses::create_shell_command_sse_response;
pub use rollout::create_fake_rollout;
use serde::de::DeserializeOwned;

View File

@@ -35,6 +35,7 @@ use codex_app_server_protocol::NewConversationParams;
use codex_app_server_protocol::RemoveConversationListenerParams;
use codex_app_server_protocol::RequestId;
use codex_app_server_protocol::ResumeConversationParams;
use codex_app_server_protocol::ReviewStartParams;
use codex_app_server_protocol::SendUserMessageParams;
use codex_app_server_protocol::SendUserTurnParams;
use codex_app_server_protocol::ServerRequest;
@@ -377,6 +378,15 @@ impl McpProcess {
self.send_request("turn/interrupt", params).await
}
/// Send a `review/start` JSON-RPC request (v2).
pub async fn send_review_start_request(
&mut self,
params: ReviewStartParams,
) -> anyhow::Result<i64> {
let params = Some(serde_json::to_value(params)?);
self.send_request("review/start", params).await
}
/// Send a `cancelLoginChatGpt` JSON-RPC request.
pub async fn send_cancel_login_chat_gpt_request(
&mut self,

View File

@@ -1,17 +1,18 @@
use serde_json::json;
use std::path::Path;
pub fn create_shell_sse_response(
pub fn create_shell_command_sse_response(
command: Vec<String>,
workdir: Option<&Path>,
timeout_ms: Option<u64>,
call_id: &str,
) -> anyhow::Result<String> {
// The `arguments`` for the `shell` tool is a serialized JSON object.
// The `arguments` for the `shell_command` tool is a serialized JSON object.
let command_str = shlex::try_join(command.iter().map(String::as_str))?;
let tool_call_arguments = serde_json::to_string(&json!({
"command": command,
"command": command_str,
"workdir": workdir.map(|w| w.to_string_lossy()),
"timeout": timeout_ms
"timeout_ms": timeout_ms
}))?;
let tool_call = json!({
"choices": [
@@ -21,7 +22,7 @@ pub fn create_shell_sse_response(
{
"id": call_id,
"function": {
"name": "shell",
"name": "shell_command",
"arguments": tool_call_arguments
}
}
@@ -62,10 +63,10 @@ pub fn create_apply_patch_sse_response(
patch_content: &str,
call_id: &str,
) -> anyhow::Result<String> {
// Use shell command to call apply_patch with heredoc format
let shell_command = format!("apply_patch <<'EOF'\n{patch_content}\nEOF");
// Use shell_command to call apply_patch with heredoc format
let command = format!("apply_patch <<'EOF'\n{patch_content}\nEOF");
let tool_call_arguments = serde_json::to_string(&json!({
"command": ["bash", "-lc", shell_command]
"command": command
}))?;
let tool_call = json!({
@@ -76,7 +77,7 @@ pub fn create_apply_patch_sse_response(
{
"id": call_id,
"function": {
"name": "shell",
"name": "shell_command",
"arguments": tool_call_arguments
}
}

View File

@@ -2,7 +2,8 @@ use anyhow::Result;
use app_test_support::McpProcess;
use app_test_support::create_final_assistant_message_sse_response;
use app_test_support::create_mock_chat_completions_server;
use app_test_support::create_shell_sse_response;
use app_test_support::create_shell_command_sse_response;
use app_test_support::format_with_current_shell;
use app_test_support::to_response;
use codex_app_server_protocol::AddConversationListenerParams;
use codex_app_server_protocol::AddConversationSubscriptionResponse;
@@ -56,7 +57,7 @@ async fn test_codex_jsonrpc_conversation_flow() -> Result<()> {
// Create a mock model server that immediately ends each turn.
// Two turns are expected: initial session configure + one user message.
let responses = vec![
create_shell_sse_response(
create_shell_command_sse_response(
vec!["ls".to_string()],
Some(&working_directory),
Some(5000),
@@ -175,7 +176,7 @@ async fn test_send_user_turn_changes_approval_policy_behavior() -> Result<()> {
// Mock server will request a python shell call for the first and second turn, then finish.
let responses = vec![
create_shell_sse_response(
create_shell_command_sse_response(
vec![
"python3".to_string(),
"-c".to_string(),
@@ -186,7 +187,7 @@ async fn test_send_user_turn_changes_approval_policy_behavior() -> Result<()> {
"call1",
)?,
create_final_assistant_message_sse_response("done 1")?,
create_shell_sse_response(
create_shell_command_sse_response(
vec![
"python3".to_string(),
"-c".to_string(),
@@ -267,11 +268,7 @@ async fn test_send_user_turn_changes_approval_policy_behavior() -> Result<()> {
ExecCommandApprovalParams {
conversation_id,
call_id: "call1".to_string(),
command: vec![
"python3".to_string(),
"-c".to_string(),
"print(42)".to_string(),
],
command: format_with_current_shell("python3 -c 'print(42)'"),
cwd: working_directory.clone(),
reason: None,
risk: None,
@@ -353,23 +350,15 @@ async fn test_send_user_turn_updates_sandbox_and_cwd_between_turns() -> Result<(
std::fs::create_dir(&second_cwd)?;
let responses = vec![
create_shell_sse_response(
vec![
"bash".to_string(),
"-lc".to_string(),
"echo first turn".to_string(),
],
create_shell_command_sse_response(
vec!["echo".to_string(), "first".to_string(), "turn".to_string()],
None,
Some(5000),
"call-first",
)?,
create_final_assistant_message_sse_response("done first")?,
create_shell_sse_response(
vec![
"bash".to_string(),
"-lc".to_string(),
"echo second turn".to_string(),
],
create_shell_command_sse_response(
vec!["echo".to_string(), "second".to_string(), "turn".to_string()],
None,
Some(5000),
"call-second",
@@ -481,13 +470,9 @@ async fn test_send_user_turn_updates_sandbox_and_cwd_between_turns() -> Result<(
exec_begin.cwd, second_cwd,
"exec turn should run from updated cwd"
);
let expected_command = format_with_current_shell("echo second turn");
assert_eq!(
exec_begin.command,
vec![
"bash".to_string(),
"-lc".to_string(),
"echo second turn".to_string()
],
exec_begin.command, expected_command,
"exec turn should run expected command"
);

View File

@@ -27,7 +27,7 @@ fn create_config_toml(codex_home: &Path) -> std::io::Result<()> {
std::fs::write(
config_toml,
r#"
model = "gpt-5.1-codex"
model = "gpt-5.1-codex-max"
approval_policy = "on-request"
sandbox_mode = "workspace-write"
model_reasoning_summary = "detailed"
@@ -87,7 +87,7 @@ async fn get_config_toml_parses_all_fields() -> Result<()> {
}),
forced_chatgpt_workspace_id: Some("12345678-0000-0000-0000-000000000000".into()),
forced_login_method: Some(ForcedLoginMethod::Chatgpt),
model: Some("gpt-5.1-codex".into()),
model: Some("gpt-5.1-codex-max".into()),
model_reasoning_effort: Some(ReasoningEffort::High),
model_reasoning_summary: Some(ReasoningSummary::Detailed),
model_verbosity: Some(Verbosity::Medium),

View File

@@ -19,7 +19,7 @@ use tokio::time::timeout;
use app_test_support::McpProcess;
use app_test_support::create_mock_chat_completions_server;
use app_test_support::create_shell_sse_response;
use app_test_support::create_shell_command_sse_response;
use app_test_support::to_response;
const DEFAULT_READ_TIMEOUT: std::time::Duration = std::time::Duration::from_secs(10);
@@ -56,7 +56,7 @@ async fn shell_command_interruption() -> anyhow::Result<()> {
std::fs::create_dir(&working_directory)?;
// Create mock server with a single SSE response: the long sleep command
let server = create_mock_chat_completions_server(vec![create_shell_sse_response(
let server = create_mock_chat_completions_server(vec![create_shell_command_sse_response(
shell_command.clone(),
Some(&working_directory),
Some(10_000), // 10 seconds timeout in ms

View File

@@ -57,7 +57,7 @@ fn create_config_toml(codex_home: &Path) -> std::io::Result<()> {
std::fs::write(
config_toml,
r#"
model = "gpt-5.1-codex"
model = "gpt-5.1-codex-max"
model_reasoning_effort = "medium"
"#,
)

View File

@@ -1,6 +1,7 @@
mod account;
mod model_list;
mod rate_limits;
mod review;
mod thread_archive;
mod thread_list;
mod thread_resume;

View File

@@ -45,6 +45,33 @@ async fn list_models_returns_all_models_with_large_limit() -> Result<()> {
} = to_response::<ModelListResponse>(response)?;
let expected_models = vec![
Model {
id: "gpt-5.1-codex-max".to_string(),
model: "gpt-5.1-codex-max".to_string(),
display_name: "gpt-5.1-codex-max".to_string(),
description: "Latest Codex-optimized flagship for deep and fast reasoning.".to_string(),
supported_reasoning_efforts: vec![
ReasoningEffortOption {
reasoning_effort: ReasoningEffort::Low,
description: "Fast responses with lighter reasoning".to_string(),
},
ReasoningEffortOption {
reasoning_effort: ReasoningEffort::Medium,
description: "Balances speed and reasoning depth for everyday tasks"
.to_string(),
},
ReasoningEffortOption {
reasoning_effort: ReasoningEffort::High,
description: "Maximizes reasoning depth for complex problems".to_string(),
},
ReasoningEffortOption {
reasoning_effort: ReasoningEffort::XHigh,
description: "Extra high reasoning depth for complex problems".to_string(),
},
],
default_reasoning_effort: ReasoningEffort::Medium,
is_default: true,
},
Model {
id: "gpt-5.1-codex".to_string(),
model: "gpt-5.1-codex".to_string(),
@@ -66,7 +93,7 @@ async fn list_models_returns_all_models_with_large_limit() -> Result<()> {
},
],
default_reasoning_effort: ReasoningEffort::Medium,
is_default: true,
is_default: false,
},
Model {
id: "gpt-5.1-codex-mini".to_string(),
@@ -147,7 +174,7 @@ async fn list_models_pagination_works() -> Result<()> {
} = to_response::<ModelListResponse>(first_response)?;
assert_eq!(first_items.len(), 1);
assert_eq!(first_items[0].id, "gpt-5.1-codex");
assert_eq!(first_items[0].id, "gpt-5.1-codex-max");
let next_cursor = first_cursor.ok_or_else(|| anyhow!("cursor for second page"))?;
let second_request = mcp
@@ -169,7 +196,7 @@ async fn list_models_pagination_works() -> Result<()> {
} = to_response::<ModelListResponse>(second_response)?;
assert_eq!(second_items.len(), 1);
assert_eq!(second_items[0].id, "gpt-5.1-codex-mini");
assert_eq!(second_items[0].id, "gpt-5.1-codex");
let third_cursor = second_cursor.ok_or_else(|| anyhow!("cursor for third page"))?;
let third_request = mcp
@@ -191,8 +218,30 @@ async fn list_models_pagination_works() -> Result<()> {
} = to_response::<ModelListResponse>(third_response)?;
assert_eq!(third_items.len(), 1);
assert_eq!(third_items[0].id, "gpt-5.1");
assert!(third_cursor.is_none());
assert_eq!(third_items[0].id, "gpt-5.1-codex-mini");
let fourth_cursor = third_cursor.ok_or_else(|| anyhow!("cursor for fourth page"))?;
let fourth_request = mcp
.send_list_models_request(ModelListParams {
limit: Some(1),
cursor: Some(fourth_cursor.clone()),
})
.await?;
let fourth_response: JSONRPCResponse = timeout(
DEFAULT_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(fourth_request)),
)
.await??;
let ModelListResponse {
data: fourth_items,
next_cursor: fourth_cursor,
} = to_response::<ModelListResponse>(fourth_response)?;
assert_eq!(fourth_items.len(), 1);
assert_eq!(fourth_items[0].id, "gpt-5.1");
assert!(fourth_cursor.is_none());
Ok(())
}

View File

@@ -152,6 +152,7 @@ async fn get_account_rate_limits_returns_snapshot() -> Result<()> {
window_duration_mins: Some(1440),
resets_at: Some(secondary_reset_timestamp),
}),
credits: None,
},
};
assert_eq!(received, expected);

View File

@@ -0,0 +1,279 @@
use anyhow::Result;
use app_test_support::McpProcess;
use app_test_support::create_final_assistant_message_sse_response;
use app_test_support::create_mock_chat_completions_server_unchecked;
use app_test_support::to_response;
use codex_app_server_protocol::ItemCompletedNotification;
use codex_app_server_protocol::ItemStartedNotification;
use codex_app_server_protocol::JSONRPCError;
use codex_app_server_protocol::JSONRPCNotification;
use codex_app_server_protocol::JSONRPCResponse;
use codex_app_server_protocol::RequestId;
use codex_app_server_protocol::ReviewStartParams;
use codex_app_server_protocol::ReviewTarget;
use codex_app_server_protocol::ThreadItem;
use codex_app_server_protocol::ThreadStartParams;
use codex_app_server_protocol::ThreadStartResponse;
use codex_app_server_protocol::TurnStartResponse;
use codex_app_server_protocol::TurnStatus;
use serde_json::json;
use tempfile::TempDir;
use tokio::time::timeout;
const DEFAULT_READ_TIMEOUT: std::time::Duration = std::time::Duration::from_secs(10);
const INVALID_REQUEST_ERROR_CODE: i64 = -32600;
#[tokio::test]
async fn review_start_runs_review_turn_and_emits_code_review_item() -> Result<()> {
let review_payload = json!({
"findings": [
{
"title": "Prefer Stylize helpers",
"body": "Use .dim()/.bold() chaining instead of manual Style.",
"confidence_score": 0.9,
"priority": 1,
"code_location": {
"absolute_file_path": "/tmp/file.rs",
"line_range": {"start": 10, "end": 20}
}
}
],
"overall_correctness": "good",
"overall_explanation": "Looks solid overall with minor polish suggested.",
"overall_confidence_score": 0.75
})
.to_string();
let responses = vec![create_final_assistant_message_sse_response(
&review_payload,
)?];
let server = create_mock_chat_completions_server_unchecked(responses).await;
let codex_home = TempDir::new()?;
create_config_toml(codex_home.path(), &server.uri())?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
let thread_id = start_default_thread(&mut mcp).await?;
let review_req = mcp
.send_review_start_request(ReviewStartParams {
thread_id: thread_id.clone(),
append_to_original_thread: true,
target: ReviewTarget::Commit {
sha: "1234567deadbeef".to_string(),
title: Some("Tidy UI colors".to_string()),
},
})
.await?;
let review_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(review_req)),
)
.await??;
let TurnStartResponse { turn } = to_response::<TurnStartResponse>(review_resp)?;
let turn_id = turn.id.clone();
assert_eq!(turn.status, TurnStatus::InProgress);
assert_eq!(turn.items.len(), 1);
match &turn.items[0] {
ThreadItem::UserMessage { content, .. } => {
assert_eq!(content.len(), 1);
assert!(matches!(
&content[0],
codex_app_server_protocol::UserInput::Text { .. }
));
}
other => panic!("expected user message, got {other:?}"),
}
let _started: JSONRPCNotification = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("turn/started"),
)
.await??;
let item_started: JSONRPCNotification = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("item/started"),
)
.await??;
let started: ItemStartedNotification =
serde_json::from_value(item_started.params.expect("params must be present"))?;
match started.item {
ThreadItem::CodeReview { id, review } => {
assert_eq!(id, turn_id);
assert_eq!(review, "commit 1234567");
}
other => panic!("expected code review item, got {other:?}"),
}
let mut review_body: Option<String> = None;
for _ in 0..5 {
let review_notif: JSONRPCNotification = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("item/completed"),
)
.await??;
let completed: ItemCompletedNotification =
serde_json::from_value(review_notif.params.expect("params must be present"))?;
match completed.item {
ThreadItem::CodeReview { id, review } => {
assert_eq!(id, turn_id);
review_body = Some(review);
break;
}
ThreadItem::UserMessage { .. } => continue,
other => panic!("unexpected item/completed payload: {other:?}"),
}
}
let review = review_body.expect("did not observe a code review item");
assert!(review.contains("Prefer Stylize helpers"));
assert!(review.contains("/tmp/file.rs:10-20"));
Ok(())
}
#[tokio::test]
async fn review_start_rejects_empty_base_branch() -> Result<()> {
let server = create_mock_chat_completions_server_unchecked(vec![]).await;
let codex_home = TempDir::new()?;
create_config_toml(codex_home.path(), &server.uri())?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
let thread_id = start_default_thread(&mut mcp).await?;
let request_id = mcp
.send_review_start_request(ReviewStartParams {
thread_id,
append_to_original_thread: true,
target: ReviewTarget::BaseBranch {
branch: " ".to_string(),
},
})
.await?;
let error: JSONRPCError = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_error_message(RequestId::Integer(request_id)),
)
.await??;
assert_eq!(error.error.code, INVALID_REQUEST_ERROR_CODE);
assert!(
error.error.message.contains("branch must not be empty"),
"unexpected message: {}",
error.error.message
);
Ok(())
}
#[tokio::test]
async fn review_start_rejects_empty_commit_sha() -> Result<()> {
let server = create_mock_chat_completions_server_unchecked(vec![]).await;
let codex_home = TempDir::new()?;
create_config_toml(codex_home.path(), &server.uri())?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
let thread_id = start_default_thread(&mut mcp).await?;
let request_id = mcp
.send_review_start_request(ReviewStartParams {
thread_id,
append_to_original_thread: true,
target: ReviewTarget::Commit {
sha: "\t".to_string(),
title: None,
},
})
.await?;
let error: JSONRPCError = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_error_message(RequestId::Integer(request_id)),
)
.await??;
assert_eq!(error.error.code, INVALID_REQUEST_ERROR_CODE);
assert!(
error.error.message.contains("sha must not be empty"),
"unexpected message: {}",
error.error.message
);
Ok(())
}
#[tokio::test]
async fn review_start_rejects_empty_custom_instructions() -> Result<()> {
let server = create_mock_chat_completions_server_unchecked(vec![]).await;
let codex_home = TempDir::new()?;
create_config_toml(codex_home.path(), &server.uri())?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
let thread_id = start_default_thread(&mut mcp).await?;
let request_id = mcp
.send_review_start_request(ReviewStartParams {
thread_id,
append_to_original_thread: true,
target: ReviewTarget::Custom {
instructions: "\n\n".to_string(),
},
})
.await?;
let error: JSONRPCError = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_error_message(RequestId::Integer(request_id)),
)
.await??;
assert_eq!(error.error.code, INVALID_REQUEST_ERROR_CODE);
assert!(
error
.error
.message
.contains("instructions must not be empty"),
"unexpected message: {}",
error.error.message
);
Ok(())
}
async fn start_default_thread(mcp: &mut McpProcess) -> Result<String> {
let thread_req = mcp
.send_thread_start_request(ThreadStartParams {
model: Some("mock-model".to_string()),
..Default::default()
})
.await?;
let thread_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(thread_req)),
)
.await??;
let ThreadStartResponse { thread, .. } = to_response::<ThreadStartResponse>(thread_resp)?;
Ok(thread.id)
}
fn create_config_toml(codex_home: &std::path::Path, server_uri: &str) -> std::io::Result<()> {
let config_toml = codex_home.join("config.toml");
std::fs::write(
config_toml,
format!(
r#"
model = "mock-model"
approval_policy = "never"
sandbox_mode = "read-only"
model_provider = "mock_provider"
[model_providers.mock_provider]
name = "Mock provider"
base_url = "{server_uri}/v1"
wire_api = "chat"
request_max_retries = 0
stream_max_retries = 0
"#
),
)
}

View File

@@ -35,7 +35,7 @@ async fn thread_archive_moves_rollout_into_archived_directory() -> Result<()> {
mcp.read_stream_until_response_message(RequestId::Integer(start_id)),
)
.await??;
let ThreadStartResponse { thread } = to_response::<ThreadStartResponse>(start_resp)?;
let ThreadStartResponse { thread, .. } = to_response::<ThreadStartResponse>(start_resp)?;
assert!(!thread.id.is_empty());
// Locate the rollout path recorded for this thread id.

View File

@@ -1,13 +1,17 @@
use anyhow::Result;
use app_test_support::McpProcess;
use app_test_support::create_fake_rollout;
use app_test_support::create_mock_chat_completions_server;
use app_test_support::to_response;
use codex_app_server_protocol::JSONRPCResponse;
use codex_app_server_protocol::RequestId;
use codex_app_server_protocol::ThreadItem;
use codex_app_server_protocol::ThreadResumeParams;
use codex_app_server_protocol::ThreadResumeResponse;
use codex_app_server_protocol::ThreadStartParams;
use codex_app_server_protocol::ThreadStartResponse;
use codex_app_server_protocol::TurnStatus;
use codex_app_server_protocol::UserInput;
use codex_protocol::models::ContentItem;
use codex_protocol::models::ResponseItem;
use tempfile::TempDir;
@@ -27,7 +31,7 @@ async fn thread_resume_returns_original_thread() -> Result<()> {
// Start a thread.
let start_id = mcp
.send_thread_start_request(ThreadStartParams {
model: Some("gpt-5.1-codex".to_string()),
model: Some("gpt-5.1-codex-max".to_string()),
..Default::default()
})
.await?;
@@ -36,7 +40,7 @@ async fn thread_resume_returns_original_thread() -> Result<()> {
mcp.read_stream_until_response_message(RequestId::Integer(start_id)),
)
.await??;
let ThreadStartResponse { thread } = to_response::<ThreadStartResponse>(start_resp)?;
let ThreadStartResponse { thread, .. } = to_response::<ThreadStartResponse>(start_resp)?;
// Resume it via v2 API.
let resume_id = mcp
@@ -50,13 +54,73 @@ async fn thread_resume_returns_original_thread() -> Result<()> {
mcp.read_stream_until_response_message(RequestId::Integer(resume_id)),
)
.await??;
let ThreadResumeResponse { thread: resumed } =
to_response::<ThreadResumeResponse>(resume_resp)?;
let ThreadResumeResponse {
thread: resumed, ..
} = to_response::<ThreadResumeResponse>(resume_resp)?;
assert_eq!(resumed, thread);
Ok(())
}
#[tokio::test]
async fn thread_resume_returns_rollout_history() -> Result<()> {
let server = create_mock_chat_completions_server(vec![]).await;
let codex_home = TempDir::new()?;
create_config_toml(codex_home.path(), &server.uri())?;
let preview = "Saved user message";
let conversation_id = create_fake_rollout(
codex_home.path(),
"2025-01-05T12-00-00",
"2025-01-05T12:00:00Z",
preview,
Some("mock_provider"),
)?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
let resume_id = mcp
.send_thread_resume_request(ThreadResumeParams {
thread_id: conversation_id.clone(),
..Default::default()
})
.await?;
let resume_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(resume_id)),
)
.await??;
let ThreadResumeResponse { thread, .. } = to_response::<ThreadResumeResponse>(resume_resp)?;
assert_eq!(thread.id, conversation_id);
assert_eq!(thread.preview, preview);
assert_eq!(thread.model_provider, "mock_provider");
assert!(thread.path.is_absolute());
assert_eq!(
thread.turns.len(),
1,
"expected rollouts to include one turn"
);
let turn = &thread.turns[0];
assert_eq!(turn.status, TurnStatus::Completed);
assert_eq!(turn.items.len(), 1, "expected user message item");
match &turn.items[0] {
ThreadItem::UserMessage { content, .. } => {
assert_eq!(
content,
&vec![UserInput::Text {
text: preview.to_string()
}]
);
}
other => panic!("expected user message item, got {other:?}"),
}
Ok(())
}
#[tokio::test]
async fn thread_resume_prefers_path_over_thread_id() -> Result<()> {
let server = create_mock_chat_completions_server(vec![]).await;
@@ -68,7 +132,7 @@ async fn thread_resume_prefers_path_over_thread_id() -> Result<()> {
let start_id = mcp
.send_thread_start_request(ThreadStartParams {
model: Some("gpt-5.1-codex".to_string()),
model: Some("gpt-5.1-codex-max".to_string()),
..Default::default()
})
.await?;
@@ -77,7 +141,7 @@ async fn thread_resume_prefers_path_over_thread_id() -> Result<()> {
mcp.read_stream_until_response_message(RequestId::Integer(start_id)),
)
.await??;
let ThreadStartResponse { thread } = to_response::<ThreadStartResponse>(start_resp)?;
let ThreadStartResponse { thread, .. } = to_response::<ThreadStartResponse>(start_resp)?;
let thread_path = thread.path.clone();
let resume_id = mcp
@@ -93,8 +157,9 @@ async fn thread_resume_prefers_path_over_thread_id() -> Result<()> {
mcp.read_stream_until_response_message(RequestId::Integer(resume_id)),
)
.await??;
let ThreadResumeResponse { thread: resumed } =
to_response::<ThreadResumeResponse>(resume_resp)?;
let ThreadResumeResponse {
thread: resumed, ..
} = to_response::<ThreadResumeResponse>(resume_resp)?;
assert_eq!(resumed, thread);
Ok(())
@@ -112,7 +177,7 @@ async fn thread_resume_supports_history_and_overrides() -> Result<()> {
// Start a thread.
let start_id = mcp
.send_thread_start_request(ThreadStartParams {
model: Some("gpt-5.1-codex".to_string()),
model: Some("gpt-5.1-codex-max".to_string()),
..Default::default()
})
.await?;
@@ -121,7 +186,7 @@ async fn thread_resume_supports_history_and_overrides() -> Result<()> {
mcp.read_stream_until_response_message(RequestId::Integer(start_id)),
)
.await??;
let ThreadStartResponse { thread } = to_response::<ThreadStartResponse>(start_resp)?;
let ThreadStartResponse { thread, .. } = to_response::<ThreadStartResponse>(start_resp)?;
let history_text = "Hello from history";
let history = vec![ResponseItem::Message {
@@ -147,10 +212,13 @@ async fn thread_resume_supports_history_and_overrides() -> Result<()> {
mcp.read_stream_until_response_message(RequestId::Integer(resume_id)),
)
.await??;
let ThreadResumeResponse { thread: resumed } =
to_response::<ThreadResumeResponse>(resume_resp)?;
let ThreadResumeResponse {
thread: resumed,
model_provider,
..
} = to_response::<ThreadResumeResponse>(resume_resp)?;
assert!(!resumed.id.is_empty());
assert_eq!(resumed.model_provider, "mock_provider");
assert_eq!(model_provider, "mock_provider");
assert_eq!(resumed.preview, history_text);
Ok(())

View File

@@ -40,13 +40,17 @@ async fn thread_start_creates_thread_and_emits_started() -> Result<()> {
mcp.read_stream_until_response_message(RequestId::Integer(req_id)),
)
.await??;
let ThreadStartResponse { thread } = to_response::<ThreadStartResponse>(resp)?;
let ThreadStartResponse {
thread,
model_provider,
..
} = to_response::<ThreadStartResponse>(resp)?;
assert!(!thread.id.is_empty(), "thread id should not be empty");
assert!(
thread.preview.is_empty(),
"new threads should start with an empty preview"
);
assert_eq!(thread.model_provider, "mock_provider");
assert_eq!(model_provider, "mock_provider");
assert!(
thread.created_at > 0,
"created_at should be a positive UNIX timestamp"

View File

@@ -3,16 +3,19 @@
use anyhow::Result;
use app_test_support::McpProcess;
use app_test_support::create_mock_chat_completions_server;
use app_test_support::create_shell_sse_response;
use app_test_support::create_shell_command_sse_response;
use app_test_support::to_response;
use codex_app_server_protocol::JSONRPCNotification;
use codex_app_server_protocol::JSONRPCResponse;
use codex_app_server_protocol::RequestId;
use codex_app_server_protocol::ThreadStartParams;
use codex_app_server_protocol::ThreadStartResponse;
use codex_app_server_protocol::TurnCompletedNotification;
use codex_app_server_protocol::TurnInterruptParams;
use codex_app_server_protocol::TurnInterruptResponse;
use codex_app_server_protocol::TurnStartParams;
use codex_app_server_protocol::TurnStartResponse;
use codex_app_server_protocol::TurnStatus;
use codex_app_server_protocol::UserInput as V2UserInput;
use tempfile::TempDir;
use tokio::time::timeout;
@@ -38,7 +41,7 @@ async fn turn_interrupt_aborts_running_turn() -> Result<()> {
std::fs::create_dir(&working_directory)?;
// Mock server: long-running shell command then (after abort) nothing else needed.
let server = create_mock_chat_completions_server(vec![create_shell_sse_response(
let server = create_mock_chat_completions_server(vec![create_shell_command_sse_response(
shell_command.clone(),
Some(&working_directory),
Some(10_000),
@@ -62,7 +65,7 @@ async fn turn_interrupt_aborts_running_turn() -> Result<()> {
mcp.read_stream_until_response_message(RequestId::Integer(thread_req)),
)
.await??;
let ThreadStartResponse { thread } = to_response::<ThreadStartResponse>(thread_resp)?;
let ThreadStartResponse { thread, .. } = to_response::<ThreadStartResponse>(thread_resp)?;
// Start a turn that triggers a long-running command.
let turn_req = mcp
@@ -99,7 +102,18 @@ async fn turn_interrupt_aborts_running_turn() -> Result<()> {
.await??;
let _resp: TurnInterruptResponse = to_response::<TurnInterruptResponse>(interrupt_resp)?;
// No fields to assert on; successful deserialization confirms proper response shape.
let completed_notif: JSONRPCNotification = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("turn/completed"),
)
.await??;
let completed: TurnCompletedNotification = serde_json::from_value(
completed_notif
.params
.expect("turn/completed params must be present"),
)?;
assert_eq!(completed.turn.status, TurnStatus::Interrupted);
Ok(())
}

View File

@@ -1,22 +1,32 @@
use anyhow::Result;
use app_test_support::McpProcess;
use app_test_support::create_apply_patch_sse_response;
use app_test_support::create_final_assistant_message_sse_response;
use app_test_support::create_mock_chat_completions_server;
use app_test_support::create_mock_chat_completions_server_unchecked;
use app_test_support::create_shell_sse_response;
use app_test_support::create_shell_command_sse_response;
use app_test_support::format_with_current_shell_display;
use app_test_support::to_response;
use codex_app_server_protocol::ApprovalDecision;
use codex_app_server_protocol::CommandExecutionRequestApprovalResponse;
use codex_app_server_protocol::CommandExecutionStatus;
use codex_app_server_protocol::FileChangeRequestApprovalResponse;
use codex_app_server_protocol::ItemCompletedNotification;
use codex_app_server_protocol::ItemStartedNotification;
use codex_app_server_protocol::JSONRPCNotification;
use codex_app_server_protocol::JSONRPCResponse;
use codex_app_server_protocol::PatchApplyStatus;
use codex_app_server_protocol::PatchChangeKind;
use codex_app_server_protocol::RequestId;
use codex_app_server_protocol::ServerRequest;
use codex_app_server_protocol::ThreadItem;
use codex_app_server_protocol::ThreadStartParams;
use codex_app_server_protocol::ThreadStartResponse;
use codex_app_server_protocol::TurnCompletedNotification;
use codex_app_server_protocol::TurnStartParams;
use codex_app_server_protocol::TurnStartResponse;
use codex_app_server_protocol::TurnStartedNotification;
use codex_app_server_protocol::TurnStatus;
use codex_app_server_protocol::UserInput as V2UserInput;
use codex_core::protocol_config_types::ReasoningEffort;
use codex_core::protocol_config_types::ReasoningSummary;
@@ -57,7 +67,7 @@ async fn turn_start_emits_notifications_and_accepts_model_override() -> Result<(
mcp.read_stream_until_response_message(RequestId::Integer(thread_req)),
)
.await??;
let ThreadStartResponse { thread } = to_response::<ThreadStartResponse>(thread_resp)?;
let ThreadStartResponse { thread, .. } = to_response::<ThreadStartResponse>(thread_resp)?;
// Start a turn with only input and thread_id set (no overrides).
let turn_req = mcp
@@ -118,13 +128,17 @@ async fn turn_start_emits_notifications_and_accepts_model_override() -> Result<(
)
.await??;
// And we should ultimately get a task_complete without having to add a
// legacy conversation listener explicitly (auto-attached by thread/start).
let _task_complete: JSONRPCNotification = timeout(
let completed_notif: JSONRPCNotification = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("codex/event/task_complete"),
mcp.read_stream_until_notification_message("turn/completed"),
)
.await??;
let completed: TurnCompletedNotification = serde_json::from_value(
completed_notif
.params
.expect("turn/completed params must be present"),
)?;
assert_eq!(completed.turn.status, TurnStatus::Completed);
Ok(())
}
@@ -157,7 +171,7 @@ async fn turn_start_accepts_local_image_input() -> Result<()> {
mcp.read_stream_until_response_message(RequestId::Integer(thread_req)),
)
.await??;
let ThreadStartResponse { thread } = to_response::<ThreadStartResponse>(thread_resp)?;
let ThreadStartResponse { thread, .. } = to_response::<ThreadStartResponse>(thread_resp)?;
let image_path = codex_home.path().join("image.png");
// No need to actually write the file; we just exercise the input path.
@@ -191,7 +205,7 @@ async fn turn_start_exec_approval_toggle_v2() -> Result<()> {
// Mock server: first turn requests a shell call (elicitation), then completes.
// Second turn same, but we'll set approval_policy=never to avoid elicitation.
let responses = vec![
create_shell_sse_response(
create_shell_command_sse_response(
vec![
"python3".to_string(),
"-c".to_string(),
@@ -202,7 +216,7 @@ async fn turn_start_exec_approval_toggle_v2() -> Result<()> {
"call1",
)?,
create_final_assistant_message_sse_response("done 1")?,
create_shell_sse_response(
create_shell_command_sse_response(
vec![
"python3".to_string(),
"-c".to_string(),
@@ -233,7 +247,7 @@ async fn turn_start_exec_approval_toggle_v2() -> Result<()> {
mcp.read_stream_until_response_message(RequestId::Integer(start_id)),
)
.await??;
let ThreadStartResponse { thread } = to_response::<ThreadStartResponse>(start_resp)?;
let ThreadStartResponse { thread, .. } = to_response::<ThreadStartResponse>(start_resp)?;
// turn/start — expect CommandExecutionRequestApproval request from server
let first_turn_id = mcp
@@ -274,6 +288,11 @@ async fn turn_start_exec_approval_toggle_v2() -> Result<()> {
mcp.read_stream_until_notification_message("codex/event/task_complete"),
)
.await??;
timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("turn/completed"),
)
.await??;
// Second turn with approval_policy=never should not elicit approval
let second_turn_id = mcp
@@ -297,6 +316,150 @@ async fn turn_start_exec_approval_toggle_v2() -> Result<()> {
.await??;
// Ensure we do NOT receive a CommandExecutionRequestApproval request before task completes
timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("codex/event/task_complete"),
)
.await??;
timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("turn/completed"),
)
.await??;
Ok(())
}
#[tokio::test]
async fn turn_start_exec_approval_decline_v2() -> Result<()> {
skip_if_no_network!(Ok(()));
let tmp = TempDir::new()?;
let codex_home = tmp.path().to_path_buf();
let workspace = tmp.path().join("workspace");
std::fs::create_dir(&workspace)?;
let responses = vec![
create_shell_command_sse_response(
vec![
"python3".to_string(),
"-c".to_string(),
"print(42)".to_string(),
],
None,
Some(5000),
"call-decline",
)?,
create_final_assistant_message_sse_response("done")?,
];
let server = create_mock_chat_completions_server(responses).await;
create_config_toml(codex_home.as_path(), &server.uri(), "untrusted")?;
let mut mcp = McpProcess::new(codex_home.as_path()).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
let start_id = mcp
.send_thread_start_request(ThreadStartParams {
model: Some("mock-model".to_string()),
..Default::default()
})
.await?;
let start_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(start_id)),
)
.await??;
let ThreadStartResponse { thread, .. } = to_response::<ThreadStartResponse>(start_resp)?;
let turn_id = mcp
.send_turn_start_request(TurnStartParams {
thread_id: thread.id.clone(),
input: vec![V2UserInput::Text {
text: "run python".to_string(),
}],
cwd: Some(workspace.clone()),
..Default::default()
})
.await?;
let turn_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(turn_id)),
)
.await??;
let TurnStartResponse { turn } = to_response::<TurnStartResponse>(turn_resp)?;
let started_command_execution = timeout(DEFAULT_READ_TIMEOUT, async {
loop {
let started_notif = mcp
.read_stream_until_notification_message("item/started")
.await?;
let started: ItemStartedNotification =
serde_json::from_value(started_notif.params.clone().expect("item/started params"))?;
if let ThreadItem::CommandExecution { .. } = started.item {
return Ok::<ThreadItem, anyhow::Error>(started.item);
}
}
})
.await??;
let ThreadItem::CommandExecution { id, status, .. } = started_command_execution else {
unreachable!("loop ensures we break on command execution items");
};
assert_eq!(id, "call-decline");
assert_eq!(status, CommandExecutionStatus::InProgress);
let server_req = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_request_message(),
)
.await??;
let ServerRequest::CommandExecutionRequestApproval { request_id, params } = server_req else {
panic!("expected CommandExecutionRequestApproval request")
};
assert_eq!(params.item_id, "call-decline");
assert_eq!(params.thread_id, thread.id);
assert_eq!(params.turn_id, turn.id);
mcp.send_response(
request_id,
serde_json::to_value(CommandExecutionRequestApprovalResponse {
decision: ApprovalDecision::Decline,
accept_settings: None,
})?,
)
.await?;
let completed_command_execution = timeout(DEFAULT_READ_TIMEOUT, async {
loop {
let completed_notif = mcp
.read_stream_until_notification_message("item/completed")
.await?;
let completed: ItemCompletedNotification = serde_json::from_value(
completed_notif
.params
.clone()
.expect("item/completed params"),
)?;
if let ThreadItem::CommandExecution { .. } = completed.item {
return Ok::<ThreadItem, anyhow::Error>(completed.item);
}
}
})
.await??;
let ThreadItem::CommandExecution {
id,
status,
exit_code,
aggregated_output,
..
} = completed_command_execution
else {
unreachable!("loop ensures we break on command execution items");
};
assert_eq!(id, "call-decline");
assert_eq!(status, CommandExecutionStatus::Declined);
assert!(exit_code.is_none());
assert!(aggregated_output.is_none());
timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("codex/event/task_complete"),
@@ -321,23 +484,15 @@ async fn turn_start_updates_sandbox_and_cwd_between_turns_v2() -> Result<()> {
std::fs::create_dir(&second_cwd)?;
let responses = vec![
create_shell_sse_response(
vec![
"bash".to_string(),
"-lc".to_string(),
"echo first turn".to_string(),
],
create_shell_command_sse_response(
vec!["echo".to_string(), "first".to_string(), "turn".to_string()],
None,
Some(5000),
"call-first",
)?,
create_final_assistant_message_sse_response("done first")?,
create_shell_sse_response(
vec![
"bash".to_string(),
"-lc".to_string(),
"echo second turn".to_string(),
],
create_shell_command_sse_response(
vec!["echo".to_string(), "second".to_string(), "turn".to_string()],
None,
Some(5000),
"call-second",
@@ -362,7 +517,7 @@ async fn turn_start_updates_sandbox_and_cwd_between_turns_v2() -> Result<()> {
mcp.read_stream_until_response_message(RequestId::Integer(start_id)),
)
.await??;
let ThreadStartResponse { thread } = to_response::<ThreadStartResponse>(start_resp)?;
let ThreadStartResponse { thread, .. } = to_response::<ThreadStartResponse>(start_resp)?;
// first turn with workspace-write sandbox and first_cwd
let first_turn = mcp
@@ -443,7 +598,8 @@ async fn turn_start_updates_sandbox_and_cwd_between_turns_v2() -> Result<()> {
unreachable!("loop ensures we break on command execution items");
};
assert_eq!(cwd, second_cwd);
assert_eq!(command, "bash -lc 'echo second turn'");
let expected_command = format_with_current_shell_display("echo second turn");
assert_eq!(command, expected_command);
assert_eq!(status, CommandExecutionStatus::InProgress);
timeout(
@@ -455,6 +611,308 @@ async fn turn_start_updates_sandbox_and_cwd_between_turns_v2() -> Result<()> {
Ok(())
}
#[tokio::test]
async fn turn_start_file_change_approval_v2() -> Result<()> {
skip_if_no_network!(Ok(()));
if cfg!(windows) {
// TODO apply_patch approvals are not parsed from powershell commands yet
return Ok(());
}
let tmp = TempDir::new()?;
let codex_home = tmp.path().join("codex_home");
std::fs::create_dir(&codex_home)?;
let workspace = tmp.path().join("workspace");
std::fs::create_dir(&workspace)?;
let patch = r#"*** Begin Patch
*** Add File: README.md
+new line
*** End Patch
"#;
let responses = vec![
create_apply_patch_sse_response(patch, "patch-call")?,
create_final_assistant_message_sse_response("patch applied")?,
];
let server = create_mock_chat_completions_server(responses).await;
create_config_toml(&codex_home, &server.uri(), "untrusted")?;
let mut mcp = McpProcess::new(&codex_home).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
let start_req = mcp
.send_thread_start_request(ThreadStartParams {
model: Some("mock-model".to_string()),
cwd: Some(workspace.to_string_lossy().into_owned()),
..Default::default()
})
.await?;
let start_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(start_req)),
)
.await??;
let ThreadStartResponse { thread, .. } = to_response::<ThreadStartResponse>(start_resp)?;
let turn_req = mcp
.send_turn_start_request(TurnStartParams {
thread_id: thread.id.clone(),
input: vec![V2UserInput::Text {
text: "apply patch".into(),
}],
cwd: Some(workspace.clone()),
..Default::default()
})
.await?;
let turn_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(turn_req)),
)
.await??;
let TurnStartResponse { turn } = to_response::<TurnStartResponse>(turn_resp)?;
let started_file_change = timeout(DEFAULT_READ_TIMEOUT, async {
loop {
let started_notif = mcp
.read_stream_until_notification_message("item/started")
.await?;
let started: ItemStartedNotification =
serde_json::from_value(started_notif.params.clone().expect("item/started params"))?;
if let ThreadItem::FileChange { .. } = started.item {
return Ok::<ThreadItem, anyhow::Error>(started.item);
}
}
})
.await??;
let ThreadItem::FileChange {
ref id,
status,
ref changes,
} = started_file_change
else {
unreachable!("loop ensures we break on file change items");
};
assert_eq!(id, "patch-call");
assert_eq!(status, PatchApplyStatus::InProgress);
let started_changes = changes.clone();
let server_req = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_request_message(),
)
.await??;
let ServerRequest::FileChangeRequestApproval { request_id, params } = server_req else {
panic!("expected FileChangeRequestApproval request")
};
assert_eq!(params.item_id, "patch-call");
assert_eq!(params.thread_id, thread.id);
assert_eq!(params.turn_id, turn.id);
let expected_readme_path = workspace.join("README.md");
let expected_readme_path = expected_readme_path.to_string_lossy().into_owned();
pretty_assertions::assert_eq!(
started_changes,
vec![codex_app_server_protocol::FileUpdateChange {
path: expected_readme_path.clone(),
kind: PatchChangeKind::Add,
diff: "new line\n".to_string(),
}]
);
mcp.send_response(
request_id,
serde_json::to_value(FileChangeRequestApprovalResponse {
decision: ApprovalDecision::Accept,
})?,
)
.await?;
let completed_file_change = timeout(DEFAULT_READ_TIMEOUT, async {
loop {
let completed_notif = mcp
.read_stream_until_notification_message("item/completed")
.await?;
let completed: ItemCompletedNotification = serde_json::from_value(
completed_notif
.params
.clone()
.expect("item/completed params"),
)?;
if let ThreadItem::FileChange { .. } = completed.item {
return Ok::<ThreadItem, anyhow::Error>(completed.item);
}
}
})
.await??;
let ThreadItem::FileChange { ref id, status, .. } = completed_file_change else {
unreachable!("loop ensures we break on file change items");
};
assert_eq!(id, "patch-call");
assert_eq!(status, PatchApplyStatus::Completed);
timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("codex/event/task_complete"),
)
.await??;
let readme_contents = std::fs::read_to_string(expected_readme_path)?;
assert_eq!(readme_contents, "new line\n");
Ok(())
}
#[tokio::test]
async fn turn_start_file_change_approval_decline_v2() -> Result<()> {
skip_if_no_network!(Ok(()));
if cfg!(windows) {
// TODO apply_patch approvals are not parsed from powershell commands yet
return Ok(());
}
let tmp = TempDir::new()?;
let codex_home = tmp.path().join("codex_home");
std::fs::create_dir(&codex_home)?;
let workspace = tmp.path().join("workspace");
std::fs::create_dir(&workspace)?;
let patch = r#"*** Begin Patch
*** Add File: README.md
+new line
*** End Patch
"#;
let responses = vec![
create_apply_patch_sse_response(patch, "patch-call")?,
create_final_assistant_message_sse_response("patch declined")?,
];
let server = create_mock_chat_completions_server(responses).await;
create_config_toml(&codex_home, &server.uri(), "untrusted")?;
let mut mcp = McpProcess::new(&codex_home).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
let start_req = mcp
.send_thread_start_request(ThreadStartParams {
model: Some("mock-model".to_string()),
cwd: Some(workspace.to_string_lossy().into_owned()),
..Default::default()
})
.await?;
let start_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(start_req)),
)
.await??;
let ThreadStartResponse { thread, .. } = to_response::<ThreadStartResponse>(start_resp)?;
let turn_req = mcp
.send_turn_start_request(TurnStartParams {
thread_id: thread.id.clone(),
input: vec![V2UserInput::Text {
text: "apply patch".into(),
}],
cwd: Some(workspace.clone()),
..Default::default()
})
.await?;
let turn_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(turn_req)),
)
.await??;
let TurnStartResponse { turn } = to_response::<TurnStartResponse>(turn_resp)?;
let started_file_change = timeout(DEFAULT_READ_TIMEOUT, async {
loop {
let started_notif = mcp
.read_stream_until_notification_message("item/started")
.await?;
let started: ItemStartedNotification =
serde_json::from_value(started_notif.params.clone().expect("item/started params"))?;
if let ThreadItem::FileChange { .. } = started.item {
return Ok::<ThreadItem, anyhow::Error>(started.item);
}
}
})
.await??;
let ThreadItem::FileChange {
ref id,
status,
ref changes,
} = started_file_change
else {
unreachable!("loop ensures we break on file change items");
};
assert_eq!(id, "patch-call");
assert_eq!(status, PatchApplyStatus::InProgress);
let started_changes = changes.clone();
let server_req = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_request_message(),
)
.await??;
let ServerRequest::FileChangeRequestApproval { request_id, params } = server_req else {
panic!("expected FileChangeRequestApproval request")
};
assert_eq!(params.item_id, "patch-call");
assert_eq!(params.thread_id, thread.id);
assert_eq!(params.turn_id, turn.id);
let expected_readme_path = workspace.join("README.md");
let expected_readme_path_str = expected_readme_path.to_string_lossy().into_owned();
pretty_assertions::assert_eq!(
started_changes,
vec![codex_app_server_protocol::FileUpdateChange {
path: expected_readme_path_str.clone(),
kind: PatchChangeKind::Add,
diff: "new line\n".to_string(),
}]
);
mcp.send_response(
request_id,
serde_json::to_value(FileChangeRequestApprovalResponse {
decision: ApprovalDecision::Decline,
})?,
)
.await?;
let completed_file_change = timeout(DEFAULT_READ_TIMEOUT, async {
loop {
let completed_notif = mcp
.read_stream_until_notification_message("item/completed")
.await?;
let completed: ItemCompletedNotification = serde_json::from_value(
completed_notif
.params
.clone()
.expect("item/completed params"),
)?;
if let ThreadItem::FileChange { .. } = completed.item {
return Ok::<ThreadItem, anyhow::Error>(completed.item);
}
}
})
.await??;
let ThreadItem::FileChange { ref id, status, .. } = completed_file_change else {
unreachable!("loop ensures we break on file change items");
};
assert_eq!(id, "patch-call");
assert_eq!(status, PatchApplyStatus::Declined);
timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("codex/event/task_complete"),
)
.await??;
assert!(
!expected_readme_path.exists(),
"declined patch should not be applied"
);
Ok(())
}
// Helper to create a config.toml pointing at the mock model server.
fn create_config_toml(
codex_home: &Path,

View File

@@ -30,6 +30,7 @@ pub use standalone_executable::main;
pub const APPLY_PATCH_TOOL_INSTRUCTIONS: &str = include_str!("../apply_patch_tool_instructions.md");
const APPLY_PATCH_COMMANDS: [&str; 2] = ["apply_patch", "applypatch"];
const APPLY_PATCH_SHELLS: [&str; 3] = ["bash", "zsh", "sh"];
#[derive(Debug, Error, PartialEq)]
pub enum ApplyPatchError {
@@ -96,6 +97,13 @@ pub struct ApplyPatchArgs {
pub workdir: Option<String>,
}
fn shell_supports_apply_patch(shell: &str) -> bool {
std::path::Path::new(shell)
.file_name()
.and_then(|name| name.to_str())
.is_some_and(|name| APPLY_PATCH_SHELLS.contains(&name))
}
pub fn maybe_parse_apply_patch(argv: &[String]) -> MaybeApplyPatch {
match argv {
// Direct invocation: apply_patch <patch>
@@ -104,7 +112,7 @@ pub fn maybe_parse_apply_patch(argv: &[String]) -> MaybeApplyPatch {
Err(e) => MaybeApplyPatch::PatchParseError(e),
},
// Bash heredoc form: (optional `cd <path> &&`) apply_patch <<'EOF' ...
[bash, flag, script] if bash == "bash" && flag == "-lc" => {
[shell, flag, script] if shell_supports_apply_patch(shell) && flag == "-lc" => {
match extract_apply_patch_from_bash(script) {
Ok((body, workdir)) => match parse_patch(&body) {
Ok(mut source) => {
@@ -224,12 +232,12 @@ pub fn maybe_parse_apply_patch_verified(argv: &[String], cwd: &Path) -> MaybeApp
);
}
}
[bash, flag, script] if bash == "bash" && flag == "-lc" => {
if parse_patch(script).is_ok() {
return MaybeApplyPatchVerified::CorrectnessError(
ApplyPatchError::ImplicitInvocation,
);
}
[shell, flag, script]
if shell_supports_apply_patch(shell)
&& flag == "-lc"
&& parse_patch(script).is_ok() =>
{
return MaybeApplyPatchVerified::CorrectnessError(ApplyPatchError::ImplicitInvocation);
}
_ => {}
}

View File

@@ -1,4 +1,5 @@
use crate::types::CodeTaskDetailsResponse;
use crate::types::CreditStatusDetails;
use crate::types::PaginatedListTaskListItem;
use crate::types::RateLimitStatusPayload;
use crate::types::RateLimitWindowSnapshot;
@@ -6,6 +7,7 @@ use crate::types::TurnAttemptsSiblingTurnsResponse;
use anyhow::Result;
use codex_core::auth::CodexAuth;
use codex_core::default_client::get_codex_user_agent;
use codex_protocol::protocol::CreditsSnapshot;
use codex_protocol::protocol::RateLimitSnapshot;
use codex_protocol::protocol::RateLimitWindow;
use reqwest::header::AUTHORIZATION;
@@ -272,19 +274,23 @@ impl Client {
// rate limit helpers
fn rate_limit_snapshot_from_payload(payload: RateLimitStatusPayload) -> RateLimitSnapshot {
let Some(details) = payload
let rate_limit_details = payload
.rate_limit
.and_then(|inner| inner.map(|boxed| *boxed))
else {
return RateLimitSnapshot {
primary: None,
secondary: None,
};
.and_then(|inner| inner.map(|boxed| *boxed));
let (primary, secondary) = if let Some(details) = rate_limit_details {
(
Self::map_rate_limit_window(details.primary_window),
Self::map_rate_limit_window(details.secondary_window),
)
} else {
(None, None)
};
RateLimitSnapshot {
primary: Self::map_rate_limit_window(details.primary_window),
secondary: Self::map_rate_limit_window(details.secondary_window),
primary,
secondary,
credits: Self::map_credits(payload.credits),
}
}
@@ -306,6 +312,19 @@ impl Client {
})
}
fn map_credits(credits: Option<Option<Box<CreditStatusDetails>>>) -> Option<CreditsSnapshot> {
let details = match credits {
Some(Some(details)) => *details,
_ => return None,
};
Some(CreditsSnapshot {
has_credits: details.has_credits,
unlimited: details.unlimited,
balance: details.balance.and_then(|inner| inner),
})
}
fn window_minutes_from_seconds(seconds: i32) -> Option<i64> {
if seconds <= 0 {
return None;

View File

@@ -1,3 +1,4 @@
pub use codex_backend_openapi_models::models::CreditStatusDetails;
pub use codex_backend_openapi_models::models::PaginatedListTaskListItem;
pub use codex_backend_openapi_models::models::PlanType;
pub use codex_backend_openapi_models::models::RateLimitStatusDetails;

View File

@@ -26,6 +26,7 @@ codex-cloud-tasks = { path = "../cloud-tasks" }
codex-common = { workspace = true, features = ["cli"] }
codex-core = { workspace = true }
codex-exec = { workspace = true }
codex-execpolicy = { workspace = true }
codex-login = { workspace = true }
codex-mcp-server = { workspace = true }
codex-process-hardening = { workspace = true }

View File

@@ -138,11 +138,7 @@ async fn run_command_under_sandbox(
{
use codex_windows_sandbox::run_windows_sandbox_capture;
let policy_str = match &config.sandbox_policy {
codex_core::protocol::SandboxPolicy::DangerFullAccess => "workspace-write",
codex_core::protocol::SandboxPolicy::ReadOnly => "read-only",
codex_core::protocol::SandboxPolicy::WorkspaceWrite { .. } => "workspace-write",
};
let policy_str = serde_json::to_string(&config.sandbox_policy)?;
let sandbox_cwd = sandbox_policy_cwd.clone();
let cwd_clone = cwd.clone();
@@ -153,7 +149,7 @@ async fn run_command_under_sandbox(
// Preflight audit is invoked elsewhere at the appropriate times.
let res = tokio::task::spawn_blocking(move || {
run_windows_sandbox_capture(
policy_str,
policy_str.as_str(),
&sandbox_cwd,
base_dir.as_path(),
command_vec,

View File

@@ -18,6 +18,7 @@ use codex_cli::login::run_logout;
use codex_cloud_tasks::Cli as CloudTasksCli;
use codex_common::CliConfigOverrides;
use codex_exec::Cli as ExecCli;
use codex_execpolicy::ExecPolicyCheckCommand;
use codex_responses_api_proxy::Args as ResponsesApiProxyArgs;
use codex_tui::AppExitInfo;
use codex_tui::Cli as TuiCli;
@@ -93,6 +94,10 @@ enum Subcommand {
#[clap(visible_alias = "debug")]
Sandbox(SandboxArgs),
/// Execpolicy tooling.
#[clap(hide = true)]
Execpolicy(ExecpolicyCommand),
/// Apply the latest diff produced by Codex agent as a `git apply` to your local working tree.
#[clap(visible_alias = "a")]
Apply(ApplyCommand),
@@ -134,6 +139,10 @@ struct ResumeCommand {
#[arg(long = "last", default_value_t = false, conflicts_with = "session_id")]
last: bool,
/// Show all sessions (disables cwd filtering and shows CWD column).
#[arg(long = "all", default_value_t = false)]
all: bool,
#[clap(flatten)]
config_overrides: TuiCli,
}
@@ -158,6 +167,19 @@ enum SandboxCommand {
Windows(WindowsCommand),
}
#[derive(Debug, Parser)]
struct ExecpolicyCommand {
#[command(subcommand)]
sub: ExecpolicySubcommand,
}
#[derive(Debug, clap::Subcommand)]
enum ExecpolicySubcommand {
/// Check execpolicy files against a command.
#[clap(name = "check")]
Check(ExecPolicyCheckCommand),
}
#[derive(Debug, Parser)]
struct LoginCommand {
#[clap(skip)]
@@ -323,6 +345,10 @@ fn run_update_action(action: UpdateAction) -> anyhow::Result<()> {
Ok(())
}
fn run_execpolicycheck(cmd: ExecPolicyCheckCommand) -> anyhow::Result<()> {
cmd.run()
}
#[derive(Debug, Default, Parser, Clone)]
struct FeatureToggles {
/// Enable a feature (repeatable). Equivalent to `-c features.<name>=true`.
@@ -448,6 +474,7 @@ async fn cli_main(codex_linux_sandbox_exe: Option<PathBuf>) -> anyhow::Result<()
Some(Subcommand::Resume(ResumeCommand {
session_id,
last,
all,
config_overrides,
})) => {
interactive = finalize_resume_interactive(
@@ -455,6 +482,7 @@ async fn cli_main(codex_linux_sandbox_exe: Option<PathBuf>) -> anyhow::Result<()
root_config_overrides.clone(),
session_id,
last,
all,
config_overrides,
);
let exit_info = codex_tui::run_main(interactive, codex_linux_sandbox_exe).await?;
@@ -543,6 +571,9 @@ async fn cli_main(codex_linux_sandbox_exe: Option<PathBuf>) -> anyhow::Result<()
.await?;
}
},
Some(Subcommand::Execpolicy(ExecpolicyCommand { sub })) => match sub {
ExecpolicySubcommand::Check(cmd) => run_execpolicycheck(cmd)?,
},
Some(Subcommand::Apply(mut apply_cli)) => {
prepend_config_flags(
&mut apply_cli.config_overrides,
@@ -611,6 +642,7 @@ fn finalize_resume_interactive(
root_config_overrides: CliConfigOverrides,
session_id: Option<String>,
last: bool,
show_all: bool,
resume_cli: TuiCli,
) -> TuiCli {
// Start with the parsed interactive CLI so resume shares the same
@@ -619,6 +651,7 @@ fn finalize_resume_interactive(
interactive.resume_picker = resume_session_id.is_none() && !last;
interactive.resume_last = last;
interactive.resume_session_id = resume_session_id;
interactive.resume_show_all = show_all;
// Merge resume-scoped flags and overrides with highest precedence.
merge_resume_cli_flags(&mut interactive, resume_cli);
@@ -702,13 +735,21 @@ mod tests {
let Subcommand::Resume(ResumeCommand {
session_id,
last,
all,
config_overrides: resume_cli,
}) = subcommand.expect("resume present")
else {
unreachable!()
};
finalize_resume_interactive(interactive, root_overrides, session_id, last, resume_cli)
finalize_resume_interactive(
interactive,
root_overrides,
session_id,
last,
all,
resume_cli,
)
}
fn sample_exit_info(conversation: Option<&str>) -> AppExitInfo {
@@ -775,6 +816,7 @@ mod tests {
assert!(interactive.resume_picker);
assert!(!interactive.resume_last);
assert_eq!(interactive.resume_session_id, None);
assert!(!interactive.resume_show_all);
}
#[test]
@@ -783,6 +825,7 @@ mod tests {
assert!(!interactive.resume_picker);
assert!(interactive.resume_last);
assert_eq!(interactive.resume_session_id, None);
assert!(!interactive.resume_show_all);
}
#[test]
@@ -791,6 +834,14 @@ mod tests {
assert!(!interactive.resume_picker);
assert!(!interactive.resume_last);
assert_eq!(interactive.resume_session_id.as_deref(), Some("1234"));
assert!(!interactive.resume_show_all);
}
#[test]
fn resume_all_flag_sets_show_all() {
let interactive = finalize_from_args(["codex", "resume", "--all"].as_ref());
assert!(interactive.resume_picker);
assert!(interactive.resume_show_all);
}
#[test]

View File

@@ -79,6 +79,7 @@ pub struct GetArgs {
}
#[derive(Debug, clap::Parser)]
#[command(override_usage = "codex mcp add [OPTIONS] <NAME> (--url <URL> | -- <COMMAND>...)")]
pub struct AddArgs {
/// Name for the MCP server configuration.
pub name: String,

View File

@@ -0,0 +1,58 @@
use std::fs;
use assert_cmd::Command;
use pretty_assertions::assert_eq;
use serde_json::json;
use tempfile::TempDir;
#[test]
fn execpolicy_check_matches_expected_json() -> Result<(), Box<dyn std::error::Error>> {
let codex_home = TempDir::new()?;
let policy_path = codex_home.path().join("policy.codexpolicy");
fs::write(
&policy_path,
r#"
prefix_rule(
pattern = ["git", "push"],
decision = "forbidden",
)
"#,
)?;
let output = Command::cargo_bin("codex")?
.env("CODEX_HOME", codex_home.path())
.args([
"execpolicy",
"check",
"--policy",
policy_path
.to_str()
.expect("policy path should be valid UTF-8"),
"git",
"push",
"origin",
"main",
])
.output()?;
assert!(output.status.success());
let result: serde_json::Value = serde_json::from_slice(&output.stdout)?;
assert_eq!(
result,
json!({
"match": {
"decision": "forbidden",
"matchedRules": [
{
"prefixRuleMatch": {
"matchedPrefix": ["git", "push"],
"decision": "forbidden"
}
}
]
}
})
);
Ok(())
}

View File

@@ -0,0 +1,52 @@
/*
* codex-backend
*
* codex-backend
*
* The version of the OpenAPI document: 0.0.1
*
* Generated by: https://openapi-generator.tech
*/
use serde::Deserialize;
use serde::Serialize;
#[derive(Clone, Default, Debug, PartialEq, Serialize, Deserialize)]
pub struct CreditStatusDetails {
#[serde(rename = "has_credits")]
pub has_credits: bool,
#[serde(rename = "unlimited")]
pub unlimited: bool,
#[serde(
rename = "balance",
default,
with = "::serde_with::rust::double_option",
skip_serializing_if = "Option::is_none"
)]
pub balance: Option<Option<String>>,
#[serde(
rename = "approx_local_messages",
default,
with = "::serde_with::rust::double_option",
skip_serializing_if = "Option::is_none"
)]
pub approx_local_messages: Option<Option<Vec<serde_json::Value>>>,
#[serde(
rename = "approx_cloud_messages",
default,
with = "::serde_with::rust::double_option",
skip_serializing_if = "Option::is_none"
)]
pub approx_cloud_messages: Option<Option<Vec<serde_json::Value>>>,
}
impl CreditStatusDetails {
pub fn new(has_credits: bool, unlimited: bool) -> CreditStatusDetails {
CreditStatusDetails {
has_credits,
unlimited,
balance: None,
approx_local_messages: None,
approx_cloud_messages: None,
}
}
}

View File

@@ -32,3 +32,6 @@ pub use self::rate_limit_status_details::RateLimitStatusDetails;
pub mod rate_limit_window_snapshot;
pub use self::rate_limit_window_snapshot::RateLimitWindowSnapshot;
pub mod credit_status_details;
pub use self::credit_status_details::CreditStatusDetails;

View File

@@ -23,6 +23,13 @@ pub struct RateLimitStatusPayload {
skip_serializing_if = "Option::is_none"
)]
pub rate_limit: Option<Option<Box<models::RateLimitStatusDetails>>>,
#[serde(
rename = "credits",
default,
with = "::serde_with::rust::double_option",
skip_serializing_if = "Option::is_none"
)]
pub credits: Option<Option<Box<models::CreditStatusDetails>>>,
}
impl RateLimitStatusPayload {
@@ -30,12 +37,15 @@ impl RateLimitStatusPayload {
RateLimitStatusPayload {
plan_type,
rate_limit: None,
credits: None,
}
}
}
#[derive(Clone, Copy, Debug, Eq, PartialEq, Ord, PartialOrd, Hash, Serialize, Deserialize)]
pub enum PlanType {
#[serde(rename = "guest")]
Guest,
#[serde(rename = "free")]
Free,
#[serde(rename = "go")]
@@ -44,6 +54,8 @@ pub enum PlanType {
Plus,
#[serde(rename = "pro")]
Pro,
#[serde(rename = "free_workspace")]
FreeWorkspace,
#[serde(rename = "team")]
Team,
#[serde(rename = "business")]
@@ -52,6 +64,8 @@ pub enum PlanType {
Education,
#[serde(rename = "quorum")]
Quorum,
#[serde(rename = "k12")]
K12,
#[serde(rename = "enterprise")]
Enterprise,
#[serde(rename = "edu")]
@@ -60,6 +74,6 @@ pub enum PlanType {
impl Default for PlanType {
fn default() -> PlanType {
Self::Free
Self::Guest
}
}

View File

@@ -7,7 +7,6 @@
*
* Generated by: https://openapi-generator.tech
*/
use serde::Deserialize;
use serde::Serialize;

View File

@@ -24,21 +24,21 @@ pub fn builtin_approval_presets() -> Vec<ApprovalPreset> {
ApprovalPreset {
id: "read-only",
label: "Read Only",
description: "Codex can read files and answer questions. Codex requires approval to make edits, run commands, or access network.",
description: "Requires approval to edit files and run commands.",
approval: AskForApproval::OnRequest,
sandbox: SandboxPolicy::ReadOnly,
},
ApprovalPreset {
id: "auto",
label: "Auto",
description: "Codex can read files, make edits, and run commands in the workspace. Codex requires approval to work outside the workspace or access network.",
label: "Agent",
description: "Read and edit files, and run commands.",
approval: AskForApproval::OnRequest,
sandbox: SandboxPolicy::new_workspace_write_policy(),
},
ApprovalPreset {
id: "full-access",
label: "Full Access",
description: "Codex can read files, make edits, and run commands with network access, without approval. Exercise caution.",
label: "Agent (full access)",
description: "Codex can edit files outside this workspace and run commands with network access. Exercise caution when using.",
approval: AskForApproval::Never,
sandbox: SandboxPolicy::DangerFullAccess,
},

View File

@@ -15,13 +15,12 @@ pub fn create_config_summary_entries(config: &Config) -> Vec<(&'static str, Stri
if config.model_provider.wire_api == WireApi::Responses
&& config.model_family.supports_reasoning_summaries
{
entries.push((
"reasoning effort",
config
.model_reasoning_effort
.map(|effort| effort.to_string())
.unwrap_or_else(|| "none".to_string()),
));
let reasoning_effort = config
.model_reasoning_effort
.or(config.model_family.default_reasoning_effort)
.map(|effort| effort.to_string())
.unwrap_or_else(|| "none".to_string());
entries.push(("reasoning effort", reasoning_effort));
entries.push((
"reasoning summaries",
config.model_reasoning_summary.to_string(),

View File

@@ -4,6 +4,10 @@ use codex_app_server_protocol::AuthMode;
use codex_core::protocol_config_types::ReasoningEffort;
use once_cell::sync::Lazy;
pub const HIDE_GPT5_1_MIGRATION_PROMPT_CONFIG: &str = "hide_gpt5_1_migration_prompt";
pub const HIDE_GPT_5_1_CODEX_MAX_MIGRATION_PROMPT_CONFIG: &str =
"hide_gpt-5.1-codex-max_migration_prompt";
/// A reasoning effort option that can be surfaced for a model.
#[derive(Debug, Clone, Copy)]
pub struct ReasoningEffortPreset {
@@ -17,6 +21,7 @@ pub struct ReasoningEffortPreset {
pub struct ModelUpgrade {
pub id: &'static str,
pub reasoning_effort_mapping: Option<HashMap<ReasoningEffort, ReasoningEffort>>,
pub migration_config_key: &'static str,
}
/// Metadata describing a Codex-supported model.
@@ -38,10 +43,40 @@ pub struct ModelPreset {
pub is_default: bool,
/// recommended upgrade model
pub upgrade: Option<ModelUpgrade>,
/// Whether this preset should appear in the picker UI.
pub show_in_picker: bool,
}
static PRESETS: Lazy<Vec<ModelPreset>> = Lazy::new(|| {
vec![
ModelPreset {
id: "gpt-5.1-codex-max",
model: "gpt-5.1-codex-max",
display_name: "gpt-5.1-codex-max",
description: "Latest Codex-optimized flagship for deep and fast reasoning.",
default_reasoning_effort: ReasoningEffort::Medium,
supported_reasoning_efforts: &[
ReasoningEffortPreset {
effort: ReasoningEffort::Low,
description: "Fast responses with lighter reasoning",
},
ReasoningEffortPreset {
effort: ReasoningEffort::Medium,
description: "Balances speed and reasoning depth for everyday tasks",
},
ReasoningEffortPreset {
effort: ReasoningEffort::High,
description: "Maximizes reasoning depth for complex problems",
},
ReasoningEffortPreset {
effort: ReasoningEffort::XHigh,
description: "Extra high reasoning depth for complex problems",
},
],
is_default: true,
upgrade: None,
show_in_picker: true,
},
ModelPreset {
id: "gpt-5.1-codex",
model: "gpt-5.1-codex",
@@ -62,8 +97,13 @@ static PRESETS: Lazy<Vec<ModelPreset>> = Lazy::new(|| {
description: "Maximizes reasoning depth for complex or ambiguous problems",
},
],
is_default: true,
upgrade: None,
is_default: false,
upgrade: Some(ModelUpgrade {
id: "gpt-5.1-codex-max",
reasoning_effort_mapping: None,
migration_config_key: HIDE_GPT_5_1_CODEX_MAX_MIGRATION_PROMPT_CONFIG,
}),
show_in_picker: true,
},
ModelPreset {
id: "gpt-5.1-codex-mini",
@@ -82,7 +122,12 @@ static PRESETS: Lazy<Vec<ModelPreset>> = Lazy::new(|| {
},
],
is_default: false,
upgrade: None,
upgrade: Some(ModelUpgrade {
id: "gpt-5.1-codex-max",
reasoning_effort_mapping: None,
migration_config_key: HIDE_GPT_5_1_CODEX_MAX_MIGRATION_PROMPT_CONFIG,
}),
show_in_picker: true,
},
ModelPreset {
id: "gpt-5.1",
@@ -105,7 +150,12 @@ static PRESETS: Lazy<Vec<ModelPreset>> = Lazy::new(|| {
},
],
is_default: false,
upgrade: None,
upgrade: Some(ModelUpgrade {
id: "gpt-5.1-codex-max",
reasoning_effort_mapping: None,
migration_config_key: HIDE_GPT_5_1_CODEX_MAX_MIGRATION_PROMPT_CONFIG,
}),
show_in_picker: true,
},
// Deprecated models.
ModelPreset {
@@ -130,9 +180,11 @@ static PRESETS: Lazy<Vec<ModelPreset>> = Lazy::new(|| {
],
is_default: false,
upgrade: Some(ModelUpgrade {
id: "gpt-5.1-codex",
id: "gpt-5.1-codex-max",
reasoning_effort_mapping: None,
migration_config_key: HIDE_GPT_5_1_CODEX_MAX_MIGRATION_PROMPT_CONFIG,
}),
show_in_picker: false,
},
ModelPreset {
id: "gpt-5-codex-mini",
@@ -154,7 +206,9 @@ static PRESETS: Lazy<Vec<ModelPreset>> = Lazy::new(|| {
upgrade: Some(ModelUpgrade {
id: "gpt-5.1-codex-mini",
reasoning_effort_mapping: None,
migration_config_key: HIDE_GPT5_1_MIGRATION_PROMPT_CONFIG,
}),
show_in_picker: false,
},
ModelPreset {
id: "gpt-5",
@@ -182,21 +236,22 @@ static PRESETS: Lazy<Vec<ModelPreset>> = Lazy::new(|| {
],
is_default: false,
upgrade: Some(ModelUpgrade {
id: "gpt-5.1",
reasoning_effort_mapping: Some(HashMap::from([(
ReasoningEffort::Minimal,
ReasoningEffort::Low,
)])),
id: "gpt-5.1-codex-max",
reasoning_effort_mapping: None,
migration_config_key: HIDE_GPT_5_1_CODEX_MAX_MIGRATION_PROMPT_CONFIG,
}),
show_in_picker: false,
},
]
});
pub fn builtin_model_presets(_auth_mode: Option<AuthMode>) -> Vec<ModelPreset> {
// leave auth mode for later use
pub fn builtin_model_presets(auth_mode: Option<AuthMode>) -> Vec<ModelPreset> {
PRESETS
.iter()
.filter(|preset| preset.upgrade.is_none())
.filter(|preset| match auth_mode {
Some(AuthMode::ApiKey) => preset.show_in_picker && preset.id != "gpt-5.1-codex-max",
_ => preset.show_in_picker,
})
.cloned()
.collect()
}
@@ -208,10 +263,21 @@ pub fn all_model_presets() -> &'static Vec<ModelPreset> {
#[cfg(test)]
mod tests {
use super::*;
use codex_app_server_protocol::AuthMode;
#[test]
fn only_one_default_model_is_configured() {
let default_models = PRESETS.iter().filter(|preset| preset.is_default).count();
assert!(default_models == 1);
}
#[test]
fn gpt_5_1_codex_max_hidden_for_api_key_auth() {
let presets = builtin_model_presets(Some(AuthMode::ApiKey));
assert!(
presets
.iter()
.all(|preset| preset.id != "gpt-5.1-codex-max")
);
}
}

View File

@@ -19,9 +19,11 @@ async-trait = { workspace = true }
base64 = { workspace = true }
bytes = { workspace = true }
chrono = { workspace = true, features = ["serde"] }
chardetng = { workspace = true }
codex-app-server-protocol = { workspace = true }
codex-apply-patch = { workspace = true }
codex-async-utils = { workspace = true }
codex-execpolicy = { workspace = true }
codex-file-search = { workspace = true }
codex-git = { workspace = true }
codex-keyring-store = { workspace = true }
@@ -31,11 +33,11 @@ codex-rmcp-client = { workspace = true }
codex-utils-pty = { workspace = true }
codex-utils-readiness = { workspace = true }
codex-utils-string = { workspace = true }
codex-utils-tokenizer = { workspace = true }
codex-windows-sandbox = { package = "codex-windows-sandbox", path = "../windows-sandbox-rs" }
dirs = { workspace = true }
dunce = { workspace = true }
env-flags = { workspace = true }
encoding_rs = { workspace = true }
eventsource-stream = { workspace = true }
futures = { workspace = true }
http = { workspace = true }

View File

@@ -0,0 +1,117 @@
You are Codex, based on GPT-5. You are running as a coding agent in the Codex CLI on a user's computer.
## General
- When searching for text or files, prefer using `rg` or `rg --files` respectively because `rg` is much faster than alternatives like `grep`. (If the `rg` command is not found, then use alternatives.)
## Editing constraints
- Default to ASCII when editing or creating files. Only introduce non-ASCII or other Unicode characters when there is a clear justification and the file already uses them.
- Add succinct code comments that explain what is going on if code is not self-explanatory. You should not add comments like "Assigns the value to the variable", but a brief comment might be useful ahead of a complex code block that the user would otherwise have to spend time parsing out. Usage of these comments should be rare.
- Try to use apply_patch for single file edits, but it is fine to explore other options to make the edit if it does not work well. Do not use apply_patch for changes that are auto-generated (i.e. generating package.json or running a lint or format command like gofmt) or when scripting is more efficient (such as search and replacing a string across a codebase).
- You may be in a dirty git worktree.
* NEVER revert existing changes you did not make unless explicitly requested, since these changes were made by the user.
* If asked to make a commit or code edits and there are unrelated changes to your work or changes that you didn't make in those files, don't revert those changes.
* If the changes are in files you've touched recently, you should read carefully and understand how you can work with the changes rather than reverting them.
* If the changes are in unrelated files, just ignore them and don't revert them.
- Do not amend a commit unless explicitly requested to do so.
- While you are working, you might notice unexpected changes that you didn't make. If this happens, STOP IMMEDIATELY and ask the user how they would like to proceed.
- **NEVER** use destructive commands like `git reset --hard` or `git checkout --` unless specifically requested or approved by the user.
## Plan tool
When using the planning tool:
- Skip using the planning tool for straightforward tasks (roughly the easiest 25%).
- Do not make single-step plans.
- When you made a plan, update it after having performed one of the sub-tasks that you shared on the plan.
## Codex CLI harness, sandboxing, and approvals
The Codex CLI harness supports several different configurations for sandboxing and escalation approvals that the user can choose from.
Filesystem sandboxing defines which files can be read or written. The options for `sandbox_mode` are:
- **read-only**: The sandbox only permits reading files.
- **workspace-write**: The sandbox permits reading files, and editing files in `cwd` and `writable_roots`. Editing files in other directories requires approval.
- **danger-full-access**: No filesystem sandboxing - all commands are permitted.
Network sandboxing defines whether network can be accessed without approval. Options for `network_access` are:
- **restricted**: Requires approval
- **enabled**: No approval needed
Approvals are your mechanism to get user consent to run shell commands without the sandbox. Possible configuration options for `approval_policy` are
- **untrusted**: The harness will escalate most commands for user approval, apart from a limited allowlist of safe "read" commands.
- **on-failure**: The harness will allow all commands to run in the sandbox (if enabled), and failures will be escalated to the user for approval to run again without the sandbox.
- **on-request**: Commands will be run in the sandbox by default, and you can specify in your tool call if you want to escalate a command to run without sandboxing. (Note that this mode is not always available. If it is, you'll see parameters for it in the `shell` command description.)
- **never**: This is a non-interactive mode where you may NEVER ask the user for approval to run commands. Instead, you must always persist and work around constraints to solve the task for the user. You MUST do your utmost best to finish the task and validate your work before yielding. If this mode is paired with `danger-full-access`, take advantage of it to deliver the best outcome for the user. Further, in this mode, your default testing philosophy is overridden: Even if you don't see local patterns for testing, you may add tests and scripts to validate your work. Just remove them before yielding.
When you are running with `approval_policy == on-request`, and sandboxing enabled, here are scenarios where you'll need to request approval:
- You need to run a command that writes to a directory that requires it (e.g. running tests that write to /var)
- You need to run a GUI app (e.g., open/xdg-open/osascript) to open browsers or files.
- You are running sandboxed and need to run a command that requires network access (e.g. installing packages)
- If you run a command that is important to solving the user's query, but it fails because of sandboxing, rerun the command with approval. ALWAYS proceed to use the `with_escalated_permissions` and `justification` parameters - do not message the user before requesting approval for the command.
- You are about to take a potentially destructive action such as an `rm` or `git reset` that the user did not explicitly ask for
- (for all of these, you should weigh alternative paths that do not require approval)
When `sandbox_mode` is set to read-only, you'll need to request approval for any command that isn't a read.
You will be told what filesystem sandboxing, network sandboxing, and approval mode are active in a developer or user message. If you are not told about this, assume that you are running with workspace-write, network sandboxing enabled, and approval on-failure.
Although they introduce friction to the user because your work is paused until the user responds, you should leverage them when necessary to accomplish important work. If the completing the task requires escalated permissions, Do not let these settings or the sandbox deter you from attempting to accomplish the user's task unless it is set to "never", in which case never ask for approvals.
When requesting approval to execute a command that will require escalated privileges:
- Provide the `with_escalated_permissions` parameter with the boolean value true
- Include a short, 1 sentence explanation for why you need to enable `with_escalated_permissions` in the justification parameter
## Special user requests
- If the user makes a simple request (such as asking for the time) which you can fulfill by running a terminal command (such as `date`), you should do so.
- If the user asks for a "review", default to a code review mindset: prioritise identifying bugs, risks, behavioural regressions, and missing tests. Findings must be the primary focus of the response - keep summaries or overviews brief and only after enumerating the issues. Present findings first (ordered by severity with file/line references), follow with open questions or assumptions, and offer a change-summary only as a secondary detail. If no findings are discovered, state that explicitly and mention any residual risks or testing gaps.
## Frontend tasks
When doing frontend design tasks, avoid collapsing into "AI slop" or safe, average-looking layouts.
Aim for interfaces that feel intentional, bold, and a bit surprising.
- Typography: Use expressive, purposeful fonts and avoid default stacks (Inter, Roboto, Arial, system).
- Color & Look: Choose a clear visual direction; define CSS variables; avoid purple-on-white defaults. No purple bias or dark mode bias.
- Motion: Use a few meaningful animations (page-load, staggered reveals) instead of generic micro-motions.
- Background: Don't rely on flat, single-color backgrounds; use gradients, shapes, or subtle patterns to build atmosphere.
- Overall: Avoid boilerplate layouts and interchangeable UI patterns. Vary themes, type families, and visual languages across outputs.
- Ensure the page loads properly on both desktop and mobile
Exception: If working within an existing website or design system, preserve the established patterns, structure, and visual language.
## Presenting your work and final message
You are producing plain text that will later be styled by the CLI. Follow these rules exactly. Formatting should make results easy to scan, but not feel mechanical. Use judgment to decide how much structure adds value.
- Default: be very concise; friendly coding teammate tone.
- Ask only when needed; suggest ideas; mirror the user's style.
- For substantial work, summarize clearly; follow finalanswer formatting.
- Skip heavy formatting for simple confirmations.
- Don't dump large files you've written; reference paths only.
- No "save/copy this file" - User is on the same machine.
- Offer logical next steps (tests, commits, build) briefly; add verify steps if you couldn't do something.
- For code changes:
* Lead with a quick explanation of the change, and then give more details on the context covering where and why a change was made. Do not start this explanation with "summary", just jump right in.
* If there are natural next steps the user may want to take, suggest them at the end of your response. Do not make suggestions if there are no natural next steps.
* When suggesting multiple options, use numeric lists for the suggestions so the user can quickly respond with a single number.
- The user does not command execution outputs. When asked to show the output of a command (e.g. `git show`), relay the important details in your answer or summarize the key lines so the user understands the result.
### Final answer structure and style guidelines
- Plain text; CLI handles styling. Use structure only when it helps scanability.
- Headers: optional; short Title Case (1-3 words) wrapped in **…**; no blank line before the first bullet; add only if they truly help.
- Bullets: use - ; merge related points; keep to one line when possible; 46 per list ordered by importance; keep phrasing consistent.
- Monospace: backticks for commands/paths/env vars/code ids and inline examples; use for literal keyword bullets; never combine with **.
- Code samples or multi-line snippets should be wrapped in fenced code blocks; include an info string as often as possible.
- Structure: group related bullets; order sections general → specific → supporting; for subsections, start with a bolded keyword bullet, then items; match complexity to the task.
- Tone: collaborative, concise, factual; present tense, active voice; selfcontained; no "above/below"; parallel wording.
- Don'ts: no nested bullets/hierarchies; no ANSI codes; don't cram unrelated keywords; keep keyword lists short—wrap/reformat if long; avoid naming formatting styles in answers.
- Adaptation: code explanations → precise, structured with code refs; simple tasks → lead with outcome; big changes → logical walkthrough + rationale + next actions; casual one-offs → plain sentences, no headers/bullets.
- File References: When referencing files in your response follow the below rules:
* Use inline code to make file paths clickable.
* Each reference should have a stand alone path. Even if it's the same file.
* Accepted: absolute, workspacerelative, a/ or b/ diff prefixes, or bare filename/suffix.
* Optionally include line/column (1based): :line[:column] or #Lline[Ccolumn] (column defaults to 1).
* Do not use URIs like file://, vscode://, or https://.
* Do not provide range of lines
* Examples: src/app.ts, src/app.ts:42, b/server/index.js#L10, C:\repo\project\main.rs:12:5

View File

@@ -100,7 +100,7 @@ pub fn extract_bash_command(command: &[String]) -> Option<(&str, &str)> {
if !matches!(flag.as_str(), "-lc" | "-c")
|| !matches!(
detect_shell_type(&PathBuf::from(shell)),
Some(ShellType::Zsh) | Some(ShellType::Bash)
Some(ShellType::Zsh) | Some(ShellType::Bash) | Some(ShellType::Sh)
)
{
return None;

View File

@@ -81,6 +81,7 @@ pub(crate) async fn stream_chat_completions(
ResponseItem::CustomToolCallOutput { .. } => {}
ResponseItem::WebSearchCall { .. } => {}
ResponseItem::GhostSnapshot { .. } => {}
ResponseItem::CompactionSummary { .. } => {}
}
}
@@ -320,7 +321,8 @@ pub(crate) async fn stream_chat_completions(
}
ResponseItem::Reasoning { .. }
| ResponseItem::WebSearchCall { .. }
| ResponseItem::Other => {
| ResponseItem::Other
| ResponseItem::CompactionSummary { .. } => {
// Omit these items from the conversation history.
continue;
}

View File

@@ -26,6 +26,7 @@ use tokio::sync::mpsc;
use tokio::time::timeout;
use tokio_util::io::ReaderStream;
use tracing::debug;
use tracing::enabled;
use tracing::trace;
use tracing::warn;
@@ -55,6 +56,7 @@ use crate::model_family::ModelFamily;
use crate::model_provider_info::ModelProviderInfo;
use crate::model_provider_info::WireApi;
use crate::openai_model_info::get_model_info;
use crate::protocol::CreditsSnapshot;
use crate::protocol::RateLimitSnapshot;
use crate::protocol::RateLimitWindow;
use crate::protocol::TokenUsage;
@@ -78,6 +80,18 @@ struct Error {
resets_at: Option<i64>,
}
#[derive(Debug, Serialize)]
struct CompactHistoryRequest<'a> {
model: &'a str,
input: &'a [ResponseItem],
instructions: &'a str,
}
#[derive(Debug, Deserialize)]
struct CompactHistoryResponse {
output: Vec<ResponseItem>,
}
#[derive(Debug, Clone)]
pub struct ModelClient {
config: Arc<Config>,
@@ -507,6 +521,70 @@ impl ModelClient {
pub fn get_auth_manager(&self) -> Option<Arc<AuthManager>> {
self.auth_manager.clone()
}
pub async fn compact_conversation_history(&self, prompt: &Prompt) -> Result<Vec<ResponseItem>> {
if prompt.input.is_empty() {
return Ok(Vec::new());
}
let auth_manager = self.auth_manager.clone();
let auth = auth_manager.as_ref().and_then(|m| m.auth());
let mut req_builder = self
.provider
.create_compact_request_builder(&self.client, &auth)
.await?;
if let SessionSource::SubAgent(sub) = &self.session_source {
let subagent = if let crate::protocol::SubAgentSource::Other(label) = sub {
label.clone()
} else {
serde_json::to_value(sub)
.ok()
.and_then(|v| v.as_str().map(std::string::ToString::to_string))
.unwrap_or_else(|| "other".to_string())
};
req_builder = req_builder.header("x-openai-subagent", subagent);
}
if let Some(auth) = auth.as_ref()
&& auth.mode == AuthMode::ChatGPT
&& let Some(account_id) = auth.get_account_id()
{
req_builder = req_builder.header("chatgpt-account-id", account_id);
}
let payload = CompactHistoryRequest {
model: &self.config.model,
input: &prompt.input,
instructions: &prompt.get_full_instructions(&self.config.model_family),
};
if enabled!(tracing::Level::TRACE) {
trace!(
"POST to {}: {}",
self.provider
.get_compact_url(&auth)
.unwrap_or("<none>".to_string()),
serde_json::to_value(&payload).unwrap_or_default()
);
}
let response = req_builder
.json(&payload)
.send()
.await
.map_err(|source| CodexErr::ConnectionFailed(ConnectionFailedError { source }))?;
let status = response.status();
let body = response
.text()
.await
.map_err(|source| CodexErr::ConnectionFailed(ConnectionFailedError { source }))?;
if !status.is_success() {
return Err(CodexErr::UnexpectedStatus(UnexpectedResponseError {
status,
body,
request_id: None,
}));
}
let CompactHistoryResponse { output } = serde_json::from_str(&body)?;
Ok(output)
}
}
enum StreamAttemptError {
@@ -649,7 +727,13 @@ fn parse_rate_limit_snapshot(headers: &HeaderMap) -> Option<RateLimitSnapshot> {
"x-codex-secondary-reset-at",
);
Some(RateLimitSnapshot { primary, secondary })
let credits = parse_credits_snapshot(headers);
Some(RateLimitSnapshot {
primary,
secondary,
credits,
})
}
fn parse_rate_limit_window(
@@ -676,6 +760,20 @@ fn parse_rate_limit_window(
})
}
fn parse_credits_snapshot(headers: &HeaderMap) -> Option<CreditsSnapshot> {
let has_credits = parse_header_bool(headers, "x-codex-credits-has-credits")?;
let unlimited = parse_header_bool(headers, "x-codex-credits-unlimited")?;
let balance = parse_header_str(headers, "x-codex-credits-balance")
.map(str::trim)
.filter(|value| !value.is_empty())
.map(std::string::ToString::to_string);
Some(CreditsSnapshot {
has_credits,
unlimited,
balance,
})
}
fn parse_header_f64(headers: &HeaderMap, name: &str) -> Option<f64> {
parse_header_str(headers, name)?
.parse::<f64>()
@@ -687,6 +785,17 @@ fn parse_header_i64(headers: &HeaderMap, name: &str) -> Option<i64> {
parse_header_str(headers, name)?.parse::<i64>().ok()
}
fn parse_header_bool(headers: &HeaderMap, name: &str) -> Option<bool> {
let raw = parse_header_str(headers, name)?;
if raw.eq_ignore_ascii_case("true") || raw == "1" {
Some(true)
} else if raw.eq_ignore_ascii_case("false") || raw == "0" {
Some(false)
} else {
None
}
}
fn parse_header_str<'a>(headers: &'a HeaderMap, name: &str) -> Option<&'a str> {
headers.get(name)?.to_str().ok()
}

View File

@@ -136,7 +136,7 @@ fn reserialize_shell_outputs(items: &mut [ResponseItem]) {
}
fn is_shell_tool_name(name: &str) -> bool {
matches!(name, "shell" | "container.exec" | "shell_command")
matches!(name, "shell" | "container.exec")
}
#[derive(Deserialize)]
@@ -165,11 +165,9 @@ fn build_structured_output(parsed: &ExecOutputJson) -> String {
));
let mut output = parsed.output.clone();
if let Some(total_lines) = extract_total_output_lines(&parsed.output) {
if let Some((stripped, total_lines)) = strip_total_output_header(&parsed.output) {
sections.push(format!("Total output lines: {total_lines}"));
if let Some(stripped) = strip_total_output_header(&output) {
output = stripped.to_string();
}
output = stripped.to_string();
}
sections.push("Output:".to_string());
@@ -178,19 +176,12 @@ fn build_structured_output(parsed: &ExecOutputJson) -> String {
sections.join("\n")
}
fn extract_total_output_lines(output: &str) -> Option<u32> {
let marker_start = output.find("[... omitted ")?;
let marker = &output[marker_start..];
let (_, after_of) = marker.split_once(" of ")?;
let (total_segment, _) = after_of.split_once(' ')?;
total_segment.parse::<u32>().ok()
}
fn strip_total_output_header(output: &str) -> Option<&str> {
fn strip_total_output_header(output: &str) -> Option<(&str, u32)> {
let after_prefix = output.strip_prefix("Total output lines: ")?;
let (_, remainder) = after_prefix.split_once('\n')?;
let (total_segment, remainder) = after_prefix.split_once('\n')?;
let total_lines = total_segment.parse::<u32>().ok()?;
let remainder = remainder.strip_prefix('\n').unwrap_or(remainder);
Some(remainder)
Some((remainder, total_lines))
}
#[derive(Debug)]
@@ -431,7 +422,7 @@ mod tests {
expects_apply_patch_instructions: false,
},
InstructionsTestCase {
slug: "gpt-5.1-codex",
slug: "gpt-5.1-codex-max",
expects_apply_patch_instructions: false,
},
];

View File

@@ -7,12 +7,16 @@ use std::sync::atomic::AtomicU64;
use crate::AuthManager;
use crate::client_common::REVIEW_PROMPT;
use crate::compact;
use crate::compact::run_inline_auto_compact_task;
use crate::compact::should_use_remote_compact_task;
use crate::compact_remote::run_inline_remote_auto_compact_task;
use crate::features::Feature;
use crate::function_tool::FunctionCallError;
use crate::parse_command::parse_command;
use crate::parse_turn_item;
use crate::response_processing::process_items;
use crate::terminal;
use crate::truncate::TruncationPolicy;
use crate::user_notification::UserNotifier;
use crate::util::error_or_panic;
use async_channel::Receiver;
@@ -55,6 +59,7 @@ use crate::ModelProviderInfo;
use crate::client::ModelClient;
use crate::client_common::Prompt;
use crate::client_common::ResponseEvent;
use crate::compact::collect_user_messages;
use crate::config::Config;
use crate::config::types::ShellEnvironmentPolicy;
use crate::context_manager::ContextManager;
@@ -63,10 +68,6 @@ use crate::error::CodexErr;
use crate::error::Result as CodexResult;
#[cfg(test)]
use crate::exec::StreamOutput;
// Removed: legacy executor wiring replaced by ToolOrchestrator flows.
// legacy normalize_exec_result no longer used after orchestrator migration
use crate::compact::build_compacted_history;
use crate::compact::collect_user_messages;
use crate::mcp::auth::compute_auth_statuses;
use crate::mcp_connection_manager::McpConnectionManager;
use crate::model_family::find_family_for_model;
@@ -78,7 +79,6 @@ use crate::protocol::ApplyPatchApprovalRequestEvent;
use crate::protocol::AskForApproval;
use crate::protocol::BackgroundEventEvent;
use crate::protocol::DeprecationNoticeEvent;
use crate::protocol::ErrorEvent;
use crate::protocol::Event;
use crate::protocol::EventMsg;
use crate::protocol::ExecApprovalRequestEvent;
@@ -120,6 +120,7 @@ use crate::user_instructions::UserInstructions;
use crate::user_notification::UserNotification;
use crate::util::backoff;
use codex_async_utils::OrCancelExt;
use codex_execpolicy::Policy as ExecPolicy;
use codex_otel::otel_event_manager::OtelEventManager;
use codex_protocol::config_types::ReasoningEffort as ReasoningEffortConfig;
use codex_protocol::config_types::ReasoningSummary as ReasoningSummaryConfig;
@@ -127,11 +128,11 @@ use codex_protocol::models::ContentItem;
use codex_protocol::models::FunctionCallOutputPayload;
use codex_protocol::models::ResponseInputItem;
use codex_protocol::models::ResponseItem;
use codex_protocol::protocol::CodexErrorInfo;
use codex_protocol::protocol::InitialHistory;
use codex_protocol::user_input::UserInput;
use codex_utils_readiness::Readiness;
use codex_utils_readiness::ReadinessFlag;
use codex_utils_tokenizer::warm_model_cache;
/// The high-level interface to the Codex system.
/// It operates as a queue pair where you send submissions and receive events.
@@ -165,6 +166,10 @@ impl Codex {
let user_instructions = get_user_instructions(&config).await;
let exec_policy = crate::exec_policy::exec_policy_for(&config.features, &config.codex_home)
.await
.map_err(|err| CodexErr::Fatal(format!("failed to load execpolicy: {err}")))?;
let config = Arc::new(config);
let session_configuration = SessionConfiguration {
@@ -181,6 +186,7 @@ impl Codex {
cwd: config.cwd.clone(),
original_config_do_not_use: Arc::clone(&config),
features: config.features.clone(),
exec_policy,
session_source,
};
@@ -278,6 +284,8 @@ pub(crate) struct TurnContext {
pub(crate) final_output_json_schema: Option<Value>,
pub(crate) codex_linux_sandbox_exe: Option<PathBuf>,
pub(crate) tool_call_gate: Arc<ReadinessFlag>,
pub(crate) exec_policy: Arc<ExecPolicy>,
pub(crate) truncation_policy: TruncationPolicy,
}
impl TurnContext {
@@ -294,7 +302,6 @@ impl TurnContext {
}
}
#[allow(dead_code)]
#[derive(Clone)]
pub(crate) struct SessionConfiguration {
/// Provider identifier ("openai", "openrouter", ...).
@@ -334,6 +341,8 @@ pub(crate) struct SessionConfiguration {
/// Set of feature flags for this session
features: Features,
/// Execpolicy policy, applied only when enabled by feature flag.
exec_policy: Arc<ExecPolicy>,
// TODO(pakrym): Remove config from here
original_config_do_not_use: Arc<Config>,
@@ -404,7 +413,7 @@ impl Session {
);
let client = ModelClient::new(
Arc::new(per_turn_config),
Arc::new(per_turn_config.clone()),
auth_manager,
otel_event_manager,
provider,
@@ -434,6 +443,8 @@ impl Session {
final_output_json_schema: None,
codex_linux_sandbox_exe: config.codex_linux_sandbox_exe.clone(),
tool_call_gate: Arc::new(ReadinessFlag::new()),
exec_policy: session_configuration.exec_policy.clone(),
truncation_policy: TruncationPolicy::new(&per_turn_config),
}
}
@@ -481,7 +492,7 @@ impl Session {
// - load history metadata
let rollout_fut = RolloutRecorder::new(&config, rollout_params);
let default_shell_fut = shell::default_user_shell();
let default_shell = shell::default_user_shell();
let history_meta_fut = crate::message_history::history_metadata(&config);
let auth_statuses_fut = compute_auth_statuses(
config.mcp_servers.iter(),
@@ -489,12 +500,8 @@ impl Session {
);
// Join all independent futures.
let (rollout_recorder, default_shell, (history_log_id, history_entry_count), auth_statuses) = tokio::join!(
rollout_fut,
default_shell_fut,
history_meta_fut,
auth_statuses_fut
);
let (rollout_recorder, (history_log_id, history_entry_count), auth_statuses) =
tokio::join!(rollout_fut, history_meta_fut, auth_statuses_fut);
let rollout_recorder = rollout_recorder.map_err(|e| {
error!("failed to initialize rollout recorder: {e:#}");
@@ -536,7 +543,6 @@ impl Session {
config.model_reasoning_effort,
config.model_reasoning_summary,
config.model_context_window,
config.model_max_output_tokens,
config.model_auto_compact_token_limit,
config.approval_policy,
config.sandbox_policy.clone(),
@@ -547,9 +553,6 @@ impl Session {
// Create the mutable state for the Session.
let state = SessionState::new(session_configuration.clone());
// Warm the tokenizer cache for the session model without blocking startup.
warm_model_cache(&session_configuration.model);
let services = SessionServices {
mcp_connection_manager: Arc::new(RwLock::new(McpConnectionManager::default())),
mcp_startup_cancellation_token: CancellationToken::new(),
@@ -581,6 +584,10 @@ impl Session {
msg: EventMsg::SessionConfigured(SessionConfiguredEvent {
session_id: conversation_id,
model: session_configuration.model.clone(),
model_provider_id: config.model_provider_id.clone(),
approval_policy: session_configuration.approval_policy,
sandbox_policy: session_configuration.sandbox_policy.clone(),
cwd: session_configuration.cwd.clone(),
reasoning_effort: session_configuration.model_reasoning_effort,
history_log_id,
history_entry_count,
@@ -681,7 +688,8 @@ impl Session {
let reconstructed_history =
self.reconstruct_history_from_rollout(&turn_context, &rollout_items);
if !reconstructed_history.is_empty() {
self.record_into_history(&reconstructed_history).await;
self.record_into_history(&reconstructed_history, &turn_context)
.await;
}
// If persisting, persist all rollout items as-is (recorder filters)
@@ -902,6 +910,7 @@ impl Session {
let event = EventMsg::ApplyPatchApprovalRequest(ApplyPatchApprovalRequestEvent {
call_id,
turn_id: turn_context.sub_id.clone(),
changes,
reason,
grant_root,
@@ -938,7 +947,7 @@ impl Session {
turn_context: &TurnContext,
items: &[ResponseItem],
) {
self.record_into_history(items).await;
self.record_into_history(items, turn_context).await;
self.persist_rollout_response_items(items).await;
self.send_raw_response_items(turn_context, items).await;
}
@@ -952,17 +961,25 @@ impl Session {
for item in rollout_items {
match item {
RolloutItem::ResponseItem(response_item) => {
history.record_items(std::iter::once(response_item));
history.record_items(
std::iter::once(response_item),
turn_context.truncation_policy,
);
}
RolloutItem::Compacted(compacted) => {
let snapshot = history.get_history();
let user_messages = collect_user_messages(&snapshot);
let rebuilt = build_compacted_history(
self.build_initial_context(turn_context),
&user_messages,
&compacted.message,
);
history.replace(rebuilt);
// TODO(jif) clean
if let Some(replacement) = &compacted.replacement_history {
history.replace(replacement.clone());
} else {
let user_messages = collect_user_messages(&snapshot);
let rebuilt = compact::build_compacted_history(
self.build_initial_context(turn_context),
&user_messages,
&compacted.message,
);
history.replace(rebuilt);
}
}
_ => {}
}
@@ -971,9 +988,13 @@ impl Session {
}
/// Append ResponseItems to the in-memory conversation history only.
pub(crate) async fn record_into_history(&self, items: &[ResponseItem]) {
pub(crate) async fn record_into_history(
&self,
items: &[ResponseItem],
turn_context: &TurnContext,
) {
let mut state = self.state.lock().await;
state.record_items(items.iter());
state.record_items(items.iter(), turn_context.truncation_policy);
}
pub(crate) async fn replace_history(&self, items: Vec<ResponseItem>) {
@@ -990,6 +1011,15 @@ impl Session {
self.persist_rollout_items(&rollout_items).await;
}
pub async fn enabled(&self, feature: Feature) -> bool {
self.state
.lock()
.await
.session_configuration
.features
.enabled(feature)
}
async fn send_raw_response_items(&self, turn_context: &TurnContext, items: &[ResponseItem]) {
for item in items {
self.send_event(
@@ -1018,7 +1048,7 @@ impl Session {
Some(turn_context.cwd.clone()),
Some(turn_context.approval_policy),
Some(turn_context.sandbox_policy.clone()),
Some(self.user_shell().clone()),
self.user_shell().clone(),
)));
items
}
@@ -1057,11 +1087,14 @@ impl Session {
self.send_token_count_event(turn_context).await;
}
pub(crate) async fn override_last_token_usage_estimate(
&self,
turn_context: &TurnContext,
estimated_total_tokens: i64,
) {
pub(crate) async fn recompute_token_usage(&self, turn_context: &TurnContext) {
let Some(estimated_total_tokens) = self
.clone_history()
.await
.estimate_token_count(turn_context)
else {
return;
};
{
let mut state = self.state.lock().await;
let mut info = state.token_info().unwrap_or(TokenUsageInfo {
@@ -1155,9 +1188,14 @@ impl Session {
&self,
turn_context: &TurnContext,
message: impl Into<String>,
codex_error: CodexErr,
) {
let codex_error_info = CodexErrorInfo::ResponseStreamDisconnected {
http_status_code: codex_error.http_status_code_value(),
};
let event = EventMsg::StreamError(StreamErrorEvent {
message: message.into(),
codex_error_info: Some(codex_error_info),
});
self.send_event(turn_context, event).await;
}
@@ -1167,14 +1205,7 @@ impl Session {
turn_context: Arc<TurnContext>,
cancellation_token: CancellationToken,
) {
if !self
.state
.lock()
.await
.session_configuration
.features
.enabled(Feature::GhostCommit)
{
if !self.enabled(Feature::GhostCommit).await {
return;
}
let token = match turn_context.tool_call_gate.subscribe().await {
@@ -1311,7 +1342,10 @@ impl Session {
}
async fn submission_loop(sess: Arc<Session>, config: Arc<Config>, rx_sub: Receiver<Submission>) {
let mut previous_context: Option<Arc<TurnContext>> = None;
// Seed with context in case there is an OverrideTurnContext first.
let mut previous_context: Option<Arc<TurnContext>> =
Some(sess.new_turn(SessionSettingsUpdate::default()).await);
// To break out of this loop, send Op::Shutdown.
while let Ok(sub) = rx_sub.recv().await {
debug!(?sub, "Submission");
@@ -1407,6 +1441,7 @@ mod handlers {
use crate::tasks::UndoTask;
use crate::tasks::UserShellCommandTask;
use codex_protocol::custom_prompts::CustomPrompt;
use codex_protocol::protocol::CodexErrorInfo;
use codex_protocol::protocol::ErrorEvent;
use codex_protocol::protocol::Event;
use codex_protocol::protocol::EventMsg;
@@ -1415,6 +1450,7 @@ mod handlers {
use codex_protocol::protocol::ReviewDecision;
use codex_protocol::protocol::ReviewRequest;
use codex_protocol::protocol::TurnAbortReason;
use codex_protocol::user_input::UserInput;
use std::sync::Arc;
use tracing::info;
@@ -1623,16 +1659,15 @@ mod handlers {
let turn_context = sess
.new_turn_with_sub_id(sub_id, SessionSettingsUpdate::default())
.await;
// Attempt to inject input into current task
if let Err(items) = sess
.inject_input(vec![UserInput::Text {
sess.spawn_task(
Arc::clone(&turn_context),
vec![UserInput::Text {
text: turn_context.compact_prompt().to_string(),
}])
.await
{
sess.spawn_task(Arc::clone(&turn_context), items, CompactTask)
.await;
}
}],
CompactTask,
)
.await;
}
pub async fn shutdown(sess: &Arc<Session>, sub_id: String) -> bool {
@@ -1653,6 +1688,7 @@ mod handlers {
id: sub_id.clone(),
msg: EventMsg::Error(ErrorEvent {
message: "Failed to shutdown rollout recorder".to_string(),
codex_error_info: Some(CodexErrorInfo::Other),
}),
};
sess.send_event_raw(event).await;
@@ -1758,6 +1794,8 @@ async fn spawn_review_thread(
final_output_json_schema: None,
codex_linux_sandbox_exe: parent_turn_context.codex_linux_sandbox_exe.clone(),
tool_call_gate: Arc::new(ReadinessFlag::new()),
exec_policy: parent_turn_context.exec_policy.clone(),
truncation_policy: TruncationPolicy::new(&per_turn_config),
};
// Seed the child task with the review prompt as the initial user message.
@@ -1765,7 +1803,12 @@ async fn spawn_review_thread(
text: review_prompt,
}];
let tc = Arc::new(review_turn_context);
sess.spawn_task(tc.clone(), input, ReviewTask).await;
sess.spawn_task(
tc.clone(),
input,
ReviewTask::new(review_request.append_to_original_thread),
)
.await;
// Announce entering review mode so UIs can switch modes.
sess.send_event(&tc, EventMsg::EnteredReviewMode(review_request))
@@ -1866,7 +1909,12 @@ pub(crate) async fn run_task(
// as long as compaction works well in getting us way below the token limit, we shouldn't worry about being in an infinite loop.
if token_limit_reached {
compact::run_inline_auto_compact_task(sess.clone(), turn_context.clone()).await;
if should_use_remote_compact_task(&sess).await {
run_inline_remote_auto_compact_task(sess.clone(), turn_context.clone())
.await;
} else {
run_inline_auto_compact_task(sess.clone(), turn_context.clone()).await;
}
continue;
}
@@ -1895,9 +1943,7 @@ pub(crate) async fn run_task(
}
Err(e) => {
info!("Turn error: {e:#}");
let event = EventMsg::Error(ErrorEvent {
message: e.to_string(),
});
let event = EventMsg::Error(e.to_error_event(None));
sess.send_event(&turn_context, event).await;
// let the user continue the conversation
break;
@@ -1937,12 +1983,32 @@ async fn run_turn(
.client
.get_model_family()
.supports_parallel_tool_calls;
let parallel_tool_calls = model_supports_parallel;
// TODO(jif) revert once testing phase is done.
let parallel_tool_calls = model_supports_parallel
&& sess
.state
.lock()
.await
.session_configuration
.features
.enabled(Feature::ParallelToolCalls);
let mut base_instructions = turn_context.base_instructions.clone();
if parallel_tool_calls {
static INSTRUCTIONS: &str = include_str!("../templates/parallel/instructions.md");
if let Some(family) =
find_family_for_model(&sess.state.lock().await.session_configuration.model)
{
let mut new_instructions = base_instructions.unwrap_or(family.base_instructions);
new_instructions.push_str(INSTRUCTIONS);
base_instructions = Some(new_instructions);
}
}
let prompt = Prompt {
input,
tools: router.specs(),
parallel_tool_calls,
base_instructions_override: turn_context.base_instructions.clone(),
base_instructions_override: base_instructions,
output_schema: turn_context.final_output_json_schema.clone(),
};
@@ -2002,6 +2068,7 @@ async fn run_turn(
sess.notify_stream_error(
&turn_context,
format!("Reconnecting... {retries}/{max_retries}"),
e,
)
.await;
@@ -2320,6 +2387,7 @@ mod tests {
use crate::config::ConfigOverrides;
use crate::config::ConfigToml;
use crate::exec::ExecToolCallOutput;
use crate::shell::default_user_shell;
use crate::tools::format_exec_output_str;
use crate::protocol::CompactedItem;
@@ -2429,8 +2497,9 @@ mod tests {
duration: StdDuration::from_secs(1),
timed_out: true,
};
let (_, turn_context) = make_session_and_context();
let out = format_exec_output_str(&exec);
let out = format_exec_output_str(&exec, turn_context.truncation_policy);
assert_eq!(
out,
@@ -2546,6 +2615,7 @@ mod tests {
cwd: config.cwd.clone(),
original_config_do_not_use: Arc::clone(&config),
features: Features::default(),
exec_policy: Arc::new(ExecPolicy::empty()),
session_source: SessionSource::Exec,
};
@@ -2557,7 +2627,7 @@ mod tests {
unified_exec_manager: UnifiedExecSessionManager::default(),
notifier: UserNotifier::new(None),
rollout: Mutex::new(None),
user_shell: shell::Shell::Unknown,
user_shell: default_user_shell(),
show_raw_agent_reasoning: config.show_raw_agent_reasoning,
auth_manager: Arc::clone(&auth_manager),
otel_event_manager: otel_event_manager.clone(),
@@ -2623,6 +2693,7 @@ mod tests {
cwd: config.cwd.clone(),
original_config_do_not_use: Arc::clone(&config),
features: Features::default(),
exec_policy: Arc::new(ExecPolicy::empty()),
session_source: SessionSource::Exec,
};
@@ -2634,7 +2705,7 @@ mod tests {
unified_exec_manager: UnifiedExecSessionManager::default(),
notifier: UserNotifier::new(None),
rollout: Mutex::new(None),
user_shell: shell::Shell::Unknown,
user_shell: default_user_shell(),
show_raw_agent_reasoning: config.show_raw_agent_reasoning,
auth_manager: Arc::clone(&auth_manager),
otel_event_manager: otel_event_manager.clone(),
@@ -2753,7 +2824,8 @@ mod tests {
let input = vec![UserInput::Text {
text: "start review".to_string(),
}];
sess.spawn_task(Arc::clone(&tc), input, ReviewTask).await;
sess.spawn_task(Arc::clone(&tc), input, ReviewTask::new(true))
.await;
sess.abort_all_tasks(TurnAbortReason::Interrupted).await;
@@ -2871,7 +2943,7 @@ mod tests {
for item in &initial_context {
rollout_items.push(RolloutItem::ResponseItem(item.clone()));
}
live_history.record_items(initial_context.iter());
live_history.record_items(initial_context.iter(), turn_context.truncation_policy);
let user1 = ResponseItem::Message {
id: None,
@@ -2880,7 +2952,7 @@ mod tests {
text: "first user".to_string(),
}],
};
live_history.record_items(std::iter::once(&user1));
live_history.record_items(std::iter::once(&user1), turn_context.truncation_policy);
rollout_items.push(RolloutItem::ResponseItem(user1.clone()));
let assistant1 = ResponseItem::Message {
@@ -2890,13 +2962,13 @@ mod tests {
text: "assistant reply one".to_string(),
}],
};
live_history.record_items(std::iter::once(&assistant1));
live_history.record_items(std::iter::once(&assistant1), turn_context.truncation_policy);
rollout_items.push(RolloutItem::ResponseItem(assistant1.clone()));
let summary1 = "summary one";
let snapshot1 = live_history.get_history();
let user_messages1 = collect_user_messages(&snapshot1);
let rebuilt1 = build_compacted_history(
let rebuilt1 = compact::build_compacted_history(
session.build_initial_context(turn_context),
&user_messages1,
summary1,
@@ -2904,6 +2976,7 @@ mod tests {
live_history.replace(rebuilt1);
rollout_items.push(RolloutItem::Compacted(CompactedItem {
message: summary1.to_string(),
replacement_history: None,
}));
let user2 = ResponseItem::Message {
@@ -2913,7 +2986,7 @@ mod tests {
text: "second user".to_string(),
}],
};
live_history.record_items(std::iter::once(&user2));
live_history.record_items(std::iter::once(&user2), turn_context.truncation_policy);
rollout_items.push(RolloutItem::ResponseItem(user2.clone()));
let assistant2 = ResponseItem::Message {
@@ -2923,13 +2996,13 @@ mod tests {
text: "assistant reply two".to_string(),
}],
};
live_history.record_items(std::iter::once(&assistant2));
live_history.record_items(std::iter::once(&assistant2), turn_context.truncation_policy);
rollout_items.push(RolloutItem::ResponseItem(assistant2.clone()));
let summary2 = "summary two";
let snapshot2 = live_history.get_history();
let user_messages2 = collect_user_messages(&snapshot2);
let rebuilt2 = build_compacted_history(
let rebuilt2 = compact::build_compacted_history(
session.build_initial_context(turn_context),
&user_messages2,
summary2,
@@ -2937,6 +3010,7 @@ mod tests {
live_history.replace(rebuilt2);
rollout_items.push(RolloutItem::Compacted(CompactedItem {
message: summary2.to_string(),
replacement_history: None,
}));
let user3 = ResponseItem::Message {
@@ -2946,7 +3020,7 @@ mod tests {
text: "third user".to_string(),
}],
};
live_history.record_items(std::iter::once(&user3));
live_history.record_items(std::iter::once(&user3), turn_context.truncation_policy);
rollout_items.push(RolloutItem::ResponseItem(user3.clone()));
let assistant3 = ResponseItem::Message {
@@ -2956,7 +3030,7 @@ mod tests {
text: "assistant reply three".to_string(),
}],
};
live_history.record_items(std::iter::once(&assistant3));
live_history.record_items(std::iter::once(&assistant3), turn_context.truncation_policy);
rollout_items.push(RolloutItem::ResponseItem(assistant3.clone()));
(rollout_items, live_history.get_history())
@@ -2976,6 +3050,7 @@ mod tests {
let session = Arc::new(session);
let mut turn_context = Arc::new(turn_context_raw);
let timeout_ms = 1000;
let params = ExecParams {
command: if cfg!(windows) {
vec![
@@ -2991,7 +3066,7 @@ mod tests {
]
},
cwd: turn_context.cwd.clone(),
timeout_ms: Some(1000),
expiration: timeout_ms.into(),
env: HashMap::new(),
with_escalated_permissions: Some(true),
justification: Some("test".to_string()),
@@ -3000,7 +3075,12 @@ mod tests {
let params2 = ExecParams {
with_escalated_permissions: Some(false),
..params.clone()
command: params.command.clone(),
cwd: params.cwd.clone(),
expiration: timeout_ms.into(),
env: HashMap::new(),
justification: params.justification.clone(),
arg0: None,
};
let turn_diff_tracker = Arc::new(tokio::sync::Mutex::new(TurnDiffTracker::new()));
@@ -3020,7 +3100,7 @@ mod tests {
arguments: serde_json::json!({
"command": params.command.clone(),
"workdir": Some(turn_context.cwd.to_string_lossy().to_string()),
"timeout_ms": params.timeout_ms,
"timeout_ms": params.expiration.timeout_ms(),
"with_escalated_permissions": params.with_escalated_permissions,
"justification": params.justification.clone(),
})
@@ -3057,7 +3137,7 @@ mod tests {
arguments: serde_json::json!({
"command": params2.command.clone(),
"workdir": Some(turn_context.cwd.to_string_lossy().to_string()),
"timeout_ms": params2.timeout_ms,
"timeout_ms": params2.expiration.timeout_ms(),
"with_escalated_permissions": params2.with_escalated_permissions,
"justification": params2.justification.clone(),
})

View File

@@ -1,6 +1,8 @@
use codex_protocol::protocol::AskForApproval;
use codex_protocol::protocol::SandboxPolicy;
use crate::sandboxing::SandboxPermissions;
use crate::bash::parse_shell_lc_plain_commands;
use crate::is_safe_command::is_known_safe_command;
@@ -8,7 +10,7 @@ pub fn requires_initial_appoval(
policy: AskForApproval,
sandbox_policy: &SandboxPolicy,
command: &[String],
with_escalated_permissions: bool,
sandbox_permissions: SandboxPermissions,
) -> bool {
if is_known_safe_command(command) {
return false;
@@ -24,8 +26,7 @@ pub fn requires_initial_appoval(
// In restricted sandboxes (ReadOnly/WorkspaceWrite), do not prompt for
// nonescalated, nondangerous commands — let the sandbox enforce
// restrictions (e.g., block network/write) without a user prompt.
let wants_escalation: bool = with_escalated_permissions;
if wants_escalation {
if sandbox_permissions.requires_escalated_permissions() {
return true;
}
command_might_be_dangerous(command)

View File

@@ -267,6 +267,20 @@ mod tests {
}
}
#[test]
fn windows_powershell_full_path_is_safe() {
if !cfg!(windows) {
// Windows only because on Linux path splitting doesn't handle `/` separators properly
return;
}
assert!(is_known_safe_command(&vec_str(&[
r"C:\Program Files\PowerShell\7\pwsh.exe",
"-Command",
"Get-Location",
])));
}
#[test]
fn bash_lc_safe_examples() {
assert!(is_known_safe_command(&vec_str(&["bash", "-lc", "ls"])));

View File

@@ -1,4 +1,5 @@
use shlex::split as shlex_split;
use std::path::Path;
/// On Windows, we conservatively allow only clearly read-only PowerShell invocations
/// that match a small safelist. Anything else (including direct CMD commands) is unsafe.
@@ -131,8 +132,14 @@ fn split_into_commands(tokens: Vec<String>) -> Option<Vec<Vec<String>>> {
/// Returns true when the executable name is one of the supported PowerShell binaries.
fn is_powershell_executable(exe: &str) -> bool {
let executable_name = Path::new(exe)
.file_name()
.and_then(|osstr| osstr.to_str())
.unwrap_or(exe)
.to_ascii_lowercase();
matches!(
exe.to_ascii_lowercase().as_str(),
executable_name.as_str(),
"powershell" | "powershell.exe" | "pwsh" | "pwsh.exe"
)
}
@@ -313,6 +320,27 @@ mod tests {
])));
}
#[test]
fn accepts_full_path_powershell_invocations() {
if !cfg!(windows) {
// Windows only because on Linux path splitting doesn't handle `/` separators properly
return;
}
assert!(is_safe_command_windows(&vec_str(&[
r"C:\Program Files\PowerShell\7\pwsh.exe",
"-NoProfile",
"-Command",
"Get-ChildItem -Path .",
])));
assert!(is_safe_command_windows(&vec_str(&[
r"C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe",
"-Command",
"Get-Content Cargo.toml",
])));
}
#[test]
fn allows_read_only_pipelines_and_git_usage() {
assert!(is_safe_command_windows(&vec_str(&[

View File

@@ -7,15 +7,18 @@ use crate::codex::TurnContext;
use crate::codex::get_last_assistant_message_from_turn;
use crate::error::CodexErr;
use crate::error::Result as CodexResult;
use crate::features::Feature;
use crate::protocol::AgentMessageEvent;
use crate::protocol::CompactedItem;
use crate::protocol::ErrorEvent;
use crate::protocol::EventMsg;
use crate::protocol::TaskStartedEvent;
use crate::protocol::TurnContextItem;
use crate::protocol::WarningEvent;
use crate::truncate::truncate_middle;
use crate::truncate::TruncationPolicy;
use crate::truncate::approx_token_count;
use crate::truncate::truncate_text;
use crate::util::backoff;
use codex_app_server_protocol::AuthMode;
use codex_protocol::items::TurnItem;
use codex_protocol::models::ContentItem;
use codex_protocol::models::ResponseInputItem;
@@ -29,12 +32,22 @@ pub const SUMMARIZATION_PROMPT: &str = include_str!("../templates/compact/prompt
pub const SUMMARY_PREFIX: &str = include_str!("../templates/compact/summary_prefix.md");
const COMPACT_USER_MESSAGE_MAX_TOKENS: usize = 20_000;
pub(crate) async fn should_use_remote_compact_task(session: &Session) -> bool {
session
.services
.auth_manager
.auth()
.is_some_and(|auth| auth.mode == AuthMode::ChatGPT)
&& session.enabled(Feature::RemoteCompaction).await
}
pub(crate) async fn run_inline_auto_compact_task(
sess: Arc<Session>,
turn_context: Arc<TurnContext>,
) {
let prompt = turn_context.compact_prompt().to_string();
let input = vec![UserInput::Text { text: prompt }];
run_compact_task_inner(sess, turn_context, input).await;
}
@@ -42,13 +55,12 @@ pub(crate) async fn run_compact_task(
sess: Arc<Session>,
turn_context: Arc<TurnContext>,
input: Vec<UserInput>,
) -> Option<String> {
) {
let start_event = EventMsg::TaskStarted(TaskStartedEvent {
model_context_window: turn_context.client.get_model_context_window(),
});
sess.send_event(&turn_context, start_event).await;
run_compact_task_inner(sess.clone(), turn_context, input).await;
None
}
async fn run_compact_task_inner(
@@ -59,7 +71,10 @@ async fn run_compact_task_inner(
let initial_input_for_turn: ResponseInputItem = ResponseInputItem::from(input);
let mut history = sess.clone_history().await;
history.record_items(&[initial_input_for_turn.into()]);
history.record_items(
&[initial_input_for_turn.into()],
turn_context.truncation_policy,
);
let mut truncated_count = 0usize;
@@ -112,9 +127,7 @@ async fn run_compact_task_inner(
continue;
}
sess.set_total_tokens_full(turn_context.as_ref()).await;
let event = EventMsg::Error(ErrorEvent {
message: e.to_string(),
});
let event = EventMsg::Error(e.to_error_event(None));
sess.send_event(&turn_context, event).await;
return;
}
@@ -125,14 +138,13 @@ async fn run_compact_task_inner(
sess.notify_stream_error(
turn_context.as_ref(),
format!("Reconnecting... {retries}/{max_retries}"),
e,
)
.await;
tokio::time::sleep(delay).await;
continue;
} else {
let event = EventMsg::Error(ErrorEvent {
message: e.to_string(),
});
let event = EventMsg::Error(e.to_error_event(None));
sess.send_event(&turn_context, event).await;
return;
}
@@ -155,18 +167,11 @@ async fn run_compact_task_inner(
.collect();
new_history.extend(ghost_snapshots);
sess.replace_history(new_history).await;
if let Some(estimated_tokens) = sess
.clone_history()
.await
.estimate_token_count(&turn_context)
{
sess.override_last_token_usage_estimate(&turn_context, estimated_tokens)
.await;
}
sess.recompute_token_usage(&turn_context).await;
let rollout_item = RolloutItem::Compacted(CompactedItem {
message: summary_text.clone(),
replacement_history: None,
});
sess.persist_rollout_items(&[rollout_item]).await;
@@ -229,7 +234,7 @@ pub(crate) fn build_compacted_history(
initial_context,
user_messages,
summary_text,
COMPACT_USER_MESSAGE_MAX_TOKENS * 4,
COMPACT_USER_MESSAGE_MAX_TOKENS,
)
}
@@ -237,20 +242,21 @@ fn build_compacted_history_with_limit(
mut history: Vec<ResponseItem>,
user_messages: &[String],
summary_text: &str,
max_bytes: usize,
max_tokens: usize,
) -> Vec<ResponseItem> {
let mut selected_messages: Vec<String> = Vec::new();
if max_bytes > 0 {
let mut remaining = max_bytes;
if max_tokens > 0 {
let mut remaining = max_tokens;
for message in user_messages.iter().rev() {
if remaining == 0 {
break;
}
if message.len() <= remaining {
let tokens = approx_token_count(message);
if tokens <= remaining {
selected_messages.push(message.clone());
remaining = remaining.saturating_sub(message.len());
remaining = remaining.saturating_sub(tokens);
} else {
let (truncated, _) = truncate_middle(message, remaining);
let truncated = truncate_text(message, TruncationPolicy::Tokens(remaining));
selected_messages.push(truncated);
break;
}
@@ -299,7 +305,8 @@ async fn drain_to_completed(
};
match event {
Ok(ResponseEvent::OutputItemDone(item)) => {
sess.record_into_history(std::slice::from_ref(&item)).await;
sess.record_into_history(std::slice::from_ref(&item), turn_context)
.await;
}
Ok(ResponseEvent::RateLimits(snapshot)) => {
sess.update_rate_limits(turn_context, snapshot).await;
@@ -317,6 +324,7 @@ async fn drain_to_completed(
#[cfg(test)]
mod tests {
use super::*;
use pretty_assertions::assert_eq;
@@ -408,16 +416,16 @@ mod tests {
}
#[test]
fn build_compacted_history_truncates_overlong_user_messages() {
fn build_token_limited_compacted_history_truncates_overlong_user_messages() {
// Use a small truncation limit so the test remains fast while still validating
// that oversized user content is truncated.
let max_bytes = 128;
let big = "X".repeat(max_bytes + 50);
let max_tokens = 16;
let big = "word ".repeat(200);
let history = super::build_compacted_history_with_limit(
Vec::new(),
std::slice::from_ref(&big),
"SUMMARY",
max_bytes,
max_tokens,
);
assert_eq!(history.len(), 2);
@@ -450,7 +458,7 @@ mod tests {
}
#[test]
fn build_compacted_history_appends_summary_message() {
fn build_token_limited_compacted_history_appends_summary_message() {
let initial_context: Vec<ResponseItem> = Vec::new();
let user_messages = vec!["first user message".to_string()];
let summary_text = "summary text";

View File

@@ -0,0 +1,83 @@
use std::sync::Arc;
use crate::Prompt;
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::error::Result as CodexResult;
use crate::protocol::AgentMessageEvent;
use crate::protocol::CompactedItem;
use crate::protocol::EventMsg;
use crate::protocol::RolloutItem;
use crate::protocol::TaskStartedEvent;
use codex_protocol::models::ResponseItem;
pub(crate) async fn run_inline_remote_auto_compact_task(
sess: Arc<Session>,
turn_context: Arc<TurnContext>,
) {
run_remote_compact_task_inner(&sess, &turn_context).await;
}
pub(crate) async fn run_remote_compact_task(sess: Arc<Session>, turn_context: Arc<TurnContext>) {
let start_event = EventMsg::TaskStarted(TaskStartedEvent {
model_context_window: turn_context.client.get_model_context_window(),
});
sess.send_event(&turn_context, start_event).await;
run_remote_compact_task_inner(&sess, &turn_context).await;
}
async fn run_remote_compact_task_inner(sess: &Arc<Session>, turn_context: &Arc<TurnContext>) {
if let Err(err) = run_remote_compact_task_inner_impl(sess, turn_context).await {
let event = EventMsg::Error(
err.to_error_event(Some("Error running remote compact task".to_string())),
);
sess.send_event(turn_context, event).await;
}
}
async fn run_remote_compact_task_inner_impl(
sess: &Arc<Session>,
turn_context: &Arc<TurnContext>,
) -> CodexResult<()> {
let mut history = sess.clone_history().await;
let prompt = Prompt {
input: history.get_history_for_prompt(),
tools: vec![],
parallel_tool_calls: false,
base_instructions_override: turn_context.base_instructions.clone(),
output_schema: None,
};
let mut new_history = turn_context
.client
.compact_conversation_history(&prompt)
.await?;
// Required to keep `/undo` available after compaction
let ghost_snapshots: Vec<ResponseItem> = history
.get_history()
.iter()
.filter(|item| matches!(item, ResponseItem::GhostSnapshot { .. }))
.cloned()
.collect();
if !ghost_snapshots.is_empty() {
new_history.extend(ghost_snapshots);
}
sess.replace_history(new_history.clone()).await;
sess.recompute_token_usage(turn_context).await;
let compacted_item = CompactedItem {
message: String::new(),
replacement_history: Some(new_history),
};
sess.persist_rollout_items(&[RolloutItem::Compacted(compacted_item)])
.await;
let event = EventMsg::AgentMessage(AgentMessageEvent {
message: "Compact task completed".to_string(),
});
sess.send_event(turn_context, event).await;
Ok(())
}

View File

@@ -4,7 +4,6 @@ use crate::config::types::Notice;
use anyhow::Context;
use codex_protocol::config_types::ReasoningEffort;
use codex_protocol::config_types::TrustLevel;
use codex_utils_tokenizer::warm_model_cache;
use std::collections::BTreeMap;
use std::path::Path;
use std::path::PathBuf;
@@ -231,9 +230,6 @@ impl ConfigDocument {
fn apply(&mut self, edit: &ConfigEdit) -> anyhow::Result<bool> {
match edit {
ConfigEdit::SetModel { model, effort } => Ok({
if let Some(model) = &model {
warm_model_cache(model)
}
let mut mutated = false;
mutated |= self.write_profile_value(
&["model"],
@@ -550,6 +546,15 @@ impl ConfigEditsBuilder {
self
}
/// Enable or disable a feature flag by key under the `[features]` table.
pub fn set_feature_enabled(mut self, key: &str, enabled: bool) -> Self {
self.edits.push(ConfigEdit::SetPath {
segments: vec!["features".to_string(), key.to_string()],
value: value(enabled),
});
self
}
/// Apply edits on a blocking thread.
pub fn apply_blocking(self) -> anyhow::Result<()> {
apply_blocking(&self.codex_home, self.profile.as_deref(), &self.edits)
@@ -836,6 +841,36 @@ hide_gpt5_1_migration_prompt = true
assert_eq!(contents, expected);
}
#[test]
fn blocking_set_hide_gpt_5_1_codex_max_migration_prompt_preserves_table() {
let tmp = tempdir().expect("tmpdir");
let codex_home = tmp.path();
std::fs::write(
codex_home.join(CONFIG_TOML_FILE),
r#"[notice]
existing = "value"
"#,
)
.expect("seed");
apply_blocking(
codex_home,
None,
&[ConfigEdit::SetNoticeHideModelMigrationPrompt(
"hide_gpt-5.1-codex-max_migration_prompt".to_string(),
true,
)],
)
.expect("persist");
let contents =
std::fs::read_to_string(codex_home.join(CONFIG_TOML_FILE)).expect("read config");
let expected = r#"[notice]
existing = "value"
"hide_gpt-5.1-codex-max_migration_prompt" = true
"#;
assert_eq!(contents, expected);
}
#[test]
fn blocking_replace_mcp_servers_round_trips() {
let tmp = tempdir().expect("tmpdir");

View File

@@ -61,9 +61,6 @@ pub mod edit;
pub mod profile;
pub mod types;
#[cfg(target_os = "windows")]
pub const OPENAI_DEFAULT_MODEL: &str = "gpt-5.1";
#[cfg(not(target_os = "windows"))]
pub const OPENAI_DEFAULT_MODEL: &str = "gpt-5.1-codex";
const OPENAI_DEFAULT_REVIEW_MODEL: &str = "gpt-5.1-codex";
pub const GPT_5_CODEX_MEDIUM_MODEL: &str = "gpt-5.1-codex";
@@ -81,7 +78,7 @@ pub struct Config {
/// Optional override of model selection.
pub model: String,
/// Model used specifically for review sessions. Defaults to "gpt-5.1-codex".
/// Model used specifically for review sessions. Defaults to "gpt-5.1-codex-max".
pub review_model: String,
pub model_family: ModelFamily,
@@ -89,9 +86,6 @@ pub struct Config {
/// Size of the context window for the model, in tokens.
pub model_context_window: Option<i64>,
/// Maximum number of output tokens.
pub model_max_output_tokens: Option<i64>,
/// Token usage threshold triggering auto-compaction of conversation history.
pub model_auto_compact_token_limit: Option<i64>,
@@ -163,6 +157,9 @@ pub struct Config {
/// and turn completions when not focused.
pub tui_notifications: Notifications,
/// Enable ASCII animations and shimmer effects in the TUI.
pub animations: bool,
/// The directory that should be treated as the current working directory
/// for the session. All relative paths inside the business-logic layer are
/// resolved against this path.
@@ -195,6 +192,9 @@ pub struct Config {
/// Additional filenames to try when looking for project-level docs.
pub project_doc_fallback_filenames: Vec<String>,
/// Token budget applied when storing tool/function outputs in the context manager.
pub tool_output_token_limit: Option<usize>,
/// Directory containing all Codex state (defaults to `~/.codex` but can be
/// overridden by the `CODEX_HOME` environment variable).
pub codex_home: PathBuf,
@@ -567,9 +567,6 @@ pub struct ConfigToml {
/// Size of the context window for the model, in tokens.
pub model_context_window: Option<i64>,
/// Maximum number of output tokens.
pub model_max_output_tokens: Option<i64>,
/// Token usage threshold triggering auto-compaction of conversation history.
pub model_auto_compact_token_limit: Option<i64>,
@@ -636,6 +633,9 @@ pub struct ConfigToml {
/// Ordered list of fallback filenames to look for when AGENTS.md is missing.
pub project_doc_fallback_filenames: Option<Vec<String>>,
/// Token budget applied when storing tool/function outputs in the context manager.
pub tool_output_token_limit: Option<usize>,
/// Profile to use from the `profiles` map.
pub profile: Option<String>,
@@ -1116,11 +1116,6 @@ impl Config {
let model_context_window = cfg
.model_context_window
.or_else(|| openai_model_info.as_ref().map(|info| info.context_window));
let model_max_output_tokens = cfg.model_max_output_tokens.or_else(|| {
openai_model_info
.as_ref()
.map(|info| info.max_output_tokens)
});
let model_auto_compact_token_limit = cfg.model_auto_compact_token_limit.or_else(|| {
openai_model_info
.as_ref()
@@ -1172,7 +1167,6 @@ impl Config {
review_model,
model_family,
model_context_window,
model_max_output_tokens,
model_auto_compact_token_limit,
model_provider_id,
model_provider,
@@ -1209,6 +1203,7 @@ impl Config {
}
})
.collect(),
tool_output_token_limit: cfg.tool_output_token_limit,
codex_home,
history,
file_opener: cfg.file_opener.unwrap_or(UriBasedFileOpener::VsCode),
@@ -1249,6 +1244,7 @@ impl Config {
.as_ref()
.map(|t| t.notifications.clone())
.unwrap_or_default(),
animations: cfg.tui.as_ref().map(|t| t.animations).unwrap_or(true),
otel: {
let t: OtelConfigToml = cfg.otel.unwrap_or_default();
let log_user_prompt = t.log_user_prompt.unwrap_or(false);
@@ -1313,6 +1309,16 @@ impl Config {
Ok(Some(s))
}
}
pub fn set_windows_sandbox_globally(&mut self, value: bool) {
crate::safety::set_windows_sandbox_enabled(value);
if value {
self.features.enable(Feature::WindowsSandbox);
} else {
self.features.disable(Feature::WindowsSandbox);
}
self.forced_auto_mode_downgraded_on_windows = !value;
}
}
fn default_model() -> String {
@@ -2943,7 +2949,6 @@ model_verbosity = "high"
review_model: OPENAI_DEFAULT_REVIEW_MODEL.to_string(),
model_family: find_family_for_model("o3").expect("known model slug"),
model_context_window: Some(200_000),
model_max_output_tokens: Some(100_000),
model_auto_compact_token_limit: Some(180_000),
model_provider_id: "openai".to_string(),
model_provider: fixture.openai_provider.clone(),
@@ -2961,6 +2966,7 @@ model_verbosity = "high"
model_providers: fixture.model_provider_map.clone(),
project_doc_max_bytes: PROJECT_DOC_MAX_BYTES,
project_doc_fallback_filenames: Vec::new(),
tool_output_token_limit: None,
codex_home: fixture.codex_home(),
history: History::default(),
file_opener: UriBasedFileOpener::VsCode,
@@ -2988,6 +2994,7 @@ model_verbosity = "high"
notices: Default::default(),
disable_paste_burst: false,
tui_notifications: Default::default(),
animations: true,
otel: OtelConfig::default(),
},
o3_profile_config
@@ -3014,7 +3021,6 @@ model_verbosity = "high"
review_model: OPENAI_DEFAULT_REVIEW_MODEL.to_string(),
model_family: find_family_for_model("gpt-3.5-turbo").expect("known model slug"),
model_context_window: Some(16_385),
model_max_output_tokens: Some(4_096),
model_auto_compact_token_limit: Some(14_746),
model_provider_id: "openai-chat-completions".to_string(),
model_provider: fixture.openai_chat_completions_provider.clone(),
@@ -3032,6 +3038,7 @@ model_verbosity = "high"
model_providers: fixture.model_provider_map.clone(),
project_doc_max_bytes: PROJECT_DOC_MAX_BYTES,
project_doc_fallback_filenames: Vec::new(),
tool_output_token_limit: None,
codex_home: fixture.codex_home(),
history: History::default(),
file_opener: UriBasedFileOpener::VsCode,
@@ -3059,6 +3066,7 @@ model_verbosity = "high"
notices: Default::default(),
disable_paste_burst: false,
tui_notifications: Default::default(),
animations: true,
otel: OtelConfig::default(),
};
@@ -3100,7 +3108,6 @@ model_verbosity = "high"
review_model: OPENAI_DEFAULT_REVIEW_MODEL.to_string(),
model_family: find_family_for_model("o3").expect("known model slug"),
model_context_window: Some(200_000),
model_max_output_tokens: Some(100_000),
model_auto_compact_token_limit: Some(180_000),
model_provider_id: "openai".to_string(),
model_provider: fixture.openai_provider.clone(),
@@ -3118,6 +3125,7 @@ model_verbosity = "high"
model_providers: fixture.model_provider_map.clone(),
project_doc_max_bytes: PROJECT_DOC_MAX_BYTES,
project_doc_fallback_filenames: Vec::new(),
tool_output_token_limit: None,
codex_home: fixture.codex_home(),
history: History::default(),
file_opener: UriBasedFileOpener::VsCode,
@@ -3145,6 +3153,7 @@ model_verbosity = "high"
notices: Default::default(),
disable_paste_burst: false,
tui_notifications: Default::default(),
animations: true,
otel: OtelConfig::default(),
};
@@ -3172,7 +3181,6 @@ model_verbosity = "high"
review_model: OPENAI_DEFAULT_REVIEW_MODEL.to_string(),
model_family: find_family_for_model("gpt-5.1").expect("known model slug"),
model_context_window: Some(272_000),
model_max_output_tokens: Some(128_000),
model_auto_compact_token_limit: Some(244_800),
model_provider_id: "openai".to_string(),
model_provider: fixture.openai_provider.clone(),
@@ -3190,6 +3198,7 @@ model_verbosity = "high"
model_providers: fixture.model_provider_map.clone(),
project_doc_max_bytes: PROJECT_DOC_MAX_BYTES,
project_doc_fallback_filenames: Vec::new(),
tool_output_token_limit: None,
codex_home: fixture.codex_home(),
history: History::default(),
file_opener: UriBasedFileOpener::VsCode,
@@ -3217,6 +3226,7 @@ model_verbosity = "high"
notices: Default::default(),
disable_paste_burst: false,
tui_notifications: Default::default(),
animations: true,
otel: OtelConfig::default(),
};

View File

@@ -282,6 +282,14 @@ pub enum OtelHttpProtocol {
Json,
}
#[derive(Deserialize, Debug, Clone, PartialEq, Default)]
#[serde(rename_all = "kebab-case")]
pub struct OtelTlsConfig {
pub ca_certificate: Option<PathBuf>,
pub client_certificate: Option<PathBuf>,
pub client_private_key: Option<PathBuf>,
}
/// Which OTEL exporter to use.
#[derive(Deserialize, Debug, Clone, PartialEq)]
#[serde(rename_all = "kebab-case")]
@@ -289,12 +297,18 @@ pub enum OtelExporterKind {
None,
OtlpHttp {
endpoint: String,
#[serde(default)]
headers: HashMap<String, String>,
protocol: OtelHttpProtocol,
#[serde(default)]
tls: Option<OtelTlsConfig>,
},
OtlpGrpc {
endpoint: String,
#[serde(default)]
headers: HashMap<String, String>,
#[serde(default)]
tls: Option<OtelTlsConfig>,
},
}
@@ -349,6 +363,15 @@ pub struct Tui {
/// Defaults to `true`.
#[serde(default)]
pub notifications: Notifications,
/// Enable animations (welcome screen, shimmer effects, spinners).
/// Defaults to `true`.
#[serde(default = "default_true")]
pub animations: bool,
}
const fn default_true() -> bool {
true
}
/// Settings for notices we display to users via the tui and app-server clients
@@ -364,6 +387,9 @@ pub struct Notice {
pub hide_rate_limit_model_nudge: Option<bool>,
/// Tracks whether the user has seen the model migration prompt
pub hide_gpt5_1_migration_prompt: Option<bool>,
/// Tracks whether the user has seen the gpt-5.1-codex-max migration prompt
#[serde(rename = "hide_gpt-5.1-codex-max_migration_prompt")]
pub hide_gpt_5_1_codex_max_migration_prompt: Option<bool>,
}
impl Notice {

View File

@@ -1,21 +1,15 @@
use crate::codex::TurnContext;
use crate::context_manager::normalize;
use crate::truncate;
use crate::truncate::format_output_for_model_body;
use crate::truncate::globally_truncate_function_output_items;
use crate::truncate::TruncationPolicy;
use crate::truncate::approx_token_count;
use crate::truncate::truncate_function_output_items_with_policy;
use crate::truncate::truncate_text;
use codex_protocol::models::FunctionCallOutputPayload;
use codex_protocol::models::ResponseItem;
use codex_protocol::protocol::TokenUsage;
use codex_protocol::protocol::TokenUsageInfo;
use codex_utils_tokenizer::Tokenizer;
use std::ops::Deref;
const CONTEXT_WINDOW_HARD_LIMIT_FACTOR: f64 = 1.1;
const CONTEXT_WINDOW_HARD_LIMIT_BYTES: usize =
(truncate::MODEL_FORMAT_MAX_BYTES as f64 * CONTEXT_WINDOW_HARD_LIMIT_FACTOR) as usize;
const CONTEXT_WINDOW_HARD_LIMIT_LINES: usize =
(truncate::MODEL_FORMAT_MAX_LINES as f64 * CONTEXT_WINDOW_HARD_LIMIT_FACTOR) as usize;
/// Transcript of conversation history
#[derive(Debug, Clone, Default)]
pub(crate) struct ContextManager {
@@ -50,7 +44,7 @@ impl ContextManager {
}
/// `items` is ordered from oldest to newest.
pub(crate) fn record_items<I>(&mut self, items: I)
pub(crate) fn record_items<I>(&mut self, items: I, policy: TruncationPolicy)
where
I: IntoIterator,
I::Item: std::ops::Deref<Target = ResponseItem>,
@@ -62,7 +56,7 @@ impl ContextManager {
continue;
}
let processed = Self::process_item(&item);
let processed = self.process_item(item_ref, policy);
self.items.push(processed);
}
}
@@ -80,26 +74,21 @@ impl ContextManager {
history
}
// Estimate the number of tokens in the history. Return None if no tokenizer
// is available. This does not consider the reasoning traces.
// /!\ The value is a lower bound estimate and does not represent the exact
// context length.
// Estimate token usage using byte-based heuristics from the truncation helpers.
// This is a coarse lower bound, not a tokenizer-accurate count.
pub(crate) fn estimate_token_count(&self, turn_context: &TurnContext) -> Option<i64> {
let model = turn_context.client.get_model();
let tokenizer = Tokenizer::for_model(model.as_str()).ok()?;
let model_family = turn_context.client.get_model_family();
let base_tokens =
i64::try_from(approx_token_count(model_family.base_instructions.as_str()))
.unwrap_or(i64::MAX);
Some(
self.items
.iter()
.map(|item| {
serde_json::to_string(&item)
.map(|item| tokenizer.count(&item))
.unwrap_or_default()
})
.sum::<i64>()
+ tokenizer.count(model_family.base_instructions.as_str()),
)
let items_tokens = self.items.iter().fold(0i64, |acc, item| {
let serialized = serde_json::to_string(item).unwrap_or_default();
let item_tokens = i64::try_from(approx_token_count(&serialized)).unwrap_or(i64::MAX);
acc.saturating_add(item_tokens)
});
Some(base_tokens.saturating_add(items_tokens))
}
pub(crate) fn remove_first_item(&mut self) {
@@ -150,18 +139,18 @@ impl ContextManager {
items.retain(|item| !matches!(item, ResponseItem::GhostSnapshot { .. }));
}
fn process_item(item: &ResponseItem) -> ResponseItem {
fn process_item(&self, item: &ResponseItem, policy: TruncationPolicy) -> ResponseItem {
let policy_with_serialization_budget = policy.mul(1.2);
match item {
ResponseItem::FunctionCallOutput { call_id, output } => {
let truncated = format_output_for_model_body(
output.content.as_str(),
CONTEXT_WINDOW_HARD_LIMIT_BYTES,
CONTEXT_WINDOW_HARD_LIMIT_LINES,
);
let truncated_items = output
.content_items
.as_ref()
.map(|items| globally_truncate_function_output_items(items));
let truncated =
truncate_text(output.content.as_str(), policy_with_serialization_budget);
let truncated_items = output.content_items.as_ref().map(|items| {
truncate_function_output_items_with_policy(
items,
policy_with_serialization_budget,
)
});
ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output: FunctionCallOutputPayload {
@@ -172,11 +161,7 @@ impl ContextManager {
}
}
ResponseItem::CustomToolCallOutput { call_id, output } => {
let truncated = format_output_for_model_body(
output,
CONTEXT_WINDOW_HARD_LIMIT_BYTES,
CONTEXT_WINDOW_HARD_LIMIT_LINES,
);
let truncated = truncate_text(output, policy_with_serialization_budget);
ResponseItem::CustomToolCallOutput {
call_id: call_id.clone(),
output: truncated,
@@ -188,6 +173,7 @@ impl ContextManager {
| ResponseItem::FunctionCall { .. }
| ResponseItem::WebSearchCall { .. }
| ResponseItem::CustomToolCall { .. }
| ResponseItem::CompactionSummary { .. }
| ResponseItem::GhostSnapshot { .. }
| ResponseItem::Other => item.clone(),
}
@@ -205,7 +191,8 @@ fn is_api_message(message: &ResponseItem) -> bool {
| ResponseItem::CustomToolCallOutput { .. }
| ResponseItem::LocalShellCall { .. }
| ResponseItem::Reasoning { .. }
| ResponseItem::WebSearchCall { .. } => true,
| ResponseItem::WebSearchCall { .. }
| ResponseItem::CompactionSummary { .. } => true,
ResponseItem::GhostSnapshot { .. } => false,
ResponseItem::Other => false,
}

View File

@@ -1,9 +1,8 @@
use super::*;
use crate::context_manager::MODEL_FORMAT_MAX_LINES;
use crate::truncate;
use crate::truncate::TruncationPolicy;
use codex_git::GhostCommit;
use codex_protocol::models::ContentItem;
use codex_protocol::models::FunctionCallOutputContentItem;
use codex_protocol::models::FunctionCallOutputPayload;
use codex_protocol::models::LocalShellAction;
use codex_protocol::models::LocalShellExecAction;
@@ -13,6 +12,9 @@ use codex_protocol::models::ReasoningItemReasoningSummary;
use pretty_assertions::assert_eq;
use regex_lite::Regex;
const EXEC_FORMAT_MAX_BYTES: usize = 10_000;
const EXEC_FORMAT_MAX_TOKENS: usize = 2_500;
fn assistant_msg(text: &str) -> ResponseItem {
ResponseItem::Message {
id: None,
@@ -25,7 +27,9 @@ fn assistant_msg(text: &str) -> ResponseItem {
fn create_history_with_items(items: Vec<ResponseItem>) -> ContextManager {
let mut h = ContextManager::new();
h.record_items(items.iter());
// Use a generous but fixed token budget; tests only rely on truncation
// behavior, not on a specific model's token limit.
h.record_items(items.iter(), TruncationPolicy::Tokens(10_000));
h
}
@@ -52,9 +56,14 @@ fn reasoning_msg(text: &str) -> ResponseItem {
}
}
fn truncate_exec_output(content: &str) -> String {
truncate::truncate_text(content, TruncationPolicy::Tokens(EXEC_FORMAT_MAX_TOKENS))
}
#[test]
fn filters_non_api_messages() {
let mut h = ContextManager::default();
let policy = TruncationPolicy::Tokens(10_000);
// System message is not API messages; Other is ignored.
let system = ResponseItem::Message {
id: None,
@@ -64,12 +73,12 @@ fn filters_non_api_messages() {
}],
};
let reasoning = reasoning_msg("thinking...");
h.record_items([&system, &reasoning, &ResponseItem::Other]);
h.record_items([&system, &reasoning, &ResponseItem::Other], policy);
// User and assistant should be retained.
let u = user_msg("hi");
let a = assistant_msg("hello");
h.record_items([&u, &a]);
h.record_items([&u, &a], policy);
let items = h.contents();
assert_eq!(
@@ -223,7 +232,7 @@ fn normalization_retains_local_shell_outputs() {
ResponseItem::FunctionCallOutput {
call_id: "shell-1".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
content: "Total output lines: 1\n\nok".to_string(),
..Default::default()
},
},
@@ -237,6 +246,9 @@ fn normalization_retains_local_shell_outputs() {
#[test]
fn record_items_truncates_function_call_output_content() {
let mut history = ContextManager::new();
// Any reasonably small token budget works; the test only cares that
// truncation happens and the marker is present.
let policy = TruncationPolicy::Tokens(1_000);
let long_line = "a very long line to trigger truncation\n";
let long_output = long_line.repeat(2_500);
let item = ResponseItem::FunctionCallOutput {
@@ -248,15 +260,20 @@ fn record_items_truncates_function_call_output_content() {
},
};
history.record_items([&item]);
history.record_items([&item], policy);
assert_eq!(history.items.len(), 1);
match &history.items[0] {
ResponseItem::FunctionCallOutput { output, .. } => {
assert_ne!(output.content, long_output);
assert!(
output.content.starts_with("Total output lines:"),
"expected truncated summary, got {}",
output.content.contains("tokens truncated"),
"expected token-based truncation marker, got {}",
output.content
);
assert!(
output.content.contains("tokens truncated"),
"expected truncation marker, got {}",
output.content
);
}
@@ -267,6 +284,7 @@ fn record_items_truncates_function_call_output_content() {
#[test]
fn record_items_truncates_custom_tool_call_output_content() {
let mut history = ContextManager::new();
let policy = TruncationPolicy::Tokens(1_000);
let line = "custom output that is very long\n";
let long_output = line.repeat(2_500);
let item = ResponseItem::CustomToolCallOutput {
@@ -274,23 +292,50 @@ fn record_items_truncates_custom_tool_call_output_content() {
output: long_output.clone(),
};
history.record_items([&item]);
history.record_items([&item], policy);
assert_eq!(history.items.len(), 1);
match &history.items[0] {
ResponseItem::CustomToolCallOutput { output, .. } => {
assert_ne!(output, &long_output);
assert!(
output.starts_with("Total output lines:"),
"expected truncated summary, got {output}"
output.contains("tokens truncated"),
"expected token-based truncation marker, got {output}"
);
assert!(
output.contains("tokens truncated") || output.contains("bytes truncated"),
"expected truncation marker, got {output}"
);
}
other => panic!("unexpected history item: {other:?}"),
}
}
fn assert_truncated_message_matches(message: &str, line: &str, total_lines: usize) {
let pattern = truncated_message_pattern(line, total_lines);
#[test]
fn record_items_respects_custom_token_limit() {
let mut history = ContextManager::new();
let policy = TruncationPolicy::Tokens(10);
let long_output = "tokenized content repeated many times ".repeat(200);
let item = ResponseItem::FunctionCallOutput {
call_id: "call-custom-limit".to_string(),
output: FunctionCallOutputPayload {
content: long_output,
success: Some(true),
..Default::default()
},
};
history.record_items([&item], policy);
let stored = match &history.items[0] {
ResponseItem::FunctionCallOutput { output, .. } => output,
other => panic!("unexpected history item: {other:?}"),
};
assert!(stored.content.contains("tokens truncated"));
}
fn assert_truncated_message_matches(message: &str, line: &str, expected_removed: usize) {
let pattern = truncated_message_pattern(line);
let regex = Regex::new(&pattern).unwrap_or_else(|err| {
panic!("failed to compile regex {pattern}: {err}");
});
@@ -302,28 +347,22 @@ fn assert_truncated_message_matches(message: &str, line: &str, total_lines: usiz
.expect("missing body capture")
.as_str();
assert!(
body.len() <= truncate::MODEL_FORMAT_MAX_BYTES,
body.len() <= EXEC_FORMAT_MAX_BYTES,
"body exceeds byte limit: {} bytes",
body.len()
);
let removed: usize = captures
.name("removed")
.expect("missing removed capture")
.as_str()
.parse()
.unwrap_or_else(|err| panic!("invalid removed tokens: {err}"));
assert_eq!(removed, expected_removed, "mismatched removed token count");
}
fn truncated_message_pattern(line: &str, total_lines: usize) -> String {
let head_lines = MODEL_FORMAT_MAX_LINES / 2;
let tail_lines = MODEL_FORMAT_MAX_LINES - head_lines;
let head_take = head_lines.min(total_lines);
let tail_take = tail_lines.min(total_lines.saturating_sub(head_take));
let omitted = total_lines.saturating_sub(head_take + tail_take);
fn truncated_message_pattern(line: &str) -> String {
let escaped_line = regex_lite::escape(line);
if omitted == 0 {
return format!(
r"(?s)^Total output lines: {total_lines}\n\n(?P<body>{escaped_line}.*\n\[\.{{3}} output truncated to fit {max_bytes} bytes \.{{3}}]\n\n.*)$",
max_bytes = truncate::MODEL_FORMAT_MAX_BYTES,
);
}
format!(
r"(?s)^Total output lines: {total_lines}\n\n(?P<body>{escaped_line}.*\n\[\.{{3}} omitted {omitted} of {total_lines} lines \.{{3}}]\n\n.*)$",
)
format!(r"(?s)^(?P<body>{escaped_line}.*?)(?:\r?)?…(?P<removed>\d+) tokens truncated…(?:.*)?$")
}
#[test]
@@ -331,35 +370,18 @@ fn format_exec_output_truncates_large_error() {
let line = "very long execution error line that should trigger truncation\n";
let large_error = line.repeat(2_500); // way beyond both byte and line limits
let truncated = truncate::format_output_for_model_body(
&large_error,
truncate::MODEL_FORMAT_MAX_BYTES,
truncate::MODEL_FORMAT_MAX_LINES,
);
let truncated = truncate_exec_output(&large_error);
let total_lines = large_error.lines().count();
assert_truncated_message_matches(&truncated, line, total_lines);
assert_truncated_message_matches(&truncated, line, 36250);
assert_ne!(truncated, large_error);
}
#[test]
fn format_exec_output_marks_byte_truncation_without_omitted_lines() {
let long_line = "a".repeat(truncate::MODEL_FORMAT_MAX_BYTES + 50);
let truncated = truncate::format_output_for_model_body(
&long_line,
truncate::MODEL_FORMAT_MAX_BYTES,
truncate::MODEL_FORMAT_MAX_LINES,
);
let long_line = "a".repeat(EXEC_FORMAT_MAX_BYTES + 10000);
let truncated = truncate_exec_output(&long_line);
assert_ne!(truncated, long_line);
let marker_line = format!(
"[... output truncated to fit {} bytes ...]",
truncate::MODEL_FORMAT_MAX_BYTES
);
assert!(
truncated.contains(&marker_line),
"missing byte truncation marker: {truncated}"
);
assert_truncated_message_matches(&truncated, "a", 2500);
assert!(
!truncated.contains("omitted"),
"line omission marker should not appear when no lines were dropped: {truncated}"
@@ -369,42 +391,25 @@ fn format_exec_output_marks_byte_truncation_without_omitted_lines() {
#[test]
fn format_exec_output_returns_original_when_within_limits() {
let content = "example output\n".repeat(10);
assert_eq!(
truncate::format_output_for_model_body(
&content,
truncate::MODEL_FORMAT_MAX_BYTES,
truncate::MODEL_FORMAT_MAX_LINES
),
content
);
assert_eq!(truncate_exec_output(&content), content);
}
#[test]
fn format_exec_output_reports_omitted_lines_and_keeps_head_and_tail() {
let total_lines = truncate::MODEL_FORMAT_MAX_LINES + 100;
let total_lines = 2_000;
let filler = "x".repeat(64);
let content: String = (0..total_lines)
.map(|idx| format!("line-{idx}\n"))
.map(|idx| format!("line-{idx}-{filler}\n"))
.collect();
let truncated = truncate::format_output_for_model_body(
&content,
truncate::MODEL_FORMAT_MAX_BYTES,
truncate::MODEL_FORMAT_MAX_LINES,
);
let omitted = total_lines - truncate::MODEL_FORMAT_MAX_LINES;
let expected_marker = format!("[... omitted {omitted} of {total_lines} lines ...]");
let truncated = truncate_exec_output(&content);
assert_truncated_message_matches(&truncated, "line-0-", 34_723);
assert!(
truncated.contains(&expected_marker),
"missing omitted marker: {truncated}"
);
assert!(
truncated.contains("line-0\n"),
truncated.contains("line-0-"),
"expected head line to remain: {truncated}"
);
let last_line = format!("line-{}\n", total_lines - 1);
let last_line = format!("line-{}-", total_lines - 1);
assert!(
truncated.contains(&last_line),
"expected tail line to remain: {truncated}"
@@ -413,101 +418,15 @@ fn format_exec_output_reports_omitted_lines_and_keeps_head_and_tail() {
#[test]
fn format_exec_output_prefers_line_marker_when_both_limits_exceeded() {
let total_lines = truncate::MODEL_FORMAT_MAX_LINES + 42;
let total_lines = 300;
let long_line = "x".repeat(256);
let content: String = (0..total_lines)
.map(|idx| format!("line-{idx}-{long_line}\n"))
.collect();
let truncated = truncate::format_output_for_model_body(
&content,
truncate::MODEL_FORMAT_MAX_BYTES,
truncate::MODEL_FORMAT_MAX_LINES,
);
let truncated = truncate_exec_output(&content);
assert!(
truncated.contains("[... omitted 42 of 298 lines ...]"),
"expected omitted marker when line count exceeds limit: {truncated}"
);
assert!(
!truncated.contains("output truncated to fit"),
"line omission marker should take precedence over byte marker: {truncated}"
);
}
#[test]
fn truncates_across_multiple_under_limit_texts_and_reports_omitted() {
// Arrange: several text items, none exceeding per-item limit, but total exceeds budget.
let budget = truncate::MODEL_FORMAT_MAX_BYTES;
let t1_len = (budget / 2).saturating_sub(10);
let t2_len = (budget / 2).saturating_sub(10);
let remaining_after_t1_t2 = budget.saturating_sub(t1_len + t2_len);
let t3_len = 50; // gets truncated to remaining_after_t1_t2
let t4_len = 5; // omitted
let t5_len = 7; // omitted
let t1 = "a".repeat(t1_len);
let t2 = "b".repeat(t2_len);
let t3 = "c".repeat(t3_len);
let t4 = "d".repeat(t4_len);
let t5 = "e".repeat(t5_len);
let item = ResponseItem::FunctionCallOutput {
call_id: "call-omit".to_string(),
output: FunctionCallOutputPayload {
content: "irrelevant".to_string(),
content_items: Some(vec![
FunctionCallOutputContentItem::InputText { text: t1 },
FunctionCallOutputContentItem::InputText { text: t2 },
FunctionCallOutputContentItem::InputImage {
image_url: "img:mid".to_string(),
},
FunctionCallOutputContentItem::InputText { text: t3 },
FunctionCallOutputContentItem::InputText { text: t4 },
FunctionCallOutputContentItem::InputText { text: t5 },
]),
success: Some(true),
},
};
let mut history = ContextManager::new();
history.record_items([&item]);
assert_eq!(history.items.len(), 1);
let json = serde_json::to_value(&history.items[0]).expect("serialize to json");
let output = json
.get("output")
.expect("output field")
.as_array()
.expect("array output");
// Expect: t1 (full), t2 (full), image, t3 (truncated), summary mentioning 2 omitted.
assert_eq!(output.len(), 5);
let first = output[0].as_object().expect("first obj");
assert_eq!(first.get("type").unwrap(), "input_text");
let first_text = first.get("text").unwrap().as_str().unwrap();
assert_eq!(first_text.len(), t1_len);
let second = output[1].as_object().expect("second obj");
assert_eq!(second.get("type").unwrap(), "input_text");
let second_text = second.get("text").unwrap().as_str().unwrap();
assert_eq!(second_text.len(), t2_len);
assert_eq!(
output[2],
serde_json::json!({"type": "input_image", "image_url": "img:mid"})
);
let fourth = output[3].as_object().expect("fourth obj");
assert_eq!(fourth.get("type").unwrap(), "input_text");
let fourth_text = fourth.get("text").unwrap().as_str().unwrap();
assert_eq!(fourth_text.len(), remaining_after_t1_t2);
let summary = output[4].as_object().expect("summary obj");
assert_eq!(summary.get("type").unwrap(), "input_text");
let summary_text = summary.get("text").unwrap().as_str().unwrap();
assert!(summary_text.contains("omitted 2 text items"));
assert_truncated_message_matches(&truncated, "line-0-", 17_423);
}
//TODO(aibrahim): run CI in release mode.

View File

@@ -1,7 +1,4 @@
mod history;
mod normalize;
pub(crate) use crate::truncate::MODEL_FORMAT_MAX_BYTES;
pub(crate) use crate::truncate::MODEL_FORMAT_MAX_LINES;
pub(crate) use crate::truncate::format_output_for_model_body;
pub(crate) use history::ContextManager;

View File

@@ -6,6 +6,7 @@ use crate::codex::TurnContext;
use crate::protocol::AskForApproval;
use crate::protocol::SandboxPolicy;
use crate::shell::Shell;
use crate::shell::default_user_shell;
use codex_protocol::config_types::SandboxMode;
use codex_protocol::models::ContentItem;
use codex_protocol::models::ResponseItem;
@@ -28,7 +29,7 @@ pub(crate) struct EnvironmentContext {
pub sandbox_mode: Option<SandboxMode>,
pub network_access: Option<NetworkAccess>,
pub writable_roots: Option<Vec<PathBuf>>,
pub shell: Option<Shell>,
pub shell: Shell,
}
impl EnvironmentContext {
@@ -36,7 +37,7 @@ impl EnvironmentContext {
cwd: Option<PathBuf>,
approval_policy: Option<AskForApproval>,
sandbox_policy: Option<SandboxPolicy>,
shell: Option<Shell>,
shell: Shell,
) -> Self {
Self {
cwd,
@@ -110,7 +111,7 @@ impl EnvironmentContext {
} else {
None
};
EnvironmentContext::new(cwd, approval_policy, sandbox_policy, None)
EnvironmentContext::new(cwd, approval_policy, sandbox_policy, default_user_shell())
}
}
@@ -121,7 +122,7 @@ impl From<&TurnContext> for EnvironmentContext {
Some(turn_context.approval_policy),
Some(turn_context.sandbox_policy.clone()),
// Shell is not configurable from turn to turn
None,
default_user_shell(),
)
}
}
@@ -169,11 +170,9 @@ impl EnvironmentContext {
}
lines.push(" </writable_roots>".to_string());
}
if let Some(shell) = self.shell
&& let Some(shell_name) = shell.name()
{
lines.push(format!(" <shell>{shell_name}</shell>"));
}
let shell_name = self.shell.name();
lines.push(format!(" <shell>{shell_name}</shell>"));
lines.push(ENVIRONMENT_CONTEXT_CLOSE_TAG.to_string());
lines.join("\n")
}
@@ -193,12 +192,18 @@ impl From<EnvironmentContext> for ResponseItem {
#[cfg(test)]
mod tests {
use crate::shell::BashShell;
use crate::shell::ZshShell;
use crate::shell::ShellType;
use super::*;
use pretty_assertions::assert_eq;
fn fake_shell() -> Shell {
Shell {
shell_type: ShellType::Bash,
shell_path: PathBuf::from("/bin/bash"),
}
}
fn workspace_write_policy(writable_roots: Vec<&str>, network_access: bool) -> SandboxPolicy {
SandboxPolicy::WorkspaceWrite {
writable_roots: writable_roots.into_iter().map(PathBuf::from).collect(),
@@ -214,7 +219,7 @@ mod tests {
Some(PathBuf::from("/repo")),
Some(AskForApproval::OnRequest),
Some(workspace_write_policy(vec!["/repo", "/tmp"], false)),
None,
fake_shell(),
);
let expected = r#"<environment_context>
@@ -226,6 +231,7 @@ mod tests {
<root>/repo</root>
<root>/tmp</root>
</writable_roots>
<shell>bash</shell>
</environment_context>"#;
assert_eq!(context.serialize_to_xml(), expected);
@@ -237,13 +243,14 @@ mod tests {
None,
Some(AskForApproval::Never),
Some(SandboxPolicy::ReadOnly),
None,
fake_shell(),
);
let expected = r#"<environment_context>
<approval_policy>never</approval_policy>
<sandbox_mode>read-only</sandbox_mode>
<network_access>restricted</network_access>
<shell>bash</shell>
</environment_context>"#;
assert_eq!(context.serialize_to_xml(), expected);
@@ -255,13 +262,14 @@ mod tests {
None,
Some(AskForApproval::OnFailure),
Some(SandboxPolicy::DangerFullAccess),
None,
fake_shell(),
);
let expected = r#"<environment_context>
<approval_policy>on-failure</approval_policy>
<sandbox_mode>danger-full-access</sandbox_mode>
<network_access>enabled</network_access>
<shell>bash</shell>
</environment_context>"#;
assert_eq!(context.serialize_to_xml(), expected);
@@ -274,13 +282,13 @@ mod tests {
Some(PathBuf::from("/repo")),
Some(AskForApproval::OnRequest),
Some(workspace_write_policy(vec!["/repo"], false)),
None,
fake_shell(),
);
let context2 = EnvironmentContext::new(
Some(PathBuf::from("/repo")),
Some(AskForApproval::Never),
Some(workspace_write_policy(vec!["/repo"], true)),
None,
fake_shell(),
);
assert!(!context1.equals_except_shell(&context2));
}
@@ -291,13 +299,13 @@ mod tests {
Some(PathBuf::from("/repo")),
Some(AskForApproval::OnRequest),
Some(SandboxPolicy::new_read_only_policy()),
None,
fake_shell(),
);
let context2 = EnvironmentContext::new(
Some(PathBuf::from("/repo")),
Some(AskForApproval::OnRequest),
Some(SandboxPolicy::new_workspace_write_policy()),
None,
fake_shell(),
);
assert!(!context1.equals_except_shell(&context2));
@@ -309,13 +317,13 @@ mod tests {
Some(PathBuf::from("/repo")),
Some(AskForApproval::OnRequest),
Some(workspace_write_policy(vec!["/repo", "/tmp", "/var"], false)),
None,
fake_shell(),
);
let context2 = EnvironmentContext::new(
Some(PathBuf::from("/repo")),
Some(AskForApproval::OnRequest),
Some(workspace_write_policy(vec!["/repo", "/tmp"], true)),
None,
fake_shell(),
);
assert!(!context1.equals_except_shell(&context2));
@@ -327,17 +335,19 @@ mod tests {
Some(PathBuf::from("/repo")),
Some(AskForApproval::OnRequest),
Some(workspace_write_policy(vec!["/repo"], false)),
Some(Shell::Bash(BashShell {
Shell {
shell_type: ShellType::Bash,
shell_path: "/bin/bash".into(),
})),
},
);
let context2 = EnvironmentContext::new(
Some(PathBuf::from("/repo")),
Some(AskForApproval::OnRequest),
Some(workspace_write_policy(vec!["/repo"], false)),
Some(Shell::Zsh(ZshShell {
Shell {
shell_type: ShellType::Zsh,
shell_path: "/bin/zsh".into(),
})),
},
);
assert!(context1.equals_except_shell(&context2));

View File

@@ -2,13 +2,16 @@ use crate::codex::ProcessedResponseItem;
use crate::exec::ExecToolCallOutput;
use crate::token_data::KnownPlan;
use crate::token_data::PlanType;
use crate::truncate::truncate_middle;
use crate::truncate::TruncationPolicy;
use crate::truncate::truncate_text;
use chrono::DateTime;
use chrono::Datelike;
use chrono::Local;
use chrono::Utc;
use codex_async_utils::CancelErr;
use codex_protocol::ConversationId;
use codex_protocol::protocol::CodexErrorInfo;
use codex_protocol::protocol::ErrorEvent;
use codex_protocol::protocol::RateLimitSnapshot;
use reqwest::StatusCode;
use serde_json;
@@ -429,6 +432,57 @@ impl CodexErr {
pub fn downcast_ref<T: std::any::Any>(&self) -> Option<&T> {
(self as &dyn std::any::Any).downcast_ref::<T>()
}
/// Translate core error to client-facing protocol error.
pub fn to_codex_protocol_error(&self) -> CodexErrorInfo {
match self {
CodexErr::ContextWindowExceeded => CodexErrorInfo::ContextWindowExceeded,
CodexErr::UsageLimitReached(_)
| CodexErr::QuotaExceeded
| CodexErr::UsageNotIncluded => CodexErrorInfo::UsageLimitExceeded,
CodexErr::RetryLimit(_) => CodexErrorInfo::ResponseTooManyFailedAttempts {
http_status_code: self.http_status_code_value(),
},
CodexErr::ConnectionFailed(_) => CodexErrorInfo::HttpConnectionFailed {
http_status_code: self.http_status_code_value(),
},
CodexErr::ResponseStreamFailed(_) => CodexErrorInfo::ResponseStreamConnectionFailed {
http_status_code: self.http_status_code_value(),
},
CodexErr::RefreshTokenFailed(_) => CodexErrorInfo::Unauthorized,
CodexErr::SessionConfiguredNotFirstEvent
| CodexErr::InternalServerError
| CodexErr::InternalAgentDied => CodexErrorInfo::InternalServerError,
CodexErr::UnsupportedOperation(_) | CodexErr::ConversationNotFound(_) => {
CodexErrorInfo::BadRequest
}
CodexErr::Sandbox(_) => CodexErrorInfo::SandboxError,
_ => CodexErrorInfo::Other,
}
}
pub fn to_error_event(&self, message_prefix: Option<String>) -> ErrorEvent {
let error_message = self.to_string();
let message: String = match message_prefix {
Some(prefix) => format!("{prefix}: {error_message}"),
None => error_message,
};
ErrorEvent {
message,
codex_error_info: Some(self.to_codex_protocol_error()),
}
}
pub fn http_status_code_value(&self) -> Option<u16> {
let http_status_code = match self {
CodexErr::RetryLimit(err) => Some(err.status),
CodexErr::UnexpectedStatus(err) => Some(err.status),
CodexErr::ConnectionFailed(err) => err.source.status(),
CodexErr::ResponseStreamFailed(err) => err.source.status(),
_ => None,
};
http_status_code.as_ref().map(StatusCode::as_u16)
}
}
pub fn get_error_message_ui(e: &CodexErr) -> String {
@@ -461,7 +515,10 @@ pub fn get_error_message_ui(e: &CodexErr) -> String {
_ => e.to_string(),
};
truncate_middle(&message, ERROR_MESSAGE_UI_MAX_BYTES).0
truncate_text(
&message,
TruncationPolicy::Bytes(ERROR_MESSAGE_UI_MAX_BYTES),
)
}
#[cfg(test)]
@@ -474,6 +531,10 @@ mod tests {
use chrono::Utc;
use codex_protocol::protocol::RateLimitWindow;
use pretty_assertions::assert_eq;
use reqwest::Response;
use reqwest::ResponseBuilderExt;
use reqwest::StatusCode;
use reqwest::Url;
fn rate_limit_snapshot() -> RateLimitSnapshot {
let primary_reset_at = Utc
@@ -495,6 +556,7 @@ mod tests {
window_minutes: Some(120),
resets_at: Some(secondary_reset_at),
}),
credits: None,
}
}
@@ -568,6 +630,33 @@ mod tests {
assert_eq!(get_error_message_ui(&err), "stdout only");
}
#[test]
fn to_error_event_handles_response_stream_failed() {
let response = http::Response::builder()
.status(StatusCode::TOO_MANY_REQUESTS)
.url(Url::parse("http://example.com").unwrap())
.body("")
.unwrap();
let source = Response::from(response).error_for_status_ref().unwrap_err();
let err = CodexErr::ResponseStreamFailed(ResponseStreamFailed {
source,
request_id: Some("req-123".to_string()),
});
let event = err.to_error_event(Some("prefix".to_string()));
assert_eq!(
event.message,
"prefix: Error while reading the server response: HTTP status client error (429 Too Many Requests) for url (http://example.com/), request id: req-123"
);
assert_eq!(
event.codex_error_info,
Some(CodexErrorInfo::ResponseStreamConnectionFailed {
http_status_code: Some(429)
})
);
}
#[test]
fn sandbox_denied_reports_exit_code_when_no_output_available() {
let output = ExecToolCallOutput {

View File

@@ -117,7 +117,7 @@ pub fn parse_turn_item(item: &ResponseItem) -> Option<TurnItem> {
..
} => Some(TurnItem::WebSearch(WebSearchItem {
id: id.clone().unwrap_or_default(),
query: query.clone(),
query: query.clone().unwrap_or_default(),
})),
_ => None,
}
@@ -306,7 +306,7 @@ mod tests {
id: Some("ws_1".to_string()),
status: Some("completed".to_string()),
action: WebSearchAction::Search {
query: "weather".to_string(),
query: Some("weather".to_string()),
},
};

View File

@@ -14,6 +14,7 @@ use tokio::io::AsyncRead;
use tokio::io::AsyncReadExt;
use tokio::io::BufReader;
use tokio::process::Child;
use tokio_util::sync::CancellationToken;
use crate::error::CodexErr;
use crate::error::Result;
@@ -28,8 +29,9 @@ use crate::sandboxing::ExecEnv;
use crate::sandboxing::SandboxManager;
use crate::spawn::StdioPolicy;
use crate::spawn::spawn_child_async;
use crate::text_encoding::bytes_to_string_smart;
const DEFAULT_TIMEOUT_MS: u64 = 10_000;
pub const DEFAULT_EXEC_COMMAND_TIMEOUT_MS: u64 = 10_000;
// Hardcode these since it does not seem worth including the libc crate just
// for these.
@@ -46,20 +48,59 @@ const AGGREGATE_BUFFER_INITIAL_CAPACITY: usize = 8 * 1024; // 8 KiB
/// Aggregation still collects full output; only the live event stream is capped.
pub(crate) const MAX_EXEC_OUTPUT_DELTAS_PER_CALL: usize = 10_000;
#[derive(Clone, Debug)]
#[derive(Debug)]
pub struct ExecParams {
pub command: Vec<String>,
pub cwd: PathBuf,
pub timeout_ms: Option<u64>,
pub expiration: ExecExpiration,
pub env: HashMap<String, String>,
pub with_escalated_permissions: Option<bool>,
pub justification: Option<String>,
pub arg0: Option<String>,
}
impl ExecParams {
pub fn timeout_duration(&self) -> Duration {
Duration::from_millis(self.timeout_ms.unwrap_or(DEFAULT_TIMEOUT_MS))
/// Mechanism to terminate an exec invocation before it finishes naturally.
#[derive(Debug)]
pub enum ExecExpiration {
Timeout(Duration),
DefaultTimeout,
Cancellation(CancellationToken),
}
impl From<Option<u64>> for ExecExpiration {
fn from(timeout_ms: Option<u64>) -> Self {
timeout_ms.map_or(ExecExpiration::DefaultTimeout, |timeout_ms| {
ExecExpiration::Timeout(Duration::from_millis(timeout_ms))
})
}
}
impl From<u64> for ExecExpiration {
fn from(timeout_ms: u64) -> Self {
ExecExpiration::Timeout(Duration::from_millis(timeout_ms))
}
}
impl ExecExpiration {
async fn wait(self) {
match self {
ExecExpiration::Timeout(duration) => tokio::time::sleep(duration).await,
ExecExpiration::DefaultTimeout => {
tokio::time::sleep(Duration::from_millis(DEFAULT_EXEC_COMMAND_TIMEOUT_MS)).await
}
ExecExpiration::Cancellation(cancel) => {
cancel.cancelled().await;
}
}
}
/// If ExecExpiration is a timeout, returns the timeout in milliseconds.
pub(crate) fn timeout_ms(&self) -> Option<u64> {
match self {
ExecExpiration::Timeout(duration) => Some(duration.as_millis() as u64),
ExecExpiration::DefaultTimeout => Some(DEFAULT_EXEC_COMMAND_TIMEOUT_MS),
ExecExpiration::Cancellation(_) => None,
}
}
}
@@ -95,7 +136,7 @@ pub async fn process_exec_tool_call(
let ExecParams {
command,
cwd,
timeout_ms,
expiration,
env,
with_escalated_permissions,
justification,
@@ -114,7 +155,7 @@ pub async fn process_exec_tool_call(
args: args.to_vec(),
cwd,
env,
timeout_ms,
expiration,
with_escalated_permissions,
justification,
};
@@ -122,7 +163,7 @@ pub async fn process_exec_tool_call(
let manager = SandboxManager::new();
let exec_env = manager
.transform(
&spec,
spec,
sandbox_policy,
sandbox_type,
sandbox_cwd,
@@ -131,7 +172,7 @@ pub async fn process_exec_tool_call(
.map_err(CodexErr::from)?;
// Route through the sandboxing module for a single, unified execution path.
crate::sandboxing::execute_env(&exec_env, sandbox_policy, stdout_stream).await
crate::sandboxing::execute_env(exec_env, sandbox_policy, stdout_stream).await
}
pub(crate) async fn execute_exec_env(
@@ -143,7 +184,7 @@ pub(crate) async fn execute_exec_env(
command,
cwd,
env,
timeout_ms,
expiration,
sandbox,
with_escalated_permissions,
justification,
@@ -153,7 +194,7 @@ pub(crate) async fn execute_exec_env(
let params = ExecParams {
command,
cwd,
timeout_ms,
expiration,
env,
with_escalated_permissions,
justification,
@@ -178,16 +219,18 @@ async fn exec_windows_sandbox(
command,
cwd,
env,
timeout_ms,
expiration,
..
} = params;
// TODO(iceweasel-oai): run_windows_sandbox_capture should support all
// variants of ExecExpiration, not just timeout.
let timeout_ms = expiration.timeout_ms();
let policy_str = match sandbox_policy {
SandboxPolicy::DangerFullAccess => "workspace-write",
SandboxPolicy::ReadOnly => "read-only",
SandboxPolicy::WorkspaceWrite { .. } => "workspace-write",
};
let policy_str = serde_json::to_string(sandbox_policy).map_err(|err| {
CodexErr::Io(io::Error::other(format!(
"failed to serialize Windows sandbox policy: {err}"
)))
})?;
let sandbox_cwd = cwd.clone();
let codex_home = find_codex_home().map_err(|err| {
CodexErr::Io(io::Error::other(format!(
@@ -196,7 +239,7 @@ async fn exec_windows_sandbox(
})?;
let spawn_res = tokio::task::spawn_blocking(move || {
run_windows_sandbox_capture(
policy_str,
policy_str.as_str(),
&sandbox_cwd,
codex_home.as_ref(),
command,
@@ -415,7 +458,7 @@ impl StreamOutput<String> {
impl StreamOutput<Vec<u8>> {
pub fn from_utf8_lossy(&self) -> StreamOutput<String> {
StreamOutput {
text: String::from_utf8_lossy(&self.text).to_string(),
text: bytes_to_string_smart(&self.text),
truncated_after_lines: self.truncated_after_lines,
}
}
@@ -444,15 +487,17 @@ async fn exec(
stdout_stream: Option<StdoutStream>,
) -> Result<RawExecToolCallOutput> {
#[cfg(target_os = "windows")]
if sandbox == SandboxType::WindowsRestrictedToken {
if sandbox == SandboxType::WindowsRestrictedToken
&& !matches!(sandbox_policy, SandboxPolicy::DangerFullAccess)
{
return exec_windows_sandbox(params, sandbox_policy).await;
}
let timeout = params.timeout_duration();
let ExecParams {
command,
cwd,
env,
arg0,
expiration,
..
} = params;
@@ -473,14 +518,14 @@ async fn exec(
env,
)
.await?;
consume_truncated_output(child, timeout, stdout_stream).await
consume_truncated_output(child, expiration, stdout_stream).await
}
/// Consumes the output of a child process, truncating it so it is suitable for
/// use as the output of a `shell` tool call. Also enforces specified timeout.
async fn consume_truncated_output(
mut child: Child,
timeout: Duration,
expiration: ExecExpiration,
stdout_stream: Option<StdoutStream>,
) -> Result<RawExecToolCallOutput> {
// Both stdout and stderr were configured with `Stdio::piped()`
@@ -514,20 +559,14 @@ async fn consume_truncated_output(
));
let (exit_status, timed_out) = tokio::select! {
result = tokio::time::timeout(timeout, child.wait()) => {
match result {
Ok(status_result) => {
let exit_status = status_result?;
(exit_status, false)
}
Err(_) => {
// timeout
kill_child_process_group(&mut child)?;
child.start_kill()?;
// Debatable whether `child.wait().await` should be called here.
(synthetic_exit_status(EXIT_CODE_SIGNAL_BASE + TIMEOUT_CODE), true)
}
}
status_result = child.wait() => {
let exit_status = status_result?;
(exit_status, false)
}
_ = expiration.wait() => {
kill_child_process_group(&mut child)?;
child.start_kill()?;
(synthetic_exit_status(EXIT_CODE_SIGNAL_BASE + TIMEOUT_CODE), true)
}
_ = tokio::signal::ctrl_c() => {
kill_child_process_group(&mut child)?;
@@ -779,6 +818,15 @@ mod tests {
#[cfg(unix)]
#[tokio::test]
async fn kill_child_process_group_kills_grandchildren_on_timeout() -> Result<()> {
// On Linux/macOS, /bin/bash is typically present; on FreeBSD/OpenBSD,
// prefer /bin/sh to avoid NotFound errors.
#[cfg(any(target_os = "freebsd", target_os = "openbsd"))]
let command = vec![
"/bin/sh".to_string(),
"-c".to_string(),
"sleep 60 & echo $!; sleep 60".to_string(),
];
#[cfg(all(unix, not(any(target_os = "freebsd", target_os = "openbsd"))))]
let command = vec![
"/bin/bash".to_string(),
"-c".to_string(),
@@ -788,7 +836,7 @@ mod tests {
let params = ExecParams {
command,
cwd: std::env::current_dir()?,
timeout_ms: Some(500),
expiration: 500.into(),
env,
with_escalated_permissions: None,
justification: None,
@@ -822,4 +870,62 @@ mod tests {
assert!(killed, "grandchild process with pid {pid} is still alive");
Ok(())
}
#[tokio::test]
async fn process_exec_tool_call_respects_cancellation_token() -> Result<()> {
let command = long_running_command();
let cwd = std::env::current_dir()?;
let env: HashMap<String, String> = std::env::vars().collect();
let cancel_token = CancellationToken::new();
let cancel_tx = cancel_token.clone();
let params = ExecParams {
command,
cwd: cwd.clone(),
expiration: ExecExpiration::Cancellation(cancel_token),
env,
with_escalated_permissions: None,
justification: None,
arg0: None,
};
tokio::spawn(async move {
tokio::time::sleep(Duration::from_millis(1_000)).await;
cancel_tx.cancel();
});
let result = process_exec_tool_call(
params,
SandboxType::None,
&SandboxPolicy::DangerFullAccess,
cwd.as_path(),
&None,
None,
)
.await;
let output = match result {
Err(CodexErr::Sandbox(SandboxErr::Timeout { output })) => output,
other => panic!("expected timeout error, got {other:?}"),
};
assert!(output.timed_out);
assert_eq!(output.exit_code, EXEC_TIMEOUT_EXIT_CODE);
Ok(())
}
#[cfg(unix)]
fn long_running_command() -> Vec<String> {
vec![
"/bin/sh".to_string(),
"-c".to_string(),
"sleep 30".to_string(),
]
}
#[cfg(windows)]
fn long_running_command() -> Vec<String> {
vec![
"powershell.exe".to_string(),
"-NonInteractive".to_string(),
"-NoLogo".to_string(),
"-Command".to_string(),
"Start-Sleep -Seconds 30".to_string(),
]
}
}

View File

@@ -0,0 +1,365 @@
use std::io::ErrorKind;
use std::path::Path;
use std::path::PathBuf;
use std::sync::Arc;
use crate::command_safety::is_dangerous_command::requires_initial_appoval;
use codex_execpolicy::Decision;
use codex_execpolicy::Evaluation;
use codex_execpolicy::Policy;
use codex_execpolicy::PolicyParser;
use codex_protocol::protocol::AskForApproval;
use codex_protocol::protocol::SandboxPolicy;
use thiserror::Error;
use tokio::fs;
use crate::bash::parse_shell_lc_plain_commands;
use crate::features::Feature;
use crate::features::Features;
use crate::sandboxing::SandboxPermissions;
use crate::tools::sandboxing::ApprovalRequirement;
const FORBIDDEN_REASON: &str = "execpolicy forbids this command";
const PROMPT_REASON: &str = "execpolicy requires approval for this command";
const POLICY_DIR_NAME: &str = "policy";
const POLICY_EXTENSION: &str = "codexpolicy";
#[derive(Debug, Error)]
pub enum ExecPolicyError {
#[error("failed to read execpolicy files from {dir}: {source}")]
ReadDir {
dir: PathBuf,
source: std::io::Error,
},
#[error("failed to read execpolicy file {path}: {source}")]
ReadFile {
path: PathBuf,
source: std::io::Error,
},
#[error("failed to parse execpolicy file {path}: {source}")]
ParsePolicy {
path: String,
source: codex_execpolicy::Error,
},
}
pub(crate) async fn exec_policy_for(
features: &Features,
codex_home: &Path,
) -> Result<Arc<Policy>, ExecPolicyError> {
if !features.enabled(Feature::ExecPolicy) {
return Ok(Arc::new(Policy::empty()));
}
let policy_dir = codex_home.join(POLICY_DIR_NAME);
let policy_paths = collect_policy_files(&policy_dir).await?;
let mut parser = PolicyParser::new();
for policy_path in &policy_paths {
let contents =
fs::read_to_string(policy_path)
.await
.map_err(|source| ExecPolicyError::ReadFile {
path: policy_path.clone(),
source,
})?;
let identifier = policy_path.to_string_lossy().to_string();
parser
.parse(&identifier, &contents)
.map_err(|source| ExecPolicyError::ParsePolicy {
path: identifier,
source,
})?;
}
let policy = Arc::new(parser.build());
tracing::debug!(
"loaded execpolicy from {} files in {}",
policy_paths.len(),
policy_dir.display()
);
Ok(policy)
}
fn evaluate_with_policy(
policy: &Policy,
command: &[String],
approval_policy: AskForApproval,
) -> Option<ApprovalRequirement> {
let commands = parse_shell_lc_plain_commands(command).unwrap_or_else(|| vec![command.to_vec()]);
let evaluation = policy.check_multiple(commands.iter());
match evaluation {
Evaluation::Match { decision, .. } => match decision {
Decision::Forbidden => Some(ApprovalRequirement::Forbidden {
reason: FORBIDDEN_REASON.to_string(),
}),
Decision::Prompt => {
let reason = PROMPT_REASON.to_string();
if matches!(approval_policy, AskForApproval::Never) {
Some(ApprovalRequirement::Forbidden { reason })
} else {
Some(ApprovalRequirement::NeedsApproval {
reason: Some(reason),
})
}
}
Decision::Allow => Some(ApprovalRequirement::Skip),
},
Evaluation::NoMatch { .. } => None,
}
}
pub(crate) fn create_approval_requirement_for_command(
policy: &Policy,
command: &[String],
approval_policy: AskForApproval,
sandbox_policy: &SandboxPolicy,
sandbox_permissions: SandboxPermissions,
) -> ApprovalRequirement {
if let Some(requirement) = evaluate_with_policy(policy, command, approval_policy) {
return requirement;
}
if requires_initial_appoval(
approval_policy,
sandbox_policy,
command,
sandbox_permissions,
) {
ApprovalRequirement::NeedsApproval { reason: None }
} else {
ApprovalRequirement::Skip
}
}
async fn collect_policy_files(dir: &Path) -> Result<Vec<PathBuf>, ExecPolicyError> {
let mut read_dir = match fs::read_dir(dir).await {
Ok(read_dir) => read_dir,
Err(err) if err.kind() == ErrorKind::NotFound => return Ok(Vec::new()),
Err(source) => {
return Err(ExecPolicyError::ReadDir {
dir: dir.to_path_buf(),
source,
});
}
};
let mut policy_paths = Vec::new();
while let Some(entry) =
read_dir
.next_entry()
.await
.map_err(|source| ExecPolicyError::ReadDir {
dir: dir.to_path_buf(),
source,
})?
{
let path = entry.path();
let file_type = entry
.file_type()
.await
.map_err(|source| ExecPolicyError::ReadDir {
dir: dir.to_path_buf(),
source,
})?;
if path
.extension()
.and_then(|ext| ext.to_str())
.is_some_and(|ext| ext == POLICY_EXTENSION)
&& file_type.is_file()
{
policy_paths.push(path);
}
}
policy_paths.sort();
Ok(policy_paths)
}
#[cfg(test)]
mod tests {
use super::*;
use crate::features::Feature;
use crate::features::Features;
use codex_protocol::protocol::AskForApproval;
use codex_protocol::protocol::SandboxPolicy;
use pretty_assertions::assert_eq;
use std::fs;
use tempfile::tempdir;
#[tokio::test]
async fn returns_empty_policy_when_feature_disabled() {
let mut features = Features::with_defaults();
features.disable(Feature::ExecPolicy);
let temp_dir = tempdir().expect("create temp dir");
let policy = exec_policy_for(&features, temp_dir.path())
.await
.expect("policy result");
let commands = [vec!["rm".to_string()]];
assert!(matches!(
policy.check_multiple(commands.iter()),
Evaluation::NoMatch { .. }
));
assert!(!temp_dir.path().join(POLICY_DIR_NAME).exists());
}
#[tokio::test]
async fn collect_policy_files_returns_empty_when_dir_missing() {
let temp_dir = tempdir().expect("create temp dir");
let policy_dir = temp_dir.path().join(POLICY_DIR_NAME);
let files = collect_policy_files(&policy_dir)
.await
.expect("collect policy files");
assert!(files.is_empty());
}
#[tokio::test]
async fn loads_policies_from_policy_subdirectory() {
let temp_dir = tempdir().expect("create temp dir");
let policy_dir = temp_dir.path().join(POLICY_DIR_NAME);
fs::create_dir_all(&policy_dir).expect("create policy dir");
fs::write(
policy_dir.join("deny.codexpolicy"),
r#"prefix_rule(pattern=["rm"], decision="forbidden")"#,
)
.expect("write policy file");
let policy = exec_policy_for(&Features::with_defaults(), temp_dir.path())
.await
.expect("policy result");
let command = [vec!["rm".to_string()]];
assert!(matches!(
policy.check_multiple(command.iter()),
Evaluation::Match { .. }
));
}
#[tokio::test]
async fn ignores_policies_outside_policy_dir() {
let temp_dir = tempdir().expect("create temp dir");
fs::write(
temp_dir.path().join("root.codexpolicy"),
r#"prefix_rule(pattern=["ls"], decision="prompt")"#,
)
.expect("write policy file");
let policy = exec_policy_for(&Features::with_defaults(), temp_dir.path())
.await
.expect("policy result");
let command = [vec!["ls".to_string()]];
assert!(matches!(
policy.check_multiple(command.iter()),
Evaluation::NoMatch { .. }
));
}
#[test]
fn evaluates_bash_lc_inner_commands() {
let policy_src = r#"
prefix_rule(pattern=["rm"], decision="forbidden")
"#;
let mut parser = PolicyParser::new();
parser
.parse("test.codexpolicy", policy_src)
.expect("parse policy");
let policy = parser.build();
let forbidden_script = vec![
"bash".to_string(),
"-lc".to_string(),
"rm -rf /tmp".to_string(),
];
let requirement =
evaluate_with_policy(&policy, &forbidden_script, AskForApproval::OnRequest)
.expect("expected match for forbidden command");
assert_eq!(
requirement,
ApprovalRequirement::Forbidden {
reason: FORBIDDEN_REASON.to_string()
}
);
}
#[test]
fn approval_requirement_prefers_execpolicy_match() {
let policy_src = r#"prefix_rule(pattern=["rm"], decision="prompt")"#;
let mut parser = PolicyParser::new();
parser
.parse("test.codexpolicy", policy_src)
.expect("parse policy");
let policy = parser.build();
let command = vec!["rm".to_string()];
let requirement = create_approval_requirement_for_command(
&policy,
&command,
AskForApproval::OnRequest,
&SandboxPolicy::DangerFullAccess,
SandboxPermissions::UseDefault,
);
assert_eq!(
requirement,
ApprovalRequirement::NeedsApproval {
reason: Some(PROMPT_REASON.to_string())
}
);
}
#[test]
fn approval_requirement_respects_approval_policy() {
let policy_src = r#"prefix_rule(pattern=["rm"], decision="prompt")"#;
let mut parser = PolicyParser::new();
parser
.parse("test.codexpolicy", policy_src)
.expect("parse policy");
let policy = parser.build();
let command = vec!["rm".to_string()];
let requirement = create_approval_requirement_for_command(
&policy,
&command,
AskForApproval::Never,
&SandboxPolicy::DangerFullAccess,
SandboxPermissions::UseDefault,
);
assert_eq!(
requirement,
ApprovalRequirement::Forbidden {
reason: PROMPT_REASON.to_string()
}
);
}
#[test]
fn approval_requirement_falls_back_to_heuristics() {
let command = vec!["python".to_string()];
let empty_policy = Policy::empty();
let requirement = create_approval_requirement_for_command(
&empty_policy,
&command,
AskForApproval::UnlessTrusted,
&SandboxPolicy::ReadOnly,
SandboxPermissions::UseDefault,
);
assert_eq!(
requirement,
ApprovalRequirement::NeedsApproval { reason: None }
);
}
}

View File

@@ -27,11 +27,10 @@ pub enum Stage {
/// Unique features toggled via configuration.
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub enum Feature {
/// Create a ghost commit at each turn.
GhostCommit,
/// Use the single unified PTY-backed exec tool.
UnifiedExec,
/// Use the shell command tool that takes `command` as a single string of
/// shell instead of an array of args passed to `execvp(3)`.
ShellCommandTool,
/// Enable experimental RMCP features such as OAuth login.
RmcpClient,
/// Include the freeform apply_patch tool.
@@ -40,14 +39,18 @@ pub enum Feature {
ViewImageTool,
/// Allow the model to request web searches.
WebSearchRequest,
/// Gate the execpolicy enforcement for shell/unified exec.
ExecPolicy,
/// Enable the model-based risk assessments for sandboxed commands.
SandboxCommandAssessment,
/// Create a ghost commit at each turn.
GhostCommit,
/// Enable Windows sandbox (restricted token) on Windows.
WindowsSandbox,
/// Remote compaction enabled (only for ChatGPT auth)
RemoteCompaction,
/// Enable the default shell tool.
ShellTool,
/// Allow model to call multiple tools in parallel (only for models supporting it).
ParallelToolCalls,
}
impl Feature {
@@ -249,18 +252,26 @@ pub struct FeatureSpec {
}
pub const FEATURES: &[FeatureSpec] = &[
// Stable features.
FeatureSpec {
id: Feature::GhostCommit,
key: "undo",
stage: Stage::Stable,
default_enabled: true,
},
FeatureSpec {
id: Feature::ViewImageTool,
key: "view_image_tool",
stage: Stage::Stable,
default_enabled: true,
},
// Unstable features.
FeatureSpec {
id: Feature::UnifiedExec,
key: "unified_exec",
stage: Stage::Experimental,
default_enabled: false,
},
FeatureSpec {
id: Feature::ShellCommandTool,
key: "shell_command_tool",
stage: Stage::Experimental,
default_enabled: false,
},
FeatureSpec {
id: Feature::RmcpClient,
key: "rmcp_client",
@@ -273,18 +284,18 @@ pub const FEATURES: &[FeatureSpec] = &[
stage: Stage::Beta,
default_enabled: false,
},
FeatureSpec {
id: Feature::ViewImageTool,
key: "view_image_tool",
stage: Stage::Stable,
default_enabled: true,
},
FeatureSpec {
id: Feature::WebSearchRequest,
key: "web_search_request",
stage: Stage::Stable,
default_enabled: false,
},
FeatureSpec {
id: Feature::ExecPolicy,
key: "exec_policy",
stage: Stage::Experimental,
default_enabled: true,
},
FeatureSpec {
id: Feature::SandboxCommandAssessment,
key: "experimental_sandbox_command_assessment",
@@ -292,14 +303,20 @@ pub const FEATURES: &[FeatureSpec] = &[
default_enabled: false,
},
FeatureSpec {
id: Feature::GhostCommit,
key: "ghost_commit",
id: Feature::WindowsSandbox,
key: "enable_experimental_windows_sandbox",
stage: Stage::Experimental,
default_enabled: false,
},
FeatureSpec {
id: Feature::RemoteCompaction,
key: "remote_compaction",
stage: Stage::Experimental,
default_enabled: true,
},
FeatureSpec {
id: Feature::WindowsSandbox,
key: "enable_experimental_windows_sandbox",
id: Feature::ParallelToolCalls,
key: "parallel",
stage: Stage::Experimental,
default_enabled: false,
},

View File

@@ -825,11 +825,21 @@ mod tests {
.await
.expect("Should collect git info from repo");
let remote_url_output = Command::new("git")
.args(["remote", "get-url", "origin"])
.current_dir(&repo_path)
.output()
.await
.expect("Failed to read remote url");
// Some dev environments rewrite remotes (e.g., force SSH), so compare against
// whatever URL Git reports instead of a fixed placeholder.
let expected_remote = String::from_utf8(remote_url_output.stdout)
.unwrap()
.trim()
.to_string();
// Should have repository URL
assert_eq!(
git_info.repository_url,
Some("https://github.com/example/repo.git".to_string())
);
assert_eq!(git_info.repository_url, Some(expected_remote));
}
#[tokio::test]

View File

@@ -13,6 +13,7 @@ mod client;
mod client_common;
pub mod codex;
mod codex_conversation;
mod compact_remote;
pub use codex_conversation::CodexConversation;
mod codex_delegate;
mod command_safety;
@@ -24,6 +25,7 @@ mod environment_context;
pub mod error;
pub mod exec;
pub mod exec_env;
mod exec_policy;
pub mod features;
mod flags;
pub mod git_info;
@@ -34,8 +36,10 @@ mod mcp_tool_call;
mod message_history;
mod model_provider_info;
pub mod parse_command;
pub mod powershell;
mod response_processing;
pub mod sandboxing;
mod text_encoding;
pub mod token_data;
mod truncate;
mod unified_exec;

View File

@@ -4,6 +4,7 @@ use codex_protocol::config_types::Verbosity;
use crate::config::types::ReasoningSummaryFormat;
use crate::tools::handlers::apply_patch::ApplyPatchToolType;
use crate::tools::spec::ConfigShellToolType;
use crate::truncate::TruncationPolicy;
/// The `instructions` field in the payload sent to a model should always start
/// with this content.
@@ -11,6 +12,7 @@ const BASE_INSTRUCTIONS: &str = include_str!("../prompt.md");
const GPT_5_CODEX_INSTRUCTIONS: &str = include_str!("../gpt_5_codex_prompt.md");
const GPT_5_1_INSTRUCTIONS: &str = include_str!("../gpt_5_1_prompt.md");
const GPT_5_1_CODEX_MAX_INSTRUCTIONS: &str = include_str!("../gpt-5.1-codex-max_prompt.md");
/// A model family is a group of models that share certain characteristics.
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
@@ -66,6 +68,8 @@ pub struct ModelFamily {
/// Preferred shell tool type for this model family when features do not override it.
pub shell_type: ConfigShellToolType,
pub truncation_policy: TruncationPolicy,
}
macro_rules! model_family {
@@ -89,6 +93,7 @@ macro_rules! model_family {
shell_type: ConfigShellToolType::Default,
default_verbosity: None,
default_reasoning_effort: None,
truncation_policy: TruncationPolicy::Bytes(10_000),
};
// apply overrides
@@ -132,6 +137,19 @@ pub fn find_family_for_model(slug: &str) -> Option<ModelFamily> {
model_family!(slug, "gpt-4o", needs_special_apply_patch_instructions: true)
} else if slug.starts_with("gpt-3.5") {
model_family!(slug, "gpt-3.5", needs_special_apply_patch_instructions: true)
} else if slug.starts_with("robin") {
model_family!(
slug, "gpt-5.1",
supports_reasoning_summaries: true,
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
support_verbosity: true,
default_verbosity: Some(Verbosity::Low),
base_instructions: GPT_5_1_INSTRUCTIONS.to_string(),
default_reasoning_effort: Some(ReasoningEffort::Medium),
truncation_policy: TruncationPolicy::Bytes(10_000),
shell_type: ConfigShellToolType::ShellCommand,
supports_parallel_tool_calls: true,
)
} else if slug.starts_with("test-gpt-5") {
model_family!(
slug, slug,
@@ -145,7 +163,9 @@ pub fn find_family_for_model(slug: &str) -> Option<ModelFamily> {
"test_sync_tool".to_string(),
],
supports_parallel_tool_calls: true,
shell_type: ConfigShellToolType::ShellCommand,
support_verbosity: true,
truncation_policy: TruncationPolicy::Tokens(10_000),
)
// Internal models.
@@ -161,11 +181,25 @@ pub fn find_family_for_model(slug: &str) -> Option<ModelFamily> {
"list_dir".to_string(),
"read_file".to_string(),
],
shell_type: ConfigShellToolType::ShellCommand,
supports_parallel_tool_calls: true,
support_verbosity: true,
truncation_policy: TruncationPolicy::Tokens(10_000),
)
// Production models.
} else if slug.starts_with("gpt-5.1-codex-max") {
model_family!(
slug, slug,
supports_reasoning_summaries: true,
reasoning_summary_format: ReasoningSummaryFormat::Experimental,
base_instructions: GPT_5_1_CODEX_MAX_INSTRUCTIONS.to_string(),
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
shell_type: ConfigShellToolType::ShellCommand,
supports_parallel_tool_calls: true,
support_verbosity: false,
truncation_policy: TruncationPolicy::Tokens(10_000),
)
} else if slug.starts_with("gpt-5-codex")
|| slug.starts_with("gpt-5.1-codex")
|| slug.starts_with("codex-")
@@ -176,7 +210,10 @@ pub fn find_family_for_model(slug: &str) -> Option<ModelFamily> {
reasoning_summary_format: ReasoningSummaryFormat::Experimental,
base_instructions: GPT_5_CODEX_INSTRUCTIONS.to_string(),
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
shell_type: ConfigShellToolType::ShellCommand,
supports_parallel_tool_calls: true,
support_verbosity: false,
truncation_policy: TruncationPolicy::Tokens(10_000),
)
} else if slug.starts_with("gpt-5.1") {
model_family!(
@@ -187,13 +224,18 @@ pub fn find_family_for_model(slug: &str) -> Option<ModelFamily> {
default_verbosity: Some(Verbosity::Low),
base_instructions: GPT_5_1_INSTRUCTIONS.to_string(),
default_reasoning_effort: Some(ReasoningEffort::Medium),
truncation_policy: TruncationPolicy::Bytes(10_000),
shell_type: ConfigShellToolType::ShellCommand,
supports_parallel_tool_calls: true,
)
} else if slug.starts_with("gpt-5") {
model_family!(
slug, "gpt-5",
supports_reasoning_summaries: true,
needs_special_apply_patch_instructions: true,
shell_type: ConfigShellToolType::Default,
support_verbosity: true,
truncation_policy: TruncationPolicy::Bytes(10_000),
)
} else {
None
@@ -216,5 +258,6 @@ pub fn derive_default_model_family(model: &str) -> ModelFamily {
shell_type: ConfigShellToolType::Default,
default_verbosity: None,
default_reasoning_effort: None,
truncation_policy: TruncationPolicy::Bytes(10_000),
}
}

View File

@@ -8,6 +8,7 @@
use crate::CodexAuth;
use crate::default_client::CodexHttpClient;
use crate::default_client::CodexRequestBuilder;
use crate::error::CodexErr;
use codex_app_server_protocol::AuthMode;
use serde::Deserialize;
use serde::Serialize;
@@ -109,21 +110,7 @@ impl ModelProviderInfo {
client: &'a CodexHttpClient,
auth: &Option<CodexAuth>,
) -> crate::error::Result<CodexRequestBuilder> {
let effective_auth = if let Some(secret_key) = &self.experimental_bearer_token {
Some(CodexAuth::from_api_key(secret_key))
} else {
match self.api_key() {
Ok(Some(key)) => Some(CodexAuth::from_api_key(&key)),
Ok(None) => auth.clone(),
Err(err) => {
if auth.is_some() {
auth.clone()
} else {
return Err(err);
}
}
}
};
let effective_auth = self.effective_auth(auth)?;
let url = self.get_full_url(&effective_auth);
@@ -136,6 +123,51 @@ impl ModelProviderInfo {
Ok(self.apply_http_headers(builder))
}
pub async fn create_compact_request_builder<'a>(
&'a self,
client: &'a CodexHttpClient,
auth: &Option<CodexAuth>,
) -> crate::error::Result<CodexRequestBuilder> {
if self.wire_api != WireApi::Responses {
return Err(CodexErr::UnsupportedOperation(
"Compaction endpoint requires Responses API providers".to_string(),
));
}
let effective_auth = self.effective_auth(auth)?;
let url = self.get_compact_url(&effective_auth).ok_or_else(|| {
CodexErr::UnsupportedOperation(
"Compaction endpoint requires Responses API providers".to_string(),
)
})?;
let mut builder = client.post(url);
if let Some(auth) = effective_auth.as_ref() {
builder = builder.bearer_auth(auth.get_token().await?);
}
Ok(self.apply_http_headers(builder))
}
fn effective_auth(&self, auth: &Option<CodexAuth>) -> crate::error::Result<Option<CodexAuth>> {
if let Some(secret_key) = &self.experimental_bearer_token {
return Ok(Some(CodexAuth::from_api_key(secret_key)));
}
match self.api_key() {
Ok(Some(key)) => Ok(Some(CodexAuth::from_api_key(&key))),
Ok(None) => Ok(auth.clone()),
Err(err) => {
if auth.is_some() {
Ok(auth.clone())
} else {
Err(err)
}
}
}
}
fn get_query_string(&self) -> String {
self.query_params
.as_ref()
@@ -173,6 +205,18 @@ impl ModelProviderInfo {
}
}
pub(crate) fn get_compact_url(&self, auth: &Option<CodexAuth>) -> Option<String> {
if self.wire_api != WireApi::Responses {
return None;
}
let full = self.get_full_url(auth);
if let Some((path, query)) = full.split_once('?') {
Some(format!("{path}/compact?{query}"))
} else {
Some(format!("{full}/compact"))
}
}
pub(crate) fn is_azure_responses_endpoint(&self) -> bool {
if self.wire_api != WireApi::Responses {
return false;

View File

@@ -2,7 +2,6 @@ use crate::model_family::ModelFamily;
// Shared constants for commonly used window/token sizes.
pub(crate) const CONTEXT_WINDOW_272K: i64 = 272_000;
pub(crate) const MAX_OUTPUT_TOKENS_128K: i64 = 128_000;
/// Metadata about a model, particularly OpenAI models.
/// We may want to consider including details like the pricing for
@@ -14,19 +13,15 @@ pub(crate) struct ModelInfo {
/// Size of the context window in tokens. This is the maximum size of the input context.
pub(crate) context_window: i64,
/// Maximum number of output tokens that can be generated for the model.
pub(crate) max_output_tokens: i64,
/// Token threshold where we should automatically compact conversation history. This considers
/// input tokens + output tokens of this turn.
pub(crate) auto_compact_token_limit: Option<i64>,
}
impl ModelInfo {
const fn new(context_window: i64, max_output_tokens: i64) -> Self {
const fn new(context_window: i64) -> Self {
Self {
context_window,
max_output_tokens,
auto_compact_token_limit: Some(Self::default_auto_compact_limit(context_window)),
}
}
@@ -42,45 +37,44 @@ pub(crate) fn get_model_info(model_family: &ModelFamily) -> Option<ModelInfo> {
// OSS models have a 128k shared token pool.
// Arbitrarily splitting it: 3/4 input context, 1/4 output.
// https://openai.com/index/gpt-oss-model-card/
"gpt-oss-20b" => Some(ModelInfo::new(96_000, 32_000)),
"gpt-oss-120b" => Some(ModelInfo::new(96_000, 32_000)),
"gpt-oss-20b" => Some(ModelInfo::new(96_000)),
"gpt-oss-120b" => Some(ModelInfo::new(96_000)),
// https://platform.openai.com/docs/models/o3
"o3" => Some(ModelInfo::new(200_000, 100_000)),
"o3" => Some(ModelInfo::new(200_000)),
// https://platform.openai.com/docs/models/o4-mini
"o4-mini" => Some(ModelInfo::new(200_000, 100_000)),
"o4-mini" => Some(ModelInfo::new(200_000)),
// https://platform.openai.com/docs/models/codex-mini-latest
"codex-mini-latest" => Some(ModelInfo::new(200_000, 100_000)),
"codex-mini-latest" => Some(ModelInfo::new(200_000)),
// As of Jun 25, 2025, gpt-4.1 defaults to gpt-4.1-2025-04-14.
// https://platform.openai.com/docs/models/gpt-4.1
"gpt-4.1" | "gpt-4.1-2025-04-14" => Some(ModelInfo::new(1_047_576, 32_768)),
"gpt-4.1" | "gpt-4.1-2025-04-14" => Some(ModelInfo::new(1_047_576)),
// As of Jun 25, 2025, gpt-4o defaults to gpt-4o-2024-08-06.
// https://platform.openai.com/docs/models/gpt-4o
"gpt-4o" | "gpt-4o-2024-08-06" => Some(ModelInfo::new(128_000, 16_384)),
"gpt-4o" | "gpt-4o-2024-08-06" => Some(ModelInfo::new(128_000)),
// https://platform.openai.com/docs/models/gpt-4o?snapshot=gpt-4o-2024-05-13
"gpt-4o-2024-05-13" => Some(ModelInfo::new(128_000, 4_096)),
"gpt-4o-2024-05-13" => Some(ModelInfo::new(128_000)),
// https://platform.openai.com/docs/models/gpt-4o?snapshot=gpt-4o-2024-11-20
"gpt-4o-2024-11-20" => Some(ModelInfo::new(128_000, 16_384)),
"gpt-4o-2024-11-20" => Some(ModelInfo::new(128_000)),
// https://platform.openai.com/docs/models/gpt-3.5-turbo
"gpt-3.5-turbo" => Some(ModelInfo::new(16_385, 4_096)),
"gpt-3.5-turbo" => Some(ModelInfo::new(16_385)),
_ if slug.starts_with("gpt-5-codex") || slug.starts_with("gpt-5.1-codex") => {
Some(ModelInfo::new(CONTEXT_WINDOW_272K, MAX_OUTPUT_TOKENS_128K))
_ if slug.starts_with("gpt-5-codex")
|| slug.starts_with("gpt-5.1-codex")
|| slug.starts_with("gpt-5.1-codex-max") =>
{
Some(ModelInfo::new(CONTEXT_WINDOW_272K))
}
_ if slug.starts_with("gpt-5") => {
Some(ModelInfo::new(CONTEXT_WINDOW_272K, MAX_OUTPUT_TOKENS_128K))
}
_ if slug.starts_with("gpt-5") => Some(ModelInfo::new(CONTEXT_WINDOW_272K)),
_ if slug.starts_with("codex-") => {
Some(ModelInfo::new(CONTEXT_WINDOW_272K, MAX_OUTPUT_TOKENS_128K))
}
_ if slug.starts_with("codex-") => Some(ModelInfo::new(CONTEXT_WINDOW_272K)),
_ => None,
}

View File

@@ -5,6 +5,7 @@ use crate::default_client::originator;
use codex_otel::config::OtelExporter;
use codex_otel::config::OtelHttpProtocol;
use codex_otel::config::OtelSettings;
use codex_otel::config::OtelTlsConfig as OtelTlsSettings;
use codex_otel::otel_provider::OtelProvider;
use std::error::Error;
@@ -21,6 +22,7 @@ pub fn build_provider(
endpoint,
headers,
protocol,
tls,
} => {
let protocol = match protocol {
Protocol::Json => OtelHttpProtocol::Json,
@@ -34,14 +36,28 @@ pub fn build_provider(
.map(|(k, v)| (k.clone(), v.clone()))
.collect(),
protocol,
tls: tls.as_ref().map(|config| OtelTlsSettings {
ca_certificate: config.ca_certificate.clone(),
client_certificate: config.client_certificate.clone(),
client_private_key: config.client_private_key.clone(),
}),
}
}
Kind::OtlpGrpc { endpoint, headers } => OtelExporter::OtlpGrpc {
Kind::OtlpGrpc {
endpoint,
headers,
tls,
} => OtelExporter::OtlpGrpc {
endpoint: endpoint.clone(),
headers: headers
.iter()
.map(|(k, v)| (k.clone(), v.clone()))
.collect(),
tls: tls.as_ref().map(|config| OtelTlsSettings {
ca_certificate: config.ca_certificate.clone(),
client_certificate: config.client_certificate.clone(),
client_private_key: config.client_private_key.clone(),
}),
},
};

View File

@@ -1,6 +1,7 @@
use crate::bash::extract_bash_command;
use crate::bash::try_parse_shell;
use crate::bash::try_parse_word_only_commands_sequence;
use crate::powershell::extract_powershell_command;
use codex_protocol::parse_command::ParsedCommand;
use shlex::split as shlex_split;
use shlex::try_join as shlex_try_join;
@@ -11,6 +12,11 @@ pub fn shlex_join(tokens: &[String]) -> String {
.unwrap_or_else(|_| "<command included NUL byte>".to_string())
}
/// Extracts the shell and script from a command, regardless of platform
pub fn extract_shell_command(command: &[String]) -> Option<(&str, &str)> {
extract_bash_command(command).or_else(|| extract_powershell_command(command))
}
/// DO NOT REVIEW THIS CODE BY HAND
/// This parsing code is quite complex and not easy to hand-modify.
/// The easiest way to iterate is to add unit tests and have Codex fix the implementation.
@@ -877,6 +883,42 @@ mod tests {
}],
);
}
#[test]
fn powershell_command_is_stripped() {
assert_parsed(
&vec_str(&["powershell", "-Command", "Get-ChildItem"]),
vec![ParsedCommand::Unknown {
cmd: "Get-ChildItem".to_string(),
}],
);
}
#[test]
fn pwsh_with_noprofile_and_c_alias_is_stripped() {
assert_parsed(
&vec_str(&["pwsh", "-NoProfile", "-c", "Write-Host hi"]),
vec![ParsedCommand::Unknown {
cmd: "Write-Host hi".to_string(),
}],
);
}
#[test]
fn powershell_with_path_is_stripped() {
let command = if cfg!(windows) {
"C:\\windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe"
} else {
"/usr/local/bin/powershell.exe"
};
assert_parsed(
&vec_str(&[command, "-NoProfile", "-c", "Write-Host hi"]),
vec![ParsedCommand::Unknown {
cmd: "Write-Host hi".to_string(),
}],
);
}
}
pub fn parse_command_impl(command: &[String]) -> Vec<ParsedCommand> {
@@ -884,6 +926,12 @@ pub fn parse_command_impl(command: &[String]) -> Vec<ParsedCommand> {
return commands;
}
if let Some((_, script)) = extract_powershell_command(command) {
return vec![ParsedCommand::Unknown {
cmd: script.to_string(),
}];
}
let normalized = normalize_tokens(command);
let parts = if contains_connectors(&normalized) {
@@ -1190,6 +1238,7 @@ fn parse_find_query_and_path(tail: &[String]) -> (Option<String>, Option<String>
}
fn parse_shell_lc_commands(original: &[String]) -> Option<Vec<ParsedCommand>> {
// Only handle bash/zsh here; PowerShell is stripped separately without bash parsing.
let (_, script) = extract_bash_command(original)?;
if let Some(tree) = try_parse_shell(script)

View File

@@ -0,0 +1,93 @@
use std::path::PathBuf;
use crate::shell::ShellType;
use crate::shell::detect_shell_type;
const POWERSHELL_FLAGS: &[&str] = &["-nologo", "-noprofile", "-command", "-c"];
/// Extract the PowerShell script body from an invocation such as:
///
/// - ["pwsh", "-NoProfile", "-Command", "Get-ChildItem -Recurse | Select-String foo"]
/// - ["powershell.exe", "-Command", "Write-Host hi"]
/// - ["powershell", "-NoLogo", "-NoProfile", "-Command", "...script..."]
///
/// Returns (`shell`, `script`) when the first arg is a PowerShell executable and a
/// `-Command` (or `-c`) flag is present followed by a script string.
pub fn extract_powershell_command(command: &[String]) -> Option<(&str, &str)> {
if command.len() < 3 {
return None;
}
let shell = &command[0];
if detect_shell_type(&PathBuf::from(shell)) != Some(ShellType::PowerShell) {
return None;
}
// Find the first occurrence of -Command (accept common short alias -c as well)
let mut i = 1usize;
while i + 1 < command.len() {
let flag = &command[i];
// Reject unknown flags
if !POWERSHELL_FLAGS.contains(&flag.to_ascii_lowercase().as_str()) {
return None;
}
if flag.eq_ignore_ascii_case("-Command") || flag.eq_ignore_ascii_case("-c") {
let script = &command[i + 1];
return Some((shell, script.as_str()));
}
i += 1;
}
None
}
#[cfg(test)]
mod tests {
use super::extract_powershell_command;
#[test]
fn extracts_basic_powershell_command() {
let cmd = vec![
"powershell".to_string(),
"-Command".to_string(),
"Write-Host hi".to_string(),
];
let (_shell, script) = extract_powershell_command(&cmd).expect("extract");
assert_eq!(script, "Write-Host hi");
}
#[test]
fn extracts_lowercase_flags() {
let cmd = vec![
"powershell".to_string(),
"-nologo".to_string(),
"-command".to_string(),
"Write-Host hi".to_string(),
];
let (_shell, script) = extract_powershell_command(&cmd).expect("extract");
assert_eq!(script, "Write-Host hi");
}
#[test]
fn extracts_full_path_powershell_command() {
let command = if cfg!(windows) {
"C:\\windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe".to_string()
} else {
"/usr/local/bin/powershell.exe".to_string()
};
let cmd = vec![command, "-Command".to_string(), "Write-Host hi".to_string()];
let (_shell, script) = extract_powershell_command(&cmd).expect("extract");
assert_eq!(script, "Write-Host hi");
}
#[test]
fn extracts_with_noprofile_and_alias() {
let cmd = vec![
"pwsh".to_string(),
"-NoProfile".to_string(),
"-c".to_string(),
"Get-ChildItem | Select-String foo".to_string(),
];
let (_shell, script) = extract_powershell_command(&cmd).expect("extract");
assert_eq!(script, "Get-ChildItem | Select-String foo");
}
}

View File

@@ -27,7 +27,8 @@ pub(crate) fn should_persist_response_item(item: &ResponseItem) -> bool {
| ResponseItem::CustomToolCall { .. }
| ResponseItem::CustomToolCallOutput { .. }
| ResponseItem::WebSearchCall { .. }
| ResponseItem::GhostSnapshot { .. } => true,
| ResponseItem::GhostSnapshot { .. }
| ResponseItem::CompactionSummary { .. } => true,
ResponseItem::Other => false,
}
}

View File

@@ -814,6 +814,7 @@ async fn test_tail_skips_trailing_non_responses() -> Result<()> {
timestamp: format!("{ts}-compacted"),
item: RolloutItem::Compacted(CompactedItem {
message: "compacted".into(),
replacement_history: None,
}),
};
writeln!(file, "{}", serde_json::to_string(&compacted_line)?)?;

View File

@@ -8,6 +8,7 @@ readytospawn environment.
pub mod assessment;
use crate::exec::ExecExpiration;
use crate::exec::ExecToolCallOutput;
use crate::exec::SandboxType;
use crate::exec::StdoutStream;
@@ -26,23 +27,45 @@ use std::collections::HashMap;
use std::path::Path;
use std::path::PathBuf;
#[derive(Clone, Debug)]
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
pub enum SandboxPermissions {
UseDefault,
RequireEscalated,
}
impl SandboxPermissions {
pub fn requires_escalated_permissions(self) -> bool {
matches!(self, SandboxPermissions::RequireEscalated)
}
}
impl From<bool> for SandboxPermissions {
fn from(with_escalated_permissions: bool) -> Self {
if with_escalated_permissions {
SandboxPermissions::RequireEscalated
} else {
SandboxPermissions::UseDefault
}
}
}
#[derive(Debug)]
pub struct CommandSpec {
pub program: String,
pub args: Vec<String>,
pub cwd: PathBuf,
pub env: HashMap<String, String>,
pub timeout_ms: Option<u64>,
pub expiration: ExecExpiration,
pub with_escalated_permissions: Option<bool>,
pub justification: Option<String>,
}
#[derive(Clone, Debug)]
#[derive(Debug)]
pub struct ExecEnv {
pub command: Vec<String>,
pub cwd: PathBuf,
pub env: HashMap<String, String>,
pub timeout_ms: Option<u64>,
pub expiration: ExecExpiration,
pub sandbox: SandboxType,
pub with_escalated_permissions: Option<bool>,
pub justification: Option<String>,
@@ -93,13 +116,13 @@ impl SandboxManager {
pub(crate) fn transform(
&self,
spec: &CommandSpec,
mut spec: CommandSpec,
policy: &SandboxPolicy,
sandbox: SandboxType,
sandbox_policy_cwd: &Path,
codex_linux_sandbox_exe: Option<&PathBuf>,
) -> Result<ExecEnv, SandboxTransformError> {
let mut env = spec.env.clone();
let mut env = spec.env;
if !policy.has_full_network_access() {
env.insert(
CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR.to_string(),
@@ -108,8 +131,8 @@ impl SandboxManager {
}
let mut command = Vec::with_capacity(1 + spec.args.len());
command.push(spec.program.clone());
command.extend(spec.args.iter().cloned());
command.push(spec.program);
command.append(&mut spec.args);
let (command, sandbox_env, arg0_override) = match sandbox {
SandboxType::None => (command, HashMap::new(), None),
@@ -154,12 +177,12 @@ impl SandboxManager {
Ok(ExecEnv {
command,
cwd: spec.cwd.clone(),
cwd: spec.cwd,
env,
timeout_ms: spec.timeout_ms,
expiration: spec.expiration,
sandbox,
with_escalated_permissions: spec.with_escalated_permissions,
justification: spec.justification.clone(),
justification: spec.justification,
arg0: arg0_override,
})
}
@@ -170,9 +193,9 @@ impl SandboxManager {
}
pub async fn execute_env(
env: &ExecEnv,
env: ExecEnv,
policy: &SandboxPolicy,
stdout_stream: Option<StdoutStream>,
) -> crate::error::Result<ExecToolCallOutput> {
execute_exec_env(env.clone(), policy, stdout_stream).await
execute_exec_env(env, policy, stdout_stream).await
}

View File

@@ -7,64 +7,41 @@ pub enum ShellType {
Zsh,
Bash,
PowerShell,
Sh,
Cmd,
}
#[derive(Debug, PartialEq, Eq, Clone, Serialize, Deserialize)]
pub struct ZshShell {
pub struct Shell {
pub(crate) shell_type: ShellType,
pub(crate) shell_path: PathBuf,
}
#[derive(Debug, PartialEq, Eq, Clone, Serialize, Deserialize)]
pub struct BashShell {
pub(crate) shell_path: PathBuf,
}
#[derive(Debug, PartialEq, Eq, Clone, Serialize, Deserialize)]
pub struct PowerShellConfig {
pub(crate) shell_path: PathBuf, // Executable name or path, e.g. "pwsh" or "powershell.exe".
}
#[derive(Debug, PartialEq, Eq, Clone, Serialize, Deserialize)]
pub enum Shell {
Zsh(ZshShell),
Bash(BashShell),
PowerShell(PowerShellConfig),
Unknown,
}
impl Shell {
pub fn name(&self) -> Option<String> {
match self {
Shell::Zsh(ZshShell { shell_path, .. }) | Shell::Bash(BashShell { shell_path, .. }) => {
std::path::Path::new(shell_path)
.file_name()
.map(|s| s.to_string_lossy().to_string())
}
Shell::PowerShell(ps) => ps
.shell_path
.file_stem()
.map(|s| s.to_string_lossy().to_string()),
Shell::Unknown => None,
pub fn name(&self) -> &'static str {
match self.shell_type {
ShellType::Zsh => "zsh",
ShellType::Bash => "bash",
ShellType::PowerShell => "powershell",
ShellType::Sh => "sh",
ShellType::Cmd => "cmd",
}
}
/// Takes a string of shell and returns the full list of command args to
/// use with `exec()` to run the shell command.
pub fn derive_exec_args(&self, command: &str, use_login_shell: bool) -> Vec<String> {
match self {
Shell::Zsh(ZshShell { shell_path, .. }) | Shell::Bash(BashShell { shell_path, .. }) => {
match self.shell_type {
ShellType::Zsh | ShellType::Bash | ShellType::Sh => {
let arg = if use_login_shell { "-lc" } else { "-c" };
vec![
shell_path.to_string_lossy().to_string(),
self.shell_path.to_string_lossy().to_string(),
arg.to_string(),
command.to_string(),
]
}
Shell::PowerShell(ps) => {
let mut args = vec![
ps.shell_path.to_string_lossy().to_string(),
"-NoLogo".to_string(),
];
ShellType::PowerShell => {
let mut args = vec![self.shell_path.to_string_lossy().to_string()];
if !use_login_shell {
args.push("-NoProfile".to_string());
}
@@ -73,7 +50,12 @@ impl Shell {
args.push(command.to_string());
args
}
Shell::Unknown => shlex::split(command).unwrap_or_else(|| vec![command.to_string()]),
ShellType::Cmd => {
let mut args = vec![self.shell_path.to_string_lossy().to_string()];
args.push("/c".to_string());
args.push(command.to_string());
args
}
}
}
}
@@ -146,19 +128,34 @@ fn get_shell_path(
None
}
fn get_zsh_shell(path: Option<&PathBuf>) -> Option<ZshShell> {
fn get_zsh_shell(path: Option<&PathBuf>) -> Option<Shell> {
let shell_path = get_shell_path(ShellType::Zsh, path, "zsh", vec!["/bin/zsh"]);
shell_path.map(|shell_path| ZshShell { shell_path })
shell_path.map(|shell_path| Shell {
shell_type: ShellType::Zsh,
shell_path,
})
}
fn get_bash_shell(path: Option<&PathBuf>) -> Option<BashShell> {
fn get_bash_shell(path: Option<&PathBuf>) -> Option<Shell> {
let shell_path = get_shell_path(ShellType::Bash, path, "bash", vec!["/bin/bash"]);
shell_path.map(|shell_path| BashShell { shell_path })
shell_path.map(|shell_path| Shell {
shell_type: ShellType::Bash,
shell_path,
})
}
fn get_powershell_shell(path: Option<&PathBuf>) -> Option<PowerShellConfig> {
fn get_sh_shell(path: Option<&PathBuf>) -> Option<Shell> {
let shell_path = get_shell_path(ShellType::Sh, path, "sh", vec!["/bin/sh"]);
shell_path.map(|shell_path| Shell {
shell_type: ShellType::Sh,
shell_path,
})
}
fn get_powershell_shell(path: Option<&PathBuf>) -> Option<Shell> {
let shell_path = get_shell_path(
ShellType::PowerShell,
path,
@@ -167,32 +164,61 @@ fn get_powershell_shell(path: Option<&PathBuf>) -> Option<PowerShellConfig> {
)
.or_else(|| get_shell_path(ShellType::PowerShell, path, "powershell", vec![]));
shell_path.map(|shell_path| PowerShellConfig { shell_path })
shell_path.map(|shell_path| Shell {
shell_type: ShellType::PowerShell,
shell_path,
})
}
fn get_cmd_shell(path: Option<&PathBuf>) -> Option<Shell> {
let shell_path = get_shell_path(ShellType::Cmd, path, "cmd", vec![]);
shell_path.map(|shell_path| Shell {
shell_type: ShellType::Cmd,
shell_path,
})
}
fn ultimate_fallback_shell() -> Shell {
if cfg!(windows) {
Shell {
shell_type: ShellType::Cmd,
shell_path: PathBuf::from("cmd.exe"),
}
} else {
Shell {
shell_type: ShellType::Sh,
shell_path: PathBuf::from("/bin/sh"),
}
}
}
pub fn get_shell_by_model_provided_path(shell_path: &PathBuf) -> Shell {
detect_shell_type(shell_path)
.and_then(|shell_type| get_shell(shell_type, Some(shell_path)))
.unwrap_or(Shell::Unknown)
.unwrap_or(ultimate_fallback_shell())
}
pub fn get_shell(shell_type: ShellType, path: Option<&PathBuf>) -> Option<Shell> {
match shell_type {
ShellType::Zsh => get_zsh_shell(path).map(Shell::Zsh),
ShellType::Bash => get_bash_shell(path).map(Shell::Bash),
ShellType::PowerShell => get_powershell_shell(path).map(Shell::PowerShell),
ShellType::Zsh => get_zsh_shell(path),
ShellType::Bash => get_bash_shell(path),
ShellType::PowerShell => get_powershell_shell(path),
ShellType::Sh => get_sh_shell(path),
ShellType::Cmd => get_cmd_shell(path),
}
}
pub fn detect_shell_type(shell_path: &PathBuf) -> Option<ShellType> {
match shell_path.as_os_str().to_str() {
Some("zsh") => Some(ShellType::Zsh),
Some("sh") => Some(ShellType::Sh),
Some("cmd") => Some(ShellType::Cmd),
Some("bash") => Some(ShellType::Bash),
Some("pwsh") => Some(ShellType::PowerShell),
Some("powershell") => Some(ShellType::PowerShell),
_ => {
let shell_name = shell_path.file_stem();
if let Some(shell_name) = shell_name
&& shell_name != shell_path
{
@@ -204,14 +230,29 @@ pub fn detect_shell_type(shell_path: &PathBuf) -> Option<ShellType> {
}
}
pub async fn default_user_shell() -> Shell {
pub fn default_user_shell() -> Shell {
default_user_shell_from_path(get_user_shell_path())
}
fn default_user_shell_from_path(user_shell_path: Option<PathBuf>) -> Shell {
if cfg!(windows) {
get_shell(ShellType::PowerShell, None).unwrap_or(Shell::Unknown)
get_shell(ShellType::PowerShell, None).unwrap_or(ultimate_fallback_shell())
} else {
get_user_shell_path()
let user_default_shell = user_shell_path
.and_then(|shell| detect_shell_type(&shell))
.and_then(|shell_type| get_shell(shell_type, None))
.unwrap_or(Shell::Unknown)
.and_then(|shell_type| get_shell(shell_type, None));
let shell_with_fallback = if cfg!(target_os = "macos") {
user_default_shell
.or_else(|| get_shell(ShellType::Zsh, None))
.or_else(|| get_shell(ShellType::Bash, None))
} else {
user_default_shell
.or_else(|| get_shell(ShellType::Bash, None))
.or_else(|| get_shell(ShellType::Zsh, None))
};
shell_with_fallback.unwrap_or(ultimate_fallback_shell())
}
}
@@ -251,6 +292,14 @@ mod detect_shell_type_tests {
detect_shell_type(&PathBuf::from("powershell.exe")),
Some(ShellType::PowerShell)
);
assert_eq!(
detect_shell_type(&PathBuf::from(if cfg!(windows) {
"C:\\windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe"
} else {
"/usr/local/bin/pwsh"
})),
Some(ShellType::PowerShell)
);
assert_eq!(
detect_shell_type(&PathBuf::from("pwsh.exe")),
Some(ShellType::PowerShell)
@@ -259,6 +308,19 @@ mod detect_shell_type_tests {
detect_shell_type(&PathBuf::from("/usr/local/bin/pwsh")),
Some(ShellType::PowerShell)
);
assert_eq!(
detect_shell_type(&PathBuf::from("/bin/sh")),
Some(ShellType::Sh)
);
assert_eq!(detect_shell_type(&PathBuf::from("sh")), Some(ShellType::Sh));
assert_eq!(
detect_shell_type(&PathBuf::from("cmd")),
Some(ShellType::Cmd)
);
assert_eq!(
detect_shell_type(&PathBuf::from("cmd.exe")),
Some(ShellType::Cmd)
);
}
}
@@ -274,10 +336,17 @@ mod tests {
fn detects_zsh() {
let zsh_shell = get_shell(ShellType::Zsh, None).unwrap();
let ZshShell { shell_path } = match zsh_shell {
Shell::Zsh(zsh_shell) => zsh_shell,
_ => panic!("expected zsh shell"),
};
let shell_path = zsh_shell.shell_path;
assert_eq!(shell_path, PathBuf::from("/bin/zsh"));
}
#[test]
#[cfg(target_os = "macos")]
fn fish_fallback_to_zsh() {
let zsh_shell = default_user_shell_from_path(Some(PathBuf::from("/bin/fish")));
let shell_path = zsh_shell.shell_path;
assert_eq!(shell_path, PathBuf::from("/bin/zsh"));
}
@@ -285,18 +354,60 @@ mod tests {
#[test]
fn detects_bash() {
let bash_shell = get_shell(ShellType::Bash, None).unwrap();
let BashShell { shell_path } = match bash_shell {
Shell::Bash(bash_shell) => bash_shell,
_ => panic!("expected bash shell"),
};
let shell_path = bash_shell.shell_path;
assert!(
shell_path == PathBuf::from("/bin/bash")
|| shell_path == PathBuf::from("/usr/bin/bash"),
|| shell_path == PathBuf::from("/usr/bin/bash")
|| shell_path == PathBuf::from("/usr/local/bin/bash"),
"shell path: {shell_path:?}",
);
}
#[test]
fn detects_sh() {
let sh_shell = get_shell(ShellType::Sh, None).unwrap();
let shell_path = sh_shell.shell_path;
assert!(
shell_path == PathBuf::from("/bin/sh") || shell_path == PathBuf::from("/usr/bin/sh"),
"shell path: {shell_path:?}",
);
}
#[test]
fn can_run_on_shell_test() {
let cmd = "echo \"Works\"";
if cfg!(windows) {
assert!(shell_works(
get_shell(ShellType::PowerShell, None),
"Out-String 'Works'",
true,
));
assert!(shell_works(get_shell(ShellType::Cmd, None), cmd, true,));
assert!(shell_works(Some(ultimate_fallback_shell()), cmd, true));
} else {
assert!(shell_works(Some(ultimate_fallback_shell()), cmd, true));
assert!(shell_works(get_shell(ShellType::Zsh, None), cmd, false));
assert!(shell_works(get_shell(ShellType::Bash, None), cmd, true));
assert!(shell_works(get_shell(ShellType::Sh, None), cmd, true));
}
}
fn shell_works(shell: Option<Shell>, command: &str, required: bool) -> bool {
if let Some(shell) = shell {
let args = shell.derive_exec_args(command, false);
let output = Command::new(args[0].clone())
.args(&args[1..])
.output()
.unwrap();
assert!(output.status.success());
assert!(String::from_utf8_lossy(&output.stdout).contains("Works"));
true
} else {
!required
}
}
#[tokio::test]
async fn test_current_shell_detects_zsh() {
let shell = Command::new("sh")
@@ -308,10 +419,11 @@ mod tests {
let shell_path = String::from_utf8_lossy(&shell.stdout).trim().to_string();
if shell_path.ends_with("/zsh") {
assert_eq!(
default_user_shell().await,
Shell::Zsh(ZshShell {
default_user_shell(),
Shell {
shell_type: ShellType::Zsh,
shell_path: PathBuf::from(shell_path),
})
}
);
}
}
@@ -322,11 +434,8 @@ mod tests {
return;
}
let powershell_shell = default_user_shell().await;
let PowerShellConfig { shell_path } = match powershell_shell {
Shell::PowerShell(powershell_shell) => powershell_shell,
_ => panic!("expected powershell shell"),
};
let powershell_shell = default_user_shell();
let shell_path = powershell_shell.shell_path;
assert!(shell_path.ends_with("pwsh.exe") || shell_path.ends_with("powershell.exe"));
}
@@ -338,10 +447,7 @@ mod tests {
}
let powershell_shell = get_shell(ShellType::PowerShell, None).unwrap();
let PowerShellConfig { shell_path } = match powershell_shell {
Shell::PowerShell(powershell_shell) => powershell_shell,
_ => panic!("expected powershell shell"),
};
let shell_path = powershell_shell.shell_path;
assert!(shell_path.ends_with("pwsh.exe") || shell_path.ends_with("powershell.exe"));
}

View File

@@ -7,6 +7,7 @@ use crate::context_manager::ContextManager;
use crate::protocol::RateLimitSnapshot;
use crate::protocol::TokenUsage;
use crate::protocol::TokenUsageInfo;
use crate::truncate::TruncationPolicy;
/// Persistent, session-scoped state previously stored directly on `Session`.
pub(crate) struct SessionState {
@@ -18,20 +19,21 @@ pub(crate) struct SessionState {
impl SessionState {
/// Create a new session state mirroring previous `State::default()` semantics.
pub(crate) fn new(session_configuration: SessionConfiguration) -> Self {
let history = ContextManager::new();
Self {
session_configuration,
history: ContextManager::new(),
history,
latest_rate_limits: None,
}
}
// History helpers
pub(crate) fn record_items<I>(&mut self, items: I)
pub(crate) fn record_items<I>(&mut self, items: I, policy: TruncationPolicy)
where
I: IntoIterator,
I::Item: std::ops::Deref<Target = ResponseItem>,
{
self.history.record_items(items)
self.history.record_items(items, policy);
}
pub(crate) fn clone_history(&self) -> ContextManager {

View File

@@ -1,15 +1,12 @@
use std::sync::Arc;
use async_trait::async_trait;
use tokio_util::sync::CancellationToken;
use crate::codex::TurnContext;
use crate::compact;
use crate::state::TaskKind;
use codex_protocol::user_input::UserInput;
use super::SessionTask;
use super::SessionTaskContext;
use crate::codex::TurnContext;
use crate::state::TaskKind;
use async_trait::async_trait;
use codex_protocol::user_input::UserInput;
use tokio_util::sync::CancellationToken;
#[derive(Clone, Copy, Default)]
pub(crate) struct CompactTask;
@@ -27,6 +24,13 @@ impl SessionTask for CompactTask {
input: Vec<UserInput>,
_cancellation_token: CancellationToken,
) -> Option<String> {
compact::run_compact_task(session.clone_session(), ctx, input).await
let session = session.clone_session();
if crate::compact::should_use_remote_compact_task(&session).await {
crate::compact_remote::run_remote_compact_task(session, ctx).await
} else {
crate::compact::run_compact_task(session, ctx, input).await
}
None
}
}

View File

@@ -1,10 +1,14 @@
use crate::codex::TurnContext;
use crate::protocol::EventMsg;
use crate::protocol::WarningEvent;
use crate::state::TaskKind;
use crate::tasks::SessionTask;
use crate::tasks::SessionTaskContext;
use async_trait::async_trait;
use codex_git::CreateGhostCommitOptions;
use codex_git::GhostSnapshotReport;
use codex_git::GitToolingError;
use codex_git::capture_ghost_snapshot_report;
use codex_git::create_ghost_commit;
use codex_protocol::models::ResponseItem;
use codex_protocol::user_input::UserInput;
@@ -39,6 +43,27 @@ impl SessionTask for GhostSnapshotTask {
_ = cancellation_token.cancelled() => true,
_ = async {
let repo_path = ctx_for_task.cwd.clone();
// First, compute a snapshot report so we can warn about
// large untracked directories before running the heavier
// snapshot logic.
if let Ok(Ok(report)) = tokio::task::spawn_blocking({
let repo_path = repo_path.clone();
move || {
let options = CreateGhostCommitOptions::new(&repo_path);
capture_ghost_snapshot_report(&options)
}
})
.await
&& let Some(message) = format_large_untracked_warning(&report) {
session
.session
.send_event(
&ctx_for_task,
EventMsg::Warning(WarningEvent { message }),
)
.await;
}
// Required to run in a dedicated blocking pool.
match tokio::task::spawn_blocking(move || {
let options = CreateGhostCommitOptions::new(&repo_path);
@@ -56,23 +81,18 @@ impl SessionTask for GhostSnapshotTask {
.await;
info!("ghost commit captured: {}", ghost_commit.id());
}
Ok(Err(err)) => {
warn!(
Ok(Err(err)) => match err {
GitToolingError::NotAGitRepository { .. } => info!(
sub_id = ctx_for_task.sub_id.as_str(),
"failed to capture ghost snapshot: {err}"
);
let message = match err {
GitToolingError::NotAGitRepository { .. } => {
"Snapshots disabled: current directory is not a Git repository."
.to_string()
}
_ => format!("Snapshots disabled after ghost snapshot error: {err}."),
};
session
.session
.notify_background_event(&ctx_for_task, message)
.await;
}
"skipping ghost snapshot because current directory is not a Git repository"
),
_ => {
warn!(
sub_id = ctx_for_task.sub_id.as_str(),
"failed to capture ghost snapshot: {err}"
);
}
},
Err(err) => {
warn!(
sub_id = ctx_for_task.sub_id.as_str(),
@@ -108,3 +128,22 @@ impl GhostSnapshotTask {
Self { token }
}
}
fn format_large_untracked_warning(report: &GhostSnapshotReport) -> Option<String> {
if report.large_untracked_dirs.is_empty() {
return None;
}
const MAX_DIRS: usize = 3;
let mut parts: Vec<String> = Vec::new();
for dir in report.large_untracked_dirs.iter().take(MAX_DIRS) {
parts.push(format!("{} ({} files)", dir.path.display(), dir.file_count));
}
if report.large_untracked_dirs.len() > MAX_DIRS {
let remaining = report.large_untracked_dirs.len() - MAX_DIRS;
parts.push(format!("{remaining} more"));
}
Some(format!(
"Repository snapshot encountered large untracked directories: {}. This can slow Codex; consider adding these paths to .gitignore or disabling undo in your config.",
parts.join(", ")
))
}

View File

@@ -23,8 +23,18 @@ use codex_protocol::user_input::UserInput;
use super::SessionTask;
use super::SessionTaskContext;
#[derive(Clone, Copy, Default)]
pub(crate) struct ReviewTask;
#[derive(Clone, Copy)]
pub(crate) struct ReviewTask {
append_to_original_thread: bool,
}
impl ReviewTask {
pub(crate) fn new(append_to_original_thread: bool) -> Self {
Self {
append_to_original_thread,
}
}
}
#[async_trait]
impl SessionTask for ReviewTask {
@@ -52,13 +62,25 @@ impl SessionTask for ReviewTask {
None => None,
};
if !cancellation_token.is_cancelled() {
exit_review_mode(session.clone_session(), output.clone(), ctx.clone()).await;
exit_review_mode(
session.clone_session(),
output.clone(),
ctx.clone(),
self.append_to_original_thread,
)
.await;
}
None
}
async fn abort(&self, session: Arc<SessionTaskContext>, ctx: Arc<TurnContext>) {
exit_review_mode(session.clone_session(), None, ctx).await;
exit_review_mode(
session.clone_session(),
None,
ctx,
self.append_to_original_thread,
)
.await;
}
}
@@ -175,32 +197,35 @@ pub(crate) async fn exit_review_mode(
session: Arc<Session>,
review_output: Option<ReviewOutputEvent>,
ctx: Arc<TurnContext>,
append_to_original_thread: bool,
) {
let user_message = if let Some(out) = review_output.clone() {
let mut findings_str = String::new();
let text = out.overall_explanation.trim();
if !text.is_empty() {
findings_str.push_str(text);
}
if !out.findings.is_empty() {
let block = format_review_findings_block(&out.findings, None);
findings_str.push_str(&format!("\n{block}"));
}
crate::client_common::REVIEW_EXIT_SUCCESS_TMPL.replace("{results}", &findings_str)
} else {
crate::client_common::REVIEW_EXIT_INTERRUPTED_TMPL.to_string()
};
if append_to_original_thread {
let user_message = if let Some(out) = review_output.clone() {
let mut findings_str = String::new();
let text = out.overall_explanation.trim();
if !text.is_empty() {
findings_str.push_str(text);
}
if !out.findings.is_empty() {
let block = format_review_findings_block(&out.findings, None);
findings_str.push_str(&format!("\n{block}"));
}
crate::client_common::REVIEW_EXIT_SUCCESS_TMPL.replace("{results}", &findings_str)
} else {
crate::client_common::REVIEW_EXIT_INTERRUPTED_TMPL.to_string()
};
session
.record_conversation_items(
&ctx,
&[ResponseItem::Message {
id: None,
role: "user".to_string(),
content: vec![ContentItem::InputText { text: user_message }],
}],
)
.await;
session
.record_conversation_items(
&ctx,
&[ResponseItem::Message {
id: None,
role: "user".to_string(),
content: vec![ContentItem::InputText { text: user_message }],
}],
)
.await;
}
session
.send_event(
ctx.as_ref(),

View File

@@ -31,6 +31,8 @@ use crate::user_shell_command::user_shell_command_record_item;
use super::SessionTask;
use super::SessionTaskContext;
const USER_SHELL_TIMEOUT_MS: u64 = 60 * 60 * 1000; // 1 hour
#[derive(Clone)]
pub(crate) struct UserShellCommandTask {
command: String,
@@ -93,7 +95,9 @@ impl SessionTask for UserShellCommandTask {
command: command.clone(),
cwd: cwd.clone(),
env: create_env(&turn_context.shell_environment_policy),
timeout_ms: None,
// TODO(zhao-oai): Now that we have ExecExpiration::Cancellation, we
// should use that instead of an "arbitrarily large" timeout here.
expiration: USER_SHELL_TIMEOUT_MS.into(),
sandbox: SandboxType::None,
with_escalated_permissions: None,
justification: None,
@@ -122,7 +126,11 @@ impl SessionTask for UserShellCommandTask {
duration: Duration::ZERO,
timed_out: false,
};
let output_items = [user_shell_command_record_item(&raw_command, &exec_output)];
let output_items = [user_shell_command_record_item(
&raw_command,
&exec_output,
&turn_context,
)];
session
.record_conversation_items(turn_context.as_ref(), &output_items)
.await;
@@ -164,12 +172,19 @@ impl SessionTask for UserShellCommandTask {
aggregated_output: output.aggregated_output.text.clone(),
exit_code: output.exit_code,
duration: output.duration,
formatted_output: format_exec_output_str(&output),
formatted_output: format_exec_output_str(
&output,
turn_context.truncation_policy,
),
}),
)
.await;
let output_items = [user_shell_command_record_item(&raw_command, &output)];
let output_items = [user_shell_command_record_item(
&raw_command,
&output,
&turn_context,
)];
session
.record_conversation_items(turn_context.as_ref(), &output_items)
.await;
@@ -201,11 +216,18 @@ impl SessionTask for UserShellCommandTask {
aggregated_output: exec_output.aggregated_output.text.clone(),
exit_code: exec_output.exit_code,
duration: exec_output.duration,
formatted_output: format_exec_output_str(&exec_output),
formatted_output: format_exec_output_str(
&exec_output,
turn_context.truncation_policy,
),
}),
)
.await;
let output_items = [user_shell_command_record_item(&raw_command, &exec_output)];
let output_items = [user_shell_command_record_item(
&raw_command,
&exec_output,
&turn_context,
)];
session
.record_conversation_items(turn_context.as_ref(), &output_items)
.await;

View File

@@ -0,0 +1,461 @@
//! Text encoding detection and conversion utilities for shell output.
//!
//! Windows users frequently run into code pages such as CP1251 or CP866 when invoking commands
//! through VS Code. Those bytes show up as invalid UTF-8 and used to be replaced with the standard
//! Unicode replacement character. We now lean on `chardetng` and `encoding_rs` so we can
//! automatically detect and decode the vast majority of legacy encodings before falling back to
//! lossy UTF-8 decoding.
use chardetng::EncodingDetector;
use encoding_rs::Encoding;
use encoding_rs::IBM866;
use encoding_rs::WINDOWS_1252;
/// Attempts to convert arbitrary bytes to UTF-8 with best-effort encoding detection.
pub fn bytes_to_string_smart(bytes: &[u8]) -> String {
if bytes.is_empty() {
return String::new();
}
if let Ok(utf8_str) = std::str::from_utf8(bytes) {
return utf8_str.to_owned();
}
let encoding = detect_encoding(bytes);
decode_bytes(bytes, encoding)
}
// Windows-1252 reassigns a handful of 0x80-0x9F slots to smart punctuation (curly quotes, dashes,
// ™). CP866 uses those *same byte values* for uppercase Cyrillic letters. When chardetng sees shell
// snippets that mix these bytes with ASCII it sometimes guesses IBM866, so “smart quotes” render as
// Cyrillic garbage (“УФЦ”) in VS Code. However, CP866 uppercase tokens are perfectly valid output
// (e.g., `ПРИ test`) so we cannot flip every 0x80-0x9F byte to Windows-1252 either. The compromise
// is to only coerce IBM866 to Windows-1252 when (a) the high bytes are exclusively the punctuation
// values listed below and (b) we spot adjacent ASCII. This targets the real failure case without
// clobbering legitimate Cyrillic text. If another code page has a similar collision, introduce a
// dedicated allowlist (like this one) plus unit tests that capture the actual shell output we want
// to preserve. Windows-1252 byte values for smart punctuation.
const WINDOWS_1252_PUNCT_BYTES: [u8; 8] = [
0x91, // (left single quotation mark)
0x92, // (right single quotation mark)
0x93, // “ (left double quotation mark)
0x94, // ” (right double quotation mark)
0x95, // • (bullet)
0x96, // (en dash)
0x97, // — (em dash)
0x99, // ™ (trade mark sign)
];
fn detect_encoding(bytes: &[u8]) -> &'static Encoding {
let mut detector = EncodingDetector::new();
detector.feed(bytes, true);
let (encoding, _is_confident) = detector.guess_assess(None, true);
// chardetng occasionally reports IBM866 for short strings that only contain Windows-1252 “smart
// punctuation” bytes (0x80-0x9F) because that range maps to Cyrillic letters in IBM866. When
// those bytes show up alongside an ASCII word (typical shell output: `"“`test), we know the
// intent was likely CP1252 quotes/dashes. Prefer WINDOWS_1252 in that specific situation so we
// render the characters users expect instead of Cyrillic junk. References:
// - Windows-1252 reserving 0x80-0x9F for curly quotes/dashes:
// https://en.wikipedia.org/wiki/Windows-1252
// - CP866 mapping 0x93/0x94/0x96 to Cyrillic letters, so the same bytes show up as “УФЦ” when
// mis-decoded: https://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/PC/CP866.TXT
if encoding == IBM866 && looks_like_windows_1252_punctuation(bytes) {
return WINDOWS_1252;
}
encoding
}
fn decode_bytes(bytes: &[u8], encoding: &'static Encoding) -> String {
let (decoded, _, had_errors) = encoding.decode(bytes);
if had_errors {
return String::from_utf8_lossy(bytes).into_owned();
}
decoded.into_owned()
}
/// Detect whether the byte stream looks like Windows-1252 “smart punctuation” wrapped around
/// otherwise-ASCII text.
///
/// Context: IBM866 and Windows-1252 share the 0x80-0x9F slot range. In IBM866 these bytes decode to
/// Cyrillic letters, whereas Windows-1252 maps them to curly quotes and dashes. chardetng can guess
/// IBM866 for short snippets that only contain those bytes, which turns shell output such as
/// `“test”` into unreadable Cyrillic. To avoid that, we treat inputs comprising a handful of bytes
/// from the problematic range plus ASCII letters as CP1252 punctuation. We deliberately do *not*
/// cap how many of those punctuation bytes we accept: VS Code frequently prints several quoted
/// phrases (e.g., `"foo" "bar"`), and truncating the count would once again mis-decode those as
/// Cyrillic. If we discover additional encodings with overlapping byte ranges, prefer adding
/// encoding-specific byte allowlists like `WINDOWS_1252_PUNCT` and tests that exercise real-world
/// shell snippets.
fn looks_like_windows_1252_punctuation(bytes: &[u8]) -> bool {
let mut saw_extended_punctuation = false;
let mut saw_ascii_word = false;
for &byte in bytes {
if byte >= 0xA0 {
return false;
}
if (0x80..=0x9F).contains(&byte) {
if !is_windows_1252_punct(byte) {
return false;
}
saw_extended_punctuation = true;
}
if byte.is_ascii_alphabetic() {
saw_ascii_word = true;
}
}
saw_extended_punctuation && saw_ascii_word
}
fn is_windows_1252_punct(byte: u8) -> bool {
WINDOWS_1252_PUNCT_BYTES.contains(&byte)
}
#[cfg(test)]
mod tests {
use super::*;
use encoding_rs::BIG5;
use encoding_rs::EUC_KR;
use encoding_rs::GBK;
use encoding_rs::ISO_8859_2;
use encoding_rs::ISO_8859_3;
use encoding_rs::ISO_8859_4;
use encoding_rs::ISO_8859_5;
use encoding_rs::ISO_8859_6;
use encoding_rs::ISO_8859_7;
use encoding_rs::ISO_8859_8;
use encoding_rs::ISO_8859_10;
use encoding_rs::ISO_8859_13;
use encoding_rs::SHIFT_JIS;
use encoding_rs::WINDOWS_874;
use encoding_rs::WINDOWS_1250;
use encoding_rs::WINDOWS_1251;
use encoding_rs::WINDOWS_1253;
use encoding_rs::WINDOWS_1254;
use encoding_rs::WINDOWS_1255;
use encoding_rs::WINDOWS_1256;
use encoding_rs::WINDOWS_1257;
use encoding_rs::WINDOWS_1258;
use pretty_assertions::assert_eq;
#[test]
fn test_utf8_passthrough() {
// Fast path: when UTF-8 is valid we should avoid copies and return as-is.
let utf8_text = "Hello, мир! 世界";
let bytes = utf8_text.as_bytes();
assert_eq!(bytes_to_string_smart(bytes), utf8_text);
}
#[test]
fn test_cp1251_russian_text() {
// Cyrillic text emitted by PowerShell/WSL in CP1251 should decode cleanly.
let bytes = b"\xEF\xF0\xE8\xEC\xE5\xF0"; // "пример" encoded with Windows-1251
assert_eq!(bytes_to_string_smart(bytes), "пример");
}
#[test]
fn test_cp1251_privet_word() {
// Regression: CP1251 words like "Привет" must not be mis-identified as Windows-1252.
let bytes = b"\xCF\xF0\xE8\xE2\xE5\xF2"; // "Привет" encoded with Windows-1251
assert_eq!(bytes_to_string_smart(bytes), "Привет");
}
#[test]
fn test_koi8_r_privet_word() {
// KOI8-R output should decode to the original Cyrillic as well.
let bytes = b"\xF0\xD2\xC9\xD7\xC5\xD4"; // "Привет" encoded with KOI8-R
assert_eq!(bytes_to_string_smart(bytes), "Привет");
}
#[test]
fn test_cp866_russian_text() {
// Legacy consoles (cmd.exe) commonly emit CP866 bytes for Cyrillic content.
let bytes = b"\xAF\xE0\xA8\xAC\xA5\xE0"; // "пример" encoded with CP866
assert_eq!(bytes_to_string_smart(bytes), "пример");
}
#[test]
fn test_cp866_uppercase_text() {
// Ensure the IBM866 heuristic still returns IBM866 for uppercase-only words.
let bytes = b"\x8F\x90\x88"; // "ПРИ" encoded with CP866 uppercase letters
assert_eq!(bytes_to_string_smart(bytes), "ПРИ");
}
#[test]
fn test_cp866_uppercase_followed_by_ascii() {
// Regression test: uppercase CP866 tokens next to ASCII text should not be treated as
// CP1252.
let bytes = b"\x8F\x90\x88 test"; // "ПРИ test" encoded with CP866 uppercase letters followed by ASCII
assert_eq!(bytes_to_string_smart(bytes), "ПРИ test");
}
#[test]
fn test_windows_1252_quotes() {
// Smart detection should map Windows-1252 punctuation into proper Unicode.
let bytes = b"\x93\x94test";
assert_eq!(bytes_to_string_smart(bytes), "\u{201C}\u{201D}test");
}
#[test]
fn test_windows_1252_multiple_quotes() {
// Longer snippets of punctuation (e.g., “foo” “bar”) should still flip to CP1252.
let bytes = b"\x93foo\x94 \x96 \x93bar\x94";
assert_eq!(
bytes_to_string_smart(bytes),
"\u{201C}foo\u{201D} \u{2013} \u{201C}bar\u{201D}"
);
}
#[test]
fn test_windows_1252_privet_gibberish_is_preserved() {
// Windows-1252 cannot encode Cyrillic; if the input literally contains "ПÑ..." we should not "fix" it.
let bytes = "Привет".as_bytes();
assert_eq!(bytes_to_string_smart(bytes), "Привет");
}
#[test]
fn test_iso8859_1_latin_text() {
// ISO-8859-1 (code page 28591) is the Latin segment used by LatArCyrHeb.
// encoding_rs unifies ISO-8859-1 with Windows-1252, so reuse that constant here.
let (encoded, _, had_errors) = WINDOWS_1252.encode("Hello");
assert!(!had_errors, "failed to encode Latin sample");
assert_eq!(bytes_to_string_smart(encoded.as_ref()), "Hello");
}
#[test]
fn test_iso8859_2_central_european_text() {
// ISO-8859-2 (code page 28592) covers additional Central European glyphs.
let (encoded, _, had_errors) = ISO_8859_2.encode("Příliš žluťoučký kůň");
assert!(!had_errors, "failed to encode ISO-8859-2 sample");
assert_eq!(
bytes_to_string_smart(encoded.as_ref()),
"Příliš žluťoučký kůň"
);
}
#[test]
fn test_iso8859_3_south_europe_text() {
// ISO-8859-3 (code page 28593) adds support for Maltese/Esperanto letters.
// chardetng rarely distinguishes ISO-8859-3 from neighboring Latin code pages, so we rely on
// an ASCII-only sample to ensure round-tripping still succeeds.
let (encoded, _, had_errors) = ISO_8859_3.encode("Esperanto and Maltese");
assert!(!had_errors, "failed to encode ISO-8859-3 sample");
assert_eq!(
bytes_to_string_smart(encoded.as_ref()),
"Esperanto and Maltese"
);
}
#[test]
fn test_iso8859_4_baltic_text() {
// ISO-8859-4 (code page 28594) targets the Baltic/Nordic repertoire.
let sample = "Šis ir rakstzīmju kodēšanas tests. Dažās valodās, kurās tiek \
izmantotas latīņu valodas burti, lēmuma pieņemšanai mums ir nepieciešams \
vairāk ieguldījuma.";
let (encoded, _, had_errors) = ISO_8859_4.encode(sample);
assert!(!had_errors, "failed to encode ISO-8859-4 sample");
assert_eq!(bytes_to_string_smart(encoded.as_ref()), sample);
}
#[test]
fn test_iso8859_5_cyrillic_text() {
// ISO-8859-5 (code page 28595) covers the Cyrillic portion.
let (encoded, _, had_errors) = ISO_8859_5.encode("Привет");
assert!(!had_errors, "failed to encode Cyrillic sample");
assert_eq!(bytes_to_string_smart(encoded.as_ref()), "Привет");
}
#[test]
fn test_iso8859_6_arabic_text() {
// ISO-8859-6 (code page 28596) covers the Arabic glyphs.
let (encoded, _, had_errors) = ISO_8859_6.encode("مرحبا");
assert!(!had_errors, "failed to encode Arabic sample");
assert_eq!(bytes_to_string_smart(encoded.as_ref()), "مرحبا");
}
#[test]
fn test_iso8859_7_greek_text() {
// ISO-8859-7 (code page 28597) is used for Greek locales.
let (encoded, _, had_errors) = ISO_8859_7.encode("Καλημέρα");
assert!(!had_errors, "failed to encode ISO-8859-7 sample");
assert_eq!(bytes_to_string_smart(encoded.as_ref()), "Καλημέρα");
}
#[test]
fn test_iso8859_8_hebrew_text() {
// ISO-8859-8 (code page 28598) covers the Hebrew glyphs.
let (encoded, _, had_errors) = ISO_8859_8.encode("שלום");
assert!(!had_errors, "failed to encode Hebrew sample");
assert_eq!(bytes_to_string_smart(encoded.as_ref()), "שלום");
}
#[test]
fn test_iso8859_9_turkish_text() {
// ISO-8859-9 (code page 28599) mirrors Latin-1 but inserts Turkish letters.
// encoding_rs exposes the equivalent Windows-1254 mapping.
let (encoded, _, had_errors) = WINDOWS_1254.encode("İstanbul");
assert!(!had_errors, "failed to encode ISO-8859-9 sample");
assert_eq!(bytes_to_string_smart(encoded.as_ref()), "İstanbul");
}
#[test]
fn test_iso8859_10_nordic_text() {
// ISO-8859-10 (code page 28600) adds additional Nordic letters.
let sample = "Þetta er prófun fyrir Ægir og Øystein.";
let (encoded, _, had_errors) = ISO_8859_10.encode(sample);
assert!(!had_errors, "failed to encode ISO-8859-10 sample");
assert_eq!(bytes_to_string_smart(encoded.as_ref()), sample);
}
#[test]
fn test_iso8859_11_thai_text() {
// ISO-8859-11 (code page 28601) mirrors TIS-620 / Windows-874 for Thai.
let sample = "ภาษาไทยสำหรับการทดสอบ ISO-8859-11";
// encoding_rs exposes the equivalent Windows-874 encoding, so use that constant.
let (encoded, _, had_errors) = WINDOWS_874.encode(sample);
assert!(!had_errors, "failed to encode ISO-8859-11 sample");
assert_eq!(bytes_to_string_smart(encoded.as_ref()), sample);
}
// ISO-8859-12 was never standardized, and encodings 1416 cannot be distinguished reliably
// without the heuristics we removed (chardetng generally reports neighboring Latin pages), so
// we intentionally omit coverage for those slots until the detector can identify them.
#[test]
fn test_iso8859_13_baltic_text() {
// ISO-8859-13 (code page 28603) is common across Baltic languages.
let (encoded, _, had_errors) = ISO_8859_13.encode("Sveiki");
assert!(!had_errors, "failed to encode ISO-8859-13 sample");
assert_eq!(bytes_to_string_smart(encoded.as_ref()), "Sveiki");
}
#[test]
fn test_windows_1250_central_european_text() {
let (encoded, _, had_errors) = WINDOWS_1250.encode("Příliš žluťoučký kůň");
assert!(!had_errors, "failed to encode Central European sample");
assert_eq!(
bytes_to_string_smart(encoded.as_ref()),
"Příliš žluťoučký kůň"
);
}
#[test]
fn test_windows_1251_encoded_text() {
let (encoded, _, had_errors) = WINDOWS_1251.encode("Привет из Windows-1251");
assert!(!had_errors, "failed to encode Windows-1251 sample");
assert_eq!(
bytes_to_string_smart(encoded.as_ref()),
"Привет из Windows-1251"
);
}
#[test]
fn test_windows_1253_greek_text() {
let (encoded, _, had_errors) = WINDOWS_1253.encode("Γειά σου");
assert!(!had_errors, "failed to encode Greek sample");
assert_eq!(bytes_to_string_smart(encoded.as_ref()), "Γειά σου");
}
#[test]
fn test_windows_1254_turkish_text() {
let (encoded, _, had_errors) = WINDOWS_1254.encode("İstanbul");
assert!(!had_errors, "failed to encode Turkish sample");
assert_eq!(bytes_to_string_smart(encoded.as_ref()), "İstanbul");
}
#[test]
fn test_windows_1255_hebrew_text() {
let (encoded, _, had_errors) = WINDOWS_1255.encode("שלום");
assert!(!had_errors, "failed to encode Windows-1255 Hebrew sample");
assert_eq!(bytes_to_string_smart(encoded.as_ref()), "שלום");
}
#[test]
fn test_windows_1256_arabic_text() {
let (encoded, _, had_errors) = WINDOWS_1256.encode("مرحبا");
assert!(!had_errors, "failed to encode Windows-1256 Arabic sample");
assert_eq!(bytes_to_string_smart(encoded.as_ref()), "مرحبا");
}
#[test]
fn test_windows_1257_baltic_text() {
let (encoded, _, had_errors) = WINDOWS_1257.encode("Pērkons");
assert!(!had_errors, "failed to encode Baltic sample");
assert_eq!(bytes_to_string_smart(encoded.as_ref()), "Pērkons");
}
#[test]
fn test_windows_1258_vietnamese_text() {
let (encoded, _, had_errors) = WINDOWS_1258.encode("Xin chào");
assert!(!had_errors, "failed to encode Vietnamese sample");
assert_eq!(bytes_to_string_smart(encoded.as_ref()), "Xin chào");
}
#[test]
fn test_windows_874_thai_text() {
let (encoded, _, had_errors) = WINDOWS_874.encode("สวัสดีครับ นี่คือการทดสอบภาษาไทย");
assert!(!had_errors, "failed to encode Thai sample");
assert_eq!(
bytes_to_string_smart(encoded.as_ref()),
"สวัสดีครับ นี่คือการทดสอบภาษาไทย"
);
}
#[test]
fn test_windows_932_shift_jis_text() {
let (encoded, _, had_errors) = SHIFT_JIS.encode("こんにちは");
assert!(!had_errors, "failed to encode Shift-JIS sample");
assert_eq!(bytes_to_string_smart(encoded.as_ref()), "こんにちは");
}
#[test]
fn test_windows_936_gbk_text() {
let (encoded, _, had_errors) = GBK.encode("你好,世界,这是一个测试");
assert!(!had_errors, "failed to encode GBK sample");
assert_eq!(
bytes_to_string_smart(encoded.as_ref()),
"你好,世界,这是一个测试"
);
}
#[test]
fn test_windows_949_korean_text() {
let (encoded, _, had_errors) = EUC_KR.encode("안녕하세요");
assert!(!had_errors, "failed to encode Korean sample");
assert_eq!(bytes_to_string_smart(encoded.as_ref()), "안녕하세요");
}
#[test]
fn test_windows_950_big5_text() {
let (encoded, _, had_errors) = BIG5.encode("繁體");
assert!(!had_errors, "failed to encode Big5 sample");
assert_eq!(bytes_to_string_smart(encoded.as_ref()), "繁體");
}
#[test]
fn test_latin1_cafe() {
// Latin-1 bytes remain common in Western-European locales; decode them directly.
let bytes = b"caf\xE9"; // codespell:ignore caf
assert_eq!(bytes_to_string_smart(bytes), "café");
}
#[test]
fn test_preserves_ansi_sequences() {
// ANSI escape sequences should survive regardless of the detected encoding.
let bytes = b"\x1b[31mred\x1b[0m";
assert_eq!(bytes_to_string_smart(bytes), "\x1b[31mred\x1b[0m");
}
#[test]
fn test_fallback_to_lossy() {
// Completely invalid sequences fall back to the old lossy behavior.
let invalid_bytes = [0xFF, 0xFE, 0xFD];
let result = bytes_to_string_smart(&invalid_bytes);
assert_eq!(result, String::from_utf8_lossy(&invalid_bytes));
}
}

View File

@@ -88,6 +88,7 @@ pub(crate) enum ToolEmitter {
cwd: PathBuf,
source: ExecCommandSource,
parsed_cmd: Vec<ParsedCommand>,
freeform: bool,
},
ApplyPatch {
changes: HashMap<PathBuf, FileChange>,
@@ -103,13 +104,19 @@ pub(crate) enum ToolEmitter {
}
impl ToolEmitter {
pub fn shell(command: Vec<String>, cwd: PathBuf, source: ExecCommandSource) -> Self {
pub fn shell(
command: Vec<String>,
cwd: PathBuf,
source: ExecCommandSource,
freeform: bool,
) -> Self {
let parsed_cmd = parse_command(&command);
Self::Shell {
command,
cwd,
source,
parsed_cmd,
freeform,
}
}
@@ -144,6 +151,7 @@ impl ToolEmitter {
cwd,
source,
parsed_cmd,
..
},
stage,
) => {
@@ -171,15 +179,17 @@ impl ToolEmitter {
ctx.turn,
EventMsg::PatchApplyBegin(PatchApplyBeginEvent {
call_id: ctx.call_id.to_string(),
turn_id: ctx.turn.sub_id.clone(),
auto_approved: *auto_approved,
changes: changes.clone(),
}),
)
.await;
}
(Self::ApplyPatch { .. }, ToolEventStage::Success(output)) => {
(Self::ApplyPatch { changes, .. }, ToolEventStage::Success(output)) => {
emit_patch_end(
ctx,
changes.clone(),
output.stdout.text.clone(),
output.stderr.text.clone(),
output.exit_code == 0,
@@ -187,11 +197,12 @@ impl ToolEmitter {
.await;
}
(
Self::ApplyPatch { .. },
Self::ApplyPatch { changes, .. },
ToolEventStage::Failure(ToolEventFailure::Output(output)),
) => {
emit_patch_end(
ctx,
changes.clone(),
output.stdout.text.clone(),
output.stderr.text.clone(),
output.exit_code == 0,
@@ -199,10 +210,17 @@ impl ToolEmitter {
.await;
}
(
Self::ApplyPatch { .. },
Self::ApplyPatch { changes, .. },
ToolEventStage::Failure(ToolEventFailure::Message(message)),
) => {
emit_patch_end(ctx, String::new(), (*message).to_string(), false).await;
emit_patch_end(
ctx,
changes.clone(),
String::new(),
(*message).to_string(),
false,
)
.await;
}
(
Self::UnifiedExec {
@@ -234,6 +252,19 @@ impl ToolEmitter {
self.emit(ctx, ToolEventStage::Begin).await;
}
fn format_exec_output_for_model(
&self,
output: &ExecToolCallOutput,
ctx: ToolEventCtx<'_>,
) -> String {
match self {
Self::Shell { freeform: true, .. } => {
super::format_exec_output_for_model_freeform(output, ctx.turn.truncation_policy)
}
_ => super::format_exec_output_for_model_structured(output, ctx.turn.truncation_policy),
}
}
pub async fn finish(
&self,
ctx: ToolEventCtx<'_>,
@@ -241,7 +272,7 @@ impl ToolEmitter {
) -> Result<String, FunctionCallError> {
let (event, result) = match out {
Ok(output) => {
let content = super::format_exec_output_for_model(&output);
let content = self.format_exec_output_for_model(&output, ctx);
let exit_code = output.exit_code;
let event = ToolEventStage::Success(output);
let result = if exit_code == 0 {
@@ -253,7 +284,7 @@ impl ToolEmitter {
}
Err(ToolError::Codex(CodexErr::Sandbox(SandboxErr::Timeout { output })))
| Err(ToolError::Codex(CodexErr::Sandbox(SandboxErr::Denied { output }))) => {
let response = super::format_exec_output_for_model(&output);
let response = self.format_exec_output_for_model(&output, ctx);
let event = ToolEventStage::Failure(ToolEventFailure::Output(*output));
let result = Err(FunctionCallError::RespondToModel(response));
(event, result)
@@ -342,7 +373,7 @@ async fn emit_exec_stage(
aggregated_output: output.aggregated_output.text.clone(),
exit_code: output.exit_code,
duration: output.duration,
formatted_output: format_exec_output_str(&output),
formatted_output: format_exec_output_str(&output, ctx.turn.truncation_policy),
};
emit_exec_end(ctx, exec_input, exec_result).await;
}
@@ -388,15 +419,23 @@ async fn emit_exec_end(
.await;
}
async fn emit_patch_end(ctx: ToolEventCtx<'_>, stdout: String, stderr: String, success: bool) {
async fn emit_patch_end(
ctx: ToolEventCtx<'_>,
changes: HashMap<PathBuf, FileChange>,
stdout: String,
stderr: String,
success: bool,
) {
ctx.session
.send_event(
ctx.turn,
EventMsg::PatchApplyEnd(PatchApplyEndEvent {
call_id: ctx.call_id.to_string(),
turn_id: ctx.turn.sub_id.clone(),
stdout,
stderr,
success,
changes,
}),
)
.await;

View File

@@ -9,9 +9,11 @@ use crate::apply_patch::convert_apply_patch_to_protocol;
use crate::codex::TurnContext;
use crate::exec::ExecParams;
use crate::exec_env::create_env;
use crate::exec_policy::create_approval_requirement_for_command;
use crate::function_tool::FunctionCallError;
use crate::is_safe_command::is_known_safe_command;
use crate::protocol::ExecCommandSource;
use crate::sandboxing::SandboxPermissions;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolOutput;
use crate::tools::context::ToolPayload;
@@ -35,7 +37,7 @@ impl ShellHandler {
ExecParams {
command: params.command,
cwd: turn_context.resolve_path(params.workdir.clone()),
timeout_ms: params.timeout_ms,
expiration: params.timeout_ms.into(),
env: create_env(&turn_context.shell_environment_policy),
with_escalated_permissions: params.with_escalated_permissions,
justification: params.justification,
@@ -57,7 +59,7 @@ impl ShellCommandHandler {
ExecParams {
command,
cwd: turn_context.resolve_path(params.workdir.clone()),
timeout_ms: params.timeout_ms,
expiration: params.timeout_ms.into(),
env: create_env(&turn_context.shell_environment_policy),
with_escalated_permissions: params.with_escalated_permissions,
justification: params.justification,
@@ -130,7 +132,7 @@ impl ToolHandler for ShellHandler {
turn,
tracker,
call_id,
true,
false,
)
.await
}
@@ -178,7 +180,7 @@ impl ToolHandler for ShellCommandHandler {
turn,
tracker,
call_id,
false,
true,
)
.await
}
@@ -192,7 +194,7 @@ impl ShellHandler {
turn: Arc<TurnContext>,
tracker: crate::tools::context::SharedTurnDiffTracker,
call_id: String,
is_user_shell_command: bool,
freeform: bool,
) -> Result<ToolOutput, FunctionCallError> {
// Approval policy guard for explicit escalation in non-OnRequest modes.
if exec_params.with_escalated_permissions.unwrap_or(false)
@@ -241,7 +243,7 @@ impl ShellHandler {
let req = ApplyPatchRequest {
patch: apply.action.patch.clone(),
cwd: apply.action.cwd.clone(),
timeout_ms: exec_params.timeout_ms,
timeout_ms: exec_params.expiration.timeout_ms(),
user_explicitly_approved: apply.user_explicitly_approved_this_action,
codex_exe: turn.codex_linux_sandbox_exe.clone(),
};
@@ -285,24 +287,30 @@ impl ShellHandler {
}
}
// Regular shell execution path.
let source = if is_user_shell_command {
ExecCommandSource::UserShell
} else {
ExecCommandSource::Agent
};
let emitter =
ToolEmitter::shell(exec_params.command.clone(), exec_params.cwd.clone(), source);
let source = ExecCommandSource::Agent;
let emitter = ToolEmitter::shell(
exec_params.command.clone(),
exec_params.cwd.clone(),
source,
freeform,
);
let event_ctx = ToolEventCtx::new(session.as_ref(), turn.as_ref(), &call_id, None);
emitter.begin(event_ctx).await;
let req = ShellRequest {
command: exec_params.command.clone(),
cwd: exec_params.cwd.clone(),
timeout_ms: exec_params.timeout_ms,
timeout_ms: exec_params.expiration.timeout_ms(),
env: exec_params.env.clone(),
with_escalated_permissions: exec_params.with_escalated_permissions,
justification: exec_params.justification.clone(),
approval_requirement: create_approval_requirement_for_command(
&turn.exec_policy,
&exec_params.command,
turn.approval_policy,
&turn.sandbox_policy,
SandboxPermissions::from(exec_params.with_escalated_permissions.unwrap_or(false)),
),
};
let mut orchestrator = ToolOrchestrator::new();
let mut runtime = ShellRuntime::new();
@@ -330,29 +338,30 @@ mod tests {
use std::path::PathBuf;
use crate::is_safe_command::is_known_safe_command;
use crate::shell::BashShell;
use crate::shell::PowerShellConfig;
use crate::shell::Shell;
use crate::shell::ZshShell;
use crate::shell::ShellType;
/// The logic for is_known_safe_command() has heuristics for known shells,
/// so we must ensure the commands generated by [ShellCommandHandler] can be
/// recognized as safe if the `command` is safe.
#[test]
fn commands_generated_by_shell_command_handler_can_be_matched_by_is_known_safe_command() {
let bash_shell = Shell::Bash(BashShell {
let bash_shell = Shell {
shell_type: ShellType::Bash,
shell_path: PathBuf::from("/bin/bash"),
});
};
assert_safe(&bash_shell, "ls -la");
let zsh_shell = Shell::Zsh(ZshShell {
let zsh_shell = Shell {
shell_type: ShellType::Zsh,
shell_path: PathBuf::from("/bin/zsh"),
});
};
assert_safe(&zsh_shell, "ls -la");
let powershell = Shell::PowerShell(PowerShellConfig {
let powershell = Shell {
shell_type: ShellType::PowerShell,
shell_path: PathBuf::from("pwsh.exe"),
});
};
assert_safe(&powershell, "ls -Name");
}

Some files were not shown because too many files have changed in this diff Show More