Compare commits

...

84 Commits

Author SHA1 Message Date
Ahmed Ibrahim
c414265e98 async 2025-12-04 13:16:21 -08:00
Ahmed Ibrahim
903b7774bc Add models endpoint (#7603)
- Use the codex-api crate to introduce models endpoint. 
- Add `models` to codex core tests helpers
- Add `ModelsInfo` for the endpoint return type
2025-12-04 12:57:54 -08:00
Ahmed Ibrahim
6e6338aa87 Inline response recording and remove process_items indirection (#7310)
- Inline response recording during streaming: `run_turn` now records
items as they arrive instead of building a `ProcessedResponseItem` list
and post‑processing via `process_items`.
- Simplify turn handling: `handle_output_item_done` returns the
follow‑up signal + optional tool future; `needs_follow_up` is set only
there, and in‑flight tool futures are drained once at the end (errors
logged, no extra state writes).
- Flattened stream loop: removed `process_items` indirection and the
extra output queue
- - Tests: relaxed `tool_parallelism::tool_results_grouped` to allow any
completion order while still requiring matching call/output IDs.
2025-12-04 12:17:54 -08:00
Jeremy Rose
7dfc3a4dc7 add --branch to codex cloud exec (#7602)
Adds `--branch` to `codex cloud exec` to set base branch.
2025-12-04 12:00:18 -08:00
Ahmed Ibrahim
9b2055586d remove model_family from `config (#7571)
- Remove `model_family` from `config`
- Make sure to still override config elements related to `model_family`
like supporting reasoning
2025-12-04 11:57:58 -08:00
Maxime Savard
ce0b38c056 FIX: WSL Paste image does not work (#6793)
## Related issues:  
- https://github.com/openai/codex/issues/3939  
- https://github.com/openai/codex/issues/2292  
- https://github.com/openai/codex/issues/7528 (After correction
https://github.com/openai/codex/pull/3990)

**Area:** `codex-cli` (image handling / clipboard & file uploads)  
**Platforms affected:** WSL (Ubuntu on Windows 10/11). No behavior
change on native Linux/macOS/Windows.

## Summary

This PR fixes image pasting and file uploads when running `codex-cli`
inside WSL. Previously, image operations failed silently or with
permission errors because paths weren't properly mapped between Windows
and WSL filesystems.

## Visual Result

<img width="1118" height="798" alt="image"
src="https://github.com/user-attachments/assets/14e10bc4-6b71-4d1f-b2a6-52c0a67dd069"
/>

## Last Rust-Cli

<img width="1175" height="859" alt="image"
src="https://github.com/user-attachments/assets/7ef41e29-9118-42c9-903c-7116d21e1751"
/>

## Root cause

The CLI assumed native Linux/Windows environments and didn't handle the
WSL↔Windows boundary:

- Used Linux paths for files that lived on the Windows host
- Missing path normalization between Windows (`C:\...`) and WSL
(`/mnt/c/...`)
- Clipboard access failed under WSL

### Why `Ctrl+V` doesn't work in WSL terminals

Most WSL terminal emulators (Windows Terminal, ConEmu, etc.) intercept
`Ctrl+V` at the terminal level to paste text from the Windows clipboard.
This keypress never reaches the CLI application itself, so our clipboard
image handler never gets triggered. Users need `Ctrl+Alt+V`.

## Changes

### WSL detection & path mapping

- Detects WSL by checking `/proc/sys/kernel/osrelease` and the
`WSL_INTEROP` env var
- Maps Windows drive paths to WSL mount paths (`C:\...` → `/mnt/c/...`)

### Clipboard fallback for WSL

- When clipboard access fails under WSL, falls back to PowerShell to
extract images from the Windows clipboard
- Saves to a temp file and maps the path back to WSL

### UI improvements

- Shows `Ctrl+Alt+V` hint on WSL (many terminals intercept plain
`Ctrl+V`)
- Better error messages for unreadable images

## Performance

- Negligible overhead. The fallback adds a single FS copy to a temp file
only when needed.
- Direct streaming remains the default.

## Files changed

- `protocol/src/lib.rs` – Added platform detection module  
- `protocol/src/models.rs` – Added WSL path mapping for local images  
- `protocol/src/platform.rs` – New module with WSL detection utilities  
- `tui/src/bottom_pane/chat_composer.rs` – Added base64 data URL support
and WSL path mapping
- `tui/src/bottom_pane/footer.rs` – WSL-aware keyboard shortcuts  
- `tui/src/clipboard_paste.rs` – PowerShell clipboard fallback

## How to reproduce the original bug (pre-fix)

1. Run `codex-cli` inside WSL2 on Windows.  
2. Paste an image from the Windows clipboard or drag an image from
`C:\...` into the terminal.
3. Observe that the image is not attached (silent failure) or an error
is logged; no artifact reaches the tool.

## How to verify the fix

1. Build this branch and run `codex-cli` inside WSL2.  
2. Paste from clipboard and drag from both Windows and WSL paths.  
3. Confirm that the image appears in the tool and the CLI shows a single
concise info line (no warning unless fallback was used).

I’m happy to adjust paths, naming, or split helpers into a separate
module if you prefer.

## How to try this branch

If you want to try this before it’s merged, you can use my Git branch:

Repository: https://github.com/Waxime64/codex.git  
Branch: `wsl-image-2`

1. Start WSL on your Windows machine.
2. Clone the repository and switch to the branch:
   ```bash
   git clone https://github.com/Waxime64/codex.git
   cd codex
   git checkout wsl-image-2
   # then go into the Rust workspace root, e.g.:
   cd codex-rs
3. Build the TUI binary:
  cargo build -p codex-tui --bin codex-tui --release
4. Install the binary:
   sudo install -m 0755 target/release/codex-tui /usr/local/bin/codex
5. From the project directory where you want to use Codex, start it
with:
   cd /path/to/your/project
   /usr/local/bin/codex

On WSL, use CTRL+ALT+V to paste an image from the Windows clipboard into
the chat.
2025-12-04 10:50:20 -08:00
Dylan Hurd
37c36024c7 chore(core): test apply_patch_cli on Windows (#7554)
## Summary
These tests pass on windows, let's enable them.

## Testing
- [x] These are more tests
2025-12-04 10:39:45 -08:00
jif-oai
291b54a762 chore: review in read-only (#7593) 2025-12-04 10:01:12 -08:00
jif-oai
2b5d0b2935 feat: update sandbox policy to allow TTY (#7580)
**Change**: Seatbelt now allows file-ioctl on /dev/ttys[0-9]+ even
without the sandbox extension so pre-created PTYs remain interactive
(Python REPL, shells).

**Risk**: A seatbelted process that already holds a PTY fd (including
one it shouldn’t) could issue tty ioctls like TIOCSTI or termios changes
on that fd. This doesn’t allow opening new PTYs or reading/writing them;
it only broadens ioctl capability on existing fds.

**Why acceptable**: We already hand the child its PTY for interactive
use; restoring ioctls is required for isatty() and prompts to work. The
attack requires being given or inheriting a sensitive PTY fd; by design
we don’t hand untrusted processes other users’ PTYs (we don't hand them
any PTYs actually), so the practical exposure is limited to the PTY
intentionally allocated for the session.

**Validation**:
Running
```
start a python interpreter and keep it running
```
Followed by:
* `calculate 1+1 using it` -> works as expected
* `Use this Python session to run the command just fix in
/Users/jif/code/codex/codex-rs` -> does not work as expected
2025-12-04 17:58:58 +00:00
zhao-oai
404a1ea34b Update execpolicy.md (#7595) 2025-12-04 17:55:42 +00:00
jif-oai
36edb412b1 fix: release session ID when not used (#7592) 2025-12-04 17:42:16 +00:00
jif-oai
1b2509f05a chore: default warning messages to true (#7588) 2025-12-04 17:29:23 +00:00
pakrym-oai
f1b7cdc3bd Use shared check sandboxing (#7547) 2025-12-04 08:34:09 -08:00
pakrym-oai
c4e18f1b63 Slightly better status display for unified exec (#7563)
Trim bash -lc
2025-12-04 08:32:54 -08:00
jif-oai
8f4e00e1f1 chore: tool tip for /prompt (#7591) 2025-12-04 15:13:49 +00:00
zhao-oai
87666695ba execpolicy tui flow (#7543)
## Updating the `execpolicy` TUI flow

In the TUI, when going through the command approval flow, codex will now
ask the user if they would like to whitelist the FIRST unmatched command
among a chain of commands.

For example, let's say the agent wants to run `apple | pear` with an
empty `execpolicy`

Neither apple nor pear will match to an `execpolicy` rule. Thus, when
prompting the user, codex tui will ask the user if they would like to
whitelist `apple`.

If the agent wants to run `apple | pear` again, they would be prompted
again because pear is still unknown. when prompted, the user will now be
asked if they'd like to whitelist `pear`.

Here's a demo video of this flow:


https://github.com/user-attachments/assets/fd160717-f6cb-46b0-9f4a-f0a974d4e710

This PR also removed the `allow for this session` option from the TUI.
2025-12-04 07:58:13 +00:00
ae
871f44f385 Add Enterprise plan to ChatGPT login description (#6918)
## Summary
- update ChatGPT onboarding login description to mention Enterprise
plans alongside Plus, Pro, and Team

## Testing
- just fmt


------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_691e088daf20832c88d8b667adf45128)
2025-12-03 23:47:46 -08:00
zhao-oai
3d35cb4619 Refactor execpolicy fallback evaluation (#7544)
## Refactor of the `execpolicy` crate

To illustrate why we need this refactor, consider an agent attempting to
run `apple | rm -rf ./`. Suppose `apple` is allowed by `execpolicy`.
Before this PR, `execpolicy` would consider `apple` and `pear` and only
render one rule match: `Allow`. We would skip any heuristics checks on
`rm -rf ./` and immediately approve `apple | rm -rf ./` to run.

To fix this, we now thread a `fallback` evaluation function into
`execpolicy` that runs when no `execpolicy` rules match a given command.
In our example, we would run `fallback` on `rm -rf ./` and prevent
`apple | rm -rf ./` from being run without approval.
2025-12-03 23:39:48 -08:00
zhao-oai
e925a380dc whitelist command prefix integration in core and tui (#7033)
this PR enables TUI to approve commands and add their prefixes to an
allowlist:
<img width="708" height="605" alt="Screenshot 2025-11-21 at 4 18 07 PM"
src="https://github.com/user-attachments/assets/56a19893-4553-4770-a881-becf79eeda32"
/>

note: we only show the option to whitelist the command when 
1) command is not multi-part (e.g `git add -A && git commit -m 'hello
world'`)
2) command is not already matched by an existing rule
2025-12-03 23:17:02 -08:00
Jeremy Rose
ccdeb9d9c4 use markdown for rendering tips (#7557)
## Summary
- render tooltip content through the markdown renderer and prepend a
bold Tip label
- wrap tooltips at the available width using the indent’s measured width
before adding the indent

## Testing
- `/root/.cargo/bin/just fmt`
- `RUSTFLAGS="--cfg tokio_unstable" TOKIO_UNSTABLE=1
/root/.cargo/bin/just fix -p codex-tui` *(fails: codex-tui tests
reference tokio::time::advance/start_paused gated behind the tokio
test-util feature)*
- `RUSTFLAGS="--cfg tokio_unstable" TOKIO_UNSTABLE=1 cargo test -p
codex-tui` *(fails: codex-tui tests reference
tokio::time::advance/start_paused gated behind the tokio test-util
feature)*

------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_693081406050832c9772ae9fa5dd77ca)
2025-12-04 04:58:35 +00:00
Ahmed Ibrahim
67e67e054f Migrate codex max (#7566)
- make codex max the default
- fix: we were doing some async work in sync function which caused tui
to panic
2025-12-03 20:54:48 -08:00
Eric Traut
edd98dd3b7 Remove test from #7481 that doesn't add much value (#7558)
Follow-up from PR #7481
2025-12-03 19:10:54 -08:00
Celia Chen
3e6cd5660c [app-server] make file_path for config optional (#7560)
When we are writing to config using `config/value/write` or
`config/batchWrite`, it always require a `config/read` before it right
now in order to get the correct file path to write to. make this
optional so we read from the default user config file if this is not
passed in.
2025-12-04 03:08:18 +00:00
Ahmed Ibrahim
cee37a32b2 Migrate model family to models manager (#7565)
This PR moves `ModelsFamily` to `openai_models`. It also propagates
`ModelsManager` to session services and use it to drive model family. We
also make `derive_default_model_family` private because it's a step
towards what we want: one place that gives model configuration.

This is a second step at having one source of truth for models
information and config: `ModelsManager`.

Next steps would be to remove `ModelsFamily` from config. That's massive
because it's being used in 41 occasions mostly pre launching `codex`.
Also, we need to make `find_family_for_model` private. It's also big
because it's being used in 21 occasions ~ all tests.
2025-12-03 18:49:47 -08:00
Ahmed Ibrahim
8da91d1c89 Migrate tui to use models manager (#7555)
- This PR treats the `ModelsManager` like `AuthManager` and propagate it
into the tui, replacing the `builtin_model_presets`
- We are also decreasing the visibility of `builtin_model_presets`

based on https://github.com/openai/codex/pull/7552
2025-12-03 18:00:47 -08:00
Ahmed Ibrahim
00cc00ead8 Introduce ModelsManager and migrate app-server to use it. (#7552) 2025-12-03 17:17:56 -08:00
muyuanjin
70b97790be fix: wrap long exec lines in transcript overlay (#7481)
What
-----
- Fix the Ctrl+T transcript overlay so that very long exec output lines
are soft‑wrapped to the viewport width instead of being rendered as a
single truncated row.
- Add a regression test to `TranscriptOverlay` to ensure long exec
outputs are rendered on multiple lines in the overlay.

Why
----
- Previously, the transcript overlay rendered extremely long single exec
lines as one on‑screen row and simply cut them off at the right edge,
with no horizontal scrolling.
- This made it impossible to inspect the full content of long tool/exec
outputs in the transcript view, even though the main TUI view already
wrapped those lines.
- Fixes #7454.

How
----
- Update `ExecCell::transcript_lines` to wrap exec output lines using
the existing `RtOptions`/`word_wrap_line` helpers so that transcript
rendering is width‑aware.
- Reuse the existing line utilities to expand the wrapped `Line` values
into the transcript overlay, preserving styling while respecting the
current viewport width.
- Add `transcript_overlay_wraps_long_exec_output_lines` test in
`pager_overlay.rs` that constructs a long single‑line exec output,
renders the transcript overlay into a small buffer, and asserts that the
long marker string spans multiple rendered lines.
2025-12-03 16:45:08 -08:00
Michael Bolin
1cfc967eb8 fix: Features should be immutable over the lifetime of a session/thread (#7540)
I noticed that `features: Features` was defined on `struct
SessionConfiguration`, which is commonly owned by `SessionState`, which
is in turn owned by `Session`.

Though I do not believe that `Features` should be allowed to be modified
over the course of a session (if the feature state is not invariant, it
makes it harder to reason about), which argues that it should live on
`Session` rather than `SessionState` or `SessionConfiguration`.

This PR moves `Features` to `Session` and updates all call sites. It
appears the only place we were mutating `Features` was:

- in tests
- the sub-agent config for a review task:


3ef76ff29d/codex-rs/core/src/tasks/review.rs (L86-L89)

Note this change also means it is no longer an `async` call to check the
state of a feature, eliminating the possibility of a
[TOCTTOU](https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use)
error between checking the state of a feature and acting on it:


3ef76ff29d/codex-rs/core/src/codex.rs (L1069-L1076)
2025-12-03 16:12:31 -08:00
xl-openai
9a50a04400 feat: Support listing and selecting skills via $ or /skills (#7506)
List/Select skills with $-mention or /skills
2025-12-03 15:12:46 -08:00
Owen Lin
231ff19ca2 [app-server] fix: add thread_id to turn/plan/updated (#7553)
Realized we're missing this while migrating VSCE.
2025-12-03 15:00:07 -08:00
Aofei Sheng
de08c735a6 feat(tui): map Ctrl-P/N to arrow navigation in textarea (#7530)
- Treat Ctrl-P/N (and their C0 fallbacks) the same as Up/Down so cursor
movement matches popup/history behavior and control bytes never land in
the buffer

Fixes #7529

Signed-off-by: Aofei Sheng <aofei@aofeisheng.com>
2025-12-03 14:43:31 -08:00
muyuanjin
3395ebd96e fix(tui): limit user shell output by screen lines (#7448)
What
- Limit the TUI "user shell" output panel by the number of visible
screen lines rather than by the number of logical lines.
- Apply middle truncation after wrapping, so a few extremely long lines
cannot expand into hundreds of visible lines.
- Add a regression test to guard this behavior.

Why
When the `ExecCommandSource::UserShell` tool returns a small number of
very long logical lines, the TUI wraps those lines into many visual
lines. The existing truncation logic applied
`USER_SHELL_TOOL_CALL_MAX_LINES` to the number of logical lines *before*
wrapping.

As a result, a command like:

- `Ran bash -lc "grep -R --line-number 'maskAssetId' ."`

or a synthetic command that prints a single ~50,000‑character line, can
produce hundreds of screen lines and effectively flood the viewport. The
intended middle truncation for user shell output does not take effect in
this scenario.

How
- In `codex-rs/tui/src/exec_cell/render.rs`, change the `ExecCell`
rendering path for `ExecCommandSource::UserShell` so that:
- Each logical line from `CommandOutput::aggregated_output` is first
wrapped via `word_wrap_line` into multiple screen lines using the
appropriate `RtOptions` and width from the `EXEC_DISPLAY_LAYOUT`
configuration.
- `truncate_lines_middle` is then applied to the wrapped screen lines,
with `USER_SHELL_TOOL_CALL_MAX_LINES` as the limit. This means the limit
is enforced on visible screen lines, not logical lines.
- The existing layout struct (`ExecDisplayLayout`) continues to provide
`output_max_lines`, so user shell output is subject to both
`USER_SHELL_TOOL_CALL_MAX_LINES` and the layout-specific
`output_max_lines` constraint.
- Keep using `USER_SHELL_TOOL_CALL_MAX_LINES` as the cap, but interpret
it as a per‑tool‑call limit on screen lines.
- Add a regression test `user_shell_output_is_limited_by_screen_lines`
in `codex-rs/tui/src/exec_cell/render.rs` that:
- Constructs two extremely long logical lines containing a short marker
(`"Z"`), so each wrapped screen line still contains the marker.
  - Wraps them at a narrow width to generate many screen lines.
- Asserts that the unbounded wrapped output would exceed
`USER_SHELL_TOOL_CALL_MAX_LINES` screen lines.
- Renders an `ExecCell` for `ExecCommandSource::UserShell` at the same
width and counts rendered lines containing the marker.
- Asserts `output_screen_lines <= USER_SHELL_TOOL_CALL_MAX_LINES`,
guarding against regressions where truncation happens before wrapping.

This change keeps user shell output readable while ensuring it cannot
flood the TUI, even when the tool emits a few extremely long lines.

Tests
- `cargo test -p codex-tui`

Issue
- Fixes #7447
2025-12-03 13:43:17 -08:00
Ahmed Ibrahim
71504325d3 Migrate model preset (#7542)
- Introduce `openai_models` in `/core`
- Move `PRESETS` under it
- Move `ModelPreset`, `ModelUpgrade`, `ReasoningEffortPreset`,
`ReasoningEffortPreset`, and `ReasoningEffortPreset` to `protocol`
- Introduce `Op::ListModels` and `EventMsg::AvailableModels`

Next steps:
- migrate `app-server` and `tui` to use the introduced Operation
2025-12-03 20:30:43 +00:00
jif-oai
7f068cfbcc fix: main (#7546) 2025-12-03 20:15:12 +00:00
jif-oai
9e6c2c1e64 feat: add pycache to excluded directories (#7545) 2025-12-03 20:06:55 +00:00
jif-oai
8d0f023fa9 chore: update unified exec sandboxing detection (#7541)
No integration test for now because it would make them flaky. Tracking
it in my todos to add some once we have a clock based system for
integration tests
2025-12-03 20:06:47 +00:00
Ahmed Ibrahim
2ad980abf4 add slash resume (#7302)
`codex resume` isn't that discoverable. Adding it to the slash commands
can help
2025-12-03 11:25:44 -08:00
Owen Lin
3ef76ff29d chore: conversation_id -> thread_id in app-server feedback/upload (#7538)
Use `thread_id: Option<String>` instead of `conversation_id:
Option<ConversationId>` to be consistent with the rest of app-server v2
APIs.
2025-12-03 18:47:35 +00:00
Owen Lin
844de19561 chore: delete unused TodoList item from app-server (#7537)
This item is sent as a turn notification instead: `turn/plan/updated`,
similar to Turn diffs (which is `turn/diff/updated`).

We treat these concepts as ephemeral compared to Items which are usually
persisted.
2025-12-03 18:47:12 +00:00
Owen Lin
343aa35db1 chore: update app-server README (#7510)
Just keeping the README up to date.

- Reorganize structure a bit to read more naturally
- Update RPC methods
- Update events
2025-12-03 10:41:38 -08:00
Shijie Rao
c3e4f920b4 chore: remove bun env var detect (#7534)
### Summary


[Thread](https://openai.slack.com/archives/C08JZTV654K/p1764780129457519)

We were a bit aggressive on assuming package installer based on env
variables for BUN. Here we are removing those checks.
2025-12-03 10:23:45 -08:00
Shijie Rao
4785344c9c feat: support list mcp servers in app server (#7505)
### Summary
Added `mcp/servers/list` which is equivalent to `/mcp` slash command in
CLI for response. This will be used in VSCE MCP settings to show log in
status, available tools etc.
2025-12-03 09:51:46 -08:00
Jeremy Rose
9b3251f28f seatbelt: allow openpty() (#7507)
This allows `openpty(3)` to run in the default sandbox. Also permit
reading `kern.argmax`, which is the maximum number of arguments to
exec().
2025-12-03 09:15:38 -08:00
jif-oai
45f3250eec feat: codex tool tips (#7440)
<img width="551" height="316" alt="Screenshot 2025-12-01 at 12 22 26"
src="https://github.com/user-attachments/assets/6ca3deff-8ef8-4f74-a8e1-e5ea13fd6740"
/>
2025-12-03 16:29:13 +00:00
jif-oai
51307eaf07 feat: retroactive image placeholder to prevent poisoning (#6774)
If an image can't be read by the API, it will poison the entire history,
preventing any new turn on the conversation.
This detect such cases and replace the image by a placeholder
2025-12-03 11:35:56 +00:00
jif-oai
42ae738f67 feat: model warning in case of apply patch (#7494) 2025-12-03 09:07:31 +00:00
Dylan Hurd
00ef9d3784 fix(tui) Support image paste from clipboard on native Windows (#7514)
Closes #3404 

## Summary
On windows, ctrl+v does not work for the same reason that cmd+v does not
work on macos. This PR adds alt/option+v detection, which allows windows
users to paste images from the clipboard using.

We could swap between just ctrl on mac and just alt on windows, but this
felt simpler - I don't feel strongly about it.

Note that this will NOT address image pasting in WSL environments, due
to issues with WSL <> Windows clipboards. I'm planning to address that
in a separate PR since it will likely warrant some discussion.

## Testing
- [x] Tested locally on a Mac and Windows laptop
2025-12-02 22:12:49 -08:00
Robby He
f3989f6092 fix(unified_exec): use platform default shell when unified_exec shell… (#7486)
# Unified Exec Shell Selection on Windows

## Problem

reference issue #7466

The `unified_exec` handler currently deserializes model-provided tool
calls into the `ExecCommandArgs` struct:

```rust
#[derive(Debug, Deserialize)]
struct ExecCommandArgs {
    cmd: String,
    #[serde(default)]
    workdir: Option<String>,
    #[serde(default = "default_shell")]
    shell: String,
    #[serde(default = "default_login")]
    login: bool,
    #[serde(default = "default_exec_yield_time_ms")]
    yield_time_ms: u64,
    #[serde(default)]
    max_output_tokens: Option<usize>,
    #[serde(default)]
    with_escalated_permissions: Option<bool>,
    #[serde(default)]
    justification: Option<String>,
}
```

The `shell` field uses a hard-coded default:

```rust
fn default_shell() -> String {
    "/bin/bash".to_string()
}
```

When the model returns a tool call JSON that only contains `cmd` (which
is the common case), Serde fills in `shell` with this default value.
Later, `get_command` uses that value as if it were a model-provided
shell path:

```rust
fn get_command(args: &ExecCommandArgs) -> Vec<String> {
    let shell = get_shell_by_model_provided_path(&PathBuf::from(args.shell.clone()));
    shell.derive_exec_args(&args.cmd, args.login)
}
```

On Unix, this usually resolves to `/bin/bash` and works as expected.
However, on Windows this behavior is problematic:

- The hard-coded `"/bin/bash"` is not a valid Windows path.
- `get_shell_by_model_provided_path` treats this as a model-specified
shell, and tries to resolve it (e.g. via `which::which("bash")`), which
may or may not exist and may not behave as intended.
- In practice, this leads to commands being executed under a non-default
or non-existent shell on Windows (for example, WSL bash), instead of the
expected Windows PowerShell or `cmd.exe`.

The core of the issue is that **"model did not specify `shell`" is
currently interpreted as "the model explicitly requested `/bin/bash`"**,
which is both Unix-specific and wrong on Windows.

## Proposed Solution

Instead of hard-coding `"/bin/bash"` into `ExecCommandArgs`, we should
distinguish between:

1. **The model explicitly specifying a shell**, e.g.:

   ```json
   {
     "cmd": "echo hello",
     "shell": "pwsh"
   }
   ```

In this case, we *do* want to respect the model’s choice and use
`get_shell_by_model_provided_path`.

2. **The model omitting the `shell` field entirely**, e.g.:

   ```json
   {
     "cmd": "echo hello"
   }
   ```

In this case, we should *not* assume `/bin/bash`. Instead, we should use
`default_user_shell()` and let the platform decide.

To express this distinction, we can:

1. Change `shell` to be optional in `ExecCommandArgs`:

   ```rust
   #[derive(Debug, Deserialize)]
   struct ExecCommandArgs {
       cmd: String,
       #[serde(default)]
       workdir: Option<String>,
       #[serde(default)]
       shell: Option<String>,
       #[serde(default = "default_login")]
       login: bool,
       #[serde(default = "default_exec_yield_time_ms")]
       yield_time_ms: u64,
       #[serde(default)]
       max_output_tokens: Option<usize>,
       #[serde(default)]
       with_escalated_permissions: Option<bool>,
       #[serde(default)]
       justification: Option<String>,
   }
   ```

Here, the absence of `shell` in the JSON is represented as `shell:
None`, rather than a hard-coded string value.
2025-12-02 21:49:25 -08:00
Matthew Zeng
dbec741ef0 Update device code auth strings. (#7498)
- [x] Update device code auth strings.
2025-12-02 17:36:38 -08:00
Michael Bolin
06e7667d0e fix: inline function marked as dead code (#7508)
I was debugging something else and noticed we could eliminate an
instance of `#[allow(dead_code)]` pretty easily.
2025-12-03 00:50:34 +00:00
Ahmed Ibrahim
1ef1fe67ec improve resume performance (#7303)
Reading the tail can be costly if we have a very big rollout item. we
can just read the file metadata
2025-12-02 16:39:40 -08:00
Michael Bolin
ee191dbe81 fix: path resolution bug in npx (#7134)
When running `npx @openai/codex-shell-tool-mcp`, the old code derived
`__dirname` from `process.argv[1]`, which points to npx’s transient
wrapper script in
`~/.npm/_npx/134d0fb7e1a27652/node_modules/.bin/codex-shell-tool-mcp`.
That made `vendorRoot` resolve to `<npx cache>/vendor`, so the startup
checks failed with "Required binary missing" because it looked for
`codex-execve-wrapper` in the wrong place.

By relying on the real module `__dirname` and `path.resolve(__dirname,
"..", "vendor")`, the package now anchors to its installed location
under `node_modules/@openai/codex-shell-tool-mcp/`, so the bundled
binaries are found and npx launches correctly.
2025-12-02 16:37:14 -08:00
Joshua Sutton
ad9eeeb287 Ensure duplicate-length paste placeholders stay distinct (#7431)
Fix issue #7430 
Generate unique numbered placeholders for multiple large pastes of the
same length so deleting one no longer removes the others.

Signed-off-by: Joshua <joshua1s@protonmail.com>
2025-12-02 16:16:01 -08:00
Michael Bolin
6b5b9a687e feat: support --version flag for @openai/codex-shell-tool-mcp (#7504)
I find it helpful to easily verify which version is running.

Tested:

```shell
~/code/codex3/codex-rs/exec-server$ cargo run --bin codex-exec-mcp-server -- --help
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.19s
     Running `/Users/mbolin/code/codex3/codex-rs/target/debug/codex-exec-mcp-server --help`
Usage: codex-exec-mcp-server [OPTIONS]

Options:
      --execve <EXECVE_WRAPPER>  Executable to delegate execve(2) calls to in Bash
      --bash <BASH_PATH>         Path to Bash that has been patched to support execve() wrapping
  -h, --help                     Print help
  -V, --version                  Print version
~/code/codex3/codex-rs/exec-server$ cargo run --bin codex-exec-mcp-server -- --version
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.17s
     Running `/Users/mbolin/code/codex3/codex-rs/target/debug/codex-exec-mcp-server --version`
codex-exec-server 0.0.0
```
2025-12-02 23:43:25 +00:00
Josh McKinney
58e1e570fa refactor: tui.rs extract several pieces (#7461)
Pull FrameRequester out of tui.rs into its own module and make a
FrameScheduler struct. This is effectively an Actor/Handler approach
(see https://ryhl.io/blog/actors-with-tokio/). Adds tests and docs.

Small refactor of pending_viewport_area logic.
2025-12-02 15:19:27 -08:00
Michael Bolin
ec93b6daf3 chore: make create_approval_requirement_for_command an async fn (#7501)
I think this might help with https://github.com/openai/codex/pull/7033
because `create_approval_requirement_for_command()` will soon need
access to `Session.state`, which is a `tokio::sync::Mutex` that needs to
be accessed via `async`.
2025-12-02 15:01:15 -08:00
liam
4d4778ec1c Trim history.jsonl when history.max_bytes is set (#6242)
This PR honors the `history.max_bytes` configuration parameter by
trimming `history.jsonl` whenever it grows past the configured limit.
While appending new entries we retain the newest record, drop the oldest
lines to stay within the byte budget, and serialize the compacted file
back to disk under the same lock to keep writers safe.
2025-12-02 14:01:05 -08:00
Owen Lin
77c457121e fix: remove serde(flatten) annotation for TurnError (#7499)
The problem with using `serde(flatten)` on Turn status is that it
conditionally serializes the `error` field, which is not the pattern we
want in API v2 where all fields on an object should always be returned.

```
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct Turn {
    pub id: String,
    /// Only populated on a `thread/resume` response.
    /// For all other responses and notifications returning a Turn,
    /// the items field will be an empty list.
    pub items: Vec<ThreadItem>,
    #[serde(flatten)]
    pub status: TurnStatus,
}

#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(tag = "status", rename_all = "camelCase")]
#[ts(tag = "status", export_to = "v2/")]
pub enum TurnStatus {
    Completed,
    Interrupted,
    Failed { error: TurnError },
    InProgress,
}
```

serializes to:
```
{
  "id": "turn-123",
  "items": [],
  "status": "completed"
}

{
  "id": "turn-123",
  "items": [],
  "status": "failed",
  "error": {
    "message": "Tool timeout",
    "codexErrorInfo": null
  }
}
```

Instead we want:
```
{
  "id": "turn-123",
  "items": [],
  "status": "completed",
  "error": null
}

{
  "id": "turn-123",
  "items": [],
  "status": "failed",
  "error": {
    "message": "Tool timeout",
    "codexErrorInfo": null
  }
}
```
2025-12-02 21:39:10 +00:00
zhao-oai
5ebdc9af1b persisting credits if new snapshot does not contain credit info (#7490)
in response to incoming changes to responses headers where the header
may sometimes not contain credits info (no longer forcing a credit
check)
2025-12-02 16:23:24 -05:00
Michael Bolin
f6a7da4ac3 fix: drop lock once it is no longer needed (#7500)
I noticed this while doing a post-commit review of https://github.com/openai/codex/pull/7467.
2025-12-02 20:46:26 +00:00
zhao-oai
1d09ac89a1 execpolicy helpers (#7032)
this PR 
- adds a helper function to amend `.codexpolicy` files with new prefix
rules
- adds a utility to `Policy` allowing prefix rules to be added to
existing `Policy` structs

both additions will be helpful as we thread codexpolicy into the TUI
workflow
2025-12-02 15:05:27 -05:00
Ahmed Ibrahim
127e307f89 Show token used when context window is unknown (#7497)
- Show context window usage in tokens instead of percentage when the
window length is unknown.
2025-12-02 11:45:50 -08:00
Ahmed Ibrahim
21ad1c1c90 Use non-blocking mutex (#7467) 2025-12-02 10:50:46 -08:00
lionel-oai
349734e38d Fix: track only untracked paths in ghost snapshots (#7470)
# Ghost snapshot ignores

This PR should close #7067, #7395, #7405.

Prior to this change the ghost snapshot task ran `git status
--ignored=matching` so the report picked up literally every ignored
file. When a directory only contained entries matched by patterns such
as `dozens/*.txt`, `/test123/generated/*.html`, or `/wp-includes/*`, Git
still enumerated them and the large-untracked-dir detection treated the
parent directory as “large,” even though everything inside was
intentionally ignored.

By removing `--ignored=matching` we only capture true untracked paths
now, so those patterns stay out of the snapshot report and no longer
trigger the “large untracked directories” warning.

---------

Signed-off-by: lionelchg <lionel.cheng@hotmail.fr>
Co-authored-by: lionelchg <lionel.cheng@hotmail.fr>
2025-12-02 19:42:33 +01:00
jif-oai
2222cab9ea feat: ignore standard directories (#7483) 2025-12-02 18:42:07 +00:00
Owen Lin
c2f8c4e9f4 fix: add ts number annotations for app-server v2 types (#7492)
These will be more ergonomic to work with in Typescript.
2025-12-02 18:09:41 +00:00
jif-oai
72b95db12f feat: intercept apply_patch for unified_exec (#7446) 2025-12-02 17:54:02 +00:00
Owen Lin
37ee6bf2c3 chore: remove mention of experimental/unstable from app-server README (#7474) 2025-12-02 17:35:05 +00:00
pakrym-oai
8b1e397211 Add request logging back (#7471)
Having full requests helps debugging
2025-12-02 07:57:55 -08:00
jif-oai
85e687c74a feat: add one off commands to app-server v2 (#7452) 2025-12-02 11:56:09 +00:00
jif-oai
9ee855ec57 feat: add warning message for the model (#7445)
Add a warning message as a user turn to the model if the model does not
behave as expected (here, for example, if the model opens too many
`unified_exec` sessions)
2025-12-02 11:56:00 +00:00
jif-oai
4b78e2ab09 chore: review everywhere (#7444) 2025-12-02 11:26:27 +00:00
jif-oai
85e2fabc9f feat: alias compaction (#7442) 2025-12-02 09:21:30 +00:00
Thibault Sottiaux
a8d5ad37b8 feat: experimental support for skills.md (#7412)
This change prototypes support for Skills with the CLI. This is an
**experimental** feature for internal testing.

---------

Co-authored-by: Gav Verma <gverma@openai.com>
2025-12-01 20:22:35 -08:00
Manoel Calixto
32e4a3a4d7 fix(tui): handle WSL clipboard image paths (#3990)
Fixes #3939 
Fixes #2803

## Summary
- convert Windows clipboard file paths into their `/mnt/<drive>`
equivalents when running inside WSL so pasted images resolve correctly
- add WSL detection helpers and share them with unit tests to cover both
native Windows and WSL clipboard normalization cases
- improve the test suite by exercising Windows path handling plus a
dedicated WSL conversion scenario and keeping the code path guarded by
targeted cfgs

## Testing
- just fmt
- cargo test -p codex-tui
- cargo clippy -p codex-tui --tests
- just fix -p codex-tui

## Screenshots
_Codex TUI screenshot:_
<img width="1880" height="848" alt="describe this copied image"
src="https://github.com/user-attachments/assets/c620d43c-f45c-451e-8893-e56ae85a5eea"
/>

_GitHub docs directory screenshot:_
<img width="1064" height="478" alt="image-copied"
src="https://github.com/user-attachments/assets/eb5eef6c-eb43-45a0-8bfe-25c35bcae753"
/>

Co-authored-by: Eric Traut <etraut@openai.com>
2025-12-01 16:54:20 -08:00
Steve Mostovoy
f443555728 fix(core): enable history lookup on windows (#7457)
- Add portable history log id helper to support inode-like tracking on
Unix and creation time on Windows
- Refactor history metadata and lookup to share code paths and allow
nonzero log ids across platforms
- Add coverage for lookup stability after appends
2025-12-01 16:29:01 -08:00
Celia Chen
ff4ca9959c [app-server] Add ImageView item (#7468)
Add view_image tool call as image_view item.

Before:
```
< {
<   "method": "codex/event/view_image_tool_call",
<   "params": {
<     "conversationId": "019adc2f-2922-7e43-ace9-64f394019616",
<     "id": "0",
<     "msg": {
<       "call_id": "call_nBQDxnTfZQtgjGpVoGuDnRjz",
<       "path": "/Users/celia/code/codex/codex-rs/app-server-protocol/codex-cli-login.png",
<       "type": "view_image_tool_call"
<     }
<   }
< }
```

After:
```
< {
<   "method": "item/started",
<   "params": {
<     "item": {
<       "id": "call_nBQDxnTfZQtgjGpVoGuDnRjz",
<       "path": "/Users/celia/code/codex/codex-rs/app-server-protocol/codex-cli-login.png",
<       "type": "imageView"
<     },
<     "threadId": "019adc2f-2922-7e43-ace9-64f394019616",
<     "turnId": "0"
<   }
< }

< {
<   "method": "item/completed",
<   "params": {
<     "item": {
<       "id": "call_nBQDxnTfZQtgjGpVoGuDnRjz",
<       "path": "/Users/celia/code/codex/codex-rs/app-server-protocol/codex-cli-login.png",
<       "type": "imageView"
<     },
<     "threadId": "019adc2f-2922-7e43-ace9-64f394019616",
<     "turnId": "0"
<   }
< }
```
2025-12-01 23:56:05 +00:00
Dylan Hurd
5b25915d7e fix(apply_patch) tests for shell_command (#7307)
## Summary
Adds test coverage for invocations of apply_patch via shell_command with
heredoc, to validate behavior.

## Testing
- [x] These are tests
2025-12-01 15:09:22 -08:00
Michael Bolin
c0564edebe chore: update to rmcp@0.10.0 to pick up support for custom client notifications (#7462)
In https://github.com/openai/codex/pull/7112, I updated our `rmcp`
dependency to point to a personal fork while I tried to upstream my
proposed change. Now that
https://github.com/modelcontextprotocol/rust-sdk/pull/556 has been
upstreamed and included in the `0.10.0` release of the crate, we can go
back to using the mainline release.
2025-12-01 14:01:50 -08:00
linuxmetel
c936c68c84 fix: prevent MCP startup failure on missing 'type' field (#7417)
Fix the issue #7416 that the codex-cli produce an error "MCP startup
failure on missing 'type' field" in the startup.

- Cause: serde in `convert_to_rmcp`
(`codex-rs/rmcp-client/src/utils.rs`) failed because no `r#type` value
was provided
- Fix: set a default `r#type` value in the corresponding structs
2025-12-01 13:58:20 -05:00
Kaden Gruizenga
41760f8a09 docs: clarify codex max defaults and xhigh availability (#7449)
## Summary
Adds the missing `xhigh` reasoning level everywhere it should have been
documented, and makes clear it only works with `gpt-5.1-codex-max`.

## Changes

* `docs/config.md`

* Add `xhigh` to the official list of reasoning levels with a note that
`xhigh` is exclusive to Codex Max.

* `docs/example-config.md`

* Update the example comment adding `xhigh` as a valid option but only
for Codex Max.

* `docs/faq.md`

  * Update the model recommendation to `GPT-5.1 Codex Max`.
* Mention that users can choose `high` or the newly documented `xhigh`
level when using Codex Max.
2025-12-01 10:46:53 -08:00
Albert O'Shea
440c7acd8f fix: nix build missing rmcp output hash (#7436)
Output hash for `rmcp-0.9.0` was missing from the nix package, (i.e.
`error: No hash was found while vendoring the git dependency
rmcp-0.9.0.`) blocking the build.
2025-12-01 10:45:31 -08:00
Ali Towaiji
0cc3b50228 Fix recent_commits(limit=0) returning 1 commit instead of 0 (#7334)
Fixes #7333

This is a small bug fix.

This PR fixes an inconsistency in `recent_commits` where `limit == 0`
still returns 1 commit due to the use of `limit.max(1)` when
constructing the `git log -n` argument.

Expected behavior: requesting 0 commits should return an empty list.

This PR:
- returns an empty `Vec` when `limit == 0`
- adds a test for `recent_commits(limit == 0)` that fails before the
change and passes afterwards
- maintains existing behavior for `limit > 0`

This aligns behavior with API expectations and avoids downstream
consumers misinterpreting the repository as having commit history when
`limit == 0` is used to explicitly request none.

Happy to adjust if the current behavior is intentional.
2025-12-01 10:14:36 -08:00
Owen Lin
8532876ad8 [app-server] fix: emit item/fileChange/outputDelta for file change items (#7399) 2025-12-01 17:52:34 +00:00
189 changed files with 9416 additions and 2539 deletions

View File

@@ -14,6 +14,7 @@ In the codex-rs folder where the rust code lives:
- Do not use unsigned integer even if the number cannot be negative.
- When writing tests, prefer comparing the equality of entire objects over fields one by one.
- When making a change that adds or changes an API, ensure that the documentation in the `docs/` folder is up to date if applicable.
- Always prefer async functions when possible.
Run `just fmt` (in `codex-rs` directory) automatically after making Rust code changes; do not ask for approval to run it. Before finalizing a change to `codex-rs`, run `just fix -p <project>` (in `codex-rs` directory) to fix any linter issues in the code. Prefer scoping with `-p` to avoid slow workspacewide Clippy builds; only run `just fix` without `-p` if you changed shared crates. Additionally, run the tests:

View File

@@ -95,14 +95,6 @@ function detectPackageManager() {
return "bun";
}
if (
process.env.BUN_INSTALL ||
process.env.BUN_INSTALL_GLOBAL_DIR ||
process.env.BUN_INSTALL_BIN_DIR
) {
return "bun";
}
return userAgent ? "npm" : null;
}

47
codex-rs/Cargo.lock generated
View File

@@ -858,6 +858,7 @@ dependencies = [
"http",
"pretty_assertions",
"regex-lite",
"reqwest",
"serde",
"serde_json",
"thiserror 2.0.17",
@@ -865,6 +866,7 @@ dependencies = [
"tokio-test",
"tokio-util",
"tracing",
"wiremock",
]
[[package]]
@@ -1068,6 +1070,7 @@ dependencies = [
"serde_json",
"thiserror 2.0.17",
"tokio",
"tracing",
]
[[package]]
@@ -1116,12 +1119,10 @@ name = "codex-common"
version = "0.0.0"
dependencies = [
"clap",
"codex-app-server-protocol",
"codex-core",
"codex-lmstudio",
"codex-ollama",
"codex-protocol",
"once_cell",
"serde",
"toml",
]
@@ -1186,6 +1187,7 @@ dependencies = [
"seccompiler",
"serde",
"serde_json",
"serde_yaml",
"serial_test",
"sha1",
"sha2",
@@ -1281,6 +1283,7 @@ dependencies = [
"serde_json",
"shlex",
"starlark",
"tempfile",
"thiserror 2.0.17",
]
@@ -1612,6 +1615,7 @@ dependencies = [
"textwrap 0.16.2",
"tokio",
"tokio-stream",
"tokio-util",
"toml",
"tracing",
"tracing-appender",
@@ -1621,6 +1625,7 @@ dependencies = [
"unicode-segmentation",
"unicode-width 0.2.1",
"url",
"uuid",
"vt100",
]
@@ -4464,6 +4469,12 @@ version = "1.0.15"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "57c0d7b74b563b49d38dae00a0c37d4d6de9b432382b2892f0574ddcae73fd0a"
[[package]]
name = "pastey"
version = "0.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "57d6c094ee800037dff99e02cab0eaf3142826586742a270ab3d7a62656bd27a"
[[package]]
name = "path-absolutize"
version = "3.1.1"
@@ -5142,8 +5153,9 @@ dependencies = [
[[package]]
name = "rmcp"
version = "0.9.0"
source = "git+https://github.com/bolinfest/rust-sdk?branch=pr556#4d9cc16f4c76c84486344f542ed9a3e9364019ba"
version = "0.10.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "38b18323edc657390a6ed4d7a9110b0dec2dc3ed128eb2a123edfbafabdbddc5"
dependencies = [
"async-trait",
"base64",
@@ -5154,7 +5166,7 @@ dependencies = [
"http-body",
"http-body-util",
"oauth2",
"paste",
"pastey",
"pin-project-lite",
"process-wrap",
"rand 0.9.2",
@@ -5176,8 +5188,9 @@ dependencies = [
[[package]]
name = "rmcp-macros"
version = "0.9.0"
source = "git+https://github.com/bolinfest/rust-sdk?branch=pr556#4d9cc16f4c76c84486344f542ed9a3e9364019ba"
version = "0.10.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c75d0a62676bf8c8003c4e3c348e2ceb6a7b3e48323681aaf177fdccdac2ce50"
dependencies = [
"darling 0.21.3",
"proc-macro2",
@@ -5765,6 +5778,19 @@ dependencies = [
"syn 2.0.104",
]
[[package]]
name = "serde_yaml"
version = "0.9.34+deprecated"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6a8b1a1a2ebf674015cc02edccce75287f1a0130d394307b36743c2f5d504b47"
dependencies = [
"indexmap 2.12.0",
"itoa",
"ryu",
"serde",
"unsafe-libyaml",
]
[[package]]
name = "serial2"
version = "0.2.31"
@@ -6576,6 +6602,7 @@ dependencies = [
"futures-sink",
"futures-util",
"pin-project-lite",
"slab",
"tokio",
]
@@ -6979,6 +7006,12 @@ version = "0.2.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ebc1c04c71510c7f702b52b7c350734c9ff1295c464a03335b00bb84fc54f853"
[[package]]
name = "unsafe-libyaml"
version = "0.2.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "673aac59facbab8a9007c7f6108d11f63b603f7cabff99fabf650fea5c32b861"
[[package]]
name = "untrusted"
version = "0.9.0"

View File

@@ -59,15 +59,15 @@ license = "Apache-2.0"
# Internal
app_test_support = { path = "app-server/tests/common" }
codex-ansi-escape = { path = "ansi-escape" }
codex-api = { path = "codex-api" }
codex-app-server = { path = "app-server" }
codex-app-server-protocol = { path = "app-server-protocol" }
codex-apply-patch = { path = "apply-patch" }
codex-arg0 = { path = "arg0" }
codex-async-utils = { path = "async-utils" }
codex-backend-client = { path = "backend-client" }
codex-api = { path = "codex-api" }
codex-client = { path = "codex-client" }
codex-chatgpt = { path = "chatgpt" }
codex-client = { path = "codex-client" }
codex-common = { path = "common" }
codex-core = { path = "core" }
codex-exec = { path = "exec" }
@@ -169,15 +169,16 @@ pulldown-cmark = "0.10"
rand = "0.9"
ratatui = "0.29.0"
ratatui-macros = "0.6.0"
regex-lite = "0.1.7"
regex = "1.12.2"
regex-lite = "0.1.7"
reqwest = "0.12"
rmcp = { version = "0.9.0", default-features = false }
rmcp = { version = "0.10.0", default-features = false }
schemars = "0.8.22"
seccompiler = "0.5.0"
sentry = "0.34.0"
serde = "1"
serde_json = "1"
serde_yaml = "0.9"
serde_with = "3.16"
serial_test = "3.2.0"
sha1 = "0.10.6"
@@ -288,7 +289,6 @@ opt-level = 0
# ratatui = { path = "../../ratatui" }
crossterm = { git = "https://github.com/nornagon/crossterm", branch = "nornagon/color-query" }
ratatui = { git = "https://github.com/nornagon/ratatui", branch = "nornagon-v0.29.0-patch" }
rmcp = { git = "https://github.com/bolinfest/rust-sdk", branch = "pr556" }
# Uncomment to debug local changes.
# rmcp = { path = "../../rust-sdk/crates/rmcp" }

View File

@@ -139,6 +139,11 @@ client_request_definitions! {
response: v2::ModelListResponse,
},
McpServersList => "mcpServers/list" {
params: v2::ListMcpServersParams,
response: v2::ListMcpServersResponse,
},
LoginAccount => "account/login/start" {
params: v2::LoginAccountParams,
response: v2::LoginAccountResponse,
@@ -164,6 +169,12 @@ client_request_definitions! {
response: v2::FeedbackUploadResponse,
},
/// Execute a command (argv vector) under the server's sandbox.
OneOffCommandExec => "command/exec" {
params: v2::CommandExecParams,
response: v2::CommandExecResponse,
},
ConfigRead => "config/read" {
params: v2::ConfigReadParams,
response: v2::ConfigReadResponse,
@@ -511,6 +522,7 @@ server_notification_definitions! {
ItemCompleted => "item/completed" (v2::ItemCompletedNotification),
AgentMessageDelta => "item/agentMessage/delta" (v2::AgentMessageDeltaNotification),
CommandExecutionOutputDelta => "item/commandExecution/outputDelta" (v2::CommandExecutionOutputDeltaNotification),
FileChangeOutputDelta => "item/fileChange/outputDelta" (v2::FileChangeOutputDeltaNotification),
McpToolCallProgress => "item/mcpToolCall/progress" (v2::McpToolCallProgressNotification),
AccountUpdated => "account/updated" (v2::AccountUpdatedNotification),
AccountRateLimitsUpdated => "account/rateLimits/updated" (v2::AccountRateLimitsUpdatedNotification),

View File

@@ -0,0 +1,15 @@
use crate::protocol::v1;
use crate::protocol::v2;
impl From<v1::ExecOneOffCommandParams> for v2::CommandExecParams {
fn from(value: v1::ExecOneOffCommandParams) -> Self {
Self {
command: value.command,
timeout_ms: value
.timeout_ms
.map(|timeout| i64::try_from(timeout).unwrap_or(60_000)),
cwd: value.cwd,
sandbox_policy: value.sandbox_policy.map(std::convert::Into::into),
}
}
}

View File

@@ -2,6 +2,7 @@
// Exposes protocol pieces used by `lib.rs` via `pub use protocol::common::*;`.
pub mod common;
mod mappers;
pub mod thread_history;
pub mod v1;
pub mod v2;

View File

@@ -1,5 +1,6 @@
use crate::protocol::v2::ThreadItem;
use crate::protocol::v2::Turn;
use crate::protocol::v2::TurnError;
use crate::protocol::v2::TurnStatus;
use crate::protocol::v2::UserInput;
use codex_protocol::protocol::AgentReasoningEvent;
@@ -142,6 +143,7 @@ impl ThreadHistoryBuilder {
PendingTurn {
id: self.next_turn_id(),
items: Vec::new(),
error: None,
status: TurnStatus::Completed,
}
}
@@ -190,6 +192,7 @@ impl ThreadHistoryBuilder {
struct PendingTurn {
id: String,
items: Vec<ThreadItem>,
error: Option<TurnError>,
status: TurnStatus,
}
@@ -198,6 +201,7 @@ impl From<PendingTurn> for Turn {
Self {
id: value.id,
items: value.items,
error: value.error,
status: value.status,
}
}

View File

@@ -3,11 +3,11 @@ use std::path::PathBuf;
use codex_protocol::ConversationId;
use codex_protocol::config_types::ForcedLoginMethod;
use codex_protocol::config_types::ReasoningEffort;
use codex_protocol::config_types::ReasoningSummary;
use codex_protocol::config_types::SandboxMode;
use codex_protocol::config_types::Verbosity;
use codex_protocol::models::ResponseItem;
use codex_protocol::openai_models::ReasoningEffort;
use codex_protocol::parse_command::ParsedCommand;
use codex_protocol::protocol::AskForApproval;
use codex_protocol::protocol::EventMsg;

View File

@@ -2,14 +2,13 @@ use std::collections::HashMap;
use std::path::PathBuf;
use crate::protocol::common::AuthMode;
use codex_protocol::ConversationId;
use codex_protocol::account::PlanType;
use codex_protocol::approvals::SandboxCommandAssessment as CoreSandboxCommandAssessment;
use codex_protocol::config_types::ReasoningEffort;
use codex_protocol::config_types::ReasoningSummary;
use codex_protocol::items::AgentMessageContent as CoreAgentMessageContent;
use codex_protocol::items::TurnItem as CoreTurnItem;
use codex_protocol::models::ResponseItem;
use codex_protocol::openai_models::ReasoningEffort;
use codex_protocol::parse_command::ParsedCommand as CoreParsedCommand;
use codex_protocol::plan_tool::PlanItemArg as CorePlanItemArg;
use codex_protocol::plan_tool::StepStatus as CorePlanStepStatus;
@@ -22,6 +21,9 @@ use codex_protocol::protocol::TokenUsage as CoreTokenUsage;
use codex_protocol::protocol::TokenUsageInfo as CoreTokenUsageInfo;
use codex_protocol::user_input::UserInput as CoreUserInput;
use mcp_types::ContentBlock as McpContentBlock;
use mcp_types::Resource as McpResource;
use mcp_types::ResourceTemplate as McpResourceTemplate;
use mcp_types::Tool as McpTool;
use schemars::JsonSchema;
use serde::Deserialize;
use serde::Serialize;
@@ -138,6 +140,15 @@ v2_enum_from_core!(
}
);
v2_enum_from_core!(
pub enum McpAuthStatus from codex_protocol::protocol::McpAuthStatus {
Unsupported,
NotLoggedIn,
BearerToken,
OAuth
}
);
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Eq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
@@ -198,6 +209,8 @@ pub struct OverriddenMetadata {
pub struct ConfigWriteResponse {
pub status: WriteStatus,
pub version: String,
/// Canonical path to the config file that was written.
pub file_path: String,
pub overridden_metadata: Option<OverriddenMetadata>,
}
@@ -234,10 +247,11 @@ pub struct ConfigReadResponse {
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct ConfigValueWriteParams {
pub file_path: String,
pub key_path: String,
pub value: JsonValue,
pub merge_strategy: MergeStrategy,
/// Path to the config file to write; defaults to the user's `config.toml` when omitted.
pub file_path: Option<String>,
pub expected_version: Option<String>,
}
@@ -245,8 +259,9 @@ pub struct ConfigValueWriteParams {
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct ConfigBatchWriteParams {
pub file_path: String,
pub edits: Vec<ConfigEdit>,
/// Path to the config file to write; defaults to the user's `config.toml` when omitted.
pub file_path: Option<String>,
pub expected_version: Option<String>,
}
@@ -615,13 +630,44 @@ pub struct ModelListResponse {
pub next_cursor: Option<String>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct ListMcpServersParams {
/// Opaque pagination cursor returned by a previous call.
pub cursor: Option<String>,
/// Optional page size; defaults to a server-defined value.
pub limit: Option<u32>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct McpServer {
pub name: String,
pub tools: std::collections::HashMap<String, McpTool>,
pub resources: Vec<McpResource>,
pub resource_templates: Vec<McpResourceTemplate>,
pub auth_status: McpAuthStatus,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct ListMcpServersResponse {
pub data: Vec<McpServer>,
/// Opaque cursor to pass to the next call to continue after the last item.
/// If None, there are no more items to return.
pub next_cursor: Option<String>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct FeedbackUploadParams {
pub classification: String,
pub reason: Option<String>,
pub conversation_id: Option<ConversationId>,
pub thread_id: Option<String>,
pub include_logs: bool,
}
@@ -632,6 +678,26 @@ pub struct FeedbackUploadResponse {
pub thread_id: String,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct CommandExecParams {
pub command: Vec<String>,
#[ts(type = "number | null")]
pub timeout_ms: Option<i64>,
pub cwd: Option<PathBuf>,
pub sandbox_policy: Option<SandboxPolicy>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct CommandExecResponse {
pub exit_code: i32,
pub stdout: String,
pub stderr: String,
}
// === Threads, Turns, and Items ===
// Thread APIs
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Default, JsonSchema, TS)]
@@ -766,6 +832,7 @@ pub struct Thread {
/// Model provider used for this thread (for example, 'openai').
pub model_provider: String,
/// Unix timestamp (in seconds) when the thread was created.
#[ts(type = "number")]
pub created_at: i64,
/// [UNSTABLE] Path to the thread on disk.
pub path: PathBuf,
@@ -856,8 +923,9 @@ pub struct Turn {
/// For all other responses and notifications returning a Turn,
/// the items field will be an empty list.
pub items: Vec<ThreadItem>,
#[serde(flatten)]
pub status: TurnStatus,
/// Only populated when the Turn's status is failed.
pub error: Option<TurnError>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS, Error)]
@@ -879,12 +947,12 @@ pub struct ErrorNotification {
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(tag = "status", rename_all = "camelCase")]
#[ts(tag = "status", export_to = "v2/")]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub enum TurnStatus {
Completed,
Interrupted,
Failed { error: TurnError },
Failed,
InProgress,
}
@@ -1053,6 +1121,7 @@ pub enum ThreadItem {
/// The command's exit code.
exit_code: Option<i32>,
/// The duration of the command execution in milliseconds.
#[ts(type = "number | null")]
duration_ms: Option<i64>,
},
#[serde(rename_all = "camelCase")]
@@ -1078,9 +1147,6 @@ pub enum ThreadItem {
WebSearch { id: String, query: String },
#[serde(rename_all = "camelCase")]
#[ts(rename_all = "camelCase")]
TodoList { id: String, items: Vec<TodoItem> },
#[serde(rename_all = "camelCase")]
#[ts(rename_all = "camelCase")]
ImageView { id: String, path: String },
#[serde(rename_all = "camelCase")]
#[ts(rename_all = "camelCase")]
@@ -1183,15 +1249,6 @@ pub struct McpToolCallError {
pub message: String,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct TodoItem {
pub id: String,
pub text: String,
pub completed: bool,
}
// === Server Notifications ===
// Thread/Turn lifecycle notifications and item progress events
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
@@ -1241,6 +1298,7 @@ pub struct TurnDiffUpdatedNotification {
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct TurnPlanUpdatedNotification {
pub thread_id: String,
pub turn_id: String,
pub explanation: Option<String>,
pub plan: Vec<TurnPlanStep>,
@@ -1319,6 +1377,7 @@ pub struct ReasoningSummaryTextDeltaNotification {
pub turn_id: String,
pub item_id: String,
pub delta: String,
#[ts(type = "number")]
pub summary_index: i64,
}
@@ -1329,6 +1388,7 @@ pub struct ReasoningSummaryPartAddedNotification {
pub thread_id: String,
pub turn_id: String,
pub item_id: String,
#[ts(type = "number")]
pub summary_index: i64,
}
@@ -1340,6 +1400,7 @@ pub struct ReasoningTextDeltaNotification {
pub turn_id: String,
pub item_id: String,
pub delta: String,
#[ts(type = "number")]
pub content_index: i64,
}
@@ -1353,6 +1414,16 @@ pub struct CommandExecutionOutputDeltaNotification {
pub delta: String,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct FileChangeOutputDeltaNotification {
pub thread_id: String,
pub turn_id: String,
pub item_id: String,
pub delta: String,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
@@ -1464,7 +1535,9 @@ impl From<CoreRateLimitSnapshot> for RateLimitSnapshot {
#[ts(export_to = "v2/")]
pub struct RateLimitWindow {
pub used_percent: i32,
#[ts(type = "number | null")]
pub window_duration_mins: Option<i64>,
#[ts(type = "number | null")]
pub resets_at: Option<i64>,
}

View File

@@ -563,7 +563,9 @@ impl CodexClient {
ServerNotification::TurnCompleted(payload) => {
if payload.turn.id == turn_id {
println!("\n< turn/completed notification: {:?}", payload.turn.status);
if let TurnStatus::Failed { error } = &payload.turn.status {
if payload.turn.status == TurnStatus::Failed
&& let Some(error) = payload.turn.error
{
println!("[turn error] {}", error.message);
}
break;

View File

@@ -31,6 +31,7 @@ chrono = { workspace = true }
serde = { workspace = true, features = ["derive"] }
serde_json = { workspace = true }
sha2 = { workspace = true }
mcp-types = { workspace = true }
tempfile = { workspace = true }
toml = { workspace = true }
tokio = { workspace = true, features = [

View File

@@ -1,15 +1,15 @@
# codex-app-server
`codex app-server` is the interface Codex uses to power rich interfaces such as the [Codex VS Code extension](https://marketplace.visualstudio.com/items?itemName=openai.chatgpt). The message schema is currently unstable, but those who wish to build experimental UIs on top of Codex may find it valuable.
`codex app-server` is the interface Codex uses to power rich interfaces such as the [Codex VS Code extension](https://marketplace.visualstudio.com/items?itemName=openai.chatgpt).
## Table of Contents
- [Protocol](#protocol)
- [Message Schema](#message-schema)
- [Core Primitives](#core-primitives)
- [Lifecycle Overview](#lifecycle-overview)
- [Initialization](#initialization)
- [Core primitives](#core-primitives)
- [Thread & turn endpoints](#thread--turn-endpoints)
- [Events (work-in-progress)](#events-work-in-progress)
- [API Overview](#api-overview)
- [Events](#events)
- [Auth endpoints](#auth-endpoints)
## Protocol
@@ -25,6 +25,15 @@ codex app-server generate-ts --out DIR
codex app-server generate-json-schema --out DIR
```
## Core Primitives
The API exposes three top level primitives representing an interaction between a user and Codex:
- **Thread**: A conversation between a user and the Codex agent. Each thread contains multiple turns.
- **Turn**: One turn of the conversation, typically starting with a user message and finishing with an agent message. Each turn contains multiple items.
- **Item**: Represents user inputs and agent outputs as part of the turn, persisted and used as the context for future conversations. Example items include user message, agent reasoning, agent message, shell command, file edit, etc.
Use the thread APIs to create, list, or archive conversations. Drive a conversation with turn APIs and stream progress via turn notifications.
## Lifecycle Overview
- Initialize once: Immediately after launching the codex app-server process, send an `initialize` request with your client metadata, then emit an `initialized` notification. Any other request before this handshake gets rejected.
@@ -37,28 +46,16 @@ codex app-server generate-json-schema --out DIR
Clients must send a single `initialize` request before invoking any other method, then acknowledge with an `initialized` notification. The server returns the user agent string it will present to upstream services; subsequent requests issued before initialization receive a `"Not initialized"` error, and repeated `initialize` calls receive an `"Already initialized"` error.
Example:
Applications building on top of `codex app-server` should identify themselves via the `clientInfo` parameter.
Example (from OpenAI's official VSCode extension):
```json
{ "method": "initialize", "id": 0, "params": {
"clientInfo": { "name": "codex-vscode", "title": "Codex VS Code Extension", "version": "0.1.0" }
} }
{ "id": 0, "result": { "userAgent": "codex-app-server/0.1.0 codex-vscode/0.1.0" } }
{ "method": "initialized" }
```
## Core primitives
We have 3 top level primitives:
- Thread - a conversation between the Codex agent and a user. Each thread contains multiple turns.
- Turn - one turn of the conversation, typically starting with a user message and finishing with an agent message. Each turn contains multiple items.
- Item - represents user inputs and agent outputs as part of the turn, persisted and used as the context for future conversations.
## Thread & turn endpoints
The JSON-RPC API exposes dedicated methods for managing Codex conversations. Threads store long-lived conversation metadata, and turns store the per-message exchange (input → Codex output, including streamed items). Use the thread APIs to create, list, or archive sessions, then drive the conversation with turn APIs and notifications.
### Quick reference
## API Overview
- `thread/start` — create a new thread; emits `thread/started` and auto-subscribes you to turn/item events for that thread.
- `thread/resume` — reopen an existing thread by id so subsequent `turn/start` calls append to it.
- `thread/list` — page through stored rollouts; supports cursor-based pagination and optional `modelProviders` filtering.
@@ -66,8 +63,15 @@ The JSON-RPC API exposes dedicated methods for managing Codex conversations. Thr
- `turn/start` — add user input to a thread and begin Codex generation; responds with the initial `turn` object and streams `turn/started`, `item/*`, and `turn/completed` notifications.
- `turn/interrupt` — request cancellation of an in-flight turn by `(thread_id, turn_id)`; success is an empty `{}` response and the turn finishes with `status: "interrupted"`.
- `review/start` — kick off Codexs automated reviewer for a thread; responds like `turn/start` and emits `item/started`/`item/completed` notifications with `enteredReviewMode` and `exitedReviewMode` items, plus a final assistant `agentMessage` containing the review.
- `command/exec` — run a single command under the server sandbox without starting a thread/turn (handy for utilities and validation).
- `model/list` — list available models (with reasoning effort options).
- `feedback/upload` — submit a feedback report (classification + optional reason/logs and conversation_id); returns the tracking thread id.
- `command/exec` — run a single command under the server sandbox without starting a thread/turn (handy for utilities and validation).
- `config/read` — fetch the effective config on disk after resolving config layering.
- `config/value/write` — write a single config key/value to the user's config.toml on disk.
- `config/batchWrite` — apply multiple config edits atomically to the user's config.toml on disk.
### 1) Start or resume a thread
### Example: Start or resume a thread
Start a fresh thread when you need a new Codex conversation.
@@ -98,7 +102,7 @@ To continue a stored session, call `thread/resume` with the `thread.id` you prev
{ "id": 11, "result": { "thread": { "id": "thr_123", } } }
```
### 2) List threads (pagination & filters)
### Example: List threads (with pagination & filters)
`thread/list` lets you render a history UI. Pass any combination of:
- `cursor` — opaque string from a prior response; omit for the first page.
@@ -123,7 +127,7 @@ Example:
When `nextCursor` is `null`, youve reached the final page.
### 3) Archive a thread
### Example: Archive a thread
Use `thread/archive` to move the persisted rollout (stored as a JSONL file on disk) into the archived sessions directory.
@@ -134,7 +138,7 @@ Use `thread/archive` to move the persisted rollout (stored as a JSONL file on di
An archived thread will not appear in future calls to `thread/list`.
### 4) Start a turn (send user input)
### Example: Start a turn (send user input)
Turns attach user input (text or images) to a thread and trigger Codex generation. The `input` field is a list of discriminated unions:
@@ -168,7 +172,7 @@ You can optionally specify config overrides on the new turn. If specified, these
} } }
```
### 5) Interrupt an active turn
### Example: Interrupt an active turn
You can cancel a running Turn with `turn/interrupt`.
@@ -182,7 +186,7 @@ You can cancel a running Turn with `turn/interrupt`.
The server requests cancellations for running subprocesses, then emits a `turn/completed` event with `status: "interrupted"`. Rely on the `turn/completed` to know when Codex-side cleanup is done.
### 6) Request a code review
### Example: Request a code review
Use `review/start` to run Codexs reviewer on the currently checked-out project. The request takes the thread id plus a `target` describing what should be reviewed:
@@ -241,7 +245,26 @@ containing an `exitedReviewMode` item with the final review text:
The `review` string is plain text that already bundles the overall explanation plus a bullet list for each structured finding (matching `ThreadItem::ExitedReviewMode` in the generated schema). Use this notification to render the reviewer output in your client.
## Events (work-in-progress)
### Example: One-off command execution
Run a standalone command (argv vector) in the servers sandbox without creating a thread or turn:
```json
{ "method": "command/exec", "id": 32, "params": {
"command": ["ls", "-la"],
"cwd": "/Users/me/project", // optional; defaults to server cwd
"sandboxPolicy": { "type": "workspaceWrite" }, // optional; defaults to user config
"timeoutMs": 10000 // optional; ms timeout; defaults to server timeout
} }
{ "id": 32, "result": { "exitCode": 0, "stdout": "...", "stderr": "" } }
```
Notes:
- Empty `command` arrays are rejected.
- `sandboxPolicy` accepts the same shape used by `turn/start` (e.g., `dangerFullAccess`, `readOnly`, `workspaceWrite` with flags).
- When omitted, `timeoutMs` falls back to the server default.
## Events
Event notifications are the server-initiated event stream for thread lifecycles, turn lifecycles, and the items within them. After you start or resume a thread, keep reading stdout for `thread/started`, `turn/*`, and `item/*` notifications.
@@ -251,11 +274,12 @@ The app-server streams JSON-RPC notifications while a turn is running. Each turn
- `turn/started``{ turn }` with the turn id, empty `items`, and `status: "inProgress"`.
- `turn/completed``{ turn }` where `turn.status` is `completed`, `interrupted`, or `failed`; failures carry `{ error: { message, codexErrorInfo? } }`.
- `turn/diff/updated``{ threadId, turnId, diff }` represents the up-to-date snapshot of the turn-level unified diff, emitted after every FileChange item. `diff` is the latest aggregated unified diff across every file change in the turn. UIs can render this to show the full "what changed" view without stitching individual `fileChange` items.
- `turn/plan/updated``{ turnId, explanation?, plan }` whenever the agent shares or changes its plan; each `plan` entry is `{ step, status }` with `status` in `pending`, `inProgress`, or `completed`.
Today both notifications carry an empty `items` array even when item events were streamed; rely on `item/*` notifications for the canonical item list until this is fixed.
#### Thread items
#### Items
`ThreadItem` is the tagged union carried in turn responses and `item/*` notifications. Currently we support events for the following items:
- `userMessage``{id, content}` where `content` is a list of user inputs (`text`, `image`, or `localImage`).
@@ -265,6 +289,9 @@ Today both notifications carry an empty `items` array even when item events were
- `fileChange``{id, changes, status}` describing proposed edits; `changes` list `{path, kind, diff}` and `status` is `inProgress`, `completed`, `failed`, or `declined`.
- `mcpToolCall``{id, server, tool, status, arguments, result?, error?}` describing MCP calls; `status` is `inProgress`, `completed`, or `failed`.
- `webSearch``{id, query}` for a web search request issued by the agent.
- `imageView``{id, path}` emitted when the agent invokes the image viewer tool.
- `enteredReviewMode``{id, review}` sent when the reviewer starts; `review` is a short user-facing label such as `"current changes"` or the requested target description.
- `exitedReviewMode``{id, review}` emitted when the reviewer finishes; `review` is the full plain-text review (usually, overall notes plus bullet point findings).
- `compacted` - `{threadId, turnId}` when codex compacts the conversation history. This can happen automatically.
All items emit two shared lifecycle events:
@@ -282,7 +309,7 @@ There are additional item-specific events:
- `item/commandExecution/outputDelta` — streams stdout/stderr for the command; append deltas in order to render live output alongside `aggregatedOutput` in the final item.
Final `commandExecution` items include parsed `commandActions`, `status`, `exitCode`, and `durationMs` so the UI can summarize what ran and whether it succeeded.
#### fileChange
`fileChange` items contain a `changes` list with `{path, kind, diff}` entries (`kind` is `add`, `delete`, or `update` with an optional `movePath`). The `status` tracks whether apply succeeded (`completed`), failed, or was `declined`.
- `item/fileChange/outputDelta` - contains the tool call response of the underlying `apply_patch` tool call.
### Errors
`error` event is emitted whenever the server hits an error mid-turn (for example, upstream model errors or quota limits). Carries the same `{ error: { message, codexErrorInfo? } }` payload as `turn.status: "failed"` and may precede that terminal notification.
@@ -331,7 +358,7 @@ UI guidance for IDEs: surface an approval dialog as soon as the request arrives.
The JSON-RPC auth/account surface exposes request/response methods plus server-initiated notifications (no `id`). Use these to determine auth state, start or cancel logins, logout, and inspect ChatGPT rate limits.
### Quick reference
### API Overview
- `account/read` — fetch current account info; optionally refresh tokens.
- `account/login/start` — begin login (`apiKey` or `chatgpt`).
- `account/login/completed` (notify) — emitted when a login attempt finishes (success or error).
@@ -416,9 +443,3 @@ Field notes:
- `usedPercent` is current usage within the OpenAI quota window.
- `windowDurationMins` is the quota window length.
- `resetsAt` is a Unix timestamp (seconds) for the next reset.
### Dev notes
- `codex app-server generate-ts --out <dir>` emits v2 types under `v2/`.
- `codex app-server generate-json-schema --out <dir>` outputs `codex_app_server_protocol.schemas.json`.
- See [“Authentication and authorization” in the config docs](../../docs/config.md#authentication-and-authorization) for configuration knobs.

View File

@@ -18,6 +18,7 @@ use codex_app_server_protocol::ContextCompactedNotification;
use codex_app_server_protocol::ErrorNotification;
use codex_app_server_protocol::ExecCommandApprovalParams;
use codex_app_server_protocol::ExecCommandApprovalResponse;
use codex_app_server_protocol::FileChangeOutputDeltaNotification;
use codex_app_server_protocol::FileChangeRequestApprovalParams;
use codex_app_server_protocol::FileChangeRequestApprovalResponse;
use codex_app_server_protocol::FileUpdateChange;
@@ -61,6 +62,7 @@ use codex_core::protocol::ReviewDecision;
use codex_core::protocol::TokenCountEvent;
use codex_core::protocol::TurnDiffEvent;
use codex_core::review_format::format_review_findings_block;
use codex_core::review_prompts;
use codex_protocol::ConversationId;
use codex_protocol::plan_tool::UpdatePlanArgs;
use codex_protocol::protocol::ReviewOutputEvent;
@@ -177,6 +179,7 @@ pub(crate) async fn apply_bespoke_event_handling(
cwd,
reason,
risk,
proposed_execpolicy_amendment: _,
parsed_cmd,
}) => match api_version {
ApiVersion::V1 => {
@@ -350,8 +353,32 @@ pub(crate) async fn apply_bespoke_event_handling(
}))
.await;
}
EventMsg::ViewImageToolCall(view_image_event) => {
let item = ThreadItem::ImageView {
id: view_image_event.call_id.clone(),
path: view_image_event.path.to_string_lossy().into_owned(),
};
let started = ItemStartedNotification {
thread_id: conversation_id.to_string(),
turn_id: event_turn_id.clone(),
item: item.clone(),
};
outgoing
.send_server_notification(ServerNotification::ItemStarted(started))
.await;
let completed = ItemCompletedNotification {
thread_id: conversation_id.to_string(),
turn_id: event_turn_id.clone(),
item,
};
outgoing
.send_server_notification(ServerNotification::ItemCompleted(completed))
.await;
}
EventMsg::EnteredReviewMode(review_request) => {
let review = review_request.user_facing_hint;
let review = review_request
.user_facing_hint
.unwrap_or_else(|| review_prompts::user_facing_hint(&review_request.target));
let item = ThreadItem::EnteredReviewMode {
id: event_turn_id.clone(),
review,
@@ -501,17 +528,44 @@ pub(crate) async fn apply_bespoke_event_handling(
.await;
}
EventMsg::ExecCommandOutputDelta(exec_command_output_delta_event) => {
let notification = CommandExecutionOutputDeltaNotification {
thread_id: conversation_id.to_string(),
turn_id: event_turn_id.clone(),
item_id: exec_command_output_delta_event.call_id.clone(),
delta: String::from_utf8_lossy(&exec_command_output_delta_event.chunk).to_string(),
let item_id = exec_command_output_delta_event.call_id.clone();
let delta = String::from_utf8_lossy(&exec_command_output_delta_event.chunk).to_string();
// The underlying EventMsg::ExecCommandOutputDelta is used for shell, unified_exec,
// and apply_patch tool calls. We represent apply_patch with the FileChange item, and
// everything else with the CommandExecution item.
//
// We need to detect which item type it is so we can emit the right notification.
// We already have state tracking FileChange items on item/started, so let's use that.
let is_file_change = {
let map = turn_summary_store.lock().await;
map.get(&conversation_id)
.is_some_and(|summary| summary.file_change_started.contains(&item_id))
};
outgoing
.send_server_notification(ServerNotification::CommandExecutionOutputDelta(
notification,
))
.await;
if is_file_change {
let notification = FileChangeOutputDeltaNotification {
thread_id: conversation_id.to_string(),
turn_id: event_turn_id.clone(),
item_id,
delta,
};
outgoing
.send_server_notification(ServerNotification::FileChangeOutputDelta(
notification,
))
.await;
} else {
let notification = CommandExecutionOutputDeltaNotification {
thread_id: conversation_id.to_string(),
turn_id: event_turn_id.clone(),
item_id,
delta,
};
outgoing
.send_server_notification(ServerNotification::CommandExecutionOutputDelta(
notification,
))
.await;
}
}
EventMsg::ExecCommandEnd(exec_command_end_event) => {
let ExecCommandEndEvent {
@@ -608,6 +662,7 @@ pub(crate) async fn apply_bespoke_event_handling(
}
EventMsg::PlanUpdate(plan_update_event) => {
handle_turn_plan_update(
conversation_id,
&event_turn_id,
plan_update_event,
api_version,
@@ -640,6 +695,7 @@ async fn handle_turn_diff(
}
async fn handle_turn_plan_update(
conversation_id: ConversationId,
event_turn_id: &str,
plan_update_event: UpdatePlanArgs,
api_version: ApiVersion,
@@ -647,6 +703,7 @@ async fn handle_turn_plan_update(
) {
if let ApiVersion::V2 = api_version {
let notification = TurnPlanUpdatedNotification {
thread_id: conversation_id.to_string(),
turn_id: event_turn_id.to_string(),
explanation: plan_update_event.explanation,
plan: plan_update_event
@@ -665,6 +722,7 @@ async fn emit_turn_completed_with_status(
conversation_id: ConversationId,
event_turn_id: String,
status: TurnStatus,
error: Option<TurnError>,
outgoing: &OutgoingMessageSender,
) {
let notification = TurnCompletedNotification {
@@ -672,6 +730,7 @@ async fn emit_turn_completed_with_status(
turn: Turn {
id: event_turn_id,
items: vec![],
error,
status,
},
};
@@ -760,13 +819,12 @@ async fn handle_turn_complete(
) {
let turn_summary = find_and_remove_turn_summary(conversation_id, turn_summary_store).await;
let status = if let Some(error) = turn_summary.last_error {
TurnStatus::Failed { error }
} else {
TurnStatus::Completed
let (status, error) = match turn_summary.last_error {
Some(error) => (TurnStatus::Failed, Some(error)),
None => (TurnStatus::Completed, None),
};
emit_turn_completed_with_status(conversation_id, event_turn_id, status, outgoing).await;
emit_turn_completed_with_status(conversation_id, event_turn_id, status, error, outgoing).await;
}
async fn handle_turn_interrupted(
@@ -781,6 +839,7 @@ async fn handle_turn_interrupted(
conversation_id,
event_turn_id,
TurnStatus::Interrupted,
None,
outgoing,
)
.await;
@@ -1253,6 +1312,7 @@ mod tests {
OutgoingMessage::AppServerNotification(ServerNotification::TurnCompleted(n)) => {
assert_eq!(n.turn.id, event_turn_id);
assert_eq!(n.turn.status, TurnStatus::Completed);
assert_eq!(n.turn.error, None);
}
other => bail!("unexpected message: {other:?}"),
}
@@ -1293,6 +1353,7 @@ mod tests {
OutgoingMessage::AppServerNotification(ServerNotification::TurnCompleted(n)) => {
assert_eq!(n.turn.id, event_turn_id);
assert_eq!(n.turn.status, TurnStatus::Interrupted);
assert_eq!(n.turn.error, None);
}
other => bail!("unexpected message: {other:?}"),
}
@@ -1332,14 +1393,13 @@ mod tests {
match msg {
OutgoingMessage::AppServerNotification(ServerNotification::TurnCompleted(n)) => {
assert_eq!(n.turn.id, event_turn_id);
assert_eq!(n.turn.status, TurnStatus::Failed);
assert_eq!(
n.turn.status,
TurnStatus::Failed {
error: TurnError {
message: "bad".to_string(),
codex_error_info: Some(V2CodexErrorInfo::Other),
}
}
n.turn.error,
Some(TurnError {
message: "bad".to_string(),
codex_error_info: Some(V2CodexErrorInfo::Other),
})
);
}
other => bail!("unexpected message: {other:?}"),
@@ -1366,7 +1426,16 @@ mod tests {
],
};
handle_turn_plan_update("turn-123", update, ApiVersion::V2, &outgoing).await;
let conversation_id = ConversationId::new();
handle_turn_plan_update(
conversation_id,
"turn-123",
update,
ApiVersion::V2,
&outgoing,
)
.await;
let msg = rx
.recv()
@@ -1374,6 +1443,7 @@ mod tests {
.ok_or_else(|| anyhow!("should send one notification"))?;
match msg {
OutgoingMessage::AppServerNotification(ServerNotification::TurnPlanUpdated(n)) => {
assert_eq!(n.thread_id, conversation_id.to_string());
assert_eq!(n.turn_id, "turn-123");
assert_eq!(n.explanation.as_deref(), Some("need plan"));
assert_eq!(n.plan.len(), 2);
@@ -1600,14 +1670,13 @@ mod tests {
match msg {
OutgoingMessage::AppServerNotification(ServerNotification::TurnCompleted(n)) => {
assert_eq!(n.turn.id, a_turn1);
assert_eq!(n.turn.status, TurnStatus::Failed);
assert_eq!(
n.turn.status,
TurnStatus::Failed {
error: TurnError {
message: "a1".to_string(),
codex_error_info: Some(V2CodexErrorInfo::BadRequest),
}
}
n.turn.error,
Some(TurnError {
message: "a1".to_string(),
codex_error_info: Some(V2CodexErrorInfo::BadRequest),
})
);
}
other => bail!("unexpected message: {other:?}"),
@@ -1621,14 +1690,13 @@ mod tests {
match msg {
OutgoingMessage::AppServerNotification(ServerNotification::TurnCompleted(n)) => {
assert_eq!(n.turn.id, b_turn1);
assert_eq!(n.turn.status, TurnStatus::Failed);
assert_eq!(
n.turn.status,
TurnStatus::Failed {
error: TurnError {
message: "b1".to_string(),
codex_error_info: None,
}
}
n.turn.error,
Some(TurnError {
message: "b1".to_string(),
codex_error_info: None,
})
);
}
other => bail!("unexpected message: {other:?}"),
@@ -1643,6 +1711,7 @@ mod tests {
OutgoingMessage::AppServerNotification(ServerNotification::TurnCompleted(n)) => {
assert_eq!(n.turn.id, a_turn2);
assert_eq!(n.turn.status, TurnStatus::Completed);
assert_eq!(n.turn.error, None);
}
other => bail!("unexpected message: {other:?}"),
}

View File

@@ -21,9 +21,9 @@ use codex_app_server_protocol::CancelLoginAccountParams;
use codex_app_server_protocol::CancelLoginAccountResponse;
use codex_app_server_protocol::CancelLoginChatGptResponse;
use codex_app_server_protocol::ClientRequest;
use codex_app_server_protocol::CommandExecParams;
use codex_app_server_protocol::ConversationGitInfo;
use codex_app_server_protocol::ConversationSummary;
use codex_app_server_protocol::ExecOneOffCommandParams;
use codex_app_server_protocol::ExecOneOffCommandResponse;
use codex_app_server_protocol::FeedbackUploadParams;
use codex_app_server_protocol::FeedbackUploadResponse;
@@ -45,6 +45,8 @@ use codex_app_server_protocol::InterruptConversationParams;
use codex_app_server_protocol::JSONRPCErrorError;
use codex_app_server_protocol::ListConversationsParams;
use codex_app_server_protocol::ListConversationsResponse;
use codex_app_server_protocol::ListMcpServersParams;
use codex_app_server_protocol::ListMcpServersResponse;
use codex_app_server_protocol::LoginAccountParams;
use codex_app_server_protocol::LoginApiKeyParams;
use codex_app_server_protocol::LoginApiKeyResponse;
@@ -52,6 +54,7 @@ use codex_app_server_protocol::LoginChatGptCompleteNotification;
use codex_app_server_protocol::LoginChatGptResponse;
use codex_app_server_protocol::LogoutAccountResponse;
use codex_app_server_protocol::LogoutChatGptResponse;
use codex_app_server_protocol::McpServer;
use codex_app_server_protocol::ModelListParams;
use codex_app_server_protocol::ModelListResponse;
use codex_app_server_protocol::NewConversationParams;
@@ -64,7 +67,7 @@ use codex_app_server_protocol::ResumeConversationResponse;
use codex_app_server_protocol::ReviewDelivery as ApiReviewDelivery;
use codex_app_server_protocol::ReviewStartParams;
use codex_app_server_protocol::ReviewStartResponse;
use codex_app_server_protocol::ReviewTarget;
use codex_app_server_protocol::ReviewTarget as ApiReviewTarget;
use codex_app_server_protocol::SandboxMode;
use codex_app_server_protocol::SendUserMessageParams;
use codex_app_server_protocol::SendUserMessageResponse;
@@ -119,11 +122,14 @@ use codex_core::exec_env::create_env;
use codex_core::features::Feature;
use codex_core::find_conversation_path_by_id_str;
use codex_core::git_info::git_diff_to_remote;
use codex_core::mcp::collect_mcp_snapshot;
use codex_core::mcp::group_tools_by_server;
use codex_core::parse_cursor;
use codex_core::protocol::EventMsg;
use codex_core::protocol::Op;
use codex_core::protocol::ReviewDelivery as CoreReviewDelivery;
use codex_core::protocol::ReviewRequest;
use codex_core::protocol::ReviewTarget as CoreReviewTarget;
use codex_core::protocol::SessionConfiguredEvent;
use codex_core::read_head_for_summary;
use codex_feedback::CodexFeedback;
@@ -135,6 +141,7 @@ use codex_protocol::config_types::ForcedLoginMethod;
use codex_protocol::items::TurnItem;
use codex_protocol::models::ResponseItem;
use codex_protocol::protocol::GitInfo as CoreGitInfo;
use codex_protocol::protocol::McpAuthStatus as CoreMcpAuthStatus;
use codex_protocol::protocol::RateLimitSnapshot as CoreRateLimitSnapshot;
use codex_protocol::protocol::RolloutItem;
use codex_protocol::protocol::SessionMetaLine;
@@ -255,7 +262,7 @@ impl CodexMessageProcessor {
}
fn review_request_from_target(
target: ReviewTarget,
target: ApiReviewTarget,
) -> Result<(ReviewRequest, String), JSONRPCErrorError> {
fn invalid_request(message: String) -> JSONRPCErrorError {
JSONRPCErrorError {
@@ -265,73 +272,52 @@ impl CodexMessageProcessor {
}
}
match target {
// TODO(jif) those messages will be extracted in a follow-up PR.
ReviewTarget::UncommittedChanges => Ok((
ReviewRequest {
prompt: "Review the current code changes (staged, unstaged, and untracked files) and provide prioritized findings.".to_string(),
user_facing_hint: "current changes".to_string(),
},
"Review uncommitted changes".to_string(),
)),
ReviewTarget::BaseBranch { branch } => {
let cleaned_target = match target {
ApiReviewTarget::UncommittedChanges => ApiReviewTarget::UncommittedChanges,
ApiReviewTarget::BaseBranch { branch } => {
let branch = branch.trim().to_string();
if branch.is_empty() {
return Err(invalid_request("branch must not be empty".to_string()));
}
let prompt = format!("Review the code changes against the base branch '{branch}'. Start by finding the merge diff between the current branch and {branch}'s upstream e.g. (`git merge-base HEAD \"$(git rev-parse --abbrev-ref \"{branch}@{{upstream}}\")\"`), then run `git diff` against that SHA to see what changes we would merge into the {branch} branch. Provide prioritized, actionable findings.");
let hint = format!("changes against '{branch}'");
let display = format!("Review changes against base branch '{branch}'");
Ok((
ReviewRequest {
prompt,
user_facing_hint: hint,
},
display,
))
ApiReviewTarget::BaseBranch { branch }
}
ReviewTarget::Commit { sha, title } => {
ApiReviewTarget::Commit { sha, title } => {
let sha = sha.trim().to_string();
if sha.is_empty() {
return Err(invalid_request("sha must not be empty".to_string()));
}
let brief_title = title
let title = title
.map(|t| t.trim().to_string())
.filter(|t| !t.is_empty());
let prompt = if let Some(title) = brief_title.clone() {
format!("Review the code changes introduced by commit {sha} (\"{title}\"). Provide prioritized, actionable findings.")
} else {
format!("Review the code changes introduced by commit {sha}. Provide prioritized, actionable findings.")
};
let short_sha = sha.chars().take(7).collect::<String>();
let hint = format!("commit {short_sha}");
let display = if let Some(title) = brief_title {
format!("Review commit {short_sha}: {title}")
} else {
format!("Review commit {short_sha}")
};
Ok((
ReviewRequest {
prompt,
user_facing_hint: hint,
},
display,
))
ApiReviewTarget::Commit { sha, title }
}
ReviewTarget::Custom { instructions } => {
ApiReviewTarget::Custom { instructions } => {
let trimmed = instructions.trim().to_string();
if trimmed.is_empty() {
return Err(invalid_request("instructions must not be empty".to_string()));
return Err(invalid_request(
"instructions must not be empty".to_string(),
));
}
ApiReviewTarget::Custom {
instructions: trimmed,
}
Ok((
ReviewRequest {
prompt: trimmed.clone(),
user_facing_hint: trimmed.clone(),
},
trimmed,
))
}
}
};
let core_target = match cleaned_target {
ApiReviewTarget::UncommittedChanges => CoreReviewTarget::UncommittedChanges,
ApiReviewTarget::BaseBranch { branch } => CoreReviewTarget::BaseBranch { branch },
ApiReviewTarget::Commit { sha, title } => CoreReviewTarget::Commit { sha, title },
ApiReviewTarget::Custom { instructions } => CoreReviewTarget::Custom { instructions },
};
let hint = codex_core::review_prompts::user_facing_hint(&core_target);
let review_request = ReviewRequest {
target: core_target,
user_facing_hint: Some(hint.clone()),
};
Ok((review_request, hint))
}
pub async fn process_request(&mut self, request: ClientRequest) {
@@ -383,6 +369,9 @@ impl CodexMessageProcessor {
ClientRequest::ModelList { request_id, params } => {
self.list_models(request_id, params).await;
}
ClientRequest::McpServersList { request_id, params } => {
self.list_mcp_servers(request_id, params).await;
}
ClientRequest::LoginAccount { request_id, params } => {
self.login_v2(request_id, params).await;
}
@@ -467,9 +456,12 @@ impl CodexMessageProcessor {
ClientRequest::FuzzyFileSearch { request_id, params } => {
self.fuzzy_file_search(request_id, params).await;
}
ClientRequest::ExecOneOffCommand { request_id, params } => {
ClientRequest::OneOffCommandExec { request_id, params } => {
self.exec_one_off_command(request_id, params).await;
}
ClientRequest::ExecOneOffCommand { request_id, params } => {
self.exec_one_off_command(request_id, params.into()).await;
}
ClientRequest::ConfigRead { .. }
| ClientRequest::ConfigValueWrite { .. }
| ClientRequest::ConfigBatchWrite { .. } => {
@@ -1154,7 +1146,7 @@ impl CodexMessageProcessor {
}
}
async fn exec_one_off_command(&self, request_id: RequestId, params: ExecOneOffCommandParams) {
async fn exec_one_off_command(&self, request_id: RequestId, params: CommandExecParams) {
tracing::debug!("ExecOneOffCommand params: {params:?}");
if params.command.is_empty() {
@@ -1169,7 +1161,9 @@ impl CodexMessageProcessor {
let cwd = params.cwd.unwrap_or_else(|| self.config.cwd.clone());
let env = create_env(&self.config.shell_environment_policy);
let timeout_ms = params.timeout_ms;
let timeout_ms = params
.timeout_ms
.and_then(|timeout_ms| u64::try_from(timeout_ms).ok());
let exec_params = ExecParams {
command: params.command,
cwd,
@@ -1182,6 +1176,7 @@ impl CodexMessageProcessor {
let effective_policy = params
.sandbox_policy
.map(|policy| policy.to_core())
.unwrap_or_else(|| self.config.sandbox_policy.clone());
let codex_linux_sandbox_exe = self.config.codex_linux_sandbox_exe.clone();
@@ -1867,8 +1862,7 @@ impl CodexMessageProcessor {
async fn list_models(&self, request_id: RequestId, params: ModelListParams) {
let ModelListParams { limit, cursor } = params;
let auth_mode = self.auth_manager.auth().map(|auth| auth.mode);
let models = supported_models(auth_mode);
let models = supported_models(self.conversation_manager.clone()).await;
let total = models.len();
if total == 0 {
@@ -1922,6 +1916,85 @@ impl CodexMessageProcessor {
self.outgoing.send_response(request_id, response).await;
}
async fn list_mcp_servers(&self, request_id: RequestId, params: ListMcpServersParams) {
let snapshot = collect_mcp_snapshot(self.config.as_ref()).await;
let tools_by_server = group_tools_by_server(&snapshot.tools);
let mut server_names: Vec<String> = self
.config
.mcp_servers
.keys()
.cloned()
.chain(snapshot.auth_statuses.keys().cloned())
.chain(snapshot.resources.keys().cloned())
.chain(snapshot.resource_templates.keys().cloned())
.collect();
server_names.sort();
server_names.dedup();
let total = server_names.len();
let limit = params.limit.unwrap_or(total as u32).max(1) as usize;
let effective_limit = limit.min(total);
let start = match params.cursor {
Some(cursor) => match cursor.parse::<usize>() {
Ok(idx) => idx,
Err(_) => {
let error = JSONRPCErrorError {
code: INVALID_REQUEST_ERROR_CODE,
message: format!("invalid cursor: {cursor}"),
data: None,
};
self.outgoing.send_error(request_id, error).await;
return;
}
},
None => 0,
};
if start > total {
let error = JSONRPCErrorError {
code: INVALID_REQUEST_ERROR_CODE,
message: format!("cursor {start} exceeds total MCP servers {total}"),
data: None,
};
self.outgoing.send_error(request_id, error).await;
return;
}
let end = start.saturating_add(effective_limit).min(total);
let data: Vec<McpServer> = server_names[start..end]
.iter()
.map(|name| McpServer {
name: name.clone(),
tools: tools_by_server.get(name).cloned().unwrap_or_default(),
resources: snapshot.resources.get(name).cloned().unwrap_or_default(),
resource_templates: snapshot
.resource_templates
.get(name)
.cloned()
.unwrap_or_default(),
auth_status: snapshot
.auth_statuses
.get(name)
.cloned()
.unwrap_or(CoreMcpAuthStatus::Unsupported)
.into(),
})
.collect();
let next_cursor = if end < total {
Some(end.to_string())
} else {
None
};
let response = ListMcpServersResponse { data, next_cursor };
self.outgoing.send_response(request_id, response).await;
}
async fn handle_resume_conversation(
&self,
request_id: RequestId,
@@ -2469,6 +2542,7 @@ impl CodexMessageProcessor {
let turn = Turn {
id: turn_id.clone(),
items: vec![],
error: None,
status: TurnStatus::InProgress,
};
@@ -2510,6 +2584,7 @@ impl CodexMessageProcessor {
Turn {
id: turn_id,
items,
error: None,
status: TurnStatus::InProgress,
}
}
@@ -2945,10 +3020,26 @@ impl CodexMessageProcessor {
let FeedbackUploadParams {
classification,
reason,
conversation_id,
thread_id,
include_logs,
} = params;
let conversation_id = match thread_id.as_deref() {
Some(thread_id) => match ConversationId::from_string(thread_id) {
Ok(conversation_id) => Some(conversation_id),
Err(err) => {
let error = JSONRPCErrorError {
code: INVALID_REQUEST_ERROR_CODE,
message: format!("invalid thread id: {err}"),
data: None,
};
self.outgoing.send_error(request_id, error).await;
return;
}
},
None => None,
};
let snapshot = self.feedback.snapshot(conversation_id);
let thread_id = snapshot.thread_id.clone();

View File

@@ -109,12 +109,17 @@ impl ConfigApi {
async fn apply_edits(
&self,
file_path: String,
file_path: Option<String>,
expected_version: Option<String>,
edits: Vec<(String, JsonValue, MergeStrategy)>,
) -> Result<ConfigWriteResponse, JSONRPCErrorError> {
let allowed_path = self.codex_home.join(CONFIG_FILE_NAME);
if !paths_match(&allowed_path, &file_path) {
let provided_path = file_path
.as_ref()
.map(PathBuf::from)
.unwrap_or_else(|| allowed_path.clone());
if !paths_match(&allowed_path, &provided_path) {
return Err(config_write_error(
ConfigWriteErrorCode::ConfigLayerReadonly,
"Only writes to the user config are allowed",
@@ -190,9 +195,16 @@ impl ConfigApi {
.map(|_| WriteStatus::OkOverridden)
.unwrap_or(WriteStatus::Ok);
let file_path = provided_path
.canonicalize()
.unwrap_or(provided_path.clone())
.display()
.to_string();
Ok(ConfigWriteResponse {
status,
version: updated_layers.user.version.clone(),
file_path,
overridden_metadata: overridden,
})
}
@@ -587,15 +599,14 @@ fn canonical_json(value: &JsonValue) -> JsonValue {
}
}
fn paths_match(expected: &Path, provided: &str) -> bool {
let provided_path = PathBuf::from(provided);
fn paths_match(expected: &Path, provided: &Path) -> bool {
if let (Ok(expanded_expected), Ok(expanded_provided)) =
(expected.canonicalize(), provided_path.canonicalize())
(expected.canonicalize(), provided.canonicalize())
{
return expanded_expected == expanded_provided;
}
expected == provided_path
expected == provided
}
fn value_at_path<'a>(root: &'a TomlValue, segments: &[String]) -> Option<&'a TomlValue> {
@@ -795,7 +806,7 @@ mod tests {
let result = api
.write_value(ConfigValueWriteParams {
file_path: tmp.path().join(CONFIG_FILE_NAME).display().to_string(),
file_path: Some(tmp.path().join(CONFIG_FILE_NAME).display().to_string()),
key_path: "approval_policy".to_string(),
value: json!("never"),
merge_strategy: MergeStrategy::Replace,
@@ -832,7 +843,7 @@ mod tests {
let api = ConfigApi::new(tmp.path().to_path_buf(), vec![]);
let error = api
.write_value(ConfigValueWriteParams {
file_path: tmp.path().join(CONFIG_FILE_NAME).display().to_string(),
file_path: Some(tmp.path().join(CONFIG_FILE_NAME).display().to_string()),
key_path: "model".to_string(),
value: json!("gpt-5"),
merge_strategy: MergeStrategy::Replace,
@@ -852,6 +863,30 @@ mod tests {
);
}
#[tokio::test]
async fn write_value_defaults_to_user_config_path() {
let tmp = tempdir().expect("tempdir");
std::fs::write(tmp.path().join(CONFIG_FILE_NAME), "").unwrap();
let api = ConfigApi::new(tmp.path().to_path_buf(), vec![]);
api.write_value(ConfigValueWriteParams {
file_path: None,
key_path: "model".to_string(),
value: json!("gpt-new"),
merge_strategy: MergeStrategy::Replace,
expected_version: None,
})
.await
.expect("write succeeds");
let contents =
std::fs::read_to_string(tmp.path().join(CONFIG_FILE_NAME)).expect("read config");
assert!(
contents.contains("model = \"gpt-new\""),
"config.toml should be updated even when file_path is omitted"
);
}
#[tokio::test]
async fn invalid_user_value_rejected_even_if_overridden_by_managed() {
let tmp = tempdir().expect("tempdir");
@@ -872,7 +907,7 @@ mod tests {
let error = api
.write_value(ConfigValueWriteParams {
file_path: tmp.path().join(CONFIG_FILE_NAME).display().to_string(),
file_path: Some(tmp.path().join(CONFIG_FILE_NAME).display().to_string()),
key_path: "approval_policy".to_string(),
value: json!("bogus"),
merge_strategy: MergeStrategy::Replace,
@@ -957,7 +992,7 @@ mod tests {
let result = api
.write_value(ConfigValueWriteParams {
file_path: tmp.path().join(CONFIG_FILE_NAME).display().to_string(),
file_path: Some(tmp.path().join(CONFIG_FILE_NAME).display().to_string()),
key_path: "approval_policy".to_string(),
value: json!("on-request"),
merge_strategy: MergeStrategy::Replace,

View File

@@ -1,12 +1,15 @@
use codex_app_server_protocol::AuthMode;
use std::sync::Arc;
use codex_app_server_protocol::Model;
use codex_app_server_protocol::ReasoningEffortOption;
use codex_common::model_presets::ModelPreset;
use codex_common::model_presets::ReasoningEffortPreset;
use codex_common::model_presets::builtin_model_presets;
use codex_core::ConversationManager;
use codex_protocol::openai_models::ModelPreset;
use codex_protocol::openai_models::ReasoningEffortPreset;
pub fn supported_models(auth_mode: Option<AuthMode>) -> Vec<Model> {
builtin_model_presets(auth_mode)
pub async fn supported_models(conversation_manager: Arc<ConversationManager>) -> Vec<Model> {
conversation_manager
.list_models()
.await
.into_iter()
.map(model_from_preset)
.collect()
@@ -27,7 +30,7 @@ fn model_from_preset(preset: ModelPreset) -> Model {
}
fn reasoning_efforts_from_preset(
efforts: &'static [ReasoningEffortPreset],
efforts: Vec<ReasoningEffortPreset>,
) -> Vec<ReasoningEffortOption> {
efforts
.iter()

View File

@@ -23,10 +23,10 @@ use codex_app_server_protocol::SendUserTurnResponse;
use codex_app_server_protocol::ServerRequest;
use codex_core::protocol::AskForApproval;
use codex_core::protocol::SandboxPolicy;
use codex_core::protocol_config_types::ReasoningEffort;
use codex_core::protocol_config_types::ReasoningSummary;
use codex_core::spawn::CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR;
use codex_protocol::config_types::SandboxMode;
use codex_protocol::openai_models::ReasoningEffort;
use codex_protocol::parse_command::ParsedCommand;
use codex_protocol::protocol::Event;
use codex_protocol::protocol::EventMsg;

View File

@@ -10,10 +10,10 @@ use codex_app_server_protocol::Tools;
use codex_app_server_protocol::UserSavedConfig;
use codex_core::protocol::AskForApproval;
use codex_protocol::config_types::ForcedLoginMethod;
use codex_protocol::config_types::ReasoningEffort;
use codex_protocol::config_types::ReasoningSummary;
use codex_protocol::config_types::SandboxMode;
use codex_protocol::config_types::Verbosity;
use codex_protocol::openai_models::ReasoningEffort;
use pretty_assertions::assert_eq;
use std::collections::HashMap;
use std::path::Path;

View File

@@ -206,7 +206,7 @@ model = "gpt-old"
let write_id = mcp
.send_config_value_write_request(ConfigValueWriteParams {
file_path: codex_home.path().join("config.toml").display().to_string(),
file_path: None,
key_path: "model".to_string(),
value: json!("gpt-new"),
merge_strategy: MergeStrategy::Replace,
@@ -219,8 +219,16 @@ model = "gpt-old"
)
.await??;
let write: ConfigWriteResponse = to_response(write_resp)?;
let expected_file_path = codex_home
.path()
.join("config.toml")
.canonicalize()
.unwrap()
.display()
.to_string();
assert_eq!(write.status, WriteStatus::Ok);
assert_eq!(write.file_path, expected_file_path);
assert!(write.overridden_metadata.is_none());
let verify_id = mcp
@@ -254,7 +262,7 @@ model = "gpt-old"
let write_id = mcp
.send_config_value_write_request(ConfigValueWriteParams {
file_path: codex_home.path().join("config.toml").display().to_string(),
file_path: Some(codex_home.path().join("config.toml").display().to_string()),
key_path: "model".to_string(),
value: json!("gpt-new"),
merge_strategy: MergeStrategy::Replace,
@@ -288,7 +296,7 @@ async fn config_batch_write_applies_multiple_edits() -> Result<()> {
let batch_id = mcp
.send_config_batch_write_request(ConfigBatchWriteParams {
file_path: codex_home.path().join("config.toml").display().to_string(),
file_path: Some(codex_home.path().join("config.toml").display().to_string()),
edits: vec![
ConfigEdit {
key_path: "sandbox_mode".to_string(),
@@ -314,6 +322,14 @@ async fn config_batch_write_applies_multiple_edits() -> Result<()> {
.await??;
let batch_write: ConfigWriteResponse = to_response(batch_resp)?;
assert_eq!(batch_write.status, WriteStatus::Ok);
let expected_file_path = codex_home
.path()
.join("config.toml")
.canonicalize()
.unwrap()
.display()
.to_string();
assert_eq!(batch_write.file_path, expected_file_path);
let read_id = mcp
.send_config_read_request(ConfigReadParams {

View File

@@ -11,7 +11,7 @@ use codex_app_server_protocol::ModelListParams;
use codex_app_server_protocol::ModelListResponse;
use codex_app_server_protocol::ReasoningEffortOption;
use codex_app_server_protocol::RequestId;
use codex_protocol::config_types::ReasoningEffort;
use codex_protocol::openai_models::ReasoningEffort;
use pretty_assertions::assert_eq;
use tempfile::TempDir;
use tokio::time::timeout;

View File

@@ -93,7 +93,7 @@ async fn review_start_runs_review_turn_and_emits_code_review_item() -> Result<()
match started.item {
ThreadItem::EnteredReviewMode { id, review } => {
assert_eq!(id, turn_id);
assert_eq!(review, "commit 1234567");
assert_eq!(review, "commit 1234567: Tidy UI colors");
saw_entered_review_mode = true;
break;
}

View File

@@ -11,6 +11,7 @@ use app_test_support::to_response;
use codex_app_server_protocol::ApprovalDecision;
use codex_app_server_protocol::CommandExecutionRequestApprovalResponse;
use codex_app_server_protocol::CommandExecutionStatus;
use codex_app_server_protocol::FileChangeOutputDeltaNotification;
use codex_app_server_protocol::FileChangeRequestApprovalResponse;
use codex_app_server_protocol::ItemCompletedNotification;
use codex_app_server_protocol::ItemStartedNotification;
@@ -29,8 +30,8 @@ use codex_app_server_protocol::TurnStartResponse;
use codex_app_server_protocol::TurnStartedNotification;
use codex_app_server_protocol::TurnStatus;
use codex_app_server_protocol::UserInput as V2UserInput;
use codex_core::protocol_config_types::ReasoningEffort;
use codex_core::protocol_config_types::ReasoningSummary;
use codex_protocol::openai_models::ReasoningEffort;
use core_test_support::skip_if_no_network;
use pretty_assertions::assert_eq;
use std::path::Path;
@@ -725,6 +726,26 @@ async fn turn_start_file_change_approval_v2() -> Result<()> {
)
.await?;
let output_delta_notif = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("item/fileChange/outputDelta"),
)
.await??;
let output_delta: FileChangeOutputDeltaNotification = serde_json::from_value(
output_delta_notif
.params
.clone()
.expect("item/fileChange/outputDelta params"),
)?;
assert_eq!(output_delta.thread_id, thread.id);
assert_eq!(output_delta.turn_id, turn.id);
assert_eq!(output_delta.item_id, "patch-call");
assert!(
!output_delta.delta.is_empty(),
"expected delta to be non-empty, got: {}",
output_delta.delta
);
let completed_file_change = timeout(DEFAULT_READ_TIMEOUT, async {
loop {
let completed_notif = mcp

View File

@@ -18,6 +18,8 @@ use codex_cli::login::run_logout;
use codex_cloud_tasks::Cli as CloudTasksCli;
use codex_common::CliConfigOverrides;
use codex_exec::Cli as ExecCli;
use codex_exec::Command as ExecCommand;
use codex_exec::ReviewArgs;
use codex_execpolicy::ExecPolicyCheckCommand;
use codex_responses_api_proxy::Args as ResponsesApiProxyArgs;
use codex_tui::AppExitInfo;
@@ -72,6 +74,9 @@ enum Subcommand {
#[clap(visible_alias = "e")]
Exec(ExecCli),
/// Run a code review non-interactively.
Review(ReviewArgs),
/// Manage login.
Login(LoginCommand),
@@ -449,6 +454,15 @@ async fn cli_main(codex_linux_sandbox_exe: Option<PathBuf>) -> anyhow::Result<()
);
codex_exec::run_main(exec_cli, codex_linux_sandbox_exe).await?;
}
Some(Subcommand::Review(review_args)) => {
let mut exec_cli = ExecCli::try_parse_from(["codex", "exec"])?;
exec_cli.command = Some(ExecCommand::Review(review_args));
prepend_config_flags(
&mut exec_cli.config_overrides,
root_config_overrides.clone(),
);
codex_exec::run_main(exec_cli, codex_linux_sandbox_exe).await?;
}
Some(Subcommand::McpServer) => {
codex_mcp_server::run_main(codex_linux_sandbox_exe, root_config_overrides).await?;
}

View File

@@ -40,17 +40,15 @@ prefix_rule(
assert_eq!(
result,
json!({
"match": {
"decision": "forbidden",
"matchedRules": [
{
"prefixRuleMatch": {
"matchedPrefix": ["git", "push"],
"decision": "forbidden"
}
"decision": "forbidden",
"matchedRules": [
{
"prefixRuleMatch": {
"matchedPrefix": ["git", "push"],
"decision": "forbidden"
}
]
}
}
]
})
);

View File

@@ -28,6 +28,10 @@ pub struct ExecCommand {
#[arg(long = "env", value_name = "ENV_ID")]
pub environment: String,
/// Git branch to run in Codex Cloud.
#[arg(long = "branch", value_name = "BRANCH", default_value = "main")]
pub branch: String,
/// Number of assistant attempts (best-of-N).
#[arg(
long = "attempts",

View File

@@ -101,6 +101,7 @@ async fn run_exec_command(args: crate::cli::ExecCommand) -> anyhow::Result<()> {
let crate::cli::ExecCommand {
query,
environment,
branch,
attempts,
} = args;
let ctx = init_backend("codex_cloud_tasks_exec").await?;
@@ -110,7 +111,7 @@ async fn run_exec_command(args: crate::cli::ExecCommand) -> anyhow::Result<()> {
&*ctx.backend,
&env_id,
&prompt,
"main",
&branch,
false,
attempts,
)

View File

@@ -25,6 +25,8 @@ anyhow = { workspace = true }
assert_matches = { workspace = true }
pretty_assertions = { workspace = true }
tokio-test = { workspace = true }
wiremock = { workspace = true }
reqwest = { workspace = true }
[lints]
workspace = true

View File

@@ -1,8 +1,8 @@
use crate::error::ApiError;
use codex_protocol::config_types::ReasoningEffort as ReasoningEffortConfig;
use codex_protocol::config_types::ReasoningSummary as ReasoningSummaryConfig;
use codex_protocol::config_types::Verbosity as VerbosityConfig;
use codex_protocol::models::ResponseItem;
use codex_protocol::openai_models::ReasoningEffort as ReasoningEffortConfig;
use codex_protocol::protocol::RateLimitSnapshot;
use codex_protocol::protocol::TokenUsage;
use futures::Stream;

View File

@@ -1,4 +1,5 @@
pub mod chat;
pub mod compact;
pub mod models;
pub mod responses;
mod streaming;

View File

@@ -0,0 +1,216 @@
use crate::auth::AuthProvider;
use crate::auth::add_auth_headers;
use crate::error::ApiError;
use crate::provider::Provider;
use crate::telemetry::run_with_request_telemetry;
use codex_client::HttpTransport;
use codex_client::RequestTelemetry;
use codex_protocol::openai_models::ModelsResponse;
use http::HeaderMap;
use http::Method;
use std::sync::Arc;
pub struct ModelsClient<T: HttpTransport, A: AuthProvider> {
transport: T,
provider: Provider,
auth: A,
request_telemetry: Option<Arc<dyn RequestTelemetry>>,
}
impl<T: HttpTransport, A: AuthProvider> ModelsClient<T, A> {
pub fn new(transport: T, provider: Provider, auth: A) -> Self {
Self {
transport,
provider,
auth,
request_telemetry: None,
}
}
pub fn with_telemetry(mut self, request: Option<Arc<dyn RequestTelemetry>>) -> Self {
self.request_telemetry = request;
self
}
fn path(&self) -> &'static str {
"models"
}
pub async fn list_models(
&self,
client_version: &str,
extra_headers: HeaderMap,
) -> Result<ModelsResponse, ApiError> {
let builder = || {
let mut req = self.provider.build_request(Method::GET, self.path());
req.headers.extend(extra_headers.clone());
let separator = if req.url.contains('?') { '&' } else { '?' };
req.url = format!("{}{}client_version={client_version}", req.url, separator);
add_auth_headers(&self.auth, req)
};
let resp = run_with_request_telemetry(
self.provider.retry.to_policy(),
self.request_telemetry.clone(),
builder,
|req| self.transport.execute(req),
)
.await?;
serde_json::from_slice::<ModelsResponse>(&resp.body).map_err(|e| {
ApiError::Stream(format!(
"failed to decode models response: {e}; body: {}",
String::from_utf8_lossy(&resp.body)
))
})
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::provider::RetryConfig;
use crate::provider::WireApi;
use async_trait::async_trait;
use codex_client::Request;
use codex_client::Response;
use codex_client::StreamResponse;
use codex_client::TransportError;
use http::HeaderMap;
use http::StatusCode;
use pretty_assertions::assert_eq;
use serde_json::json;
use std::sync::Arc;
use std::sync::Mutex;
use std::time::Duration;
#[derive(Clone, Default)]
struct CapturingTransport {
last_request: Arc<Mutex<Option<Request>>>,
body: Arc<ModelsResponse>,
}
#[async_trait]
impl HttpTransport for CapturingTransport {
async fn execute(&self, req: Request) -> Result<Response, TransportError> {
*self.last_request.lock().unwrap() = Some(req);
let body = serde_json::to_vec(&*self.body).unwrap();
Ok(Response {
status: StatusCode::OK,
headers: HeaderMap::new(),
body: body.into(),
})
}
async fn stream(&self, _req: Request) -> Result<StreamResponse, TransportError> {
Err(TransportError::Build("stream should not run".to_string()))
}
}
#[derive(Clone, Default)]
struct DummyAuth;
impl AuthProvider for DummyAuth {
fn bearer_token(&self) -> Option<String> {
None
}
}
fn provider(base_url: &str) -> Provider {
Provider {
name: "test".to_string(),
base_url: base_url.to_string(),
query_params: None,
wire: WireApi::Responses,
headers: HeaderMap::new(),
retry: RetryConfig {
max_attempts: 1,
base_delay: Duration::from_millis(1),
retry_429: false,
retry_5xx: true,
retry_transport: true,
},
stream_idle_timeout: Duration::from_secs(1),
}
}
#[tokio::test]
async fn appends_client_version_query() {
let response = ModelsResponse { models: Vec::new() };
let transport = CapturingTransport {
last_request: Arc::new(Mutex::new(None)),
body: Arc::new(response),
};
let client = ModelsClient::new(
transport.clone(),
provider("https://example.com/api/codex"),
DummyAuth,
);
let result = client
.list_models("0.99.0", HeaderMap::new())
.await
.expect("request should succeed");
assert_eq!(result.models.len(), 0);
let url = transport
.last_request
.lock()
.unwrap()
.as_ref()
.unwrap()
.url
.clone();
assert_eq!(
url,
"https://example.com/api/codex/models?client_version=0.99.0"
);
}
#[tokio::test]
async fn parses_models_response() {
let response = ModelsResponse {
models: vec![
serde_json::from_value(json!({
"slug": "gpt-test",
"display_name": "gpt-test",
"description": "desc",
"default_reasoning_level": "medium",
"supported_reasoning_levels": ["low", "medium", "high"],
"shell_type": "shell_command",
"visibility": "list",
"minimal_client_version": [0, 99, 0],
"supported_in_api": true,
"priority": 1
}))
.unwrap(),
],
};
let transport = CapturingTransport {
last_request: Arc::new(Mutex::new(None)),
body: Arc::new(response),
};
let client = ModelsClient::new(
transport,
provider("https://example.com/api/codex"),
DummyAuth,
);
let result = client
.list_models("0.99.0", HeaderMap::new())
.await
.expect("request should succeed");
assert_eq!(result.models.len(), 1);
assert_eq!(result.models[0].slug, "gpt-test");
assert_eq!(result.models[0].supported_in_api, true);
assert_eq!(result.models[0].priority, 1);
}
}

View File

@@ -22,6 +22,7 @@ pub use crate::common::create_text_param_for_request;
pub use crate::endpoint::chat::AggregateStreamExt;
pub use crate::endpoint::chat::ChatClient;
pub use crate::endpoint::compact::CompactClient;
pub use crate::endpoint::models::ModelsClient;
pub use crate::endpoint::responses::ResponsesClient;
pub use crate::endpoint::responses::ResponsesOptions;
pub use crate::error::ApiError;

View File

@@ -0,0 +1,100 @@
use codex_api::AuthProvider;
use codex_api::ModelsClient;
use codex_api::provider::Provider;
use codex_api::provider::RetryConfig;
use codex_api::provider::WireApi;
use codex_client::ReqwestTransport;
use codex_protocol::openai_models::ClientVersion;
use codex_protocol::openai_models::ModelInfo;
use codex_protocol::openai_models::ModelVisibility;
use codex_protocol::openai_models::ModelsResponse;
use codex_protocol::openai_models::ReasoningLevel;
use codex_protocol::openai_models::ShellType;
use http::HeaderMap;
use http::Method;
use wiremock::Mock;
use wiremock::MockServer;
use wiremock::ResponseTemplate;
use wiremock::matchers::method;
use wiremock::matchers::path;
#[derive(Clone, Default)]
struct DummyAuth;
impl AuthProvider for DummyAuth {
fn bearer_token(&self) -> Option<String> {
None
}
}
fn provider(base_url: &str) -> Provider {
Provider {
name: "test".to_string(),
base_url: base_url.to_string(),
query_params: None,
wire: WireApi::Responses,
headers: HeaderMap::new(),
retry: RetryConfig {
max_attempts: 1,
base_delay: std::time::Duration::from_millis(1),
retry_429: false,
retry_5xx: true,
retry_transport: true,
},
stream_idle_timeout: std::time::Duration::from_secs(1),
}
}
#[tokio::test]
async fn models_client_hits_models_endpoint() {
let server = MockServer::start().await;
let base_url = format!("{}/api/codex", server.uri());
let response = ModelsResponse {
models: vec![ModelInfo {
slug: "gpt-test".to_string(),
display_name: "gpt-test".to_string(),
description: Some("desc".to_string()),
default_reasoning_level: ReasoningLevel::Medium,
supported_reasoning_levels: vec![
ReasoningLevel::Low,
ReasoningLevel::Medium,
ReasoningLevel::High,
],
shell_type: ShellType::ShellCommand,
visibility: ModelVisibility::List,
minimal_client_version: ClientVersion(0, 1, 0),
supported_in_api: true,
priority: 1,
}],
};
Mock::given(method("GET"))
.and(path("/api/codex/models"))
.respond_with(
ResponseTemplate::new(200)
.insert_header("content-type", "application/json")
.set_body_json(&response),
)
.mount(&server)
.await;
let transport = ReqwestTransport::new(reqwest::Client::new());
let client = ModelsClient::new(transport, provider(&base_url), DummyAuth);
let result = client
.list_models("0.1.0", HeaderMap::new())
.await
.expect("models request should succeed");
assert_eq!(result.models.len(), 1);
assert_eq!(result.models[0].slug, "gpt-test");
let received = server
.received_requests()
.await
.expect("should capture requests");
assert_eq!(received.len(), 1);
assert_eq!(received[0].method, Method::GET.as_str());
assert_eq!(received[0].url.path(), "/api/codex/models");
}

View File

@@ -1,21 +1,22 @@
[package]
name = "codex-client"
version.workspace = true
edition.workspace = true
license.workspace = true
name = "codex-client"
version.workspace = true
[dependencies]
async-trait = { workspace = true }
bytes = { workspace = true }
eventsource-stream = { workspace = true }
futures = { workspace = true }
http = { workspace = true }
rand = { workspace = true }
reqwest = { workspace = true, features = ["json", "stream"] }
serde = { workspace = true, features = ["derive"] }
serde_json = { workspace = true }
thiserror = { workspace = true }
tokio = { workspace = true, features = ["macros", "rt", "time", "sync"] }
rand = { workspace = true }
eventsource-stream = { workspace = true }
tracing = { workspace = true }
[lints]
workspace = true

View File

@@ -8,6 +8,9 @@ use futures::stream::BoxStream;
use http::HeaderMap;
use http::Method;
use http::StatusCode;
use tracing::Level;
use tracing::enabled;
use tracing::trace;
pub type ByteStream = BoxStream<'static, Result<Bytes, TransportError>>;
@@ -83,6 +86,15 @@ impl HttpTransport for ReqwestTransport {
}
async fn stream(&self, req: Request) -> Result<StreamResponse, TransportError> {
if enabled!(Level::TRACE) {
trace!(
"{} to {}: {}",
req.method,
req.url,
req.body.as_ref().unwrap_or_default()
);
}
let builder = self.build(req)?;
let resp = builder.send().await.map_err(Self::map_error)?;
let status = resp.status();

View File

@@ -9,12 +9,10 @@ workspace = true
[dependencies]
clap = { workspace = true, features = ["derive", "wrap_help"], optional = true }
codex-app-server-protocol = { workspace = true }
codex-core = { workspace = true }
codex-lmstudio = { workspace = true }
codex-ollama = { workspace = true }
codex-protocol = { workspace = true }
once_cell = { workspace = true }
serde = { workspace = true, optional = true }
toml = { workspace = true, optional = true }

View File

@@ -12,15 +12,14 @@ pub fn create_config_summary_entries(config: &Config) -> Vec<(&'static str, Stri
("approval", config.approval_policy.to_string()),
("sandbox", summarize_sandbox_policy(&config.sandbox_policy)),
];
if config.model_provider.wire_api == WireApi::Responses
&& config.model_family.supports_reasoning_summaries
{
if config.model_provider.wire_api == WireApi::Responses {
let reasoning_effort = config
.model_reasoning_effort
.or(config.model_family.default_reasoning_effort)
.map(|effort| effort.to_string())
.unwrap_or_else(|| "none".to_string());
entries.push(("reasoning effort", reasoning_effort));
.map(|effort| effort.to_string());
entries.push((
"reasoning effort",
reasoning_effort.unwrap_or_else(|| "none".to_string()),
));
entries.push((
"reasoning summaries",
config.model_reasoning_summary.to_string(),

View File

@@ -32,8 +32,6 @@ mod config_summary;
pub use config_summary::create_config_summary_entries;
// Shared fuzzy matcher (used by TUI selection popups and other UI filtering)
pub mod fuzzy_match;
// Shared model presets used by TUI and MCP server
pub mod model_presets;
// Shared approval presets (AskForApproval + Sandbox) used by TUI and MCP server
// Not to be confused with AskForApproval, which we should probably rename to EscalationPolicy.
pub mod approval_presets;

View File

@@ -52,6 +52,7 @@ regex-lite = { workspace = true }
reqwest = { workspace = true, features = ["json", "stream"] }
serde = { workspace = true, features = ["derive"] }
serde_json = { workspace = true }
serde_yaml = { workspace = true }
sha1 = { workspace = true }
sha2 = { workspace = true }
shlex = { workspace = true }

View File

@@ -33,12 +33,20 @@ pub(crate) fn map_api_error(err: ApiError) -> CodexErr {
headers,
body,
} => {
if status == http::StatusCode::INTERNAL_SERVER_ERROR {
let body_text = body.unwrap_or_default();
if status == http::StatusCode::BAD_REQUEST {
if body_text
.contains("The image data you provided does not represent a valid image")
{
CodexErr::InvalidImageRequest()
} else {
CodexErr::InvalidRequest(body_text)
}
} else if status == http::StatusCode::INTERNAL_SERVER_ERROR {
CodexErr::InternalServerError
} else if status == http::StatusCode::TOO_MANY_REQUESTS {
if let Some(body) = body
&& let Ok(err) = serde_json::from_str::<UsageErrorResponse>(&body)
{
if let Ok(err) = serde_json::from_str::<UsageErrorResponse>(&body_text) {
if err.error.error_type.as_deref() == Some("usage_limit_reached") {
let rate_limits = headers.as_ref().and_then(parse_rate_limit);
let resets_at = err
@@ -62,7 +70,7 @@ pub(crate) fn map_api_error(err: ApiError) -> CodexErr {
} else {
CodexErr::UnexpectedStatus(UnexpectedResponseError {
status,
body: body.unwrap_or_default(),
body: body_text,
request_id: extract_request_id(headers.as_ref()),
})
}

View File

@@ -70,7 +70,9 @@ pub(crate) async fn apply_patch(
)
.await;
match rx_approve.await.unwrap_or_default() {
ReviewDecision::Approved | ReviewDecision::ApprovedForSession => {
ReviewDecision::Approved
| ReviewDecision::ApprovedExecpolicyAmendment { .. }
| ReviewDecision::ApprovedForSession => {
InternalApplyPatchInvocation::DelegateToExec(ApplyPatchExec {
action,
user_explicitly_approved_this_action: true,

View File

@@ -1201,4 +1201,8 @@ impl AuthManager {
self.reload();
Ok(removed)
}
pub fn get_auth_mode(&self) -> Option<AuthMode> {
self.auth().map(|a| a.mode)
}
}

View File

@@ -20,9 +20,9 @@ use codex_api::error::ApiError;
use codex_app_server_protocol::AuthMode;
use codex_otel::otel_event_manager::OtelEventManager;
use codex_protocol::ConversationId;
use codex_protocol::config_types::ReasoningEffort as ReasoningEffortConfig;
use codex_protocol::config_types::ReasoningSummary as ReasoningSummaryConfig;
use codex_protocol::models::ResponseItem;
use codex_protocol::openai_models::ReasoningEffort as ReasoningEffortConfig;
use codex_protocol::protocol::SessionSource;
use eventsource_stream::Event;
use eventsource_stream::EventStreamError;
@@ -46,10 +46,10 @@ use crate::default_client::build_reqwest_client;
use crate::error::CodexErr;
use crate::error::Result;
use crate::flags::CODEX_RS_SSE_FIXTURE;
use crate::model_family::ModelFamily;
use crate::model_provider_info::ModelProviderInfo;
use crate::model_provider_info::WireApi;
use crate::openai_model_info::get_model_info;
use crate::openai_models::model_family::ModelFamily;
use crate::tools::spec::create_tools_json_for_chat_completions_api;
use crate::tools::spec::create_tools_json_for_responses_api;
@@ -57,6 +57,7 @@ use crate::tools::spec::create_tools_json_for_responses_api;
pub struct ModelClient {
config: Arc<Config>,
auth_manager: Option<Arc<AuthManager>>,
model_family: ModelFamily,
otel_event_manager: OtelEventManager,
provider: ModelProviderInfo,
conversation_id: ConversationId,
@@ -70,6 +71,7 @@ impl ModelClient {
pub fn new(
config: Arc<Config>,
auth_manager: Option<Arc<AuthManager>>,
model_family: ModelFamily,
otel_event_manager: OtelEventManager,
provider: ModelProviderInfo,
effort: Option<ReasoningEffortConfig>,
@@ -80,6 +82,7 @@ impl ModelClient {
Self {
config,
auth_manager,
model_family,
otel_event_manager,
provider,
conversation_id,
@@ -90,16 +93,18 @@ impl ModelClient {
}
pub fn get_model_context_window(&self) -> Option<i64> {
let pct = self.config.model_family.effective_context_window_percent;
let model_family = self.get_model_family();
let effective_context_window_percent = model_family.effective_context_window_percent;
self.config
.model_context_window
.or_else(|| get_model_info(&self.config.model_family).map(|info| info.context_window))
.map(|w| w.saturating_mul(pct) / 100)
.or_else(|| get_model_info(&model_family).map(|info| info.context_window))
.map(|w| w.saturating_mul(effective_context_window_percent) / 100)
}
pub fn get_auto_compact_token_limit(&self) -> Option<i64> {
let model_family = self.get_model_family();
self.config.model_auto_compact_token_limit.or_else(|| {
get_model_info(&self.config.model_family).and_then(|info| info.auto_compact_token_limit)
get_model_info(&model_family).and_then(|info| info.auto_compact_token_limit)
})
}
@@ -149,9 +154,8 @@ impl ModelClient {
}
let auth_manager = self.auth_manager.clone();
let instructions = prompt
.get_full_instructions(&self.config.model_family)
.into_owned();
let model_family = self.get_model_family();
let instructions = prompt.get_full_instructions(&model_family).into_owned();
let tools_json = create_tools_json_for_chat_completions_api(&prompt.tools)?;
let api_prompt = build_api_prompt(prompt, instructions, tools_json);
let conversation_id = self.conversation_id.to_string();
@@ -204,16 +208,13 @@ impl ModelClient {
}
let auth_manager = self.auth_manager.clone();
let instructions = prompt
.get_full_instructions(&self.config.model_family)
.into_owned();
let model_family = self.get_model_family();
let instructions = prompt.get_full_instructions(&model_family).into_owned();
let tools_json: Vec<Value> = create_tools_json_for_responses_api(&prompt.tools)?;
let reasoning = if self.config.model_family.supports_reasoning_summaries {
let reasoning = if model_family.supports_reasoning_summaries {
Some(Reasoning {
effort: self
.effort
.or(self.config.model_family.default_reasoning_effort),
effort: self.effort.or(model_family.default_reasoning_effort),
summary: Some(self.summary),
})
} else {
@@ -226,15 +227,15 @@ impl ModelClient {
vec![]
};
let verbosity = if self.config.model_family.support_verbosity {
let verbosity = if model_family.support_verbosity {
self.config
.model_verbosity
.or(self.config.model_family.default_verbosity)
.or(model_family.default_verbosity)
} else {
if self.config.model_verbosity.is_some() {
warn!(
"model_verbosity is set but ignored as the model does not support verbosity: {}",
self.config.model_family.family
model_family.family
);
}
None
@@ -305,7 +306,7 @@ impl ModelClient {
/// Returns the currently configured model family.
pub fn get_model_family(&self) -> ModelFamily {
self.config.model_family.clone()
self.model_family.clone()
}
/// Returns the current reasoning effort setting.
@@ -342,7 +343,7 @@ impl ModelClient {
.with_telemetry(Some(request_telemetry));
let instructions = prompt
.get_full_instructions(&self.config.model_family)
.get_full_instructions(&self.get_model_family())
.into_owned();
let payload = ApiCompactionInput {
model: &self.config.model,

View File

@@ -1,6 +1,6 @@
use crate::client_common::tools::ToolSpec;
use crate::error::Result;
use crate::model_family::ModelFamily;
use crate::openai_models::model_family::ModelFamily;
pub use codex_api::common::ResponseEvent;
use codex_apply_patch::APPLY_PATCH_TOOL_INSTRUCTIONS;
use codex_protocol::models::ResponseItem;
@@ -252,7 +252,7 @@ impl Stream for ResponseStream {
#[cfg(test)]
mod tests {
use crate::model_family::find_family_for_model;
use crate::openai_models::model_family::find_family_for_model;
use codex_api::ResponsesApiRequest;
use codex_api::common::OpenAiVerbosity;
use codex_api::common::TextControls;
@@ -309,7 +309,7 @@ mod tests {
},
];
for test_case in test_cases {
let model_family = find_family_for_model(test_case.slug).expect("known model slug");
let model_family = find_family_for_model(test_case.slug);
let expected = if test_case.expects_apply_patch_instructions {
format!(
"{}\n{}",

File diff suppressed because it is too large Load Diff

View File

@@ -25,6 +25,7 @@ use crate::codex::Session;
use crate::codex::TurnContext;
use crate::config::Config;
use crate::error::CodexErr;
use crate::openai_models::models_manager::ModelsManager;
use codex_protocol::protocol::InitialHistory;
/// Start an interactive sub-Codex conversation and return IO channels.
@@ -35,6 +36,7 @@ use codex_protocol::protocol::InitialHistory;
pub(crate) async fn run_codex_conversation_interactive(
config: Config,
auth_manager: Arc<AuthManager>,
models_manager: Arc<ModelsManager>,
parent_session: Arc<Session>,
parent_ctx: Arc<TurnContext>,
cancel_token: CancellationToken,
@@ -46,6 +48,7 @@ pub(crate) async fn run_codex_conversation_interactive(
let CodexSpawnOk { codex, .. } = Codex::spawn(
config,
auth_manager,
models_manager,
initial_history.unwrap_or(InitialHistory::New),
SessionSource::SubAgent(SubAgentSource::Review),
)
@@ -88,9 +91,11 @@ pub(crate) async fn run_codex_conversation_interactive(
/// Convenience wrapper for one-time use with an initial prompt.
///
/// Internally calls the interactive variant, then immediately submits the provided input.
#[allow(clippy::too_many_arguments)]
pub(crate) async fn run_codex_conversation_one_shot(
config: Config,
auth_manager: Arc<AuthManager>,
models_manager: Arc<ModelsManager>,
input: Vec<UserInput>,
parent_session: Arc<Session>,
parent_ctx: Arc<TurnContext>,
@@ -103,6 +108,7 @@ pub(crate) async fn run_codex_conversation_one_shot(
let io = run_codex_conversation_interactive(
config,
auth_manager,
models_manager,
parent_session,
parent_ctx,
child_cancel.clone(),
@@ -275,6 +281,7 @@ async fn handle_exec_approval(
event.cwd,
event.reason,
event.risk,
event.proposed_execpolicy_amendment,
);
let decision = await_approval_with_cancel(
approval_fut,

View File

@@ -32,13 +32,13 @@ pub const SUMMARIZATION_PROMPT: &str = include_str!("../templates/compact/prompt
pub const SUMMARY_PREFIX: &str = include_str!("../templates/compact/summary_prefix.md");
const COMPACT_USER_MESSAGE_MAX_TOKENS: usize = 20_000;
pub(crate) async fn should_use_remote_compact_task(session: &Session) -> bool {
pub(crate) fn should_use_remote_compact_task(session: &Session) -> bool {
session
.services
.auth_manager
.auth()
.is_some_and(|auth| auth.mode == AuthMode::ChatGPT)
&& session.enabled(Feature::RemoteCompaction).await
&& session.enabled(Feature::RemoteCompaction)
}
pub(crate) async fn run_inline_auto_compact_task(

View File

@@ -2,8 +2,8 @@ use crate::config::CONFIG_TOML_FILE;
use crate::config::types::McpServerConfig;
use crate::config::types::Notice;
use anyhow::Context;
use codex_protocol::config_types::ReasoningEffort;
use codex_protocol::config_types::TrustLevel;
use codex_protocol::openai_models::ReasoningEffort;
use std::collections::BTreeMap;
use std::path::Path;
use std::path::PathBuf;
@@ -574,7 +574,7 @@ impl ConfigEditsBuilder {
mod tests {
use super::*;
use crate::config::types::McpServerTransportConfig;
use codex_protocol::config_types::ReasoningEffort;
use codex_protocol::openai_models::ReasoningEffort;
use pretty_assertions::assert_eq;
use tempfile::tempdir;
use tokio::runtime::Builder;

View File

@@ -22,26 +22,25 @@ use crate::features::FeatureOverrides;
use crate::features::Features;
use crate::features::FeaturesToml;
use crate::git_info::resolve_root_git_project_for_trust;
use crate::model_family::ModelFamily;
use crate::model_family::derive_default_model_family;
use crate::model_family::find_family_for_model;
use crate::model_provider_info::LMSTUDIO_OSS_PROVIDER_ID;
use crate::model_provider_info::ModelProviderInfo;
use crate::model_provider_info::OLLAMA_OSS_PROVIDER_ID;
use crate::model_provider_info::built_in_model_providers;
use crate::openai_model_info::get_model_info;
use crate::openai_models::model_family::find_family_for_model;
use crate::project_doc::DEFAULT_PROJECT_DOC_FILENAME;
use crate::project_doc::LOCAL_PROJECT_DOC_FILENAME;
use crate::protocol::AskForApproval;
use crate::protocol::SandboxPolicy;
use crate::util::resolve_path;
use codex_app_server_protocol::Tools;
use codex_app_server_protocol::UserSavedConfig;
use codex_protocol::config_types::ForcedLoginMethod;
use codex_protocol::config_types::ReasoningEffort;
use codex_protocol::config_types::ReasoningSummary;
use codex_protocol::config_types::SandboxMode;
use codex_protocol::config_types::TrustLevel;
use codex_protocol::config_types::Verbosity;
use codex_protocol::openai_models::ReasoningEffort;
use codex_rmcp_client::OAuthCredentialsStoreMode;
use dirs::home_dir;
use dunce::canonicalize;
@@ -61,9 +60,8 @@ pub mod edit;
pub mod profile;
pub mod types;
pub const OPENAI_DEFAULT_MODEL: &str = "gpt-5.1-codex";
const OPENAI_DEFAULT_REVIEW_MODEL: &str = "gpt-5.1-codex";
pub const GPT_5_CODEX_MEDIUM_MODEL: &str = "gpt-5.1-codex";
pub const OPENAI_DEFAULT_MODEL: &str = "gpt-5.1-codex-max";
const OPENAI_DEFAULT_REVIEW_MODEL: &str = "gpt-5.1-codex-max";
/// Maximum number of bytes of the documentation that will be embedded. Larger
/// files are *silently truncated* to this size so we do not take up too much of
@@ -81,8 +79,6 @@ pub struct Config {
/// Model used specifically for review sessions. Defaults to "gpt-5.1-codex-max".
pub review_model: String,
pub model_family: ModelFamily,
/// Size of the context window for the model, in tokens.
pub model_context_window: Option<i64>,
@@ -160,6 +156,9 @@ pub struct Config {
/// Enable ASCII animations and shimmer effects in the TUI.
pub animations: bool,
/// Show startup tooltips in the TUI welcome screen.
pub show_tooltips: bool,
/// The directory that should be treated as the current working directory
/// for the session. All relative paths inside the business-logic layer are
/// resolved against this path.
@@ -192,6 +191,7 @@ pub struct Config {
/// Additional filenames to try when looking for project-level docs.
pub project_doc_fallback_filenames: Vec<String>,
// todo(aibrahim): this should be used in the override model family
/// Token budget applied when storing tool/function outputs in the context manager.
pub tool_output_token_limit: Option<usize>,
@@ -222,6 +222,12 @@ pub struct Config {
/// request using the Responses API.
pub model_reasoning_summary: ReasoningSummary,
/// Optional override to force-enable reasoning summaries for the configured model.
pub model_supports_reasoning_summaries: Option<bool>,
/// Optional override to force reasoning summary format for the configured model.
pub model_reasoning_summary_format: Option<ReasoningSummaryFormat>,
/// Optional verbosity control for GPT-5 models (Responses API `text.verbosity`).
pub model_verbosity: Option<Verbosity>,
@@ -1016,15 +1022,8 @@ impl Config {
let additional_writable_roots: Vec<PathBuf> = additional_writable_roots
.into_iter()
.map(|path| {
let absolute = if path.is_absolute() {
path
} else {
resolved_cwd.join(path)
};
match canonicalize(&absolute) {
Ok(canonical) => canonical,
Err(_) => absolute,
}
let absolute = resolve_path(&resolved_cwd, &path);
canonicalize(&absolute).unwrap_or(absolute)
})
.collect();
let active_project = cfg
@@ -1112,15 +1111,7 @@ impl Config {
.or(cfg.model)
.unwrap_or_else(default_model);
let mut model_family =
find_family_for_model(&model).unwrap_or_else(|| derive_default_model_family(&model));
if let Some(supports_reasoning_summaries) = cfg.model_supports_reasoning_summaries {
model_family.supports_reasoning_summaries = supports_reasoning_summaries;
}
if let Some(model_reasoning_summary_format) = cfg.model_reasoning_summary_format {
model_family.reasoning_summary_format = model_reasoning_summary_format;
}
let model_family = find_family_for_model(&model);
let openai_model_info = get_model_info(&model_family);
let model_context_window = cfg
@@ -1177,7 +1168,6 @@ impl Config {
let config = Self {
model,
review_model,
model_family,
model_context_window,
model_auto_compact_token_limit,
model_provider_id,
@@ -1233,6 +1223,8 @@ impl Config {
.model_reasoning_summary
.or(cfg.model_reasoning_summary)
.unwrap_or_default(),
model_supports_reasoning_summaries: cfg.model_supports_reasoning_summaries,
model_reasoning_summary_format: cfg.model_reasoning_summary_format.clone(),
model_verbosity: config_profile.model_verbosity.or(cfg.model_verbosity),
chatgpt_base_url: config_profile
.chatgpt_base_url
@@ -1258,6 +1250,7 @@ impl Config {
.map(|t| t.notifications.clone())
.unwrap_or_default(),
animations: cfg.tui.as_ref().map(|t| t.animations).unwrap_or(true),
show_tooltips: cfg.tui.as_ref().map(|t| t.show_tooltips).unwrap_or(true),
otel: {
let t: OtelConfigToml = cfg.otel.unwrap_or_default();
let log_user_prompt = t.log_user_prompt.unwrap_or(false);
@@ -1299,11 +1292,7 @@ impl Config {
return Ok(None);
};
let full_path = if p.is_relative() {
cwd.join(p)
} else {
p.to_path_buf()
};
let full_path = resolve_path(cwd, p);
let contents = std::fs::read_to_string(&full_path).map_err(|e| {
std::io::Error::new(
@@ -1436,6 +1425,7 @@ persistence = "none"
let tui = parsed.tui.expect("config should include tui section");
assert_eq!(tui.notifications, Notifications::Enabled(true));
assert!(tui.show_tooltips);
}
#[test]
@@ -2960,7 +2950,6 @@ model_verbosity = "high"
Config {
model: "o3".to_string(),
review_model: OPENAI_DEFAULT_REVIEW_MODEL.to_string(),
model_family: find_family_for_model("o3").expect("known model slug"),
model_context_window: Some(200_000),
model_auto_compact_token_limit: Some(180_000),
model_provider_id: "openai".to_string(),
@@ -2988,6 +2977,8 @@ model_verbosity = "high"
show_raw_agent_reasoning: false,
model_reasoning_effort: Some(ReasoningEffort::High),
model_reasoning_summary: ReasoningSummary::Detailed,
model_supports_reasoning_summaries: None,
model_reasoning_summary_format: None,
model_verbosity: None,
chatgpt_base_url: "https://chatgpt.com/backend-api/".to_string(),
base_instructions: None,
@@ -3009,6 +3000,7 @@ model_verbosity = "high"
disable_paste_burst: false,
tui_notifications: Default::default(),
animations: true,
show_tooltips: true,
otel: OtelConfig::default(),
},
o3_profile_config
@@ -3033,7 +3025,6 @@ model_verbosity = "high"
let expected_gpt3_profile_config = Config {
model: "gpt-3.5-turbo".to_string(),
review_model: OPENAI_DEFAULT_REVIEW_MODEL.to_string(),
model_family: find_family_for_model("gpt-3.5-turbo").expect("known model slug"),
model_context_window: Some(16_385),
model_auto_compact_token_limit: Some(14_746),
model_provider_id: "openai-chat-completions".to_string(),
@@ -3061,6 +3052,8 @@ model_verbosity = "high"
show_raw_agent_reasoning: false,
model_reasoning_effort: None,
model_reasoning_summary: ReasoningSummary::default(),
model_supports_reasoning_summaries: None,
model_reasoning_summary_format: None,
model_verbosity: None,
chatgpt_base_url: "https://chatgpt.com/backend-api/".to_string(),
base_instructions: None,
@@ -3082,6 +3075,7 @@ model_verbosity = "high"
disable_paste_burst: false,
tui_notifications: Default::default(),
animations: true,
show_tooltips: true,
otel: OtelConfig::default(),
};
@@ -3121,7 +3115,6 @@ model_verbosity = "high"
let expected_zdr_profile_config = Config {
model: "o3".to_string(),
review_model: OPENAI_DEFAULT_REVIEW_MODEL.to_string(),
model_family: find_family_for_model("o3").expect("known model slug"),
model_context_window: Some(200_000),
model_auto_compact_token_limit: Some(180_000),
model_provider_id: "openai".to_string(),
@@ -3149,6 +3142,8 @@ model_verbosity = "high"
show_raw_agent_reasoning: false,
model_reasoning_effort: None,
model_reasoning_summary: ReasoningSummary::default(),
model_supports_reasoning_summaries: None,
model_reasoning_summary_format: None,
model_verbosity: None,
chatgpt_base_url: "https://chatgpt.com/backend-api/".to_string(),
base_instructions: None,
@@ -3170,6 +3165,7 @@ model_verbosity = "high"
disable_paste_burst: false,
tui_notifications: Default::default(),
animations: true,
show_tooltips: true,
otel: OtelConfig::default(),
};
@@ -3195,7 +3191,6 @@ model_verbosity = "high"
let expected_gpt5_profile_config = Config {
model: "gpt-5.1".to_string(),
review_model: OPENAI_DEFAULT_REVIEW_MODEL.to_string(),
model_family: find_family_for_model("gpt-5.1").expect("known model slug"),
model_context_window: Some(272_000),
model_auto_compact_token_limit: Some(244_800),
model_provider_id: "openai".to_string(),
@@ -3223,6 +3218,8 @@ model_verbosity = "high"
show_raw_agent_reasoning: false,
model_reasoning_effort: Some(ReasoningEffort::High),
model_reasoning_summary: ReasoningSummary::Detailed,
model_supports_reasoning_summaries: None,
model_reasoning_summary_format: None,
model_verbosity: Some(Verbosity::High),
chatgpt_base_url: "https://chatgpt.com/backend-api/".to_string(),
base_instructions: None,
@@ -3244,6 +3241,7 @@ model_verbosity = "high"
disable_paste_burst: false,
tui_notifications: Default::default(),
animations: true,
show_tooltips: true,
otel: OtelConfig::default(),
};

View File

@@ -2,10 +2,10 @@ use serde::Deserialize;
use std::path::PathBuf;
use crate::protocol::AskForApproval;
use codex_protocol::config_types::ReasoningEffort;
use codex_protocol::config_types::ReasoningSummary;
use codex_protocol::config_types::SandboxMode;
use codex_protocol::config_types::Verbosity;
use codex_protocol::openai_models::ReasoningEffort;
/// Collection of common configuration options that a user can define as a unit
/// in `config.toml`.

View File

@@ -256,8 +256,8 @@ pub struct History {
/// If true, history entries will not be written to disk.
pub persistence: HistoryPersistence,
/// If set, the maximum size of the history file in bytes.
/// TODO(mbolin): Not currently honored.
/// If set, the maximum size of the history file in bytes. The oldest entries
/// are dropped once the file exceeds this limit.
pub max_bytes: Option<usize>,
}
@@ -368,6 +368,11 @@ pub struct Tui {
/// Defaults to `true`.
#[serde(default = "default_true")]
pub animations: bool,
/// Show startup tooltips in the TUI welcome screen.
/// Defaults to `true`.
#[serde(default = "default_true")]
pub show_tooltips: bool,
}
const fn default_true() -> bool {

View File

@@ -5,6 +5,8 @@ use crate::truncate::approx_token_count;
use crate::truncate::approx_tokens_from_byte_count;
use crate::truncate::truncate_function_output_items_with_policy;
use crate::truncate::truncate_text;
use codex_protocol::models::ContentItem;
use codex_protocol::models::FunctionCallOutputContentItem;
use codex_protocol::models::FunctionCallOutputPayload;
use codex_protocol::models::ResponseItem;
use codex_protocol::protocol::TokenUsage;
@@ -118,6 +120,37 @@ impl ContextManager {
self.items = items;
}
pub(crate) fn replace_last_turn_images(&mut self, placeholder: &str) {
let Some(last_item) = self.items.last_mut() else {
return;
};
match last_item {
ResponseItem::Message { role, content, .. } if role == "user" => {
for item in content.iter_mut() {
if matches!(item, ContentItem::InputImage { .. }) {
*item = ContentItem::InputText {
text: placeholder.to_string(),
};
}
}
}
ResponseItem::FunctionCallOutput { output, .. } => {
let Some(content_items) = output.content_items.as_mut() else {
return;
};
for item in content_items.iter_mut() {
if matches!(item, FunctionCallOutputContentItem::InputImage { .. }) {
*item = FunctionCallOutputContentItem::InputText {
text: placeholder.to_string(),
};
}
}
}
_ => {}
}
}
pub(crate) fn update_token_info(
&mut self,
usage: &TokenUsage,

View File

@@ -7,6 +7,7 @@ use crate::codex_conversation::CodexConversation;
use crate::config::Config;
use crate::error::CodexErr;
use crate::error::Result as CodexResult;
use crate::openai_models::models_manager::ModelsManager;
use crate::protocol::Event;
use crate::protocol::EventMsg;
use crate::protocol::SessionConfiguredEvent;
@@ -14,6 +15,7 @@ use crate::rollout::RolloutRecorder;
use codex_protocol::ConversationId;
use codex_protocol::items::TurnItem;
use codex_protocol::models::ResponseItem;
use codex_protocol::openai_models::ModelPreset;
use codex_protocol::protocol::InitialHistory;
use codex_protocol::protocol::RolloutItem;
use codex_protocol::protocol::SessionSource;
@@ -35,6 +37,7 @@ pub struct NewConversation {
pub struct ConversationManager {
conversations: Arc<RwLock<HashMap<ConversationId, Arc<CodexConversation>>>>,
auth_manager: Arc<AuthManager>,
models_manager: Arc<ModelsManager>,
session_source: SessionSource,
}
@@ -42,8 +45,9 @@ impl ConversationManager {
pub fn new(auth_manager: Arc<AuthManager>, session_source: SessionSource) -> Self {
Self {
conversations: Arc::new(RwLock::new(HashMap::new())),
auth_manager,
auth_manager: auth_manager.clone(),
session_source,
models_manager: Arc::new(ModelsManager::new(auth_manager.get_auth_mode())),
}
}
@@ -61,14 +65,19 @@ impl ConversationManager {
}
pub async fn new_conversation(&self, config: Config) -> CodexResult<NewConversation> {
self.spawn_conversation(config, self.auth_manager.clone())
.await
self.spawn_conversation(
config,
self.auth_manager.clone(),
self.models_manager.clone(),
)
.await
}
async fn spawn_conversation(
&self,
config: Config,
auth_manager: Arc<AuthManager>,
models_manager: Arc<ModelsManager>,
) -> CodexResult<NewConversation> {
let CodexSpawnOk {
codex,
@@ -76,6 +85,7 @@ impl ConversationManager {
} = Codex::spawn(
config,
auth_manager,
models_manager,
InitialHistory::New,
self.session_source.clone(),
)
@@ -152,6 +162,7 @@ impl ConversationManager {
} = Codex::spawn(
config,
auth_manager,
self.models_manager.clone(),
initial_history,
self.session_source.clone(),
)
@@ -189,10 +200,25 @@ impl ConversationManager {
let CodexSpawnOk {
codex,
conversation_id,
} = Codex::spawn(config, auth_manager, history, self.session_source.clone()).await?;
} = Codex::spawn(
config,
auth_manager,
self.models_manager.clone(),
history,
self.session_source.clone(),
)
.await?;
self.finalize_spawn(codex, conversation_id).await
}
pub async fn list_models(&self) -> Vec<ModelPreset> {
self.models_manager.available_models.read().await.clone()
}
pub fn get_models_manager(&self) -> Arc<ModelsManager> {
self.models_manager.clone()
}
}
/// Return a prefix of `items` obtained by cutting strictly before the nth user message

View File

@@ -1,4 +1,3 @@
use crate::codex::ProcessedResponseItem;
use crate::exec::ExecToolCallOutput;
use crate::token_data::KnownPlan;
use crate::token_data::PlanType;
@@ -61,9 +60,7 @@ pub enum SandboxErr {
pub enum CodexErr {
// todo(aibrahim): git rid of this error carrying the dangling artifacts
#[error("turn aborted. Something went wrong? Hit `/feedback` to report the issue.")]
TurnAborted {
dangling_artifacts: Vec<ProcessedResponseItem>,
},
TurnAborted,
/// Returned by ResponsesClient when the SSE stream disconnects or errors out **after** the HTTP
/// handshake has succeeded but **before** it finished emitting `response.completed`.
@@ -103,6 +100,14 @@ pub enum CodexErr {
#[error("{0}")]
UnexpectedStatus(UnexpectedResponseError),
/// Invalid request.
#[error("{0}")]
InvalidRequest(String),
/// Invalid image.
#[error("Image poisoning")]
InvalidImageRequest(),
#[error("{0}")]
UsageLimitReached(UsageLimitReachedError),
@@ -173,9 +178,7 @@ pub enum CodexErr {
impl From<CancelErr> for CodexErr {
fn from(_: CancelErr) -> Self {
CodexErr::TurnAborted {
dangling_artifacts: Vec::new(),
}
CodexErr::TurnAborted
}
}

View File

@@ -485,6 +485,19 @@ pub struct ExecToolCallOutput {
pub timed_out: bool,
}
impl Default for ExecToolCallOutput {
fn default() -> Self {
Self {
exit_code: 0,
stdout: StreamOutput::new(String::new()),
stderr: StreamOutput::new(String::new()),
aggregated_output: StreamOutput::new(String::new()),
duration: Duration::ZERO,
timed_out: false,
}
}
}
#[cfg_attr(not(target_os = "windows"), allow(unused_variables))]
async fn exec(
params: ExecParams,

View File

@@ -4,25 +4,35 @@ use std::path::PathBuf;
use std::sync::Arc;
use crate::command_safety::is_dangerous_command::requires_initial_appoval;
use codex_execpolicy::AmendError;
use codex_execpolicy::Decision;
use codex_execpolicy::Error as ExecPolicyRuleError;
use codex_execpolicy::Evaluation;
use codex_execpolicy::Policy;
use codex_execpolicy::PolicyParser;
use codex_execpolicy::RuleMatch;
use codex_execpolicy::blocking_append_allow_prefix_rule;
use codex_protocol::approvals::ExecPolicyAmendment;
use codex_protocol::protocol::AskForApproval;
use codex_protocol::protocol::SandboxPolicy;
use thiserror::Error;
use tokio::fs;
use tokio::sync::RwLock;
use tokio::task::spawn_blocking;
use crate::bash::parse_shell_lc_plain_commands;
use crate::features::Feature;
use crate::features::Features;
use crate::sandboxing::SandboxPermissions;
use crate::tools::sandboxing::ApprovalRequirement;
use crate::tools::sandboxing::ExecApprovalRequirement;
const FORBIDDEN_REASON: &str = "execpolicy forbids this command";
const PROMPT_CONFLICT_REASON: &str =
"execpolicy requires approval for this command, but AskForApproval is set to Never";
const PROMPT_REASON: &str = "execpolicy requires approval for this command";
const POLICY_DIR_NAME: &str = "policy";
const POLICY_EXTENSION: &str = "codexpolicy";
const DEFAULT_POLICY_FILE: &str = "default.codexpolicy";
#[derive(Debug, Error)]
pub enum ExecPolicyError {
@@ -45,12 +55,30 @@ pub enum ExecPolicyError {
},
}
#[derive(Debug, Error)]
pub enum ExecPolicyUpdateError {
#[error("failed to update execpolicy file {path}: {source}")]
AppendRule { path: PathBuf, source: AmendError },
#[error("failed to join blocking execpolicy update task: {source}")]
JoinBlockingTask { source: tokio::task::JoinError },
#[error("failed to update in-memory execpolicy: {source}")]
AddRule {
#[from]
source: ExecPolicyRuleError,
},
#[error("cannot append execpolicy rule because execpolicy feature is disabled")]
FeatureDisabled,
}
pub(crate) async fn exec_policy_for(
features: &Features,
codex_home: &Path,
) -> Result<Arc<Policy>, ExecPolicyError> {
) -> Result<Arc<RwLock<Policy>>, ExecPolicyError> {
if !features.enabled(Feature::ExecPolicy) {
return Ok(Arc::new(Policy::empty()));
return Ok(Arc::new(RwLock::new(Policy::empty())));
}
let policy_dir = codex_home.join(POLICY_DIR_NAME);
@@ -74,7 +102,7 @@ pub(crate) async fn exec_policy_for(
})?;
}
let policy = Arc::new(parser.build());
let policy = Arc::new(RwLock::new(parser.build()));
tracing::debug!(
"loaded execpolicy from {} files in {}",
policy_paths.len(),
@@ -84,59 +112,133 @@ pub(crate) async fn exec_policy_for(
Ok(policy)
}
fn evaluate_with_policy(
policy: &Policy,
command: &[String],
approval_policy: AskForApproval,
) -> Option<ApprovalRequirement> {
let commands = parse_shell_lc_plain_commands(command).unwrap_or_else(|| vec![command.to_vec()]);
let evaluation = policy.check_multiple(commands.iter());
match evaluation {
Evaluation::Match { decision, .. } => match decision {
Decision::Forbidden => Some(ApprovalRequirement::Forbidden {
reason: FORBIDDEN_REASON.to_string(),
}),
Decision::Prompt => {
let reason = PROMPT_REASON.to_string();
if matches!(approval_policy, AskForApproval::Never) {
Some(ApprovalRequirement::Forbidden { reason })
} else {
Some(ApprovalRequirement::NeedsApproval {
reason: Some(reason),
})
}
}
Decision::Allow => Some(ApprovalRequirement::Skip {
bypass_sandbox: true,
}),
},
Evaluation::NoMatch { .. } => None,
}
pub(crate) fn default_policy_path(codex_home: &Path) -> PathBuf {
codex_home.join(POLICY_DIR_NAME).join(DEFAULT_POLICY_FILE)
}
pub(crate) fn create_approval_requirement_for_command(
policy: &Policy,
pub(crate) async fn append_execpolicy_amendment_and_update(
codex_home: &Path,
current_policy: &Arc<RwLock<Policy>>,
prefix: &[String],
) -> Result<(), ExecPolicyUpdateError> {
let policy_path = default_policy_path(codex_home);
let prefix = prefix.to_vec();
spawn_blocking({
let policy_path = policy_path.clone();
let prefix = prefix.clone();
move || blocking_append_allow_prefix_rule(&policy_path, &prefix)
})
.await
.map_err(|source| ExecPolicyUpdateError::JoinBlockingTask { source })?
.map_err(|source| ExecPolicyUpdateError::AppendRule {
path: policy_path,
source,
})?;
current_policy
.write()
.await
.add_prefix_rule(&prefix, Decision::Allow)?;
Ok(())
}
/// Returns a proposed execpolicy amendment only when heuristics caused
/// the prompt decision, so we can offer to apply that amendment for future runs.
///
/// The amendment uses the first command heuristics marked as `Prompt`. If any explicit
/// execpolicy rule also prompts, we return `None` because applying the amendment would not
/// skip that policy requirement.
///
/// Examples:
/// - execpolicy: empty. Command: `["python"]`. Heuristics prompt -> `Some(vec!["python"])`.
/// - execpolicy: empty. Command: `["bash", "-c", "cd /some/folder && prog1 --option1 arg1 && prog2 --option2 arg2"]`.
/// Parsed commands include `cd /some/folder`, `prog1 --option1 arg1`, and `prog2 --option2 arg2`. If heuristics allow `cd` but prompt
/// on `prog1`, we return `Some(vec!["prog1", "--option1", "arg1"])`.
/// - execpolicy: contains a `prompt for prefix ["prog2"]` rule. For the same command as above,
/// we return `None` because an execpolicy prompt still applies even if we amend execpolicy to allow ["prog1", "--option1", "arg1"].
fn proposed_execpolicy_amendment(evaluation: &Evaluation) -> Option<ExecPolicyAmendment> {
if evaluation.decision != Decision::Prompt {
return None;
}
let mut first_prompt_from_heuristics: Option<Vec<String>> = None;
for rule_match in &evaluation.matched_rules {
match rule_match {
RuleMatch::HeuristicsRuleMatch { command, decision } => {
if *decision == Decision::Prompt && first_prompt_from_heuristics.is_none() {
first_prompt_from_heuristics = Some(command.clone());
}
}
_ if rule_match.decision() == Decision::Prompt => {
return None;
}
_ => {}
}
}
first_prompt_from_heuristics.map(ExecPolicyAmendment::from)
}
/// Only return PROMPT_REASON when an execpolicy rule drove the prompt decision.
fn derive_prompt_reason(evaluation: &Evaluation) -> Option<String> {
evaluation.matched_rules.iter().find_map(|rule_match| {
if !matches!(rule_match, RuleMatch::HeuristicsRuleMatch { .. })
&& rule_match.decision() == Decision::Prompt
{
Some(PROMPT_REASON.to_string())
} else {
None
}
})
}
pub(crate) async fn create_exec_approval_requirement_for_command(
exec_policy: &Arc<RwLock<Policy>>,
features: &Features,
command: &[String],
approval_policy: AskForApproval,
sandbox_policy: &SandboxPolicy,
sandbox_permissions: SandboxPermissions,
) -> ApprovalRequirement {
if let Some(requirement) = evaluate_with_policy(policy, command, approval_policy) {
return requirement;
}
if requires_initial_appoval(
approval_policy,
sandbox_policy,
command,
sandbox_permissions,
) {
ApprovalRequirement::NeedsApproval { reason: None }
} else {
ApprovalRequirement::Skip {
bypass_sandbox: false,
) -> ExecApprovalRequirement {
let commands = parse_shell_lc_plain_commands(command).unwrap_or_else(|| vec![command.to_vec()]);
let heuristics_fallback = |cmd: &[String]| {
if requires_initial_appoval(approval_policy, sandbox_policy, cmd, sandbox_permissions) {
Decision::Prompt
} else {
Decision::Allow
}
};
let policy = exec_policy.read().await;
let evaluation = policy.check_multiple(commands.iter(), &heuristics_fallback);
let has_policy_allow = evaluation.matched_rules.iter().any(|rule_match| {
!matches!(rule_match, RuleMatch::HeuristicsRuleMatch { .. })
&& rule_match.decision() == Decision::Allow
});
match evaluation.decision {
Decision::Forbidden => ExecApprovalRequirement::Forbidden {
reason: FORBIDDEN_REASON.to_string(),
},
Decision::Prompt => {
if matches!(approval_policy, AskForApproval::Never) {
ExecApprovalRequirement::Forbidden {
reason: PROMPT_CONFLICT_REASON.to_string(),
}
} else {
ExecApprovalRequirement::NeedsApproval {
reason: derive_prompt_reason(&evaluation),
proposed_execpolicy_amendment: if features.enabled(Feature::ExecPolicy) {
proposed_execpolicy_amendment(&evaluation)
} else {
None
},
}
}
}
Decision::Allow => ExecApprovalRequirement::Skip {
bypass_sandbox: has_policy_allow,
},
}
}
@@ -195,6 +297,7 @@ mod tests {
use codex_protocol::protocol::SandboxPolicy;
use pretty_assertions::assert_eq;
use std::fs;
use std::sync::Arc;
use tempfile::tempdir;
#[tokio::test]
@@ -208,10 +311,19 @@ mod tests {
.expect("policy result");
let commands = [vec!["rm".to_string()]];
assert!(matches!(
policy.check_multiple(commands.iter()),
Evaluation::NoMatch { .. }
));
assert_eq!(
Evaluation {
decision: Decision::Allow,
matched_rules: vec![RuleMatch::HeuristicsRuleMatch {
command: vec!["rm".to_string()],
decision: Decision::Allow
}],
},
policy
.read()
.await
.check_multiple(commands.iter(), &|_| Decision::Allow)
);
assert!(!temp_dir.path().join(POLICY_DIR_NAME).exists());
}
@@ -242,10 +354,19 @@ mod tests {
.await
.expect("policy result");
let command = [vec!["rm".to_string()]];
assert!(matches!(
policy.check_multiple(command.iter()),
Evaluation::Match { .. }
));
assert_eq!(
Evaluation {
decision: Decision::Forbidden,
matched_rules: vec![RuleMatch::PrefixRuleMatch {
matched_prefix: vec!["rm".to_string()],
decision: Decision::Forbidden
}],
},
policy
.read()
.await
.check_multiple(command.iter(), &|_| Decision::Allow)
);
}
#[tokio::test]
@@ -261,14 +382,23 @@ mod tests {
.await
.expect("policy result");
let command = [vec!["ls".to_string()]];
assert!(matches!(
policy.check_multiple(command.iter()),
Evaluation::NoMatch { .. }
));
assert_eq!(
Evaluation {
decision: Decision::Allow,
matched_rules: vec![RuleMatch::HeuristicsRuleMatch {
command: vec!["ls".to_string()],
decision: Decision::Allow
}],
},
policy
.read()
.await
.check_multiple(command.iter(), &|_| Decision::Allow)
);
}
#[test]
fn evaluates_bash_lc_inner_commands() {
#[tokio::test]
async fn evaluates_bash_lc_inner_commands() {
let policy_src = r#"
prefix_rule(pattern=["rm"], decision="forbidden")
"#;
@@ -276,7 +406,7 @@ prefix_rule(pattern=["rm"], decision="forbidden")
parser
.parse("test.codexpolicy", policy_src)
.expect("parse policy");
let policy = parser.build();
let policy = Arc::new(RwLock::new(parser.build()));
let forbidden_script = vec![
"bash".to_string(),
@@ -284,86 +414,325 @@ prefix_rule(pattern=["rm"], decision="forbidden")
"rm -rf /tmp".to_string(),
];
let requirement =
evaluate_with_policy(&policy, &forbidden_script, AskForApproval::OnRequest)
.expect("expected match for forbidden command");
let requirement = create_exec_approval_requirement_for_command(
&policy,
&Features::with_defaults(),
&forbidden_script,
AskForApproval::OnRequest,
&SandboxPolicy::DangerFullAccess,
SandboxPermissions::UseDefault,
)
.await;
assert_eq!(
requirement,
ApprovalRequirement::Forbidden {
ExecApprovalRequirement::Forbidden {
reason: FORBIDDEN_REASON.to_string()
}
);
}
#[test]
fn approval_requirement_prefers_execpolicy_match() {
#[tokio::test]
async fn exec_approval_requirement_prefers_execpolicy_match() {
let policy_src = r#"prefix_rule(pattern=["rm"], decision="prompt")"#;
let mut parser = PolicyParser::new();
parser
.parse("test.codexpolicy", policy_src)
.expect("parse policy");
let policy = parser.build();
let policy = Arc::new(RwLock::new(parser.build()));
let command = vec!["rm".to_string()];
let requirement = create_approval_requirement_for_command(
let requirement = create_exec_approval_requirement_for_command(
&policy,
&Features::with_defaults(),
&command,
AskForApproval::OnRequest,
&SandboxPolicy::DangerFullAccess,
SandboxPermissions::UseDefault,
);
)
.await;
assert_eq!(
requirement,
ApprovalRequirement::NeedsApproval {
reason: Some(PROMPT_REASON.to_string())
ExecApprovalRequirement::NeedsApproval {
reason: Some(PROMPT_REASON.to_string()),
proposed_execpolicy_amendment: None,
}
);
}
#[test]
fn approval_requirement_respects_approval_policy() {
#[tokio::test]
async fn exec_approval_requirement_respects_approval_policy() {
let policy_src = r#"prefix_rule(pattern=["rm"], decision="prompt")"#;
let mut parser = PolicyParser::new();
parser
.parse("test.codexpolicy", policy_src)
.expect("parse policy");
let policy = parser.build();
let policy = Arc::new(RwLock::new(parser.build()));
let command = vec!["rm".to_string()];
let requirement = create_approval_requirement_for_command(
let requirement = create_exec_approval_requirement_for_command(
&policy,
&Features::with_defaults(),
&command,
AskForApproval::Never,
&SandboxPolicy::DangerFullAccess,
SandboxPermissions::UseDefault,
);
)
.await;
assert_eq!(
requirement,
ApprovalRequirement::Forbidden {
reason: PROMPT_REASON.to_string()
ExecApprovalRequirement::Forbidden {
reason: PROMPT_CONFLICT_REASON.to_string()
}
);
}
#[test]
fn approval_requirement_falls_back_to_heuristics() {
let command = vec!["python".to_string()];
#[tokio::test]
async fn exec_approval_requirement_falls_back_to_heuristics() {
let command = vec!["cargo".to_string(), "build".to_string()];
let empty_policy = Policy::empty();
let requirement = create_approval_requirement_for_command(
let empty_policy = Arc::new(RwLock::new(Policy::empty()));
let requirement = create_exec_approval_requirement_for_command(
&empty_policy,
&Features::with_defaults(),
&command,
AskForApproval::UnlessTrusted,
&SandboxPolicy::ReadOnly,
SandboxPermissions::UseDefault,
);
)
.await;
assert_eq!(
requirement,
ApprovalRequirement::NeedsApproval { reason: None }
ExecApprovalRequirement::NeedsApproval {
reason: None,
proposed_execpolicy_amendment: Some(ExecPolicyAmendment::new(command))
}
);
}
#[tokio::test]
async fn heuristics_apply_when_other_commands_match_policy() {
let policy_src = r#"prefix_rule(pattern=["apple"], decision="allow")"#;
let mut parser = PolicyParser::new();
parser
.parse("test.codexpolicy", policy_src)
.expect("parse policy");
let policy = Arc::new(RwLock::new(parser.build()));
let command = vec![
"bash".to_string(),
"-lc".to_string(),
"apple | orange".to_string(),
];
assert_eq!(
create_exec_approval_requirement_for_command(
&policy,
&Features::with_defaults(),
&command,
AskForApproval::UnlessTrusted,
&SandboxPolicy::DangerFullAccess,
SandboxPermissions::UseDefault,
)
.await,
ExecApprovalRequirement::NeedsApproval {
reason: None,
proposed_execpolicy_amendment: Some(ExecPolicyAmendment::new(vec![
"orange".to_string()
]))
}
);
}
#[tokio::test]
async fn append_execpolicy_amendment_updates_policy_and_file() {
let codex_home = tempdir().expect("create temp dir");
let current_policy = Arc::new(RwLock::new(Policy::empty()));
let prefix = vec!["echo".to_string(), "hello".to_string()];
append_execpolicy_amendment_and_update(codex_home.path(), &current_policy, &prefix)
.await
.expect("update policy");
let evaluation = current_policy.read().await.check(
&["echo".to_string(), "hello".to_string(), "world".to_string()],
&|_| Decision::Allow,
);
assert!(matches!(
evaluation,
Evaluation {
decision: Decision::Allow,
..
}
));
let contents = fs::read_to_string(default_policy_path(codex_home.path()))
.expect("policy file should have been created");
assert_eq!(
contents,
r#"prefix_rule(pattern=["echo", "hello"], decision="allow")
"#
);
}
#[tokio::test]
async fn append_execpolicy_amendment_rejects_empty_prefix() {
let codex_home = tempdir().expect("create temp dir");
let current_policy = Arc::new(RwLock::new(Policy::empty()));
let result =
append_execpolicy_amendment_and_update(codex_home.path(), &current_policy, &[]).await;
assert!(matches!(
result,
Err(ExecPolicyUpdateError::AppendRule {
source: AmendError::EmptyPrefix,
..
})
));
}
#[tokio::test]
async fn proposed_execpolicy_amendment_is_present_for_single_command_without_policy_match() {
let command = vec!["cargo".to_string(), "build".to_string()];
let empty_policy = Arc::new(RwLock::new(Policy::empty()));
let requirement = create_exec_approval_requirement_for_command(
&empty_policy,
&Features::with_defaults(),
&command,
AskForApproval::UnlessTrusted,
&SandboxPolicy::ReadOnly,
SandboxPermissions::UseDefault,
)
.await;
assert_eq!(
requirement,
ExecApprovalRequirement::NeedsApproval {
reason: None,
proposed_execpolicy_amendment: Some(ExecPolicyAmendment::new(command))
}
);
}
#[tokio::test]
async fn proposed_execpolicy_amendment_is_disabled_when_execpolicy_feature_disabled() {
let command = vec!["cargo".to_string(), "build".to_string()];
let mut features = Features::with_defaults();
features.disable(Feature::ExecPolicy);
let requirement = create_exec_approval_requirement_for_command(
&Arc::new(RwLock::new(Policy::empty())),
&features,
&command,
AskForApproval::UnlessTrusted,
&SandboxPolicy::ReadOnly,
SandboxPermissions::UseDefault,
)
.await;
assert_eq!(
requirement,
ExecApprovalRequirement::NeedsApproval {
reason: None,
proposed_execpolicy_amendment: None,
}
);
}
#[tokio::test]
async fn proposed_execpolicy_amendment_is_omitted_when_policy_prompts() {
let policy_src = r#"prefix_rule(pattern=["rm"], decision="prompt")"#;
let mut parser = PolicyParser::new();
parser
.parse("test.codexpolicy", policy_src)
.expect("parse policy");
let policy = Arc::new(RwLock::new(parser.build()));
let command = vec!["rm".to_string()];
let requirement = create_exec_approval_requirement_for_command(
&policy,
&Features::with_defaults(),
&command,
AskForApproval::OnRequest,
&SandboxPolicy::DangerFullAccess,
SandboxPermissions::UseDefault,
)
.await;
assert_eq!(
requirement,
ExecApprovalRequirement::NeedsApproval {
reason: Some(PROMPT_REASON.to_string()),
proposed_execpolicy_amendment: None,
}
);
}
#[tokio::test]
async fn proposed_execpolicy_amendment_is_present_for_multi_command_scripts() {
let command = vec![
"bash".to_string(),
"-lc".to_string(),
"cargo build && echo ok".to_string(),
];
let requirement = create_exec_approval_requirement_for_command(
&Arc::new(RwLock::new(Policy::empty())),
&Features::with_defaults(),
&command,
AskForApproval::UnlessTrusted,
&SandboxPolicy::ReadOnly,
SandboxPermissions::UseDefault,
)
.await;
assert_eq!(
requirement,
ExecApprovalRequirement::NeedsApproval {
reason: None,
proposed_execpolicy_amendment: Some(ExecPolicyAmendment::new(vec![
"cargo".to_string(),
"build".to_string()
])),
}
);
}
#[tokio::test]
async fn proposed_execpolicy_amendment_uses_first_no_match_in_multi_command_scripts() {
let policy_src = r#"prefix_rule(pattern=["cat"], decision="allow")"#;
let mut parser = PolicyParser::new();
parser
.parse("test.codexpolicy", policy_src)
.expect("parse policy");
let policy = Arc::new(RwLock::new(parser.build()));
let command = vec![
"bash".to_string(),
"-lc".to_string(),
"cat && apple".to_string(),
];
assert_eq!(
create_exec_approval_requirement_for_command(
&policy,
&Features::with_defaults(),
&command,
AskForApproval::UnlessTrusted,
&SandboxPolicy::ReadOnly,
SandboxPermissions::UseDefault,
)
.await,
ExecApprovalRequirement::NeedsApproval {
reason: None,
proposed_execpolicy_amendment: Some(ExecPolicyAmendment::new(vec![
"apple".to_string()
])),
}
);
}
}

View File

@@ -27,16 +27,23 @@ pub enum Stage {
/// Unique features toggled via configuration.
#[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub enum Feature {
// Stable.
/// Create a ghost commit at each turn.
GhostCommit,
/// Include the view_image tool.
ViewImageTool,
/// Send warnings to the model to correct it on the tool usage.
ModelWarnings,
/// Enable the default shell tool.
ShellTool,
// Experimental
/// Use the single unified PTY-backed exec tool.
UnifiedExec,
/// Enable experimental RMCP features such as OAuth login.
RmcpClient,
/// Include the freeform apply_patch tool.
ApplyPatchFreeform,
/// Include the view_image tool.
ViewImageTool,
/// Allow the model to request web searches.
WebSearchRequest,
/// Gate the execpolicy enforcement for shell/unified exec.
@@ -47,10 +54,10 @@ pub enum Feature {
WindowsSandbox,
/// Remote compaction enabled (only for ChatGPT auth)
RemoteCompaction,
/// Enable the default shell tool.
ShellTool,
/// Allow model to call multiple tools in parallel (only for models supporting it).
ParallelToolCalls,
/// Experimental skills injection (CLI flag-driven).
Skills,
}
impl Feature {
@@ -265,6 +272,18 @@ pub const FEATURES: &[FeatureSpec] = &[
stage: Stage::Stable,
default_enabled: true,
},
FeatureSpec {
id: Feature::ShellTool,
key: "shell_tool",
stage: Stage::Stable,
default_enabled: true,
},
FeatureSpec {
id: Feature::ModelWarnings,
key: "warnings",
stage: Stage::Stable,
default_enabled: true,
},
// Unstable features.
FeatureSpec {
id: Feature::UnifiedExec,
@@ -321,9 +340,9 @@ pub const FEATURES: &[FeatureSpec] = &[
default_enabled: false,
},
FeatureSpec {
id: Feature::ShellTool,
key: "shell_tool",
stage: Stage::Stable,
default_enabled: true,
id: Feature::Skills,
key: "skills",
stage: Stage::Experimental,
default_enabled: false,
},
];

View File

@@ -2,6 +2,7 @@ use std::collections::HashSet;
use std::path::Path;
use std::path::PathBuf;
use crate::util::resolve_path;
use codex_app_server_protocol::GitSha;
use codex_protocol::protocol::GitInfo;
use futures::future::join_all;
@@ -131,11 +132,15 @@ pub async fn recent_commits(cwd: &Path, limit: usize) -> Vec<CommitLogEntry> {
}
let fmt = "%H%x1f%ct%x1f%s"; // <sha> <US> <commit_time> <US> <subject>
let n = limit.max(1).to_string();
let Some(log_out) =
run_git_command_with_timeout(&["log", "-n", &n, &format!("--pretty=format:{fmt}")], cwd)
.await
else {
let limit_arg = (limit > 0).then(|| limit.to_string());
let mut args: Vec<String> = vec!["log".to_string()];
if let Some(n) = &limit_arg {
args.push("-n".to_string());
args.push(n.clone());
}
args.push(format!("--pretty=format:{fmt}"));
let arg_refs: Vec<&str> = args.iter().map(String::as_str).collect();
let Some(log_out) = run_git_command_with_timeout(&arg_refs, cwd).await else {
return Vec::new();
};
if !log_out.status.success() {
@@ -544,11 +549,7 @@ pub fn resolve_root_git_project_for_trust(cwd: &Path) -> Option<PathBuf> {
.trim()
.to_string();
let git_dir_path_raw = if Path::new(&git_dir_s).is_absolute() {
PathBuf::from(&git_dir_s)
} else {
base.join(&git_dir_s)
};
let git_dir_path_raw = resolve_path(base, &PathBuf::from(&git_dir_s));
// Normalize to handle macOS /var vs /private/var and resolve ".." segments.
let git_dir_path = std::fs::canonicalize(&git_dir_path_raw).unwrap_or(git_dir_path_raw);

View File

@@ -32,6 +32,7 @@ pub mod git_info;
pub mod landlock;
pub mod mcp;
mod mcp_connection_manager;
pub mod openai_models;
pub use mcp_connection_manager::MCP_SANDBOX_STATE_CAPABILITY;
pub use mcp_connection_manager::MCP_SANDBOX_STATE_NOTIFICATION;
pub use mcp_connection_manager::SandboxState;
@@ -40,8 +41,8 @@ mod message_history;
mod model_provider_info;
pub mod parse_command;
pub mod powershell;
mod response_processing;
pub mod sandboxing;
mod stream_events_utils;
mod text_encoding;
pub mod token_data;
mod truncate;
@@ -58,6 +59,7 @@ pub use model_provider_info::create_oss_provider_with_base_url;
mod conversation_manager;
mod event_mapping;
pub mod review_format;
pub mod review_prompts;
pub use codex_protocol::protocol::InitialHistory;
pub use conversation_manager::ConversationManager;
pub use conversation_manager::NewConversation;
@@ -65,13 +67,13 @@ pub use conversation_manager::NewConversation;
pub use auth::AuthManager;
pub use auth::CodexAuth;
pub mod default_client;
pub mod model_family;
mod openai_model_info;
pub mod project_doc;
mod rollout;
pub(crate) mod safety;
pub mod seatbelt;
pub mod shell;
pub mod skills;
pub mod spawn;
pub mod terminal;
mod tools;

View File

@@ -1 +1,168 @@
pub mod auth;
use std::collections::HashMap;
use async_channel::unbounded;
use codex_protocol::protocol::McpListToolsResponseEvent;
use mcp_types::Tool as McpTool;
use tokio_util::sync::CancellationToken;
use crate::config::Config;
use crate::mcp::auth::compute_auth_statuses;
use crate::mcp_connection_manager::McpConnectionManager;
const MCP_TOOL_NAME_PREFIX: &str = "mcp";
const MCP_TOOL_NAME_DELIMITER: &str = "__";
pub async fn collect_mcp_snapshot(config: &Config) -> McpListToolsResponseEvent {
if config.mcp_servers.is_empty() {
return McpListToolsResponseEvent {
tools: HashMap::new(),
resources: HashMap::new(),
resource_templates: HashMap::new(),
auth_statuses: HashMap::new(),
};
}
let auth_status_entries = compute_auth_statuses(
config.mcp_servers.iter(),
config.mcp_oauth_credentials_store_mode,
)
.await;
let mut mcp_connection_manager = McpConnectionManager::default();
let (tx_event, rx_event) = unbounded();
drop(rx_event);
let cancel_token = CancellationToken::new();
mcp_connection_manager
.initialize(
config.mcp_servers.clone(),
config.mcp_oauth_credentials_store_mode,
auth_status_entries.clone(),
tx_event,
cancel_token.clone(),
)
.await;
let snapshot =
collect_mcp_snapshot_from_manager(&mcp_connection_manager, auth_status_entries).await;
cancel_token.cancel();
snapshot
}
pub fn split_qualified_tool_name(qualified_name: &str) -> Option<(String, String)> {
let mut parts = qualified_name.split(MCP_TOOL_NAME_DELIMITER);
let prefix = parts.next()?;
if prefix != MCP_TOOL_NAME_PREFIX {
return None;
}
let server_name = parts.next()?;
let tool_name: String = parts.collect::<Vec<_>>().join(MCP_TOOL_NAME_DELIMITER);
if tool_name.is_empty() {
return None;
}
Some((server_name.to_string(), tool_name))
}
pub fn group_tools_by_server(
tools: &HashMap<String, McpTool>,
) -> HashMap<String, HashMap<String, McpTool>> {
let mut grouped = HashMap::new();
for (qualified_name, tool) in tools {
if let Some((server_name, tool_name)) = split_qualified_tool_name(qualified_name) {
grouped
.entry(server_name)
.or_insert_with(HashMap::new)
.insert(tool_name, tool.clone());
}
}
grouped
}
pub(crate) async fn collect_mcp_snapshot_from_manager(
mcp_connection_manager: &McpConnectionManager,
auth_status_entries: HashMap<String, crate::mcp::auth::McpAuthStatusEntry>,
) -> McpListToolsResponseEvent {
let (tools, resources, resource_templates) = tokio::join!(
mcp_connection_manager.list_all_tools(),
mcp_connection_manager.list_all_resources(),
mcp_connection_manager.list_all_resource_templates(),
);
let auth_statuses = auth_status_entries
.iter()
.map(|(name, entry)| (name.clone(), entry.auth_status))
.collect();
McpListToolsResponseEvent {
tools: tools
.into_iter()
.map(|(name, tool)| (name, tool.tool))
.collect(),
resources,
resource_templates,
auth_statuses,
}
}
#[cfg(test)]
mod tests {
use super::*;
use mcp_types::ToolInputSchema;
use pretty_assertions::assert_eq;
fn make_tool(name: &str) -> McpTool {
McpTool {
annotations: None,
description: None,
input_schema: ToolInputSchema {
properties: None,
required: None,
r#type: "object".to_string(),
},
name: name.to_string(),
output_schema: None,
title: None,
}
}
#[test]
fn split_qualified_tool_name_returns_server_and_tool() {
assert_eq!(
split_qualified_tool_name("mcp__alpha__do_thing"),
Some(("alpha".to_string(), "do_thing".to_string()))
);
}
#[test]
fn split_qualified_tool_name_rejects_invalid_names() {
assert_eq!(split_qualified_tool_name("other__alpha__do_thing"), None);
assert_eq!(split_qualified_tool_name("mcp__alpha__"), None);
}
#[test]
fn group_tools_by_server_strips_prefix_and_groups() {
let mut tools = HashMap::new();
tools.insert("mcp__alpha__do_thing".to_string(), make_tool("do_thing"));
tools.insert(
"mcp__alpha__nested__op".to_string(),
make_tool("nested__op"),
);
tools.insert("mcp__beta__do_other".to_string(), make_tool("do_other"));
let mut expected_alpha = HashMap::new();
expected_alpha.insert("do_thing".to_string(), make_tool("do_thing"));
expected_alpha.insert("nested__op".to_string(), make_tool("nested__op"));
let mut expected_beta = HashMap::new();
expected_beta.insert("do_other".to_string(), make_tool("do_other"));
let mut expected = HashMap::new();
expected.insert("alpha".to_string(), expected_alpha);
expected.insert("beta".to_string(), expected_beta);
assert_eq!(group_tools_by_server(&tools), expected);
}
}

View File

@@ -12,7 +12,6 @@ use std::env;
use std::ffi::OsString;
use std::path::PathBuf;
use std::sync::Arc;
use std::sync::Mutex;
use std::time::Duration;
use crate::mcp::auth::McpAuthStatusEntry;
@@ -55,6 +54,7 @@ use serde::Serialize;
use serde_json::json;
use sha1::Digest;
use sha1::Sha1;
use tokio::sync::Mutex;
use tokio::sync::oneshot;
use tokio::task::JoinSet;
use tokio_util::sync::CancellationToken;
@@ -128,7 +128,7 @@ struct ElicitationRequestManager {
}
impl ElicitationRequestManager {
fn resolve(
async fn resolve(
&self,
server_name: String,
id: RequestId,
@@ -136,7 +136,7 @@ impl ElicitationRequestManager {
) -> Result<()> {
self.requests
.lock()
.map_err(|e| anyhow!("failed to lock elicitation requests: {e:?}"))?
.await
.remove(&(server_name, id))
.ok_or_else(|| anyhow!("elicitation request not found"))?
.send(response)
@@ -151,7 +151,8 @@ impl ElicitationRequestManager {
let server_name = server_name.clone();
async move {
let (tx, rx) = oneshot::channel();
if let Ok(mut lock) = elicitation_requests.lock() {
{
let mut lock = elicitation_requests.lock().await;
lock.insert((server_name.clone(), id.clone()), tx);
}
let _ = tx_event
@@ -365,13 +366,15 @@ impl McpConnectionManager {
.context("failed to get client")
}
pub fn resolve_elicitation(
pub async fn resolve_elicitation(
&self,
server_name: String,
id: RequestId,
response: ElicitationResponse,
) -> Result<()> {
self.elicitation_requests.resolve(server_name, id, response)
self.elicitation_requests
.resolve(server_name, id, response)
.await
}
/// Returns a single map that contains all tools. Each key is the

View File

@@ -16,8 +16,14 @@
use std::fs::File;
use std::fs::OpenOptions;
use std::io::BufRead;
use std::io::BufReader;
use std::io::Read;
use std::io::Result;
use std::io::Seek;
use std::io::SeekFrom;
use std::io::Write;
use std::path::Path;
use std::path::PathBuf;
use serde::Deserialize;
@@ -39,10 +45,13 @@ use std::os::unix::fs::PermissionsExt;
/// Filename that stores the message history inside `~/.codex`.
const HISTORY_FILENAME: &str = "history.jsonl";
/// When history exceeds the hard cap, trim it down to this fraction of `max_bytes`.
const HISTORY_SOFT_CAP_RATIO: f64 = 0.8;
const MAX_RETRIES: usize = 10;
const RETRY_SLEEP: Duration = Duration::from_millis(100);
#[derive(Serialize, Deserialize, Debug, Clone)]
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq)]
pub struct HistoryEntry {
pub session_id: String,
pub ts: u64,
@@ -97,11 +106,12 @@ pub(crate) async fn append_entry(
.map_err(|e| std::io::Error::other(format!("failed to serialise history entry: {e}")))?;
line.push('\n');
// Open in append-only mode.
// Open the history file for read/write access (append-only on Unix).
let mut options = OpenOptions::new();
options.append(true).read(true).create(true);
options.read(true).write(true).create(true);
#[cfg(unix)]
{
options.append(true);
options.mode(0o600);
}
@@ -110,6 +120,8 @@ pub(crate) async fn append_entry(
// Ensure permissions.
ensure_owner_only_permissions(&history_file).await?;
let history_max_bytes = config.history.max_bytes;
// Perform a blocking write under an advisory write lock using std::fs.
tokio::task::spawn_blocking(move || -> Result<()> {
// Retry a few times to avoid indefinite blocking when contended.
@@ -117,8 +129,12 @@ pub(crate) async fn append_entry(
match history_file.try_lock() {
Ok(()) => {
// While holding the exclusive lock, write the full line.
// We do not open the file with `append(true)` on Windows, so ensure the
// cursor is positioned at the end before writing.
history_file.seek(SeekFrom::End(0))?;
history_file.write_all(line.as_bytes())?;
history_file.flush()?;
enforce_history_limit(&mut history_file, history_max_bytes)?;
return Ok(());
}
Err(std::fs::TryLockError::WouldBlock) => {
@@ -138,27 +154,144 @@ pub(crate) async fn append_entry(
Ok(())
}
/// Trim the history file to honor `max_bytes`, dropping the oldest lines while holding
/// the write lock so the newest entry is always retained. When the file exceeds the
/// hard cap, it rewrites the remaining tail to a soft cap to avoid trimming again
/// immediately on the next write.
fn enforce_history_limit(file: &mut File, max_bytes: Option<usize>) -> Result<()> {
let Some(max_bytes) = max_bytes else {
return Ok(());
};
if max_bytes == 0 {
return Ok(());
}
let max_bytes = match u64::try_from(max_bytes) {
Ok(value) => value,
Err(_) => return Ok(()),
};
let mut current_len = file.metadata()?.len();
if current_len <= max_bytes {
return Ok(());
}
let mut reader_file = file.try_clone()?;
reader_file.seek(SeekFrom::Start(0))?;
let mut buf_reader = BufReader::new(reader_file);
let mut line_lengths = Vec::new();
let mut line_buf = String::new();
loop {
line_buf.clear();
let bytes = buf_reader.read_line(&mut line_buf)?;
if bytes == 0 {
break;
}
line_lengths.push(bytes as u64);
}
if line_lengths.is_empty() {
return Ok(());
}
let last_index = line_lengths.len() - 1;
let trim_target = trim_target_bytes(max_bytes, line_lengths[last_index]);
let mut drop_bytes = 0u64;
let mut idx = 0usize;
while current_len > trim_target && idx < last_index {
current_len = current_len.saturating_sub(line_lengths[idx]);
drop_bytes += line_lengths[idx];
idx += 1;
}
if drop_bytes == 0 {
return Ok(());
}
let mut reader = buf_reader.into_inner();
reader.seek(SeekFrom::Start(drop_bytes))?;
let capacity = usize::try_from(current_len).unwrap_or(0);
let mut tail = Vec::with_capacity(capacity);
reader.read_to_end(&mut tail)?;
file.set_len(0)?;
file.seek(SeekFrom::Start(0))?;
file.write_all(&tail)?;
file.flush()?;
Ok(())
}
fn trim_target_bytes(max_bytes: u64, newest_entry_len: u64) -> u64 {
let soft_cap_bytes = ((max_bytes as f64) * HISTORY_SOFT_CAP_RATIO)
.floor()
.clamp(1.0, max_bytes as f64) as u64;
soft_cap_bytes.max(newest_entry_len)
}
/// Asynchronously fetch the history file's *identifier* (inode on Unix) and
/// the current number of entries by counting newline characters.
pub(crate) async fn history_metadata(config: &Config) -> (u64, usize) {
let path = history_filepath(config);
history_metadata_for_file(&path).await
}
#[cfg(unix)]
let log_id = {
use std::os::unix::fs::MetadataExt;
// Obtain metadata (async) to get the identifier.
let meta = match fs::metadata(&path).await {
Ok(m) => m,
Err(e) if e.kind() == std::io::ErrorKind::NotFound => return (0, 0),
Err(_) => return (0, 0),
};
meta.ino()
/// Given a `log_id` (on Unix this is the file's inode number,
/// on Windows this is the file's creation time) and a zero-based
/// `offset`, return the corresponding `HistoryEntry` if the identifier matches
/// the current history file **and** the requested offset exists. Any I/O or
/// parsing errors are logged and result in `None`.
///
/// Note this function is not async because it uses a sync advisory file
/// locking API.
pub(crate) fn lookup(log_id: u64, offset: usize, config: &Config) -> Option<HistoryEntry> {
let path = history_filepath(config);
lookup_history_entry(&path, log_id, offset)
}
/// On Unix systems, ensure the file permissions are `0o600` (rw-------). If the
/// permissions cannot be changed the error is propagated to the caller.
#[cfg(unix)]
async fn ensure_owner_only_permissions(file: &File) -> Result<()> {
let metadata = file.metadata()?;
let current_mode = metadata.permissions().mode() & 0o777;
if current_mode != 0o600 {
let mut perms = metadata.permissions();
perms.set_mode(0o600);
let perms_clone = perms.clone();
let file_clone = file.try_clone()?;
tokio::task::spawn_blocking(move || file_clone.set_permissions(perms_clone)).await??;
}
Ok(())
}
#[cfg(windows)]
// On Windows, simply succeed.
async fn ensure_owner_only_permissions(_file: &File) -> Result<()> {
Ok(())
}
async fn history_metadata_for_file(path: &Path) -> (u64, usize) {
let log_id = match fs::metadata(path).await {
Ok(metadata) => history_log_id(&metadata).unwrap_or(0),
Err(e) if e.kind() == std::io::ErrorKind::NotFound => return (0, 0),
Err(_) => return (0, 0),
};
#[cfg(not(unix))]
let log_id = 0u64;
// Open the file.
let mut file = match fs::File::open(&path).await {
let mut file = match fs::File::open(path).await {
Ok(f) => f,
Err(_) => return (log_id, 0),
};
@@ -179,21 +312,11 @@ pub(crate) async fn history_metadata(config: &Config) -> (u64, usize) {
(log_id, count)
}
/// Given a `log_id` (on Unix this is the file's inode number) and a zero-based
/// `offset`, return the corresponding `HistoryEntry` if the identifier matches
/// the current history file **and** the requested offset exists. Any I/O or
/// parsing errors are logged and result in `None`.
///
/// Note this function is not async because it uses a sync advisory file
/// locking API.
#[cfg(unix)]
pub(crate) fn lookup(log_id: u64, offset: usize, config: &Config) -> Option<HistoryEntry> {
fn lookup_history_entry(path: &Path, log_id: u64, offset: usize) -> Option<HistoryEntry> {
use std::io::BufRead;
use std::io::BufReader;
use std::os::unix::fs::MetadataExt;
let path = history_filepath(config);
let file: File = match OpenOptions::new().read(true).open(&path) {
let file: File = match OpenOptions::new().read(true).open(path) {
Ok(f) => f,
Err(e) => {
tracing::warn!(error = %e, "failed to open history file");
@@ -209,7 +332,9 @@ pub(crate) fn lookup(log_id: u64, offset: usize, config: &Config) -> Option<Hist
}
};
if metadata.ino() != log_id {
let current_log_id = history_log_id(&metadata)?;
if log_id != 0 && current_log_id != log_id {
return None;
}
@@ -256,31 +381,238 @@ pub(crate) fn lookup(log_id: u64, offset: usize, config: &Config) -> Option<Hist
None
}
/// Fallback stub for non-Unix systems: currently always returns `None`.
#[cfg(not(unix))]
pub(crate) fn lookup(log_id: u64, offset: usize, config: &Config) -> Option<HistoryEntry> {
let _ = (log_id, offset, config);
#[cfg(unix)]
fn history_log_id(metadata: &std::fs::Metadata) -> Option<u64> {
use std::os::unix::fs::MetadataExt;
Some(metadata.ino())
}
#[cfg(windows)]
fn history_log_id(metadata: &std::fs::Metadata) -> Option<u64> {
use std::os::windows::fs::MetadataExt;
Some(metadata.creation_time())
}
#[cfg(not(any(unix, windows)))]
fn history_log_id(_metadata: &std::fs::Metadata) -> Option<u64> {
None
}
/// On Unix systems ensure the file permissions are `0o600` (rw-------). If the
/// permissions cannot be changed the error is propagated to the caller.
#[cfg(unix)]
async fn ensure_owner_only_permissions(file: &File) -> Result<()> {
let metadata = file.metadata()?;
let current_mode = metadata.permissions().mode() & 0o777;
if current_mode != 0o600 {
let mut perms = metadata.permissions();
perms.set_mode(0o600);
let perms_clone = perms.clone();
let file_clone = file.try_clone()?;
tokio::task::spawn_blocking(move || file_clone.set_permissions(perms_clone)).await??;
}
Ok(())
}
#[cfg(test)]
mod tests {
use super::*;
use crate::config::Config;
use crate::config::ConfigOverrides;
use crate::config::ConfigToml;
use codex_protocol::ConversationId;
use pretty_assertions::assert_eq;
use std::fs::File;
use std::io::Write;
use tempfile::TempDir;
#[cfg(not(unix))]
async fn ensure_owner_only_permissions(_file: &File) -> Result<()> {
// For now, on non-Unix, simply succeed.
Ok(())
#[tokio::test]
async fn lookup_reads_history_entries() {
let temp_dir = TempDir::new().expect("create temp dir");
let history_path = temp_dir.path().join(HISTORY_FILENAME);
let entries = vec![
HistoryEntry {
session_id: "first-session".to_string(),
ts: 1,
text: "first".to_string(),
},
HistoryEntry {
session_id: "second-session".to_string(),
ts: 2,
text: "second".to_string(),
},
];
let mut file = File::create(&history_path).expect("create history file");
for entry in &entries {
writeln!(
file,
"{}",
serde_json::to_string(entry).expect("serialize history entry")
)
.expect("write history entry");
}
let (log_id, count) = history_metadata_for_file(&history_path).await;
assert_eq!(count, entries.len());
let second_entry =
lookup_history_entry(&history_path, log_id, 1).expect("fetch second history entry");
assert_eq!(second_entry, entries[1]);
}
#[tokio::test]
async fn lookup_uses_stable_log_id_after_appends() {
let temp_dir = TempDir::new().expect("create temp dir");
let history_path = temp_dir.path().join(HISTORY_FILENAME);
let initial = HistoryEntry {
session_id: "first-session".to_string(),
ts: 1,
text: "first".to_string(),
};
let appended = HistoryEntry {
session_id: "second-session".to_string(),
ts: 2,
text: "second".to_string(),
};
let mut file = File::create(&history_path).expect("create history file");
writeln!(
file,
"{}",
serde_json::to_string(&initial).expect("serialize initial entry")
)
.expect("write initial entry");
let (log_id, count) = history_metadata_for_file(&history_path).await;
assert_eq!(count, 1);
let mut append = std::fs::OpenOptions::new()
.append(true)
.open(&history_path)
.expect("open history file for append");
writeln!(
append,
"{}",
serde_json::to_string(&appended).expect("serialize appended entry")
)
.expect("append history entry");
let fetched =
lookup_history_entry(&history_path, log_id, 1).expect("lookup appended history entry");
assert_eq!(fetched, appended);
}
#[tokio::test]
async fn append_entry_trims_history_when_beyond_max_bytes() {
let codex_home = TempDir::new().expect("create temp dir");
let mut config = Config::load_from_base_config_with_overrides(
ConfigToml::default(),
ConfigOverrides::default(),
codex_home.path().to_path_buf(),
)
.expect("load config");
let conversation_id = ConversationId::new();
let entry_one = "a".repeat(200);
let entry_two = "b".repeat(200);
let history_path = codex_home.path().join("history.jsonl");
append_entry(&entry_one, &conversation_id, &config)
.await
.expect("write first entry");
let first_len = std::fs::metadata(&history_path).expect("metadata").len();
let limit_bytes = first_len + 10;
config.history.max_bytes =
Some(usize::try_from(limit_bytes).expect("limit should fit into usize"));
append_entry(&entry_two, &conversation_id, &config)
.await
.expect("write second entry");
let contents = std::fs::read_to_string(&history_path).expect("read history");
let entries = contents
.lines()
.map(|line| serde_json::from_str::<HistoryEntry>(line).expect("parse entry"))
.collect::<Vec<HistoryEntry>>();
assert_eq!(
entries.len(),
1,
"only one entry left because entry_one should be evicted"
);
assert_eq!(entries[0].text, entry_two);
assert!(std::fs::metadata(&history_path).expect("metadata").len() <= limit_bytes);
}
#[tokio::test]
async fn append_entry_trims_history_to_soft_cap() {
let codex_home = TempDir::new().expect("create temp dir");
let mut config = Config::load_from_base_config_with_overrides(
ConfigToml::default(),
ConfigOverrides::default(),
codex_home.path().to_path_buf(),
)
.expect("load config");
let conversation_id = ConversationId::new();
let short_entry = "a".repeat(200);
let long_entry = "b".repeat(400);
let history_path = codex_home.path().join("history.jsonl");
append_entry(&short_entry, &conversation_id, &config)
.await
.expect("write first entry");
let short_entry_len = std::fs::metadata(&history_path).expect("metadata").len();
append_entry(&long_entry, &conversation_id, &config)
.await
.expect("write second entry");
let two_entry_len = std::fs::metadata(&history_path).expect("metadata").len();
let long_entry_len = two_entry_len
.checked_sub(short_entry_len)
.expect("second entry length should be larger than first entry length");
config.history.max_bytes = Some(
usize::try_from((2 * long_entry_len) + (short_entry_len / 2))
.expect("max bytes should fit into usize"),
);
append_entry(&long_entry, &conversation_id, &config)
.await
.expect("write third entry");
let contents = std::fs::read_to_string(&history_path).expect("read history");
let entries = contents
.lines()
.map(|line| serde_json::from_str::<HistoryEntry>(line).expect("parse entry"))
.collect::<Vec<HistoryEntry>>();
assert_eq!(entries.len(), 1);
assert_eq!(entries[0].text, long_entry);
let pruned_len = std::fs::metadata(&history_path).expect("metadata").len();
let max_bytes = config
.history
.max_bytes
.expect("max bytes should be configured") as u64;
assert!(pruned_len <= max_bytes);
let soft_cap_bytes = ((max_bytes as f64) * HISTORY_SOFT_CAP_RATIO)
.floor()
.clamp(1.0, max_bytes as f64) as u64;
let len_without_first = 2 * long_entry_len;
assert!(
len_without_first <= max_bytes,
"dropping only the first entry would satisfy the hard cap"
);
assert!(
len_without_first > soft_cap_bytes,
"soft cap should require more aggressive trimming than the hard cap"
);
assert_eq!(pruned_len, long_entry_len);
assert!(pruned_len <= soft_cap_bytes.max(long_entry_len));
}
}

View File

@@ -1,4 +1,4 @@
use crate::model_family::ModelFamily;
use crate::openai_models::model_family::ModelFamily;
// Shared constants for commonly used window/token sizes.
pub(crate) const CONTEXT_WINDOW_272K: i64 = 272_000;
@@ -76,6 +76,8 @@ pub(crate) fn get_model_info(model_family: &ModelFamily) -> Option<ModelInfo> {
_ if slug.starts_with("codex-") => Some(ModelInfo::new(CONTEXT_WINDOW_272K)),
_ if slug.starts_with("exp-") => Some(ModelInfo::new(CONTEXT_WINDOW_272K)),
_ => None,
}
}

View File

@@ -0,0 +1,3 @@
pub mod model_family;
pub mod model_presets;
pub mod models_manager;

View File

@@ -1,6 +1,7 @@
use codex_protocol::config_types::ReasoningEffort;
use codex_protocol::config_types::Verbosity;
use codex_protocol::openai_models::ReasoningEffort;
use crate::config::Config;
use crate::config::types::ReasoningSummaryFormat;
use crate::tools::handlers::apply_patch::ApplyPatchToolType;
use crate::tools::spec::ConfigShellToolType;
@@ -8,11 +9,11 @@ use crate::truncate::TruncationPolicy;
/// The `instructions` field in the payload sent to a model should always start
/// with this content.
const BASE_INSTRUCTIONS: &str = include_str!("../prompt.md");
const BASE_INSTRUCTIONS: &str = include_str!("../../prompt.md");
const GPT_5_CODEX_INSTRUCTIONS: &str = include_str!("../gpt_5_codex_prompt.md");
const GPT_5_1_INSTRUCTIONS: &str = include_str!("../gpt_5_1_prompt.md");
const GPT_5_1_CODEX_MAX_INSTRUCTIONS: &str = include_str!("../gpt-5.1-codex-max_prompt.md");
const GPT_5_CODEX_INSTRUCTIONS: &str = include_str!("../../gpt_5_codex_prompt.md");
const GPT_5_1_INSTRUCTIONS: &str = include_str!("../../gpt_5_1_prompt.md");
const GPT_5_1_CODEX_MAX_INSTRUCTIONS: &str = include_str!("../../gpt-5.1-codex-max_prompt.md");
/// A model family is a group of models that share certain characteristics.
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
@@ -72,6 +73,18 @@ pub struct ModelFamily {
pub truncation_policy: TruncationPolicy,
}
impl ModelFamily {
pub fn with_config_overrides(mut self, config: &Config) -> Self {
if let Some(supports_reasoning_summaries) = config.model_supports_reasoning_summaries {
self.supports_reasoning_summaries = supports_reasoning_summaries;
}
if let Some(reasoning_summary_format) = config.model_reasoning_summary_format.as_ref() {
self.reasoning_summary_format = reasoning_summary_format.clone();
}
self
}
}
macro_rules! model_family {
(
$slug:expr, $family:expr $(, $key:ident : $value:expr )* $(,)?
@@ -100,13 +113,14 @@ macro_rules! model_family {
$(
mf.$key = $value;
)*
Some(mf)
mf
}};
}
// todo(aibrahim): remove this function
/// Returns a `ModelFamily` for the given model slug, or `None` if the slug
/// does not match any known model family.
pub fn find_family_for_model(slug: &str) -> Option<ModelFamily> {
pub fn find_family_for_model(slug: &str) -> ModelFamily {
if slug.starts_with("o3") {
model_family!(
slug, "o3",
@@ -238,11 +252,11 @@ pub fn find_family_for_model(slug: &str) -> Option<ModelFamily> {
truncation_policy: TruncationPolicy::Bytes(10_000),
)
} else {
None
derive_default_model_family(slug)
}
}
pub fn derive_default_model_family(model: &str) -> ModelFamily {
fn derive_default_model_family(model: &str) -> ModelFamily {
ModelFamily {
slug: model.to_string(),
family: model.to_string(),

View File

@@ -1,76 +1,38 @@
use std::collections::HashMap;
use codex_app_server_protocol::AuthMode;
use codex_core::protocol_config_types::ReasoningEffort;
use codex_protocol::openai_models::ModelPreset;
use codex_protocol::openai_models::ModelUpgrade;
use codex_protocol::openai_models::ReasoningEffort;
use codex_protocol::openai_models::ReasoningEffortPreset;
use once_cell::sync::Lazy;
pub const HIDE_GPT5_1_MIGRATION_PROMPT_CONFIG: &str = "hide_gpt5_1_migration_prompt";
pub const HIDE_GPT_5_1_CODEX_MAX_MIGRATION_PROMPT_CONFIG: &str =
"hide_gpt-5.1-codex-max_migration_prompt";
/// A reasoning effort option that can be surfaced for a model.
#[derive(Debug, Clone, Copy)]
pub struct ReasoningEffortPreset {
/// Effort level that the model supports.
pub effort: ReasoningEffort,
/// Short human description shown next to the effort in UIs.
pub description: &'static str,
}
#[derive(Debug, Clone)]
pub struct ModelUpgrade {
pub id: &'static str,
pub reasoning_effort_mapping: Option<HashMap<ReasoningEffort, ReasoningEffort>>,
pub migration_config_key: &'static str,
}
/// Metadata describing a Codex-supported model.
#[derive(Debug, Clone)]
pub struct ModelPreset {
/// Stable identifier for the preset.
pub id: &'static str,
/// Model slug (e.g., "gpt-5").
pub model: &'static str,
/// Display name shown in UIs.
pub display_name: &'static str,
/// Short human description shown in UIs.
pub description: &'static str,
/// Reasoning effort applied when none is explicitly chosen.
pub default_reasoning_effort: ReasoningEffort,
/// Supported reasoning effort options.
pub supported_reasoning_efforts: &'static [ReasoningEffortPreset],
/// Whether this is the default model for new users.
pub is_default: bool,
/// recommended upgrade model
pub upgrade: Option<ModelUpgrade>,
/// Whether this preset should appear in the picker UI.
pub show_in_picker: bool,
}
static PRESETS: Lazy<Vec<ModelPreset>> = Lazy::new(|| {
vec![
ModelPreset {
id: "gpt-5.1-codex-max",
model: "gpt-5.1-codex-max",
display_name: "gpt-5.1-codex-max",
description: "Latest Codex-optimized flagship for deep and fast reasoning.",
id: "gpt-5.1-codex-max".to_string(),
model: "gpt-5.1-codex-max".to_string(),
display_name: "gpt-5.1-codex-max".to_string(),
description: "Latest Codex-optimized flagship for deep and fast reasoning.".to_string(),
default_reasoning_effort: ReasoningEffort::Medium,
supported_reasoning_efforts: &[
supported_reasoning_efforts: vec![
ReasoningEffortPreset {
effort: ReasoningEffort::Low,
description: "Fast responses with lighter reasoning",
description: "Fast responses with lighter reasoning".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::Medium,
description: "Balances speed and reasoning depth for everyday tasks",
description: "Balances speed and reasoning depth for everyday tasks".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::High,
description: "Maximizes reasoning depth for complex problems",
description: "Maximizes reasoning depth for complex problems".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::XHigh,
description: "Extra high reasoning depth for complex problems",
description: "Extra high reasoning depth for complex problems".to_string(),
},
],
is_default: true,
@@ -78,184 +40,184 @@ static PRESETS: Lazy<Vec<ModelPreset>> = Lazy::new(|| {
show_in_picker: true,
},
ModelPreset {
id: "gpt-5.1-codex",
model: "gpt-5.1-codex",
display_name: "gpt-5.1-codex",
description: "Optimized for codex.",
id: "gpt-5.1-codex".to_string(),
model: "gpt-5.1-codex".to_string(),
display_name: "gpt-5.1-codex".to_string(),
description: "Optimized for codex.".to_string(),
default_reasoning_effort: ReasoningEffort::Medium,
supported_reasoning_efforts: &[
supported_reasoning_efforts: vec![
ReasoningEffortPreset {
effort: ReasoningEffort::Low,
description: "Fastest responses with limited reasoning",
description: "Fastest responses with limited reasoning".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::Medium,
description: "Dynamically adjusts reasoning based on the task",
description: "Dynamically adjusts reasoning based on the task".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::High,
description: "Maximizes reasoning depth for complex or ambiguous problems",
description: "Maximizes reasoning depth for complex or ambiguous problems"
.to_string(),
},
],
is_default: false,
upgrade: Some(ModelUpgrade {
id: "gpt-5.1-codex-max",
id: "gpt-5.1-codex-max".to_string(),
reasoning_effort_mapping: None,
migration_config_key: HIDE_GPT_5_1_CODEX_MAX_MIGRATION_PROMPT_CONFIG,
migration_config_key: HIDE_GPT_5_1_CODEX_MAX_MIGRATION_PROMPT_CONFIG.to_string(),
}),
show_in_picker: true,
},
ModelPreset {
id: "gpt-5.1-codex-mini",
model: "gpt-5.1-codex-mini",
display_name: "gpt-5.1-codex-mini",
description: "Optimized for codex. Cheaper, faster, but less capable.",
id: "gpt-5.1-codex-mini".to_string(),
model: "gpt-5.1-codex-mini".to_string(),
display_name: "gpt-5.1-codex-mini".to_string(),
description: "Optimized for codex. Cheaper, faster, but less capable.".to_string(),
default_reasoning_effort: ReasoningEffort::Medium,
supported_reasoning_efforts: &[
supported_reasoning_efforts: vec![
ReasoningEffortPreset {
effort: ReasoningEffort::Medium,
description: "Dynamically adjusts reasoning based on the task",
description: "Dynamically adjusts reasoning based on the task".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::High,
description: "Maximizes reasoning depth for complex or ambiguous problems",
description: "Maximizes reasoning depth for complex or ambiguous problems"
.to_string(),
},
],
is_default: false,
upgrade: Some(ModelUpgrade {
id: "gpt-5.1-codex-max",
id: "gpt-5.1-codex-max".to_string(),
reasoning_effort_mapping: None,
migration_config_key: HIDE_GPT_5_1_CODEX_MAX_MIGRATION_PROMPT_CONFIG,
migration_config_key: HIDE_GPT_5_1_CODEX_MAX_MIGRATION_PROMPT_CONFIG.to_string(),
}),
show_in_picker: true,
},
ModelPreset {
id: "gpt-5.1",
model: "gpt-5.1",
display_name: "gpt-5.1",
description: "Broad world knowledge with strong general reasoning.",
id: "gpt-5.1".to_string(),
model: "gpt-5.1".to_string(),
display_name: "gpt-5.1".to_string(),
description: "Broad world knowledge with strong general reasoning.".to_string(),
default_reasoning_effort: ReasoningEffort::Medium,
supported_reasoning_efforts: &[
supported_reasoning_efforts: vec![
ReasoningEffortPreset {
effort: ReasoningEffort::Low,
description: "Balances speed with some reasoning; useful for straightforward queries and short explanations",
description: "Balances speed with some reasoning; useful for straightforward queries and short explanations".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::Medium,
description: "Provides a solid balance of reasoning depth and latency for general-purpose tasks",
description: "Provides a solid balance of reasoning depth and latency for general-purpose tasks".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::High,
description: "Maximizes reasoning depth for complex or ambiguous problems",
description: "Maximizes reasoning depth for complex or ambiguous problems".to_string(),
},
],
is_default: false,
upgrade: Some(ModelUpgrade {
id: "gpt-5.1-codex-max",
id: "gpt-5.1-codex-max".to_string(),
reasoning_effort_mapping: None,
migration_config_key: HIDE_GPT_5_1_CODEX_MAX_MIGRATION_PROMPT_CONFIG,
migration_config_key: HIDE_GPT_5_1_CODEX_MAX_MIGRATION_PROMPT_CONFIG.to_string(),
}),
show_in_picker: true,
},
// Deprecated models.
ModelPreset {
id: "gpt-5-codex",
model: "gpt-5-codex",
display_name: "gpt-5-codex",
description: "Optimized for codex.",
id: "gpt-5-codex".to_string(),
model: "gpt-5-codex".to_string(),
display_name: "gpt-5-codex".to_string(),
description: "Optimized for codex.".to_string(),
default_reasoning_effort: ReasoningEffort::Medium,
supported_reasoning_efforts: &[
supported_reasoning_efforts: vec![
ReasoningEffortPreset {
effort: ReasoningEffort::Low,
description: "Fastest responses with limited reasoning",
description: "Fastest responses with limited reasoning".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::Medium,
description: "Dynamically adjusts reasoning based on the task",
description: "Dynamically adjusts reasoning based on the task".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::High,
description: "Maximizes reasoning depth for complex or ambiguous problems",
description: "Maximizes reasoning depth for complex or ambiguous problems".to_string(),
},
],
is_default: false,
upgrade: Some(ModelUpgrade {
id: "gpt-5.1-codex-max",
id: "gpt-5.1-codex-max".to_string(),
reasoning_effort_mapping: None,
migration_config_key: HIDE_GPT_5_1_CODEX_MAX_MIGRATION_PROMPT_CONFIG,
migration_config_key: HIDE_GPT_5_1_CODEX_MAX_MIGRATION_PROMPT_CONFIG.to_string(),
}),
show_in_picker: false,
},
ModelPreset {
id: "gpt-5-codex-mini",
model: "gpt-5-codex-mini",
display_name: "gpt-5-codex-mini",
description: "Optimized for codex. Cheaper, faster, but less capable.",
id: "gpt-5-codex-mini".to_string(),
model: "gpt-5-codex-mini".to_string(),
display_name: "gpt-5-codex-mini".to_string(),
description: "Optimized for codex. Cheaper, faster, but less capable.".to_string(),
default_reasoning_effort: ReasoningEffort::Medium,
supported_reasoning_efforts: &[
supported_reasoning_efforts: vec![
ReasoningEffortPreset {
effort: ReasoningEffort::Medium,
description: "Dynamically adjusts reasoning based on the task",
description: "Dynamically adjusts reasoning based on the task".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::High,
description: "Maximizes reasoning depth for complex or ambiguous problems",
description: "Maximizes reasoning depth for complex or ambiguous problems".to_string(),
},
],
is_default: false,
upgrade: Some(ModelUpgrade {
id: "gpt-5.1-codex-mini",
id: "gpt-5.1-codex-mini".to_string(),
reasoning_effort_mapping: None,
migration_config_key: HIDE_GPT5_1_MIGRATION_PROMPT_CONFIG,
migration_config_key: HIDE_GPT5_1_MIGRATION_PROMPT_CONFIG.to_string(),
}),
show_in_picker: false,
},
ModelPreset {
id: "gpt-5",
model: "gpt-5",
display_name: "gpt-5",
description: "Broad world knowledge with strong general reasoning.",
id: "gpt-5".to_string(),
model: "gpt-5".to_string(),
display_name: "gpt-5".to_string(),
description: "Broad world knowledge with strong general reasoning.".to_string(),
default_reasoning_effort: ReasoningEffort::Medium,
supported_reasoning_efforts: &[
supported_reasoning_efforts: vec![
ReasoningEffortPreset {
effort: ReasoningEffort::Minimal,
description: "Fastest responses with little reasoning",
description: "Fastest responses with little reasoning".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::Low,
description: "Balances speed with some reasoning; useful for straightforward queries and short explanations",
description: "Balances speed with some reasoning; useful for straightforward queries and short explanations".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::Medium,
description: "Provides a solid balance of reasoning depth and latency for general-purpose tasks",
description: "Provides a solid balance of reasoning depth and latency for general-purpose tasks".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::High,
description: "Maximizes reasoning depth for complex or ambiguous problems",
description: "Maximizes reasoning depth for complex or ambiguous problems".to_string(),
},
],
is_default: false,
upgrade: Some(ModelUpgrade {
id: "gpt-5.1-codex-max",
id: "gpt-5.1-codex-max".to_string(),
reasoning_effort_mapping: None,
migration_config_key: HIDE_GPT_5_1_CODEX_MAX_MIGRATION_PROMPT_CONFIG,
migration_config_key: HIDE_GPT_5_1_CODEX_MAX_MIGRATION_PROMPT_CONFIG.to_string(),
}),
show_in_picker: false,
},
]
});
pub fn builtin_model_presets(auth_mode: Option<AuthMode>) -> Vec<ModelPreset> {
pub(crate) fn builtin_model_presets(_auth_mode: Option<AuthMode>) -> Vec<ModelPreset> {
PRESETS
.iter()
.filter(|preset| match auth_mode {
Some(AuthMode::ApiKey) => preset.show_in_picker && preset.id != "gpt-5.1-codex-max",
_ => preset.show_in_picker,
})
.filter(|preset| preset.show_in_picker)
.cloned()
.collect()
}
// todo(aibrahim): remove this once we migrate tests
pub fn all_model_presets() -> &'static Vec<ModelPreset> {
&PRESETS
}
@@ -263,21 +225,10 @@ pub fn all_model_presets() -> &'static Vec<ModelPreset> {
#[cfg(test)]
mod tests {
use super::*;
use codex_app_server_protocol::AuthMode;
#[test]
fn only_one_default_model_is_configured() {
let default_models = PRESETS.iter().filter(|preset| preset.is_default).count();
assert!(default_models == 1);
}
#[test]
fn gpt_5_1_codex_max_hidden_for_api_key_auth() {
let presets = builtin_model_presets(Some(AuthMode::ApiKey));
assert!(
presets
.iter()
.all(|preset| preset.id != "gpt-5.1-codex-max")
);
}
}

View File

@@ -0,0 +1,34 @@
use codex_app_server_protocol::AuthMode;
use codex_protocol::openai_models::ModelPreset;
use tokio::sync::RwLock;
use crate::config::Config;
use crate::openai_models::model_family::ModelFamily;
use crate::openai_models::model_family::find_family_for_model;
use crate::openai_models::model_presets::builtin_model_presets;
#[derive(Debug)]
pub struct ModelsManager {
pub available_models: RwLock<Vec<ModelPreset>>,
pub etag: String,
pub auth_mode: Option<AuthMode>,
}
impl ModelsManager {
pub fn new(auth_mode: Option<AuthMode>) -> Self {
Self {
available_models: RwLock::new(builtin_model_presets(auth_mode)),
etag: String::new(),
auth_mode,
}
}
pub async fn refresh_available_models(&self) {
let models = builtin_model_presets(self.auth_mode);
*self.available_models.write().await = models;
}
pub fn construct_model_family(&self, model: &str, config: &Config) -> ModelFamily {
find_family_for_model(model).with_config_overrides(config)
}
}

View File

@@ -14,6 +14,9 @@
//! 3. We do **not** walk past the Git root.
use crate::config::Config;
use crate::features::Feature;
use crate::skills::load_skills;
use crate::skills::render_skills_section;
use dunce::canonicalize as normalize_path;
use std::path::PathBuf;
use tokio::io::AsyncReadExt;
@@ -31,18 +34,47 @@ const PROJECT_DOC_SEPARATOR: &str = "\n\n--- project-doc ---\n\n";
/// Combines `Config::instructions` and `AGENTS.md` (if present) into a single
/// string of instructions.
pub(crate) async fn get_user_instructions(config: &Config) -> Option<String> {
match read_project_docs(config).await {
Ok(Some(project_doc)) => match &config.user_instructions {
Some(original_instructions) => Some(format!(
"{original_instructions}{PROJECT_DOC_SEPARATOR}{project_doc}"
)),
None => Some(project_doc),
},
Ok(None) => config.user_instructions.clone(),
let skills_section = if config.features.enabled(Feature::Skills) {
let skills_outcome = load_skills(config);
for err in &skills_outcome.errors {
error!(
"failed to load skill {}: {}",
err.path.display(),
err.message
);
}
render_skills_section(&skills_outcome.skills)
} else {
None
};
let project_docs = match read_project_docs(config).await {
Ok(docs) => docs,
Err(e) => {
error!("error trying to find project doc: {e:#}");
config.user_instructions.clone()
return config.user_instructions.clone();
}
};
let combined_project_docs = merge_project_docs_with_skills(project_docs, skills_section);
let mut parts: Vec<String> = Vec::new();
if let Some(instructions) = config.user_instructions.clone() {
parts.push(instructions);
}
if let Some(project_doc) = combined_project_docs {
if !parts.is_empty() {
parts.push(PROJECT_DOC_SEPARATOR.to_string());
}
parts.push(project_doc);
}
if parts.is_empty() {
None
} else {
Some(parts.concat())
}
}
@@ -195,12 +227,25 @@ fn candidate_filenames<'a>(config: &'a Config) -> Vec<&'a str> {
names
}
fn merge_project_docs_with_skills(
project_doc: Option<String>,
skills_section: Option<String>,
) -> Option<String> {
match (project_doc, skills_section) {
(Some(doc), Some(skills)) => Some(format!("{doc}\n\n{skills}")),
(Some(doc), None) => Some(doc),
(None, Some(skills)) => Some(skills),
(None, None) => None,
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::config::ConfigOverrides;
use crate::config::ConfigToml;
use std::fs;
use std::path::PathBuf;
use tempfile::TempDir;
/// Helper that returns a `Config` pointing at `root` and using `limit` as
@@ -219,6 +264,7 @@ mod tests {
config.cwd = root.path().to_path_buf();
config.project_doc_max_bytes = limit;
config.features.enable(Feature::Skills);
config.user_instructions = instructions.map(ToOwned::to_owned);
config
@@ -447,4 +493,60 @@ mod tests {
.eq(DEFAULT_PROJECT_DOC_FILENAME)
);
}
#[tokio::test]
async fn skills_are_appended_to_project_doc() {
let tmp = tempfile::tempdir().expect("tempdir");
fs::write(tmp.path().join("AGENTS.md"), "base doc").unwrap();
let cfg = make_config(&tmp, 4096, None);
create_skill(
cfg.codex_home.clone(),
"pdf-processing",
"extract from pdfs",
);
let res = get_user_instructions(&cfg)
.await
.expect("instructions expected");
let expected_path = dunce::canonicalize(
cfg.codex_home
.join("skills/pdf-processing/SKILL.md")
.as_path(),
)
.unwrap_or_else(|_| cfg.codex_home.join("skills/pdf-processing/SKILL.md"));
let expected_path_str = expected_path.to_string_lossy().replace('\\', "/");
let usage_rules = "- Discovery: Available skills are listed in project docs and may also appear in a runtime \"## Skills\" section (name + description + file path). These are the sources of truth; skill bodies live on disk at the listed paths.\n- Trigger rules: If the user names a skill (with `$SkillName` or plain text) OR the task clearly matches a skill's description, you must use that skill for that turn. Multiple mentions mean use them all. Do not carry skills across turns unless re-mentioned.\n- Missing/blocked: If a named skill isn't in the list or the path can't be read, say so briefly and continue with the best fallback.\n- How to use a skill (progressive disclosure):\n 1) After deciding to use a skill, open its `SKILL.md`. Read only enough to follow the workflow.\n 2) If `SKILL.md` points to extra folders such as `references/`, load only the specific files needed for the request; don't bulk-load everything.\n 3) If `scripts/` exist, prefer running or patching them instead of retyping large code blocks.\n 4) If `assets/` or templates exist, reuse them instead of recreating from scratch.\n- Description as trigger: The YAML `description` in `SKILL.md` is the primary trigger signal; rely on it to decide applicability. If unsure, ask a brief clarification before proceeding.\n- Coordination and sequencing:\n - If multiple skills apply, choose the minimal set that covers the request and state the order you'll use them.\n - Announce which skill(s) you're using and why (one short line). If you skip an obvious skill, say why.\n- Context hygiene:\n - Keep context small: summarize long sections instead of pasting them; only load extra files when needed.\n - Avoid deeply nested references; prefer one-hop files explicitly linked from `SKILL.md`.\n - When variants exist (frameworks, providers, domains), pick only the relevant reference file(s) and note that choice.\n- Safety and fallback: If a skill can't be applied cleanly (missing files, unclear instructions), state the issue, pick the next-best approach, and continue.";
let expected = format!(
"base doc\n\n## Skills\nThese skills are discovered at startup from ~/.codex/skills; each entry shows name, description, and file path so you can open the source for full instructions. Content is not inlined to keep context lean.\n- pdf-processing: extract from pdfs (file: {expected_path_str})\n{usage_rules}"
);
assert_eq!(res, expected);
}
#[tokio::test]
async fn skills_render_without_project_doc() {
let tmp = tempfile::tempdir().expect("tempdir");
let cfg = make_config(&tmp, 4096, None);
create_skill(cfg.codex_home.clone(), "linting", "run clippy");
let res = get_user_instructions(&cfg)
.await
.expect("instructions expected");
let expected_path =
dunce::canonicalize(cfg.codex_home.join("skills/linting/SKILL.md").as_path())
.unwrap_or_else(|_| cfg.codex_home.join("skills/linting/SKILL.md"));
let expected_path_str = expected_path.to_string_lossy().replace('\\', "/");
let usage_rules = "- Discovery: Available skills are listed in project docs and may also appear in a runtime \"## Skills\" section (name + description + file path). These are the sources of truth; skill bodies live on disk at the listed paths.\n- Trigger rules: If the user names a skill (with `$SkillName` or plain text) OR the task clearly matches a skill's description, you must use that skill for that turn. Multiple mentions mean use them all. Do not carry skills across turns unless re-mentioned.\n- Missing/blocked: If a named skill isn't in the list or the path can't be read, say so briefly and continue with the best fallback.\n- How to use a skill (progressive disclosure):\n 1) After deciding to use a skill, open its `SKILL.md`. Read only enough to follow the workflow.\n 2) If `SKILL.md` points to extra folders such as `references/`, load only the specific files needed for the request; don't bulk-load everything.\n 3) If `scripts/` exist, prefer running or patching them instead of retyping large code blocks.\n 4) If `assets/` or templates exist, reuse them instead of recreating from scratch.\n- Description as trigger: The YAML `description` in `SKILL.md` is the primary trigger signal; rely on it to decide applicability. If unsure, ask a brief clarification before proceeding.\n- Coordination and sequencing:\n - If multiple skills apply, choose the minimal set that covers the request and state the order you'll use them.\n - Announce which skill(s) you're using and why (one short line). If you skip an obvious skill, say why.\n- Context hygiene:\n - Keep context small: summarize long sections instead of pasting them; only load extra files when needed.\n - Avoid deeply nested references; prefer one-hop files explicitly linked from `SKILL.md`.\n - When variants exist (frameworks, providers, domains), pick only the relevant reference file(s) and note that choice.\n- Safety and fallback: If a skill can't be applied cleanly (missing files, unclear instructions), state the issue, pick the next-best approach, and continue.";
let expected = format!(
"## Skills\nThese skills are discovered at startup from ~/.codex/skills; each entry shows name, description, and file path so you can open the source for full instructions. Content is not inlined to keep context lean.\n- linting: run clippy (file: {expected_path_str})\n{usage_rules}"
);
assert_eq!(res, expected);
}
fn create_skill(codex_home: PathBuf, name: &str, description: &str) {
let skill_dir = codex_home.join(format!("skills/{name}"));
fs::create_dir_all(&skill_dir).unwrap();
let content = format!("---\nname: {name}\ndescription: {description}\n---\n\n# Body\n");
fs::write(skill_dir.join("SKILL.md"), content).unwrap();
}
}

View File

@@ -1,70 +0,0 @@
use crate::codex::Session;
use crate::codex::TurnContext;
use codex_protocol::models::FunctionCallOutputPayload;
use codex_protocol::models::ResponseInputItem;
use codex_protocol::models::ResponseItem;
use tracing::warn;
/// Process streamed `ResponseItem`s from the model into the pair of:
/// - items we should record in conversation history; and
/// - `ResponseInputItem`s to send back to the model on the next turn.
pub(crate) async fn process_items(
processed_items: Vec<crate::codex::ProcessedResponseItem>,
sess: &Session,
turn_context: &TurnContext,
) -> (Vec<ResponseInputItem>, Vec<ResponseItem>) {
let mut outputs_to_record = Vec::<ResponseItem>::new();
let mut new_inputs_to_record = Vec::<ResponseItem>::new();
let mut responses = Vec::<ResponseInputItem>::new();
for processed_response_item in processed_items {
let crate::codex::ProcessedResponseItem { item, response } = processed_response_item;
if let Some(response) = &response {
responses.push(response.clone());
}
match response {
Some(ResponseInputItem::FunctionCallOutput { call_id, output }) => {
new_inputs_to_record.push(ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output: output.clone(),
});
}
Some(ResponseInputItem::CustomToolCallOutput { call_id, output }) => {
new_inputs_to_record.push(ResponseItem::CustomToolCallOutput {
call_id: call_id.clone(),
output: output.clone(),
});
}
Some(ResponseInputItem::McpToolCallOutput { call_id, result }) => {
let output = match result {
Ok(call_tool_result) => FunctionCallOutputPayload::from(&call_tool_result),
Err(err) => FunctionCallOutputPayload {
content: err.clone(),
success: Some(false),
..Default::default()
},
};
new_inputs_to_record.push(ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output,
});
}
None => {}
_ => {
warn!("Unexpected response item: {item:?} with response: {response:?}");
}
};
outputs_to_record.push(item);
}
let all_items_to_record = [outputs_to_record, new_inputs_to_record].concat();
// Only attempt to take the lock if there is something to record.
if !all_items_to_record.is_empty() {
sess.record_conversation_items(turn_context, &all_items_to_record)
.await;
}
(responses, all_items_to_record)
}

View File

@@ -0,0 +1,93 @@
use codex_git::merge_base_with_head;
use codex_protocol::protocol::ReviewRequest;
use codex_protocol::protocol::ReviewTarget;
use std::path::Path;
#[derive(Clone, Debug, PartialEq)]
pub struct ResolvedReviewRequest {
pub target: ReviewTarget,
pub prompt: String,
pub user_facing_hint: String,
}
const UNCOMMITTED_PROMPT: &str = "Review the current code changes (staged, unstaged, and untracked files) and provide prioritized findings.";
const BASE_BRANCH_PROMPT_BACKUP: &str = "Review the code changes against the base branch '{branch}'. Start by finding the merge diff between the current branch and {branch}'s upstream e.g. (`git merge-base HEAD \"$(git rev-parse --abbrev-ref \"{branch}@{upstream}\")\"`), then run `git diff` against that SHA to see what changes we would merge into the {branch} branch. Provide prioritized, actionable findings.";
const BASE_BRANCH_PROMPT: &str = "Review the code changes against the base branch '{baseBranch}'. The merge base commit for this comparison is {mergeBaseSha}. Run `git diff {mergeBaseSha}` to inspect the changes relative to {baseBranch}. Provide prioritized, actionable findings.";
const COMMIT_PROMPT_WITH_TITLE: &str = "Review the code changes introduced by commit {sha} (\"{title}\"). Provide prioritized, actionable findings.";
const COMMIT_PROMPT: &str =
"Review the code changes introduced by commit {sha}. Provide prioritized, actionable findings.";
pub fn resolve_review_request(
request: ReviewRequest,
cwd: &Path,
) -> anyhow::Result<ResolvedReviewRequest> {
let target = request.target;
let prompt = review_prompt(&target, cwd)?;
let user_facing_hint = request
.user_facing_hint
.unwrap_or_else(|| user_facing_hint(&target));
Ok(ResolvedReviewRequest {
target,
prompt,
user_facing_hint,
})
}
pub fn review_prompt(target: &ReviewTarget, cwd: &Path) -> anyhow::Result<String> {
match target {
ReviewTarget::UncommittedChanges => Ok(UNCOMMITTED_PROMPT.to_string()),
ReviewTarget::BaseBranch { branch } => {
if let Some(commit) = merge_base_with_head(cwd, branch)? {
Ok(BASE_BRANCH_PROMPT
.replace("{baseBranch}", branch)
.replace("{mergeBaseSha}", &commit))
} else {
Ok(BASE_BRANCH_PROMPT_BACKUP.replace("{branch}", branch))
}
}
ReviewTarget::Commit { sha, title } => {
if let Some(title) = title {
Ok(COMMIT_PROMPT_WITH_TITLE
.replace("{sha}", sha)
.replace("{title}", title))
} else {
Ok(COMMIT_PROMPT.replace("{sha}", sha))
}
}
ReviewTarget::Custom { instructions } => {
let prompt = instructions.trim();
if prompt.is_empty() {
anyhow::bail!("Review prompt cannot be empty");
}
Ok(prompt.to_string())
}
}
}
pub fn user_facing_hint(target: &ReviewTarget) -> String {
match target {
ReviewTarget::UncommittedChanges => "current changes".to_string(),
ReviewTarget::BaseBranch { branch } => format!("changes against '{branch}'"),
ReviewTarget::Commit { sha, title } => {
let short_sha: String = sha.chars().take(7).collect();
if let Some(title) = title {
format!("commit {short_sha}: {title}")
} else {
format!("commit {short_sha}")
}
}
ReviewTarget::Custom { instructions } => instructions.trim().to_string(),
}
}
impl From<ResolvedReviewRequest> for ReviewRequest {
fn from(resolved: ResolvedReviewRequest) -> Self {
ReviewRequest {
target: resolved.target,
user_facing_hint: Some(resolved.user_facing_hint),
}
}
}

View File

@@ -9,6 +9,7 @@ use std::sync::atomic::AtomicBool;
use time::OffsetDateTime;
use time::PrimitiveDateTime;
use time::format_description::FormatItem;
use time::format_description::well_known::Rfc3339;
use time::macros::format_description;
use uuid::Uuid;
@@ -39,18 +40,15 @@ pub struct ConversationItem {
pub path: PathBuf,
/// First up to `HEAD_RECORD_LIMIT` JSONL records parsed as JSON (includes meta line).
pub head: Vec<serde_json::Value>,
/// Last up to `TAIL_RECORD_LIMIT` JSONL response records parsed as JSON.
pub tail: Vec<serde_json::Value>,
/// RFC3339 timestamp string for when the session was created, if available.
pub created_at: Option<String>,
/// RFC3339 timestamp string for the most recent response in the tail, if available.
/// RFC3339 timestamp string for the most recent update (from file mtime).
pub updated_at: Option<String>,
}
#[derive(Default)]
struct HeadTailSummary {
head: Vec<serde_json::Value>,
tail: Vec<serde_json::Value>,
saw_session_meta: bool,
saw_user_event: bool,
source: Option<SessionSource>,
@@ -62,7 +60,6 @@ struct HeadTailSummary {
/// Hard cap to bound worstcase work per request.
const MAX_SCAN_FILES: usize = 10000;
const HEAD_RECORD_LIMIT: usize = 10;
const TAIL_RECORD_LIMIT: usize = 10;
/// Pagination cursor identifying a file by timestamp and UUID.
#[derive(Debug, Clone, PartialEq, Eq)]
@@ -141,13 +138,6 @@ pub(crate) async fn get_conversations(
Ok(result)
}
/// Load the full contents of a single conversation session file at `path`.
/// Returns the entire file contents as a String.
#[allow(dead_code)]
pub(crate) async fn get_conversation(path: &Path) -> io::Result<String> {
tokio::fs::read_to_string(path).await
}
/// Load conversation file paths from disk using directory traversal.
///
/// Directory layout: `~/.codex/sessions/YYYY/MM/DD/rollout-YYYY-MM-DDThh-mm-ss-<uuid>.jsonl`
@@ -212,9 +202,8 @@ async fn traverse_directories_for_paths(
more_matches_available = true;
break 'outer;
}
// Read head and simultaneously detect message events within the same
// first N JSONL records to avoid a second file read.
let summary = read_head_and_tail(&path, HEAD_RECORD_LIMIT, TAIL_RECORD_LIMIT)
// Read head and detect message events; stop once meta + user are found.
let summary = read_head_summary(&path, HEAD_RECORD_LIMIT)
.await
.unwrap_or_default();
if !allowed_sources.is_empty()
@@ -233,16 +222,19 @@ async fn traverse_directories_for_paths(
if summary.saw_session_meta && summary.saw_user_event {
let HeadTailSummary {
head,
tail,
created_at,
mut updated_at,
..
} = summary;
updated_at = updated_at.or_else(|| created_at.clone());
if updated_at.is_none() {
updated_at = file_modified_rfc3339(&path)
.await
.unwrap_or(None)
.or_else(|| created_at.clone());
}
items.push(ConversationItem {
path,
head,
tail,
created_at,
updated_at,
});
@@ -384,11 +376,7 @@ impl<'a> ProviderMatcher<'a> {
}
}
async fn read_head_and_tail(
path: &Path,
head_limit: usize,
tail_limit: usize,
) -> io::Result<HeadTailSummary> {
async fn read_head_summary(path: &Path, head_limit: usize) -> io::Result<HeadTailSummary> {
use tokio::io::AsyncBufReadExt;
let file = tokio::fs::File::open(path).await?;
@@ -441,107 +429,30 @@ async fn read_head_and_tail(
}
}
}
if summary.saw_session_meta && summary.saw_user_event {
break;
}
}
if tail_limit != 0 {
let (tail, updated_at) = read_tail_records(path, tail_limit).await?;
summary.tail = tail;
summary.updated_at = updated_at;
}
Ok(summary)
}
/// Read up to `HEAD_RECORD_LIMIT` records from the start of the rollout file at `path`.
/// This should be enough to produce a summary including the session meta line.
pub async fn read_head_for_summary(path: &Path) -> io::Result<Vec<serde_json::Value>> {
let summary = read_head_and_tail(path, HEAD_RECORD_LIMIT, 0).await?;
let summary = read_head_summary(path, HEAD_RECORD_LIMIT).await?;
Ok(summary.head)
}
async fn read_tail_records(
path: &Path,
max_records: usize,
) -> io::Result<(Vec<serde_json::Value>, Option<String>)> {
use std::io::SeekFrom;
use tokio::io::AsyncReadExt;
use tokio::io::AsyncSeekExt;
if max_records == 0 {
return Ok((Vec::new(), None));
}
const CHUNK_SIZE: usize = 8192;
let mut file = tokio::fs::File::open(path).await?;
let mut pos = file.seek(SeekFrom::End(0)).await?;
if pos == 0 {
return Ok((Vec::new(), None));
}
let mut buffer: Vec<u8> = Vec::new();
let mut latest_timestamp: Option<String> = None;
loop {
let slice_start = match (pos > 0, buffer.iter().position(|&b| b == b'\n')) {
(true, Some(idx)) => idx + 1,
_ => 0,
};
let (tail, newest_ts) = collect_last_response_values(&buffer[slice_start..], max_records);
if latest_timestamp.is_none() {
latest_timestamp = newest_ts.clone();
}
if tail.len() >= max_records || pos == 0 {
return Ok((tail, latest_timestamp.or(newest_ts)));
}
let read_size = CHUNK_SIZE.min(pos as usize);
if read_size == 0 {
return Ok((tail, latest_timestamp.or(newest_ts)));
}
pos -= read_size as u64;
file.seek(SeekFrom::Start(pos)).await?;
let mut chunk = vec![0; read_size];
file.read_exact(&mut chunk).await?;
chunk.extend_from_slice(&buffer);
buffer = chunk;
}
}
fn collect_last_response_values(
buffer: &[u8],
max_records: usize,
) -> (Vec<serde_json::Value>, Option<String>) {
use std::borrow::Cow;
if buffer.is_empty() || max_records == 0 {
return (Vec::new(), None);
}
let text: Cow<'_, str> = String::from_utf8_lossy(buffer);
let mut collected_rev: Vec<serde_json::Value> = Vec::new();
let mut latest_timestamp: Option<String> = None;
for line in text.lines().rev() {
let trimmed = line.trim();
if trimmed.is_empty() {
continue;
}
let parsed: serde_json::Result<RolloutLine> = serde_json::from_str(trimmed);
let Ok(rollout_line) = parsed else { continue };
let RolloutLine { timestamp, item } = rollout_line;
if let RolloutItem::ResponseItem(item) = item
&& let Ok(val) = serde_json::to_value(&item)
{
if latest_timestamp.is_none() {
latest_timestamp = Some(timestamp.clone());
}
collected_rev.push(val);
if collected_rev.len() == max_records {
break;
}
}
}
collected_rev.reverse();
(collected_rev, latest_timestamp)
async fn file_modified_rfc3339(path: &Path) -> io::Result<Option<String>> {
let meta = tokio::fs::metadata(path).await?;
let modified = meta.modified().ok();
let Some(modified) = modified else {
return Ok(None);
};
let dt = OffsetDateTime::from(modified);
Ok(dt.format(&Rfc3339).ok())
}
/// Locate a recorded conversation rollout file by its UUID string using the existing

View File

@@ -16,13 +16,11 @@ use crate::rollout::INTERACTIVE_SESSION_SOURCES;
use crate::rollout::list::ConversationItem;
use crate::rollout::list::ConversationsPage;
use crate::rollout::list::Cursor;
use crate::rollout::list::get_conversation;
use crate::rollout::list::get_conversations;
use anyhow::Result;
use codex_protocol::ConversationId;
use codex_protocol::models::ContentItem;
use codex_protocol::models::ResponseItem;
use codex_protocol::protocol::CompactedItem;
use codex_protocol::protocol::EventMsg;
use codex_protocol::protocol::RolloutItem;
use codex_protocol::protocol::RolloutLine;
@@ -226,28 +224,28 @@ async fn test_list_conversations_latest_first() {
"model_provider": "test-provider",
})];
let updated_times: Vec<Option<String>> =
page.items.iter().map(|i| i.updated_at.clone()).collect();
let expected = ConversationsPage {
items: vec![
ConversationItem {
path: p1,
head: head_3,
tail: Vec::new(),
created_at: Some("2025-01-03T12-00-00".into()),
updated_at: Some("2025-01-03T12-00-00".into()),
updated_at: updated_times.first().cloned().flatten(),
},
ConversationItem {
path: p2,
head: head_2,
tail: Vec::new(),
created_at: Some("2025-01-02T12-00-00".into()),
updated_at: Some("2025-01-02T12-00-00".into()),
updated_at: updated_times.get(1).cloned().flatten(),
},
ConversationItem {
path: p3,
head: head_1,
tail: Vec::new(),
created_at: Some("2025-01-01T12-00-00".into()),
updated_at: Some("2025-01-01T12-00-00".into()),
updated_at: updated_times.get(2).cloned().flatten(),
},
],
next_cursor: None,
@@ -355,6 +353,8 @@ async fn test_pagination_cursor() {
"source": "vscode",
"model_provider": "test-provider",
})];
let updated_page1: Vec<Option<String>> =
page1.items.iter().map(|i| i.updated_at.clone()).collect();
let expected_cursor1: Cursor =
serde_json::from_str(&format!("\"2025-03-04T09-00-00|{u4}\"")).unwrap();
let expected_page1 = ConversationsPage {
@@ -362,16 +362,14 @@ async fn test_pagination_cursor() {
ConversationItem {
path: p5,
head: head_5,
tail: Vec::new(),
created_at: Some("2025-03-05T09-00-00".into()),
updated_at: Some("2025-03-05T09-00-00".into()),
updated_at: updated_page1.first().cloned().flatten(),
},
ConversationItem {
path: p4,
head: head_4,
tail: Vec::new(),
created_at: Some("2025-03-04T09-00-00".into()),
updated_at: Some("2025-03-04T09-00-00".into()),
updated_at: updated_page1.get(1).cloned().flatten(),
},
],
next_cursor: Some(expected_cursor1.clone()),
@@ -422,6 +420,8 @@ async fn test_pagination_cursor() {
"source": "vscode",
"model_provider": "test-provider",
})];
let updated_page2: Vec<Option<String>> =
page2.items.iter().map(|i| i.updated_at.clone()).collect();
let expected_cursor2: Cursor =
serde_json::from_str(&format!("\"2025-03-02T09-00-00|{u2}\"")).unwrap();
let expected_page2 = ConversationsPage {
@@ -429,16 +429,14 @@ async fn test_pagination_cursor() {
ConversationItem {
path: p3,
head: head_3,
tail: Vec::new(),
created_at: Some("2025-03-03T09-00-00".into()),
updated_at: Some("2025-03-03T09-00-00".into()),
updated_at: updated_page2.first().cloned().flatten(),
},
ConversationItem {
path: p2,
head: head_2,
tail: Vec::new(),
created_at: Some("2025-03-02T09-00-00".into()),
updated_at: Some("2025-03-02T09-00-00".into()),
updated_at: updated_page2.get(1).cloned().flatten(),
},
],
next_cursor: Some(expected_cursor2.clone()),
@@ -473,13 +471,14 @@ async fn test_pagination_cursor() {
"source": "vscode",
"model_provider": "test-provider",
})];
let updated_page3: Vec<Option<String>> =
page3.items.iter().map(|i| i.updated_at.clone()).collect();
let expected_page3 = ConversationsPage {
items: vec![ConversationItem {
path: p1,
head: head_1,
tail: Vec::new(),
created_at: Some("2025-03-01T09-00-00".into()),
updated_at: Some("2025-03-01T09-00-00".into()),
updated_at: updated_page3.first().cloned().flatten(),
}],
next_cursor: None,
num_scanned_files: 5, // scanned 05, 04 (anchor), 03, 02 (anchor), 01
@@ -510,7 +509,7 @@ async fn test_get_conversation_contents() {
.unwrap();
let path = &page.items[0].path;
let content = get_conversation(path).await.unwrap();
let content = tokio::fs::read_to_string(path).await.unwrap();
// Page equality (single item)
let expected_path = home
@@ -533,9 +532,8 @@ async fn test_get_conversation_contents() {
items: vec![ConversationItem {
path: expected_path,
head: expected_head,
tail: Vec::new(),
created_at: Some(ts.into()),
updated_at: Some(ts.into()),
updated_at: page.items[0].updated_at.clone(),
}],
next_cursor: None,
num_scanned_files: 1,
@@ -570,7 +568,7 @@ async fn test_get_conversation_contents() {
}
#[tokio::test]
async fn test_tail_includes_last_response_items() -> Result<()> {
async fn test_updated_at_uses_file_mtime() -> Result<()> {
let temp = TempDir::new().unwrap();
let home = temp.path();
@@ -636,229 +634,16 @@ async fn test_tail_includes_last_response_items() -> Result<()> {
)
.await?;
let item = page.items.first().expect("conversation item");
let tail_len = item.tail.len();
assert_eq!(tail_len, 10usize.min(total_messages));
let expected: Vec<serde_json::Value> = (total_messages - tail_len..total_messages)
.map(|idx| {
serde_json::json!({
"type": "message",
"role": "assistant",
"content": [
{
"type": "output_text",
"text": format!("reply-{idx}"),
}
],
})
})
.collect();
assert_eq!(item.tail, expected);
assert_eq!(item.created_at.as_deref(), Some(ts));
let expected_updated = format!("{ts}-{last:02}", last = total_messages - 1);
assert_eq!(item.updated_at.as_deref(), Some(expected_updated.as_str()));
Ok(())
}
#[tokio::test]
async fn test_tail_handles_short_sessions() -> Result<()> {
let temp = TempDir::new().unwrap();
let home = temp.path();
let ts = "2025-06-02T08-30-00";
let uuid = Uuid::from_u128(7);
let day_dir = home.join("sessions").join("2025").join("06").join("02");
fs::create_dir_all(&day_dir)?;
let file_path = day_dir.join(format!("rollout-{ts}-{uuid}.jsonl"));
let mut file = File::create(&file_path)?;
let conversation_id = ConversationId::from_string(&uuid.to_string())?;
let meta_line = RolloutLine {
timestamp: ts.to_string(),
item: RolloutItem::SessionMeta(SessionMetaLine {
meta: SessionMeta {
id: conversation_id,
timestamp: ts.to_string(),
instructions: None,
cwd: ".".into(),
originator: "test_originator".into(),
cli_version: "test_version".into(),
source: SessionSource::VSCode,
model_provider: Some("test-provider".into()),
},
git: None,
}),
};
writeln!(file, "{}", serde_json::to_string(&meta_line)?)?;
let user_event_line = RolloutLine {
timestamp: ts.to_string(),
item: RolloutItem::EventMsg(EventMsg::UserMessage(UserMessageEvent {
message: "hi".into(),
images: None,
})),
};
writeln!(file, "{}", serde_json::to_string(&user_event_line)?)?;
for idx in 0..3 {
let response_line = RolloutLine {
timestamp: format!("{ts}-{idx:02}"),
item: RolloutItem::ResponseItem(ResponseItem::Message {
id: None,
role: "assistant".into(),
content: vec![ContentItem::OutputText {
text: format!("short-{idx}"),
}],
}),
};
writeln!(file, "{}", serde_json::to_string(&response_line)?)?;
}
drop(file);
let provider_filter = provider_vec(&[TEST_PROVIDER]);
let page = get_conversations(
home,
1,
None,
INTERACTIVE_SESSION_SOURCES,
Some(provider_filter.as_slice()),
TEST_PROVIDER,
)
.await?;
let tail = &page.items.first().expect("conversation item").tail;
assert_eq!(tail.len(), 3);
let expected: Vec<serde_json::Value> = (0..3)
.map(|idx| {
serde_json::json!({
"type": "message",
"role": "assistant",
"content": [
{
"type": "output_text",
"text": format!("short-{idx}"),
}
],
})
})
.collect();
assert_eq!(tail, &expected);
let expected_updated = format!("{ts}-{last:02}", last = 2);
assert_eq!(
page.items[0].updated_at.as_deref(),
Some(expected_updated.as_str())
);
Ok(())
}
#[tokio::test]
async fn test_tail_skips_trailing_non_responses() -> Result<()> {
let temp = TempDir::new().unwrap();
let home = temp.path();
let ts = "2025-06-03T10-00-00";
let uuid = Uuid::from_u128(11);
let day_dir = home.join("sessions").join("2025").join("06").join("03");
fs::create_dir_all(&day_dir)?;
let file_path = day_dir.join(format!("rollout-{ts}-{uuid}.jsonl"));
let mut file = File::create(&file_path)?;
let conversation_id = ConversationId::from_string(&uuid.to_string())?;
let meta_line = RolloutLine {
timestamp: ts.to_string(),
item: RolloutItem::SessionMeta(SessionMetaLine {
meta: SessionMeta {
id: conversation_id,
timestamp: ts.to_string(),
instructions: None,
cwd: ".".into(),
originator: "test_originator".into(),
cli_version: "test_version".into(),
source: SessionSource::VSCode,
model_provider: Some("test-provider".into()),
},
git: None,
}),
};
writeln!(file, "{}", serde_json::to_string(&meta_line)?)?;
let user_event_line = RolloutLine {
timestamp: ts.to_string(),
item: RolloutItem::EventMsg(EventMsg::UserMessage(UserMessageEvent {
message: "hello".into(),
images: None,
})),
};
writeln!(file, "{}", serde_json::to_string(&user_event_line)?)?;
for idx in 0..4 {
let response_line = RolloutLine {
timestamp: format!("{ts}-{idx:02}"),
item: RolloutItem::ResponseItem(ResponseItem::Message {
id: None,
role: "assistant".into(),
content: vec![ContentItem::OutputText {
text: format!("response-{idx}"),
}],
}),
};
writeln!(file, "{}", serde_json::to_string(&response_line)?)?;
}
let compacted_line = RolloutLine {
timestamp: format!("{ts}-compacted"),
item: RolloutItem::Compacted(CompactedItem {
message: "compacted".into(),
replacement_history: None,
}),
};
writeln!(file, "{}", serde_json::to_string(&compacted_line)?)?;
let shutdown_event = RolloutLine {
timestamp: format!("{ts}-shutdown"),
item: RolloutItem::EventMsg(EventMsg::ShutdownComplete),
};
writeln!(file, "{}", serde_json::to_string(&shutdown_event)?)?;
drop(file);
let provider_filter = provider_vec(&[TEST_PROVIDER]);
let page = get_conversations(
home,
1,
None,
INTERACTIVE_SESSION_SOURCES,
Some(provider_filter.as_slice()),
TEST_PROVIDER,
)
.await?;
let tail = &page.items.first().expect("conversation item").tail;
let expected: Vec<serde_json::Value> = (0..4)
.map(|idx| {
serde_json::json!({
"type": "message",
"role": "assistant",
"content": [
{
"type": "output_text",
"text": format!("response-{idx}"),
}
],
})
})
.collect();
assert_eq!(tail, &expected);
let expected_updated = format!("{ts}-{last:02}", last = 3);
assert_eq!(
page.items[0].updated_at.as_deref(),
Some(expected_updated.as_str())
);
let updated = item
.updated_at
.as_deref()
.and_then(|s| chrono::DateTime::parse_from_rfc3339(s).ok())
.map(|dt| dt.with_timezone(&chrono::Utc))
.expect("updated_at set from file mtime");
let now = chrono::Utc::now();
let age = now - updated;
assert!(age.num_seconds().abs() < 30);
Ok(())
}
@@ -913,22 +698,22 @@ async fn test_stable_ordering_same_second_pagination() {
"model_provider": "test-provider",
})]
};
let updated_page1: Vec<Option<String>> =
page1.items.iter().map(|i| i.updated_at.clone()).collect();
let expected_cursor1: Cursor = serde_json::from_str(&format!("\"{ts}|{u2}\"")).unwrap();
let expected_page1 = ConversationsPage {
items: vec![
ConversationItem {
path: p3,
head: head(u3),
tail: Vec::new(),
created_at: Some(ts.to_string()),
updated_at: Some(ts.to_string()),
updated_at: updated_page1.first().cloned().flatten(),
},
ConversationItem {
path: p2,
head: head(u2),
tail: Vec::new(),
created_at: Some(ts.to_string()),
updated_at: Some(ts.to_string()),
updated_at: updated_page1.get(1).cloned().flatten(),
},
],
next_cursor: Some(expected_cursor1.clone()),
@@ -953,13 +738,14 @@ async fn test_stable_ordering_same_second_pagination() {
.join("07")
.join("01")
.join(format!("rollout-2025-07-01T00-00-00-{u1}.jsonl"));
let updated_page2: Vec<Option<String>> =
page2.items.iter().map(|i| i.updated_at.clone()).collect();
let expected_page2 = ConversationsPage {
items: vec![ConversationItem {
path: p1,
head: head(u1),
tail: Vec::new(),
created_at: Some(ts.to_string()),
updated_at: Some(ts.to_string()),
updated_at: updated_page2.first().cloned().flatten(),
}],
next_cursor: None,
num_scanned_files: 3, // scanned u3, u2 (anchor), u1

View File

@@ -6,6 +6,7 @@ use codex_apply_patch::ApplyPatchAction;
use codex_apply_patch::ApplyPatchFileChange;
use crate::exec::SandboxType;
use crate::util::resolve_path;
use crate::protocol::AskForApproval;
use crate::protocol::SandboxPolicy;
@@ -150,11 +151,7 @@ fn is_write_patch_constrained_to_writable_paths(
// and roots are converted to absolute, normalized forms before the
// prefix check.
let is_path_writable = |p: &PathBuf| {
let abs = if p.is_absolute() {
p.clone()
} else {
cwd.join(p)
};
let abs = resolve_path(cwd, p);
let abs = match normalize(&abs) {
Some(v) => v,
None => return false,

View File

@@ -10,13 +10,14 @@ use crate::client::ModelClient;
use crate::client_common::Prompt;
use crate::client_common::ResponseEvent;
use crate::config::Config;
use crate::openai_models::models_manager::ModelsManager;
use crate::protocol::SandboxPolicy;
use askama::Template;
use codex_otel::otel_event_manager::OtelEventManager;
use codex_protocol::ConversationId;
use codex_protocol::config_types::ReasoningEffort as ReasoningEffortConfig;
use codex_protocol::models::ContentItem;
use codex_protocol::models::ResponseItem;
use codex_protocol::openai_models::ReasoningEffort as ReasoningEffortConfig;
use codex_protocol::protocol::SandboxCommandAssessment;
use codex_protocol::protocol::SessionSource;
use futures::StreamExt;
@@ -46,6 +47,7 @@ pub(crate) async fn assess_command(
auth_manager: Arc<AuthManager>,
parent_otel: &OtelEventManager,
conversation_id: ConversationId,
models_manager: Arc<ModelsManager>,
session_source: SessionSource,
call_id: &str,
command: &[String],
@@ -124,12 +126,14 @@ pub(crate) async fn assess_command(
output_schema: Some(sandbox_assessment_schema()),
};
let child_otel =
parent_otel.with_model(config.model.as_str(), config.model_family.slug.as_str());
let model_family = models_manager.construct_model_family(&config.model, &config);
let child_otel = parent_otel.with_model(config.model.as_str(), model_family.slug.as_str());
let client = ModelClient::new(
Arc::clone(&config),
Some(auth_manager),
model_family,
child_otel,
provider,
Some(SANDBOX_ASSESSMENT_REASONING_EFFORT),

View File

@@ -53,6 +53,7 @@
(sysctl-name "hw.physicalcpu_max")
(sysctl-name "hw.tbfrequency_compat")
(sysctl-name "hw.vectorunit")
(sysctl-name "kern.argmax")
(sysctl-name "kern.hostname")
(sysctl-name "kern.maxfilesperproc")
(sysctl-name "kern.maxproc")
@@ -72,7 +73,8 @@
(sysctl-name-prefix "net.routetable.")
)
; Allow Java to set CPU type grade when required
; Allow Java to read some CPU info. This is misclassified as a "write" because
; userspace passes a memory buffer to the sysctl, but conceptually it is a read.
(allow sysctl-write
(sysctl-name "kern.grade_cputype"))
@@ -86,10 +88,20 @@
(global-name "com.apple.system.opendirectoryd.libinfo")
)
; Added on top of Chrome profile
; Needed for python multiprocessing on MacOS for the SemLock
(allow ipc-posix-sem)
(allow mach-lookup
(global-name "com.apple.PowerManagement.control")
)
; allow openpty()
(allow pseudo-tty)
(allow file-read* file-write* file-ioctl (literal "/dev/ptmx"))
(allow file-read* file-write*
(require-all
(regex #"^/dev/ttys[0-9]+")
(extension "com.apple.sandbox.pty")))
; PTYs created before entering seatbelt may lack the extension; allow ioctl
; on those slave ttys so interactive shells detect a TTY and remain functional.
(allow file-ioctl (regex #"^/dev/ttys[0-9]+"))

View File

@@ -0,0 +1,291 @@
use crate::config::Config;
use crate::skills::model::SkillError;
use crate::skills::model::SkillLoadOutcome;
use crate::skills::model::SkillMetadata;
use dunce::canonicalize as normalize_path;
use serde::Deserialize;
use std::collections::VecDeque;
use std::error::Error;
use std::fmt;
use std::fs;
use std::path::Path;
use std::path::PathBuf;
use tracing::error;
#[derive(Debug, Deserialize)]
struct SkillFrontmatter {
name: String,
description: String,
}
const SKILLS_FILENAME: &str = "SKILL.md";
const SKILLS_DIR_NAME: &str = "skills";
const MAX_NAME_LEN: usize = 100;
const MAX_DESCRIPTION_LEN: usize = 500;
#[derive(Debug)]
enum SkillParseError {
Read(std::io::Error),
MissingFrontmatter,
InvalidYaml(serde_yaml::Error),
MissingField(&'static str),
InvalidField { field: &'static str, reason: String },
}
impl fmt::Display for SkillParseError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
SkillParseError::Read(e) => write!(f, "failed to read file: {e}"),
SkillParseError::MissingFrontmatter => {
write!(f, "missing YAML frontmatter delimited by ---")
}
SkillParseError::InvalidYaml(e) => write!(f, "invalid YAML: {e}"),
SkillParseError::MissingField(field) => write!(f, "missing field `{field}`"),
SkillParseError::InvalidField { field, reason } => {
write!(f, "invalid {field}: {reason}")
}
}
}
}
impl Error for SkillParseError {}
pub fn load_skills(config: &Config) -> SkillLoadOutcome {
let mut outcome = SkillLoadOutcome::default();
let roots = skill_roots(config);
for root in roots {
discover_skills_under_root(&root, &mut outcome);
}
outcome
.skills
.sort_by(|a, b| a.name.cmp(&b.name).then_with(|| a.path.cmp(&b.path)));
outcome
}
fn skill_roots(config: &Config) -> Vec<PathBuf> {
vec![config.codex_home.join(SKILLS_DIR_NAME)]
}
fn discover_skills_under_root(root: &Path, outcome: &mut SkillLoadOutcome) {
let Ok(root) = normalize_path(root) else {
return;
};
if !root.is_dir() {
return;
}
let mut queue: VecDeque<PathBuf> = VecDeque::from([root]);
while let Some(dir) = queue.pop_front() {
let entries = match fs::read_dir(&dir) {
Ok(entries) => entries,
Err(e) => {
error!("failed to read skills dir {}: {e:#}", dir.display());
continue;
}
};
for entry in entries.flatten() {
let path = entry.path();
let file_name = match path.file_name().and_then(|f| f.to_str()) {
Some(name) => name,
None => continue,
};
if file_name.starts_with('.') {
continue;
}
let Ok(file_type) = entry.file_type() else {
continue;
};
if file_type.is_symlink() {
continue;
}
if file_type.is_dir() {
queue.push_back(path);
continue;
}
if file_type.is_file() && file_name == SKILLS_FILENAME {
match parse_skill_file(&path) {
Ok(skill) => outcome.skills.push(skill),
Err(err) => outcome.errors.push(SkillError {
path,
message: err.to_string(),
}),
}
}
}
}
}
fn parse_skill_file(path: &Path) -> Result<SkillMetadata, SkillParseError> {
let contents = fs::read_to_string(path).map_err(SkillParseError::Read)?;
let frontmatter = extract_frontmatter(&contents).ok_or(SkillParseError::MissingFrontmatter)?;
let parsed: SkillFrontmatter =
serde_yaml::from_str(&frontmatter).map_err(SkillParseError::InvalidYaml)?;
let name = sanitize_single_line(&parsed.name);
let description = sanitize_single_line(&parsed.description);
validate_field(&name, MAX_NAME_LEN, "name")?;
validate_field(&description, MAX_DESCRIPTION_LEN, "description")?;
let resolved_path = normalize_path(path).unwrap_or_else(|_| path.to_path_buf());
Ok(SkillMetadata {
name,
description,
path: resolved_path,
})
}
fn sanitize_single_line(raw: &str) -> String {
raw.split_whitespace().collect::<Vec<_>>().join(" ")
}
fn validate_field(
value: &str,
max_len: usize,
field_name: &'static str,
) -> Result<(), SkillParseError> {
if value.is_empty() {
return Err(SkillParseError::MissingField(field_name));
}
if value.len() > max_len {
return Err(SkillParseError::InvalidField {
field: field_name,
reason: format!("exceeds maximum length of {max_len} characters"),
});
}
Ok(())
}
fn extract_frontmatter(contents: &str) -> Option<String> {
let mut lines = contents.lines();
if !matches!(lines.next(), Some(line) if line.trim() == "---") {
return None;
}
let mut frontmatter_lines: Vec<&str> = Vec::new();
let mut found_closing = false;
for line in lines.by_ref() {
if line.trim() == "---" {
found_closing = true;
break;
}
frontmatter_lines.push(line);
}
if frontmatter_lines.is_empty() || !found_closing {
return None;
}
Some(frontmatter_lines.join("\n"))
}
#[cfg(test)]
mod tests {
use super::*;
use crate::config::ConfigOverrides;
use crate::config::ConfigToml;
use tempfile::TempDir;
fn make_config(codex_home: &TempDir) -> Config {
let mut config = Config::load_from_base_config_with_overrides(
ConfigToml::default(),
ConfigOverrides::default(),
codex_home.path().to_path_buf(),
)
.expect("defaults for test should always succeed");
config.cwd = codex_home.path().to_path_buf();
config
}
fn write_skill(codex_home: &TempDir, dir: &str, name: &str, description: &str) -> PathBuf {
let skill_dir = codex_home.path().join(format!("skills/{dir}"));
fs::create_dir_all(&skill_dir).unwrap();
let indented_description = description.replace('\n', "\n ");
let content = format!(
"---\nname: {name}\ndescription: |-\n {indented_description}\n---\n\n# Body\n"
);
let path = skill_dir.join(SKILLS_FILENAME);
fs::write(&path, content).unwrap();
path
}
#[test]
fn loads_valid_skill() {
let codex_home = tempfile::tempdir().expect("tempdir");
write_skill(&codex_home, "demo", "demo-skill", "does things\ncarefully");
let cfg = make_config(&codex_home);
let outcome = load_skills(&cfg);
assert!(
outcome.errors.is_empty(),
"unexpected errors: {:?}",
outcome.errors
);
assert_eq!(outcome.skills.len(), 1);
let skill = &outcome.skills[0];
assert_eq!(skill.name, "demo-skill");
assert_eq!(skill.description, "does things carefully");
let path_str = skill.path.to_string_lossy().replace('\\', "/");
assert!(
path_str.ends_with("skills/demo/SKILL.md"),
"unexpected path {path_str}"
);
}
#[test]
fn skips_hidden_and_invalid() {
let codex_home = tempfile::tempdir().expect("tempdir");
let hidden_dir = codex_home.path().join("skills/.hidden");
fs::create_dir_all(&hidden_dir).unwrap();
fs::write(
hidden_dir.join(SKILLS_FILENAME),
"---\nname: hidden\ndescription: hidden\n---\n",
)
.unwrap();
// Invalid because missing closing frontmatter.
let invalid_dir = codex_home.path().join("skills/invalid");
fs::create_dir_all(&invalid_dir).unwrap();
fs::write(invalid_dir.join(SKILLS_FILENAME), "---\nname: bad").unwrap();
let cfg = make_config(&codex_home);
let outcome = load_skills(&cfg);
assert_eq!(outcome.skills.len(), 0);
assert_eq!(outcome.errors.len(), 1);
assert!(
outcome.errors[0]
.message
.contains("missing YAML frontmatter"),
"expected frontmatter error"
);
}
#[test]
fn enforces_length_limits() {
let codex_home = tempfile::tempdir().expect("tempdir");
let long_desc = "a".repeat(MAX_DESCRIPTION_LEN + 1);
write_skill(&codex_home, "too-long", "toolong", &long_desc);
let cfg = make_config(&codex_home);
let outcome = load_skills(&cfg);
assert_eq!(outcome.skills.len(), 0);
assert_eq!(outcome.errors.len(), 1);
assert!(
outcome.errors[0].message.contains("invalid description"),
"expected length error"
);
}
}

View File

@@ -0,0 +1,9 @@
pub mod loader;
pub mod model;
pub mod render;
pub use loader::load_skills;
pub use model::SkillError;
pub use model::SkillLoadOutcome;
pub use model::SkillMetadata;
pub use render::render_skills_section;

View File

@@ -0,0 +1,20 @@
use std::path::PathBuf;
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct SkillMetadata {
pub name: String,
pub description: String,
pub path: PathBuf,
}
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct SkillError {
pub path: PathBuf,
pub message: String,
}
#[derive(Debug, Clone, Default)]
pub struct SkillLoadOutcome {
pub skills: Vec<SkillMetadata>,
pub errors: Vec<SkillError>,
}

View File

@@ -0,0 +1,42 @@
use crate::skills::model::SkillMetadata;
pub fn render_skills_section(skills: &[SkillMetadata]) -> Option<String> {
if skills.is_empty() {
return None;
}
let mut lines: Vec<String> = Vec::new();
lines.push("## Skills".to_string());
lines.push("These skills are discovered at startup from ~/.codex/skills; each entry shows name, description, and file path so you can open the source for full instructions. Content is not inlined to keep context lean.".to_string());
for skill in skills {
let path_str = skill.path.to_string_lossy().replace('\\', "/");
lines.push(format!(
"- {}: {} (file: {})",
skill.name, skill.description, path_str
));
}
lines.push(
r###"- Discovery: Available skills are listed in project docs and may also appear in a runtime "## Skills" section (name + description + file path). These are the sources of truth; skill bodies live on disk at the listed paths.
- Trigger rules: If the user names a skill (with `$SkillName` or plain text) OR the task clearly matches a skill's description, you must use that skill for that turn. Multiple mentions mean use them all. Do not carry skills across turns unless re-mentioned.
- Missing/blocked: If a named skill isn't in the list or the path can't be read, say so briefly and continue with the best fallback.
- How to use a skill (progressive disclosure):
1) After deciding to use a skill, open its `SKILL.md`. Read only enough to follow the workflow.
2) If `SKILL.md` points to extra folders such as `references/`, load only the specific files needed for the request; don't bulk-load everything.
3) If `scripts/` exist, prefer running or patching them instead of retyping large code blocks.
4) If `assets/` or templates exist, reuse them instead of recreating from scratch.
- Description as trigger: The YAML `description` in `SKILL.md` is the primary trigger signal; rely on it to decide applicability. If unsure, ask a brief clarification before proceeding.
- Coordination and sequencing:
- If multiple skills apply, choose the minimal set that covers the request and state the order you'll use them.
- Announce which skill(s) you're using and why (one short line). If you skip an obvious skill, say why.
- Context hygiene:
- Keep context small: summarize long sections instead of pasting them; only load extra files when needed.
- Avoid deeply nested references; prefer one-hop files explicitly linked from `SKILL.md`.
- When variants exist (frameworks, providers, domains), pick only the relevant reference file(s) and note that choice.
- Safety and fallback: If a skill can't be applied cleanly (missing files, unclear instructions), state the issue, pick the next-best approach, and continue."###
.to_string(),
);
Some(lines.join("\n"))
}

View File

@@ -3,6 +3,7 @@ use std::sync::Arc;
use crate::AuthManager;
use crate::RolloutRecorder;
use crate::mcp_connection_manager::McpConnectionManager;
use crate::openai_models::models_manager::ModelsManager;
use crate::tools::sandboxing::ApprovalStore;
use crate::unified_exec::UnifiedExecSessionManager;
use crate::user_notification::UserNotifier;
@@ -20,6 +21,7 @@ pub(crate) struct SessionServices {
pub(crate) user_shell: crate::shell::Shell,
pub(crate) show_raw_agent_reasoning: bool,
pub(crate) auth_manager: Arc<AuthManager>,
pub(crate) models_manager: Arc<ModelsManager>,
pub(crate) otel_event_manager: OtelEventManager,
pub(crate) tool_approvals: Mutex<ApprovalStore>,
}

View File

@@ -62,7 +62,10 @@ impl SessionState {
}
pub(crate) fn set_rate_limits(&mut self, snapshot: RateLimitSnapshot) {
self.latest_rate_limits = Some(snapshot);
self.latest_rate_limits = Some(merge_rate_limit_credits(
self.latest_rate_limits.as_ref(),
snapshot,
));
}
pub(crate) fn token_info_and_rate_limits(
@@ -79,3 +82,14 @@ impl SessionState {
self.history.get_total_token_usage()
}
}
// Sometimes new snapshots don't include credits
fn merge_rate_limit_credits(
previous: Option<&RateLimitSnapshot>,
mut snapshot: RateLimitSnapshot,
) -> RateLimitSnapshot {
if snapshot.credits.is_none() {
snapshot.credits = previous.and_then(|prior| prior.credits.clone());
}
snapshot
}

View File

@@ -0,0 +1,212 @@
use std::pin::Pin;
use std::sync::Arc;
use codex_protocol::items::TurnItem;
use tokio_util::sync::CancellationToken;
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::error::CodexErr;
use crate::error::Result;
use crate::function_tool::FunctionCallError;
use crate::parse_turn_item;
use crate::tools::parallel::ToolCallRuntime;
use crate::tools::router::ToolRouter;
use codex_protocol::models::FunctionCallOutputPayload;
use codex_protocol::models::ResponseInputItem;
use codex_protocol::models::ResponseItem;
use futures::Future;
use tracing::debug;
/// Handle a completed output item from the model stream, recording it and
/// queuing any tool execution futures. This records items immediately so
/// history and rollout stay in sync even if the turn is later cancelled.
pub(crate) type InFlightFuture<'f> =
Pin<Box<dyn Future<Output = Result<ResponseInputItem>> + Send + 'f>>;
#[derive(Default)]
pub(crate) struct OutputItemResult {
pub last_agent_message: Option<String>,
pub needs_follow_up: bool,
pub tool_future: Option<InFlightFuture<'static>>,
}
pub(crate) struct HandleOutputCtx {
pub sess: Arc<Session>,
pub turn_context: Arc<TurnContext>,
pub tool_runtime: ToolCallRuntime,
pub cancellation_token: CancellationToken,
}
pub(crate) async fn handle_output_item_done(
ctx: &mut HandleOutputCtx,
item: ResponseItem,
previously_active_item: Option<TurnItem>,
) -> Result<OutputItemResult> {
let mut output = OutputItemResult::default();
match ToolRouter::build_tool_call(ctx.sess.as_ref(), item.clone()).await {
// The model emitted a tool call; log it, persist the item immediately, and queue the tool execution.
Ok(Some(call)) => {
let payload_preview = call.payload.log_payload().into_owned();
tracing::info!("ToolCall: {} {}", call.tool_name, payload_preview);
ctx.sess
.record_conversation_items(&ctx.turn_context, std::slice::from_ref(&item))
.await;
let cancellation_token = ctx.cancellation_token.child_token();
let tool_runtime = ctx.tool_runtime.clone();
let tool_future: InFlightFuture<'static> = Box::pin(async move {
let response_input = tool_runtime
.handle_tool_call(call, cancellation_token)
.await?;
Ok(response_input)
});
output.needs_follow_up = true;
output.tool_future = Some(tool_future);
}
// No tool call: convert messages/reasoning into turn items and mark them as complete.
Ok(None) => {
if let Some(turn_item) = handle_non_tool_response_item(&item).await {
if previously_active_item.is_none() {
ctx.sess
.emit_turn_item_started(&ctx.turn_context, &turn_item)
.await;
}
ctx.sess
.emit_turn_item_completed(&ctx.turn_context, turn_item)
.await;
}
ctx.sess
.record_conversation_items(&ctx.turn_context, std::slice::from_ref(&item))
.await;
let last_agent_message = last_assistant_message_from_item(&item);
output.last_agent_message = last_agent_message;
}
// Guardrail: the model issued a LocalShellCall without an id; surface the error back into history.
Err(FunctionCallError::MissingLocalShellCallId) => {
let msg = "LocalShellCall without call_id or id";
ctx.turn_context
.client
.get_otel_event_manager()
.log_tool_failed("local_shell", msg);
tracing::error!(msg);
let response = ResponseInputItem::FunctionCallOutput {
call_id: String::new(),
output: FunctionCallOutputPayload {
content: msg.to_string(),
..Default::default()
},
};
ctx.sess
.record_conversation_items(&ctx.turn_context, std::slice::from_ref(&item))
.await;
if let Some(response_item) = response_input_to_response_item(&response) {
ctx.sess
.record_conversation_items(
&ctx.turn_context,
std::slice::from_ref(&response_item),
)
.await;
}
output.needs_follow_up = true;
}
// The tool request should be answered directly (or was denied); push that response into the transcript.
Err(FunctionCallError::RespondToModel(message))
| Err(FunctionCallError::Denied(message)) => {
let response = ResponseInputItem::FunctionCallOutput {
call_id: String::new(),
output: FunctionCallOutputPayload {
content: message,
..Default::default()
},
};
ctx.sess
.record_conversation_items(&ctx.turn_context, std::slice::from_ref(&item))
.await;
if let Some(response_item) = response_input_to_response_item(&response) {
ctx.sess
.record_conversation_items(
&ctx.turn_context,
std::slice::from_ref(&response_item),
)
.await;
}
output.needs_follow_up = true;
}
// A fatal error occurred; surface it back into history.
Err(FunctionCallError::Fatal(message)) => {
return Err(CodexErr::Fatal(message));
}
}
Ok(output)
}
pub(crate) async fn handle_non_tool_response_item(item: &ResponseItem) -> Option<TurnItem> {
debug!(?item, "Output item");
match item {
ResponseItem::Message { .. }
| ResponseItem::Reasoning { .. }
| ResponseItem::WebSearchCall { .. } => parse_turn_item(item),
ResponseItem::FunctionCallOutput { .. } | ResponseItem::CustomToolCallOutput { .. } => {
debug!("unexpected tool output from stream");
None
}
_ => None,
}
}
pub(crate) fn last_assistant_message_from_item(item: &ResponseItem) -> Option<String> {
if let ResponseItem::Message { role, content, .. } = item
&& role == "assistant"
{
return content.iter().rev().find_map(|ci| match ci {
codex_protocol::models::ContentItem::OutputText { text } => Some(text.clone()),
_ => None,
});
}
None
}
pub(crate) fn response_input_to_response_item(input: &ResponseInputItem) -> Option<ResponseItem> {
match input {
ResponseInputItem::FunctionCallOutput { call_id, output } => {
Some(ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output: output.clone(),
})
}
ResponseInputItem::CustomToolCallOutput { call_id, output } => {
Some(ResponseItem::CustomToolCallOutput {
call_id: call_id.clone(),
output: output.clone(),
})
}
ResponseInputItem::McpToolCallOutput { call_id, result } => {
let output = match result {
Ok(call_tool_result) => FunctionCallOutputPayload::from(call_tool_result),
Err(err) => FunctionCallOutputPayload {
content: err.clone(),
success: Some(false),
..Default::default()
},
};
Some(ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output,
})
}
_ => None,
}
}

View File

@@ -25,7 +25,7 @@ impl SessionTask for CompactTask {
_cancellation_token: CancellationToken,
) -> Option<String> {
let session = session.clone_session();
if crate::compact::should_use_remote_compact_task(&session).await {
if crate::compact::should_use_remote_compact_task(&session) {
crate::compact_remote::run_remote_compact_task(session, ctx).await
} else {
crate::compact::run_compact_task(session, ctx, input).await

View File

@@ -19,6 +19,7 @@ use tracing::warn;
use crate::AuthManager;
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::openai_models::models_manager::ModelsManager;
use crate::protocol::EventMsg;
use crate::protocol::TaskCompleteEvent;
use crate::protocol::TurnAbortReason;
@@ -55,6 +56,10 @@ impl SessionTaskContext {
pub(crate) fn auth_manager(&self) -> Arc<AuthManager> {
Arc::clone(&self.session.services.auth_manager)
}
pub(crate) fn models_manager(&self) -> Arc<ModelsManager> {
Arc::clone(&self.session.services.models_manager)
}
}
/// Async task that drives a [`Session`] turn.

View File

@@ -16,6 +16,7 @@ use tokio_util::sync::CancellationToken;
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::codex_delegate::run_codex_conversation_one_shot;
use crate::protocol::SandboxPolicy;
use crate::review_format::format_review_findings_block;
use crate::review_format::render_review_output_text;
use crate::state::TaskKind;
@@ -77,6 +78,7 @@ async fn start_review_conversation(
) -> Option<async_channel::Receiver<Event>> {
let config = ctx.client.config();
let mut sub_agent_config = config.as_ref().clone();
sub_agent_config.sandbox_policy = SandboxPolicy::new_read_only_policy();
// Run with only reviewer rubric — drop outer user_instructions
sub_agent_config.user_instructions = None;
// Avoid loading project docs; reviewer only needs findings
@@ -93,6 +95,7 @@ async fn start_review_conversation(
(run_codex_conversation_one_shot(
sub_agent_config,
session.auth_manager(),
session.models_manager(),
input,
session.clone_session(),
ctx.clone(),

View File

@@ -1,4 +1,5 @@
use std::collections::BTreeMap;
use std::path::Path;
use crate::apply_patch;
use crate::apply_patch::InternalApplyPatchInvocation;
@@ -7,7 +8,10 @@ use crate::client_common::tools::FreeformTool;
use crate::client_common::tools::FreeformToolFormat;
use crate::client_common::tools::ResponsesApiTool;
use crate::client_common::tools::ToolSpec;
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::function_tool::FunctionCallError;
use crate::tools::context::SharedTurnDiffTracker;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolOutput;
use crate::tools::context::ToolPayload;
@@ -164,6 +168,86 @@ pub enum ApplyPatchToolType {
Function,
}
#[allow(clippy::too_many_arguments)]
pub(crate) async fn intercept_apply_patch(
command: &[String],
cwd: &Path,
timeout_ms: Option<u64>,
session: &Session,
turn: &TurnContext,
tracker: Option<&SharedTurnDiffTracker>,
call_id: &str,
tool_name: &str,
) -> Result<Option<ToolOutput>, FunctionCallError> {
match codex_apply_patch::maybe_parse_apply_patch_verified(command, cwd) {
codex_apply_patch::MaybeApplyPatchVerified::Body(changes) => {
session
.record_model_warning(
format!("apply_patch was requested via {tool_name}. Use the apply_patch tool instead of exec_command."),
turn,
)
.await;
match apply_patch::apply_patch(session, turn, call_id, changes).await {
InternalApplyPatchInvocation::Output(item) => {
let content = item?;
Ok(Some(ToolOutput::Function {
content,
content_items: None,
success: Some(true),
}))
}
InternalApplyPatchInvocation::DelegateToExec(apply) => {
let emitter = ToolEmitter::apply_patch(
convert_apply_patch_to_protocol(&apply.action),
!apply.user_explicitly_approved_this_action,
);
let event_ctx =
ToolEventCtx::new(session, turn, call_id, tracker.as_ref().copied());
emitter.begin(event_ctx).await;
let req = ApplyPatchRequest {
patch: apply.action.patch.clone(),
cwd: apply.action.cwd.clone(),
timeout_ms,
user_explicitly_approved: apply.user_explicitly_approved_this_action,
codex_exe: turn.codex_linux_sandbox_exe.clone(),
};
let mut orchestrator = ToolOrchestrator::new();
let mut runtime = ApplyPatchRuntime::new();
let tool_ctx = ToolCtx {
session,
turn,
call_id: call_id.to_string(),
tool_name: tool_name.to_string(),
};
let out = orchestrator
.run(&mut runtime, &req, &tool_ctx, turn, turn.approval_policy)
.await;
let event_ctx =
ToolEventCtx::new(session, turn, call_id, tracker.as_ref().copied());
let content = emitter.finish(event_ctx, out).await?;
Ok(Some(ToolOutput::Function {
content,
content_items: None,
success: Some(true),
}))
}
}
}
codex_apply_patch::MaybeApplyPatchVerified::CorrectnessError(parse_error) => {
Err(FunctionCallError::RespondToModel(format!(
"apply_patch verification failed: {parse_error}"
)))
}
codex_apply_patch::MaybeApplyPatchVerified::ShellParseError(error) => {
tracing::trace!("Failed to parse apply_patch input, {error:?}");
Ok(None)
}
codex_apply_patch::MaybeApplyPatchVerified::NotApplyPatch => Ok(None),
}
}
/// Returns a custom tool that can be used to edit files. Well-suited for GPT-5 models
/// https://platform.openai.com/docs/guides/function-calling#custom-tools
pub(crate) fn create_apply_patch_freeform_tool() -> ToolSpec {

View File

@@ -3,13 +3,10 @@ use codex_protocol::models::ShellCommandToolCallParams;
use codex_protocol::models::ShellToolCallParams;
use std::sync::Arc;
use crate::apply_patch;
use crate::apply_patch::InternalApplyPatchInvocation;
use crate::apply_patch::convert_apply_patch_to_protocol;
use crate::codex::TurnContext;
use crate::exec::ExecParams;
use crate::exec_env::create_env;
use crate::exec_policy::create_approval_requirement_for_command;
use crate::exec_policy::create_exec_approval_requirement_for_command;
use crate::function_tool::FunctionCallError;
use crate::is_safe_command::is_known_safe_command;
use crate::protocol::ExecCommandSource;
@@ -19,11 +16,10 @@ use crate::tools::context::ToolOutput;
use crate::tools::context::ToolPayload;
use crate::tools::events::ToolEmitter;
use crate::tools::events::ToolEventCtx;
use crate::tools::handlers::apply_patch::intercept_apply_patch;
use crate::tools::orchestrator::ToolOrchestrator;
use crate::tools::registry::ToolHandler;
use crate::tools::registry::ToolKind;
use crate::tools::runtimes::apply_patch::ApplyPatchRequest;
use crate::tools::runtimes::apply_patch::ApplyPatchRuntime;
use crate::tools::runtimes::shell::ShellRequest;
use crate::tools::runtimes::shell::ShellRuntime;
use crate::tools::sandboxing::ToolCtx;
@@ -210,81 +206,19 @@ impl ShellHandler {
}
// Intercept apply_patch if present.
match codex_apply_patch::maybe_parse_apply_patch_verified(
if let Some(output) = intercept_apply_patch(
&exec_params.command,
&exec_params.cwd,
) {
codex_apply_patch::MaybeApplyPatchVerified::Body(changes) => {
match apply_patch::apply_patch(session.as_ref(), turn.as_ref(), &call_id, changes)
.await
{
InternalApplyPatchInvocation::Output(item) => {
// Programmatic apply_patch path; return its result.
let content = item?;
return Ok(ToolOutput::Function {
content,
content_items: None,
success: Some(true),
});
}
InternalApplyPatchInvocation::DelegateToExec(apply) => {
let emitter = ToolEmitter::apply_patch(
convert_apply_patch_to_protocol(&apply.action),
!apply.user_explicitly_approved_this_action,
);
let event_ctx = ToolEventCtx::new(
session.as_ref(),
turn.as_ref(),
&call_id,
Some(&tracker),
);
emitter.begin(event_ctx).await;
let req = ApplyPatchRequest {
patch: apply.action.patch.clone(),
cwd: apply.action.cwd.clone(),
timeout_ms: exec_params.expiration.timeout_ms(),
user_explicitly_approved: apply.user_explicitly_approved_this_action,
codex_exe: turn.codex_linux_sandbox_exe.clone(),
};
let mut orchestrator = ToolOrchestrator::new();
let mut runtime = ApplyPatchRuntime::new();
let tool_ctx = ToolCtx {
session: session.as_ref(),
turn: turn.as_ref(),
call_id: call_id.clone(),
tool_name: tool_name.to_string(),
};
let out = orchestrator
.run(&mut runtime, &req, &tool_ctx, &turn, turn.approval_policy)
.await;
let event_ctx = ToolEventCtx::new(
session.as_ref(),
turn.as_ref(),
&call_id,
Some(&tracker),
);
let content = emitter.finish(event_ctx, out).await?;
return Ok(ToolOutput::Function {
content,
content_items: None,
success: Some(true),
});
}
}
}
codex_apply_patch::MaybeApplyPatchVerified::CorrectnessError(parse_error) => {
return Err(FunctionCallError::RespondToModel(format!(
"apply_patch verification failed: {parse_error}"
)));
}
codex_apply_patch::MaybeApplyPatchVerified::ShellParseError(error) => {
tracing::trace!("Failed to parse shell command, {error:?}");
// Fall through to regular shell execution.
}
codex_apply_patch::MaybeApplyPatchVerified::NotApplyPatch => {
// Fall through to regular shell execution.
}
exec_params.expiration.timeout_ms(),
session.as_ref(),
turn.as_ref(),
Some(&tracker),
&call_id,
tool_name,
)
.await?
{
return Ok(output);
}
let source = ExecCommandSource::Agent;
@@ -297,6 +231,17 @@ impl ShellHandler {
let event_ctx = ToolEventCtx::new(session.as_ref(), turn.as_ref(), &call_id, None);
emitter.begin(event_ctx).await;
let features = session.features();
let exec_approval_requirement = create_exec_approval_requirement_for_command(
&turn.exec_policy,
&features,
&exec_params.command,
turn.approval_policy,
&turn.sandbox_policy,
SandboxPermissions::from(exec_params.with_escalated_permissions.unwrap_or(false)),
)
.await;
let req = ShellRequest {
command: exec_params.command.clone(),
cwd: exec_params.cwd.clone(),
@@ -304,13 +249,7 @@ impl ShellHandler {
env: exec_params.env.clone(),
with_escalated_permissions: exec_params.with_escalated_permissions,
justification: exec_params.justification.clone(),
approval_requirement: create_approval_requirement_for_command(
&turn.exec_policy,
&exec_params.command,
turn.approval_policy,
&turn.sandbox_policy,
SandboxPermissions::from(exec_params.with_escalated_permissions.unwrap_or(false)),
),
exec_approval_requirement,
};
let mut orchestrator = ToolOrchestrator::new();
let mut runtime = ShellRuntime::new();

View File

@@ -6,6 +6,7 @@ use crate::protocol::EventMsg;
use crate::protocol::ExecCommandOutputDeltaEvent;
use crate::protocol::ExecCommandSource;
use crate::protocol::ExecOutputStream;
use crate::shell::default_user_shell;
use crate::shell::get_shell_by_model_provided_path;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolOutput;
@@ -13,6 +14,7 @@ use crate::tools::context::ToolPayload;
use crate::tools::events::ToolEmitter;
use crate::tools::events::ToolEventCtx;
use crate::tools::events::ToolEventStage;
use crate::tools::handlers::apply_patch::intercept_apply_patch;
use crate::tools::registry::ToolHandler;
use crate::tools::registry::ToolKind;
use crate::unified_exec::ExecCommandRequest;
@@ -30,8 +32,8 @@ struct ExecCommandArgs {
cmd: String,
#[serde(default)]
workdir: Option<String>,
#[serde(default = "default_shell")]
shell: String,
#[serde(default)]
shell: Option<String>,
#[serde(default = "default_login")]
login: bool,
#[serde(default = "default_exec_yield_time_ms")]
@@ -64,10 +66,6 @@ fn default_write_stdin_yield_time_ms() -> u64 {
250
}
fn default_shell() -> String {
"/bin/bash".to_string()
}
fn default_login() -> bool {
true
}
@@ -103,6 +101,7 @@ impl ToolHandler for UnifiedExecHandler {
let ToolInvocation {
session,
turn,
tracker,
call_id,
tool_name,
payload,
@@ -147,18 +146,34 @@ impl ToolHandler for UnifiedExecHandler {
codex_protocol::protocol::AskForApproval::OnRequest
)
{
manager.release_process_id(&process_id).await;
return Err(FunctionCallError::RespondToModel(format!(
"approval policy is {policy:?}; reject command — you cannot ask for escalated permissions if the approval policy is {policy:?}",
policy = context.turn.approval_policy
)));
}
let workdir = workdir
.as_deref()
.filter(|value| !value.is_empty())
.map(PathBuf::from);
let workdir = workdir.filter(|value| !value.is_empty());
let workdir = workdir.map(|dir| context.turn.resolve_path(Some(dir)));
let cwd = workdir.clone().unwrap_or_else(|| context.turn.cwd.clone());
if let Some(output) = intercept_apply_patch(
&command,
&cwd,
Some(yield_time_ms),
context.session.as_ref(),
context.turn.as_ref(),
Some(&tracker),
&context.call_id,
tool_name.as_str(),
)
.await?
{
manager.release_process_id(&process_id).await;
return Ok(output);
}
let event_ctx = ToolEventCtx::new(
context.session.as_ref(),
context.turn.as_ref(),
@@ -241,7 +256,12 @@ impl ToolHandler for UnifiedExecHandler {
}
fn get_command(args: &ExecCommandArgs) -> Vec<String> {
let shell = get_shell_by_model_provided_path(&PathBuf::from(args.shell.clone()));
let shell = if let Some(shell_str) = &args.shell {
get_shell_by_model_provided_path(&PathBuf::from(shell_str))
} else {
default_user_shell()
};
shell.derive_exec_args(&args.cmd, args.login)
}
@@ -273,3 +293,65 @@ fn format_response(response: &UnifiedExecResponse) -> String {
sections.join("\n")
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_get_command_uses_default_shell_when_unspecified() {
let json = r#"{"cmd": "echo hello"}"#;
let args: ExecCommandArgs =
serde_json::from_str(json).expect("deserialize ExecCommandArgs");
assert!(args.shell.is_none());
let command = get_command(&args);
assert_eq!(command.len(), 3);
assert_eq!(command[2], "echo hello");
}
#[test]
fn test_get_command_respects_explicit_bash_shell() {
let json = r#"{"cmd": "echo hello", "shell": "/bin/bash"}"#;
let args: ExecCommandArgs =
serde_json::from_str(json).expect("deserialize ExecCommandArgs");
assert_eq!(args.shell.as_deref(), Some("/bin/bash"));
let command = get_command(&args);
assert_eq!(command[2], "echo hello");
}
#[test]
fn test_get_command_respects_explicit_powershell_shell() {
let json = r#"{"cmd": "echo hello", "shell": "powershell"}"#;
let args: ExecCommandArgs =
serde_json::from_str(json).expect("deserialize ExecCommandArgs");
assert_eq!(args.shell.as_deref(), Some("powershell"));
let command = get_command(&args);
assert_eq!(command[2], "echo hello");
}
#[test]
fn test_get_command_respects_explicit_cmd_shell() {
let json = r#"{"cmd": "echo hello", "shell": "cmd"}"#;
let args: ExecCommandArgs =
serde_json::from_str(json).expect("deserialize ExecCommandArgs");
assert_eq!(args.shell.as_deref(), Some("cmd"));
let command = get_command(&args);
assert_eq!(command[2], "echo hello");
}
}

View File

@@ -11,14 +11,14 @@ use crate::error::get_error_message_ui;
use crate::exec::ExecToolCallOutput;
use crate::sandboxing::SandboxManager;
use crate::tools::sandboxing::ApprovalCtx;
use crate::tools::sandboxing::ApprovalRequirement;
use crate::tools::sandboxing::ExecApprovalRequirement;
use crate::tools::sandboxing::ProvidesSandboxRetryData;
use crate::tools::sandboxing::SandboxAttempt;
use crate::tools::sandboxing::SandboxOverride;
use crate::tools::sandboxing::ToolCtx;
use crate::tools::sandboxing::ToolError;
use crate::tools::sandboxing::ToolRuntime;
use crate::tools::sandboxing::default_approval_requirement;
use crate::tools::sandboxing::default_exec_approval_requirement;
use codex_protocol::protocol::AskForApproval;
use codex_protocol::protocol::ReviewDecision;
@@ -54,17 +54,17 @@ impl ToolOrchestrator {
// 1) Approval
let mut already_approved = false;
let requirement = tool.approval_requirement(req).unwrap_or_else(|| {
default_approval_requirement(approval_policy, &turn_ctx.sandbox_policy)
let requirement = tool.exec_approval_requirement(req).unwrap_or_else(|| {
default_exec_approval_requirement(approval_policy, &turn_ctx.sandbox_policy)
});
match requirement {
ApprovalRequirement::Skip { .. } => {
otel.tool_decision(otel_tn, otel_ci, ReviewDecision::Approved, otel_cfg);
ExecApprovalRequirement::Skip { .. } => {
otel.tool_decision(otel_tn, otel_ci, &ReviewDecision::Approved, otel_cfg);
}
ApprovalRequirement::Forbidden { reason } => {
ExecApprovalRequirement::Forbidden { reason } => {
return Err(ToolError::Rejected(reason));
}
ApprovalRequirement::NeedsApproval { reason } => {
ExecApprovalRequirement::NeedsApproval { reason, .. } => {
let mut risk = None;
if let Some(metadata) = req.sandbox_retry_data() {
@@ -88,13 +88,15 @@ impl ToolOrchestrator {
};
let decision = tool.start_approval_async(req, approval_ctx).await;
otel.tool_decision(otel_tn, otel_ci, decision, otel_user.clone());
otel.tool_decision(otel_tn, otel_ci, &decision, otel_user.clone());
match decision {
ReviewDecision::Denied | ReviewDecision::Abort => {
return Err(ToolError::Rejected("rejected by user".to_string()));
}
ReviewDecision::Approved | ReviewDecision::ApprovedForSession => {}
ReviewDecision::Approved
| ReviewDecision::ApprovedExecpolicyAmendment { .. }
| ReviewDecision::ApprovedForSession => {}
}
already_approved = true;
}
@@ -169,13 +171,15 @@ impl ToolOrchestrator {
};
let decision = tool.start_approval_async(req, approval_ctx).await;
otel.tool_decision(otel_tn, otel_ci, decision, otel_user);
otel.tool_decision(otel_tn, otel_ci, &decision, otel_user);
match decision {
ReviewDecision::Denied | ReviewDecision::Abort => {
return Err(ToolError::Rejected("rejected by user".to_string()));
}
ReviewDecision::Approved | ReviewDecision::ApprovedForSession => {}
ReviewDecision::Approved
| ReviewDecision::ApprovedExecpolicyAmendment { .. }
| ReviewDecision::ApprovedForSession => {}
}
}

View File

@@ -17,6 +17,7 @@ use crate::tools::router::ToolRouter;
use codex_protocol::models::FunctionCallOutputPayload;
use codex_protocol::models::ResponseInputItem;
#[derive(Clone)]
pub(crate) struct ToolCallRuntime {
router: Arc<ToolRouter>,
session: Arc<Session>,

View File

@@ -127,6 +127,7 @@ impl Approvable<ApplyPatchRequest> for ApplyPatchRuntime {
cwd,
Some(reason),
risk,
None,
)
.await
} else if user_explicitly_approved {

View File

@@ -9,7 +9,7 @@ use crate::sandboxing::execute_env;
use crate::tools::runtimes::build_command_spec;
use crate::tools::sandboxing::Approvable;
use crate::tools::sandboxing::ApprovalCtx;
use crate::tools::sandboxing::ApprovalRequirement;
use crate::tools::sandboxing::ExecApprovalRequirement;
use crate::tools::sandboxing::ProvidesSandboxRetryData;
use crate::tools::sandboxing::SandboxAttempt;
use crate::tools::sandboxing::SandboxOverride;
@@ -32,7 +32,7 @@ pub struct ShellRequest {
pub env: std::collections::HashMap<String, String>,
pub with_escalated_permissions: Option<bool>,
pub justification: Option<String>,
pub approval_requirement: ApprovalRequirement,
pub exec_approval_requirement: ExecApprovalRequirement,
}
impl ProvidesSandboxRetryData for ShellRequest {
@@ -107,22 +107,32 @@ impl Approvable<ShellRequest> for ShellRuntime {
Box::pin(async move {
with_cached_approval(&session.services, key, move || async move {
session
.request_command_approval(turn, call_id, command, cwd, reason, risk)
.request_command_approval(
turn,
call_id,
command,
cwd,
reason,
risk,
req.exec_approval_requirement
.proposed_execpolicy_amendment()
.cloned(),
)
.await
})
.await
})
}
fn approval_requirement(&self, req: &ShellRequest) -> Option<ApprovalRequirement> {
Some(req.approval_requirement.clone())
fn exec_approval_requirement(&self, req: &ShellRequest) -> Option<ExecApprovalRequirement> {
Some(req.exec_approval_requirement.clone())
}
fn sandbox_mode_for_first_attempt(&self, req: &ShellRequest) -> SandboxOverride {
if req.with_escalated_permissions.unwrap_or(false)
|| matches!(
req.approval_requirement,
ApprovalRequirement::Skip {
req.exec_approval_requirement,
ExecApprovalRequirement::Skip {
bypass_sandbox: true
}
)

View File

@@ -10,7 +10,7 @@ use crate::exec::ExecExpiration;
use crate::tools::runtimes::build_command_spec;
use crate::tools::sandboxing::Approvable;
use crate::tools::sandboxing::ApprovalCtx;
use crate::tools::sandboxing::ApprovalRequirement;
use crate::tools::sandboxing::ExecApprovalRequirement;
use crate::tools::sandboxing::ProvidesSandboxRetryData;
use crate::tools::sandboxing::SandboxAttempt;
use crate::tools::sandboxing::SandboxOverride;
@@ -36,7 +36,7 @@ pub struct UnifiedExecRequest {
pub env: HashMap<String, String>,
pub with_escalated_permissions: Option<bool>,
pub justification: Option<String>,
pub approval_requirement: ApprovalRequirement,
pub exec_approval_requirement: ExecApprovalRequirement,
}
impl ProvidesSandboxRetryData for UnifiedExecRequest {
@@ -66,7 +66,7 @@ impl UnifiedExecRequest {
env: HashMap<String, String>,
with_escalated_permissions: Option<bool>,
justification: Option<String>,
approval_requirement: ApprovalRequirement,
exec_approval_requirement: ExecApprovalRequirement,
) -> Self {
Self {
command,
@@ -74,7 +74,7 @@ impl UnifiedExecRequest {
env,
with_escalated_permissions,
justification,
approval_requirement,
exec_approval_requirement,
}
}
}
@@ -125,22 +125,35 @@ impl Approvable<UnifiedExecRequest> for UnifiedExecRuntime<'_> {
Box::pin(async move {
with_cached_approval(&session.services, key, || async move {
session
.request_command_approval(turn, call_id, command, cwd, reason, risk)
.request_command_approval(
turn,
call_id,
command,
cwd,
reason,
risk,
req.exec_approval_requirement
.proposed_execpolicy_amendment()
.cloned(),
)
.await
})
.await
})
}
fn approval_requirement(&self, req: &UnifiedExecRequest) -> Option<ApprovalRequirement> {
Some(req.approval_requirement.clone())
fn exec_approval_requirement(
&self,
req: &UnifiedExecRequest,
) -> Option<ExecApprovalRequirement> {
Some(req.exec_approval_requirement.clone())
}
fn sandbox_mode_for_first_attempt(&self, req: &UnifiedExecRequest) -> SandboxOverride {
if req.with_escalated_permissions.unwrap_or(false)
|| matches!(
req.approval_requirement,
ApprovalRequirement::Skip {
req.exec_approval_requirement,
ExecApprovalRequirement::Skip {
bypass_sandbox: true
}
)

View File

@@ -13,6 +13,7 @@ use crate::sandboxing::CommandSpec;
use crate::sandboxing::SandboxManager;
use crate::sandboxing::SandboxTransformError;
use crate::state::SessionServices;
use codex_protocol::approvals::ExecPolicyAmendment;
use codex_protocol::protocol::AskForApproval;
use codex_protocol::protocol::ReviewDecision;
use std::collections::HashMap;
@@ -88,26 +89,43 @@ pub(crate) struct ApprovalCtx<'a> {
// Specifies what tool orchestrator should do with a given tool call.
#[derive(Clone, Debug, PartialEq, Eq)]
pub(crate) enum ApprovalRequirement {
pub(crate) enum ExecApprovalRequirement {
/// No approval required for this tool call.
Skip {
/// The first attempt should skip sandboxing (e.g., when explicitly
/// greenlit by policy).
bypass_sandbox: bool,
},
/// Approval required for this tool call
NeedsApproval { reason: Option<String> },
/// Execution forbidden for this tool call
/// Approval required for this tool call.
NeedsApproval {
reason: Option<String>,
/// Proposed execpolicy amendment to skip future approvals for similar commands
/// See core/src/exec_policy.rs for more details on how proposed_execpolicy_amendment is determined.
proposed_execpolicy_amendment: Option<ExecPolicyAmendment>,
},
/// Execution forbidden for this tool call.
Forbidden { reason: String },
}
impl ExecApprovalRequirement {
pub fn proposed_execpolicy_amendment(&self) -> Option<&ExecPolicyAmendment> {
match self {
Self::NeedsApproval {
proposed_execpolicy_amendment: Some(prefix),
..
} => Some(prefix),
_ => None,
}
}
}
/// - Never, OnFailure: do not ask
/// - OnRequest: ask unless sandbox policy is DangerFullAccess
/// - UnlessTrusted: always ask
pub(crate) fn default_approval_requirement(
pub(crate) fn default_exec_approval_requirement(
policy: AskForApproval,
sandbox_policy: &SandboxPolicy,
) -> ApprovalRequirement {
) -> ExecApprovalRequirement {
let needs_approval = match policy {
AskForApproval::Never | AskForApproval::OnFailure => false,
AskForApproval::OnRequest => !matches!(sandbox_policy, SandboxPolicy::DangerFullAccess),
@@ -115,9 +133,12 @@ pub(crate) fn default_approval_requirement(
};
if needs_approval {
ApprovalRequirement::NeedsApproval { reason: None }
ExecApprovalRequirement::NeedsApproval {
reason: None,
proposed_execpolicy_amendment: None,
}
} else {
ApprovalRequirement::Skip {
ExecApprovalRequirement::Skip {
bypass_sandbox: false,
}
}
@@ -149,10 +170,9 @@ pub(crate) trait Approvable<Req> {
matches!(policy, AskForApproval::Never)
}
/// Override the default approval requirement. Return `Some(_)` to specify
/// a custom requirement, or `None` to fall back to
/// policy-based default.
fn approval_requirement(&self, _req: &Req) -> Option<ApprovalRequirement> {
/// Return `Some(_)` to specify a custom exec approval requirement, or `None`
/// to fall back to policy-based default.
fn exec_approval_requirement(&self, _req: &Req) -> Option<ExecApprovalRequirement> {
None
}

View File

@@ -2,7 +2,7 @@ use crate::client_common::tools::ResponsesApiTool;
use crate::client_common::tools::ToolSpec;
use crate::features::Feature;
use crate::features::Features;
use crate::model_family::ModelFamily;
use crate::openai_models::model_family::ModelFamily;
use crate::tools::handlers::PLAN_TOOL;
use crate::tools::handlers::apply_patch::ApplyPatchToolType;
use crate::tools::handlers::apply_patch::create_apply_patch_freeform_tool;
@@ -1118,7 +1118,7 @@ pub(crate) fn build_specs(
#[cfg(test)]
mod tests {
use crate::client_common::tools::FreeformTool;
use crate::model_family::find_family_for_model;
use crate::openai_models::model_family::find_family_for_model;
use crate::tools::registry::ConfiguredToolSpec;
use mcp_types::ToolInputSchema;
use pretty_assertions::assert_eq;
@@ -1213,8 +1213,7 @@ mod tests {
#[test]
fn test_full_toolset_specs_for_gpt5_codex_unified_exec_web_search() {
let model_family = find_family_for_model("gpt-5-codex")
.expect("gpt-5-codex should be a valid model family");
let model_family = find_family_for_model("gpt-5-codex");
let mut features = Features::with_defaults();
features.enable(Feature::UnifiedExec);
features.enable(Feature::WebSearchRequest);
@@ -1273,8 +1272,7 @@ mod tests {
}
fn assert_model_tools(model_family: &str, features: &Features, expected_tools: &[&str]) {
let model_family = find_family_for_model(model_family)
.unwrap_or_else(|| panic!("{model_family} should be a valid model family"));
let model_family = find_family_for_model(model_family);
let config = ToolsConfig::new(&ToolsConfigParams {
model_family: &model_family,
features,
@@ -1466,7 +1464,7 @@ mod tests {
#[test]
fn test_build_specs_default_shell_present() {
let model_family = find_family_for_model("o3").expect("o3 should be a valid model family");
let model_family = find_family_for_model("o3");
let mut features = Features::with_defaults();
features.enable(Feature::WebSearchRequest);
features.enable(Feature::UnifiedExec);
@@ -1487,8 +1485,7 @@ mod tests {
#[test]
#[ignore]
fn test_parallel_support_flags() {
let model_family = find_family_for_model("gpt-5-codex")
.expect("codex-mini-latest should be a valid model family");
let model_family = find_family_for_model("gpt-5-codex");
let mut features = Features::with_defaults();
features.disable(Feature::ViewImageTool);
features.enable(Feature::UnifiedExec);
@@ -1507,8 +1504,7 @@ mod tests {
#[test]
fn test_test_model_family_includes_sync_tool() {
let model_family = find_family_for_model("test-gpt-5-codex")
.expect("test-gpt-5-codex should be a valid model family");
let model_family = find_family_for_model("test-gpt-5-codex");
let mut features = Features::with_defaults();
features.disable(Feature::ViewImageTool);
let config = ToolsConfig::new(&ToolsConfigParams {
@@ -1537,7 +1533,7 @@ mod tests {
#[test]
fn test_build_specs_mcp_tools_converted() {
let model_family = find_family_for_model("o3").expect("o3 should be a valid model family");
let model_family = find_family_for_model("o3");
let mut features = Features::with_defaults();
features.enable(Feature::UnifiedExec);
features.enable(Feature::WebSearchRequest);
@@ -1631,7 +1627,7 @@ mod tests {
#[test]
fn test_build_specs_mcp_tools_sorted_by_name() {
let model_family = find_family_for_model("o3").expect("o3 should be a valid model family");
let model_family = find_family_for_model("o3");
let mut features = Features::with_defaults();
features.enable(Feature::UnifiedExec);
let config = ToolsConfig::new(&ToolsConfigParams {
@@ -1706,8 +1702,7 @@ mod tests {
#[test]
fn test_mcp_tool_property_missing_type_defaults_to_string() {
let model_family = find_family_for_model("gpt-5-codex")
.expect("gpt-5-codex should be a valid model family");
let model_family = find_family_for_model("gpt-5-codex");
let mut features = Features::with_defaults();
features.enable(Feature::UnifiedExec);
features.enable(Feature::WebSearchRequest);
@@ -1763,8 +1758,7 @@ mod tests {
#[test]
fn test_mcp_tool_integer_normalized_to_number() {
let model_family = find_family_for_model("gpt-5-codex")
.expect("gpt-5-codex should be a valid model family");
let model_family = find_family_for_model("gpt-5-codex");
let mut features = Features::with_defaults();
features.enable(Feature::UnifiedExec);
features.enable(Feature::WebSearchRequest);
@@ -1816,8 +1810,7 @@ mod tests {
#[test]
fn test_mcp_tool_array_without_items_gets_default_string_items() {
let model_family = find_family_for_model("gpt-5-codex")
.expect("gpt-5-codex should be a valid model family");
let model_family = find_family_for_model("gpt-5-codex");
let mut features = Features::with_defaults();
features.enable(Feature::UnifiedExec);
features.enable(Feature::WebSearchRequest);
@@ -1873,8 +1866,7 @@ mod tests {
#[test]
fn test_mcp_tool_anyof_defaults_to_string() {
let model_family = find_family_for_model("gpt-5-codex")
.expect("gpt-5-codex should be a valid model family");
let model_family = find_family_for_model("gpt-5-codex");
let mut features = Features::with_defaults();
features.enable(Feature::UnifiedExec);
features.enable(Feature::WebSearchRequest);
@@ -1985,8 +1977,7 @@ Examples of valid command strings:
#[test]
fn test_get_openai_tools_mcp_tools_with_additional_properties_schema() {
let model_family = find_family_for_model("gpt-5-codex")
.expect("gpt-5-codex should be a valid model family");
let model_family = find_family_for_model("gpt-5-codex");
let mut features = Features::with_defaults();
features.enable(Feature::UnifiedExec);
features.enable(Feature::WebSearchRequest);

View File

@@ -26,10 +26,10 @@ impl TruncationPolicy {
}
}
pub fn new(config: &Config) -> Self {
pub fn new(config: &Config, truncation_policy: TruncationPolicy) -> Self {
let config_token_limit = config.tool_output_token_limit;
match config.model_family.truncation_policy {
match truncation_policy {
TruncationPolicy::Bytes(family_bytes) => {
if let Some(token_limit) = config_token_limit {
Self::Bytes(approx_bytes_for_tokens(token_limit))

View File

@@ -48,6 +48,9 @@ pub(crate) const UNIFIED_EXEC_OUTPUT_MAX_BYTES: usize = 1024 * 1024; // 1 MiB
pub(crate) const UNIFIED_EXEC_OUTPUT_MAX_TOKENS: usize = UNIFIED_EXEC_OUTPUT_MAX_BYTES / 4;
pub(crate) const MAX_UNIFIED_EXEC_SESSIONS: usize = 64;
// Send a warning message to the models when it reaches this number of sessions.
pub(crate) const WARNING_UNIFIED_EXEC_SESSIONS: usize = 60;
pub(crate) struct UnifiedExecContext {
pub session: Arc<Session>,
pub turn: Arc<TurnContext>,

View File

@@ -149,15 +149,11 @@ impl UnifiedExecSession {
guard.snapshot()
}
fn sandbox_type(&self) -> SandboxType {
pub(crate) fn sandbox_type(&self) -> SandboxType {
self.sandbox_type
}
pub(super) async fn check_for_sandbox_denial(&self) -> Result<(), UnifiedExecError> {
if self.sandbox_type() == SandboxType::None || !self.has_exited() {
return Ok(());
}
let _ =
tokio::time::timeout(Duration::from_millis(20), self.output_notify.notified()).await;
@@ -167,30 +163,40 @@ impl UnifiedExecSession {
aggregated.extend_from_slice(&chunk);
}
let aggregated_text = String::from_utf8_lossy(&aggregated).to_string();
let exit_code = self.exit_code().unwrap_or(-1);
self.check_for_sandbox_denial_with_text(&aggregated_text)
.await?;
Ok(())
}
pub(super) async fn check_for_sandbox_denial_with_text(
&self,
text: &str,
) -> Result<(), UnifiedExecError> {
let sandbox_type = self.sandbox_type();
if sandbox_type == SandboxType::None || !self.has_exited() {
return Ok(());
}
let exit_code = self.exit_code().unwrap_or(-1);
let exec_output = ExecToolCallOutput {
exit_code,
stdout: StreamOutput::new(aggregated_text.clone()),
stderr: StreamOutput::new(String::new()),
aggregated_output: StreamOutput::new(aggregated_text.clone()),
duration: Duration::ZERO,
timed_out: false,
stderr: StreamOutput::new(text.to_string()),
aggregated_output: StreamOutput::new(text.to_string()),
..Default::default()
};
if is_likely_sandbox_denied(self.sandbox_type(), &exec_output) {
if is_likely_sandbox_denied(sandbox_type, &exec_output) {
let snippet = formatted_truncate_text(
&aggregated_text,
text,
TruncationPolicy::Tokens(UNIFIED_EXEC_OUTPUT_MAX_TOKENS),
);
let message = if snippet.is_empty() {
format!("exit code {exit_code}")
format!("Session exited with code {exit_code}")
} else {
snippet
};
return Err(UnifiedExecError::sandbox_denied(message, exec_output));
}
Ok(())
}
@@ -205,10 +211,7 @@ impl UnifiedExecSession {
} = spawned;
let managed = Self::new(session, output_rx, sandbox_type);
let exit_ready = match exit_rx.try_recv() {
Ok(_) | Err(TryRecvError::Closed) => true,
Err(TryRecvError::Empty) => false,
};
let exit_ready = matches!(exit_rx.try_recv(), Ok(_) | Err(TryRecvError::Closed));
if exit_ready {
managed.signal_exit();
@@ -216,7 +219,7 @@ impl UnifiedExecSession {
return Ok(managed);
}
if tokio::time::timeout(Duration::from_millis(50), &mut exit_rx)
if tokio::time::timeout(Duration::from_millis(150), &mut exit_rx)
.await
.is_ok()
{

View File

@@ -10,12 +10,13 @@ use tokio::time::Duration;
use tokio::time::Instant;
use tokio_util::sync::CancellationToken;
use crate::bash::extract_bash_command;
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::exec::ExecToolCallOutput;
use crate::exec::StreamOutput;
use crate::exec_env::create_env;
use crate::exec_policy::create_approval_requirement_for_command;
use crate::exec_policy::create_exec_approval_requirement_for_command;
use crate::protocol::BackgroundEventEvent;
use crate::protocol::EventMsg;
use crate::protocol::ExecCommandSource;
@@ -41,6 +42,7 @@ use super::UnifiedExecContext;
use super::UnifiedExecError;
use super::UnifiedExecResponse;
use super::UnifiedExecSessionManager;
use super::WARNING_UNIFIED_EXEC_SESSIONS;
use super::WriteStdinRequest;
use super::clamp_yield_time;
use super::generate_chunk_id;
@@ -109,6 +111,11 @@ impl UnifiedExecSessionManager {
}
}
pub(crate) async fn release_process_id(&self, process_id: &str) {
let mut store = self.session_store.lock().await;
store.remove(process_id);
}
pub(crate) async fn exec_command(
&self,
request: ExecCommandRequest,
@@ -127,7 +134,15 @@ impl UnifiedExecSessionManager {
request.justification,
context,
)
.await?;
.await;
let session = match session {
Ok(session) => session,
Err(err) => {
self.release_process_id(&request.process_id).await;
return Err(err);
}
};
let max_tokens = resolve_max_tokens(request.max_output_tokens);
let yield_time_ms = clamp_yield_time(request.yield_time_ms);
@@ -153,8 +168,24 @@ impl UnifiedExecSessionManager {
let has_exited = session.has_exited();
let exit_code = session.exit_code();
let chunk_id = generate_chunk_id();
let process_id = if has_exited {
None
let process_id = request.process_id.clone();
if has_exited {
self.release_process_id(&request.process_id).await;
let exit = exit_code.unwrap_or(-1);
Self::emit_exec_end_from_context(
context,
&request.command,
cwd,
output.clone(),
exit,
wall_time,
// We always emit the process ID in order to keep consistency between the Begin
// event and the End event.
Some(process_id),
)
.await;
session.check_for_sandbox_denial_with_text(&text).await?;
} else {
// Only store session if not exited.
self.store_session(
@@ -163,45 +194,29 @@ impl UnifiedExecSessionManager {
&request.command,
cwd.clone(),
start,
request.process_id.clone(),
process_id,
)
.await;
Some(request.process_id.clone())
};
let original_token_count = approx_token_count(&text);
Self::emit_waiting_status(&context.session, &context.turn, &request.command).await;
};
let original_token_count = approx_token_count(&text);
let response = UnifiedExecResponse {
event_call_id: context.call_id.clone(),
chunk_id,
wall_time,
output,
process_id: process_id.clone(),
process_id: if has_exited {
None
} else {
Some(request.process_id.clone())
},
exit_code,
original_token_count: Some(original_token_count),
session_command: Some(request.command.clone()),
};
if !has_exited {
Self::emit_waiting_status(&context.session, &context.turn, &request.command).await;
}
// If the command completed during this call, emit an ExecCommandEnd via the emitter.
if has_exited {
let exit = response.exit_code.unwrap_or(-1);
Self::emit_exec_end_from_context(
context,
&request.command,
cwd,
response.output.clone(),
exit,
response.wall_time,
// We always emit the process ID in order to keep consistency between the Begin
// event and the End event.
Some(request.process_id),
)
.await;
}
Ok(response)
}
@@ -421,9 +436,22 @@ impl UnifiedExecSessionManager {
started_at,
last_used: started_at,
};
let mut store = self.session_store.lock().await;
Self::prune_sessions_if_needed(&mut store);
store.sessions.insert(process_id, entry);
let number_sessions = {
let mut store = self.session_store.lock().await;
Self::prune_sessions_if_needed(&mut store);
store.sessions.insert(process_id, entry);
store.sessions.len()
};
if number_sessions >= WARNING_UNIFIED_EXEC_SESSIONS {
context
.session
.record_model_warning(
format!("The maximum number of unified exec sessions you can keep open is {WARNING_UNIFIED_EXEC_SESSIONS} and you currently have {number_sessions} sessions open. Reuse older sessions or close them to prevent automatic pruning of old session"),
&context.turn
)
.await;
};
}
async fn emit_exec_end_from_entry(
@@ -498,7 +526,11 @@ impl UnifiedExecSessionManager {
turn: &Arc<TurnContext>,
command: &[String],
) {
let command_display = command.join(" ");
let command_display = if let Some((_, script)) = extract_bash_command(command) {
script.to_string()
} else {
command.join(" ")
};
let message = format!("Waiting for `{command_display}`");
session
.send_event(
@@ -538,21 +570,25 @@ impl UnifiedExecSessionManager {
context: &UnifiedExecContext,
) -> Result<UnifiedExecSession, UnifiedExecError> {
let env = apply_unified_exec_env(create_env(&context.turn.shell_environment_policy));
let features = context.session.features();
let mut orchestrator = ToolOrchestrator::new();
let mut runtime = UnifiedExecRuntime::new(self);
let exec_approval_requirement = create_exec_approval_requirement_for_command(
&context.turn.exec_policy,
&features,
command,
context.turn.approval_policy,
&context.turn.sandbox_policy,
SandboxPermissions::from(with_escalated_permissions.unwrap_or(false)),
)
.await;
let req = UnifiedExecToolRequest::new(
command.to_vec(),
cwd,
env,
with_escalated_permissions,
justification,
create_approval_requirement_for_command(
&context.turn.exec_policy,
command,
context.turn.approval_policy,
&context.turn.sandbox_policy,
SandboxPermissions::from(with_escalated_permissions.unwrap_or(false)),
),
exec_approval_requirement,
);
let tool_ctx = ToolCtx {
session: context.session.as_ref(),
@@ -633,9 +669,9 @@ impl UnifiedExecSessionManager {
collected
}
fn prune_sessions_if_needed(store: &mut SessionStore) {
fn prune_sessions_if_needed(store: &mut SessionStore) -> bool {
if store.sessions.len() < MAX_UNIFIED_EXEC_SESSIONS {
return;
return false;
}
let meta: Vec<(String, Instant, bool)> = store
@@ -646,7 +682,10 @@ impl UnifiedExecSessionManager {
if let Some(session_id) = Self::session_id_to_prune_from_meta(&meta) {
store.remove(&session_id);
return true;
}
false
}
// Centralized pruning policy so we can easily swap strategies later.

View File

@@ -1,3 +1,5 @@
use std::path::Path;
use std::path::PathBuf;
use std::time::Duration;
use rand::Rng;
@@ -14,11 +16,11 @@ pub(crate) fn backoff(attempt: u64) -> Duration {
Duration::from_millis((base as f64 * jitter) as u64)
}
pub(crate) fn error_or_panic(message: String) {
pub(crate) fn error_or_panic(message: impl std::string::ToString) {
if cfg!(debug_assertions) || env!("CARGO_PKG_VERSION").contains("alpha") {
panic!("{message}");
panic!("{}", message.to_string());
} else {
error!("{message}");
error!("{}", message.to_string());
}
}
@@ -37,6 +39,14 @@ pub(crate) fn try_parse_error_message(text: &str) -> String {
text.to_string()
}
pub fn resolve_path(base: &Path, path: &PathBuf) -> PathBuf {
if path.is_absolute() {
path.clone()
} else {
base.join(path)
}
}
#[cfg(test)]
mod tests {
use super::*;

Some files were not shown because too many files have changed in this diff Show More