Compare commits

...

44 Commits

Author SHA1 Message Date
pakrym-oai
0e4d129379 ws 2026-01-08 08:20:41 -08:00
pakrym-oai
018de994b0 Stop using AuthManager as the source of codex_home (#8846) 2026-01-07 18:56:20 +00:00
Ahmed Ibrahim
c31960b13a remove unnecessary todos (#8842)
> // todo(aibrahim): why are we passing model here while it can change?

we update it on each turn with `.with_model`

> //TODO(aibrahim): run CI in release mode.

although it's good to have, release builds take double the time tests
take.

> // todo(aibrahim): make this async function

we figured out another way of doing this sync
2026-01-07 10:43:10 -08:00
Ahmed Ibrahim
9179c9deac Merge Modelfamily into modelinfo (#8763)
- Merge ModelFamily into ModelInfo
- Remove logic for adding instructions to apply patch
- Add compaction limit and visible context window to `ModelInfo`
2026-01-07 10:35:09 -08:00
Michael Bolin
a1e81180f8 fix: upgrade lru crate to 0.16.3 (#8845)
See https://rustsec.org/advisories/RUSTSEC-2026-0002.

Though our `ratatui` fork has a transitive dep on an older version of
the `lru` crate, so to get CI green ASAP, this PR also adds an exception
to `deny.toml` for `RUSTSEC-2026-0002`, but hopefully this will be
short-lived.
2026-01-07 10:11:27 -08:00
pakrym-oai
fedcb8f63c Move tests below auth manager (#8840)
To simplify future diffs
2026-01-07 17:36:44 +00:00
jif-oai
116059c3a0 chore: unify conversation with thread name (#8830)
Done and verified by Codex + refactor feature of RustRover
2026-01-07 17:04:53 +00:00
Thibault Sottiaux
0d788e6263 fix: handle early codex exec exit (#8825)
Fixes CodexExec to avoid missing early process exits by registering the
exit handler up front and deferring the error until after stdout is
drained, and adds a regression test that simulates a fast-exit child
while still producing output so hangs are caught.
2026-01-07 08:54:27 -08:00
jif-oai
4cef89a122 chore: rename unified exec sessions (#8822)
Renaming done by Codex
2026-01-07 16:12:47 +00:00
Thibault Sottiaux
124a09e577 fix: handle /review arguments in TUI (#8823)
Handle /review <instructions> in the TUI and TUI2 by routing it as a
custom review command instead of plain text, wiring command dispatch and
adding composer coverage so typing /review text starts a review directly
rather than posting a message. User impact: /review with arguments now
kicks off the review flow, previously it would just forward as a plain
command and not actually start a review.
2026-01-07 13:14:55 +00:00
Thibault Sottiaux
a59052341d fix: parse git apply paths correctly (#8824)
Fixes apply.rs path parsing so 
- quoted diff headers are tokenized and extracted correctly, 
- /dev/null headers are ignored before prefix stripping to avoid bogus
dev/null paths, and
- git apply output paths are unescaped from C-style quoting.

**Why**
This prevents potentially missed staging and misclassified paths when
applying or reverting patches, which could lead to incorrect behavior
for repos with spaces or escaped characters in filenames.

**Impact**
I checked and this is only used in the cloud tasks support and `codex
apply <task_id>` flow.
2026-01-07 13:00:31 +00:00
jif-oai
8372d61be7 chore: silent just fmt (#8820)
Done to avoid spammy warnings to end up in the model context without
having to switch to nightly
```
Warning: can't set `imports_granularity = Item`, unstable features are only available in nightly channel.
```
2026-01-07 12:16:38 +00:00
Thibault Sottiaux
230a045ac9 chore: stabilize core tool parallelism test (#8805)
Set login=false for the shell tool in the timing-based parallelism test
so it does not depend on slow user login shells, making the test
deterministic without user-facing changes. This prevents occasional
flakes when running locally.
2026-01-07 09:26:47 +00:00
charley-oai
3389465c8d Enable model upgrade popup even when selected model is no longer in picker (#8802)
With `config.toml`:
```
model = "gpt-5.1-codex"
```
(where `gpt-5.1-codex` has `show_in_picker: false` in
[`model_presets.rs`](https://github.com/openai/codex/blob/main/codex-rs/core/src/models_manager/model_presets.rs);
this happens if the user hasn't used codex in a while so they didn't see
the popup before their model was changed to `show_in_picker: false`)

The upgrade picker used to not show (because `gpt-5.1-codex` was
filtered out of the model list in code). Now, the filtering is done
downstream in tui and app-server, so the model upgrade popup shows:

<img width="1503" height="227" alt="Screenshot 2026-01-06 at 5 04 37 PM"
src="https://github.com/user-attachments/assets/26144cc2-0b3f-4674-ac17-e476781ec548"
/>
2026-01-06 19:32:27 -08:00
Thibault Sottiaux
8b4d27dfcd fix: truncate long approval prefixes when rendering (#8734)
Fixes inscrutable multiline approval requests:
<img width="686" height="844" alt="image"
src="https://github.com/user-attachments/assets/cf9493dc-79e6-4168-8020-0ef0fe676d5e"
/>
2026-01-06 15:17:01 -08:00
Michael Bolin
dc1a568dc7 fix: populate the release notes when the release is created (#8799)
Use the contents of the commit message from the commit associated with
the tag (that contains the version bump) as the release notes by writing
them to a file and then specifying the file as the `body_path` of
`softprops/action-gh-release@v2`.
2026-01-06 15:02:39 -08:00
sayan-oai
54ded1a3c0 add web_search_cached flag (#8795)
Add `web_search_cached` feature to config. Enables `web_search` tool
with access only to cached/indexed results (see
[docs](https://platform.openai.com/docs/guides/tools-web-search#live-internet-access)).

This takes precedence over the existing `web_search_request`, which
continues to enable `web_search` over live results as it did before.

`web_search_cached` is disabled for review mode, as `web_search_request`
is.
2026-01-06 14:53:59 -08:00
Celia Chen
11d4f3f45e [app-server] fix config loading for conversations (#8765)
Currently we don't load config properly for app server conversations.
see:
https://linear.app/openai/issue/CODEX-3956/config-flags-not-respected-in-codex-app-server.
This PR fixes that by respecting the config passed in.

Tested by running `cargo build -p codex-cli &&
RUST_LOG=codex_app_server=debug CODEX_BIN=target/debug/codex cargo run
-p codex-app-server-test-client -- \
--config
model_providers.mock_provider.base_url=\"http://localhost:4010/v2\" \
    --config model_provider=\"mock_provider\" \
    --config model_providers.mock_provider.name="hello" \
    send-message-v2 "hello"`
and verified that the mock_provider is called instead of default
provider.

#closes
https://linear.app/openai/issue/CODEX-3956/config-flags-not-respected-in-codex-app-server

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>
2026-01-06 22:02:17 +00:00
Owen Lin
8b7ec31ba7 feat(app-server): thread/rollback API (#8454)
Add `thread/rollback` to app-server to support IDEs undo-ing the last N
turns of a thread.

For context, an IDE partner will be supporting an "undo" capability
where the IDE (the app-server client) will be responsible for reverting
the local changes made during the last turn. To support this well, we
also need a way to drop the last turn (or more generally, the last N
turns) from the agent's context. This is what `thread/rollback` does.

**Core idea**: A Thread rollback is represented as a persisted event
message (EventMsg::ThreadRollback) in the rollout JSONL file, not by
rewriting history. On resume, both the model's context (core replay) and
the UI turn list (app-server v2's thread history builder) apply these
markers so the pruned history is consistent across live conversations
and `thread/resume`.

Implementation notes:
- Rollback only affects agent context and appends to the rollout file;
clients are responsible for reverting files on disk.
- If a thread rollback is currently in progress, subsequent
`thread/rollback` calls are rejected.
- Because we use `CodexConversation::submit` and codex core tracks
active turns, returning an error on concurrent rollbacks is communicated
via an `EventMsg::Error` with a new variant
`CodexErrorInfo::ThreadRollbackFailed`. app-server watches for that and
sends the BAD_REQUEST RPC response.

Tests cover thread rollbacks in both core and app-server, including when
`num_turns` > existing turns (which clears all turns).

**Note**: this explicitly does **not** behave like `/undo` which we just
removed from the CLI, which does the opposite of what `thread/rollback`
does. `/undo` reverts local changes via ghost commits/snapshots and does
not modify the agent's context / conversation history.
2026-01-06 21:23:48 +00:00
jif-oai
188f79afee feat: drop agent bus and store the agent status in codex directly (#8788) 2026-01-06 19:44:39 +00:00
Josh McKinney
a0b2d03302 Clear copy pill background and add snapshot test (#8777)
### Motivation
- Fix a visual bug where transcript text could bleed through the
on-screen copy "pill" overlay.
- Ensure the copy affordance fully covers the underlying buffer so the
pill background is solid and consistent with styling.
- Document the approach in-code to make the background-clearing
rationale explicit.

### Description
- Clear the pill area before drawing by iterating `Rect::positions()`
and calling `cell.set_symbol(" ")` and `cell.set_style(base_style)` in
`render_copy_pill` in `transcript_copy_ui.rs`.
- Added an explanatory comment for why the pill background is explicitly
cleared.
- Added a unit test `copy_pill_clears_background` and committed the
corresponding snapshot file to validate the rendering behavior.

### Testing
- Ran `just fmt` (formatting completed; non-blocking environment warning
may appear).
- Ran `just fix -p codex-tui2` to apply lints/fixes (completed). 
- Ran `cargo test -p codex-tui2` and all tests passed (snapshot updated
and tests succeeded).

------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_695c9b23e9b8832997d5a457c4d83410)
2026-01-06 11:21:26 -08:00
xl-openai
4ce9d0aa7b suppress popups while browsing input history (#8772) 2026-01-06 11:13:21 -08:00
jif-oai
1dd1355df3 feat: agent controller (#8783)
Added an agent control plane that lets sessions spawn or message other
conversations via `AgentControl`.

`AgentBus` (core/src/agent/bus.rs) keeps track of the last known status
of a conversation.

ConversationManager now holds shared state behind an Arc so AgentControl
keeps only a weak back-reference, the goal is just to avoid explicit
cycle reference.

Follow-ups:
* Build a small tool in the TUI to be able to see every agent and send
manual message to each of them
* Handle approval requests in this TUI
* Add tools to spawn/communicate between agents (see related design)
* Define agent types
2026-01-06 19:08:02 +00:00
Javi
915352b10c feat: add analytics config setting (#8350) 2026-01-06 19:04:13 +00:00
jif-oai
740bf0e755 chore: clear background terminals on interrupt (#8786) 2026-01-06 19:01:07 +00:00
jif-oai
d1c6329c32 feat: forced tool tips (#8752)
Force an announcement tooltip in the CLI. This query the gh repo on this
[file](https://raw.githubusercontent.com/openai/codex/main/announcement_tip.toml)
which contains announcements in TOML looking like this:
```
# Example announcement tips for Codex TUI.
# Each [[announcements]] entry is evaluated in order; the last matching one is shown.
# Dates are UTC, formatted as YYYY-MM-DD. The from_date is inclusive and the to_date is exclusive.
# version_regex matches against the CLI version (env!("CARGO_PKG_VERSION")); omit to apply to all versions.
# target_app specify which app should display the announcement (cli, vsce, ...).

[[announcements]]
content = "Welcome to Codex! Check out the new onboarding flow."
from_date = "2024-10-01"
to_date = "2024-10-15"
version_regex = "^0\\.0\\.0$"
target_app = "cli"
``` 

To make this efficient, the announcement is queried on a best effort
basis at the launch of the CLI (no refresh made after this).
This is done in an async way and we display the announcement (with 100%
probability) iff the announcement is available, the cache is correctly
warmed and there is a matching announcement (matching is recomputed for
each new session).
2026-01-06 18:02:05 +00:00
Owen Lin
cab7136fb3 chore: add model/list call to app-server-test-client (#8331)
Allows us to run `cargo run -p codex-app-server-test-client --
model-list` to return the list of models over app-server.
2026-01-06 17:50:17 +00:00
jif-oai
32db8ea5ca feat: add head-tail buffer for unified_exec (#8735) 2026-01-06 15:48:44 +00:00
Abdelkader Boudih
06e21c7a65 fix: update model examples to gpt-5.2 (#8566)
The models are outdated and sometime get used by GPT when it to try
delegate.

I have read the CLA Document and I hereby sign the CLA
2026-01-06 08:47:29 -07:00
Michael Bolin
7ecd0dc9b3 fix: stop honoring CODEX_MANAGED_CONFIG_PATH environment variable in production (#8762) 2026-01-06 07:10:27 -08:00
jif-oai
8858012fd1 chore: emit unified exec begin only when PTY exist (#8780) 2026-01-06 13:12:54 +00:00
Thibault Sottiaux
6346e4f560 fix: fix readiness subscribe token wrap-around (#8770)
Fixes ReadinessFlag::subscribe to avoid handing out token 0 or duplicate
tokens on i32 wrap-around, adds regression tests, and prevents readiness
gates from getting stuck waiting on an unmarkable or mis-authorized
token.
2026-01-06 13:09:02 +00:00
Josh McKinney
4c3d2a5bbe fix: render cwd-relative paths in tui (#8771)
Display paths relative to the cwd before checking git roots so view
image tool calls keep project-local names in jj/no-.git workspaces.
2026-01-06 03:17:40 +00:00
Josh McKinney
c92dbea7c1 tui2: stop baking streaming wraps; reflow agent markdown (#8761)
Background
Streaming assistant prose in tui2 was being rendered with viewport-width
wrapping during streaming, then stored in history cells as already split
`Line`s. Those width-derived breaks became indistinguishable from hard
newlines, so the transcript could not "un-split" on resize. This also
degraded copy/paste, since soft wraps looked like hard breaks.

What changed
- Introduce width-agnostic `MarkdownLogicalLine` output in
`tui2/src/markdown_render.rs`, preserving markdown wrap semantics:
initial/subsequent indents, per-line style, and a preformatted flag.
- Update the streaming collector (`tui2/src/markdown_stream.rs`) to emit
logical lines (newline-gated) and remove any captured viewport width.
- Update streaming orchestration (`tui2/src/streaming/*`) to queue and
emit logical lines, producing `AgentMessageCell::new_logical(...)`.
- Make `AgentMessageCell` store logical lines and wrap at render time in
`HistoryCell::transcript_lines_with_joiners(width)`, emitting joiners so
copy/paste can join soft-wrap continuations correctly.

Overlay deferral
When an overlay is active, defer *cells* (not rendered `Vec<Line>`) and
render them at overlay close time. This avoids baking width-derived
wraps based on a stale width.

Tests + docs
- Add resize/reflow regression tests + snapshots for streamed agent
output.
- Expand module/API docs for the new logical-line streaming pipeline and
clarify joiner semantics.
- Align scrollback-related docs/comments with current tui2 behavior
(main draw loop does not flush queued "history lines" to the terminal).

More details
See `codex-rs/tui2/docs/streaming_wrapping_design.md` for the full
problem statement and solution approach, and
`codex-rs/tui2/docs/tui_viewport_and_history.md` for viewport vs printed
output behavior.
2026-01-05 18:37:58 -08:00
Thibault Sottiaux
771f1ca6ab fix: accept whitespace-padded patch markers (#8746)
Trim whitespace when validating '*** Begin Patch'/'*** End Patch'
markers in codex-apply-patch so padded marker lines parse as intended,
and add regression coverage (unit + fixture scenario); this avoids
apply_patch failures when models include extra spacing. Tested with
cargo test -p codex-apply-patch.
2026-01-05 17:41:23 -08:00
Dylan Hurd
b1c93e135b chore(apply-patch) additional scenarios (#8230)
## Summary
More apply-patch scenarios

## Testing
- [x] This pr only adds tests
2026-01-05 15:56:38 -08:00
Curtis 'Fjord' Hawthorne
5f8776d34d Allow global exec flags after resume and fix CI codex build/timeout (#8440)
**Motivation**
- Bring `codex exec resume` to parity with top‑level flags so global
options (git check bypass, json, model, sandbox toggles) work after the
subcommand, including when outside a git repo.

**Description**
- Exec CLI: mark `--skip-git-repo-check`, `--json`, `--model`,
`--full-auto`, and `--dangerously-bypass-approvals-and-sandbox` as
global so they’re accepted after `resume`.
- Tests: add `exec_resume_accepts_global_flags_after_subcommand` to
verify those flags work when passed after `resume`.

**Testing**
- `just fmt`
- `cargo test -p codex-exec` (pass; ran with elevated perms to allow
network/port binds)
- Manual: exercised `codex exec resume` with global flags after the
subcommand to confirm behavior.
2026-01-05 22:12:09 +00:00
xl-openai
58a91a0b50 Use ConfigLayerStack for skills discovery. (#8497)
Use ConfigLayerStack to get all folders while loading skills.
2026-01-05 13:47:39 -08:00
Matthew Zeng
c29afc0cf3 [device-auth] Update login instruction for headless environments. (#8753)
We've seen reports that people who try to login on a remote/headless
machine will open the login link on their own machine and got errors.
Update the instructions to ask those users to use `codex login
--device-auth` instead.

<img width="1434" height="938" alt="CleanShot 2026-01-05 at 11 35 02@2x"
src="https://github.com/user-attachments/assets/2b209953-6a42-4eb0-8b55-bb0733f2e373"
/>
2026-01-05 13:46:42 -08:00
Michael Bolin
cafb07fe6e feat: add justification arg to prefix_rule() in *.rules (#8751)
Adds an optional `justification` parameter to the `prefix_rule()`
execpolicy DSL so policy authors can attach human-readable rationale to
a rule. That justification is propagated through parsing/matching and
can be surfaced to the model (or approval UI) when a command is blocked
or requires approval.

When a command is rejected (or gated behind approval) due to policy, a
generic message makes it hard for the model/user to understand what went
wrong and what to do instead. Allowing policy authors to supply a short
justification improves debuggability and helps guide the model toward
compliant alternatives.

Example:

```python
prefix_rule(
    pattern = ["git", "push"],
    decision = "forbidden",
    justification = "pushing is blocked in this repo",
)
```

If Codex tried to run `git push origin main`, now the failure would
include:

```
`git push origin main` rejected: pushing is blocked in this repo
```

whereas previously, all it was told was:

```
execpolicy forbids this command
```
2026-01-05 21:24:48 +00:00
iceweasel-oai
07f077dfb3 best effort to "hide" Sandbox users (#8492)
The elevated sandbox creates two new Windows users - CodexSandboxOffline
and CodexSandboxOnline. This is necessary, so this PR does all that it
can to "hide" those users. It uses the registry plus directory flags (on
their home directories) to get them to show up as little as possible.
2026-01-05 12:29:10 -08:00
Abrar Ahmed
7cf6f1c723 Use issuer URL in device auth prompt link (#7858)
## Summary

When using device-code login with a custom issuer
(`--experimental_issuer`), Codex correctly uses that issuer for the auth
flow — but the **terminal prompt still told users to open the default
OpenAI device URL** (`https://auth.openai.com/codex/device`). That’s
confusing and can send users to the **wrong domain** (especially for
enterprise/staging issuers). This PR updates the prompt (and related
URLs) to consistently use the configured issuer. 🎯

---

## 🔧 What changed

* 🔗 **Device auth prompt link** now uses the configured issuer (instead
of a hard-coded OpenAI URL)
* 🧭 **Redirect callback URL** is derived from the same issuer for
consistency
* 🧼 Minor cleanup: normalize the issuer base URL once and reuse it
(avoids formatting quirks like trailing `/`)

---

## 🧪 Repro + Before/After

### ▶️ Command

```bash
codex login --device-auth --experimental_issuer https://auth.example.com
```

###  Before (wrong link shown)

```text
1. Open this link in your browser and sign in to your account
   https://auth.openai.com/codex/device
```

###  After (correct link shown)

```text
1. Open this link in your browser and sign in to your account
   https://auth.example.com/codex/device
```

Full example output (same as before, but with the correct URL):

```text
Welcome to Codex [v0.72.0]
OpenAI's command-line coding agent

Follow these steps to sign in with ChatGPT using device code authorization:

1. Open this link in your browser and sign in to your account
   https://auth.example.com/codex/device

2. Enter this one-time code (expires in 15 minutes)
   BUT6-0M8K4

Device codes are a common phishing target. Never share this code.
```

---

##  Test plan

* 🟦 `codex login --device-auth` (default issuer): output remains
unchanged
* 🟩 `codex login --device-auth --experimental_issuer
https://auth.example.com`:

  * prompt link points to the issuer 
  * callback URL is derived from the same issuer 
  * no double slashes / mismatched domains 

Co-authored-by: Eric Traut <etraut@openai.com>
2026-01-05 13:09:05 -07:00
Gav Verma
57f8158608 chore: improve skills render section (#8459)
This change improves the skills render section
- Separate the skills list from usage rules with clear subheadings
- Define skill more clearly upfront
- Remove confusing trigger/discovery wording and make reference-following guidance more actionable
2026-01-05 11:55:26 -08:00
iceweasel-oai
95580f229e never let sandbox write to .codex/ or .codex/.sandbox/ (#8683)
Never treat .codex or .codex/.sandbox as a workspace root.
Handle write permissions to .codex/.sandbox in a single method so that
the sandbox setup/runner can write logs and other setup files to that
directory.
2026-01-05 11:54:21 -08:00
220 changed files with 9875 additions and 3460 deletions

View File

@@ -323,6 +323,26 @@ jobs:
- name: Checkout repository
uses: actions/checkout@v6
- name: Generate release notes from tag commit message
id: release_notes
shell: bash
run: |
set -euo pipefail
# On tag pushes, GITHUB_SHA may be a tag object for annotated tags;
# peel it to the underlying commit.
commit="$(git rev-parse "${GITHUB_SHA}^{commit}")"
notes_path="${RUNNER_TEMP}/release-notes.md"
# Use the commit message for the commit the tag points at (not the
# annotated tag message).
git log -1 --format=%B "${commit}" > "${notes_path}"
# Ensure trailing newline so GitHub's markdown renderer doesn't
# occasionally run the last line into subsequent content.
echo >> "${notes_path}"
echo "path=${notes_path}" >> "${GITHUB_OUTPUT}"
- uses: actions/download-artifact@v7
with:
path: dist
@@ -395,6 +415,7 @@ jobs:
with:
name: ${{ steps.release_name.outputs.name }}
tag_name: ${{ github.ref_name }}
body_path: ${{ steps.release_notes.outputs.path }}
files: dist/**
# Mark as prerelease only when the version has a suffix after x.y.z
# (e.g. -alpha, -beta). Otherwise publish a normal release.

16
announcement_tip.toml Normal file
View File

@@ -0,0 +1,16 @@
# Example announcement tips for Codex TUI.
# Each [[announcements]] entry is evaluated in order; the last matching one is shown.
# Dates are UTC, formatted as YYYY-MM-DD. The from_date is inclusive and the to_date is exclusive.
# version_regex matches against the CLI version (env!("CARGO_PKG_VERSION")); omit to apply to all versions.
# target_app specify which app should display the announcement (cli, vsce, ...).
[[announcements]]
content = "Welcome to Codex! Check out the new onboarding flow."
from_date = "2024-10-01"
to_date = "2024-10-15"
target_app = "cli"
[[announcements]]
content = "This is a test announcement"
version_regex = "^0\\.0\\.0$"
to_date = "2026-01-10"

61
codex-rs/Cargo.lock generated
View File

@@ -360,7 +360,7 @@ dependencies = [
"objc2-foundation",
"parking_lot",
"percent-encoding",
"windows-sys 0.52.0",
"windows-sys 0.60.2",
"wl-clipboard-rs",
"x11rb",
]
@@ -982,8 +982,10 @@ dependencies = [
"thiserror 2.0.17",
"tokio",
"tokio-test",
"tokio-tungstenite",
"tokio-util",
"tracing",
"url",
"wiremock",
]
@@ -1866,7 +1868,7 @@ dependencies = [
name = "codex-utils-cache"
version = "0.0.0"
dependencies = [
"lru 0.16.2",
"lru 0.16.3",
"sha1",
"tokio",
]
@@ -2344,6 +2346,12 @@ dependencies = [
"syn 2.0.104",
]
[[package]]
name = "data-encoding"
version = "2.9.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2a2330da5de22e8a3cb63252ce2abb30116bf5265e89c0e01bc17015ce30a476"
[[package]]
name = "dbus"
version = "0.9.9"
@@ -2781,7 +2789,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "778e2ac28f6c47af28e4907f13ffd1e1ddbd400980a9abd7c8df189bf578a5ad"
dependencies = [
"libc",
"windows-sys 0.52.0",
"windows-sys 0.60.2",
]
[[package]]
@@ -2889,7 +2897,7 @@ checksum = "0ce92ff622d6dadf7349484f42c93271a0d49b7cc4d466a936405bacbe10aa78"
dependencies = [
"cfg-if",
"rustix 1.0.8",
"windows-sys 0.52.0",
"windows-sys 0.59.0",
]
[[package]]
@@ -3830,7 +3838,7 @@ checksum = "e04d7f318608d35d4b61ddd75cbdaee86b023ebe2bd5a66ee0915f0bf93095a9"
dependencies = [
"hermit-abi",
"libc",
"windows-sys 0.52.0",
"windows-sys 0.59.0",
]
[[package]]
@@ -4151,9 +4159,9 @@ dependencies = [
[[package]]
name = "lru"
version = "0.16.2"
version = "0.16.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "96051b46fc183dc9cd4a223960ef37b9af631b55191852a8274bfef064cda20f"
checksum = "a1dc47f592c06f33f8e3aea9591776ec7c9f9e4124778ff8a3c3b87159f7e593"
dependencies = [
"hashbrown 0.16.0",
]
@@ -5331,7 +5339,7 @@ dependencies = [
"once_cell",
"socket2 0.6.1",
"tracing",
"windows-sys 0.52.0",
"windows-sys 0.60.2",
]
[[package]]
@@ -5459,7 +5467,7 @@ dependencies = [
"indoc",
"itertools 0.14.0",
"kasuari",
"lru 0.16.2",
"lru 0.16.3",
"strum 0.27.2",
"thiserror 2.0.17",
"unicode-segmentation",
@@ -5710,7 +5718,7 @@ dependencies = [
"errno",
"libc",
"linux-raw-sys 0.4.15",
"windows-sys 0.52.0",
"windows-sys 0.59.0",
]
[[package]]
@@ -5723,7 +5731,7 @@ dependencies = [
"errno",
"libc",
"linux-raw-sys 0.9.4",
"windows-sys 0.52.0",
"windows-sys 0.60.2",
]
[[package]]
@@ -7095,6 +7103,18 @@ dependencies = [
"tokio-stream",
]
[[package]]
name = "tokio-tungstenite"
version = "0.26.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7a9daff607c6d2bf6c16fd681ccb7eecc83e4e2cdc1ca067ffaadfca5de7f084"
dependencies = [
"futures-util",
"log",
"tokio",
"tungstenite",
]
[[package]]
name = "tokio-util"
version = "0.7.16"
@@ -7489,6 +7509,23 @@ dependencies = [
"ratatui-core",
]
[[package]]
name = "tungstenite"
version = "0.26.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4793cb5e56680ecbb1d843515b23b6de9a75eb04b66643e256a396d43be33c13"
dependencies = [
"bytes",
"data-encoding",
"http 1.3.1",
"httparse",
"log",
"rand 0.9.2",
"sha1",
"thiserror 2.0.17",
"utf-8",
]
[[package]]
name = "typenum"
version = "1.18.0"
@@ -8011,7 +8048,7 @@ version = "0.1.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "cf221c93e13a30d793f7645a0e7762c55d169dbb0a49671918a2319d289b10bb"
dependencies = [
"windows-sys 0.52.0",
"windows-sys 0.59.0",
]
[[package]]

View File

@@ -152,7 +152,7 @@ landlock = "0.4.4"
lazy_static = "1"
libc = "0.2.177"
log = "0.4"
lru = "0.16.2"
lru = "0.16.3"
maplit = "1.0.2"
mime_guess = "2.0.5"
multimap = "0.10.0"
@@ -207,6 +207,7 @@ thiserror = "2.0.17"
time = "0.3"
tiny_http = "0.12"
tokio = "1"
tokio-tungstenite = "0.26.1"
tokio-stream = "0.1.18"
tokio-test = "0.4"
tokio-util = "0.7.16"

View File

@@ -113,6 +113,10 @@ client_request_definitions! {
params: v2::ThreadArchiveParams,
response: v2::ThreadArchiveResponse,
},
ThreadRollback => "thread/rollback" {
params: v2::ThreadRollbackParams,
response: v2::ThreadRollbackResponse,
},
ThreadList => "thread/list" {
params: v2::ThreadListParams,
response: v2::ThreadListResponse,
@@ -565,7 +569,7 @@ client_notification_definitions! {
mod tests {
use super::*;
use anyhow::Result;
use codex_protocol::ConversationId;
use codex_protocol::ThreadId;
use codex_protocol::account::PlanType;
use codex_protocol::parse_command::ParsedCommand;
use codex_protocol::protocol::AskForApproval;
@@ -614,7 +618,7 @@ mod tests {
#[test]
fn conversation_id_serializes_as_plain_string() -> Result<()> {
let id = ConversationId::from_string("67e55044-10b1-426f-9247-bb680e5fe0c8")?;
let id = ThreadId::from_string("67e55044-10b1-426f-9247-bb680e5fe0c8")?;
assert_eq!(
json!("67e55044-10b1-426f-9247-bb680e5fe0c8"),
@@ -625,11 +629,10 @@ mod tests {
#[test]
fn conversation_id_deserializes_from_plain_string() -> Result<()> {
let id: ConversationId =
serde_json::from_value(json!("67e55044-10b1-426f-9247-bb680e5fe0c8"))?;
let id: ThreadId = serde_json::from_value(json!("67e55044-10b1-426f-9247-bb680e5fe0c8"))?;
assert_eq!(
ConversationId::from_string("67e55044-10b1-426f-9247-bb680e5fe0c8")?,
ThreadId::from_string("67e55044-10b1-426f-9247-bb680e5fe0c8")?,
id,
);
Ok(())
@@ -650,7 +653,7 @@ mod tests {
#[test]
fn serialize_server_request() -> Result<()> {
let conversation_id = ConversationId::from_string("67e55044-10b1-426f-9247-bb680e5fe0c8")?;
let conversation_id = ThreadId::from_string("67e55044-10b1-426f-9247-bb680e5fe0c8")?;
let params = v1::ExecCommandApprovalParams {
conversation_id,
call_id: "call-42".to_string(),

View File

@@ -6,6 +6,7 @@ use crate::protocol::v2::UserInput;
use codex_protocol::protocol::AgentReasoningEvent;
use codex_protocol::protocol::AgentReasoningRawContentEvent;
use codex_protocol::protocol::EventMsg;
use codex_protocol::protocol::ThreadRolledBackEvent;
use codex_protocol::protocol::TurnAbortedEvent;
use codex_protocol::protocol::UserMessageEvent;
@@ -57,6 +58,7 @@ impl ThreadHistoryBuilder {
EventMsg::TokenCount(_) => {}
EventMsg::EnteredReviewMode(_) => {}
EventMsg::ExitedReviewMode(_) => {}
EventMsg::ThreadRolledBack(payload) => self.handle_thread_rollback(payload),
EventMsg::UndoCompleted(_) => {}
EventMsg::TurnAborted(payload) => self.handle_turn_aborted(payload),
_ => {}
@@ -130,6 +132,23 @@ impl ThreadHistoryBuilder {
turn.status = TurnStatus::Interrupted;
}
fn handle_thread_rollback(&mut self, payload: &ThreadRolledBackEvent) {
self.finish_current_turn();
let n = usize::try_from(payload.num_turns).unwrap_or(usize::MAX);
if n >= self.turns.len() {
self.turns.clear();
} else {
self.turns.truncate(self.turns.len().saturating_sub(n));
}
// Re-number subsequent synthetic ids so the pruned history is consistent.
self.next_turn_index =
i64::try_from(self.turns.len().saturating_add(1)).unwrap_or(i64::MAX);
let item_count: usize = self.turns.iter().map(|t| t.items.len()).sum();
self.next_item_index = i64::try_from(item_count.saturating_add(1)).unwrap_or(i64::MAX);
}
fn finish_current_turn(&mut self) {
if let Some(turn) = self.current_turn.take() {
if turn.items.is_empty() {
@@ -213,6 +232,7 @@ mod tests {
use codex_protocol::protocol::AgentMessageEvent;
use codex_protocol::protocol::AgentReasoningEvent;
use codex_protocol::protocol::AgentReasoningRawContentEvent;
use codex_protocol::protocol::ThreadRolledBackEvent;
use codex_protocol::protocol::TurnAbortReason;
use codex_protocol::protocol::TurnAbortedEvent;
use codex_protocol::protocol::UserMessageEvent;
@@ -410,4 +430,95 @@ mod tests {
}
);
}
#[test]
fn drops_last_turns_on_thread_rollback() {
let events = vec![
EventMsg::UserMessage(UserMessageEvent {
message: "First".into(),
images: None,
}),
EventMsg::AgentMessage(AgentMessageEvent {
message: "A1".into(),
}),
EventMsg::UserMessage(UserMessageEvent {
message: "Second".into(),
images: None,
}),
EventMsg::AgentMessage(AgentMessageEvent {
message: "A2".into(),
}),
EventMsg::ThreadRolledBack(ThreadRolledBackEvent { num_turns: 1 }),
EventMsg::UserMessage(UserMessageEvent {
message: "Third".into(),
images: None,
}),
EventMsg::AgentMessage(AgentMessageEvent {
message: "A3".into(),
}),
];
let turns = build_turns_from_event_msgs(&events);
let expected = vec![
Turn {
id: "turn-1".into(),
status: TurnStatus::Completed,
error: None,
items: vec![
ThreadItem::UserMessage {
id: "item-1".into(),
content: vec![UserInput::Text {
text: "First".into(),
}],
},
ThreadItem::AgentMessage {
id: "item-2".into(),
text: "A1".into(),
},
],
},
Turn {
id: "turn-2".into(),
status: TurnStatus::Completed,
error: None,
items: vec![
ThreadItem::UserMessage {
id: "item-3".into(),
content: vec![UserInput::Text {
text: "Third".into(),
}],
},
ThreadItem::AgentMessage {
id: "item-4".into(),
text: "A3".into(),
},
],
},
];
assert_eq!(turns, expected);
}
#[test]
fn thread_rollback_clears_all_turns_when_num_turns_exceeds_history() {
let events = vec![
EventMsg::UserMessage(UserMessageEvent {
message: "One".into(),
images: None,
}),
EventMsg::AgentMessage(AgentMessageEvent {
message: "A1".into(),
}),
EventMsg::UserMessage(UserMessageEvent {
message: "Two".into(),
images: None,
}),
EventMsg::AgentMessage(AgentMessageEvent {
message: "A2".into(),
}),
EventMsg::ThreadRolledBack(ThreadRolledBackEvent { num_turns: 99 }),
];
let turns = build_turns_from_event_msgs(&events);
assert_eq!(turns, Vec::<Turn>::new());
}
}

View File

@@ -1,7 +1,7 @@
use std::collections::HashMap;
use std::path::PathBuf;
use codex_protocol::ConversationId;
use codex_protocol::ThreadId;
use codex_protocol::config_types::ForcedLoginMethod;
use codex_protocol::config_types::ReasoningSummary;
use codex_protocol::config_types::SandboxMode;
@@ -68,7 +68,7 @@ pub struct NewConversationParams {
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct NewConversationResponse {
pub conversation_id: ConversationId,
pub conversation_id: ThreadId,
pub model: String,
pub reasoning_effort: Option<ReasoningEffort>,
pub rollout_path: PathBuf,
@@ -77,7 +77,7 @@ pub struct NewConversationResponse {
#[derive(Serialize, Deserialize, Debug, Clone, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct ResumeConversationResponse {
pub conversation_id: ConversationId,
pub conversation_id: ThreadId,
pub model: String,
pub initial_messages: Option<Vec<EventMsg>>,
pub rollout_path: PathBuf,
@@ -90,9 +90,9 @@ pub enum GetConversationSummaryParams {
#[serde(rename = "rolloutPath")]
rollout_path: PathBuf,
},
ConversationId {
ThreadId {
#[serde(rename = "conversationId")]
conversation_id: ConversationId,
conversation_id: ThreadId,
},
}
@@ -113,7 +113,7 @@ pub struct ListConversationsParams {
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct ConversationSummary {
pub conversation_id: ConversationId,
pub conversation_id: ThreadId,
pub path: PathBuf,
pub preview: String,
pub timestamp: Option<String>,
@@ -143,7 +143,7 @@ pub struct ListConversationsResponse {
#[serde(rename_all = "camelCase")]
pub struct ResumeConversationParams {
pub path: Option<PathBuf>,
pub conversation_id: Option<ConversationId>,
pub conversation_id: Option<ThreadId>,
pub history: Option<Vec<ResponseItem>>,
pub overrides: Option<NewConversationParams>,
}
@@ -158,7 +158,7 @@ pub struct AddConversationSubscriptionResponse {
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct ArchiveConversationParams {
pub conversation_id: ConversationId,
pub conversation_id: ThreadId,
pub rollout_path: PathBuf,
}
@@ -198,7 +198,7 @@ pub struct GitDiffToRemoteResponse {
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct ApplyPatchApprovalParams {
pub conversation_id: ConversationId,
pub conversation_id: ThreadId,
/// Use to correlate this with [codex_core::protocol::PatchApplyBeginEvent]
/// and [codex_core::protocol::PatchApplyEndEvent].
pub call_id: String,
@@ -219,7 +219,7 @@ pub struct ApplyPatchApprovalResponse {
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct ExecCommandApprovalParams {
pub conversation_id: ConversationId,
pub conversation_id: ThreadId,
/// Use to correlate this with [codex_core::protocol::ExecCommandBeginEvent]
/// and [codex_core::protocol::ExecCommandEndEvent].
pub call_id: String,
@@ -369,14 +369,14 @@ pub struct SandboxSettings {
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct SendUserMessageParams {
pub conversation_id: ConversationId,
pub conversation_id: ThreadId,
pub items: Vec<InputItem>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct SendUserTurnParams {
pub conversation_id: ConversationId,
pub conversation_id: ThreadId,
pub items: Vec<InputItem>,
pub cwd: PathBuf,
pub approval_policy: AskForApproval,
@@ -395,7 +395,7 @@ pub struct SendUserTurnResponse {}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct InterruptConversationParams {
pub conversation_id: ConversationId,
pub conversation_id: ThreadId,
}
#[derive(Serialize, Deserialize, Debug, Clone, JsonSchema, TS)]
@@ -411,7 +411,7 @@ pub struct SendUserMessageResponse {}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct AddConversationListenerParams {
pub conversation_id: ConversationId,
pub conversation_id: ThreadId,
#[serde(default)]
pub experimental_raw_events: bool,
}
@@ -445,7 +445,7 @@ pub struct LoginChatGptCompleteNotification {
#[derive(Serialize, Deserialize, Debug, Clone, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct SessionConfiguredNotification {
pub session_id: ConversationId,
pub session_id: ThreadId,
pub model: String,
pub reasoning_effort: Option<ReasoningEffort>,
pub history_log_id: u64,

View File

@@ -89,6 +89,7 @@ pub enum CodexErrorInfo {
InternalServerError,
Unauthorized,
BadRequest,
ThreadRollbackFailed,
SandboxError,
/// The response SSE stream disconnected in the middle of a turn before completion.
ResponseStreamDisconnected {
@@ -119,6 +120,7 @@ impl From<CoreCodexErrorInfo> for CodexErrorInfo {
CoreCodexErrorInfo::InternalServerError => CodexErrorInfo::InternalServerError,
CoreCodexErrorInfo::Unauthorized => CodexErrorInfo::Unauthorized,
CoreCodexErrorInfo::BadRequest => CodexErrorInfo::BadRequest,
CoreCodexErrorInfo::ThreadRollbackFailed => CodexErrorInfo::ThreadRollbackFailed,
CoreCodexErrorInfo::SandboxError => CodexErrorInfo::SandboxError,
CoreCodexErrorInfo::ResponseStreamDisconnected { http_status_code } => {
CodexErrorInfo::ResponseStreamDisconnected { http_status_code }
@@ -330,6 +332,15 @@ pub struct ProfileV2 {
pub additional: HashMap<String, JsonValue>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "snake_case")]
#[ts(export_to = "v2/")]
pub struct AnalyticsConfig {
pub enabled: Option<bool>,
#[serde(default, flatten)]
pub additional: HashMap<String, JsonValue>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "snake_case")]
#[ts(export_to = "v2/")]
@@ -354,6 +365,7 @@ pub struct Config {
pub model_reasoning_effort: Option<ReasoningEffort>,
pub model_reasoning_summary: Option<ReasoningSummary>,
pub model_verbosity: Option<Verbosity>,
pub analytics: Option<AnalyticsConfig>,
#[serde(default, flatten)]
pub additional: HashMap<String, JsonValue>,
}
@@ -1045,6 +1057,30 @@ pub struct ThreadArchiveParams {
#[ts(export_to = "v2/")]
pub struct ThreadArchiveResponse {}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct ThreadRollbackParams {
pub thread_id: String,
/// The number of turns to drop from the end of the thread. Must be >= 1.
///
/// This only modifies the thread's history and does not revert local file changes
/// that have been made by the agent. Clients are responsible for reverting these changes.
pub num_turns: u32,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
pub struct ThreadRollbackResponse {
/// The updated thread after applying the rollback, with `turns` populated.
///
/// The ThreadItems stored in each Turn are lossy since we explicitly do not
/// persist all agent interactions, such as command executions. This is the same
/// behavior as `thread/resume`.
pub thread: Thread,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
#[ts(export_to = "v2/")]
@@ -1183,7 +1219,7 @@ pub struct Thread {
pub source: SessionSource,
/// Optional Git metadata captured when the thread was created.
pub git_info: Option<GitInfo>,
/// Only populated on a `thread/resume` response.
/// Only populated on `thread/resume` and `thread/rollback` responses.
/// For all other responses and notifications returning a Thread,
/// the turns field will be an empty list.
pub turns: Vec<Turn>,
@@ -1211,6 +1247,7 @@ pub struct ThreadTokenUsageUpdatedNotification {
pub struct ThreadTokenUsage {
pub total: TokenUsageBreakdown,
pub last: TokenUsageBreakdown,
// TODO(aibrahim): make this not optional
#[ts(type = "number | null")]
pub model_context_window: Option<i64>,
}

View File

@@ -13,6 +13,7 @@ use std::time::Duration;
use anyhow::Context;
use anyhow::Result;
use anyhow::bail;
use clap::ArgAction;
use clap::Parser;
use clap::Subcommand;
use codex_app_server_protocol::AddConversationListenerParams;
@@ -35,6 +36,8 @@ use codex_app_server_protocol::JSONRPCRequest;
use codex_app_server_protocol::JSONRPCResponse;
use codex_app_server_protocol::LoginChatGptCompleteNotification;
use codex_app_server_protocol::LoginChatGptResponse;
use codex_app_server_protocol::ModelListParams;
use codex_app_server_protocol::ModelListResponse;
use codex_app_server_protocol::NewConversationParams;
use codex_app_server_protocol::NewConversationResponse;
use codex_app_server_protocol::RequestId;
@@ -49,7 +52,7 @@ use codex_app_server_protocol::TurnStartParams;
use codex_app_server_protocol::TurnStartResponse;
use codex_app_server_protocol::TurnStatus;
use codex_app_server_protocol::UserInput as V2UserInput;
use codex_protocol::ConversationId;
use codex_protocol::ThreadId;
use codex_protocol::protocol::Event;
use codex_protocol::protocol::EventMsg;
use serde::Serialize;
@@ -65,6 +68,19 @@ struct Cli {
#[arg(long, env = "CODEX_BIN", default_value = "codex")]
codex_bin: String,
/// Forwarded to the `codex` CLI as `--config key=value`. Repeatable.
///
/// Example:
/// `--config 'model_providers.mock.base_url="http://localhost:4010/v2"'`
#[arg(
short = 'c',
long = "config",
value_name = "key=value",
action = ArgAction::Append,
global = true
)]
config_overrides: Vec<String>,
#[command(subcommand)]
command: CliCommand,
}
@@ -113,37 +129,54 @@ enum CliCommand {
TestLogin,
/// Fetch the current account rate limits from the Codex app-server.
GetAccountRateLimits,
/// List the available models from the Codex app-server.
#[command(name = "model-list")]
ModelList,
}
fn main() -> Result<()> {
let Cli { codex_bin, command } = Cli::parse();
let Cli {
codex_bin,
config_overrides,
command,
} = Cli::parse();
match command {
CliCommand::SendMessage { user_message } => send_message(codex_bin, user_message),
CliCommand::SendMessageV2 { user_message } => send_message_v2(codex_bin, user_message),
CliCommand::SendMessage { user_message } => {
send_message(&codex_bin, &config_overrides, user_message)
}
CliCommand::SendMessageV2 { user_message } => {
send_message_v2(&codex_bin, &config_overrides, user_message)
}
CliCommand::TriggerCmdApproval { user_message } => {
trigger_cmd_approval(codex_bin, user_message)
trigger_cmd_approval(&codex_bin, &config_overrides, user_message)
}
CliCommand::TriggerPatchApproval { user_message } => {
trigger_patch_approval(codex_bin, user_message)
trigger_patch_approval(&codex_bin, &config_overrides, user_message)
}
CliCommand::NoTriggerCmdApproval => no_trigger_cmd_approval(codex_bin),
CliCommand::NoTriggerCmdApproval => no_trigger_cmd_approval(&codex_bin, &config_overrides),
CliCommand::SendFollowUpV2 {
first_message,
follow_up_message,
} => send_follow_up_v2(codex_bin, first_message, follow_up_message),
CliCommand::TestLogin => test_login(codex_bin),
CliCommand::GetAccountRateLimits => get_account_rate_limits(codex_bin),
} => send_follow_up_v2(
&codex_bin,
&config_overrides,
first_message,
follow_up_message,
),
CliCommand::TestLogin => test_login(&codex_bin, &config_overrides),
CliCommand::GetAccountRateLimits => get_account_rate_limits(&codex_bin, &config_overrides),
CliCommand::ModelList => model_list(&codex_bin, &config_overrides),
}
}
fn send_message(codex_bin: String, user_message: String) -> Result<()> {
let mut client = CodexClient::spawn(codex_bin)?;
fn send_message(codex_bin: &str, config_overrides: &[String], user_message: String) -> Result<()> {
let mut client = CodexClient::spawn(codex_bin, config_overrides)?;
let initialize = client.initialize()?;
println!("< initialize response: {initialize:?}");
let conversation = client.new_conversation()?;
let conversation = client.start_thread()?;
println!("< newConversation response: {conversation:?}");
let subscription = client.add_conversation_listener(&conversation.conversation_id)?;
@@ -154,51 +187,66 @@ fn send_message(codex_bin: String, user_message: String) -> Result<()> {
client.stream_conversation(&conversation.conversation_id)?;
client.remove_conversation_listener(subscription.subscription_id)?;
client.remove_thread_listener(subscription.subscription_id)?;
Ok(())
}
fn send_message_v2(codex_bin: String, user_message: String) -> Result<()> {
send_message_v2_with_policies(codex_bin, user_message, None, None)
fn send_message_v2(
codex_bin: &str,
config_overrides: &[String],
user_message: String,
) -> Result<()> {
send_message_v2_with_policies(codex_bin, config_overrides, user_message, None, None)
}
fn trigger_cmd_approval(codex_bin: String, user_message: Option<String>) -> Result<()> {
fn trigger_cmd_approval(
codex_bin: &str,
config_overrides: &[String],
user_message: Option<String>,
) -> Result<()> {
let default_prompt =
"Run `touch /tmp/should-trigger-approval` so I can confirm the file exists.";
let message = user_message.unwrap_or_else(|| default_prompt.to_string());
send_message_v2_with_policies(
codex_bin,
config_overrides,
message,
Some(AskForApproval::OnRequest),
Some(SandboxPolicy::ReadOnly),
)
}
fn trigger_patch_approval(codex_bin: String, user_message: Option<String>) -> Result<()> {
fn trigger_patch_approval(
codex_bin: &str,
config_overrides: &[String],
user_message: Option<String>,
) -> Result<()> {
let default_prompt =
"Create a file named APPROVAL_DEMO.txt containing a short hello message using apply_patch.";
let message = user_message.unwrap_or_else(|| default_prompt.to_string());
send_message_v2_with_policies(
codex_bin,
config_overrides,
message,
Some(AskForApproval::OnRequest),
Some(SandboxPolicy::ReadOnly),
)
}
fn no_trigger_cmd_approval(codex_bin: String) -> Result<()> {
fn no_trigger_cmd_approval(codex_bin: &str, config_overrides: &[String]) -> Result<()> {
let prompt = "Run `touch should_not_trigger_approval.txt`";
send_message_v2_with_policies(codex_bin, prompt.to_string(), None, None)
send_message_v2_with_policies(codex_bin, config_overrides, prompt.to_string(), None, None)
}
fn send_message_v2_with_policies(
codex_bin: String,
codex_bin: &str,
config_overrides: &[String],
user_message: String,
approval_policy: Option<AskForApproval>,
sandbox_policy: Option<SandboxPolicy>,
) -> Result<()> {
let mut client = CodexClient::spawn(codex_bin)?;
let mut client = CodexClient::spawn(codex_bin, config_overrides)?;
let initialize = client.initialize()?;
println!("< initialize response: {initialize:?}");
@@ -222,11 +270,12 @@ fn send_message_v2_with_policies(
}
fn send_follow_up_v2(
codex_bin: String,
codex_bin: &str,
config_overrides: &[String],
first_message: String,
follow_up_message: String,
) -> Result<()> {
let mut client = CodexClient::spawn(codex_bin)?;
let mut client = CodexClient::spawn(codex_bin, config_overrides)?;
let initialize = client.initialize()?;
println!("< initialize response: {initialize:?}");
@@ -259,8 +308,8 @@ fn send_follow_up_v2(
Ok(())
}
fn test_login(codex_bin: String) -> Result<()> {
let mut client = CodexClient::spawn(codex_bin)?;
fn test_login(codex_bin: &str, config_overrides: &[String]) -> Result<()> {
let mut client = CodexClient::spawn(codex_bin, config_overrides)?;
let initialize = client.initialize()?;
println!("< initialize response: {initialize:?}");
@@ -289,8 +338,8 @@ fn test_login(codex_bin: String) -> Result<()> {
}
}
fn get_account_rate_limits(codex_bin: String) -> Result<()> {
let mut client = CodexClient::spawn(codex_bin)?;
fn get_account_rate_limits(codex_bin: &str, config_overrides: &[String]) -> Result<()> {
let mut client = CodexClient::spawn(codex_bin, config_overrides)?;
let initialize = client.initialize()?;
println!("< initialize response: {initialize:?}");
@@ -301,6 +350,18 @@ fn get_account_rate_limits(codex_bin: String) -> Result<()> {
Ok(())
}
fn model_list(codex_bin: &str, config_overrides: &[String]) -> Result<()> {
let mut client = CodexClient::spawn(codex_bin, config_overrides)?;
let initialize = client.initialize()?;
println!("< initialize response: {initialize:?}");
let response = client.model_list(ModelListParams::default())?;
println!("< model/list response: {response:?}");
Ok(())
}
struct CodexClient {
child: Child,
stdin: Option<ChildStdin>,
@@ -309,8 +370,12 @@ struct CodexClient {
}
impl CodexClient {
fn spawn(codex_bin: String) -> Result<Self> {
let mut codex_app_server = Command::new(&codex_bin)
fn spawn(codex_bin: &str, config_overrides: &[String]) -> Result<Self> {
let mut cmd = Command::new(codex_bin);
for override_kv in config_overrides {
cmd.arg("--config").arg(override_kv);
}
let mut codex_app_server = cmd
.arg("app-server")
.stdin(Stdio::piped())
.stdout(Stdio::piped())
@@ -351,7 +416,7 @@ impl CodexClient {
self.send_request(request, request_id, "initialize")
}
fn new_conversation(&mut self) -> Result<NewConversationResponse> {
fn start_thread(&mut self) -> Result<NewConversationResponse> {
let request_id = self.request_id();
let request = ClientRequest::NewConversation {
request_id: request_id.clone(),
@@ -363,7 +428,7 @@ impl CodexClient {
fn add_conversation_listener(
&mut self,
conversation_id: &ConversationId,
conversation_id: &ThreadId,
) -> Result<AddConversationSubscriptionResponse> {
let request_id = self.request_id();
let request = ClientRequest::AddConversationListener {
@@ -377,7 +442,7 @@ impl CodexClient {
self.send_request(request, request_id, "addConversationListener")
}
fn remove_conversation_listener(&mut self, subscription_id: Uuid) -> Result<()> {
fn remove_thread_listener(&mut self, subscription_id: Uuid) -> Result<()> {
let request_id = self.request_id();
let request = ClientRequest::RemoveConversationListener {
request_id: request_id.clone(),
@@ -395,7 +460,7 @@ impl CodexClient {
fn send_user_message(
&mut self,
conversation_id: &ConversationId,
conversation_id: &ThreadId,
message: &str,
) -> Result<SendUserMessageResponse> {
let request_id = self.request_id();
@@ -452,7 +517,17 @@ impl CodexClient {
self.send_request(request, request_id, "account/rateLimits/read")
}
fn stream_conversation(&mut self, conversation_id: &ConversationId) -> Result<()> {
fn model_list(&mut self, params: ModelListParams) -> Result<ModelListResponse> {
let request_id = self.request_id();
let request = ClientRequest::ModelList {
request_id: request_id.clone(),
params,
};
self.send_request(request, request_id, "model/list")
}
fn stream_conversation(&mut self, conversation_id: &ThreadId) -> Result<()> {
loop {
let notification = self.next_notification()?;
@@ -589,7 +664,7 @@ impl CodexClient {
fn extract_event(
&self,
notification: JSONRPCNotification,
conversation_id: &ConversationId,
conversation_id: &ThreadId,
) -> Result<Option<Event>> {
let params = notification
.params
@@ -603,7 +678,7 @@ impl CodexClient {
let conversation_value = map
.remove("conversationId")
.context("event missing conversationId")?;
let notification_conversation: ConversationId = serde_json::from_value(conversation_value)
let notification_conversation: ThreadId = serde_json::from_value(conversation_value)
.context("conversationId was not a valid UUID")?;
if &notification_conversation != conversation_id {

View File

@@ -72,6 +72,7 @@ Example (from OpenAI's official VSCode extension):
- `thread/resume` — reopen an existing thread by id so subsequent `turn/start` calls append to it.
- `thread/list` — page through stored rollouts; supports cursor-based pagination and optional `modelProviders` filtering.
- `thread/archive` — move a threads rollout file into the archived directory; returns `{}` on success.
- `thread/rollback` — drop the last N turns from the agents in-memory context and persist a rollback marker in the rollout so future resumes see the pruned history; returns the updated `thread` (with `turns` populated) on success.
- `turn/start` — add user input to a thread and begin Codex generation; responds with the initial `turn` object and streams `turn/started`, `item/*`, and `turn/completed` notifications.
- `turn/interrupt` — request cancellation of an in-flight turn by `(thread_id, turn_id)`; success is an empty `{}` response and the turn finishes with `status: "interrupted"`.
- `review/start` — kick off Codexs automated reviewer for a thread; responds like `turn/start` and emits `item/started`/`item/completed` notifications with `enteredReviewMode` and `exitedReviewMode` items, plus a final assistant `agentMessage` containing the review.

View File

@@ -1,7 +1,13 @@
use crate::codex_message_processor::ApiVersion;
use crate::codex_message_processor::PendingInterrupts;
use crate::codex_message_processor::PendingRollbacks;
use crate::codex_message_processor::TurnSummary;
use crate::codex_message_processor::TurnSummaryStore;
use crate::codex_message_processor::read_event_msgs_from_rollout;
use crate::codex_message_processor::read_summary_from_rollout;
use crate::codex_message_processor::summary_to_thread;
use crate::error_code::INTERNAL_ERROR_CODE;
use crate::error_code::INVALID_REQUEST_ERROR_CODE;
use crate::outgoing_message::OutgoingMessageSender;
use codex_app_server_protocol::AccountRateLimitsUpdatedNotification;
use codex_app_server_protocol::AgentMessageDeltaNotification;
@@ -27,6 +33,7 @@ use codex_app_server_protocol::FileUpdateChange;
use codex_app_server_protocol::InterruptConversationResponse;
use codex_app_server_protocol::ItemCompletedNotification;
use codex_app_server_protocol::ItemStartedNotification;
use codex_app_server_protocol::JSONRPCErrorError;
use codex_app_server_protocol::McpToolCallError;
use codex_app_server_protocol::McpToolCallResult;
use codex_app_server_protocol::McpToolCallStatus;
@@ -40,6 +47,7 @@ use codex_app_server_protocol::ServerNotification;
use codex_app_server_protocol::ServerRequestPayload;
use codex_app_server_protocol::TerminalInteractionNotification;
use codex_app_server_protocol::ThreadItem;
use codex_app_server_protocol::ThreadRollbackResponse;
use codex_app_server_protocol::ThreadTokenUsage;
use codex_app_server_protocol::ThreadTokenUsageUpdatedNotification;
use codex_app_server_protocol::Turn;
@@ -50,9 +58,11 @@ use codex_app_server_protocol::TurnInterruptResponse;
use codex_app_server_protocol::TurnPlanStep;
use codex_app_server_protocol::TurnPlanUpdatedNotification;
use codex_app_server_protocol::TurnStatus;
use codex_core::CodexConversation;
use codex_app_server_protocol::build_turns_from_event_msgs;
use codex_core::CodexThread;
use codex_core::parse_command::shlex_join;
use codex_core::protocol::ApplyPatchApprovalRequestEvent;
use codex_core::protocol::CodexErrorInfo as CoreCodexErrorInfo;
use codex_core::protocol::Event;
use codex_core::protocol::EventMsg;
use codex_core::protocol::ExecApprovalRequestEvent;
@@ -66,7 +76,7 @@ use codex_core::protocol::TokenCountEvent;
use codex_core::protocol::TurnDiffEvent;
use codex_core::review_format::format_review_findings_block;
use codex_core::review_prompts;
use codex_protocol::ConversationId;
use codex_protocol::ThreadId;
use codex_protocol::plan_tool::UpdatePlanArgs;
use codex_protocol::protocol::ReviewOutputEvent;
use std::collections::HashMap;
@@ -78,14 +88,17 @@ use tracing::error;
type JsonValue = serde_json::Value;
#[allow(clippy::too_many_arguments)]
pub(crate) async fn apply_bespoke_event_handling(
event: Event,
conversation_id: ConversationId,
conversation: Arc<CodexConversation>,
conversation_id: ThreadId,
conversation: Arc<CodexThread>,
outgoing: Arc<OutgoingMessageSender>,
pending_interrupts: PendingInterrupts,
pending_rollbacks: PendingRollbacks,
turn_summary_store: TurnSummaryStore,
api_version: ApiVersion,
fallback_model_provider: String,
) {
let Event {
id: event_turn_id,
@@ -337,6 +350,26 @@ pub(crate) async fn apply_bespoke_event_handling(
.await;
}
EventMsg::Error(ev) => {
let message = ev.message.clone();
let codex_error_info = ev.codex_error_info.clone();
// If this error belongs to an in-flight `thread/rollback` request, fail that request
// (and clear pending state) so subsequent rollbacks are unblocked.
//
// Don't send a notification for this error.
if matches!(
codex_error_info,
Some(CoreCodexErrorInfo::ThreadRollbackFailed)
) {
return handle_thread_rollback_failed(
conversation_id,
message,
&pending_rollbacks,
&outgoing,
)
.await;
};
let turn_error = TurnError {
message: ev.message,
codex_error_info: ev.codex_error_info.map(V2CodexErrorInfo::from),
@@ -345,7 +378,7 @@ pub(crate) async fn apply_bespoke_event_handling(
handle_error(conversation_id, turn_error.clone(), &turn_summary_store).await;
outgoing
.send_server_notification(ServerNotification::Error(ErrorNotification {
error: turn_error,
error: turn_error.clone(),
will_retry: false,
thread_id: conversation_id.to_string(),
turn_id: event_turn_id.clone(),
@@ -690,6 +723,58 @@ pub(crate) async fn apply_bespoke_event_handling(
)
.await;
}
EventMsg::ThreadRolledBack(_rollback_event) => {
let pending = {
let mut map = pending_rollbacks.lock().await;
map.remove(&conversation_id)
};
if let Some(request_id) = pending {
let rollout_path = conversation.rollout_path();
let response = match read_summary_from_rollout(
rollout_path.as_path(),
fallback_model_provider.as_str(),
)
.await
{
Ok(summary) => {
let mut thread = summary_to_thread(summary);
match read_event_msgs_from_rollout(rollout_path.as_path()).await {
Ok(events) => {
thread.turns = build_turns_from_event_msgs(&events);
ThreadRollbackResponse { thread }
}
Err(err) => {
let error = JSONRPCErrorError {
code: INTERNAL_ERROR_CODE,
message: format!(
"failed to load rollout `{}`: {err}",
rollout_path.display()
),
data: None,
};
outgoing.send_error(request_id, error).await;
return;
}
}
}
Err(err) => {
let error = JSONRPCErrorError {
code: INTERNAL_ERROR_CODE,
message: format!(
"failed to load rollout `{}`: {err}",
rollout_path.display()
),
data: None,
};
outgoing.send_error(request_id, error).await;
return;
}
};
outgoing.send_response(request_id, response).await;
}
}
EventMsg::TurnDiff(turn_diff_event) => {
handle_turn_diff(
conversation_id,
@@ -716,7 +801,7 @@ pub(crate) async fn apply_bespoke_event_handling(
}
async fn handle_turn_diff(
conversation_id: ConversationId,
conversation_id: ThreadId,
event_turn_id: &str,
turn_diff_event: TurnDiffEvent,
api_version: ApiVersion,
@@ -735,7 +820,7 @@ async fn handle_turn_diff(
}
async fn handle_turn_plan_update(
conversation_id: ConversationId,
conversation_id: ThreadId,
event_turn_id: &str,
plan_update_event: UpdatePlanArgs,
api_version: ApiVersion,
@@ -759,7 +844,7 @@ async fn handle_turn_plan_update(
}
async fn emit_turn_completed_with_status(
conversation_id: ConversationId,
conversation_id: ThreadId,
event_turn_id: String,
status: TurnStatus,
error: Option<TurnError>,
@@ -780,7 +865,7 @@ async fn emit_turn_completed_with_status(
}
async fn complete_file_change_item(
conversation_id: ConversationId,
conversation_id: ThreadId,
item_id: String,
changes: Vec<FileUpdateChange>,
status: PatchApplyStatus,
@@ -812,7 +897,7 @@ async fn complete_file_change_item(
#[allow(clippy::too_many_arguments)]
async fn complete_command_execution_item(
conversation_id: ConversationId,
conversation_id: ThreadId,
turn_id: String,
item_id: String,
command: String,
@@ -845,7 +930,7 @@ async fn complete_command_execution_item(
async fn maybe_emit_raw_response_item_completed(
api_version: ApiVersion,
conversation_id: ConversationId,
conversation_id: ThreadId,
turn_id: &str,
item: codex_protocol::models::ResponseItem,
outgoing: &OutgoingMessageSender,
@@ -865,7 +950,7 @@ async fn maybe_emit_raw_response_item_completed(
}
async fn find_and_remove_turn_summary(
conversation_id: ConversationId,
conversation_id: ThreadId,
turn_summary_store: &TurnSummaryStore,
) -> TurnSummary {
let mut map = turn_summary_store.lock().await;
@@ -873,7 +958,7 @@ async fn find_and_remove_turn_summary(
}
async fn handle_turn_complete(
conversation_id: ConversationId,
conversation_id: ThreadId,
event_turn_id: String,
outgoing: &OutgoingMessageSender,
turn_summary_store: &TurnSummaryStore,
@@ -889,7 +974,7 @@ async fn handle_turn_complete(
}
async fn handle_turn_interrupted(
conversation_id: ConversationId,
conversation_id: ThreadId,
event_turn_id: String,
outgoing: &OutgoingMessageSender,
turn_summary_store: &TurnSummaryStore,
@@ -906,8 +991,33 @@ async fn handle_turn_interrupted(
.await;
}
async fn handle_thread_rollback_failed(
conversation_id: ThreadId,
message: String,
pending_rollbacks: &PendingRollbacks,
outgoing: &OutgoingMessageSender,
) {
let pending_rollback = {
let mut map = pending_rollbacks.lock().await;
map.remove(&conversation_id)
};
if let Some(request_id) = pending_rollback {
outgoing
.send_error(
request_id,
JSONRPCErrorError {
code: INVALID_REQUEST_ERROR_CODE,
message: message.clone(),
data: None,
},
)
.await;
}
}
async fn handle_token_count_event(
conversation_id: ConversationId,
conversation_id: ThreadId,
turn_id: String,
token_count_event: TokenCountEvent,
outgoing: &OutgoingMessageSender,
@@ -935,7 +1045,7 @@ async fn handle_token_count_event(
}
async fn handle_error(
conversation_id: ConversationId,
conversation_id: ThreadId,
error: TurnError,
turn_summary_store: &TurnSummaryStore,
) {
@@ -946,7 +1056,7 @@ async fn handle_error(
async fn on_patch_approval_response(
event_turn_id: String,
receiver: oneshot::Receiver<JsonValue>,
codex: Arc<CodexConversation>,
codex: Arc<CodexThread>,
) {
let response = receiver.await;
let value = match response {
@@ -988,7 +1098,7 @@ async fn on_patch_approval_response(
async fn on_exec_approval_response(
event_turn_id: String,
receiver: oneshot::Receiver<JsonValue>,
conversation: Arc<CodexConversation>,
conversation: Arc<CodexThread>,
) {
let response = receiver.await;
let value = match response {
@@ -1086,11 +1196,11 @@ fn format_file_change_diff(change: &CoreFileChange) -> String {
#[allow(clippy::too_many_arguments)]
async fn on_file_change_request_approval_response(
event_turn_id: String,
conversation_id: ConversationId,
conversation_id: ThreadId,
item_id: String,
changes: Vec<FileUpdateChange>,
receiver: oneshot::Receiver<JsonValue>,
codex: Arc<CodexConversation>,
codex: Arc<CodexThread>,
outgoing: Arc<OutgoingMessageSender>,
turn_summary_store: TurnSummaryStore,
) {
@@ -1155,13 +1265,13 @@ async fn on_file_change_request_approval_response(
#[allow(clippy::too_many_arguments)]
async fn on_command_execution_request_approval_response(
event_turn_id: String,
conversation_id: ConversationId,
conversation_id: ThreadId,
item_id: String,
command: String,
cwd: PathBuf,
command_actions: Vec<V2ParsedCommand>,
receiver: oneshot::Receiver<JsonValue>,
conversation: Arc<CodexConversation>,
conversation: Arc<CodexThread>,
outgoing: Arc<OutgoingMessageSender>,
) {
let response = receiver.await;
@@ -1334,7 +1444,7 @@ mod tests {
#[tokio::test]
async fn test_handle_error_records_message() -> Result<()> {
let conversation_id = ConversationId::new();
let conversation_id = ThreadId::new();
let turn_summary_store = new_turn_summary_store();
handle_error(
@@ -1362,7 +1472,7 @@ mod tests {
#[tokio::test]
async fn test_handle_turn_complete_emits_completed_without_error() -> Result<()> {
let conversation_id = ConversationId::new();
let conversation_id = ThreadId::new();
let event_turn_id = "complete1".to_string();
let (tx, mut rx) = mpsc::channel(CHANNEL_CAPACITY);
let outgoing = Arc::new(OutgoingMessageSender::new(tx));
@@ -1394,7 +1504,7 @@ mod tests {
#[tokio::test]
async fn test_handle_turn_interrupted_emits_interrupted_with_error() -> Result<()> {
let conversation_id = ConversationId::new();
let conversation_id = ThreadId::new();
let event_turn_id = "interrupt1".to_string();
let turn_summary_store = new_turn_summary_store();
handle_error(
@@ -1436,7 +1546,7 @@ mod tests {
#[tokio::test]
async fn test_handle_turn_complete_emits_failed_with_error() -> Result<()> {
let conversation_id = ConversationId::new();
let conversation_id = ThreadId::new();
let event_turn_id = "complete_err1".to_string();
let turn_summary_store = new_turn_summary_store();
handle_error(
@@ -1501,7 +1611,7 @@ mod tests {
],
};
let conversation_id = ConversationId::new();
let conversation_id = ThreadId::new();
handle_turn_plan_update(
conversation_id,
@@ -1535,7 +1645,7 @@ mod tests {
#[tokio::test]
async fn test_handle_token_count_event_emits_usage_and_rate_limits() -> Result<()> {
let conversation_id = ConversationId::new();
let conversation_id = ThreadId::new();
let turn_id = "turn-123".to_string();
let (tx, mut rx) = mpsc::channel(CHANNEL_CAPACITY);
let outgoing = Arc::new(OutgoingMessageSender::new(tx));
@@ -1620,7 +1730,7 @@ mod tests {
#[tokio::test]
async fn test_handle_token_count_event_without_usage_info() -> Result<()> {
let conversation_id = ConversationId::new();
let conversation_id = ThreadId::new();
let turn_id = "turn-456".to_string();
let (tx, mut rx) = mpsc::channel(CHANNEL_CAPACITY);
let outgoing = Arc::new(OutgoingMessageSender::new(tx));
@@ -1654,7 +1764,7 @@ mod tests {
},
};
let thread_id = ConversationId::new().to_string();
let thread_id = ThreadId::new().to_string();
let turn_id = "turn_1".to_string();
let notification = construct_mcp_tool_call_notification(
begin_event.clone(),
@@ -1684,8 +1794,8 @@ mod tests {
#[tokio::test]
async fn test_handle_turn_complete_emits_error_multiple_turns() -> Result<()> {
// Conversation A will have two turns; Conversation B will have one turn.
let conversation_a = ConversationId::new();
let conversation_b = ConversationId::new();
let conversation_a = ThreadId::new();
let conversation_b = ThreadId::new();
let turn_summary_store = new_turn_summary_store();
let (tx, mut rx) = mpsc::channel(CHANNEL_CAPACITY);
@@ -1812,7 +1922,7 @@ mod tests {
},
};
let thread_id = ConversationId::new().to_string();
let thread_id = ThreadId::new().to_string();
let turn_id = "turn_2".to_string();
let notification = construct_mcp_tool_call_notification(
begin_event.clone(),
@@ -1863,7 +1973,7 @@ mod tests {
result: Ok(result),
};
let thread_id = ConversationId::new().to_string();
let thread_id = ThreadId::new().to_string();
let turn_id = "turn_3".to_string();
let notification = construct_mcp_tool_call_end_notification(
end_event.clone(),
@@ -1906,7 +2016,7 @@ mod tests {
result: Err("boom".to_string()),
};
let thread_id = ConversationId::new().to_string();
let thread_id = ThreadId::new().to_string();
let turn_id = "turn_4".to_string();
let notification = construct_mcp_tool_call_end_notification(
end_event.clone(),
@@ -1940,7 +2050,7 @@ mod tests {
let (tx, mut rx) = mpsc::channel(CHANNEL_CAPACITY);
let outgoing = OutgoingMessageSender::new(tx);
let unified_diff = "--- a\n+++ b\n".to_string();
let conversation_id = ConversationId::new();
let conversation_id = ThreadId::new();
handle_turn_diff(
conversation_id,
@@ -1975,7 +2085,7 @@ mod tests {
async fn test_handle_turn_diff_is_noop_for_v1() -> Result<()> {
let (tx, mut rx) = mpsc::channel(CHANNEL_CAPACITY);
let outgoing = OutgoingMessageSender::new(tx);
let conversation_id = ConversationId::new();
let conversation_id = ThreadId::new();
handle_turn_diff(
conversation_id,

File diff suppressed because it is too large Load Diff

View File

@@ -9,6 +9,7 @@ use codex_app_server_protocol::ConfigWriteResponse;
use codex_app_server_protocol::JSONRPCErrorError;
use codex_core::config::ConfigService;
use codex_core::config::ConfigServiceError;
use codex_core::config_loader::LoaderOverrides;
use serde_json::json;
use std::path::PathBuf;
use toml::Value as TomlValue;
@@ -19,9 +20,13 @@ pub(crate) struct ConfigApi {
}
impl ConfigApi {
pub(crate) fn new(codex_home: PathBuf, cli_overrides: Vec<(String, TomlValue)>) -> Self {
pub(crate) fn new(
codex_home: PathBuf,
cli_overrides: Vec<(String, TomlValue)>,
loader_overrides: LoaderOverrides,
) -> Self {
Self {
service: ConfigService::new(codex_home, cli_overrides),
service: ConfigService::new(codex_home, cli_overrides, loader_overrides),
}
}

View File

@@ -1,7 +1,8 @@
#![deny(clippy::print_stdout, clippy::print_stderr)]
use codex_common::CliConfigOverrides;
use codex_core::config::Config;
use codex_core::config::ConfigBuilder;
use codex_core::config_loader::LoaderOverrides;
use std::io::ErrorKind;
use std::io::Result as IoResult;
use std::path::PathBuf;
@@ -42,6 +43,7 @@ const CHANNEL_CAPACITY: usize = 128;
pub async fn run_main(
codex_linux_sandbox_exe: Option<PathBuf>,
cli_config_overrides: CliConfigOverrides,
loader_overrides: LoaderOverrides,
) -> IoResult<()> {
// Set up channels.
let (incoming_tx, mut incoming_rx) = mpsc::channel::<JSONRPCMessage>(CHANNEL_CAPACITY);
@@ -78,7 +80,11 @@ pub async fn run_main(
format!("error parsing -c overrides: {e}"),
)
})?;
let config = Config::load_with_cli_overrides(cli_kv_overrides.clone())
let loader_overrides_for_config_api = loader_overrides.clone();
let config = ConfigBuilder::default()
.cli_overrides(cli_kv_overrides.clone())
.loader_overrides(loader_overrides)
.build()
.await
.map_err(|e| {
std::io::Error::new(ErrorKind::InvalidData, format!("error loading config: {e}"))
@@ -120,11 +126,13 @@ pub async fn run_main(
let processor_handle = tokio::spawn({
let outgoing_message_sender = OutgoingMessageSender::new(outgoing_tx);
let cli_overrides: Vec<(String, TomlValue)> = cli_kv_overrides.clone();
let loader_overrides = loader_overrides_for_config_api;
let mut processor = MessageProcessor::new(
outgoing_message_sender,
codex_linux_sandbox_exe,
std::sync::Arc::new(config),
cli_overrides,
loader_overrides,
feedback.clone(),
);
async move {

View File

@@ -1,10 +1,42 @@
use codex_app_server::run_main;
use codex_arg0::arg0_dispatch_or_else;
use codex_common::CliConfigOverrides;
use codex_core::config_loader::LoaderOverrides;
use std::path::PathBuf;
// Debug-only test hook: lets integration tests point the server at a temporary
// managed config file without writing to /etc.
const MANAGED_CONFIG_PATH_ENV_VAR: &str = "CODEX_APP_SERVER_MANAGED_CONFIG_PATH";
fn main() -> anyhow::Result<()> {
arg0_dispatch_or_else(|codex_linux_sandbox_exe| async move {
run_main(codex_linux_sandbox_exe, CliConfigOverrides::default()).await?;
let managed_config_path = managed_config_path_from_debug_env();
let loader_overrides = LoaderOverrides {
managed_config_path,
..Default::default()
};
run_main(
codex_linux_sandbox_exe,
CliConfigOverrides::default(),
loader_overrides,
)
.await?;
Ok(())
})
}
fn managed_config_path_from_debug_env() -> Option<PathBuf> {
#[cfg(debug_assertions)]
{
if let Ok(value) = std::env::var(MANAGED_CONFIG_PATH_ENV_VAR) {
return if value.is_empty() {
None
} else {
Some(PathBuf::from(value))
};
}
}
None
}

View File

@@ -18,8 +18,9 @@ use codex_app_server_protocol::JSONRPCRequest;
use codex_app_server_protocol::JSONRPCResponse;
use codex_app_server_protocol::RequestId;
use codex_core::AuthManager;
use codex_core::ConversationManager;
use codex_core::ThreadManager;
use codex_core::config::Config;
use codex_core::config_loader::LoaderOverrides;
use codex_core::default_client::USER_AGENT_SUFFIX;
use codex_core::default_client::get_codex_user_agent;
use codex_feedback::CodexFeedback;
@@ -41,6 +42,7 @@ impl MessageProcessor {
codex_linux_sandbox_exe: Option<PathBuf>,
config: Arc<Config>,
cli_overrides: Vec<(String, TomlValue)>,
loader_overrides: LoaderOverrides,
feedback: CodexFeedback,
) -> Self {
let outgoing = Arc::new(outgoing);
@@ -49,20 +51,21 @@ impl MessageProcessor {
false,
config.cli_auth_credentials_store_mode,
);
let conversation_manager = Arc::new(ConversationManager::new(
let thread_manager = Arc::new(ThreadManager::new(
config.codex_home.clone(),
auth_manager.clone(),
SessionSource::VSCode,
));
let codex_message_processor = CodexMessageProcessor::new(
auth_manager,
conversation_manager,
thread_manager,
outgoing.clone(),
codex_linux_sandbox_exe,
Arc::clone(&config),
cli_overrides.clone(),
feedback,
);
let config_api = ConfigApi::new(config.codex_home.clone(), cli_overrides);
let config_api = ConfigApi::new(config.codex_home.clone(), cli_overrides, loader_overrides);
Self {
outgoing,

View File

@@ -2,19 +2,17 @@ use std::sync::Arc;
use codex_app_server_protocol::Model;
use codex_app_server_protocol::ReasoningEffortOption;
use codex_core::ConversationManager;
use codex_core::ThreadManager;
use codex_core::config::Config;
use codex_protocol::openai_models::ModelPreset;
use codex_protocol::openai_models::ReasoningEffortPreset;
pub async fn supported_models(
conversation_manager: Arc<ConversationManager>,
config: &Config,
) -> Vec<Model> {
conversation_manager
pub async fn supported_models(thread_manager: Arc<ThreadManager>, config: &Config) -> Vec<Model> {
thread_manager
.list_models(config)
.await
.into_iter()
.filter(|preset| preset.show_in_picker)
.map(model_from_preset)
.collect()
}

View File

@@ -45,6 +45,7 @@ use codex_app_server_protocol::SetDefaultModelParams;
use codex_app_server_protocol::ThreadArchiveParams;
use codex_app_server_protocol::ThreadListParams;
use codex_app_server_protocol::ThreadResumeParams;
use codex_app_server_protocol::ThreadRollbackParams;
use codex_app_server_protocol::ThreadStartParams;
use codex_app_server_protocol::TurnInterruptParams;
use codex_app_server_protocol::TurnStartParams;
@@ -197,7 +198,7 @@ impl McpProcess {
}
/// Send a `removeConversationListener` JSON-RPC request.
pub async fn send_remove_conversation_listener_request(
pub async fn send_remove_thread_listener_request(
&mut self,
params: RemoveConversationListenerParams,
) -> anyhow::Result<i64> {
@@ -316,6 +317,15 @@ impl McpProcess {
self.send_request("thread/archive", params).await
}
/// Send a `thread/rollback` JSON-RPC request.
pub async fn send_thread_rollback_request(
&mut self,
params: ThreadRollbackParams,
) -> anyhow::Result<i64> {
let params = Some(serde_json::to_value(params)?);
self.send_request("thread/rollback", params).await
}
/// Send a `thread/list` JSON-RPC request.
pub async fn send_thread_list_request(
&mut self,

View File

@@ -15,7 +15,7 @@ fn preset_to_info(preset: &ModelPreset, priority: i32) -> ModelInfo {
slug: preset.id.clone(),
display_name: preset.display_name.clone(),
description: Some(preset.description.clone()),
default_reasoning_level: preset.default_reasoning_effort,
default_reasoning_level: Some(preset.default_reasoning_effort),
supported_reasoning_levels: preset.supported_reasoning_efforts.clone(),
shell_type: ConfigShellToolType::ShellCommand,
visibility: if preset.show_in_picker {
@@ -26,14 +26,16 @@ fn preset_to_info(preset: &ModelPreset, priority: i32) -> ModelInfo {
supported_in_api: true,
priority,
upgrade: preset.upgrade.as_ref().map(|u| u.id.clone()),
base_instructions: None,
base_instructions: "base instructions".to_string(),
supports_reasoning_summaries: false,
support_verbosity: false,
default_verbosity: None,
apply_patch_tool_type: None,
truncation_policy: TruncationPolicyConfig::bytes(10_000),
supports_parallel_tool_calls: false,
context_window: None,
context_window: Some(272_000),
auto_compact_token_limit: None,
effective_context_window_percent: 95,
experimental_supported_tools: Vec::new(),
}
}

View File

@@ -1,5 +1,5 @@
use anyhow::Result;
use codex_protocol::ConversationId;
use codex_protocol::ThreadId;
use codex_protocol::protocol::GitInfo;
use codex_protocol::protocol::SessionMeta;
use codex_protocol::protocol::SessionMetaLine;
@@ -28,7 +28,7 @@ pub fn create_fake_rollout(
) -> Result<String> {
let uuid = Uuid::new_v4();
let uuid_str = uuid.to_string();
let conversation_id = ConversationId::from_string(&uuid_str)?;
let conversation_id = ThreadId::from_string(&uuid_str)?;
// sessions/YYYY/MM/DD derived from filename_ts (YYYY-MM-DDThh-mm-ss)
let year = &filename_ts[0..4];

View File

@@ -145,9 +145,7 @@ async fn test_codex_jsonrpc_conversation_flow() -> Result<()> {
// 4) removeConversationListener
let remove_listener_id = mcp
.send_remove_conversation_listener_request(RemoveConversationListenerParams {
subscription_id,
})
.send_remove_thread_listener_request(RemoveConversationListenerParams { subscription_id })
.await?;
let remove_listener_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,

View File

@@ -6,7 +6,7 @@ use codex_app_server_protocol::JSONRPCNotification;
use codex_app_server_protocol::JSONRPCResponse;
use codex_app_server_protocol::ListConversationsParams;
use codex_app_server_protocol::ListConversationsResponse;
use codex_app_server_protocol::NewConversationParams; // reused for overrides shape
use codex_app_server_protocol::NewConversationParams;
use codex_app_server_protocol::RequestId;
use codex_app_server_protocol::ResumeConversationParams;
use codex_app_server_protocol::ResumeConversationResponse;

View File

@@ -1,8 +1,8 @@
mod archive_conversation;
mod archive_thread;
mod auth;
mod codex_message_processor_flow;
mod config;
mod create_conversation;
mod create_thread;
mod fuzzy_file_search;
mod interrupt;
mod list_resume;

View File

@@ -13,7 +13,7 @@ use codex_app_server_protocol::NewConversationResponse;
use codex_app_server_protocol::RequestId;
use codex_app_server_protocol::SendUserMessageParams;
use codex_app_server_protocol::SendUserMessageResponse;
use codex_protocol::ConversationId;
use codex_protocol::ThreadId;
use codex_protocol::models::ContentItem;
use codex_protocol::models::ResponseItem;
use codex_protocol::protocol::RawResponseItemEvent;
@@ -81,7 +81,7 @@ async fn test_send_message_success() -> Result<()> {
#[expect(clippy::expect_used)]
async fn send_message(
message: &str,
conversation_id: ConversationId,
conversation_id: ThreadId,
mcp: &mut McpProcess,
) -> Result<()> {
// Now exercise sendUserMessage.
@@ -220,7 +220,7 @@ async fn test_send_message_session_not_found() -> Result<()> {
let mut mcp = McpProcess::new(codex_home.path()).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
let unknown = ConversationId::new();
let unknown = ThreadId::new();
let req_id = mcp
.send_send_user_message_request(SendUserMessageParams {
conversation_id: unknown,
@@ -268,10 +268,7 @@ stream_max_retries = 0
}
#[expect(clippy::expect_used)]
async fn read_raw_response_item(
mcp: &mut McpProcess,
conversation_id: ConversationId,
) -> ResponseItem {
async fn read_raw_response_item(mcp: &mut McpProcess, conversation_id: ThreadId) -> ResponseItem {
loop {
let raw_notification: JSONRPCNotification = timeout(
DEFAULT_READ_TIMEOUT,

View File

@@ -184,7 +184,10 @@ writable_roots = [{}]
let mut mcp = McpProcess::new_with_env(
codex_home.path(),
&[("CODEX_MANAGED_CONFIG_PATH", Some(&managed_path_str))],
&[(
"CODEX_APP_SERVER_MANAGED_CONFIG_PATH",
Some(&managed_path_str),
)],
)
.await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;

View File

@@ -7,6 +7,7 @@ mod review;
mod thread_archive;
mod thread_list;
mod thread_resume;
mod thread_rollback;
mod thread_start;
mod turn_interrupt;
mod turn_start;

View File

@@ -74,7 +74,7 @@ async fn list_models_returns_all_models_with_large_limit() -> Result<()> {
},
ReasoningEffortOption {
reasoning_effort: ReasoningEffort::XHigh,
description: "Extra high reasoning for complex problems".to_string(),
description: "Extra high reasoning depth for complex problems".to_string(),
},
],
default_reasoning_effort: ReasoningEffort::Medium,

View File

@@ -8,7 +8,7 @@ use codex_app_server_protocol::ThreadArchiveResponse;
use codex_app_server_protocol::ThreadStartParams;
use codex_app_server_protocol::ThreadStartResponse;
use codex_core::ARCHIVED_SESSIONS_SUBDIR;
use codex_core::find_conversation_path_by_id_str;
use codex_core::find_thread_path_by_id_str;
use std::path::Path;
use tempfile::TempDir;
use tokio::time::timeout;
@@ -39,7 +39,7 @@ async fn thread_archive_moves_rollout_into_archived_directory() -> Result<()> {
assert!(!thread.id.is_empty());
// Locate the rollout path recorded for this thread id.
let rollout_path = find_conversation_path_by_id_str(codex_home.path(), &thread.id)
let rollout_path = find_thread_path_by_id_str(codex_home.path(), &thread.id)
.await?
.expect("expected rollout path for thread id to exist");
assert!(

View File

@@ -0,0 +1,177 @@
use anyhow::Result;
use app_test_support::McpProcess;
use app_test_support::create_final_assistant_message_sse_response;
use app_test_support::create_mock_chat_completions_server_unchecked;
use app_test_support::to_response;
use codex_app_server_protocol::JSONRPCResponse;
use codex_app_server_protocol::RequestId;
use codex_app_server_protocol::ThreadItem;
use codex_app_server_protocol::ThreadResumeParams;
use codex_app_server_protocol::ThreadResumeResponse;
use codex_app_server_protocol::ThreadRollbackParams;
use codex_app_server_protocol::ThreadRollbackResponse;
use codex_app_server_protocol::ThreadStartParams;
use codex_app_server_protocol::ThreadStartResponse;
use codex_app_server_protocol::TurnStartParams;
use codex_app_server_protocol::UserInput as V2UserInput;
use pretty_assertions::assert_eq;
use tempfile::TempDir;
use tokio::time::timeout;
const DEFAULT_READ_TIMEOUT: std::time::Duration = std::time::Duration::from_secs(10);
#[tokio::test]
async fn thread_rollback_drops_last_turns_and_persists_to_rollout() -> Result<()> {
// Three Codex turns hit the mock model (session start + two turn/start calls).
let responses = vec![
create_final_assistant_message_sse_response("Done")?,
create_final_assistant_message_sse_response("Done")?,
create_final_assistant_message_sse_response("Done")?,
];
let server = create_mock_chat_completions_server_unchecked(responses).await;
let codex_home = TempDir::new()?;
create_config_toml(codex_home.path(), &server.uri())?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
// Start a thread.
let start_id = mcp
.send_thread_start_request(ThreadStartParams {
model: Some("mock-model".to_string()),
..Default::default()
})
.await?;
let start_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(start_id)),
)
.await??;
let ThreadStartResponse { thread, .. } = to_response::<ThreadStartResponse>(start_resp)?;
// Two turns.
let first_text = "First";
let turn1_id = mcp
.send_turn_start_request(TurnStartParams {
thread_id: thread.id.clone(),
input: vec![V2UserInput::Text {
text: first_text.to_string(),
}],
..Default::default()
})
.await?;
let _turn1_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(turn1_id)),
)
.await??;
let _completed1 = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("turn/completed"),
)
.await??;
let turn2_id = mcp
.send_turn_start_request(TurnStartParams {
thread_id: thread.id.clone(),
input: vec![V2UserInput::Text {
text: "Second".to_string(),
}],
..Default::default()
})
.await?;
let _turn2_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(turn2_id)),
)
.await??;
let _completed2 = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("turn/completed"),
)
.await??;
// Roll back the last turn.
let rollback_id = mcp
.send_thread_rollback_request(ThreadRollbackParams {
thread_id: thread.id.clone(),
num_turns: 1,
})
.await?;
let rollback_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(rollback_id)),
)
.await??;
let ThreadRollbackResponse {
thread: rolled_back_thread,
} = to_response::<ThreadRollbackResponse>(rollback_resp)?;
assert_eq!(rolled_back_thread.turns.len(), 1);
assert_eq!(rolled_back_thread.turns[0].items.len(), 2);
match &rolled_back_thread.turns[0].items[0] {
ThreadItem::UserMessage { content, .. } => {
assert_eq!(
content,
&vec![V2UserInput::Text {
text: first_text.to_string()
}]
);
}
other => panic!("expected user message item, got {other:?}"),
}
// Resume and confirm the history is pruned.
let resume_id = mcp
.send_thread_resume_request(ThreadResumeParams {
thread_id: thread.id,
..Default::default()
})
.await?;
let resume_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(resume_id)),
)
.await??;
let ThreadResumeResponse { thread, .. } = to_response::<ThreadResumeResponse>(resume_resp)?;
assert_eq!(thread.turns.len(), 1);
assert_eq!(thread.turns[0].items.len(), 2);
match &thread.turns[0].items[0] {
ThreadItem::UserMessage { content, .. } => {
assert_eq!(
content,
&vec![V2UserInput::Text {
text: first_text.to_string()
}]
);
}
other => panic!("expected user message item, got {other:?}"),
}
Ok(())
}
fn create_config_toml(codex_home: &std::path::Path, server_uri: &str) -> std::io::Result<()> {
let config_toml = codex_home.join("config.toml");
std::fs::write(
config_toml,
format!(
r#"
model = "mock-model"
approval_policy = "never"
sandbox_mode = "read-only"
model_provider = "mock_provider"
[model_providers.mock_provider]
name = "Mock provider for test"
base_url = "{server_uri}/v1"
wire_api = "chat"
request_max_retries = 0
stream_max_retries = 0
"#
),
)
}

View File

@@ -227,11 +227,14 @@ fn check_start_and_end_lines_strict(
first_line: Option<&&str>,
last_line: Option<&&str>,
) -> Result<(), ParseError> {
let first_line = first_line.map(|line| line.trim());
let last_line = last_line.map(|line| line.trim());
match (first_line, last_line) {
(Some(&first), Some(&last)) if first == BEGIN_PATCH_MARKER && last == END_PATCH_MARKER => {
(Some(first), Some(last)) if first == BEGIN_PATCH_MARKER && last == END_PATCH_MARKER => {
Ok(())
}
(Some(&first), _) if first != BEGIN_PATCH_MARKER => Err(InvalidPatchError(String::from(
(Some(first), _) if first != BEGIN_PATCH_MARKER => Err(InvalidPatchError(String::from(
"The first line of the patch must be '*** Begin Patch'",
))),
_ => Err(InvalidPatchError(String::from(
@@ -444,6 +447,25 @@ fn test_parse_patch() {
"The last line of the patch must be '*** End Patch'".to_string()
))
);
assert_eq!(
parse_patch_text(
concat!(
"*** Begin Patch",
" ",
"\n*** Add File: foo\n+hi\n",
" ",
"*** End Patch"
),
ParseMode::Strict
)
.unwrap()
.hunks,
vec![AddFile {
path: PathBuf::from("foo"),
contents: "hi\n".to_string()
}]
);
assert_eq!(
parse_patch_text(
"*** Begin Patch\n\

View File

@@ -0,0 +1 @@
obsolete

View File

@@ -0,0 +1,3 @@
*** Begin Patch
*** Delete File: obsolete.txt
*** End Patch

View File

@@ -0,0 +1,6 @@
*** Begin Patch
*** Update File: file.txt
@@
-one
+two
*** End Patch

View File

@@ -0,0 +1,2 @@
line1
line3

View File

@@ -0,0 +1,3 @@
line1
line2
line3

View File

@@ -0,0 +1,7 @@
*** Begin Patch
*** Update File: lines.txt
@@
line1
-line2
line3
*** End Patch

View File

@@ -0,0 +1,2 @@
first
second updated

View File

@@ -0,0 +1,2 @@
first
second

View File

@@ -0,0 +1,8 @@
*** Begin Patch
*** Update File: tail.txt
@@
first
-second
+second updated
*** End of File
*** End Patch

View File

@@ -283,7 +283,7 @@ struct StdioToUdsCommand {
fn format_exit_messages(exit_info: AppExitInfo, color_enabled: bool) -> Vec<String> {
let AppExitInfo {
token_usage,
conversation_id,
thread_id: conversation_id,
..
} = exit_info;
@@ -480,7 +480,12 @@ async fn cli_main(codex_linux_sandbox_exe: Option<PathBuf>) -> anyhow::Result<()
}
Some(Subcommand::AppServer(app_server_cli)) => match app_server_cli.subcommand {
None => {
codex_app_server::run_main(codex_linux_sandbox_exe, root_config_overrides).await?;
codex_app_server::run_main(
codex_linux_sandbox_exe,
root_config_overrides,
codex_core::config_loader::LoaderOverrides::default(),
)
.await?;
}
Some(AppServerSubcommand::GenerateTs(gen_cli)) => {
codex_app_server_protocol::generate_ts(
@@ -785,7 +790,7 @@ mod tests {
use super::*;
use assert_matches::assert_matches;
use codex_core::protocol::TokenUsage;
use codex_protocol::ConversationId;
use codex_protocol::ThreadId;
use pretty_assertions::assert_eq;
fn finalize_from_args(args: &[&str]) -> TuiCli {
@@ -825,9 +830,7 @@ mod tests {
};
AppExitInfo {
token_usage,
conversation_id: conversation
.map(ConversationId::from_string)
.map(Result::unwrap),
thread_id: conversation.map(ThreadId::from_string).map(Result::unwrap),
update_action: None,
}
}
@@ -836,7 +839,7 @@ mod tests {
fn format_exit_messages_skips_zero_usage() {
let exit_info = AppExitInfo {
token_usage: TokenUsage::default(),
conversation_id: None,
thread_id: None,
update_action: None,
};
let lines = format_exit_messages(exit_info, false);

View File

@@ -59,3 +59,61 @@ prefix_rule(
Ok(())
}
#[test]
fn execpolicy_check_includes_justification_when_present() -> Result<(), Box<dyn std::error::Error>>
{
let codex_home = TempDir::new()?;
let policy_path = codex_home.path().join("rules").join("policy.rules");
fs::create_dir_all(
policy_path
.parent()
.expect("policy path should have a parent"),
)?;
fs::write(
&policy_path,
r#"
prefix_rule(
pattern = ["git", "push"],
decision = "forbidden",
justification = "pushing is blocked in this repo",
)
"#,
)?;
let output = Command::new(codex_utils_cargo_bin::cargo_bin("codex")?)
.env("CODEX_HOME", codex_home.path())
.args([
"execpolicy",
"check",
"--rules",
policy_path
.to_str()
.expect("policy path should be valid UTF-8"),
"git",
"push",
"origin",
"main",
])
.output()?;
assert!(output.status.success());
let result: serde_json::Value = serde_json::from_slice(&output.stdout)?;
assert_eq!(
result,
json!({
"decision": "forbidden",
"matchedRules": [
{
"prefixRuleMatch": {
"matchedPrefix": ["git", "push"],
"decision": "forbidden",
"justification": "pushing is blocked in this repo"
}
}
]
})
);
Ok(())
}

View File

@@ -15,7 +15,9 @@ serde = { workspace = true, features = ["derive"] }
serde_json = { workspace = true }
thiserror = { workspace = true }
tokio = { workspace = true, features = ["macros", "rt", "sync", "time"] }
tokio-tungstenite = { workspace = true }
tracing = { workspace = true }
url = { workspace = true }
eventsource-stream = { workspace = true }
regex-lite = { workspace = true }
tokio-util = { workspace = true, features = ["codec"] }

View File

@@ -2,4 +2,5 @@ pub mod chat;
pub mod compact;
pub mod models;
pub mod responses;
pub mod responses_ws;
mod streaming;

View File

@@ -215,14 +215,14 @@ mod tests {
"supported_in_api": true,
"priority": 1,
"upgrade": null,
"base_instructions": null,
"base_instructions": "base instructions",
"supports_reasoning_summaries": false,
"support_verbosity": false,
"default_verbosity": null,
"apply_patch_tool_type": null,
"truncation_policy": {"mode": "bytes", "limit": 10_000},
"supports_parallel_tool_calls": false,
"context_window": null,
"context_window": 272_000,
"experimental_supported_tools": [],
}))
.unwrap(),

View File

@@ -0,0 +1,708 @@
use crate::auth::AuthProvider;
use crate::common::Prompt as ApiPrompt;
use crate::common::ResponseEvent;
use crate::common::ResponseStream;
use crate::endpoint::responses::ResponsesOptions;
use crate::error::ApiError;
use crate::provider::Provider;
use crate::requests::ResponsesRequestBuilder;
use codex_client::TransportError;
use codex_protocol::models::ResponseItem;
use codex_protocol::protocol::TokenUsage;
use futures::SinkExt;
use futures::StreamExt;
use http::HeaderMap;
use http::HeaderValue;
use serde::Deserialize;
use serde_json::Value;
use std::sync::Arc;
use tokio::net::TcpStream;
use tokio::sync::Mutex;
use tokio::sync::OwnedSemaphorePermit;
use tokio::sync::Semaphore;
use tokio::sync::mpsc;
use tokio_tungstenite::MaybeTlsStream;
use tokio_tungstenite::WebSocketStream;
use tokio_tungstenite::connect_async;
use tokio_tungstenite::tungstenite;
use tokio_tungstenite::tungstenite::Message;
use tracing::debug;
use tracing::trace;
use url::Url;
const WS_BUFFER: usize = 1600;
type WsStream = WebSocketStream<MaybeTlsStream<TcpStream>>;
type WsSender = futures::stream::SplitSink<WsStream, Message>;
#[derive(Clone)]
pub struct ResponsesWsSession<A: AuthProvider + Clone> {
inner: Arc<ResponsesWsInner<A>>,
}
struct ResponsesWsInner<A: AuthProvider + Clone> {
provider: Provider,
auth: A,
connection: Mutex<Option<Arc<ResponsesWsConnection>>>,
state: Arc<Mutex<WsSessionState>>,
turn_gate: Arc<Semaphore>,
}
#[derive(Default)]
struct WsSessionState {
last_sent_len: usize,
active: bool,
}
struct ResponsesWsConnection {
sender: Mutex<WsSender>,
receiver: Mutex<mpsc::Receiver<Result<String, ApiError>>>,
}
impl<A: AuthProvider + Clone> ResponsesWsSession<A> {
pub fn new(provider: Provider, auth: A) -> Self {
Self {
inner: Arc::new(ResponsesWsInner {
provider,
auth,
connection: Mutex::new(None),
state: Arc::new(Mutex::new(WsSessionState::default())),
turn_gate: Arc::new(Semaphore::new(1)),
}),
}
}
pub async fn reset(&self) {
{
let mut guard = self.inner.connection.lock().await;
*guard = None;
}
let mut state = self.inner.state.lock().await;
state.last_sent_len = 0;
state.active = false;
}
pub async fn stream_prompt(
&self,
model: &str,
prompt: &ApiPrompt,
options: ResponsesOptions,
) -> Result<ResponseStream, ApiError> {
let ResponsesOptions {
reasoning,
include,
prompt_cache_key,
text,
store_override,
conversation_id,
session_source,
extra_headers,
} = options;
let request = ResponsesRequestBuilder::new(model, &prompt.instructions, &prompt.input)
.tools(&prompt.tools)
.parallel_tool_calls(prompt.parallel_tool_calls)
.reasoning(reasoning)
.include(include)
.prompt_cache_key(prompt_cache_key)
.text(text)
.conversation(conversation_id)
.session_source(session_source)
.store_override(store_override)
.extra_headers(extra_headers)
.build(&self.inner.provider)?;
let input_len = prompt.input.len();
let event = {
let mut state = self.inner.state.lock().await;
let should_reset = !state.active || input_len < state.last_sent_len;
if should_reset {
state.last_sent_len = 0;
}
state.active = true;
if should_reset {
build_create_event(request.body)?
} else {
let delta = prompt
.input
.get(state.last_sent_len..)
.unwrap_or_default()
.to_vec();
build_append_event(delta)
}
};
let permit = self
.inner
.turn_gate
.clone()
.acquire_owned()
.await
.map_err(|_| ApiError::Stream("responses websocket closed".into()))?;
let connection = self.ensure_connection(request.headers).await?;
if let Err(err) = connection.send(&event).await {
self.reset().await;
return Err(err);
}
Ok(spawn_ws_response_stream(
connection,
self.inner.state.clone(),
input_len,
permit,
))
}
async fn ensure_connection(
&self,
extra_headers: HeaderMap,
) -> Result<Arc<ResponsesWsConnection>, ApiError> {
let existing = { self.inner.connection.lock().await.clone() };
if let Some(connection) = existing {
return Ok(connection);
}
let connection =
ResponsesWsConnection::connect(&self.inner.provider, &self.inner.auth, extra_headers)
.await?;
let connection = Arc::new(connection);
let mut guard = self.inner.connection.lock().await;
if guard.is_none() {
*guard = Some(connection.clone());
}
Ok(connection)
}
}
impl ResponsesWsConnection {
async fn connect<A: AuthProvider>(
provider: &Provider,
auth: &A,
extra_headers: HeaderMap,
) -> Result<Self, ApiError> {
let url = ws_url(provider)?;
let headers = build_ws_headers(provider, auth, extra_headers);
let request = build_ws_request(url, headers)?;
let (stream, _response) = connect_async(request).await.map_err(map_ws_error)?;
let (sender, mut receiver) = stream.split();
let (tx, rx) = mpsc::channel(WS_BUFFER);
tokio::spawn(async move {
loop {
let message = receiver.next().await;
let message = match message {
Some(Ok(message)) => message,
Some(Err(err)) => {
let _ = tx
.send(Err(ApiError::Stream(format!("websocket error: {err}"))))
.await;
return;
}
None => {
let _ = tx
.send(Err(ApiError::Stream(
"websocket closed unexpectedly".into(),
)))
.await;
return;
}
};
match message {
Message::Text(text) => {
if tx.send(Ok(text.to_string())).await.is_err() {
return;
}
}
Message::Binary(bytes) => {
if let Ok(text) = String::from_utf8(bytes.to_vec())
&& tx.send(Ok(text)).await.is_err()
{
return;
}
}
Message::Close(_) => {
let _ = tx
.send(Err(ApiError::Stream("websocket closed".into())))
.await;
return;
}
Message::Ping(_) | Message::Pong(_) => {}
_ => {}
}
}
});
Ok(Self {
sender: Mutex::new(sender),
receiver: Mutex::new(rx),
})
}
async fn send(&self, payload: &Value) -> Result<(), ApiError> {
let text = serde_json::to_string(payload)
.map_err(|err| ApiError::Stream(format!("failed to encode ws payload: {err}")))?;
let mut sender = self.sender.lock().await;
sender
.send(Message::Text(text.into()))
.await
.map_err(|err| ApiError::Stream(format!("websocket send failed: {err}")))
}
}
fn build_create_event(body: Value) -> Result<Value, ApiError> {
let Value::Object(mut payload) = body else {
return Err(ApiError::Stream(
"responses create body was not an object".into(),
));
};
payload.remove("stream");
payload.remove("background");
let mut event = serde_json::Map::new();
event.insert(
"type".to_string(),
Value::String("response.create".to_string()),
);
event.extend(payload);
Ok(Value::Object(event))
}
fn build_append_event(input: Vec<ResponseItem>) -> Value {
serde_json::json!({
"type": "response.append",
"input": input,
})
}
fn ws_url(provider: &Provider) -> Result<Url, ApiError> {
let url = provider.url_for_path("responses");
let mut url = Url::parse(&url)
.map_err(|err| ApiError::Stream(format!("invalid websocket url: {err}")))?;
let scheme = match url.scheme() {
"https" => "wss",
"http" => "ws",
"wss" => "wss",
"ws" => "ws",
other => {
return Err(ApiError::Stream(format!(
"unsupported websocket scheme: {other}"
)));
}
};
if url.scheme() != scheme {
url.set_scheme(scheme)
.map_err(|_| ApiError::Stream("failed to set websocket scheme".into()))?;
}
Ok(url)
}
fn build_ws_headers<A: AuthProvider>(
provider: &Provider,
auth: &A,
extra_headers: HeaderMap,
) -> HeaderMap {
let mut headers = provider.headers.clone();
headers.extend(extra_headers);
if let Some(token) = auth.bearer_token()
&& let Ok(header) = format!("Bearer {token}").parse()
{
let _ = headers.insert(http::header::AUTHORIZATION, header);
}
if let Some(account_id) = auth.account_id()
&& let Ok(header) = HeaderValue::from_str(&account_id)
{
let _ = headers.insert("ChatGPT-Account-ID", header);
}
headers
}
fn build_ws_request(url: Url, headers: HeaderMap) -> Result<http::Request<()>, ApiError> {
let mut builder = http::Request::builder()
.method(http::Method::GET)
.uri(url.as_str());
for (name, value) in headers.iter() {
builder = builder.header(name, value);
}
builder
.body(())
.map_err(|err| ApiError::Stream(format!("failed to build websocket request: {err}")))
}
fn map_ws_error(err: tungstenite::Error) -> ApiError {
let transport = match err {
tungstenite::Error::Http(response) => TransportError::Http {
status: response.status(),
headers: Some(response.headers().clone()),
body: None,
},
tungstenite::Error::Url(err) => TransportError::Build(err.to_string()),
tungstenite::Error::Io(err) => TransportError::Network(err.to_string()),
other => TransportError::Network(other.to_string()),
};
ApiError::Transport(transport)
}
fn spawn_ws_response_stream(
connection: Arc<ResponsesWsConnection>,
state: Arc<Mutex<WsSessionState>>,
input_len: usize,
permit: OwnedSemaphorePermit,
) -> ResponseStream {
let (tx_event, rx_event) = mpsc::channel::<Result<ResponseEvent, ApiError>>(WS_BUFFER);
tokio::spawn(async move {
let _permit = permit;
let mut output_count: usize = 0;
let mut draining = false;
let mut can_send = true;
let mut receiver = connection.receiver.lock().await;
loop {
let message = receiver.recv().await;
let message = match message {
Some(message) => message,
None => {
if can_send && !draining {
let _ = tx_event
.send(Err(ApiError::Stream(
"websocket closed while awaiting responses".into(),
)))
.await;
}
let mut state = state.lock().await;
state.active = false;
state.last_sent_len = 0;
return;
}
};
match message {
Ok(text) => {
trace!("WS event: {text}");
let event: WsEvent = match serde_json::from_str(&text) {
Ok(event) => event,
Err(err) => {
debug!("Failed to parse WS event: {err}");
continue;
}
};
match event.kind.as_str() {
"response.output_item.done" => {
let Some(item_val) = event.item else {
continue;
};
let Ok(item) = serde_json::from_value::<ResponseItem>(item_val) else {
debug!("failed to parse ResponseItem from output_item.done");
continue;
};
output_count = output_count.saturating_add(1);
if can_send
&& tx_event
.send(Ok(ResponseEvent::OutputItemDone(item)))
.await
.is_err()
{
can_send = false;
}
}
"response.output_item.added" => {
let Some(item_val) = event.item else {
continue;
};
let Ok(item) = serde_json::from_value::<ResponseItem>(item_val) else {
debug!("failed to parse ResponseItem from output_item.added");
continue;
};
if can_send
&& tx_event
.send(Ok(ResponseEvent::OutputItemAdded(item)))
.await
.is_err()
{
can_send = false;
}
}
"response.output_text.delta" => {
if let Some(delta) = event.delta
&& can_send
&& tx_event
.send(Ok(ResponseEvent::OutputTextDelta(delta)))
.await
.is_err()
{
can_send = false;
}
}
"response.reasoning_summary_text.delta" => {
if let (Some(delta), Some(summary_index)) =
(event.delta, event.summary_index)
&& can_send
&& tx_event
.send(Ok(ResponseEvent::ReasoningSummaryDelta {
delta,
summary_index,
}))
.await
.is_err()
{
can_send = false;
}
}
"response.reasoning_text.delta" => {
if let (Some(delta), Some(content_index)) =
(event.delta, event.content_index)
&& can_send
&& tx_event
.send(Ok(ResponseEvent::ReasoningContentDelta {
delta,
content_index,
}))
.await
.is_err()
{
can_send = false;
}
}
"response.reasoning_summary_part.added" => {
if let Some(summary_index) = event.summary_index
&& can_send
&& tx_event
.send(Ok(ResponseEvent::ReasoningSummaryPartAdded {
summary_index,
}))
.await
.is_err()
{
can_send = false;
}
}
"response.created" => {
if can_send
&& tx_event.send(Ok(ResponseEvent::Created {})).await.is_err()
{
can_send = false;
}
}
"response.failed" => {
let error = map_failed_response(&event);
if can_send && tx_event.send(Err(error)).await.is_err() {
can_send = false;
}
let mut state = state.lock().await;
state.active = false;
state.last_sent_len = 0;
draining = true;
}
"response.done" | "response.completed" => {
let completed = match completed_event(&event) {
Ok(event) => event,
Err(err) => {
if can_send {
let _ = tx_event.send(Err(err)).await;
}
let mut state = state.lock().await;
state.active = false;
state.last_sent_len = 0;
return;
}
};
if !draining {
if can_send {
let _ = tx_event.send(Ok(completed)).await;
}
let mut state = state.lock().await;
state.last_sent_len = input_len.saturating_add(output_count);
state.active = true;
}
return;
}
_ => {}
}
}
Err(err) => {
if can_send && !draining {
let _ = tx_event.send(Err(err)).await;
}
let mut state = state.lock().await;
state.active = false;
state.last_sent_len = 0;
return;
}
}
}
});
ResponseStream { rx_event }
}
#[derive(Debug, Deserialize)]
#[allow(dead_code)]
struct Error {
r#type: Option<String>,
code: Option<String>,
message: Option<String>,
plan_type: Option<String>,
resets_at: Option<i64>,
}
#[derive(Debug, Deserialize)]
struct ResponseCompleted {
id: String,
#[serde(default)]
usage: Option<ResponseUsage>,
}
#[derive(Debug, Deserialize, Clone)]
struct ResponseUsage {
#[serde(default)]
input_tokens: i64,
#[serde(default)]
input_tokens_details: Option<ResponseInputTokensDetails>,
#[serde(default)]
output_tokens: i64,
#[serde(default)]
output_tokens_details: Option<ResponseOutputTokensDetails>,
#[serde(default)]
total_tokens: i64,
}
impl From<ResponseUsage> for TokenUsage {
fn from(value: ResponseUsage) -> Self {
TokenUsage {
input_tokens: value.input_tokens,
cached_input_tokens: value
.input_tokens_details
.map(|d| d.cached_tokens)
.unwrap_or(0),
output_tokens: value.output_tokens,
reasoning_output_tokens: value
.output_tokens_details
.map(|d| d.reasoning_tokens)
.unwrap_or(0),
total_tokens: value.total_tokens,
}
}
}
#[derive(Debug, Deserialize, Clone)]
struct ResponseInputTokensDetails {
cached_tokens: i64,
}
#[derive(Debug, Deserialize, Clone)]
struct ResponseOutputTokensDetails {
reasoning_tokens: i64,
}
#[derive(Deserialize, Debug)]
struct WsEvent {
#[serde(rename = "type")]
kind: String,
response: Option<Value>,
item: Option<Value>,
delta: Option<String>,
summary_index: Option<i64>,
content_index: Option<i64>,
#[serde(default)]
usage: Option<ResponseUsage>,
}
fn completed_event(event: &WsEvent) -> Result<ResponseEvent, ApiError> {
if let Some(response) = &event.response {
let completed =
serde_json::from_value::<ResponseCompleted>(response.clone()).map_err(|err| {
ApiError::Stream(format!("failed to parse response.completed: {err}"))
})?;
return Ok(ResponseEvent::Completed {
response_id: completed.id,
token_usage: completed.usage.map(Into::into),
});
}
if let Some(usage) = event.usage.clone() {
return Ok(ResponseEvent::Completed {
response_id: String::new(),
token_usage: Some(usage.into()),
});
}
Ok(ResponseEvent::Completed {
response_id: String::new(),
token_usage: None,
})
}
fn map_failed_response(event: &WsEvent) -> ApiError {
let Some(resp_val) = event.response.clone() else {
return ApiError::Stream("response.failed event received".into());
};
let Some(error) = resp_val.get("error") else {
return ApiError::Stream("response.failed event received".into());
};
let Ok(error) = serde_json::from_value::<Error>(error.clone()) else {
return ApiError::Stream("response.failed event received".into());
};
if is_context_window_error(&error) {
ApiError::ContextWindowExceeded
} else if is_quota_exceeded_error(&error) {
ApiError::QuotaExceeded
} else if is_usage_not_included(&error) {
ApiError::UsageNotIncluded
} else {
let delay = try_parse_retry_after(&error);
let message = error.message.unwrap_or_default();
ApiError::Retryable { message, delay }
}
}
fn try_parse_retry_after(err: &Error) -> Option<std::time::Duration> {
if err.code.as_deref() != Some("rate_limit_exceeded") {
return None;
}
let re = rate_limit_regex();
if let Some(message) = &err.message
&& let Some(captures) = re.captures(message)
{
let seconds = captures.get(1);
let unit = captures.get(2);
if let (Some(value), Some(unit)) = (seconds, unit) {
let value = value.as_str().parse::<f64>().ok()?;
let unit = unit.as_str().to_ascii_lowercase();
if unit == "s" || unit.starts_with("second") {
return Some(std::time::Duration::from_secs_f64(value));
} else if unit == "ms" {
return Some(std::time::Duration::from_millis(value as u64));
}
}
}
None
}
fn is_context_window_error(error: &Error) -> bool {
error.code.as_deref() == Some("context_length_exceeded")
}
fn is_quota_exceeded_error(error: &Error) -> bool {
error.code.as_deref() == Some("insufficient_quota")
}
fn is_usage_not_included(error: &Error) -> bool {
error.code.as_deref() == Some("usage_not_included")
}
fn rate_limit_regex() -> &'static regex_lite::Regex {
static RE: std::sync::OnceLock<regex_lite::Regex> = std::sync::OnceLock::new();
#[expect(clippy::unwrap_used)]
RE.get_or_init(|| {
regex_lite::Regex::new(r"(?i)try again in\\s*(\\d+(?:\\.\\d+)?)\\s*(s|ms|seconds?)")
.unwrap()
})
}

View File

@@ -25,6 +25,7 @@ pub use crate::endpoint::compact::CompactClient;
pub use crate::endpoint::models::ModelsClient;
pub use crate::endpoint::responses::ResponsesClient;
pub use crate::endpoint::responses::ResponsesOptions;
pub use crate::endpoint::responses_ws::ResponsesWsSession;
pub use crate::error::ApiError;
pub use crate::provider::Provider;
pub use crate::provider::WireApi;

View File

@@ -56,7 +56,7 @@ async fn models_client_hits_models_endpoint() {
slug: "gpt-test".to_string(),
display_name: "gpt-test".to_string(),
description: Some("desc".to_string()),
default_reasoning_level: ReasoningEffort::Medium,
default_reasoning_level: Some(ReasoningEffort::Medium),
supported_reasoning_levels: vec![
ReasoningEffortPreset {
effort: ReasoningEffort::Low,
@@ -76,14 +76,16 @@ async fn models_client_hits_models_endpoint() {
supported_in_api: true,
priority: 1,
upgrade: None,
base_instructions: None,
base_instructions: "base instructions".to_string(),
supports_reasoning_summaries: false,
support_verbosity: false,
default_verbosity: None,
apply_patch_tool_type: None,
truncation_policy: TruncationPolicyConfig::bytes(10_000),
supports_parallel_tool_calls: false,
context_window: None,
context_window: Some(272_000),
auto_compact_token_limit: None,
effective_context_window_percent: 95,
experimental_supported_tools: Vec::new(),
}],
};

View File

@@ -0,0 +1,386 @@
You are a coding agent running in the Codex CLI, a terminal-based coding assistant. Codex CLI is an open source project led by OpenAI. You are expected to be precise, safe, and helpful.
Your capabilities:
- Receive user prompts and other context provided by the harness, such as files in the workspace.
- Communicate with the user by streaming thinking & responses, and by making & updating plans.
- Emit function calls to run terminal commands and apply patches. Depending on how this specific run is configured, you can request that these function calls be escalated to the user for approval before running. More on this in the "Sandbox and approvals" section.
Within this context, Codex refers to the open-source agentic coding interface (not the old Codex language model built by OpenAI).
# How you work
## Personality
Your default personality and tone is concise, direct, and friendly. You communicate efficiently, always keeping the user clearly informed about ongoing actions without unnecessary detail. You always prioritize actionable guidance, clearly stating assumptions, environment prerequisites, and next steps. Unless explicitly asked, you avoid excessively verbose explanations about your work.
# AGENTS.md spec
- Repos often contain AGENTS.md files. These files can appear anywhere within the repository.
- These files are a way for humans to give you (the agent) instructions or tips for working within the container.
- Some examples might be: coding conventions, info about how code is organized, or instructions for how to run or test code.
- Instructions in AGENTS.md files:
- The scope of an AGENTS.md file is the entire directory tree rooted at the folder that contains it.
- For every file you touch in the final patch, you must obey instructions in any AGENTS.md file whose scope includes that file.
- Instructions about code style, structure, naming, etc. apply only to code within the AGENTS.md file's scope, unless the file states otherwise.
- More-deeply-nested AGENTS.md files take precedence in the case of conflicting instructions.
- Direct system/developer/user instructions (as part of a prompt) take precedence over AGENTS.md instructions.
- The contents of the AGENTS.md file at the root of the repo and any directories from the CWD up to the root are included with the developer message and don't need to be re-read. When working in a subdirectory of CWD, or a directory outside the CWD, check for any AGENTS.md files that may be applicable.
## Responsiveness
### Preamble messages
Before making tool calls, send a brief preamble to the user explaining what youre about to do. When sending preamble messages, follow these principles and examples:
- **Logically group related actions**: if youre about to run several related commands, describe them together in one preamble rather than sending a separate note for each.
- **Keep it concise**: be no more than 1-2 sentences, focused on immediate, tangible next steps. (812 words for quick updates).
- **Build on prior context**: if this is not your first tool call, use the preamble message to connect the dots with whats been done so far and create a sense of momentum and clarity for the user to understand your next actions.
- **Keep your tone light, friendly and curious**: add small touches of personality in preambles feel collaborative and engaging.
- **Exception**: Avoid adding a preamble for every trivial read (e.g., `cat` a single file) unless its part of a larger grouped action.
**Examples:**
- “Ive explored the repo; now checking the API route definitions.”
- “Next, Ill patch the config and update the related tests.”
- “Im about to scaffold the CLI commands and helper functions.”
- “Ok cool, so Ive wrapped my head around the repo. Now digging into the API routes.”
- “Configs looking tidy. Next up is patching helpers to keep things in sync.”
- “Finished poking at the DB gateway. I will now chase down error handling.”
- “Alright, build pipeline order is interesting. Checking how it reports failures.”
- “Spotted a clever caching util; now hunting where it gets used.”
## Planning
You have access to an `update_plan` tool which tracks steps and progress and renders them to the user. Using the tool helps demonstrate that you've understood the task and convey how you're approaching it. Plans can help to make complex, ambiguous, or multi-phase work clearer and more collaborative for the user. A good plan should break the task into meaningful, logically ordered steps that are easy to verify as you go.
Note that plans are not for padding out simple work with filler steps or stating the obvious. The content of your plan should not involve doing anything that you aren't capable of doing (i.e. don't try to test things that you can't test). Do not use plans for simple or single-step queries that you can just do or answer immediately.
Do not repeat the full contents of the plan after an `update_plan` call — the harness already displays it. Instead, summarize the change made and highlight any important context or next step.
Before running a command, consider whether or not you have completed the previous step, and make sure to mark it as completed before moving on to the next step. It may be the case that you complete all steps in your plan after a single pass of implementation. If this is the case, you can simply mark all the planned steps as completed. Sometimes, you may need to change plans in the middle of a task: call `update_plan` with the updated plan and make sure to provide an `explanation` of the rationale when doing so.
Use a plan when:
- The task is non-trivial and will require multiple actions over a long time horizon.
- There are logical phases or dependencies where sequencing matters.
- The work has ambiguity that benefits from outlining high-level goals.
- You want intermediate checkpoints for feedback and validation.
- When the user asked you to do more than one thing in a single prompt
- The user has asked you to use the plan tool (aka "TODOs")
- You generate additional steps while working, and plan to do them before yielding to the user
### Examples
**High-quality plans**
Example 1:
1. Add CLI entry with file args
2. Parse Markdown via CommonMark library
3. Apply semantic HTML template
4. Handle code blocks, images, links
5. Add error handling for invalid files
Example 2:
1. Define CSS variables for colors
2. Add toggle with localStorage state
3. Refactor components to use variables
4. Verify all views for readability
5. Add smooth theme-change transition
Example 3:
1. Set up Node.js + WebSocket server
2. Add join/leave broadcast events
3. Implement messaging with timestamps
4. Add usernames + mention highlighting
5. Persist messages in lightweight DB
6. Add typing indicators + unread count
**Low-quality plans**
Example 1:
1. Create CLI tool
2. Add Markdown parser
3. Convert to HTML
Example 2:
1. Add dark mode toggle
2. Save preference
3. Make styles look good
Example 3:
1. Create single-file HTML game
2. Run quick sanity check
3. Summarize usage instructions
If you need to write a plan, only write high quality plans, not low quality ones.
## Task execution
You are a coding agent. Please keep going until the query is completely resolved, before ending your turn and yielding back to the user. Only terminate your turn when you are sure that the problem is solved. Autonomously resolve the query to the best of your ability, using the tools available to you, before coming back to the user. Do NOT guess or make up an answer.
You MUST adhere to the following criteria when solving queries:
- Working on the repo(s) in the current environment is allowed, even if they are proprietary.
- Analyzing code for vulnerabilities is allowed.
- Showing user code and tool call details is allowed.
- Use the `apply_patch` tool to edit files (NEVER try `applypatch` or `apply-patch`, only `apply_patch`): {"command":["apply_patch","*** Begin Patch\\n*** Update File: path/to/file.py\\n@@ def example():\\n- pass\\n+ return 123\\n*** End Patch"]}
If completing the user's task requires writing or modifying files, your code and final answer should follow these coding guidelines, though user instructions (i.e. AGENTS.md) may override these guidelines:
- Fix the problem at the root cause rather than applying surface-level patches, when possible.
- Avoid unneeded complexity in your solution.
- Do not attempt to fix unrelated bugs or broken tests. It is not your responsibility to fix them. (You may mention them to the user in your final message though.)
- Update documentation as necessary.
- Keep changes consistent with the style of the existing codebase. Changes should be minimal and focused on the task.
- Use `git log` and `git blame` to search the history of the codebase if additional context is required.
- NEVER add copyright or license headers unless specifically requested.
- Do not waste tokens by re-reading files after calling `apply_patch` on them. The tool call will fail if it didn't work. The same goes for making folders, deleting folders, etc.
- Do not `git commit` your changes or create new git branches unless explicitly requested.
- Do not add inline comments within code unless explicitly requested.
- Do not use one-letter variable names unless explicitly requested.
- NEVER output inline citations like "【F:README.md†L5-L14】" in your outputs. The CLI is not able to render these so they will just be broken in the UI. Instead, if you output valid filepaths, users will be able to click on them to open the files in their editor.
## Sandbox and approvals
The Codex CLI harness supports several different sandboxing, and approval configurations that the user can choose from.
Filesystem sandboxing prevents you from editing files without user approval. The options are:
- **read-only**: You can only read files.
- **workspace-write**: You can read files. You can write to files in your workspace folder, but not outside it.
- **danger-full-access**: No filesystem sandboxing.
Network sandboxing prevents you from accessing network without approval. Options are
- **restricted**
- **enabled**
Approvals are your mechanism to get user consent to perform more privileged actions. Although they introduce friction to the user because your work is paused until the user responds, you should leverage them to accomplish your important work. Do not let these settings or the sandbox deter you from attempting to accomplish the user's task. Approval options are
- **untrusted**: The harness will escalate most commands for user approval, apart from a limited allowlist of safe "read" commands.
- **on-failure**: The harness will allow all commands to run in the sandbox (if enabled), and failures will be escalated to the user for approval to run again without the sandbox.
- **on-request**: Commands will be run in the sandbox by default, and you can specify in your tool call if you want to escalate a command to run without sandboxing. (Note that this mode is not always available. If it is, you'll see parameters for it in the `shell` command description.)
- **never**: This is a non-interactive mode where you may NEVER ask the user for approval to run commands. Instead, you must always persist and work around constraints to solve the task for the user. You MUST do your utmost best to finish the task and validate your work before yielding. If this mode is pared with `danger-full-access`, take advantage of it to deliver the best outcome for the user. Further, in this mode, your default testing philosophy is overridden: Even if you don't see local patterns for testing, you may add tests and scripts to validate your work. Just remove them before yielding.
When you are running with approvals `on-request`, and sandboxing enabled, here are scenarios where you'll need to request approval:
- You need to run a command that writes to a directory that requires it (e.g. running tests that write to /tmp)
- You need to run a GUI app (e.g., open/xdg-open/osascript) to open browsers or files.
- You are running sandboxed and need to run a command that requires network access (e.g. installing packages)
- If you run a command that is important to solving the user's query, but it fails because of sandboxing, rerun the command with approval.
- You are about to take a potentially destructive action such as an `rm` or `git reset` that the user did not explicitly ask for
- (For all of these, you should weigh alternative paths that do not require approval.)
Note that when sandboxing is set to read-only, you'll need to request approval for any command that isn't a read.
You will be told what filesystem sandboxing, network sandboxing, and approval mode are active in a developer or user message. If you are not told about this, assume that you are running with workspace-write, network sandboxing ON, and approval on-failure.
## Validating your work
If the codebase has tests or the ability to build or run, consider using them to verify that your work is complete.
When testing, your philosophy should be to start as specific as possible to the code you changed so that you can catch issues efficiently, then make your way to broader tests as you build confidence. If there's no test for the code you changed, and if the adjacent patterns in the codebases show that there's a logical place for you to add a test, you may do so. However, do not add tests to codebases with no tests.
Similarly, once you're confident in correctness, you can suggest or use formatting commands to ensure that your code is well formatted. If there are issues you can iterate up to 3 times to get formatting right, but if you still can't manage it's better to save the user time and present them a correct solution where you call out the formatting in your final message. If the codebase does not have a formatter configured, do not add one.
For all of testing, running, building, and formatting, do not attempt to fix unrelated bugs. It is not your responsibility to fix them. (You may mention them to the user in your final message though.)
Be mindful of whether to run validation commands proactively. In the absence of behavioral guidance:
- When running in non-interactive approval modes like **never** or **on-failure**, proactively run tests, lint and do whatever you need to ensure you've completed the task.
- When working in interactive approval modes like **untrusted**, or **on-request**, hold off on running tests or lint commands until the user is ready for you to finalize your output, because these commands take time to run and slow down iteration. Instead suggest what you want to do next, and let the user confirm first.
- When working on test-related tasks, such as adding tests, fixing tests, or reproducing a bug to verify behavior, you may proactively run tests regardless of approval mode. Use your judgement to decide whether this is a test-related task.
## Ambition vs. precision
For tasks that have no prior context (i.e. the user is starting something brand new), you should feel free to be ambitious and demonstrate creativity with your implementation.
If you're operating in an existing codebase, you should make sure you do exactly what the user asks with surgical precision. Treat the surrounding codebase with respect, and don't overstep (i.e. changing filenames or variables unnecessarily). You should balance being sufficiently ambitious and proactive when completing tasks of this nature.
You should use judicious initiative to decide on the right level of detail and complexity to deliver based on the user's needs. This means showing good judgment that you're capable of doing the right extras without gold-plating. This might be demonstrated by high-value, creative touches when scope of the task is vague; while being surgical and targeted when scope is tightly specified.
## Sharing progress updates
For especially longer tasks that you work on (i.e. requiring many tool calls, or a plan with multiple steps), you should provide progress updates back to the user at reasonable intervals. These updates should be structured as a concise sentence or two (no more than 8-10 words long) recapping progress so far in plain language: this update demonstrates your understanding of what needs to be done, progress so far (i.e. files explores, subtasks complete), and where you're going next.
Before doing large chunks of work that may incur latency as experienced by the user (i.e. writing a new file), you should send a concise message to the user with an update indicating what you're about to do to ensure they know what you're spending time on. Don't start editing or writing large files before informing the user what you are doing and why.
The messages you send before tool calls should describe what is immediately about to be done next in very concise language. If there was previous work done, this preamble message should also include a note about the work done so far to bring the user along.
## Presenting your work and final message
Your final message should read naturally, like an update from a concise teammate. For casual conversation, brainstorming tasks, or quick questions from the user, respond in a friendly, conversational tone. You should ask questions, suggest ideas, and adapt to the users style. If you've finished a large amount of work, when describing what you've done to the user, you should follow the final answer formatting guidelines to communicate substantive changes. You don't need to add structured formatting for one-word answers, greetings, or purely conversational exchanges.
You can skip heavy formatting for single, simple actions or confirmations. In these cases, respond in plain sentences with any relevant next step or quick option. Reserve multi-section structured responses for results that need grouping or explanation.
The user is working on the same computer as you, and has access to your work. As such there's no need to show the full contents of large files you have already written unless the user explicitly asks for them. Similarly, if you've created or modified files using `apply_patch`, there's no need to tell users to "save the file" or "copy the code into a file"—just reference the file path.
If there's something that you think you could help with as a logical next step, concisely ask the user if they want you to do so. Good examples of this are running tests, committing changes, or building out the next logical component. If theres something that you couldn't do (even with approval) but that the user might want to do (such as verifying changes by running the app), include those instructions succinctly.
Brevity is very important as a default. You should be very concise (i.e. no more than 10 lines), but can relax this requirement for tasks where additional detail and comprehensiveness is important for the user's understanding.
### Final answer structure and style guidelines
You are producing plain text that will later be styled by the CLI. Follow these rules exactly. Formatting should make results easy to scan, but not feel mechanical. Use judgment to decide how much structure adds value.
**Section Headers**
- Use only when they improve clarity — they are not mandatory for every answer.
- Choose descriptive names that fit the content
- Keep headers short (13 words) and in `**Title Case**`. Always start headers with `**` and end with `**`
- Leave no blank line before the first bullet under a header.
- Section headers should only be used where they genuinely improve scanability; avoid fragmenting the answer.
**Bullets**
- Use `-` followed by a space for every bullet.
- Merge related points when possible; avoid a bullet for every trivial detail.
- Keep bullets to one line unless breaking for clarity is unavoidable.
- Group into short lists (46 bullets) ordered by importance.
- Use consistent keyword phrasing and formatting across sections.
**Monospace**
- Wrap all commands, file paths, env vars, and code identifiers in backticks (`` `...` ``).
- Apply to inline examples and to bullet keywords if the keyword itself is a literal file/command.
- Never mix monospace and bold markers; choose one based on whether its a keyword (`**`) or inline code/path (`` ` ``).
**File References**
When referencing files in your response, make sure to include the relevant start line and always follow the below rules:
* Use inline code to make file paths clickable.
* Each reference should have a stand alone path. Even if it's the same file.
* Accepted: absolute, workspacerelative, a/ or b/ diff prefixes, or bare filename/suffix.
* Line/column (1based, optional): :line[:column] or #Lline[Ccolumn] (column defaults to 1).
* Do not use URIs like file://, vscode://, or https://.
* Do not provide range of lines
* Examples: src/app.ts, src/app.ts:42, b/server/index.js#L10, C:\repo\project\main.rs:12:5
**Structure**
- Place related bullets together; dont mix unrelated concepts in the same section.
- Order sections from general → specific → supporting info.
- For subsections (e.g., “Binaries” under “Rust Workspace”), introduce with a bolded keyword bullet, then list items under it.
- Match structure to complexity:
- Multi-part or detailed results → use clear headers and grouped bullets.
- Simple results → minimal headers, possibly just a short list or paragraph.
**Tone**
- Keep the voice collaborative and natural, like a coding partner handing off work.
- Be concise and factual — no filler or conversational commentary and avoid unnecessary repetition
- Use present tense and active voice (e.g., “Runs tests” not “This will run tests”).
- Keep descriptions self-contained; dont refer to “above” or “below”.
- Use parallel structure in lists for consistency.
**Dont**
- Dont use literal words “bold” or “monospace” in the content.
- Dont nest bullets or create deep hierarchies.
- Dont output ANSI escape codes directly — the CLI renderer applies them.
- Dont cram unrelated keywords into a single bullet; split for clarity.
- Dont let keyword lists run long — wrap or reformat for scanability.
Generally, ensure your final answers adapt their shape and depth to the request. For example, answers to code explanations should have a precise, structured explanation with code references that answer the question directly. For tasks with a simple implementation, lead with the outcome and supplement only with whats needed for clarity. Larger changes can be presented as a logical walkthrough of your approach, grouping related steps, explaining rationale where it adds value, and highlighting next actions to accelerate the user. Your answers should provide the right level of detail while being easily scannable.
For casual greetings, acknowledgements, or other one-off conversational messages that are not delivering substantive information or structured results, respond naturally without section headers or bullet formatting.
# Tool Guidelines
## Shell commands
When using the shell, you must adhere to the following guidelines:
- When searching for text or files, prefer using `rg` or `rg --files` respectively because `rg` is much faster than alternatives like `grep`. (If the `rg` command is not found, then use alternatives.)
- Do not use python scripts to attempt to output larger chunks of a file.
## `update_plan`
A tool named `update_plan` is available to you. You can use it to keep an uptodate, stepbystep plan for the task.
To create a new plan, call `update_plan` with a short list of 1sentence steps (no more than 5-7 words each) with a `status` for each step (`pending`, `in_progress`, or `completed`).
When steps have been completed, use `update_plan` to mark each finished step as `completed` and the next step you are working on as `in_progress`. There should always be exactly one `in_progress` step until everything is done. You can mark multiple items as complete in a single `update_plan` call.
If all steps are complete, ensure you call `update_plan` to mark all steps as `completed`.
## `apply_patch`
Use the `apply_patch` shell command to edit files.
Your patch language is a strippeddown, fileoriented diff format designed to be easy to parse and safe to apply. You can think of it as a highlevel envelope:
*** Begin Patch
[ one or more file sections ]
*** End Patch
Within that envelope, you get a sequence of file operations.
You MUST include a header to specify the action you are taking.
Each operation starts with one of three headers:
*** Add File: <path> - create a new file. Every following line is a + line (the initial contents).
*** Delete File: <path> - remove an existing file. Nothing follows.
*** Update File: <path> - patch an existing file in place (optionally with a rename).
May be immediately followed by *** Move to: <new path> if you want to rename the file.
Then one or more “hunks”, each introduced by @@ (optionally followed by a hunk header).
Within a hunk each line starts with:
For instructions on [context_before] and [context_after]:
- By default, show 3 lines of code immediately above and 3 lines immediately below each change. If a change is within 3 lines of a previous change, do NOT duplicate the first changes [context_after] lines in the second changes [context_before] lines.
- If 3 lines of context is insufficient to uniquely identify the snippet of code within the file, use the @@ operator to indicate the class or function to which the snippet belongs. For instance, we might have:
@@ class BaseClass
[3 lines of pre-context]
- [old_code]
+ [new_code]
[3 lines of post-context]
- If a code block is repeated so many times in a class or function such that even a single `@@` statement and 3 lines of context cannot uniquely identify the snippet of code, you can use multiple `@@` statements to jump to the right context. For instance:
@@ class BaseClass
@@ def method():
[3 lines of pre-context]
- [old_code]
+ [new_code]
[3 lines of post-context]
The full grammar definition is below:
Patch := Begin { FileOp } End
Begin := "*** Begin Patch" NEWLINE
End := "*** End Patch" NEWLINE
FileOp := AddFile | DeleteFile | UpdateFile
AddFile := "*** Add File: " path NEWLINE { "+" line NEWLINE }
DeleteFile := "*** Delete File: " path NEWLINE
UpdateFile := "*** Update File: " path NEWLINE [ MoveTo ] { Hunk }
MoveTo := "*** Move to: " newPath NEWLINE
Hunk := "@@" [ header ] NEWLINE { HunkLine } [ "*** End of File" NEWLINE ]
HunkLine := (" " | "-" | "+") text NEWLINE
A full patch can combine several operations:
*** Begin Patch
*** Add File: hello.txt
+Hello world
*** Update File: src/app.py
*** Move to: src/main.py
@@ def greet():
-print("Hi")
+print("Hello, world!")
*** Delete File: obsolete.txt
*** End Patch
It is important to remember:
- You must include a header with your intended action (Add/Delete/Update)
- You must prefix new lines with `+` even when creating a new file
- File references can only be relative, NEVER ABSOLUTE.
You can invoke apply_patch like:
```
shell {"command":["apply_patch","*** Begin Patch\n*** Add File: hello.txt\n+Hello, world!\n*** End Patch\n"]}
```

View File

@@ -0,0 +1,188 @@
use crate::CodexThread;
use crate::agent::AgentStatus;
use crate::error::CodexErr;
use crate::error::Result as CodexResult;
use crate::thread_manager::ThreadManagerState;
use codex_protocol::ThreadId;
use codex_protocol::protocol::EventMsg;
use codex_protocol::protocol::Op;
use codex_protocol::user_input::UserInput;
use std::sync::Arc;
use std::sync::Weak;
/// Control-plane handle for multi-agent operations.
/// `AgentControl` is held by each session (via `SessionServices`). It provides capability to
/// spawn new agents and the inter-agent communication layer.
#[derive(Clone, Default)]
pub(crate) struct AgentControl {
/// Weak handle back to the global thread registry/state.
/// This is `Weak` to avoid reference cycles and shadow persistence of the form
/// `ThreadManagerState -> CodexThread -> Session -> SessionServices -> ThreadManagerState`.
manager: Weak<ThreadManagerState>,
}
impl AgentControl {
/// Construct a new `AgentControl` that can spawn/message agents via the given manager state.
pub(crate) fn new(manager: Weak<ThreadManagerState>) -> Self {
Self { manager }
}
#[allow(dead_code)] // Used by upcoming multi-agent tooling.
/// Spawn a new agent thread and submit the initial prompt.
///
/// If `headless` is true, a background drain task is spawned to prevent unbounded event growth
/// of the channel queue when there is no client actively reading the thread events.
pub(crate) async fn spawn_agent(
&self,
config: crate::config::Config,
prompt: String,
headless: bool,
) -> CodexResult<ThreadId> {
let state = self.upgrade()?;
let new_thread = state.spawn_new_thread(config, self.clone()).await?;
if headless {
spawn_headless_drain(Arc::clone(&new_thread.thread));
}
self.send_prompt(new_thread.thread_id, prompt).await?;
Ok(new_thread.thread_id)
}
#[allow(dead_code)] // Used by upcoming multi-agent tooling.
/// Send a `user` prompt to an existing agent thread.
pub(crate) async fn send_prompt(
&self,
agent_id: ThreadId,
prompt: String,
) -> CodexResult<String> {
let state = self.upgrade()?;
state
.send_op(
agent_id,
Op::UserInput {
items: vec![UserInput::Text { text: prompt }],
final_output_json_schema: None,
},
)
.await
}
#[allow(dead_code)] // Used by upcoming multi-agent tooling.
/// Fetch the last known status for `agent_id`, returning `NotFound` when unavailable.
pub(crate) async fn get_status(&self, agent_id: ThreadId) -> AgentStatus {
let Ok(state) = self.upgrade() else {
// No agent available if upgrade fails.
return AgentStatus::NotFound;
};
let Ok(thread) = state.get_thread(agent_id).await else {
return AgentStatus::NotFound;
};
thread.agent_status().await
}
fn upgrade(&self) -> CodexResult<Arc<ThreadManagerState>> {
self.manager
.upgrade()
.ok_or_else(|| CodexErr::UnsupportedOperation("thread manager dropped".to_string()))
}
}
/// When an agent is spawned "headless" (no UI/view attached), there may be no consumer polling
/// `CodexThread::next_event()`. The underlying event channel is unbounded, so the producer can
/// accumulate events indefinitely. This drain task prevents that memory growth by polling and
/// discarding events until shutdown.
fn spawn_headless_drain(thread: Arc<CodexThread>) {
tokio::spawn(async move {
loop {
match thread.next_event().await {
Ok(event) => {
if matches!(event.msg, EventMsg::ShutdownComplete) {
break;
}
}
Err(err) => {
tracing::warn!("failed to receive event from agent: {err:?}");
break;
}
}
}
});
}
#[cfg(test)]
mod tests {
use super::*;
use crate::agent::agent_status_from_event;
use codex_protocol::protocol::ErrorEvent;
use codex_protocol::protocol::TaskCompleteEvent;
use codex_protocol::protocol::TaskStartedEvent;
use codex_protocol::protocol::TurnAbortReason;
use codex_protocol::protocol::TurnAbortedEvent;
use pretty_assertions::assert_eq;
#[tokio::test]
async fn send_prompt_errors_when_manager_dropped() {
let control = AgentControl::default();
let err = control
.send_prompt(ThreadId::new(), "hello".to_string())
.await
.expect_err("send_prompt should fail without a manager");
assert_eq!(
err.to_string(),
"unsupported operation: thread manager dropped"
);
}
#[tokio::test]
async fn get_status_returns_not_found_without_manager() {
let control = AgentControl::default();
let got = control.get_status(ThreadId::new()).await;
assert_eq!(got, AgentStatus::NotFound);
}
#[tokio::test]
async fn on_event_updates_status_from_task_started() {
let status = agent_status_from_event(&EventMsg::TaskStarted(TaskStartedEvent {
model_context_window: None,
}));
assert_eq!(status, Some(AgentStatus::Running));
}
#[tokio::test]
async fn on_event_updates_status_from_task_complete() {
let status = agent_status_from_event(&EventMsg::TaskComplete(TaskCompleteEvent {
last_agent_message: Some("done".to_string()),
}));
let expected = AgentStatus::Completed(Some("done".to_string()));
assert_eq!(status, Some(expected));
}
#[tokio::test]
async fn on_event_updates_status_from_error() {
let status = agent_status_from_event(&EventMsg::Error(ErrorEvent {
message: "boom".to_string(),
codex_error_info: None,
}));
let expected = AgentStatus::Errored("boom".to_string());
assert_eq!(status, Some(expected));
}
#[tokio::test]
async fn on_event_updates_status_from_turn_aborted() {
let status = agent_status_from_event(&EventMsg::TurnAborted(TurnAbortedEvent {
reason: TurnAbortReason::Interrupted,
}));
let expected = AgentStatus::Errored("Interrupted".to_string());
assert_eq!(status, Some(expected));
}
#[tokio::test]
async fn on_event_updates_status_from_shutdown_complete() {
let status = agent_status_from_event(&EventMsg::ShutdownComplete);
assert_eq!(status, Some(AgentStatus::Shutdown));
}
}

View File

@@ -0,0 +1,6 @@
pub(crate) mod control;
pub(crate) mod status;
pub(crate) use codex_protocol::protocol::AgentStatus;
pub(crate) use control::AgentControl;
pub(crate) use status::agent_status_from_event;

View File

@@ -0,0 +1,15 @@
use codex_protocol::protocol::AgentStatus;
use codex_protocol::protocol::EventMsg;
/// Derive the next agent status from a single emitted event.
/// Returns `None` when the event does not affect status tracking.
pub(crate) fn agent_status_from_event(msg: &EventMsg) -> Option<AgentStatus> {
match msg {
EventMsg::TaskStarted(_) => Some(AgentStatus::Running),
EventMsg::TaskComplete(ev) => Some(AgentStatus::Completed(ev.last_agent_message.clone())),
EventMsg::TurnAborted(ev) => Some(AgentStatus::Errored(format!("{:?}", ev.reason))),
EventMsg::Error(ev) => Some(AgentStatus::Errored(ev.message.clone())),
EventMsg::ShutdownComplete => Some(AgentStatus::Shutdown),
_ => None,
}
}

View File

@@ -32,11 +32,7 @@ use crate::token_data::parse_id_token;
use crate::util::try_parse_error_message;
use codex_client::CodexHttpClient;
use codex_protocol::account::PlanType as AccountPlanType;
#[cfg(any(test, feature = "test-support"))]
use once_cell::sync::Lazy;
use serde_json::Value;
#[cfg(any(test, feature = "test-support"))]
use tempfile::TempDir;
use thiserror::Error;
#[derive(Debug, Clone)]
@@ -66,9 +62,6 @@ const REFRESH_TOKEN_UNKNOWN_MESSAGE: &str =
const REFRESH_TOKEN_URL: &str = "https://auth.openai.com/oauth/token";
pub const REFRESH_TOKEN_URL_OVERRIDE_ENV_VAR: &str = "CODEX_REFRESH_TOKEN_URL_OVERRIDE";
#[cfg(any(test, feature = "test-support"))]
static TEST_AUTH_TEMP_DIRS: Lazy<Mutex<Vec<TempDir>>> = Lazy::new(|| Mutex::new(Vec::new()));
#[derive(Debug, Error)]
pub enum RefreshTokenError {
#[error("{0}")]
@@ -630,6 +623,155 @@ struct CachedAuth {
auth: Option<CodexAuth>,
}
/// Central manager providing a single source of truth for auth.json derived
/// authentication data. It loads once (or on preference change) and then
/// hands out cloned `CodexAuth` values so the rest of the program has a
/// consistent snapshot.
///
/// External modifications to `auth.json` will NOT be observed until
/// `reload()` is called explicitly. This matches the design goal of avoiding
/// different parts of the program seeing inconsistent auth data midrun.
#[derive(Debug)]
pub struct AuthManager {
codex_home: PathBuf,
inner: RwLock<CachedAuth>,
enable_codex_api_key_env: bool,
auth_credentials_store_mode: AuthCredentialsStoreMode,
}
impl AuthManager {
/// Create a new manager loading the initial auth using the provided
/// preferred auth method. Errors loading auth are swallowed; `auth()` will
/// simply return `None` in that case so callers can treat it as an
/// unauthenticated state.
pub fn new(
codex_home: PathBuf,
enable_codex_api_key_env: bool,
auth_credentials_store_mode: AuthCredentialsStoreMode,
) -> Self {
let auth = load_auth(
&codex_home,
enable_codex_api_key_env,
auth_credentials_store_mode,
)
.ok()
.flatten();
Self {
codex_home,
inner: RwLock::new(CachedAuth { auth }),
enable_codex_api_key_env,
auth_credentials_store_mode,
}
}
#[cfg(any(test, feature = "test-support"))]
/// Create an AuthManager with a specific CodexAuth, for testing only.
pub fn from_auth_for_testing(auth: CodexAuth) -> Arc<Self> {
let cached = CachedAuth { auth: Some(auth) };
Arc::new(Self {
codex_home: PathBuf::from("non-existent"),
inner: RwLock::new(cached),
enable_codex_api_key_env: false,
auth_credentials_store_mode: AuthCredentialsStoreMode::File,
})
}
#[cfg(any(test, feature = "test-support"))]
/// Create an AuthManager with a specific CodexAuth and codex home, for testing only.
pub fn from_auth_for_testing_with_home(auth: CodexAuth, codex_home: PathBuf) -> Arc<Self> {
let cached = CachedAuth { auth: Some(auth) };
Arc::new(Self {
codex_home,
inner: RwLock::new(cached),
enable_codex_api_key_env: false,
auth_credentials_store_mode: AuthCredentialsStoreMode::File,
})
}
/// Current cached auth (clone). May be `None` if not logged in or load failed.
pub fn auth(&self) -> Option<CodexAuth> {
self.inner.read().ok().and_then(|c| c.auth.clone())
}
/// Force a reload of the auth information from auth.json. Returns
/// whether the auth value changed.
pub fn reload(&self) -> bool {
let new_auth = load_auth(
&self.codex_home,
self.enable_codex_api_key_env,
self.auth_credentials_store_mode,
)
.ok()
.flatten();
if let Ok(mut guard) = self.inner.write() {
let changed = !AuthManager::auths_equal(&guard.auth, &new_auth);
guard.auth = new_auth;
changed
} else {
false
}
}
fn auths_equal(a: &Option<CodexAuth>, b: &Option<CodexAuth>) -> bool {
match (a, b) {
(None, None) => true,
(Some(a), Some(b)) => a == b,
_ => false,
}
}
/// Convenience constructor returning an `Arc` wrapper.
pub fn shared(
codex_home: PathBuf,
enable_codex_api_key_env: bool,
auth_credentials_store_mode: AuthCredentialsStoreMode,
) -> Arc<Self> {
Arc::new(Self::new(
codex_home,
enable_codex_api_key_env,
auth_credentials_store_mode,
))
}
/// Attempt to refresh the current auth token (if any). On success, reload
/// the auth state from disk so other components observe refreshed token.
/// If the token refresh fails in a permanent (nontransient) way, logs out
/// to clear invalid auth state.
pub async fn refresh_token(&self) -> Result<Option<String>, RefreshTokenError> {
let auth = match self.auth() {
Some(a) => a,
None => return Ok(None),
};
match auth.refresh_token().await {
Ok(token) => {
// Reload to pick up persisted changes.
self.reload();
Ok(Some(token))
}
Err(e) => {
tracing::error!("Failed to refresh token: {}", e);
Err(e)
}
}
}
/// Log out by deleting the ondisk auth.json (if present). Returns Ok(true)
/// if a file was removed, Ok(false) if no auth file existed. On success,
/// reloads the inmemory auth cache so callers immediately observe the
/// unauthenticated state.
pub fn logout(&self) -> std::io::Result<bool> {
let removed = super::auth::logout(&self.codex_home, self.auth_credentials_store_mode)?;
// Always reload to clear any cached auth (even if file absent).
self.reload();
Ok(removed)
}
pub fn get_auth_mode(&self) -> Option<AuthMode> {
self.auth().map(|a| a.mode)
}
}
#[cfg(test)]
mod tests {
use super::*;
@@ -1051,162 +1193,3 @@ mod tests {
pretty_assertions::assert_eq!(auth.account_plan_type(), Some(AccountPlanType::Unknown));
}
}
/// Central manager providing a single source of truth for auth.json derived
/// authentication data. It loads once (or on preference change) and then
/// hands out cloned `CodexAuth` values so the rest of the program has a
/// consistent snapshot.
///
/// External modifications to `auth.json` will NOT be observed until
/// `reload()` is called explicitly. This matches the design goal of avoiding
/// different parts of the program seeing inconsistent auth data midrun.
#[derive(Debug)]
pub struct AuthManager {
codex_home: PathBuf,
inner: RwLock<CachedAuth>,
enable_codex_api_key_env: bool,
auth_credentials_store_mode: AuthCredentialsStoreMode,
}
impl AuthManager {
/// Create a new manager loading the initial auth using the provided
/// preferred auth method. Errors loading auth are swallowed; `auth()` will
/// simply return `None` in that case so callers can treat it as an
/// unauthenticated state.
pub fn new(
codex_home: PathBuf,
enable_codex_api_key_env: bool,
auth_credentials_store_mode: AuthCredentialsStoreMode,
) -> Self {
let auth = load_auth(
&codex_home,
enable_codex_api_key_env,
auth_credentials_store_mode,
)
.ok()
.flatten();
Self {
codex_home,
inner: RwLock::new(CachedAuth { auth }),
enable_codex_api_key_env,
auth_credentials_store_mode,
}
}
#[cfg(any(test, feature = "test-support"))]
#[expect(clippy::expect_used)]
/// Create an AuthManager with a specific CodexAuth, for testing only.
pub fn from_auth_for_testing(auth: CodexAuth) -> Arc<Self> {
let cached = CachedAuth { auth: Some(auth) };
let temp_dir = tempfile::tempdir().expect("temp codex home");
let codex_home = temp_dir.path().to_path_buf();
TEST_AUTH_TEMP_DIRS
.lock()
.expect("lock test codex homes")
.push(temp_dir);
Arc::new(Self {
codex_home,
inner: RwLock::new(cached),
enable_codex_api_key_env: false,
auth_credentials_store_mode: AuthCredentialsStoreMode::File,
})
}
#[cfg(any(test, feature = "test-support"))]
/// Create an AuthManager with a specific CodexAuth and codex home, for testing only.
pub fn from_auth_for_testing_with_home(auth: CodexAuth, codex_home: PathBuf) -> Arc<Self> {
let cached = CachedAuth { auth: Some(auth) };
Arc::new(Self {
codex_home,
inner: RwLock::new(cached),
enable_codex_api_key_env: false,
auth_credentials_store_mode: AuthCredentialsStoreMode::File,
})
}
/// Current cached auth (clone). May be `None` if not logged in or load failed.
pub fn auth(&self) -> Option<CodexAuth> {
self.inner.read().ok().and_then(|c| c.auth.clone())
}
pub fn codex_home(&self) -> &Path {
&self.codex_home
}
/// Force a reload of the auth information from auth.json. Returns
/// whether the auth value changed.
pub fn reload(&self) -> bool {
let new_auth = load_auth(
&self.codex_home,
self.enable_codex_api_key_env,
self.auth_credentials_store_mode,
)
.ok()
.flatten();
if let Ok(mut guard) = self.inner.write() {
let changed = !AuthManager::auths_equal(&guard.auth, &new_auth);
guard.auth = new_auth;
changed
} else {
false
}
}
fn auths_equal(a: &Option<CodexAuth>, b: &Option<CodexAuth>) -> bool {
match (a, b) {
(None, None) => true,
(Some(a), Some(b)) => a == b,
_ => false,
}
}
/// Convenience constructor returning an `Arc` wrapper.
pub fn shared(
codex_home: PathBuf,
enable_codex_api_key_env: bool,
auth_credentials_store_mode: AuthCredentialsStoreMode,
) -> Arc<Self> {
Arc::new(Self::new(
codex_home,
enable_codex_api_key_env,
auth_credentials_store_mode,
))
}
/// Attempt to refresh the current auth token (if any). On success, reload
/// the auth state from disk so other components observe refreshed token.
/// If the token refresh fails in a permanent (nontransient) way, logs out
/// to clear invalid auth state.
pub async fn refresh_token(&self) -> Result<Option<String>, RefreshTokenError> {
let auth = match self.auth() {
Some(a) => a,
None => return Ok(None),
};
match auth.refresh_token().await {
Ok(token) => {
// Reload to pick up persisted changes.
self.reload();
Ok(Some(token))
}
Err(e) => {
tracing::error!("Failed to refresh token: {}", e);
Err(e)
}
}
}
/// Log out by deleting the ondisk auth.json (if present). Returns Ok(true)
/// if a file was removed, Ok(false) if no auth file existed. On success,
/// reloads the inmemory auth cache so callers immediately observe the
/// unauthenticated state.
pub fn logout(&self) -> std::io::Result<bool> {
let removed = super::auth::logout(&self.codex_home, self.auth_credentials_store_mode)?;
// Always reload to clear any cached auth (even if file absent).
self.reload();
Ok(removed)
}
pub fn get_auth_mode(&self) -> Option<AuthMode> {
self.auth().map(|a| a.mode)
}
}

View File

@@ -19,9 +19,10 @@ use codex_api::create_text_param_for_request;
use codex_api::error::ApiError;
use codex_app_server_protocol::AuthMode;
use codex_otel::otel_manager::OtelManager;
use codex_protocol::ConversationId;
use codex_protocol::ThreadId;
use codex_protocol::config_types::ReasoningSummary as ReasoningSummaryConfig;
use codex_protocol::models::ResponseItem;
use codex_protocol::openai_models::ModelInfo;
use codex_protocol::openai_models::ReasoningEffort as ReasoningEffortConfig;
use codex_protocol::protocol::SessionSource;
use eventsource_stream::Event;
@@ -46,10 +47,11 @@ use crate::default_client::build_reqwest_client;
use crate::error::CodexErr;
use crate::error::Result;
use crate::features::FEATURES;
use crate::flags::CODEX_RS_RESPONSES_WS;
use crate::flags::CODEX_RS_SSE_FIXTURE;
use crate::model_provider_info::ModelProviderInfo;
use crate::model_provider_info::WireApi;
use crate::models_manager::model_family::ModelFamily;
use crate::responses_ws::ResponsesWsManager;
use crate::tools::spec::create_tools_json_for_chat_completions_api;
use crate::tools::spec::create_tools_json_for_responses_api;
@@ -57,10 +59,11 @@ use crate::tools::spec::create_tools_json_for_responses_api;
pub struct ModelClient {
config: Arc<Config>,
auth_manager: Option<Arc<AuthManager>>,
model_family: ModelFamily,
model_info: ModelInfo,
otel_manager: OtelManager,
provider: ModelProviderInfo,
conversation_id: ConversationId,
responses_ws: Option<Arc<ResponsesWsManager>>,
conversation_id: ThreadId,
effort: Option<ReasoningEffortConfig>,
summary: ReasoningSummaryConfig,
session_source: SessionSource,
@@ -71,20 +74,22 @@ impl ModelClient {
pub fn new(
config: Arc<Config>,
auth_manager: Option<Arc<AuthManager>>,
model_family: ModelFamily,
model_info: ModelInfo,
otel_manager: OtelManager,
provider: ModelProviderInfo,
responses_ws: Option<Arc<ResponsesWsManager>>,
effort: Option<ReasoningEffortConfig>,
summary: ReasoningSummaryConfig,
conversation_id: ConversationId,
conversation_id: ThreadId,
session_source: SessionSource,
) -> Self {
Self {
config,
auth_manager,
model_family,
model_info,
otel_manager,
provider,
responses_ws,
conversation_id,
effort,
summary,
@@ -93,11 +98,11 @@ impl ModelClient {
}
pub fn get_model_context_window(&self) -> Option<i64> {
let model_family = self.get_model_family();
let effective_context_window_percent = model_family.effective_context_window_percent;
model_family
.context_window
.map(|w| w.saturating_mul(effective_context_window_percent) / 100)
let model_info = self.get_model_info();
let effective_context_window_percent = model_info.effective_context_window_percent;
model_info.context_window.map(|context_window| {
context_window.saturating_mul(effective_context_window_percent) / 100
})
}
pub fn config(&self) -> Arc<Config> {
@@ -115,7 +120,12 @@ impl ModelClient {
/// based on the `show_raw_agent_reasoning` flag in the config.
pub async fn stream(&self, prompt: &Prompt) -> Result<ResponseStream> {
match self.provider.wire_api {
WireApi::Responses => self.stream_responses_api(prompt).await,
WireApi::Responses => {
if *CODEX_RS_RESPONSES_WS && let Some(manager) = self.responses_ws.as_ref() {
return self.stream_responses_ws(prompt, manager).await;
}
self.stream_responses_api(prompt).await
}
WireApi::Chat => {
let api_stream = self.stream_chat_completions(prompt).await?;
@@ -146,8 +156,8 @@ impl ModelClient {
}
let auth_manager = self.auth_manager.clone();
let model_family = self.get_model_family();
let instructions = prompt.get_full_instructions(&model_family).into_owned();
let model_info = self.get_model_info();
let instructions = prompt.get_full_instructions(&model_info).into_owned();
let tools_json = create_tools_json_for_chat_completions_api(&prompt.tools)?;
let api_prompt = build_api_prompt(prompt, instructions, tools_json);
let conversation_id = self.conversation_id.to_string();
@@ -200,13 +210,14 @@ impl ModelClient {
}
let auth_manager = self.auth_manager.clone();
let model_family = self.get_model_family();
let instructions = prompt.get_full_instructions(&model_family).into_owned();
let model_info = self.get_model_info();
let instructions = prompt.get_full_instructions(&model_info).into_owned();
let tools_json: Vec<Value> = create_tools_json_for_responses_api(&prompt.tools)?;
let reasoning = if model_family.supports_reasoning_summaries {
let default_reasoning_effort = model_info.default_reasoning_level;
let reasoning = if model_info.supports_reasoning_summaries {
Some(Reasoning {
effort: self.effort.or(model_family.default_reasoning_effort),
effort: self.effort.or(default_reasoning_effort),
summary: if self.summary == ReasoningSummaryConfig::None {
None
} else {
@@ -223,15 +234,13 @@ impl ModelClient {
vec![]
};
let verbosity = if model_family.support_verbosity {
self.config
.model_verbosity
.or(model_family.default_verbosity)
let verbosity = if model_info.support_verbosity {
self.config.model_verbosity.or(model_info.default_verbosity)
} else {
if self.config.model_verbosity.is_some() {
warn!(
"model_verbosity is set but ignored as the model does not support verbosity: {}",
model_family.family
model_info.slug
);
}
None
@@ -284,6 +293,108 @@ impl ModelClient {
}
}
async fn stream_responses_ws(
&self,
prompt: &Prompt,
manager: &Arc<ResponsesWsManager>,
) -> Result<ResponseStream> {
if let Some(path) = &*CODEX_RS_SSE_FIXTURE {
warn!(path, "Streaming from fixture");
let stream = codex_api::stream_from_fixture(path, self.provider.stream_idle_timeout())
.map_err(map_api_error)?;
return Ok(map_response_stream(stream, self.otel_manager.clone()));
}
let auth_manager = self.auth_manager.clone();
let model_info = self.get_model_info();
let instructions = prompt.get_full_instructions(&model_info).into_owned();
let tools_json: Vec<Value> = create_tools_json_for_responses_api(&prompt.tools)?;
let default_reasoning_effort = model_info.default_reasoning_level;
let reasoning = if model_info.supports_reasoning_summaries {
Some(Reasoning {
effort: self.effort.or(default_reasoning_effort),
summary: if self.summary == ReasoningSummaryConfig::None {
None
} else {
Some(self.summary)
},
})
} else {
None
};
let include: Vec<String> = if reasoning.is_some() {
vec!["reasoning.encrypted_content".to_string()]
} else {
vec![]
};
let verbosity = if model_info.support_verbosity {
self.config.model_verbosity.or(model_info.default_verbosity)
} else {
if self.config.model_verbosity.is_some() {
warn!(
"model_verbosity is set but ignored as the model does not support verbosity: {}",
model_info.slug
);
}
None
};
let text = create_text_param_for_request(verbosity, &prompt.output_schema);
let api_prompt = build_api_prompt(prompt, instructions.clone(), tools_json);
let conversation_id = self.conversation_id.to_string();
let session_source = self.session_source.clone();
let mut refreshed = false;
loop {
let auth = auth_manager.as_ref().and_then(|m| m.auth());
let api_provider = self
.provider
.to_api_provider(auth.as_ref().map(|a| a.mode))?;
let api_auth = auth_provider_from_auth(auth.clone(), &self.provider).await?;
let options = ApiResponsesOptions {
reasoning: reasoning.clone(),
include: include.clone(),
prompt_cache_key: Some(conversation_id.clone()),
text: text.clone(),
store_override: None,
conversation_id: Some(conversation_id.clone()),
session_source: Some(session_source.clone()),
extra_headers: beta_feature_headers(&self.config),
};
let stream_result = manager
.stream_prompt(
api_provider,
api_auth,
&self.get_model(),
&api_prompt,
options,
)
.await;
match stream_result {
Ok(stream) => {
return Ok(map_response_stream(stream, self.otel_manager.clone()));
}
Err(ApiError::Transport(TransportError::Http { status, .. }))
if status == StatusCode::UNAUTHORIZED =>
{
manager.reset().await;
handle_unauthorized(status, &mut refreshed, &auth_manager, &auth).await?;
continue;
}
Err(err) => {
manager.reset().await;
return Err(map_api_error(err));
}
}
}
}
pub fn get_provider(&self) -> ModelProviderInfo {
self.provider.clone()
}
@@ -298,12 +409,11 @@ impl ModelClient {
/// Returns the currently configured model slug.
pub fn get_model(&self) -> String {
self.get_model_family().get_model_slug().to_string()
self.model_info.slug.clone()
}
/// Returns the currently configured model family.
pub fn get_model_family(&self) -> ModelFamily {
self.model_family.clone()
pub fn get_model_info(&self) -> ModelInfo {
self.model_info.clone()
}
/// Returns the current reasoning effort setting.
@@ -340,7 +450,7 @@ impl ModelClient {
.with_telemetry(Some(request_telemetry));
let instructions = prompt
.get_full_instructions(&self.get_model_family())
.get_full_instructions(&self.get_model_info())
.into_owned();
let payload = ApiCompactionInput {
model: &self.get_model(),

View File

@@ -1,15 +1,13 @@
use crate::client_common::tools::ToolSpec;
use crate::error::Result;
use crate::models_manager::model_family::ModelFamily;
pub use codex_api::common::ResponseEvent;
use codex_apply_patch::APPLY_PATCH_TOOL_INSTRUCTIONS;
use codex_protocol::models::ResponseItem;
use codex_protocol::openai_models::ModelInfo;
use futures::Stream;
use serde::Deserialize;
use serde_json::Value;
use std::borrow::Cow;
use std::collections::HashSet;
use std::ops::Deref;
use std::pin::Pin;
use std::task::Context;
use std::task::Poll;
@@ -44,28 +42,12 @@ pub struct Prompt {
}
impl Prompt {
pub(crate) fn get_full_instructions<'a>(&'a self, model: &'a ModelFamily) -> Cow<'a, str> {
let base = self
.base_instructions_override
.as_deref()
.unwrap_or(model.base_instructions.deref());
// When there are no custom instructions, add apply_patch_tool_instructions if:
// - the model needs special instructions (4.1)
// AND
// - there is no apply_patch tool present
let is_apply_patch_tool_present = self.tools.iter().any(|tool| match tool {
ToolSpec::Function(f) => f.name == "apply_patch",
ToolSpec::Freeform(f) => f.name == "apply_patch",
_ => false,
});
if self.base_instructions_override.is_none()
&& model.needs_special_apply_patch_instructions
&& !is_apply_patch_tool_present
{
Cow::Owned(format!("{base}\n{APPLY_PATCH_TOOL_INSTRUCTIONS}"))
} else {
Cow::Borrowed(base)
}
pub(crate) fn get_full_instructions<'a>(&'a self, model: &'a ModelInfo) -> Cow<'a, str> {
Cow::Borrowed(
self.base_instructions_override
.as_deref()
.unwrap_or(model.base_instructions.as_str()),
)
}
pub(crate) fn get_formatted_input(&self) -> Vec<ResponseItem> {
@@ -195,8 +177,13 @@ pub(crate) mod tools {
LocalShell {},
// TODO: Understand why we get an error on web_search although the API docs say it's supported.
// https://platform.openai.com/docs/guides/tools-web-search?api-mode=responses#:~:text=%7B%20type%3A%20%22web_search%22%20%7D%2C
// The `external_web_access` field determines whether the web search is over cached or live content.
// https://platform.openai.com/docs/guides/tools-web-search#live-internet-access
#[serde(rename = "web_search")]
WebSearch {},
WebSearch {
#[serde(skip_serializing_if = "Option::is_none")]
external_web_access: Option<bool>,
},
#[serde(rename = "custom")]
Freeform(FreeformTool),
}
@@ -206,7 +193,7 @@ pub(crate) mod tools {
match self {
ToolSpec::Function(tool) => tool.name.as_str(),
ToolSpec::LocalShell {} => "local_shell",
ToolSpec::WebSearch {} => "web_search",
ToolSpec::WebSearch { .. } => "web_search",
ToolSpec::Freeform(tool) => tool.name.as_str(),
}
}
@@ -272,6 +259,8 @@ mod tests {
let prompt = Prompt {
..Default::default()
};
let prompt_with_apply_patch_instructions =
include_str!("../prompt_with_apply_patch_instructions.md");
let test_cases = vec![
InstructionsTestCase {
slug: "gpt-3.5",
@@ -312,19 +301,16 @@ mod tests {
];
for test_case in test_cases {
let config = test_config();
let model_family =
ModelsManager::construct_model_family_offline(test_case.slug, &config);
let expected = if test_case.expects_apply_patch_instructions {
format!(
"{}\n{}",
model_family.clone().base_instructions,
APPLY_PATCH_TOOL_INSTRUCTIONS
)
} else {
model_family.clone().base_instructions
};
let model_info = ModelsManager::construct_model_info_offline(test_case.slug, &config);
if test_case.expects_apply_patch_instructions {
assert_eq!(
model_info.base_instructions.as_str(),
prompt_with_apply_patch_instructions
);
}
let full = prompt.get_full_instructions(&model_family);
let expected = model_info.base_instructions.as_str();
let full = prompt.get_full_instructions(&model_info);
assert_eq!(full, expected);
}
}

View File

@@ -8,6 +8,9 @@ use std::sync::atomic::Ordering;
use crate::AuthManager;
use crate::SandboxState;
use crate::agent::AgentControl;
use crate::agent::AgentStatus;
use crate::agent::agent_status_from_event;
use crate::client_common::REVIEW_PROMPT;
use crate::compact;
use crate::compact::run_inline_auto_compact_task;
@@ -16,10 +19,11 @@ use crate::compact_remote::run_inline_remote_auto_compact_task;
use crate::exec_policy::ExecPolicyManager;
use crate::features::Feature;
use crate::features::Features;
use crate::flags::CODEX_RS_RESPONSES_WS;
use crate::models_manager::manager::ModelsManager;
use crate::models_manager::model_family::ModelFamily;
use crate::parse_command::parse_command;
use crate::parse_turn_item;
use crate::responses_ws::ResponsesWsManager;
use crate::stream_events_utils::HandleOutputCtx;
use crate::stream_events_utils::handle_non_tool_response_item;
use crate::stream_events_utils::handle_output_item_done;
@@ -29,9 +33,10 @@ use crate::user_notification::UserNotifier;
use crate::util::error_or_panic;
use async_channel::Receiver;
use async_channel::Sender;
use codex_protocol::ConversationId;
use codex_protocol::ThreadId;
use codex_protocol::approvals::ExecPolicyAmendment;
use codex_protocol::items::TurnItem;
use codex_protocol::openai_models::ModelInfo;
use codex_protocol::protocol::FileChange;
use codex_protocol::protocol::HasLegacyEvent;
use codex_protocol::protocol::ItemCompletedEvent;
@@ -143,7 +148,7 @@ use crate::tools::sandboxing::ApprovalStore;
use crate::tools::spec::ToolsConfig;
use crate::tools::spec::ToolsConfigParams;
use crate::turn_diff_tracker::TurnDiffTracker;
use crate::unified_exec::UnifiedExecSessionManager;
use crate::unified_exec::UnifiedExecProcessManager;
use crate::user_instructions::DeveloperInstructions;
use crate::user_instructions::UserInstructions;
use crate::user_notification::UserNotification;
@@ -167,6 +172,8 @@ pub struct Codex {
pub(crate) next_id: AtomicU64,
pub(crate) tx_sub: Sender<Submission>,
pub(crate) rx_event: Receiver<Event>,
// Last known status of the agent.
pub(crate) agent_status: Arc<RwLock<AgentStatus>>,
}
/// Wrapper returned by [`Codex::spawn`] containing the spawned [`Codex`],
@@ -174,7 +181,9 @@ pub struct Codex {
/// unique session id.
pub struct CodexSpawnOk {
pub codex: Codex,
pub conversation_id: ConversationId,
pub thread_id: ThreadId,
#[deprecated(note = "use thread_id")]
pub conversation_id: ThreadId,
}
pub(crate) const INITIAL_SUBMIT_ID: &str = "";
@@ -207,21 +216,23 @@ fn maybe_push_chat_wire_api_deprecation(
impl Codex {
/// Spawn a new [`Codex`] and initialize the session.
pub async fn spawn(
pub(crate) async fn spawn(
config: Config,
auth_manager: Arc<AuthManager>,
models_manager: Arc<ModelsManager>,
skills_manager: Arc<SkillsManager>,
conversation_history: InitialHistory,
session_source: SessionSource,
agent_control: AgentControl,
) -> CodexResult<CodexSpawnOk> {
let (tx_sub, rx_sub) = async_channel::bounded(SUBMISSION_CHANNEL_CAPACITY);
let (tx_event, rx_event) = async_channel::unbounded();
let loaded_skills = config
.features
.enabled(Feature::Skills)
.then(|| skills_manager.skills_for_cwd(&config.cwd));
let loaded_skills = if config.features.enabled(Feature::Skills) {
Some(skills_manager.skills_for_config(&config))
} else {
None
};
if let Some(outcome) = &loaded_skills {
for err in &outcome.errors {
@@ -272,6 +283,7 @@ impl Codex {
// Generate a unique ID for the lifetime of this Codex session.
let session_source_clone = session_configuration.session_source.clone();
let agent_status = Arc::new(RwLock::new(AgentStatus::PendingInit));
let session = Session::new(
session_configuration,
@@ -280,16 +292,18 @@ impl Codex {
models_manager.clone(),
exec_policy,
tx_event.clone(),
Arc::clone(&agent_status),
conversation_history,
session_source_clone,
skills_manager,
agent_control,
)
.await
.map_err(|e| {
error!("Failed to create session: {e:#}");
map_session_init_error(&e, &config.codex_home)
})?;
let conversation_id = session.conversation_id;
let thread_id = session.conversation_id;
// This task will run until Op::Shutdown is received.
tokio::spawn(submission_loop(session, config, rx_sub));
@@ -297,11 +311,14 @@ impl Codex {
next_id: AtomicU64::new(0),
tx_sub,
rx_event,
agent_status,
};
#[allow(deprecated)]
Ok(CodexSpawnOk {
codex,
conversation_id,
thread_id,
conversation_id: thread_id,
})
}
@@ -334,14 +351,20 @@ impl Codex {
.map_err(|_| CodexErr::InternalAgentDied)?;
Ok(event)
}
pub(crate) async fn agent_status(&self) -> AgentStatus {
let status = self.agent_status.read().await;
status.clone()
}
}
/// Context for an initialized model agent
///
/// A session has at most 1 running task at a time, and can be interrupted by user input.
pub(crate) struct Session {
conversation_id: ConversationId,
conversation_id: ThreadId,
tx_event: Sender<Event>,
agent_status: Arc<RwLock<AgentStatus>>,
state: Mutex<SessionState>,
/// The set of enabled features should be invariant for the lifetime of the
/// session.
@@ -351,7 +374,7 @@ pub(crate) struct Session {
next_internal_sub_id: AtomicU64,
}
/// The context needed for a single turn of the conversation.
/// The context needed for a single turn of the thread.
#[derive(Debug)]
pub(crate) struct TurnContext {
pub(crate) sub_id: String,
@@ -485,24 +508,26 @@ impl Session {
auth_manager: Option<Arc<AuthManager>>,
otel_manager: &OtelManager,
provider: ModelProviderInfo,
responses_ws: Option<Arc<ResponsesWsManager>>,
session_configuration: &SessionConfiguration,
per_turn_config: Config,
model_family: ModelFamily,
conversation_id: ConversationId,
model_info: ModelInfo,
conversation_id: ThreadId,
sub_id: String,
) -> TurnContext {
let otel_manager = otel_manager.clone().with_model(
session_configuration.model.as_str(),
model_family.get_model_slug(),
model_info.slug.as_str(),
);
let per_turn_config = Arc::new(per_turn_config);
let client = ModelClient::new(
per_turn_config.clone(),
auth_manager,
model_family.clone(),
model_info.clone(),
otel_manager,
provider,
responses_ws,
session_configuration.model_reasoning_effort,
session_configuration.model_reasoning_summary,
conversation_id,
@@ -510,7 +535,7 @@ impl Session {
);
let tools_config = ToolsConfig::new(&ToolsConfigParams {
model_family: &model_family,
model_info: &model_info,
features: &per_turn_config.features,
});
@@ -532,7 +557,7 @@ impl Session {
tool_call_gate: Arc::new(ReadinessFlag::new()),
truncation_policy: TruncationPolicy::new(
per_turn_config.as_ref(),
model_family.truncation_policy,
model_info.truncation_policy.into(),
),
}
}
@@ -545,9 +570,11 @@ impl Session {
models_manager: Arc<ModelsManager>,
exec_policy: ExecPolicyManager,
tx_event: Sender<Event>,
agent_status: Arc<RwLock<AgentStatus>>,
initial_history: InitialHistory,
session_source: SessionSource,
skills_manager: Arc<SkillsManager>,
agent_control: AgentControl,
) -> anyhow::Result<Arc<Self>> {
debug!(
"Configuring session: model={}; provider={:?}",
@@ -562,7 +589,7 @@ impl Session {
let (conversation_id, rollout_params) = match &initial_history {
InitialHistory::New | InitialHistory::Forked(_) => {
let conversation_id = ConversationId::default();
let conversation_id = ThreadId::default();
(
conversation_id,
RolloutRecorderParams::new(
@@ -620,7 +647,6 @@ impl Session {
}
maybe_push_chat_wire_api_deprecation(&config, &mut post_session_configured_events);
// todo(aibrahim): why are we passing model here while it can change?
let otel_manager = OtelManager::new(
conversation_id,
session_configuration.model.as_str(),
@@ -654,11 +680,16 @@ impl Session {
.map(Arc::new);
}
let state = SessionState::new(session_configuration.clone());
let responses_ws = if *CODEX_RS_RESPONSES_WS {
Some(Arc::new(ResponsesWsManager::new()))
} else {
None
};
let services = SessionServices {
mcp_connection_manager: Arc::new(RwLock::new(McpConnectionManager::default())),
mcp_startup_cancellation_token: CancellationToken::new(),
unified_exec_manager: UnifiedExecSessionManager::default(),
unified_exec_manager: UnifiedExecProcessManager::default(),
notifier: UserNotifier::new(config.notify.clone()),
rollout: Mutex::new(Some(rollout_recorder)),
user_shell: Arc::new(default_shell),
@@ -669,11 +700,14 @@ impl Session {
models_manager: Arc::clone(&models_manager),
tool_approvals: Mutex::new(ApprovalStore::default()),
skills_manager,
agent_control,
responses_ws,
};
let sess = Arc::new(Session {
conversation_id,
tx_event: tx_event.clone(),
agent_status: Arc::clone(&agent_status),
state: Mutex::new(state),
features: config.features.clone(),
active_turn: Mutex::new(None),
@@ -919,18 +953,19 @@ impl Session {
}
}
let model_family = self
let model_info = self
.services
.models_manager
.construct_model_family(session_configuration.model.as_str(), &per_turn_config)
.construct_model_info(session_configuration.model.as_str(), &per_turn_config)
.await;
let mut turn_context: TurnContext = Self::make_turn_context(
Some(Arc::clone(&self.services.auth_manager)),
&self.services.otel_manager,
session_configuration.provider.clone(),
self.services.responses_ws.clone(),
&session_configuration,
per_turn_config,
model_family,
model_info,
self.conversation_id,
sub_id,
);
@@ -994,6 +1029,11 @@ impl Session {
}
pub(crate) async fn send_event_raw(&self, event: Event) {
// Record the last known agent status.
if let Some(status) = agent_status_from_event(&event.msg) {
let mut guard = self.agent_status.write().await;
*guard = status;
}
// Persist the event into rollout (recorder filters as needed)
let rollout_items = vec![RolloutItem::EventMsg(event.msg.clone())];
self.persist_rollout_items(&rollout_items).await;
@@ -1002,6 +1042,25 @@ impl Session {
}
}
/// Persist the event to the rollout file, flush it, and only then deliver it to clients.
///
/// Most events can be delivered immediately after queueing the rollout write, but some
/// clients (e.g. app-server thread/rollback) re-read the rollout file synchronously on
/// receipt of the event and depend on the marker already being visible on disk.
pub(crate) async fn send_event_raw_flushed(&self, event: Event) {
// Record the last known agent status.
if let Some(status) = agent_status_from_event(&event.msg) {
let mut guard = self.agent_status.write().await;
*guard = status;
}
self.persist_rollout_items(&[RolloutItem::EventMsg(event.msg.clone())])
.await;
self.flush_rollout().await;
if let Err(e) = self.tx_event.send(event).await {
error!("failed to send tool call event: {e}");
}
}
pub(crate) async fn emit_turn_item_started(&self, turn_context: &TurnContext, item: &TurnItem) {
self.send_event(
turn_context,
@@ -1219,6 +1278,9 @@ impl Session {
history.replace(rebuilt);
}
}
RolloutItem::EventMsg(EventMsg::ThreadRolledBack(rollback)) => {
history.drop_last_n_user_turns(rollback.num_turns);
}
_ => {}
}
}
@@ -1396,14 +1458,11 @@ impl Session {
}
pub(crate) async fn set_total_tokens_full(&self, turn_context: &TurnContext) {
let context_window = turn_context.client.get_model_context_window();
if let Some(context_window) = context_window {
{
let mut state = self.state.lock().await;
state.set_token_usage_full(context_window);
}
self.send_token_count_event(turn_context).await;
if let Some(context_window) = turn_context.client.get_model_context_window() {
let mut state = self.state.lock().await;
state.set_token_usage_full(context_window);
}
self.send_token_count_event(turn_context).await;
}
pub(crate) async fn record_response_item_and_emit_turn_item(
@@ -1658,6 +1717,9 @@ async fn submission_loop(sess: Arc<Session>, config: Arc<Config>, rx_sub: Receiv
Op::Compact => {
handlers::compact(&sess, sub.id.clone()).await;
}
Op::ThreadRollback { num_turns } => {
handlers::thread_rollback(&sess, sub.id.clone(), num_turns).await;
}
Op::RunUserShellCommand { command } => {
handlers::run_user_shell_command(
&sess,
@@ -1715,6 +1777,7 @@ mod handlers {
use codex_protocol::protocol::ReviewDecision;
use codex_protocol::protocol::ReviewRequest;
use codex_protocol::protocol::SkillsListEntry;
use codex_protocol::protocol::ThreadRolledBackEvent;
use codex_protocol::protocol::TurnAbortReason;
use codex_protocol::protocol::WarningEvent;
@@ -1987,18 +2050,18 @@ mod handlers {
};
let skills = if sess.enabled(Feature::Skills) {
let skills_manager = &sess.services.skills_manager;
cwds.into_iter()
.map(|cwd| {
let outcome = skills_manager.skills_for_cwd_with_options(&cwd, force_reload);
let errors = super::errors_to_info(&outcome.errors);
let skills = super::skills_to_info(&outcome.skills);
SkillsListEntry {
cwd,
skills,
errors,
}
})
.collect()
let mut entries = Vec::new();
for cwd in cwds {
let outcome = skills_manager.skills_for_cwd(&cwd, force_reload).await;
let errors = super::errors_to_info(&outcome.errors);
let skills = super::skills_to_info(&outcome.skills);
entries.push(SkillsListEntry {
cwd,
skills,
errors,
});
}
entries
} else {
cwds.into_iter()
.map(|cwd| SkillsListEntry {
@@ -2034,11 +2097,51 @@ mod handlers {
.await;
}
pub async fn thread_rollback(sess: &Arc<Session>, sub_id: String, num_turns: u32) {
if num_turns == 0 {
sess.send_event_raw(Event {
id: sub_id,
msg: EventMsg::Error(ErrorEvent {
message: "num_turns must be >= 1".to_string(),
codex_error_info: Some(CodexErrorInfo::ThreadRollbackFailed),
}),
})
.await;
return;
}
let has_active_turn = { sess.active_turn.lock().await.is_some() };
if has_active_turn {
sess.send_event_raw(Event {
id: sub_id,
msg: EventMsg::Error(ErrorEvent {
message: "Cannot rollback while a turn is in progress.".to_string(),
codex_error_info: Some(CodexErrorInfo::ThreadRollbackFailed),
}),
})
.await;
return;
}
let turn_context = sess.new_default_turn_with_sub_id(sub_id).await;
let mut history = sess.clone_history().await;
history.drop_last_n_user_turns(num_turns);
sess.replace_history(history.get_history()).await;
sess.recompute_token_usage(turn_context.as_ref()).await;
sess.send_event_raw_flushed(Event {
id: turn_context.sub_id.clone(),
msg: EventMsg::ThreadRolledBack(ThreadRolledBackEvent { num_turns }),
})
.await;
}
pub async fn shutdown(sess: &Arc<Session>, sub_id: String) -> bool {
sess.abort_all_tasks(TurnAbortReason::Interrupted).await;
sess.services
.unified_exec_manager
.terminate_all_sessions()
.terminate_all_processes()
.await;
info!("Shutting down Codex instance");
@@ -2111,18 +2214,19 @@ async fn spawn_review_thread(
resolved: crate::review_prompts::ResolvedReviewRequest,
) {
let model = config.review_model.clone();
let review_model_family = sess
let review_model_info = sess
.services
.models_manager
.construct_model_family(&model, &config)
.construct_model_info(&model, &config)
.await;
// For reviews, disable web_search and view_image regardless of global settings.
let mut review_features = sess.features.clone();
review_features
.disable(crate::features::Feature::WebSearchRequest)
.disable(crate::features::Feature::WebSearchCached)
.disable(crate::features::Feature::ViewImageTool);
let tools_config = ToolsConfig::new(&ToolsConfigParams {
model_family: &review_model_family,
model_info: &review_model_info,
features: &review_features,
});
@@ -2130,7 +2234,7 @@ async fn spawn_review_thread(
let review_prompt = resolved.prompt.clone();
let provider = parent_turn_context.client.get_provider();
let auth_manager = parent_turn_context.client.get_auth_manager();
let model_family = review_model_family.clone();
let model_info = review_model_info.clone();
// Build perturn client with the requested model/family.
let mut per_turn_config = (*config).clone();
@@ -2140,16 +2244,17 @@ async fn spawn_review_thread(
let otel_manager = parent_turn_context.client.get_otel_manager().with_model(
config.review_model.as_str(),
review_model_family.slug.as_str(),
review_model_info.slug.as_str(),
);
let per_turn_config = Arc::new(per_turn_config);
let client = ModelClient::new(
per_turn_config.clone(),
auth_manager,
model_family.clone(),
model_info.clone(),
otel_manager,
provider,
None,
per_turn_config.model_reasoning_effort,
per_turn_config.model_reasoning_summary,
sess.conversation_id,
@@ -2172,7 +2277,10 @@ async fn spawn_review_thread(
final_output_json_schema: None,
codex_linux_sandbox_exe: parent_turn_context.codex_linux_sandbox_exe.clone(),
tool_call_gate: Arc::new(ReadinessFlag::new()),
truncation_policy: TruncationPolicy::new(&per_turn_config, model_family.truncation_policy),
truncation_policy: TruncationPolicy::new(
&per_turn_config,
model_info.truncation_policy.into(),
),
};
// Seed the child task with the review prompt as the initial user message.
@@ -2238,11 +2346,8 @@ pub(crate) async fn run_task(
return None;
}
let auto_compact_limit = turn_context
.client
.get_model_family()
.auto_compact_token_limit()
.unwrap_or(i64::MAX);
let model_info = turn_context.client.get_model_info();
let auto_compact_limit = model_info.auto_compact_token_limit().unwrap_or(i64::MAX);
let total_usage_tokens = sess.get_total_token_usage().await;
if total_usage_tokens >= auto_compact_limit {
run_auto_compact(&sess, &turn_context).await;
@@ -2252,11 +2357,16 @@ pub(crate) async fn run_task(
});
sess.send_event(&turn_context, event).await;
let skills_outcome = sess.enabled(Feature::Skills).then(|| {
sess.services
.skills_manager
.skills_for_cwd(&turn_context.cwd)
});
let skills_outcome = if sess.enabled(Feature::Skills) {
Some(
sess.services
.skills_manager
.skills_for_cwd(&turn_context.cwd, false)
.await,
)
} else {
None
};
let SkillInjections {
items: skill_items,
@@ -2415,7 +2525,7 @@ async fn run_turn(
let model_supports_parallel = turn_context
.client
.get_model_family()
.get_model_info()
.supports_parallel_tool_calls;
let prompt = Prompt {
@@ -2842,7 +2952,7 @@ mod tests {
session
.record_initial_history(InitialHistory::Resumed(ResumedHistory {
conversation_id: ConversationId::default(),
conversation_id: ThreadId::default(),
history: rollout_items,
rollout_path: PathBuf::from("/tmp/resume.jsonl"),
}))
@@ -2919,7 +3029,7 @@ mod tests {
session
.record_initial_history(InitialHistory::Resumed(ResumedHistory {
conversation_id: ConversationId::default(),
conversation_id: ThreadId::default(),
history: rollout_items,
rollout_path: PathBuf::from("/tmp/resume.jsonl"),
}))
@@ -2942,6 +3052,131 @@ mod tests {
assert_eq!(expected, actual);
}
#[tokio::test]
async fn thread_rollback_drops_last_turn_from_history() {
let (sess, tc, rx) = make_session_and_context_with_rx().await;
let initial_context = sess.build_initial_context(tc.as_ref());
sess.record_into_history(&initial_context, tc.as_ref())
.await;
let turn_1 = vec![
ResponseItem::Message {
id: None,
role: "user".to_string(),
content: vec![ContentItem::InputText {
text: "turn 1 user".to_string(),
}],
},
ResponseItem::Message {
id: None,
role: "assistant".to_string(),
content: vec![ContentItem::OutputText {
text: "turn 1 assistant".to_string(),
}],
},
];
sess.record_into_history(&turn_1, tc.as_ref()).await;
let turn_2 = vec![
ResponseItem::Message {
id: None,
role: "user".to_string(),
content: vec![ContentItem::InputText {
text: "turn 2 user".to_string(),
}],
},
ResponseItem::Message {
id: None,
role: "assistant".to_string(),
content: vec![ContentItem::OutputText {
text: "turn 2 assistant".to_string(),
}],
},
];
sess.record_into_history(&turn_2, tc.as_ref()).await;
handlers::thread_rollback(&sess, "sub-1".to_string(), 1).await;
let rollback_event = wait_for_thread_rolled_back(&rx).await;
assert_eq!(rollback_event.num_turns, 1);
let mut expected = Vec::new();
expected.extend(initial_context);
expected.extend(turn_1);
let actual = sess.clone_history().await.get_history();
assert_eq!(expected, actual);
}
#[tokio::test]
async fn thread_rollback_clears_history_when_num_turns_exceeds_existing_turns() {
let (sess, tc, rx) = make_session_and_context_with_rx().await;
let initial_context = sess.build_initial_context(tc.as_ref());
sess.record_into_history(&initial_context, tc.as_ref())
.await;
let turn_1 = vec![ResponseItem::Message {
id: None,
role: "user".to_string(),
content: vec![ContentItem::InputText {
text: "turn 1 user".to_string(),
}],
}];
sess.record_into_history(&turn_1, tc.as_ref()).await;
handlers::thread_rollback(&sess, "sub-1".to_string(), 99).await;
let rollback_event = wait_for_thread_rolled_back(&rx).await;
assert_eq!(rollback_event.num_turns, 99);
let actual = sess.clone_history().await.get_history();
assert_eq!(initial_context, actual);
}
#[tokio::test]
async fn thread_rollback_fails_when_turn_in_progress() {
let (sess, tc, rx) = make_session_and_context_with_rx().await;
let initial_context = sess.build_initial_context(tc.as_ref());
sess.record_into_history(&initial_context, tc.as_ref())
.await;
*sess.active_turn.lock().await = Some(crate::state::ActiveTurn::default());
handlers::thread_rollback(&sess, "sub-1".to_string(), 1).await;
let error_event = wait_for_thread_rollback_failed(&rx).await;
assert_eq!(
error_event.codex_error_info,
Some(CodexErrorInfo::ThreadRollbackFailed)
);
let actual = sess.clone_history().await.get_history();
assert_eq!(initial_context, actual);
}
#[tokio::test]
async fn thread_rollback_fails_when_num_turns_is_zero() {
let (sess, tc, rx) = make_session_and_context_with_rx().await;
let initial_context = sess.build_initial_context(tc.as_ref());
sess.record_into_history(&initial_context, tc.as_ref())
.await;
handlers::thread_rollback(&sess, "sub-1".to_string(), 0).await;
let error_event = wait_for_thread_rollback_failed(&rx).await;
assert_eq!(error_event.message, "num_turns must be >= 1");
assert_eq!(
error_event.codex_error_info,
Some(CodexErrorInfo::ThreadRollbackFailed)
);
let actual = sess.clone_history().await.get_history();
assert_eq!(initial_context, actual);
}
#[tokio::test]
async fn set_rate_limits_retains_previous_credits() {
let codex_home = tempfile::tempdir().expect("create temp dir");
@@ -3175,6 +3410,44 @@ mod tests {
assert_eq!(expected, got);
}
async fn wait_for_thread_rolled_back(
rx: &async_channel::Receiver<Event>,
) -> crate::protocol::ThreadRolledBackEvent {
let deadline = StdDuration::from_secs(2);
let start = std::time::Instant::now();
loop {
let remaining = deadline.saturating_sub(start.elapsed());
let evt = tokio::time::timeout(remaining, rx.recv())
.await
.expect("timeout waiting for event")
.expect("event");
match evt.msg {
EventMsg::ThreadRolledBack(payload) => return payload,
_ => continue,
}
}
}
async fn wait_for_thread_rollback_failed(rx: &async_channel::Receiver<Event>) -> ErrorEvent {
let deadline = StdDuration::from_secs(2);
let start = std::time::Instant::now();
loop {
let remaining = deadline.saturating_sub(start.elapsed());
let evt = tokio::time::timeout(remaining, rx.recv())
.await
.expect("timeout waiting for event")
.expect("event");
match evt.msg {
EventMsg::Error(payload)
if payload.codex_error_info == Some(CodexErrorInfo::ThreadRollbackFailed) =>
{
return payload;
}
_ => continue,
}
}
}
fn text_block(s: &str) -> ContentBlock {
ContentBlock::TextContent(TextContent {
annotations: None,
@@ -3192,15 +3465,15 @@ mod tests {
}
fn otel_manager(
conversation_id: ConversationId,
conversation_id: ThreadId,
config: &Config,
model_family: &ModelFamily,
model_info: &ModelInfo,
session_source: SessionSource,
) -> OtelManager {
OtelManager::new(
conversation_id,
ModelsManager::get_model_offline(config.model.as_deref()).as_str(),
model_family.slug.as_str(),
model_info.slug.as_str(),
None,
Some("test@test.com".to_string()),
Some(AuthMode::ChatGPT),
@@ -3215,11 +3488,16 @@ mod tests {
let codex_home = tempfile::tempdir().expect("create temp dir");
let config = build_test_config(codex_home.path()).await;
let config = Arc::new(config);
let conversation_id = ConversationId::default();
let conversation_id = ThreadId::default();
let auth_manager =
AuthManager::from_auth_for_testing(CodexAuth::from_api_key("Test API Key"));
let models_manager = Arc::new(ModelsManager::new(auth_manager.clone()));
let models_manager = Arc::new(ModelsManager::new(
config.codex_home.clone(),
auth_manager.clone(),
));
let agent_control = AgentControl::default();
let exec_policy = ExecPolicyManager::default();
let agent_status = Arc::new(RwLock::new(AgentStatus::PendingInit));
let model = ModelsManager::get_model_offline(config.model.as_deref());
let session_configuration = SessionConfiguration {
provider: config.model_provider.clone(),
@@ -3237,14 +3515,14 @@ mod tests {
session_source: SessionSource::Exec,
};
let per_turn_config = Session::build_per_turn_config(&session_configuration);
let model_family = ModelsManager::construct_model_family_offline(
let model_info = ModelsManager::construct_model_info_offline(
session_configuration.model.as_str(),
&per_turn_config,
);
let otel_manager = otel_manager(
conversation_id,
config.as_ref(),
&model_family,
&model_info,
session_configuration.session_source.clone(),
);
@@ -3254,7 +3532,7 @@ mod tests {
let services = SessionServices {
mcp_connection_manager: Arc::new(RwLock::new(McpConnectionManager::default())),
mcp_startup_cancellation_token: CancellationToken::new(),
unified_exec_manager: UnifiedExecSessionManager::default(),
unified_exec_manager: UnifiedExecProcessManager::default(),
notifier: UserNotifier::new(None),
rollout: Mutex::new(None),
user_shell: Arc::new(default_user_shell()),
@@ -3265,15 +3543,18 @@ mod tests {
models_manager: Arc::clone(&models_manager),
tool_approvals: Mutex::new(ApprovalStore::default()),
skills_manager,
agent_control,
responses_ws: None,
};
let turn_context = Session::make_turn_context(
Some(Arc::clone(&auth_manager)),
&otel_manager,
session_configuration.provider.clone(),
None,
&session_configuration,
per_turn_config,
model_family,
model_info,
conversation_id,
"turn_id".to_string(),
);
@@ -3281,6 +3562,7 @@ mod tests {
let session = Session {
conversation_id,
tx_event,
agent_status: Arc::clone(&agent_status),
state: Mutex::new(state),
features: config.features.clone(),
active_turn: Mutex::new(None),
@@ -3302,11 +3584,16 @@ mod tests {
let codex_home = tempfile::tempdir().expect("create temp dir");
let config = build_test_config(codex_home.path()).await;
let config = Arc::new(config);
let conversation_id = ConversationId::default();
let conversation_id = ThreadId::default();
let auth_manager =
AuthManager::from_auth_for_testing(CodexAuth::from_api_key("Test API Key"));
let models_manager = Arc::new(ModelsManager::new(auth_manager.clone()));
let models_manager = Arc::new(ModelsManager::new(
config.codex_home.clone(),
auth_manager.clone(),
));
let agent_control = AgentControl::default();
let exec_policy = ExecPolicyManager::default();
let agent_status = Arc::new(RwLock::new(AgentStatus::PendingInit));
let model = ModelsManager::get_model_offline(config.model.as_deref());
let session_configuration = SessionConfiguration {
provider: config.model_provider.clone(),
@@ -3324,14 +3611,14 @@ mod tests {
session_source: SessionSource::Exec,
};
let per_turn_config = Session::build_per_turn_config(&session_configuration);
let model_family = ModelsManager::construct_model_family_offline(
let model_info = ModelsManager::construct_model_info_offline(
session_configuration.model.as_str(),
&per_turn_config,
);
let otel_manager = otel_manager(
conversation_id,
config.as_ref(),
&model_family,
&model_info,
session_configuration.session_source.clone(),
);
@@ -3341,7 +3628,7 @@ mod tests {
let services = SessionServices {
mcp_connection_manager: Arc::new(RwLock::new(McpConnectionManager::default())),
mcp_startup_cancellation_token: CancellationToken::new(),
unified_exec_manager: UnifiedExecSessionManager::default(),
unified_exec_manager: UnifiedExecProcessManager::default(),
notifier: UserNotifier::new(None),
rollout: Mutex::new(None),
user_shell: Arc::new(default_user_shell()),
@@ -3352,15 +3639,18 @@ mod tests {
models_manager: Arc::clone(&models_manager),
tool_approvals: Mutex::new(ApprovalStore::default()),
skills_manager,
agent_control,
responses_ws: None,
};
let turn_context = Arc::new(Session::make_turn_context(
Some(Arc::clone(&auth_manager)),
&otel_manager,
session_configuration.provider.clone(),
None,
&session_configuration,
per_turn_config,
model_family,
model_info,
conversation_id,
"turn_id".to_string(),
));
@@ -3368,6 +3658,7 @@ mod tests {
let session = Arc::new(Session {
conversation_id,
tx_event,
agent_status: Arc::clone(&agent_status),
state: Mutex::new(state),
features: config.features.clone(),
active_turn: Mutex::new(None),
@@ -3386,7 +3677,7 @@ mod tests {
session.features = features;
session
.record_model_warning("too many unified exec sessions", &turn_context)
.record_model_warning("too many unified exec processes", &turn_context)
.await;
let mut history = session.clone_history().await;
@@ -3399,7 +3690,7 @@ mod tests {
assert_eq!(
content,
&vec![ContentItem::InputText {
text: "Warning: too many unified exec sessions".to_string(),
text: "Warning: too many unified exec processes".to_string(),
}]
);
}

View File

@@ -28,12 +28,12 @@ use crate::error::CodexErr;
use crate::models_manager::manager::ModelsManager;
use codex_protocol::protocol::InitialHistory;
/// Start an interactive sub-Codex conversation and return IO channels.
/// Start an interactive sub-Codex thread and return IO channels.
///
/// The returned `events_rx` yields non-approval events emitted by the sub-agent.
/// Approval requests are handled via `parent_session` and are not surfaced.
/// The returned `ops_tx` allows the caller to submit additional `Op`s to the sub-agent.
pub(crate) async fn run_codex_conversation_interactive(
pub(crate) async fn run_codex_thread_interactive(
config: Config,
auth_manager: Arc<AuthManager>,
models_manager: Arc<ModelsManager>,
@@ -52,6 +52,7 @@ pub(crate) async fn run_codex_conversation_interactive(
Arc::clone(&parent_session.services.skills_manager),
initial_history.unwrap_or(InitialHistory::New),
SessionSource::SubAgent(SubAgentSource::Review),
parent_session.services.agent_control.clone(),
)
.await?;
let codex = Arc::new(codex);
@@ -86,6 +87,7 @@ pub(crate) async fn run_codex_conversation_interactive(
next_id: AtomicU64::new(0),
tx_sub: tx_ops,
rx_event: rx_sub,
agent_status: Arc::clone(&codex.agent_status),
})
}
@@ -93,7 +95,7 @@ pub(crate) async fn run_codex_conversation_interactive(
///
/// Internally calls the interactive variant, then immediately submits the provided input.
#[allow(clippy::too_many_arguments)]
pub(crate) async fn run_codex_conversation_one_shot(
pub(crate) async fn run_codex_thread_one_shot(
config: Config,
auth_manager: Arc<AuthManager>,
models_manager: Arc<ModelsManager>,
@@ -106,7 +108,7 @@ pub(crate) async fn run_codex_conversation_one_shot(
// Use a child token so we can stop the delegate after completion without
// requiring the caller to cancel the parent token.
let child_cancel = cancel_token.child_token();
let io = run_codex_conversation_interactive(
let io = run_codex_thread_interactive(
config,
auth_manager,
models_manager,
@@ -127,6 +129,7 @@ pub(crate) async fn run_codex_conversation_one_shot(
// Bridge events so we can observe completion and shut down automatically.
let (tx_bridge, rx_bridge) = async_channel::bounded(SUBMISSION_CHANNEL_CAPACITY);
let ops_tx = io.tx_sub.clone();
let agent_status = Arc::clone(&io.agent_status);
let io_for_bridge = io;
tokio::spawn(async move {
while let Ok(event) = io_for_bridge.next_event().await {
@@ -158,6 +161,7 @@ pub(crate) async fn run_codex_conversation_one_shot(
next_id: AtomicU64::new(0),
rx_event: rx_bridge,
tx_sub: tx_closed,
agent_status,
})
}
@@ -372,6 +376,7 @@ mod tests {
next_id: AtomicU64::new(0),
tx_sub,
rx_event: rx_events,
agent_status: Default::default(),
});
let (session, ctx, _rx_evt) = crate::codex::make_session_and_context_with_rx().await;

View File

@@ -1,3 +1,4 @@
use crate::agent::AgentStatus;
use crate::codex::Codex;
use crate::error::Result as CodexResult;
use crate::protocol::Event;
@@ -5,14 +6,14 @@ use crate::protocol::Op;
use crate::protocol::Submission;
use std::path::PathBuf;
pub struct CodexConversation {
pub struct CodexThread {
codex: Codex,
rollout_path: PathBuf,
}
/// Conduit for the bidirectional stream of messages that compose a conversation
/// in Codex.
impl CodexConversation {
/// Conduit for the bidirectional stream of messages that compose a thread
/// (formerly called a conversation) in Codex.
impl CodexThread {
pub(crate) fn new(codex: Codex, rollout_path: PathBuf) -> Self {
Self {
codex,
@@ -33,6 +34,10 @@ impl CodexConversation {
self.codex.next_event().await
}
pub async fn agent_status(&self) -> AgentStatus {
self.codex.agent_status().await
}
pub fn rollout_path(&self) -> PathBuf {
self.rollout_path.clone()
}

View File

@@ -108,7 +108,7 @@ async fn run_compact_task_inner(
sess.notify_background_event(
turn_context.as_ref(),
format!(
"Trimmed {truncated_count} older conversation item(s) before compacting so the prompt fits the model context window."
"Trimmed {truncated_count} older thread item(s) before compacting so the prompt fits the model context window."
),
)
.await;
@@ -182,7 +182,7 @@ async fn run_compact_task_inner(
sess.send_event(&turn_context, event).await;
let warning = EventMsg::Warning(WarningEvent {
message: "Heads up: Long conversations and multiple compactions can cause the model to be less accurate. Start a new conversation when possible to keep conversations small and targeted.".to_string(),
message: "Heads up: Long threads and multiple compactions can cause the model to be less accurate. Start a new thread when possible to keep threads small and targeted.".to_string(),
});
sess.send_event(&turn_context, warning).await;
}

View File

@@ -268,7 +268,7 @@ pub struct Config {
/// Additional filenames to try when looking for project-level docs.
pub project_doc_fallback_filenames: Vec<String>,
// todo(aibrahim): this should be used in the override model family
// todo(aibrahim): this should be used in the override model info
/// Token budget applied when storing tool/function outputs in the context manager.
pub tool_output_token_limit: Option<usize>,
@@ -316,7 +316,7 @@ pub struct Config {
/// Include the `apply_patch` tool for models that benefit from invoking
/// file edits as a structured tool call. When unset, this falls back to the
/// model family's default preference.
/// model info's default preference.
pub include_apply_patch_tool: bool,
pub tools_web_search_request: bool,
@@ -353,6 +353,10 @@ pub struct Config {
/// or placeholder replacement will occur for fast keypress bursts.
pub disable_paste_burst: bool,
/// When `false`, disables analytics across Codex product surfaces in this machine.
/// Defaults to `true`.
pub analytics: bool,
/// OTEL configuration (exporter type, endpoint, headers, etc.).
pub otel: crate::config::types::OtelConfig,
}
@@ -813,6 +817,10 @@ pub struct ConfigToml {
/// or placeholder replacement will occur for fast keypress bursts.
pub disable_paste_burst: Option<bool>,
/// When `false`, disables analytics across Codex product surfaces in this machine.
/// Defaults to `true`.
pub analytics: Option<crate::config::types::AnalyticsConfigToml>,
/// OTEL configuration.
pub otel: Option<crate::config::types::OtelConfigToml>,
@@ -1390,6 +1398,12 @@ impl Config {
notices: cfg.notice.unwrap_or_default(),
check_for_update_on_startup,
disable_paste_burst: cfg.disable_paste_burst.unwrap_or(false),
analytics: config_profile
.analytics
.as_ref()
.and_then(|a| a.enabled)
.or(cfg.analytics.as_ref().and_then(|a| a.enabled))
.unwrap_or(true),
tui_notifications: cfg
.tui
.as_ref()
@@ -3039,6 +3053,9 @@ approval_policy = "untrusted"
# `ConfigOverrides`.
profile = "gpt3"
[analytics]
enabled = true
[model_providers.openai-chat-completions]
name = "OpenAI using Chat Completions"
base_url = "https://api.openai.com/v1"
@@ -3064,6 +3081,9 @@ model = "o3"
model_provider = "openai"
approval_policy = "on-failure"
[profiles.zdr.analytics]
enabled = false
[profiles.gpt5]
model = "gpt-5.1"
model_provider = "openai"
@@ -3204,6 +3224,7 @@ model_verbosity = "high"
tui_notifications: Default::default(),
animations: true,
show_tooltips: true,
analytics: true,
tui_scroll_events_per_tick: None,
tui_scroll_wheel_lines: None,
tui_scroll_trackpad_lines: None,
@@ -3287,6 +3308,7 @@ model_verbosity = "high"
tui_notifications: Default::default(),
animations: true,
show_tooltips: true,
analytics: true,
tui_scroll_events_per_tick: None,
tui_scroll_wheel_lines: None,
tui_scroll_trackpad_lines: None,
@@ -3385,6 +3407,7 @@ model_verbosity = "high"
tui_notifications: Default::default(),
animations: true,
show_tooltips: true,
analytics: false,
tui_scroll_events_per_tick: None,
tui_scroll_wheel_lines: None,
tui_scroll_trackpad_lines: None,
@@ -3469,6 +3492,7 @@ model_verbosity = "high"
tui_notifications: Default::default(),
animations: true,
show_tooltips: true,
analytics: true,
tui_scroll_events_per_tick: None,
tui_scroll_wheel_lines: None,
tui_scroll_trackpad_lines: None,

View File

@@ -29,6 +29,7 @@ pub struct ConfigProfile {
pub experimental_use_freeform_apply_patch: Option<bool>,
pub tools_web_search: Option<bool>,
pub tools_view_image: Option<bool>,
pub analytics: Option<crate::config::types::AnalyticsConfigToml>,
/// Optional feature toggles scoped to this profile.
#[serde(default)]
pub features: Option<crate::features::FeaturesToml>,

View File

@@ -106,16 +106,7 @@ pub struct ConfigService {
}
impl ConfigService {
pub fn new(codex_home: PathBuf, cli_overrides: Vec<(String, TomlValue)>) -> Self {
Self {
codex_home,
cli_overrides,
loader_overrides: LoaderOverrides::default(),
}
}
#[cfg(test)]
fn with_overrides(
pub fn new(
codex_home: PathBuf,
cli_overrides: Vec<(String, TomlValue)>,
loader_overrides: LoaderOverrides,
@@ -127,6 +118,14 @@ impl ConfigService {
}
}
pub fn new_with_defaults(codex_home: PathBuf) -> Self {
Self {
codex_home,
cli_overrides: Vec::new(),
loader_overrides: LoaderOverrides::default(),
}
}
pub async fn read(
&self,
params: ConfigReadParams,
@@ -707,7 +706,7 @@ unified_exec = true
"#;
std::fs::write(tmp.path().join(CONFIG_TOML_FILE), original)?;
let service = ConfigService::new(tmp.path().to_path_buf(), vec![]);
let service = ConfigService::new_with_defaults(tmp.path().to_path_buf());
service
.write_value(ConfigValueWriteParams {
file_path: Some(tmp.path().join(CONFIG_TOML_FILE).display().to_string()),
@@ -748,7 +747,7 @@ remote_compaction = true
std::fs::write(&managed_path, "approval_policy = \"never\"").unwrap();
let managed_file = AbsolutePathBuf::try_from(managed_path.clone()).expect("managed file");
let service = ConfigService::with_overrides(
let service = ConfigService::new(
tmp.path().to_path_buf(),
vec![],
LoaderOverrides {
@@ -829,7 +828,7 @@ remote_compaction = true
std::fs::write(&managed_path, "approval_policy = \"never\"").unwrap();
let managed_file = AbsolutePathBuf::try_from(managed_path.clone()).expect("managed file");
let service = ConfigService::with_overrides(
let service = ConfigService::new(
tmp.path().to_path_buf(),
vec![],
LoaderOverrides {
@@ -881,7 +880,7 @@ remote_compaction = true
let user_path = tmp.path().join(CONFIG_TOML_FILE);
std::fs::write(&user_path, "model = \"user\"").unwrap();
let service = ConfigService::new(tmp.path().to_path_buf(), vec![]);
let service = ConfigService::new_with_defaults(tmp.path().to_path_buf());
let error = service
.write_value(ConfigValueWriteParams {
file_path: Some(tmp.path().join(CONFIG_TOML_FILE).display().to_string()),
@@ -904,7 +903,7 @@ remote_compaction = true
let tmp = tempdir().expect("tempdir");
std::fs::write(tmp.path().join(CONFIG_TOML_FILE), "").unwrap();
let service = ConfigService::new(tmp.path().to_path_buf(), vec![]);
let service = ConfigService::new_with_defaults(tmp.path().to_path_buf());
service
.write_value(ConfigValueWriteParams {
file_path: None,
@@ -932,7 +931,7 @@ remote_compaction = true
let managed_path = tmp.path().join("managed_config.toml");
std::fs::write(&managed_path, "approval_policy = \"never\"").unwrap();
let service = ConfigService::with_overrides(
let service = ConfigService::new(
tmp.path().to_path_buf(),
vec![],
LoaderOverrides {
@@ -980,7 +979,7 @@ remote_compaction = true
TomlValue::String("session".to_string()),
)];
let service = ConfigService::with_overrides(
let service = ConfigService::new(
tmp.path().to_path_buf(),
cli_overrides,
LoaderOverrides {
@@ -1026,7 +1025,7 @@ remote_compaction = true
std::fs::write(&managed_path, "approval_policy = \"never\"").unwrap();
let managed_file = AbsolutePathBuf::try_from(managed_path.clone()).expect("managed file");
let service = ConfigService::with_overrides(
let service = ConfigService::new(
tmp.path().to_path_buf(),
vec![],
LoaderOverrides {
@@ -1085,7 +1084,7 @@ alpha = "a"
std::fs::write(&path, base)?;
let service = ConfigService::new(tmp.path().to_path_buf(), vec![]);
let service = ConfigService::new_with_defaults(tmp.path().to_path_buf());
service
.write_value(ConfigValueWriteParams {
file_path: Some(path.display().to_string()),

View File

@@ -273,6 +273,15 @@ pub enum HistoryPersistence {
None,
}
// ===== Analytics configuration =====
/// Analytics settings loaded from config.toml. Fields are optional so we can apply defaults.
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Default)]
pub struct AnalyticsConfigToml {
/// When `false`, disables analytics across Codex product surfaces in this profile.
pub enabled: Option<bool>,
}
// ===== OTEL configuration =====
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq)]

View File

@@ -93,12 +93,8 @@ pub(super) async fn read_config_from_path(
}
}
/// Return the default managed config path (honoring `CODEX_MANAGED_CONFIG_PATH`).
/// Return the default managed config path.
pub(super) fn managed_config_default_path(codex_home: &Path) -> PathBuf {
if let Ok(path) = std::env::var("CODEX_MANAGED_CONFIG_PATH") {
return PathBuf::from(path);
}
#[cfg(unix)]
{
let _ = codex_home;

View File

@@ -5,6 +5,9 @@ use crate::truncate::approx_token_count;
use crate::truncate::approx_tokens_from_byte_count;
use crate::truncate::truncate_function_output_items_with_policy;
use crate::truncate::truncate_text;
use crate::user_instructions::SkillInstructions;
use crate::user_instructions::UserInstructions;
use crate::user_shell_command::is_user_shell_command_text;
use codex_protocol::models::ContentItem;
use codex_protocol::models::FunctionCallOutputContentItem;
use codex_protocol::models::FunctionCallOutputPayload;
@@ -13,7 +16,7 @@ use codex_protocol::protocol::TokenUsage;
use codex_protocol::protocol::TokenUsageInfo;
use std::ops::Deref;
/// Transcript of conversation history
/// Transcript of thread history
#[derive(Debug, Clone, Default)]
pub(crate) struct ContextManager {
/// The oldest items are at the beginning of the vector.
@@ -80,10 +83,9 @@ impl ContextManager {
// Estimate token usage using byte-based heuristics from the truncation helpers.
// This is a coarse lower bound, not a tokenizer-accurate count.
pub(crate) fn estimate_token_count(&self, turn_context: &TurnContext) -> Option<i64> {
let model_family = turn_context.client.get_model_family();
let base_tokens =
i64::try_from(approx_token_count(model_family.base_instructions.as_str()))
.unwrap_or(i64::MAX);
let model_info = turn_context.client.get_model_info();
let base_instructions = model_info.base_instructions.as_str();
let base_tokens = i64::try_from(approx_token_count(base_instructions)).unwrap_or(i64::MAX);
let items_tokens = self.items.iter().fold(0i64, |acc, item| {
acc + match item {
@@ -152,6 +154,39 @@ impl ContextManager {
}
}
/// Drop the last `num_turns` user turns from this history.
///
/// "User turns" are identified as `ResponseItem::Message` entries whose role is `"user"`.
///
/// This mirrors thread-rollback semantics:
/// - `num_turns == 0` is a no-op
/// - if there are no user turns, this is a no-op
/// - if `num_turns` exceeds the number of user turns, all user turns are dropped while
/// preserving any items that occurred before the first user message.
pub(crate) fn drop_last_n_user_turns(&mut self, num_turns: u32) {
if num_turns == 0 {
return;
}
// Keep behavior consistent with call sites that previously operated on `get_history()`:
// normalize first (call/output invariants), then truncate based on the normalized view.
let snapshot = self.get_history();
let user_positions = user_message_positions(&snapshot);
let Some(&first_user_idx) = user_positions.first() else {
self.replace(snapshot);
return;
};
let n_from_end = usize::try_from(num_turns).unwrap_or(usize::MAX);
let cut_idx = if n_from_end >= user_positions.len() {
first_user_idx
} else {
user_positions[user_positions.len() - n_from_end]
};
self.replace(snapshot[..cut_idx].to_vec());
}
pub(crate) fn update_token_info(
&mut self,
usage: &TokenUsage,
@@ -291,6 +326,56 @@ fn estimate_reasoning_length(encoded_len: usize) -> usize {
.saturating_sub(650)
}
fn is_session_prefix(text: &str) -> bool {
let trimmed = text.trim_start();
let lowered = trimmed.to_ascii_lowercase();
lowered.starts_with("<environment_context>")
}
fn is_user_turn_boundary(item: &ResponseItem) -> bool {
let ResponseItem::Message { role, content, .. } = item else {
return false;
};
if role != "user" {
return false;
}
if UserInstructions::is_user_instructions(content)
|| SkillInstructions::is_skill_instructions(content)
{
return false;
}
for content_item in content {
match content_item {
ContentItem::InputText { text } => {
if is_session_prefix(text) || is_user_shell_command_text(text) {
return false;
}
}
ContentItem::OutputText { text } => {
if is_session_prefix(text) {
return false;
}
}
ContentItem::InputImage { .. } => {}
}
}
true
}
fn user_message_positions(items: &[ResponseItem]) -> Vec<usize> {
let mut positions = Vec::new();
for (idx, item) in items.iter().enumerate() {
if is_user_turn_boundary(item) {
positions.push(idx);
}
}
positions
}
#[cfg(test)]
#[path = "history_tests.rs"]
mod tests;

View File

@@ -43,6 +43,16 @@ fn user_msg(text: &str) -> ResponseItem {
}
}
fn user_input_text_msg(text: &str) -> ResponseItem {
ResponseItem::Message {
id: None,
role: "user".to_string(),
content: vec![ContentItem::InputText {
text: text.to_string(),
}],
}
}
fn reasoning_msg(text: &str) -> ResponseItem {
ResponseItem::Reasoning {
id: String::new(),
@@ -227,6 +237,127 @@ fn remove_first_item_handles_local_shell_pair() {
assert_eq!(h.contents(), vec![]);
}
#[test]
fn drop_last_n_user_turns_preserves_prefix() {
let items = vec![
assistant_msg("session prefix item"),
user_msg("u1"),
assistant_msg("a1"),
user_msg("u2"),
assistant_msg("a2"),
];
let mut history = create_history_with_items(items);
history.drop_last_n_user_turns(1);
assert_eq!(
history.get_history(),
vec![
assistant_msg("session prefix item"),
user_msg("u1"),
assistant_msg("a1"),
]
);
let mut history = create_history_with_items(vec![
assistant_msg("session prefix item"),
user_msg("u1"),
assistant_msg("a1"),
user_msg("u2"),
assistant_msg("a2"),
]);
history.drop_last_n_user_turns(99);
assert_eq!(
history.get_history(),
vec![assistant_msg("session prefix item")]
);
}
#[test]
fn drop_last_n_user_turns_ignores_session_prefix_user_messages() {
let items = vec![
user_input_text_msg("<environment_context>ctx</environment_context>"),
user_input_text_msg("<user_instructions>do the thing</user_instructions>"),
user_input_text_msg(
"# AGENTS.md instructions for test_directory\n\n<INSTRUCTIONS>\ntest_text\n</INSTRUCTIONS>",
),
user_input_text_msg(
"<skill>\n<name>demo</name>\n<path>skills/demo/SKILL.md</path>\nbody\n</skill>",
),
user_input_text_msg("<user_shell_command>echo 42</user_shell_command>"),
user_input_text_msg("turn 1 user"),
assistant_msg("turn 1 assistant"),
user_input_text_msg("turn 2 user"),
assistant_msg("turn 2 assistant"),
];
let mut history = create_history_with_items(items);
history.drop_last_n_user_turns(1);
let expected_prefix_and_first_turn = vec![
user_input_text_msg("<environment_context>ctx</environment_context>"),
user_input_text_msg("<user_instructions>do the thing</user_instructions>"),
user_input_text_msg(
"# AGENTS.md instructions for test_directory\n\n<INSTRUCTIONS>\ntest_text\n</INSTRUCTIONS>",
),
user_input_text_msg(
"<skill>\n<name>demo</name>\n<path>skills/demo/SKILL.md</path>\nbody\n</skill>",
),
user_input_text_msg("<user_shell_command>echo 42</user_shell_command>"),
user_input_text_msg("turn 1 user"),
assistant_msg("turn 1 assistant"),
];
assert_eq!(history.get_history(), expected_prefix_and_first_turn);
let expected_prefix_only = vec![
user_input_text_msg("<environment_context>ctx</environment_context>"),
user_input_text_msg("<user_instructions>do the thing</user_instructions>"),
user_input_text_msg(
"# AGENTS.md instructions for test_directory\n\n<INSTRUCTIONS>\ntest_text\n</INSTRUCTIONS>",
),
user_input_text_msg(
"<skill>\n<name>demo</name>\n<path>skills/demo/SKILL.md</path>\nbody\n</skill>",
),
user_input_text_msg("<user_shell_command>echo 42</user_shell_command>"),
];
let mut history = create_history_with_items(vec![
user_input_text_msg("<environment_context>ctx</environment_context>"),
user_input_text_msg("<user_instructions>do the thing</user_instructions>"),
user_input_text_msg(
"# AGENTS.md instructions for test_directory\n\n<INSTRUCTIONS>\ntest_text\n</INSTRUCTIONS>",
),
user_input_text_msg(
"<skill>\n<name>demo</name>\n<path>skills/demo/SKILL.md</path>\nbody\n</skill>",
),
user_input_text_msg("<user_shell_command>echo 42</user_shell_command>"),
user_input_text_msg("turn 1 user"),
assistant_msg("turn 1 assistant"),
user_input_text_msg("turn 2 user"),
assistant_msg("turn 2 assistant"),
]);
history.drop_last_n_user_turns(2);
assert_eq!(history.get_history(), expected_prefix_only);
let mut history = create_history_with_items(vec![
user_input_text_msg("<environment_context>ctx</environment_context>"),
user_input_text_msg("<user_instructions>do the thing</user_instructions>"),
user_input_text_msg(
"# AGENTS.md instructions for test_directory\n\n<INSTRUCTIONS>\ntest_text\n</INSTRUCTIONS>",
),
user_input_text_msg(
"<skill>\n<name>demo</name>\n<path>skills/demo/SKILL.md</path>\nbody\n</skill>",
),
user_input_text_msg("<user_shell_command>echo 42</user_shell_command>"),
user_input_text_msg("turn 1 user"),
assistant_msg("turn 1 assistant"),
user_input_text_msg("turn 2 user"),
assistant_msg("turn 2 assistant"),
]);
history.drop_last_n_user_turns(3);
assert_eq!(history.get_history(), expected_prefix_only);
}
#[test]
fn remove_first_item_handles_custom_tool_pair() {
let items = vec![
@@ -462,7 +593,6 @@ fn format_exec_output_prefers_line_marker_when_both_limits_exceeded() {
assert_truncated_message_matches(&truncated, "line-0-", 17_423);
}
//TODO(aibrahim): run CI in release mode.
#[cfg(not(debug_assertions))]
#[test]
fn normalize_adds_missing_output_for_function_call() {

View File

@@ -8,7 +8,7 @@ use chrono::Datelike;
use chrono::Local;
use chrono::Utc;
use codex_async_utils::CancelErr;
use codex_protocol::ConversationId;
use codex_protocol::ThreadId;
use codex_protocol::protocol::CodexErrorInfo;
use codex_protocol::protocol::ErrorEvent;
use codex_protocol::protocol::RateLimitSnapshot;
@@ -71,12 +71,12 @@ pub enum CodexErr {
Stream(String, Option<Duration>),
#[error(
"Codex ran out of room in the model's context window. Start a new conversation or clear earlier history before retrying."
"Codex ran out of room in the model's context window. Start a new thread or clear earlier history before retrying."
)]
ContextWindowExceeded,
#[error("no conversation with id: {0}")]
ConversationNotFound(ConversationId),
#[error("no thread with id: {0}")]
ThreadNotFound(ThreadId),
#[error("session configured event was not the first event in the stream")]
SessionConfiguredNotFirstEvent,
@@ -455,7 +455,7 @@ impl CodexErr {
CodexErr::SessionConfiguredNotFirstEvent
| CodexErr::InternalServerError
| CodexErr::InternalAgentDied => CodexErrorInfo::InternalServerError,
CodexErr::UnsupportedOperation(_) | CodexErr::ConversationNotFound(_) => {
CodexErr::UnsupportedOperation(_) | CodexErr::ThreadNotFound(_) => {
CodexErrorInfo::BadRequest
}
CodexErr::Sandbox(_) => CodexErrorInfo::SandboxError,

View File

@@ -28,11 +28,10 @@ use crate::features::Feature;
use crate::features::Features;
use crate::sandboxing::SandboxPermissions;
use crate::tools::sandboxing::ExecApprovalRequirement;
use shlex::try_join as shlex_try_join;
const FORBIDDEN_REASON: &str = "execpolicy forbids this command";
const PROMPT_CONFLICT_REASON: &str =
"execpolicy requires approval for this command, but AskForApproval is set to Never";
const PROMPT_REASON: &str = "execpolicy requires approval for this command";
"approval required by policy, but AskForApproval is set to Never";
const RULES_DIR_NAME: &str = "rules";
const RULE_EXTENSION: &str = "rules";
const DEFAULT_POLICY_FILE: &str = "default.rules";
@@ -128,7 +127,7 @@ impl ExecPolicyManager {
match evaluation.decision {
Decision::Forbidden => ExecApprovalRequirement::Forbidden {
reason: FORBIDDEN_REASON.to_string(),
reason: derive_forbidden_reason(command, &evaluation),
},
Decision::Prompt => {
if matches!(approval_policy, AskForApproval::Never) {
@@ -137,7 +136,7 @@ impl ExecPolicyManager {
}
} else {
ExecApprovalRequirement::NeedsApproval {
reason: derive_prompt_reason(&evaluation),
reason: derive_prompt_reason(command, &evaluation),
proposed_execpolicy_amendment: if features.enabled(Feature::ExecPolicy) {
try_derive_execpolicy_amendment_for_prompt_rules(
&evaluation.matched_rules,
@@ -299,15 +298,69 @@ fn try_derive_execpolicy_amendment_for_allow_rules(
})
}
/// Only return PROMPT_REASON when an execpolicy rule drove the prompt decision.
fn derive_prompt_reason(evaluation: &Evaluation) -> Option<String> {
evaluation.matched_rules.iter().find_map(|rule_match| {
if is_policy_match(rule_match) && rule_match.decision() == Decision::Prompt {
Some(PROMPT_REASON.to_string())
} else {
None
/// Only return a reason when a policy rule drove the prompt decision.
fn derive_prompt_reason(command_args: &[String], evaluation: &Evaluation) -> Option<String> {
let command = render_shlex_command(command_args);
let most_specific_prompt = evaluation
.matched_rules
.iter()
.filter_map(|rule_match| match rule_match {
RuleMatch::PrefixRuleMatch {
matched_prefix,
decision: Decision::Prompt,
justification,
..
} => Some((matched_prefix.len(), justification.as_deref())),
_ => None,
})
.max_by_key(|(matched_prefix_len, _)| *matched_prefix_len);
match most_specific_prompt {
Some((_matched_prefix_len, Some(justification))) => {
Some(format!("`{command}` requires approval: {justification}"))
}
})
Some((_matched_prefix_len, None)) => {
Some(format!("`{command}` requires approval by policy"))
}
None => None,
}
}
fn render_shlex_command(args: &[String]) -> String {
shlex_try_join(args.iter().map(String::as_str)).unwrap_or_else(|_| args.join(" "))
}
/// Derive a string explaining why the command was forbidden. If `justification`
/// is set by the user, this can contain instructions with recommended
/// alternatives, for example.
fn derive_forbidden_reason(command_args: &[String], evaluation: &Evaluation) -> String {
let command = render_shlex_command(command_args);
let most_specific_forbidden = evaluation
.matched_rules
.iter()
.filter_map(|rule_match| match rule_match {
RuleMatch::PrefixRuleMatch {
matched_prefix,
decision: Decision::Forbidden,
justification,
..
} => Some((matched_prefix, justification.as_deref())),
_ => None,
})
.max_by_key(|(matched_prefix, _)| matched_prefix.len());
match most_specific_forbidden {
Some((_matched_prefix, Some(justification))) => {
format!("`{command}` rejected: {justification}")
}
Some((matched_prefix, None)) => {
let prefix = render_shlex_command(matched_prefix);
format!("`{command}` rejected: policy forbids commands starting with `{prefix}`")
}
None => format!("`{command}` rejected: blocked by policy"),
}
}
async fn collect_policy_files(dir: impl AsRef<Path>) -> Result<Vec<PathBuf>, ExecPolicyError> {
@@ -450,7 +503,8 @@ mod tests {
decision: Decision::Forbidden,
matched_rules: vec![RuleMatch::PrefixRuleMatch {
matched_prefix: vec!["rm".to_string()],
decision: Decision::Forbidden
decision: Decision::Forbidden,
justification: None,
}],
},
policy.check_multiple(command.iter(), &|_| Decision::Allow)
@@ -528,7 +582,8 @@ mod tests {
decision: Decision::Forbidden,
matched_rules: vec![RuleMatch::PrefixRuleMatch {
matched_prefix: vec!["rm".to_string()],
decision: Decision::Forbidden
decision: Decision::Forbidden,
justification: None,
}],
},
policy.check_multiple([vec!["rm".to_string()]].iter(), &|_| Decision::Allow)
@@ -538,7 +593,8 @@ mod tests {
decision: Decision::Prompt,
matched_rules: vec![RuleMatch::PrefixRuleMatch {
matched_prefix: vec!["ls".to_string()],
decision: Decision::Prompt
decision: Decision::Prompt,
justification: None,
}],
},
policy.check_multiple([vec!["ls".to_string()]].iter(), &|_| Decision::Allow)
@@ -560,7 +616,7 @@ prefix_rule(pattern=["rm"], decision="forbidden")
let forbidden_script = vec![
"bash".to_string(),
"-lc".to_string(),
"rm -rf /tmp".to_string(),
"rm -rf /some/important/folder".to_string(),
];
let manager = ExecPolicyManager::new(policy);
@@ -577,7 +633,45 @@ prefix_rule(pattern=["rm"], decision="forbidden")
assert_eq!(
requirement,
ExecApprovalRequirement::Forbidden {
reason: FORBIDDEN_REASON.to_string()
reason: "`bash -lc 'rm -rf /some/important/folder'` rejected: policy forbids commands starting with `rm`".to_string()
}
);
}
#[tokio::test]
async fn justification_is_included_in_forbidden_exec_approval_requirement() {
let policy_src = r#"
prefix_rule(
pattern=["rm"],
decision="forbidden",
justification="destructive command",
)
"#;
let mut parser = PolicyParser::new();
parser
.parse("test.rules", policy_src)
.expect("parse policy");
let policy = Arc::new(parser.build());
let manager = ExecPolicyManager::new(policy);
let requirement = manager
.create_exec_approval_requirement_for_command(
&Features::with_defaults(),
&[
"rm".to_string(),
"-rf".to_string(),
"/some/important/folder".to_string(),
],
AskForApproval::OnRequest,
&SandboxPolicy::DangerFullAccess,
SandboxPermissions::UseDefault,
)
.await;
assert_eq!(
requirement,
ExecApprovalRequirement::Forbidden {
reason: "`rm -rf /some/important/folder` rejected: destructive command".to_string()
}
);
}
@@ -606,7 +700,7 @@ prefix_rule(pattern=["rm"], decision="forbidden")
assert_eq!(
requirement,
ExecApprovalRequirement::NeedsApproval {
reason: Some(PROMPT_REASON.to_string()),
reason: Some("`rm` requires approval by policy".to_string()),
proposed_execpolicy_amendment: None,
}
);
@@ -824,7 +918,7 @@ prefix_rule(pattern=["rm"], decision="forbidden")
assert_eq!(
requirement,
ExecApprovalRequirement::NeedsApproval {
reason: Some(PROMPT_REASON.to_string()),
reason: Some("`rm` requires approval by policy".to_string()),
proposed_execpolicy_amendment: None,
}
);

View File

@@ -72,8 +72,11 @@ pub enum Feature {
UnifiedExec,
/// Include the freeform apply_patch tool.
ApplyPatchFreeform,
/// Allow the model to request web searches.
/// Allow the model to request web searches that fetch live content.
WebSearchRequest,
/// Allow the model to request web searches that fetch cached content.
/// Takes precedence over `WebSearchRequest`.
WebSearchCached,
/// Gate the execpolicy enforcement for shell/unified exec.
ExecPolicy,
/// Enable Windows sandbox (restricted token) on Windows.
@@ -330,6 +333,12 @@ pub const FEATURES: &[FeatureSpec] = &[
stage: Stage::Stable,
default_enabled: false,
},
FeatureSpec {
id: Feature::WebSearchCached,
key: "web_search_cached",
stage: Stage::Experimental,
default_enabled: false,
},
// Beta program. Rendered in the `/experimental` menu for users.
FeatureSpec {
id: Feature::UnifiedExec,
@@ -337,7 +346,7 @@ pub const FEATURES: &[FeatureSpec] = &[
stage: Stage::Beta {
name: "Background terminal",
menu_description: "Run long-running terminal commands in the background.",
announcement: "NEW! Try Background terminals for long running processes. Enable in /experimental!",
announcement: "NEW! Try Background terminals for long-running commands. Enable in /experimental!",
},
default_enabled: false,
},

View File

@@ -3,4 +3,5 @@ use env_flags::env_flags;
env_flags! {
/// Fixture path for offline tests (see client.rs).
pub CODEX_RS_SSE_FIXTURE: Option<&str> = None;
pub CODEX_RS_RESPONSES_WS: bool = false;
}

View File

@@ -12,9 +12,10 @@ pub mod bash;
mod client;
mod client_common;
pub mod codex;
mod codex_conversation;
mod codex_thread;
mod compact_remote;
pub use codex_conversation::CodexConversation;
pub use codex_thread::CodexThread;
mod agent;
mod codex_delegate;
mod command_safety;
pub mod config;
@@ -59,18 +60,25 @@ pub use model_provider_info::OLLAMA_OSS_PROVIDER_ID;
pub use model_provider_info::WireApi;
pub use model_provider_info::built_in_model_providers;
pub use model_provider_info::create_oss_provider_with_base_url;
mod conversation_manager;
mod event_mapping;
pub mod review_format;
pub mod review_prompts;
mod thread_manager;
pub use codex_protocol::protocol::InitialHistory;
pub use conversation_manager::ConversationManager;
pub use conversation_manager::NewConversation;
pub use thread_manager::NewThread;
pub use thread_manager::ThreadManager;
#[deprecated(note = "use ThreadManager")]
pub type ConversationManager = ThreadManager;
#[deprecated(note = "use NewThread")]
pub type NewConversation = NewThread;
#[deprecated(note = "use CodexThread")]
pub type CodexConversation = CodexThread;
// Re-export common auth types for workspace consumers
pub use auth::AuthManager;
pub use auth::CodexAuth;
pub mod default_client;
pub mod project_doc;
mod responses_ws;
mod rollout;
pub(crate) mod safety;
pub mod seatbelt;
@@ -86,10 +94,12 @@ pub use rollout::INTERACTIVE_SESSION_SOURCES;
pub use rollout::RolloutRecorder;
pub use rollout::SESSIONS_SUBDIR;
pub use rollout::SessionMeta;
#[deprecated(note = "use find_thread_path_by_id_str")]
pub use rollout::find_conversation_path_by_id_str;
pub use rollout::list::ConversationItem;
pub use rollout::list::ConversationsPage;
pub use rollout::find_thread_path_by_id_str;
pub use rollout::list::Cursor;
pub use rollout::list::ThreadItem;
pub use rollout::list::ThreadsPage;
pub use rollout::list::parse_cursor;
pub use rollout::list::read_head_for_summary;
mod function_tool;
@@ -125,5 +135,6 @@ pub use codex_protocol::models::LocalShellStatus;
pub use codex_protocol::models::ResponseItem;
pub use compact::content_items_to_text;
pub use event_mapping::parse_turn_item;
pub use responses_ws::ResponsesWsManager;
pub mod compact;
pub mod otel_init;

View File

@@ -13,6 +13,8 @@
//! trailing `\n`) and write it with a **single `write(2)` system call** while
//! the file descriptor is opened with the `O_APPEND` flag. POSIX guarantees
//! that writes up to `PIPE_BUF` bytes are atomic in that case.
//! Note: `conversation_id` stores the thread id; the field name is preserved for
//! backwards compatibility with existing history files.
use std::fs::File;
use std::fs::OpenOptions;
@@ -36,7 +38,7 @@ use tokio::io::AsyncReadExt;
use crate::config::Config;
use crate::config::types::HistoryPersistence;
use codex_protocol::ConversationId;
use codex_protocol::ThreadId;
#[cfg(unix)]
use std::os::unix::fs::OpenOptionsExt;
#[cfg(unix)]
@@ -69,7 +71,7 @@ fn history_filepath(config: &Config) -> PathBuf {
/// which entails a small amount of blocking I/O internally.
pub(crate) async fn append_entry(
text: &str,
conversation_id: &ConversationId,
conversation_id: &ThreadId,
config: &Config,
) -> Result<()> {
match config.history.persistence {
@@ -402,7 +404,7 @@ fn history_log_id(_metadata: &std::fs::Metadata) -> Option<u64> {
mod tests {
use super::*;
use crate::config::ConfigBuilder;
use codex_protocol::ConversationId;
use codex_protocol::ThreadId;
use pretty_assertions::assert_eq;
use std::fs::File;
use std::io::Write;
@@ -497,7 +499,7 @@ mod tests {
.await
.expect("load config");
let conversation_id = ConversationId::new();
let conversation_id = ThreadId::new();
let entry_one = "a".repeat(200);
let entry_two = "b".repeat(200);
@@ -544,7 +546,7 @@ mod tests {
.await
.expect("load config");
let conversation_id = ConversationId::new();
let conversation_id = ThreadId::new();
let short_entry = "a".repeat(200);
let long_entry = "b".repeat(400);

View File

@@ -24,7 +24,7 @@ use crate::default_client::build_reqwest_client;
use crate::error::Result as CoreResult;
use crate::features::Feature;
use crate::model_provider_info::ModelProviderInfo;
use crate::models_manager::model_family::ModelFamily;
use crate::models_manager::model_info;
use crate::models_manager::model_presets::builtin_model_presets;
const MODEL_CACHE_FILE: &str = "models_cache.json";
@@ -36,7 +36,6 @@ const CODEX_AUTO_BALANCED_MODEL: &str = "codex-auto-balanced";
/// Coordinates remote model discovery plus cached metadata on disk.
#[derive(Debug)]
pub struct ModelsManager {
// todo(aibrahim) merge available_models and model family creation into one struct
local_models: Vec<ModelPreset>,
remote_models: RwLock<Vec<ModelInfo>>,
auth_manager: Arc<AuthManager>,
@@ -48,8 +47,7 @@ pub struct ModelsManager {
impl ModelsManager {
/// Construct a manager scoped to the provided `AuthManager`.
pub fn new(auth_manager: Arc<AuthManager>) -> Self {
let codex_home = auth_manager.codex_home().to_path_buf();
pub fn new(codex_home: PathBuf, auth_manager: Arc<AuthManager>) -> Self {
Self {
local_models: builtin_model_presets(auth_manager.get_auth_mode()),
remote_models: RwLock::new(Self::load_remote_models_from_file().unwrap_or_default()),
@@ -63,8 +61,11 @@ impl ModelsManager {
#[cfg(any(test, feature = "test-support"))]
/// Construct a manager scoped to the provided `AuthManager` with a specific provider. Used for integration tests.
pub fn with_provider(auth_manager: Arc<AuthManager>, provider: ModelProviderInfo) -> Self {
let codex_home = auth_manager.codex_home().to_path_buf();
pub fn with_provider(
codex_home: PathBuf,
auth_manager: Arc<AuthManager>,
provider: ModelProviderInfo,
) -> Self {
Self {
local_models: builtin_model_presets(auth_manager.get_auth_mode()),
remote_models: RwLock::new(Self::load_remote_models_from_file().unwrap_or_default()),
@@ -128,15 +129,19 @@ impl ModelsManager {
Ok(self.build_available_models(remote_models))
}
fn find_family_for_model(slug: &str) -> ModelFamily {
super::model_family::find_family_for_model(slug)
}
/// Look up the requested model family while applying remote metadata overrides.
pub async fn construct_model_family(&self, model: &str, config: &Config) -> ModelFamily {
Self::find_family_for_model(model)
.with_remote_overrides(self.remote_models(config).await)
.with_config_overrides(config)
/// Look up the requested model metadata while applying remote metadata overrides.
pub async fn construct_model_info(&self, model: &str, config: &Config) -> ModelInfo {
let remote = self
.remote_models(config)
.await
.into_iter()
.find(|m| m.slug == model);
let model = if let Some(remote) = remote {
remote
} else {
model_info::find_model_info_for_slug(model)
};
model_info::with_config_overrides(model, config)
}
pub async fn get_model(&self, model: &Option<String>, config: &Config) -> String {
@@ -149,14 +154,14 @@ impl ModelsManager {
// if codex-auto-balanced exists & signed in with chatgpt mode, return it, otherwise return the default model
let auth_mode = self.auth_manager.get_auth_mode();
let remote_models = self.remote_models(config).await;
if auth_mode == Some(AuthMode::ChatGPT)
&& self
if auth_mode == Some(AuthMode::ChatGPT) {
let has_auto_balanced = self
.build_available_models(remote_models)
.iter()
.any(|m| m.model == CODEX_AUTO_BALANCED_MODEL)
{
return CODEX_AUTO_BALANCED_MODEL.to_string();
} else if auth_mode == Some(AuthMode::ChatGPT) {
.any(|model| model.model == CODEX_AUTO_BALANCED_MODEL && model.show_in_picker);
if has_auto_balanced {
return CODEX_AUTO_BALANCED_MODEL.to_string();
}
return OPENAI_DEFAULT_CHATGPT_MODEL.to_string();
}
OPENAI_DEFAULT_API_MODEL.to_string()
@@ -180,9 +185,9 @@ impl ModelsManager {
}
#[cfg(any(test, feature = "test-support"))]
/// Offline helper that builds a `ModelFamily` without consulting remote state.
pub fn construct_model_family_offline(model: &str, config: &Config) -> ModelFamily {
Self::find_family_for_model(model).with_config_overrides(config)
/// Offline helper that builds a `ModelInfo` without consulting remote state.
pub fn construct_model_info_offline(model: &str, config: &Config) -> ModelInfo {
model_info::with_config_overrides(model_info::find_model_info_for_slug(model), config)
}
async fn get_etag(&self) -> Option<String> {
@@ -247,10 +252,15 @@ impl ModelsManager {
merged_presets = self.filter_visible_models(merged_presets);
let has_default = merged_presets.iter().any(|preset| preset.is_default);
if let Some(default) = merged_presets.first_mut()
&& !has_default
{
default.is_default = true;
if !has_default {
if let Some(default) = merged_presets
.iter_mut()
.find(|preset| preset.show_in_picker)
{
default.is_default = true;
} else if let Some(default) = merged_presets.first_mut() {
default.is_default = true;
}
}
merged_presets
@@ -260,7 +270,7 @@ impl ModelsManager {
let chatgpt_mode = self.auth_manager.get_auth_mode() == Some(AuthMode::ChatGPT);
models
.into_iter()
.filter(|model| model.show_in_picker && (chatgpt_mode || model.supported_in_api))
.filter(|model| chatgpt_mode || model.supported_in_api)
.collect()
}
@@ -358,14 +368,14 @@ mod tests {
"supported_in_api": true,
"priority": priority,
"upgrade": null,
"base_instructions": null,
"base_instructions": "base instructions",
"supports_reasoning_summaries": false,
"support_verbosity": false,
"default_verbosity": null,
"apply_patch_tool_type": null,
"truncation_policy": {"mode": "bytes", "limit": 10_000},
"supports_parallel_tool_calls": false,
"context_window": null,
"context_window": 272_000,
"experimental_supported_tools": [],
}))
.expect("valid model")
@@ -414,7 +424,8 @@ mod tests {
let auth_manager =
AuthManager::from_auth_for_testing(CodexAuth::create_dummy_chatgpt_auth_for_testing());
let provider = provider_for(server.uri());
let manager = ModelsManager::with_provider(auth_manager, provider);
let manager =
ModelsManager::with_provider(codex_home.path().to_path_buf(), auth_manager, provider);
manager
.refresh_available_models_with_cache(&config)
@@ -473,7 +484,8 @@ mod tests {
AuthCredentialsStoreMode::File,
));
let provider = provider_for(server.uri());
let manager = ModelsManager::with_provider(auth_manager, provider);
let manager =
ModelsManager::with_provider(codex_home.path().to_path_buf(), auth_manager, provider);
manager
.refresh_available_models_with_cache(&config)
@@ -527,7 +539,8 @@ mod tests {
AuthCredentialsStoreMode::File,
));
let provider = provider_for(server.uri());
let manager = ModelsManager::with_provider(auth_manager, provider);
let manager =
ModelsManager::with_provider(codex_home.path().to_path_buf(), auth_manager, provider);
manager
.refresh_available_models_with_cache(&config)
@@ -597,7 +610,8 @@ mod tests {
let auth_manager =
AuthManager::from_auth_for_testing(CodexAuth::create_dummy_chatgpt_auth_for_testing());
let provider = provider_for(server.uri());
let mut manager = ModelsManager::with_provider(auth_manager, provider);
let mut manager =
ModelsManager::with_provider(codex_home.path().to_path_buf(), auth_manager, provider);
manager.cache_ttl = Duration::ZERO;
manager
@@ -645,21 +659,24 @@ mod tests {
#[test]
fn build_available_models_picks_default_after_hiding_hidden_models() {
let codex_home = tempdir().expect("temp dir");
let auth_manager =
AuthManager::from_auth_for_testing(CodexAuth::from_api_key("Test API Key"));
let provider = provider_for("http://example.test".to_string());
let mut manager = ModelsManager::with_provider(auth_manager, provider);
let mut manager =
ModelsManager::with_provider(codex_home.path().to_path_buf(), auth_manager, provider);
manager.local_models = Vec::new();
let hidden_model = remote_model_with_visibility("hidden", "Hidden", 0, "hide");
let visible_model = remote_model_with_visibility("visible", "Visible", 1, "list");
let mut expected = ModelPreset::from(visible_model.clone());
expected.is_default = true;
let expected_hidden = ModelPreset::from(hidden_model.clone());
let mut expected_visible = ModelPreset::from(visible_model.clone());
expected_visible.is_default = true;
let available = manager.build_available_models(vec![hidden_model, visible_model]);
assert_eq!(available, vec![expected]);
assert_eq!(available, vec![expected_hidden, expected_visible]);
}
#[test]

View File

@@ -1,4 +1,4 @@
pub mod cache;
pub mod manager;
pub mod model_family;
pub mod model_info;
pub mod model_presets;

View File

@@ -1,557 +0,0 @@
use codex_protocol::config_types::Verbosity;
use codex_protocol::openai_models::ApplyPatchToolType;
use codex_protocol::openai_models::ConfigShellToolType;
use codex_protocol::openai_models::ModelInfo;
use codex_protocol::openai_models::ReasoningEffort;
use crate::config::Config;
use crate::truncate::TruncationPolicy;
/// The `instructions` field in the payload sent to a model should always start
/// with this content.
const BASE_INSTRUCTIONS: &str = include_str!("../../prompt.md");
const GPT_5_CODEX_INSTRUCTIONS: &str = include_str!("../../gpt_5_codex_prompt.md");
const GPT_5_1_INSTRUCTIONS: &str = include_str!("../../gpt_5_1_prompt.md");
const GPT_5_2_INSTRUCTIONS: &str = include_str!("../../gpt_5_2_prompt.md");
const GPT_5_1_CODEX_MAX_INSTRUCTIONS: &str = include_str!("../../gpt-5.1-codex-max_prompt.md");
const GPT_5_2_CODEX_INSTRUCTIONS: &str = include_str!("../../gpt-5.2-codex_prompt.md");
pub(crate) const CONTEXT_WINDOW_272K: i64 = 272_000;
/// A model family is a group of models that share certain characteristics.
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub struct ModelFamily {
/// The full model slug used to derive this model family, e.g.
/// "gpt-4.1-2025-04-14".
pub slug: String,
/// The model family name, e.g. "gpt-4.1". This string is used when deriving
/// default metadata for the family, such as context windows.
pub family: String,
/// True if the model needs additional instructions on how to use the
/// "virtual" `apply_patch` CLI.
pub needs_special_apply_patch_instructions: bool,
/// Maximum supported context window, if known.
pub context_window: Option<i64>,
/// Token threshold for automatic compaction if config does not override it.
auto_compact_token_limit: Option<i64>,
// Whether the `reasoning` field can be set when making a request to this
// model family. Note it has `effort` and `summary` subfields (though
// `summary` is optional).
pub supports_reasoning_summaries: bool,
// The reasoning effort to use for this model family when none is explicitly chosen.
pub default_reasoning_effort: Option<ReasoningEffort>,
/// Whether this model supports parallel tool calls when using the
/// Responses API.
pub supports_parallel_tool_calls: bool,
/// Present if the model performs better when `apply_patch` is provided as
/// a tool call instead of just a bash command
pub apply_patch_tool_type: Option<ApplyPatchToolType>,
// Instructions to use for querying the model
pub base_instructions: String,
/// Names of beta tools that should be exposed to this model family.
pub experimental_supported_tools: Vec<String>,
/// Percentage of the context window considered usable for inputs, after
/// reserving headroom for system prompts, tool overhead, and model output.
/// This is applied when computing the effective context window seen by
/// consumers.
pub effective_context_window_percent: i64,
/// If the model family supports setting the verbosity level when using Responses API.
pub support_verbosity: bool,
// The default verbosity level for this model family when using Responses API.
pub default_verbosity: Option<Verbosity>,
/// Preferred shell tool type for this model family when features do not override it.
pub shell_type: ConfigShellToolType,
pub truncation_policy: TruncationPolicy,
}
impl ModelFamily {
pub(super) fn with_config_overrides(mut self, config: &Config) -> Self {
if let Some(supports_reasoning_summaries) = config.model_supports_reasoning_summaries {
self.supports_reasoning_summaries = supports_reasoning_summaries;
}
if let Some(context_window) = config.model_context_window {
self.context_window = Some(context_window);
}
if let Some(auto_compact_token_limit) = config.model_auto_compact_token_limit {
self.auto_compact_token_limit = Some(auto_compact_token_limit);
}
self
}
pub(super) fn with_remote_overrides(mut self, remote_models: Vec<ModelInfo>) -> Self {
for model in remote_models {
if model.slug == self.slug {
self.apply_remote_overrides(model);
}
}
self
}
fn apply_remote_overrides(&mut self, model: ModelInfo) {
let ModelInfo {
slug: _,
display_name: _,
description: _,
default_reasoning_level,
supported_reasoning_levels: _,
shell_type,
visibility: _,
supported_in_api: _,
priority: _,
upgrade: _,
base_instructions,
supports_reasoning_summaries,
support_verbosity,
default_verbosity,
apply_patch_tool_type,
truncation_policy,
supports_parallel_tool_calls,
context_window,
experimental_supported_tools,
} = model;
self.default_reasoning_effort = Some(default_reasoning_level);
self.shell_type = shell_type;
if let Some(base) = base_instructions {
self.base_instructions = base;
}
self.supports_reasoning_summaries = supports_reasoning_summaries;
self.support_verbosity = support_verbosity;
self.default_verbosity = default_verbosity;
self.apply_patch_tool_type = apply_patch_tool_type;
self.truncation_policy = truncation_policy.into();
self.supports_parallel_tool_calls = supports_parallel_tool_calls;
self.context_window = context_window;
self.experimental_supported_tools = experimental_supported_tools;
}
pub fn auto_compact_token_limit(&self) -> Option<i64> {
self.auto_compact_token_limit
.or(self.context_window.map(Self::default_auto_compact_limit))
}
const fn default_auto_compact_limit(context_window: i64) -> i64 {
(context_window * 9) / 10
}
pub fn get_model_slug(&self) -> &str {
&self.slug
}
}
macro_rules! model_family {
(
$slug:expr, $family:expr $(, $key:ident : $value:expr )* $(,)?
) => {{
// defaults
#[allow(unused_mut)]
let mut mf = ModelFamily {
slug: $slug.to_string(),
family: $family.to_string(),
needs_special_apply_patch_instructions: false,
context_window: Some(CONTEXT_WINDOW_272K),
auto_compact_token_limit: None,
supports_reasoning_summaries: false,
supports_parallel_tool_calls: false,
apply_patch_tool_type: None,
base_instructions: BASE_INSTRUCTIONS.to_string(),
experimental_supported_tools: Vec::new(),
effective_context_window_percent: 95,
support_verbosity: false,
shell_type: ConfigShellToolType::Default,
default_verbosity: None,
default_reasoning_effort: None,
truncation_policy: TruncationPolicy::Bytes(10_000),
};
// apply overrides
$(
mf.$key = $value;
)*
mf
}};
}
/// Internal offline helper for `ModelsManager` that returns a `ModelFamily` for the given
/// model slug.
#[allow(clippy::if_same_then_else)]
pub(super) fn find_family_for_model(slug: &str) -> ModelFamily {
if slug.starts_with("o3") {
model_family!(
slug, "o3",
supports_reasoning_summaries: true,
needs_special_apply_patch_instructions: true,
context_window: Some(200_000),
)
} else if slug.starts_with("o4-mini") {
model_family!(
slug, "o4-mini",
supports_reasoning_summaries: true,
needs_special_apply_patch_instructions: true,
context_window: Some(200_000),
)
} else if slug.starts_with("codex-mini-latest") {
model_family!(
slug, "codex-mini-latest",
supports_reasoning_summaries: true,
needs_special_apply_patch_instructions: true,
shell_type: ConfigShellToolType::Local,
context_window: Some(200_000),
)
} else if slug.starts_with("gpt-4.1") {
model_family!(
slug, "gpt-4.1",
needs_special_apply_patch_instructions: true,
context_window: Some(1_047_576),
)
} else if slug.starts_with("gpt-oss") || slug.starts_with("openai/gpt-oss") {
model_family!(
slug, "gpt-oss",
apply_patch_tool_type: Some(ApplyPatchToolType::Function),
context_window: Some(96_000),
)
} else if slug.starts_with("gpt-4o") {
model_family!(
slug, "gpt-4o",
needs_special_apply_patch_instructions: true,
context_window: Some(128_000),
)
} else if slug.starts_with("gpt-3.5") {
model_family!(
slug, "gpt-3.5",
needs_special_apply_patch_instructions: true,
context_window: Some(16_385),
)
} else if slug.starts_with("test-gpt-5") {
model_family!(
slug, slug,
supports_reasoning_summaries: true,
base_instructions: GPT_5_CODEX_INSTRUCTIONS.to_string(),
experimental_supported_tools: vec![
"grep_files".to_string(),
"list_dir".to_string(),
"read_file".to_string(),
"test_sync_tool".to_string(),
],
supports_parallel_tool_calls: true,
shell_type: ConfigShellToolType::ShellCommand,
support_verbosity: true,
truncation_policy: TruncationPolicy::Tokens(10_000),
)
// Experimental models.
} else if slug.starts_with("exp-codex") || slug.starts_with("codex-1p") {
// Same as gpt-5.1-codex-max.
model_family!(
slug, slug,
supports_reasoning_summaries: true,
base_instructions: GPT_5_2_CODEX_INSTRUCTIONS.to_string(),
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
shell_type: ConfigShellToolType::ShellCommand,
supports_parallel_tool_calls: true,
support_verbosity: false,
truncation_policy: TruncationPolicy::Tokens(10_000),
context_window: Some(CONTEXT_WINDOW_272K),
)
} else if slug.starts_with("exp-") {
model_family!(
slug, slug,
supports_reasoning_summaries: true,
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
support_verbosity: true,
default_verbosity: Some(Verbosity::Low),
base_instructions: BASE_INSTRUCTIONS.to_string(),
default_reasoning_effort: Some(ReasoningEffort::Medium),
truncation_policy: TruncationPolicy::Bytes(10_000),
shell_type: ConfigShellToolType::UnifiedExec,
supports_parallel_tool_calls: true,
context_window: Some(CONTEXT_WINDOW_272K),
)
// Production models.
} else if slug.starts_with("gpt-5.2-codex") {
model_family!(
slug, slug,
supports_reasoning_summaries: true,
base_instructions: GPT_5_2_CODEX_INSTRUCTIONS.to_string(),
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
shell_type: ConfigShellToolType::ShellCommand,
supports_parallel_tool_calls: true,
support_verbosity: false,
truncation_policy: TruncationPolicy::Tokens(10_000),
context_window: Some(CONTEXT_WINDOW_272K),
)
} else if slug.starts_with("bengalfox") {
model_family!(
slug, slug,
supports_reasoning_summaries: true,
base_instructions: GPT_5_2_CODEX_INSTRUCTIONS.to_string(),
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
shell_type: ConfigShellToolType::ShellCommand,
supports_parallel_tool_calls: true,
support_verbosity: false,
truncation_policy: TruncationPolicy::Tokens(10_000),
context_window: Some(CONTEXT_WINDOW_272K),
)
} else if slug.starts_with("gpt-5.1-codex-max") {
model_family!(
slug, slug,
supports_reasoning_summaries: true,
base_instructions: GPT_5_1_CODEX_MAX_INSTRUCTIONS.to_string(),
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
shell_type: ConfigShellToolType::ShellCommand,
supports_parallel_tool_calls: false,
support_verbosity: false,
truncation_policy: TruncationPolicy::Tokens(10_000),
context_window: Some(CONTEXT_WINDOW_272K),
)
} else if slug.starts_with("gpt-5-codex")
|| slug.starts_with("gpt-5.1-codex")
|| slug.starts_with("codex-")
{
model_family!(
slug, slug,
supports_reasoning_summaries: true,
base_instructions: GPT_5_CODEX_INSTRUCTIONS.to_string(),
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
shell_type: ConfigShellToolType::ShellCommand,
supports_parallel_tool_calls: false,
support_verbosity: false,
truncation_policy: TruncationPolicy::Tokens(10_000),
context_window: Some(CONTEXT_WINDOW_272K),
)
} else if slug.starts_with("gpt-5.2") {
model_family!(
slug, slug,
supports_reasoning_summaries: true,
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
support_verbosity: true,
default_verbosity: Some(Verbosity::Low),
base_instructions: GPT_5_2_INSTRUCTIONS.to_string(),
default_reasoning_effort: Some(ReasoningEffort::Medium),
truncation_policy: TruncationPolicy::Bytes(10_000),
shell_type: ConfigShellToolType::ShellCommand,
supports_parallel_tool_calls: true,
context_window: Some(CONTEXT_WINDOW_272K),
)
} else if slug.starts_with("boomslang") {
model_family!(
slug, slug,
supports_reasoning_summaries: true,
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
support_verbosity: true,
default_verbosity: Some(Verbosity::Low),
base_instructions: GPT_5_2_INSTRUCTIONS.to_string(),
default_reasoning_effort: Some(ReasoningEffort::Medium),
truncation_policy: TruncationPolicy::Bytes(10_000),
shell_type: ConfigShellToolType::ShellCommand,
supports_parallel_tool_calls: true,
context_window: Some(CONTEXT_WINDOW_272K),
)
} else if slug.starts_with("gpt-5.1") {
model_family!(
slug, "gpt-5.1",
supports_reasoning_summaries: true,
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
support_verbosity: true,
default_verbosity: Some(Verbosity::Low),
base_instructions: GPT_5_1_INSTRUCTIONS.to_string(),
default_reasoning_effort: Some(ReasoningEffort::Medium),
truncation_policy: TruncationPolicy::Bytes(10_000),
shell_type: ConfigShellToolType::ShellCommand,
supports_parallel_tool_calls: true,
context_window: Some(CONTEXT_WINDOW_272K),
)
} else if slug.starts_with("gpt-5") {
model_family!(
slug, "gpt-5",
supports_reasoning_summaries: true,
needs_special_apply_patch_instructions: true,
shell_type: ConfigShellToolType::Default,
support_verbosity: true,
truncation_policy: TruncationPolicy::Bytes(10_000),
context_window: Some(CONTEXT_WINDOW_272K),
)
} else {
derive_default_model_family(slug)
}
}
fn derive_default_model_family(model: &str) -> ModelFamily {
tracing::warn!("Unknown model {model} is used. This will degrade the performance of Codex.");
ModelFamily {
slug: model.to_string(),
family: model.to_string(),
needs_special_apply_patch_instructions: false,
context_window: None,
auto_compact_token_limit: None,
supports_reasoning_summaries: false,
supports_parallel_tool_calls: false,
apply_patch_tool_type: None,
base_instructions: BASE_INSTRUCTIONS.to_string(),
experimental_supported_tools: Vec::new(),
effective_context_window_percent: 95,
support_verbosity: false,
shell_type: ConfigShellToolType::Default,
default_verbosity: None,
default_reasoning_effort: None,
truncation_policy: TruncationPolicy::Bytes(10_000),
}
}
#[cfg(test)]
mod tests {
use super::*;
use codex_protocol::openai_models::ModelVisibility;
use codex_protocol::openai_models::ReasoningEffortPreset;
use codex_protocol::openai_models::TruncationPolicyConfig;
fn remote(slug: &str, effort: ReasoningEffort, shell: ConfigShellToolType) -> ModelInfo {
ModelInfo {
slug: slug.to_string(),
display_name: slug.to_string(),
description: Some(format!("{slug} desc")),
default_reasoning_level: effort,
supported_reasoning_levels: vec![ReasoningEffortPreset {
effort,
description: effort.to_string(),
}],
shell_type: shell,
visibility: ModelVisibility::List,
supported_in_api: true,
priority: 1,
upgrade: None,
base_instructions: None,
supports_reasoning_summaries: false,
support_verbosity: false,
default_verbosity: None,
apply_patch_tool_type: None,
truncation_policy: TruncationPolicyConfig::bytes(10_000),
supports_parallel_tool_calls: false,
context_window: None,
experimental_supported_tools: Vec::new(),
}
}
#[test]
fn remote_overrides_apply_when_slug_matches() {
let family = model_family!("gpt-4o-mini", "gpt-4o-mini");
assert_ne!(family.default_reasoning_effort, Some(ReasoningEffort::High));
let updated = family.with_remote_overrides(vec![
remote(
"gpt-4o-mini",
ReasoningEffort::High,
ConfigShellToolType::ShellCommand,
),
remote(
"other-model",
ReasoningEffort::Low,
ConfigShellToolType::UnifiedExec,
),
]);
assert_eq!(
updated.default_reasoning_effort,
Some(ReasoningEffort::High)
);
assert_eq!(updated.shell_type, ConfigShellToolType::ShellCommand);
}
#[test]
fn remote_overrides_skip_non_matching_models() {
let family = model_family!(
"codex-mini-latest",
"codex-mini-latest",
shell_type: ConfigShellToolType::Local
);
let updated = family.clone().with_remote_overrides(vec![remote(
"other",
ReasoningEffort::High,
ConfigShellToolType::ShellCommand,
)]);
assert_eq!(
updated.default_reasoning_effort,
family.default_reasoning_effort
);
assert_eq!(updated.shell_type, family.shell_type);
}
#[test]
fn remote_overrides_apply_extended_metadata() {
let family = model_family!(
"gpt-5.1",
"gpt-5.1",
supports_reasoning_summaries: false,
support_verbosity: false,
default_verbosity: None,
apply_patch_tool_type: Some(ApplyPatchToolType::Function),
supports_parallel_tool_calls: false,
experimental_supported_tools: vec!["local".to_string()],
truncation_policy: TruncationPolicy::Bytes(10_000),
context_window: Some(100),
);
let updated = family.with_remote_overrides(vec![ModelInfo {
slug: "gpt-5.1".to_string(),
display_name: "gpt-5.1".to_string(),
description: Some("desc".to_string()),
default_reasoning_level: ReasoningEffort::High,
supported_reasoning_levels: vec![ReasoningEffortPreset {
effort: ReasoningEffort::High,
description: "High".to_string(),
}],
shell_type: ConfigShellToolType::ShellCommand,
visibility: ModelVisibility::List,
supported_in_api: true,
priority: 10,
upgrade: None,
base_instructions: Some("Remote instructions".to_string()),
supports_reasoning_summaries: true,
support_verbosity: true,
default_verbosity: Some(Verbosity::High),
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
truncation_policy: TruncationPolicyConfig::tokens(2_000),
supports_parallel_tool_calls: true,
context_window: Some(400_000),
experimental_supported_tools: vec!["alpha".to_string(), "beta".to_string()],
}]);
assert_eq!(
updated.default_reasoning_effort,
Some(ReasoningEffort::High)
);
assert!(updated.supports_reasoning_summaries);
assert!(updated.support_verbosity);
assert_eq!(updated.default_verbosity, Some(Verbosity::High));
assert_eq!(updated.shell_type, ConfigShellToolType::ShellCommand);
assert_eq!(
updated.apply_patch_tool_type,
Some(ApplyPatchToolType::Freeform)
);
assert_eq!(updated.truncation_policy, TruncationPolicy::Tokens(2_000));
assert!(updated.supports_parallel_tool_calls);
assert_eq!(updated.context_window, Some(400_000));
assert_eq!(
updated.experimental_supported_tools,
vec!["alpha".to_string(), "beta".to_string()]
);
assert_eq!(updated.base_instructions, "Remote instructions");
}
}

View File

@@ -0,0 +1,348 @@
use codex_protocol::config_types::Verbosity;
use codex_protocol::openai_models::ApplyPatchToolType;
use codex_protocol::openai_models::ConfigShellToolType;
use codex_protocol::openai_models::ModelInfo;
use codex_protocol::openai_models::ModelVisibility;
use codex_protocol::openai_models::ReasoningEffort;
use codex_protocol::openai_models::ReasoningEffortPreset;
use codex_protocol::openai_models::TruncationPolicyConfig;
use crate::config::Config;
use tracing::warn;
const BASE_INSTRUCTIONS: &str = include_str!("../../prompt.md");
const BASE_INSTRUCTIONS_WITH_APPLY_PATCH: &str =
include_str!("../../prompt_with_apply_patch_instructions.md");
const GPT_5_CODEX_INSTRUCTIONS: &str = include_str!("../../gpt_5_codex_prompt.md");
const GPT_5_1_INSTRUCTIONS: &str = include_str!("../../gpt_5_1_prompt.md");
const GPT_5_2_INSTRUCTIONS: &str = include_str!("../../gpt_5_2_prompt.md");
const GPT_5_1_CODEX_MAX_INSTRUCTIONS: &str = include_str!("../../gpt-5.1-codex-max_prompt.md");
const GPT_5_2_CODEX_INSTRUCTIONS: &str = include_str!("../../gpt-5.2-codex_prompt.md");
pub(crate) const CONTEXT_WINDOW_272K: i64 = 272_000;
macro_rules! model_info {
(
$slug:expr $(, $key:ident : $value:expr )* $(,)?
) => {{
#[allow(unused_mut)]
let mut model = ModelInfo {
slug: $slug.to_string(),
display_name: $slug.to_string(),
description: None,
// This is primarily used when remote metadata is available. When running
// offline, core generally omits the effort field unless explicitly
// configured by the user.
default_reasoning_level: None,
supported_reasoning_levels: supported_reasoning_level_low_medium_high(),
shell_type: ConfigShellToolType::Default,
visibility: ModelVisibility::None,
supported_in_api: true,
priority: 99,
upgrade: None,
base_instructions: BASE_INSTRUCTIONS.to_string(),
supports_reasoning_summaries: false,
support_verbosity: false,
default_verbosity: None,
apply_patch_tool_type: None,
truncation_policy: TruncationPolicyConfig::bytes(10_000),
supports_parallel_tool_calls: false,
context_window: Some(CONTEXT_WINDOW_272K),
auto_compact_token_limit: None,
effective_context_window_percent: 95,
experimental_supported_tools: Vec::new(),
};
$(
model.$key = $value;
)*
model
}};
}
pub(crate) fn with_config_overrides(mut model: ModelInfo, config: &Config) -> ModelInfo {
if let Some(supports_reasoning_summaries) = config.model_supports_reasoning_summaries {
model.supports_reasoning_summaries = supports_reasoning_summaries;
}
if let Some(context_window) = config.model_context_window {
model.context_window = Some(context_window);
}
if let Some(auto_compact_token_limit) = config.model_auto_compact_token_limit {
model.auto_compact_token_limit = Some(auto_compact_token_limit);
}
model
}
// todo(aibrahim): remove most of the entries here when enabling models.json
pub(crate) fn find_model_info_for_slug(slug: &str) -> ModelInfo {
if slug.starts_with("o3") || slug.starts_with("o4-mini") {
model_info!(
slug,
base_instructions: BASE_INSTRUCTIONS_WITH_APPLY_PATCH.to_string(),
supports_reasoning_summaries: true,
context_window: Some(200_000),
)
} else if slug.starts_with("codex-mini-latest") {
model_info!(
slug,
base_instructions: BASE_INSTRUCTIONS_WITH_APPLY_PATCH.to_string(),
shell_type: ConfigShellToolType::Local,
supports_reasoning_summaries: true,
context_window: Some(200_000),
)
} else if slug.starts_with("gpt-4.1") {
model_info!(
slug,
base_instructions: BASE_INSTRUCTIONS_WITH_APPLY_PATCH.to_string(),
supports_reasoning_summaries: false,
context_window: Some(1_047_576),
)
} else if slug.starts_with("gpt-oss") || slug.starts_with("openai/gpt-oss") {
model_info!(
slug,
apply_patch_tool_type: Some(ApplyPatchToolType::Function),
context_window: Some(96_000),
)
} else if slug.starts_with("gpt-4o") {
model_info!(
slug,
base_instructions: BASE_INSTRUCTIONS_WITH_APPLY_PATCH.to_string(),
supports_reasoning_summaries: false,
context_window: Some(128_000),
)
} else if slug.starts_with("gpt-3.5") {
model_info!(
slug,
base_instructions: BASE_INSTRUCTIONS_WITH_APPLY_PATCH.to_string(),
supports_reasoning_summaries: false,
context_window: Some(16_385),
)
} else if slug.starts_with("test-gpt-5") {
model_info!(
slug,
base_instructions: GPT_5_CODEX_INSTRUCTIONS.to_string(),
experimental_supported_tools: vec![
"grep_files".to_string(),
"list_dir".to_string(),
"read_file".to_string(),
"test_sync_tool".to_string(),
],
supports_parallel_tool_calls: true,
supports_reasoning_summaries: true,
shell_type: ConfigShellToolType::ShellCommand,
support_verbosity: true,
truncation_policy: TruncationPolicyConfig::tokens(10_000),
)
} else if slug.starts_with("exp-codex") || slug.starts_with("codex-1p") {
model_info!(
slug,
base_instructions: GPT_5_2_CODEX_INSTRUCTIONS.to_string(),
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
shell_type: ConfigShellToolType::ShellCommand,
supports_parallel_tool_calls: true,
supports_reasoning_summaries: true,
support_verbosity: false,
truncation_policy: TruncationPolicyConfig::tokens(10_000),
context_window: Some(CONTEXT_WINDOW_272K),
)
} else if slug.starts_with("exp-") {
model_info!(
slug,
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
supports_reasoning_summaries: true,
support_verbosity: true,
default_verbosity: Some(Verbosity::Low),
base_instructions: BASE_INSTRUCTIONS.to_string(),
default_reasoning_level: Some(ReasoningEffort::Medium),
truncation_policy: TruncationPolicyConfig::bytes(10_000),
shell_type: ConfigShellToolType::UnifiedExec,
supports_parallel_tool_calls: true,
context_window: Some(CONTEXT_WINDOW_272K),
)
} else if slug.starts_with("gpt-5.2-codex") || slug.starts_with("bengalfox") {
model_info!(
slug,
base_instructions: GPT_5_2_CODEX_INSTRUCTIONS.to_string(),
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
shell_type: ConfigShellToolType::ShellCommand,
supports_parallel_tool_calls: true,
supports_reasoning_summaries: true,
support_verbosity: false,
truncation_policy: TruncationPolicyConfig::tokens(10_000),
context_window: Some(CONTEXT_WINDOW_272K),
supported_reasoning_levels: supported_reasoning_level_low_medium_high_xhigh(),
)
} else if slug.starts_with("gpt-5.1-codex-max") {
model_info!(
slug,
base_instructions: GPT_5_1_CODEX_MAX_INSTRUCTIONS.to_string(),
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
shell_type: ConfigShellToolType::ShellCommand,
supports_parallel_tool_calls: false,
supports_reasoning_summaries: true,
support_verbosity: false,
truncation_policy: TruncationPolicyConfig::tokens(10_000),
context_window: Some(CONTEXT_WINDOW_272K),
supported_reasoning_levels: supported_reasoning_level_low_medium_high_xhigh(),
)
} else if (slug.starts_with("gpt-5-codex")
|| slug.starts_with("gpt-5.1-codex")
|| slug.starts_with("codex-"))
&& !slug.contains("-mini")
{
model_info!(
slug,
base_instructions: GPT_5_CODEX_INSTRUCTIONS.to_string(),
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
shell_type: ConfigShellToolType::ShellCommand,
supports_parallel_tool_calls: false,
supports_reasoning_summaries: true,
support_verbosity: false,
truncation_policy: TruncationPolicyConfig::tokens(10_000),
context_window: Some(CONTEXT_WINDOW_272K),
supported_reasoning_levels: supported_reasoning_level_low_medium_high(),
)
} else if slug.starts_with("gpt-5-codex")
|| slug.starts_with("gpt-5.1-codex")
|| slug.starts_with("codex-")
{
model_info!(
slug,
base_instructions: GPT_5_CODEX_INSTRUCTIONS.to_string(),
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
shell_type: ConfigShellToolType::ShellCommand,
supports_parallel_tool_calls: false,
supports_reasoning_summaries: true,
support_verbosity: false,
truncation_policy: TruncationPolicyConfig::tokens(10_000),
context_window: Some(CONTEXT_WINDOW_272K),
)
} else if (slug.starts_with("gpt-5.2") || slug.starts_with("boomslang"))
&& !slug.contains("codex")
{
model_info!(
slug,
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
supports_reasoning_summaries: true,
support_verbosity: true,
default_verbosity: Some(Verbosity::Low),
base_instructions: GPT_5_2_INSTRUCTIONS.to_string(),
default_reasoning_level: Some(ReasoningEffort::Medium),
truncation_policy: TruncationPolicyConfig::bytes(10_000),
shell_type: ConfigShellToolType::ShellCommand,
supports_parallel_tool_calls: true,
context_window: Some(CONTEXT_WINDOW_272K),
supported_reasoning_levels: supported_reasoning_level_low_medium_high_xhigh_non_codex(),
)
} else if slug.starts_with("gpt-5.1") && !slug.contains("codex") {
model_info!(
slug,
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
supports_reasoning_summaries: true,
support_verbosity: true,
default_verbosity: Some(Verbosity::Low),
base_instructions: GPT_5_1_INSTRUCTIONS.to_string(),
default_reasoning_level: Some(ReasoningEffort::Medium),
truncation_policy: TruncationPolicyConfig::bytes(10_000),
shell_type: ConfigShellToolType::ShellCommand,
supports_parallel_tool_calls: true,
context_window: Some(CONTEXT_WINDOW_272K),
supported_reasoning_levels: supported_reasoning_level_low_medium_high_non_codex(),
)
} else if slug.starts_with("gpt-5") {
model_info!(
slug,
base_instructions: BASE_INSTRUCTIONS_WITH_APPLY_PATCH.to_string(),
shell_type: ConfigShellToolType::Default,
supports_reasoning_summaries: true,
support_verbosity: true,
truncation_policy: TruncationPolicyConfig::bytes(10_000),
context_window: Some(CONTEXT_WINDOW_272K),
)
} else {
warn!("Unknown model {slug} is used. This will degrade the performance of Codex.");
model_info!(
slug,
context_window: None,
supported_reasoning_levels: Vec::new(),
default_reasoning_level: None
)
}
}
fn supported_reasoning_level_low_medium_high() -> Vec<ReasoningEffortPreset> {
vec![
ReasoningEffortPreset {
effort: ReasoningEffort::Low,
description: "Fast responses with lighter reasoning".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::Medium,
description: "Balances speed and reasoning depth for everyday tasks".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::High,
description: "Greater reasoning depth for complex problems".to_string(),
},
]
}
fn supported_reasoning_level_low_medium_high_non_codex() -> Vec<ReasoningEffortPreset> {
vec![
ReasoningEffortPreset {
effort: ReasoningEffort::Low,
description: "Balances speed with some reasoning; useful for straightforward queries and short explanations".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::Medium,
description: "Provides a solid balance of reasoning depth and latency for general-purpose tasks".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::High,
description: "Maximizes reasoning depth for complex or ambiguous problems".to_string(),
},
]
}
fn supported_reasoning_level_low_medium_high_xhigh() -> Vec<ReasoningEffortPreset> {
vec![
ReasoningEffortPreset {
effort: ReasoningEffort::Low,
description: "Fast responses with lighter reasoning".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::Medium,
description: "Balances speed and reasoning depth for everyday tasks".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::High,
description: "Greater reasoning depth for complex problems".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::XHigh,
description: "Extra high reasoning depth for complex problems".to_string(),
},
]
}
fn supported_reasoning_level_low_medium_high_xhigh_non_codex() -> Vec<ReasoningEffortPreset> {
vec![
ReasoningEffortPreset {
effort: ReasoningEffort::Low,
description: "Balances speed with some reasoning; useful for straightforward queries and short explanations".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::Medium,
description: "Provides a solid balance of reasoning depth and latency for general-purpose tasks".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::High,
description: "Maximizes reasoning depth for complex or ambiguous problems".to_string(),
},
ReasoningEffortPreset {
effort: ReasoningEffort::XHigh,
description: "Extra high reasoning for complex problems".to_string(),
},
]
}

View File

@@ -112,7 +112,7 @@ static PRESETS: Lazy<Vec<ModelPreset>> = Lazy::new(|| {
},
ReasoningEffortPreset {
effort: ReasoningEffort::XHigh,
description: "Extra high reasoning for complex problems".to_string(),
description: "Extra high reasoning depth for complex problems".to_string(),
},
],
is_default: false,
@@ -170,7 +170,7 @@ static PRESETS: Lazy<Vec<ModelPreset>> = Lazy::new(|| {
},
ReasoningEffortPreset {
effort: ReasoningEffort::XHigh,
description: "Extra high reasoning for complex problems".to_string(),
description: "Extra high reasoning depth for complex problems".to_string(),
},
],
is_default: false,
@@ -322,11 +322,7 @@ fn gpt_52_codex_upgrade() -> ModelUpgrade {
}
pub(super) fn builtin_model_presets(_auth_mode: Option<AuthMode>) -> Vec<ModelPreset> {
PRESETS
.iter()
.filter(|preset| preset.show_in_picker)
.cloned()
.collect()
PRESETS.iter().cloned().collect()
}
#[cfg(any(test, feature = "test-support"))]

View File

@@ -513,9 +513,9 @@ mod tests {
)
.unwrap_or_else(|_| cfg.codex_home.join("skills/pdf-processing/SKILL.md"));
let expected_path_str = expected_path.to_string_lossy().replace('\\', "/");
let usage_rules = "- Discovery: Available skills are listed in project docs and may also appear in a runtime \"## Skills\" section (name + description + file path). These are the sources of truth; skill bodies live on disk at the listed paths.\n- Trigger rules: If the user names a skill (with `$SkillName` or plain text) OR the task clearly matches a skill's description, you must use that skill for that turn. Multiple mentions mean use them all. Do not carry skills across turns unless re-mentioned.\n- Missing/blocked: If a named skill isn't in the list or the path can't be read, say so briefly and continue with the best fallback.\n- How to use a skill (progressive disclosure):\n 1) After deciding to use a skill, open its `SKILL.md`. Read only enough to follow the workflow.\n 2) If `SKILL.md` points to extra folders such as `references/`, load only the specific files needed for the request; don't bulk-load everything.\n 3) If `scripts/` exist, prefer running or patching them instead of retyping large code blocks.\n 4) If `assets/` or templates exist, reuse them instead of recreating from scratch.\n- Description as trigger: The YAML `description` in `SKILL.md` is the primary trigger signal; rely on it to decide applicability. If unsure, ask a brief clarification before proceeding.\n- Coordination and sequencing:\n - If multiple skills apply, choose the minimal set that covers the request and state the order you'll use them.\n - Announce which skill(s) you're using and why (one short line). If you skip an obvious skill, say why.\n- Context hygiene:\n - Keep context small: summarize long sections instead of pasting them; only load extra files when needed.\n - Avoid deeply nested references; prefer one-hop files explicitly linked from `SKILL.md`.\n - When variants exist (frameworks, providers, domains), pick only the relevant reference file(s) and note that choice.\n- Safety and fallback: If a skill can't be applied cleanly (missing files, unclear instructions), state the issue, pick the next-best approach, and continue.";
let usage_rules = "- Discovery: The list above is the skills available in this session (name + description + file path). Skill bodies live on disk at the listed paths.\n- Trigger rules: If the user names a skill (with `$SkillName` or plain text) OR the task clearly matches a skill's description shown above, you must use that skill for that turn. Multiple mentions mean use them all. Do not carry skills across turns unless re-mentioned.\n- Missing/blocked: If a named skill isn't in the list or the path can't be read, say so briefly and continue with the best fallback.\n- How to use a skill (progressive disclosure):\n 1) After deciding to use a skill, open its `SKILL.md`. Read only enough to follow the workflow.\n 2) If `SKILL.md` points to extra folders such as `references/`, load only the specific files needed for the request; don't bulk-load everything.\n 3) If `scripts/` exist, prefer running or patching them instead of retyping large code blocks.\n 4) If `assets/` or templates exist, reuse them instead of recreating from scratch.\n- Coordination and sequencing:\n - If multiple skills apply, choose the minimal set that covers the request and state the order you'll use them.\n - Announce which skill(s) you're using and why (one short line). If you skip an obvious skill, say why.\n- Context hygiene:\n - Keep context small: summarize long sections instead of pasting them; only load extra files when needed.\n - Avoid deep reference-chasing: prefer opening only files directly linked from `SKILL.md` unless you're blocked.\n - When variants exist (frameworks, providers, domains), pick only the relevant reference file(s) and note that choice.\n- Safety and fallback: If a skill can't be applied cleanly (missing files, unclear instructions), state the issue, pick the next-best approach, and continue.";
let expected = format!(
"base doc\n\n## Skills\nThese skills are discovered at startup from multiple local sources. Each entry includes a name, description, and file path so you can open the source for full instructions.\n- pdf-processing: extract from pdfs (file: {expected_path_str})\n{usage_rules}"
"base doc\n\n## Skills\nA skill is a set of local instructions to follow that is stored in a `SKILL.md` file. Below is the list of skills that can be used. Each entry includes a name, description, and file path so you can open the source for full instructions when using a specific skill.\n### Available skills\n- pdf-processing: extract from pdfs (file: {expected_path_str})\n### How to use skills\n{usage_rules}"
);
assert_eq!(res, expected);
}
@@ -537,9 +537,9 @@ mod tests {
dunce::canonicalize(cfg.codex_home.join("skills/linting/SKILL.md").as_path())
.unwrap_or_else(|_| cfg.codex_home.join("skills/linting/SKILL.md"));
let expected_path_str = expected_path.to_string_lossy().replace('\\', "/");
let usage_rules = "- Discovery: Available skills are listed in project docs and may also appear in a runtime \"## Skills\" section (name + description + file path). These are the sources of truth; skill bodies live on disk at the listed paths.\n- Trigger rules: If the user names a skill (with `$SkillName` or plain text) OR the task clearly matches a skill's description, you must use that skill for that turn. Multiple mentions mean use them all. Do not carry skills across turns unless re-mentioned.\n- Missing/blocked: If a named skill isn't in the list or the path can't be read, say so briefly and continue with the best fallback.\n- How to use a skill (progressive disclosure):\n 1) After deciding to use a skill, open its `SKILL.md`. Read only enough to follow the workflow.\n 2) If `SKILL.md` points to extra folders such as `references/`, load only the specific files needed for the request; don't bulk-load everything.\n 3) If `scripts/` exist, prefer running or patching them instead of retyping large code blocks.\n 4) If `assets/` or templates exist, reuse them instead of recreating from scratch.\n- Description as trigger: The YAML `description` in `SKILL.md` is the primary trigger signal; rely on it to decide applicability. If unsure, ask a brief clarification before proceeding.\n- Coordination and sequencing:\n - If multiple skills apply, choose the minimal set that covers the request and state the order you'll use them.\n - Announce which skill(s) you're using and why (one short line). If you skip an obvious skill, say why.\n- Context hygiene:\n - Keep context small: summarize long sections instead of pasting them; only load extra files when needed.\n - Avoid deeply nested references; prefer one-hop files explicitly linked from `SKILL.md`.\n - When variants exist (frameworks, providers, domains), pick only the relevant reference file(s) and note that choice.\n- Safety and fallback: If a skill can't be applied cleanly (missing files, unclear instructions), state the issue, pick the next-best approach, and continue.";
let usage_rules = "- Discovery: The list above is the skills available in this session (name + description + file path). Skill bodies live on disk at the listed paths.\n- Trigger rules: If the user names a skill (with `$SkillName` or plain text) OR the task clearly matches a skill's description shown above, you must use that skill for that turn. Multiple mentions mean use them all. Do not carry skills across turns unless re-mentioned.\n- Missing/blocked: If a named skill isn't in the list or the path can't be read, say so briefly and continue with the best fallback.\n- How to use a skill (progressive disclosure):\n 1) After deciding to use a skill, open its `SKILL.md`. Read only enough to follow the workflow.\n 2) If `SKILL.md` points to extra folders such as `references/`, load only the specific files needed for the request; don't bulk-load everything.\n 3) If `scripts/` exist, prefer running or patching them instead of retyping large code blocks.\n 4) If `assets/` or templates exist, reuse them instead of recreating from scratch.\n- Coordination and sequencing:\n - If multiple skills apply, choose the minimal set that covers the request and state the order you'll use them.\n - Announce which skill(s) you're using and why (one short line). If you skip an obvious skill, say why.\n- Context hygiene:\n - Keep context small: summarize long sections instead of pasting them; only load extra files when needed.\n - Avoid deep reference-chasing: prefer opening only files directly linked from `SKILL.md` unless you're blocked.\n - When variants exist (frameworks, providers, domains), pick only the relevant reference file(s) and note that choice.\n- Safety and fallback: If a skill can't be applied cleanly (missing files, unclear instructions), state the issue, pick the next-best approach, and continue.";
let expected = format!(
"## Skills\nThese skills are discovered at startup from multiple local sources. Each entry includes a name, description, and file path so you can open the source for full instructions.\n- linting: run clippy (file: {expected_path_str})\n{usage_rules}"
"## Skills\nA skill is a set of local instructions to follow that is stored in a `SKILL.md` file. Below is the list of skills that can be used. Each entry includes a name, description, and file path so you can open the source for full instructions when using a specific skill.\n### Available skills\n- linting: run clippy (file: {expected_path_str})\n### How to use skills\n{usage_rules}"
);
assert_eq!(res, expected);
}

View File

@@ -0,0 +1,79 @@
use crate::api_bridge::CoreAuthProvider;
use codex_api::Prompt as ApiPrompt;
use codex_api::Provider;
use codex_api::ResponseStream;
use codex_api::ResponsesOptions;
use codex_api::ResponsesWsSession;
use codex_api::error::ApiError;
use tokio::sync::Mutex;
pub struct ResponsesWsManager {
session: Mutex<Option<ResponsesWsSession<CoreAuthProvider>>>,
base_url: Mutex<Option<String>>,
}
impl std::fmt::Debug for ResponsesWsManager {
fn fmt(&self, formatter: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
formatter.debug_struct("ResponsesWsManager").finish()
}
}
impl ResponsesWsManager {
pub(crate) fn new() -> Self {
Self {
session: Mutex::new(None),
base_url: Mutex::new(None),
}
}
pub(crate) async fn reset(&self) {
{
let mut guard = self.session.lock().await;
*guard = None;
}
let mut base_url = self.base_url.lock().await;
*base_url = None;
}
pub(crate) async fn stream_prompt(
&self,
provider: Provider,
auth: CoreAuthProvider,
model: &str,
prompt: &ApiPrompt,
options: ResponsesOptions,
) -> Result<ResponseStream, ApiError> {
let should_reset = self
.base_url
.lock()
.await
.as_ref()
.map(|url| url != &provider.base_url)
.unwrap_or(false);
if should_reset {
self.reset().await;
}
let existing = { self.session.lock().await.clone() };
let session = if let Some(session) = existing {
session
} else {
let session = ResponsesWsSession::new(provider.clone(), auth);
{
let mut guard = self.session.lock().await;
if guard.is_none() {
*guard = Some(session.clone());
let mut base_url = self.base_url.lock().await;
*base_url = Some(provider.base_url.clone());
}
}
session
};
let stream = session.stream_prompt(model, prompt, options).await;
if stream.is_err() {
self.reset().await;
}
stream
}
}

View File

@@ -33,7 +33,7 @@ fn map_rollout_io_error(io_err: &std::io::Error, codex_home: &Path) -> Option<Co
sessions_dir.display()
),
ErrorKind::InvalidData | ErrorKind::InvalidInput => format!(
"Session data under {} looks corrupt or unreadable. Clearing the sessions directory may help (this will remove saved conversations).",
"Session data under {} looks corrupt or unreadable. Clearing the sessions directory may help (this will remove saved threads).",
sessions_dir.display()
),
ErrorKind::IsADirectory | ErrorKind::NotADirectory => format!(

View File

@@ -20,11 +20,11 @@ use codex_protocol::protocol::RolloutItem;
use codex_protocol::protocol::RolloutLine;
use codex_protocol::protocol::SessionSource;
/// Returned page of conversation summaries.
/// Returned page of thread (thread) summaries.
#[derive(Debug, Default, PartialEq)]
pub struct ConversationsPage {
/// Conversation summaries ordered newest first.
pub items: Vec<ConversationItem>,
pub struct ThreadsPage {
/// Thread summaries ordered newest first.
pub items: Vec<ThreadItem>,
/// Opaque pagination token to resume after the last item, or `None` if end.
pub next_cursor: Option<Cursor>,
/// Total number of files touched while scanning this request.
@@ -33,9 +33,9 @@ pub struct ConversationsPage {
pub reached_scan_cap: bool,
}
/// Summary information for a conversation rollout file.
/// Summary information for a thread rollout file.
#[derive(Debug, PartialEq)]
pub struct ConversationItem {
pub struct ThreadItem {
/// Absolute path to the rollout file.
pub path: PathBuf,
/// First up to `HEAD_RECORD_LIMIT` JSONL records parsed as JSON (includes meta line).
@@ -46,6 +46,13 @@ pub struct ConversationItem {
pub updated_at: Option<String>,
}
#[allow(dead_code)]
#[deprecated(note = "use ThreadItem")]
pub type ConversationItem = ThreadItem;
#[allow(dead_code)]
#[deprecated(note = "use ThreadsPage")]
pub type ConversationsPage = ThreadsPage;
#[derive(Default)]
struct HeadTailSummary {
head: Vec<serde_json::Value>,
@@ -99,22 +106,22 @@ impl<'de> serde::Deserialize<'de> for Cursor {
}
}
/// Retrieve recorded conversation file paths with token pagination. The returned `next_cursor`
/// Retrieve recorded thread file paths with token pagination. The returned `next_cursor`
/// can be supplied on the next call to resume after the last returned item, resilient to
/// concurrent new sessions being appended. Ordering is stable by timestamp desc, then UUID desc.
pub(crate) async fn get_conversations(
pub(crate) async fn get_threads(
codex_home: &Path,
page_size: usize,
cursor: Option<&Cursor>,
allowed_sources: &[SessionSource],
model_providers: Option<&[String]>,
default_provider: &str,
) -> io::Result<ConversationsPage> {
) -> io::Result<ThreadsPage> {
let mut root = codex_home.to_path_buf();
root.push(SESSIONS_SUBDIR);
if !root.exists() {
return Ok(ConversationsPage {
return Ok(ThreadsPage {
items: Vec::new(),
next_cursor: None,
num_scanned_files: 0,
@@ -138,7 +145,7 @@ pub(crate) async fn get_conversations(
Ok(result)
}
/// Load conversation file paths from disk using directory traversal.
/// Load thread file paths from disk using directory traversal.
///
/// Directory layout: `~/.codex/sessions/YYYY/MM/DD/rollout-YYYY-MM-DDThh-mm-ss-<uuid>.jsonl`
/// Returned newest (latest) first.
@@ -148,8 +155,8 @@ async fn traverse_directories_for_paths(
anchor: Option<Cursor>,
allowed_sources: &[SessionSource],
provider_matcher: Option<&ProviderMatcher<'_>>,
) -> io::Result<ConversationsPage> {
let mut items: Vec<ConversationItem> = Vec::with_capacity(page_size);
) -> io::Result<ThreadsPage> {
let mut items: Vec<ThreadItem> = Vec::with_capacity(page_size);
let mut scanned_files = 0usize;
let mut anchor_passed = anchor.is_none();
let (anchor_ts, anchor_id) = match anchor {
@@ -232,7 +239,7 @@ async fn traverse_directories_for_paths(
.unwrap_or(None)
.or_else(|| created_at.clone());
}
items.push(ConversationItem {
items.push(ThreadItem {
path,
head,
created_at,
@@ -254,7 +261,7 @@ async fn traverse_directories_for_paths(
} else {
None
};
Ok(ConversationsPage {
Ok(ThreadsPage {
items,
next_cursor: next,
num_scanned_files: scanned_files,
@@ -279,7 +286,7 @@ pub fn parse_cursor(token: &str) -> Option<Cursor> {
Some(Cursor::new(ts, uuid))
}
fn build_next_cursor(items: &[ConversationItem]) -> Option<Cursor> {
fn build_next_cursor(items: &[ThreadItem]) -> Option<Cursor> {
let last = items.last()?;
let file_name = last.path.file_name()?.to_string_lossy();
let (ts, id) = parse_timestamp_uuid_from_filename(&file_name)?;
@@ -455,10 +462,10 @@ async fn file_modified_rfc3339(path: &Path) -> io::Result<Option<String>> {
Ok(dt.format(&Rfc3339).ok())
}
/// Locate a recorded conversation rollout file by its UUID string using the existing
/// Locate a recorded thread rollout file by its UUID string using the existing
/// paginated listing implementation. Returns `Ok(Some(path))` if found, `Ok(None)` if not present
/// or the id is invalid.
pub async fn find_conversation_path_by_id_str(
pub async fn find_thread_path_by_id_str(
codex_home: &Path,
id_str: &str,
) -> io::Result<Option<PathBuf>> {

View File

@@ -11,10 +11,13 @@ pub(crate) mod error;
pub mod list;
pub(crate) mod policy;
pub mod recorder;
pub(crate) mod truncation;
pub use codex_protocol::protocol::SessionMeta;
pub(crate) use error::map_session_init_error;
pub use list::find_conversation_path_by_id_str;
pub use list::find_thread_path_by_id_str;
#[deprecated(note = "use find_thread_path_by_id_str")]
pub use list::find_thread_path_by_id_str as find_conversation_path_by_id_str;
pub use recorder::RolloutRecorder;
pub use recorder::RolloutRecorderParams;

View File

@@ -45,6 +45,7 @@ pub(crate) fn should_persist_event_msg(ev: &EventMsg) -> bool {
| EventMsg::ContextCompacted(_)
| EventMsg::EnteredReviewMode(_)
| EventMsg::ExitedReviewMode(_)
| EventMsg::ThreadRolledBack(_)
| EventMsg::UndoCompleted(_)
| EventMsg::TurnAborted(_) => true,
EventMsg::Error(_)

View File

@@ -6,7 +6,7 @@ use std::io::Error as IoError;
use std::path::Path;
use std::path::PathBuf;
use codex_protocol::ConversationId;
use codex_protocol::ThreadId;
use serde_json::Value;
use time::OffsetDateTime;
use time::format_description::FormatItem;
@@ -19,9 +19,9 @@ use tracing::info;
use tracing::warn;
use super::SESSIONS_SUBDIR;
use super::list::ConversationsPage;
use super::list::Cursor;
use super::list::get_conversations;
use super::list::ThreadsPage;
use super::list::get_threads;
use super::policy::is_persisted_response_item;
use crate::config::Config;
use crate::default_client::originator;
@@ -52,7 +52,7 @@ pub struct RolloutRecorder {
#[derive(Clone)]
pub enum RolloutRecorderParams {
Create {
conversation_id: ConversationId,
conversation_id: ThreadId,
instructions: Option<String>,
source: SessionSource,
},
@@ -74,7 +74,7 @@ enum RolloutCmd {
impl RolloutRecorderParams {
pub fn new(
conversation_id: ConversationId,
conversation_id: ThreadId,
instructions: Option<String>,
source: SessionSource,
) -> Self {
@@ -91,16 +91,16 @@ impl RolloutRecorderParams {
}
impl RolloutRecorder {
/// List conversations (rollout files) under the provided Codex home directory.
pub async fn list_conversations(
/// List threads (rollout files) under the provided Codex home directory.
pub async fn list_threads(
codex_home: &Path,
page_size: usize,
cursor: Option<&Cursor>,
allowed_sources: &[SessionSource],
model_providers: Option<&[String]>,
default_provider: &str,
) -> std::io::Result<ConversationsPage> {
get_conversations(
) -> std::io::Result<ThreadsPage> {
get_threads(
codex_home,
page_size,
cursor,
@@ -215,7 +215,7 @@ impl RolloutRecorder {
}
let mut items: Vec<RolloutItem> = Vec::new();
let mut conversation_id: Option<ConversationId> = None;
let mut thread_id: Option<ThreadId> = None;
for line in text.lines() {
if line.trim().is_empty() {
continue;
@@ -233,9 +233,9 @@ impl RolloutRecorder {
Ok(rollout_line) => match rollout_line.item {
RolloutItem::SessionMeta(session_meta_line) => {
// Use the FIRST SessionMeta encountered in the file as the canonical
// conversation id and main session information. Keep all items intact.
if conversation_id.is_none() {
conversation_id = Some(session_meta_line.meta.id);
// thread id and main session information. Keep all items intact.
if thread_id.is_none() {
thread_id = Some(session_meta_line.meta.id);
}
items.push(RolloutItem::SessionMeta(session_meta_line));
}
@@ -259,12 +259,12 @@ impl RolloutRecorder {
}
info!(
"Resumed rollout with {} items, conversation ID: {:?}",
"Resumed rollout with {} items, thread ID: {:?}",
items.len(),
conversation_id
thread_id
);
let conversation_id = conversation_id
.ok_or_else(|| IoError::other("failed to parse conversation ID from rollout file"))?;
let conversation_id = thread_id
.ok_or_else(|| IoError::other("failed to parse thread ID from rollout file"))?;
if items.is_empty() {
return Ok(InitialHistory::New);
@@ -302,16 +302,13 @@ struct LogFileInfo {
path: PathBuf,
/// Session ID (also embedded in filename).
conversation_id: ConversationId,
conversation_id: ThreadId,
/// Timestamp for the start of the session.
timestamp: OffsetDateTime,
}
fn create_log_file(
config: &Config,
conversation_id: ConversationId,
) -> std::io::Result<LogFileInfo> {
fn create_log_file(config: &Config, conversation_id: ThreadId) -> std::io::Result<LogFileInfo> {
// Resolve ~/.codex/sessions/YYYY/MM/DD and create it if missing.
let timestamp = OffsetDateTime::now_local()
.map_err(|e| IoError::other(format!("failed to get local time: {e}")))?;

View File

@@ -13,12 +13,12 @@ use time::macros::format_description;
use uuid::Uuid;
use crate::rollout::INTERACTIVE_SESSION_SOURCES;
use crate::rollout::list::ConversationItem;
use crate::rollout::list::ConversationsPage;
use crate::rollout::list::Cursor;
use crate::rollout::list::get_conversations;
use crate::rollout::list::ThreadItem;
use crate::rollout::list::ThreadsPage;
use crate::rollout::list::get_threads;
use anyhow::Result;
use codex_protocol::ConversationId;
use codex_protocol::ThreadId;
use codex_protocol::models::ContentItem;
use codex_protocol::models::ResponseItem;
use codex_protocol::protocol::EventMsg;
@@ -162,7 +162,7 @@ async fn test_list_conversations_latest_first() {
.unwrap();
let provider_filter = provider_vec(&[TEST_PROVIDER]);
let page = get_conversations(
let page = get_threads(
home,
10,
None,
@@ -227,21 +227,21 @@ async fn test_list_conversations_latest_first() {
let updated_times: Vec<Option<String>> =
page.items.iter().map(|i| i.updated_at.clone()).collect();
let expected = ConversationsPage {
let expected = ThreadsPage {
items: vec![
ConversationItem {
ThreadItem {
path: p1,
head: head_3,
created_at: Some("2025-01-03T12-00-00".into()),
updated_at: updated_times.first().cloned().flatten(),
},
ConversationItem {
ThreadItem {
path: p2,
head: head_2,
created_at: Some("2025-01-02T12-00-00".into()),
updated_at: updated_times.get(1).cloned().flatten(),
},
ConversationItem {
ThreadItem {
path: p3,
head: head_1,
created_at: Some("2025-01-01T12-00-00".into()),
@@ -311,7 +311,7 @@ async fn test_pagination_cursor() {
.unwrap();
let provider_filter = provider_vec(&[TEST_PROVIDER]);
let page1 = get_conversations(
let page1 = get_threads(
home,
2,
None,
@@ -357,15 +357,15 @@ async fn test_pagination_cursor() {
page1.items.iter().map(|i| i.updated_at.clone()).collect();
let expected_cursor1: Cursor =
serde_json::from_str(&format!("\"2025-03-04T09-00-00|{u4}\"")).unwrap();
let expected_page1 = ConversationsPage {
let expected_page1 = ThreadsPage {
items: vec![
ConversationItem {
ThreadItem {
path: p5,
head: head_5,
created_at: Some("2025-03-05T09-00-00".into()),
updated_at: updated_page1.first().cloned().flatten(),
},
ConversationItem {
ThreadItem {
path: p4,
head: head_4,
created_at: Some("2025-03-04T09-00-00".into()),
@@ -378,7 +378,7 @@ async fn test_pagination_cursor() {
};
assert_eq!(page1, expected_page1);
let page2 = get_conversations(
let page2 = get_threads(
home,
2,
page1.next_cursor.as_ref(),
@@ -424,15 +424,15 @@ async fn test_pagination_cursor() {
page2.items.iter().map(|i| i.updated_at.clone()).collect();
let expected_cursor2: Cursor =
serde_json::from_str(&format!("\"2025-03-02T09-00-00|{u2}\"")).unwrap();
let expected_page2 = ConversationsPage {
let expected_page2 = ThreadsPage {
items: vec![
ConversationItem {
ThreadItem {
path: p3,
head: head_3,
created_at: Some("2025-03-03T09-00-00".into()),
updated_at: updated_page2.first().cloned().flatten(),
},
ConversationItem {
ThreadItem {
path: p2,
head: head_2,
created_at: Some("2025-03-02T09-00-00".into()),
@@ -445,7 +445,7 @@ async fn test_pagination_cursor() {
};
assert_eq!(page2, expected_page2);
let page3 = get_conversations(
let page3 = get_threads(
home,
2,
page2.next_cursor.as_ref(),
@@ -473,8 +473,8 @@ async fn test_pagination_cursor() {
})];
let updated_page3: Vec<Option<String>> =
page3.items.iter().map(|i| i.updated_at.clone()).collect();
let expected_page3 = ConversationsPage {
items: vec![ConversationItem {
let expected_page3 = ThreadsPage {
items: vec![ThreadItem {
path: p1,
head: head_1,
created_at: Some("2025-03-01T09-00-00".into()),
@@ -488,7 +488,7 @@ async fn test_pagination_cursor() {
}
#[tokio::test]
async fn test_get_conversation_contents() {
async fn test_get_thread_contents() {
let temp = TempDir::new().unwrap();
let home = temp.path();
@@ -497,7 +497,7 @@ async fn test_get_conversation_contents() {
write_session_file(home, ts, uuid, 2, Some(SessionSource::VSCode)).unwrap();
let provider_filter = provider_vec(&[TEST_PROVIDER]);
let page = get_conversations(
let page = get_threads(
home,
1,
None,
@@ -528,8 +528,8 @@ async fn test_get_conversation_contents() {
"source": "vscode",
"model_provider": "test-provider",
})];
let expected_page = ConversationsPage {
items: vec![ConversationItem {
let expected_page = ThreadsPage {
items: vec![ThreadItem {
path: expected_path,
head: expected_head,
created_at: Some(ts.into()),
@@ -579,7 +579,7 @@ async fn test_updated_at_uses_file_mtime() -> Result<()> {
let file_path = day_dir.join(format!("rollout-{ts}-{uuid}.jsonl"));
let mut file = File::create(&file_path)?;
let conversation_id = ConversationId::from_string(&uuid.to_string())?;
let conversation_id = ThreadId::from_string(&uuid.to_string())?;
let meta_line = RolloutLine {
timestamp: ts.to_string(),
item: RolloutItem::SessionMeta(SessionMetaLine {
@@ -624,7 +624,7 @@ async fn test_updated_at_uses_file_mtime() -> Result<()> {
drop(file);
let provider_filter = provider_vec(&[TEST_PROVIDER]);
let page = get_conversations(
let page = get_threads(
home,
1,
None,
@@ -663,7 +663,7 @@ async fn test_stable_ordering_same_second_pagination() {
write_session_file(home, ts, u3, 0, Some(SessionSource::VSCode)).unwrap();
let provider_filter = provider_vec(&[TEST_PROVIDER]);
let page1 = get_conversations(
let page1 = get_threads(
home,
2,
None,
@@ -701,15 +701,15 @@ async fn test_stable_ordering_same_second_pagination() {
let updated_page1: Vec<Option<String>> =
page1.items.iter().map(|i| i.updated_at.clone()).collect();
let expected_cursor1: Cursor = serde_json::from_str(&format!("\"{ts}|{u2}\"")).unwrap();
let expected_page1 = ConversationsPage {
let expected_page1 = ThreadsPage {
items: vec![
ConversationItem {
ThreadItem {
path: p3,
head: head(u3),
created_at: Some(ts.to_string()),
updated_at: updated_page1.first().cloned().flatten(),
},
ConversationItem {
ThreadItem {
path: p2,
head: head(u2),
created_at: Some(ts.to_string()),
@@ -722,7 +722,7 @@ async fn test_stable_ordering_same_second_pagination() {
};
assert_eq!(page1, expected_page1);
let page2 = get_conversations(
let page2 = get_threads(
home,
2,
page1.next_cursor.as_ref(),
@@ -740,8 +740,8 @@ async fn test_stable_ordering_same_second_pagination() {
.join(format!("rollout-2025-07-01T00-00-00-{u1}.jsonl"));
let updated_page2: Vec<Option<String>> =
page2.items.iter().map(|i| i.updated_at.clone()).collect();
let expected_page2 = ConversationsPage {
items: vec![ConversationItem {
let expected_page2 = ThreadsPage {
items: vec![ThreadItem {
path: p1,
head: head(u1),
created_at: Some(ts.to_string()),
@@ -780,7 +780,7 @@ async fn test_source_filter_excludes_non_matching_sessions() {
.unwrap();
let provider_filter = provider_vec(&[TEST_PROVIDER]);
let interactive_only = get_conversations(
let interactive_only = get_threads(
home,
10,
None,
@@ -801,7 +801,7 @@ async fn test_source_filter_excludes_non_matching_sessions() {
path.ends_with("rollout-2025-08-02T10-00-00-00000000-0000-0000-0000-00000000002a.jsonl")
}));
let all_sessions = get_conversations(home, 10, None, NO_SOURCE_FILTER, None, TEST_PROVIDER)
let all_sessions = get_threads(home, 10, None, NO_SOURCE_FILTER, None, TEST_PROVIDER)
.await
.unwrap();
let all_paths: Vec<_> = all_sessions
@@ -855,7 +855,7 @@ async fn test_model_provider_filter_selects_only_matching_sessions() -> Result<(
let openai_id_str = openai_id.to_string();
let none_id_str = none_id.to_string();
let openai_filter = provider_vec(&["openai"]);
let openai_sessions = get_conversations(
let openai_sessions = get_threads(
home,
10,
None,
@@ -880,7 +880,7 @@ async fn test_model_provider_filter_selects_only_matching_sessions() -> Result<(
assert!(openai_ids.contains(&none_id_str));
let beta_filter = provider_vec(&["beta"]);
let beta_sessions = get_conversations(
let beta_sessions = get_threads(
home,
10,
None,
@@ -900,7 +900,7 @@ async fn test_model_provider_filter_selects_only_matching_sessions() -> Result<(
assert_eq!(beta_head, Some(beta_id_str.as_str()));
let unknown_filter = provider_vec(&["unknown"]);
let unknown_sessions = get_conversations(
let unknown_sessions = get_threads(
home,
10,
None,
@@ -911,7 +911,7 @@ async fn test_model_provider_filter_selects_only_matching_sessions() -> Result<(
.await?;
assert!(unknown_sessions.items.is_empty());
let all_sessions = get_conversations(home, 10, None, NO_SOURCE_FILTER, None, "openai").await?;
let all_sessions = get_threads(home, 10, None, NO_SOURCE_FILTER, None, "openai").await?;
assert_eq!(all_sessions.items.len(), 3);
Ok(())

View File

@@ -0,0 +1,195 @@
//! Helpers for truncating rollouts based on "user turn" boundaries.
//!
//! In core, "user turns" are detected by scanning `ResponseItem::Message` items and
//! interpreting them via `event_mapping::parse_turn_item(...)`.
use crate::event_mapping;
use codex_protocol::items::TurnItem;
use codex_protocol::models::ResponseItem;
use codex_protocol::protocol::EventMsg;
use codex_protocol::protocol::RolloutItem;
/// Return the indices of user message boundaries in a rollout.
///
/// A user message boundary is a `RolloutItem::ResponseItem(ResponseItem::Message { .. })`
/// whose parsed turn item is `TurnItem::UserMessage`.
///
/// Rollouts can contain `ThreadRolledBack` markers. Those markers indicate that the
/// last N user turns were removed from the effective thread history; we apply them here so
/// indexing uses the post-rollback history rather than the raw stream.
pub(crate) fn user_message_positions_in_rollout(items: &[RolloutItem]) -> Vec<usize> {
let mut user_positions = Vec::new();
for (idx, item) in items.iter().enumerate() {
match item {
RolloutItem::ResponseItem(item @ ResponseItem::Message { .. })
if matches!(
event_mapping::parse_turn_item(item),
Some(TurnItem::UserMessage(_))
) =>
{
user_positions.push(idx);
}
RolloutItem::EventMsg(EventMsg::ThreadRolledBack(rollback)) => {
let num_turns = usize::try_from(rollback.num_turns).unwrap_or(usize::MAX);
let new_len = user_positions.len().saturating_sub(num_turns);
user_positions.truncate(new_len);
}
_ => {}
}
}
user_positions
}
/// Return a prefix of `items` obtained by cutting strictly before the nth user message.
///
/// The boundary index is 0-based from the start of `items` (so `n_from_start = 0` returns
/// a prefix that excludes the first user message and everything after it).
///
/// If fewer than or equal to `n_from_start` user messages exist, this returns an empty
/// vector (out of range).
pub(crate) fn truncate_rollout_before_nth_user_message_from_start(
items: &[RolloutItem],
n_from_start: usize,
) -> Vec<RolloutItem> {
let user_positions = user_message_positions_in_rollout(items);
// If fewer than or equal to n user messages exist, treat as empty (out of range).
if user_positions.len() <= n_from_start {
return Vec::new();
}
// Cut strictly before the nth user message (do not keep the nth itself).
let cut_idx = user_positions[n_from_start];
items[..cut_idx].to_vec()
}
#[cfg(test)]
mod tests {
use super::*;
use crate::codex::make_session_and_context;
use assert_matches::assert_matches;
use codex_protocol::models::ContentItem;
use codex_protocol::models::ReasoningItemReasoningSummary;
use codex_protocol::protocol::ThreadRolledBackEvent;
use pretty_assertions::assert_eq;
fn user_msg(text: &str) -> ResponseItem {
ResponseItem::Message {
id: None,
role: "user".to_string(),
content: vec![ContentItem::OutputText {
text: text.to_string(),
}],
}
}
fn assistant_msg(text: &str) -> ResponseItem {
ResponseItem::Message {
id: None,
role: "assistant".to_string(),
content: vec![ContentItem::OutputText {
text: text.to_string(),
}],
}
}
#[test]
fn truncates_rollout_from_start_before_nth_user_only() {
let items = [
user_msg("u1"),
assistant_msg("a1"),
assistant_msg("a2"),
user_msg("u2"),
assistant_msg("a3"),
ResponseItem::Reasoning {
id: "r1".to_string(),
summary: vec![ReasoningItemReasoningSummary::SummaryText {
text: "s".to_string(),
}],
content: None,
encrypted_content: None,
},
ResponseItem::FunctionCall {
id: None,
name: "tool".to_string(),
arguments: "{}".to_string(),
call_id: "c1".to_string(),
},
assistant_msg("a4"),
];
let rollout: Vec<RolloutItem> = items
.iter()
.cloned()
.map(RolloutItem::ResponseItem)
.collect();
let truncated = truncate_rollout_before_nth_user_message_from_start(&rollout, 1);
let expected = vec![
RolloutItem::ResponseItem(items[0].clone()),
RolloutItem::ResponseItem(items[1].clone()),
RolloutItem::ResponseItem(items[2].clone()),
];
assert_eq!(
serde_json::to_value(&truncated).unwrap(),
serde_json::to_value(&expected).unwrap()
);
let truncated2 = truncate_rollout_before_nth_user_message_from_start(&rollout, 2);
assert_matches!(truncated2.as_slice(), []);
}
#[test]
fn truncates_rollout_from_start_applies_thread_rollback_markers() {
let rollout_items = vec![
RolloutItem::ResponseItem(user_msg("u1")),
RolloutItem::ResponseItem(assistant_msg("a1")),
RolloutItem::ResponseItem(user_msg("u2")),
RolloutItem::ResponseItem(assistant_msg("a2")),
RolloutItem::EventMsg(EventMsg::ThreadRolledBack(ThreadRolledBackEvent {
num_turns: 1,
})),
RolloutItem::ResponseItem(user_msg("u3")),
RolloutItem::ResponseItem(assistant_msg("a3")),
RolloutItem::ResponseItem(user_msg("u4")),
RolloutItem::ResponseItem(assistant_msg("a4")),
];
// Effective user history after applying rollback(1) is: u1, u3, u4.
// So n_from_start=2 should cut before u4 (not u3).
let truncated = truncate_rollout_before_nth_user_message_from_start(&rollout_items, 2);
let expected = rollout_items[..7].to_vec();
assert_eq!(
serde_json::to_value(&truncated).unwrap(),
serde_json::to_value(&expected).unwrap()
);
}
#[tokio::test]
async fn ignores_session_prefix_messages_when_truncating_rollout_from_start() {
let (session, turn_context) = make_session_and_context().await;
let mut items = session.build_initial_context(&turn_context);
items.push(user_msg("feature request"));
items.push(assistant_msg("ack"));
items.push(user_msg("second question"));
items.push(assistant_msg("answer"));
let rollout_items: Vec<RolloutItem> = items
.iter()
.cloned()
.map(RolloutItem::ResponseItem)
.collect();
let truncated = truncate_rollout_before_nth_user_message_from_start(&rollout_items, 1);
let expected: Vec<RolloutItem> = vec![
RolloutItem::ResponseItem(items[0].clone()),
RolloutItem::ResponseItem(items[1].clone()),
RolloutItem::ResponseItem(items[2].clone()),
];
assert_eq!(
serde_json::to_value(&truncated).unwrap(),
serde_json::to_value(&expected).unwrap()
);
}
}

View File

@@ -1,9 +1,10 @@
use crate::config::Config;
use crate::git_info::resolve_root_git_project_for_trust;
use crate::config_loader::ConfigLayerStack;
use crate::skills::model::SkillError;
use crate::skills::model::SkillLoadOutcome;
use crate::skills::model::SkillMetadata;
use crate::skills::system::system_cache_root_dir;
use codex_app_server_protocol::ConfigLayerSource;
use codex_protocol::protocol::SkillScope;
use dunce::canonicalize as normalize_path;
use serde::Deserialize;
@@ -32,8 +33,6 @@ struct SkillFrontmatterMetadata {
const SKILLS_FILENAME: &str = "SKILL.md";
const SKILLS_DIR_NAME: &str = "skills";
const REPO_ROOT_CONFIG_DIR_NAME: &str = ".codex";
const ADMIN_SKILLS_ROOT: &str = "/etc/codex/skills";
const MAX_NAME_LEN: usize = 64;
const MAX_DESCRIPTION_LEN: usize = 1024;
const MAX_SHORT_DESCRIPTION_LEN: usize = MAX_DESCRIPTION_LEN;
@@ -88,86 +87,81 @@ where
.skills
.retain(|skill| seen.insert(skill.name.clone()));
outcome
.skills
.sort_by(|a, b| a.name.cmp(&b.name).then_with(|| a.path.cmp(&b.path)));
outcome
}
pub(crate) fn user_skills_root(codex_home: &Path) -> SkillRoot {
SkillRoot {
path: codex_home.join(SKILLS_DIR_NAME),
scope: SkillScope::User,
}
}
pub(crate) fn system_skills_root(codex_home: &Path) -> SkillRoot {
SkillRoot {
path: system_cache_root_dir(codex_home),
scope: SkillScope::System,
}
}
pub(crate) fn admin_skills_root() -> SkillRoot {
SkillRoot {
path: PathBuf::from(ADMIN_SKILLS_ROOT),
scope: SkillScope::Admin,
}
}
pub(crate) fn repo_skills_root(cwd: &Path) -> Option<SkillRoot> {
let base = if cwd.is_dir() { cwd } else { cwd.parent()? };
let base = normalize_path(base).unwrap_or_else(|_| base.to_path_buf());
let repo_root =
resolve_root_git_project_for_trust(&base).map(|root| normalize_path(&root).unwrap_or(root));
let scope = SkillScope::Repo;
if let Some(repo_root) = repo_root.as_deref() {
for dir in base.ancestors() {
let skills_root = dir.join(REPO_ROOT_CONFIG_DIR_NAME).join(SKILLS_DIR_NAME);
if skills_root.is_dir() {
return Some(SkillRoot {
path: skills_root,
scope,
});
}
if dir == repo_root {
break;
}
fn scope_rank(scope: SkillScope) -> u8 {
// Higher-priority scopes first (matches dedupe priority order).
match scope {
SkillScope::Repo => 0,
SkillScope::User => 1,
SkillScope::System => 2,
SkillScope::Admin => 3,
}
return None;
}
let skills_root = base.join(REPO_ROOT_CONFIG_DIR_NAME).join(SKILLS_DIR_NAME);
skills_root.is_dir().then_some(SkillRoot {
path: skills_root,
scope,
})
outcome.skills.sort_by(|a, b| {
scope_rank(a.scope)
.cmp(&scope_rank(b.scope))
.then_with(|| a.name.cmp(&b.name))
.then_with(|| a.path.cmp(&b.path))
});
outcome
}
pub(crate) fn skill_roots_for_cwd(codex_home: &Path, cwd: &Path) -> Vec<SkillRoot> {
fn skill_roots_from_layer_stack_inner(config_layer_stack: &ConfigLayerStack) -> Vec<SkillRoot> {
let mut roots = Vec::new();
if let Some(repo_root) = repo_skills_root(cwd) {
roots.push(repo_root);
}
for layer in config_layer_stack.layers_high_to_low() {
let Some(config_folder) = layer.config_folder() else {
continue;
};
// Load order matters: we dedupe by name, keeping the first occurrence.
// Priority order: repo, user, system, then admin.
roots.push(user_skills_root(codex_home));
roots.push(system_skills_root(codex_home));
if cfg!(unix) {
roots.push(admin_skills_root());
match &layer.name {
ConfigLayerSource::Project { .. } => {
roots.push(SkillRoot {
path: config_folder.as_path().join(SKILLS_DIR_NAME),
scope: SkillScope::Repo,
});
}
ConfigLayerSource::User { .. } => {
// `$CODEX_HOME/skills` (user-installed skills).
roots.push(SkillRoot {
path: config_folder.as_path().join(SKILLS_DIR_NAME),
scope: SkillScope::User,
});
// Embedded system skills are cached under `$CODEX_HOME/skills/.system` and are a
// special case (not a config layer).
roots.push(SkillRoot {
path: system_cache_root_dir(config_folder.as_path()),
scope: SkillScope::System,
});
}
ConfigLayerSource::System { .. } => {
// The system config layer lives under `/etc/codex/` on Unix, so treat
// `/etc/codex/skills` as admin-scoped skills.
roots.push(SkillRoot {
path: config_folder.as_path().join(SKILLS_DIR_NAME),
scope: SkillScope::Admin,
});
}
ConfigLayerSource::Mdm { .. }
| ConfigLayerSource::SessionFlags
| ConfigLayerSource::LegacyManagedConfigTomlFromFile { .. }
| ConfigLayerSource::LegacyManagedConfigTomlFromMdm => {}
}
}
roots
}
fn skill_roots(config: &Config) -> Vec<SkillRoot> {
skill_roots_for_cwd(&config.codex_home, &config.cwd)
skill_roots_from_layer_stack_inner(&config.config_layer_stack)
}
pub(crate) fn skill_roots_from_layer_stack(
config_layer_stack: &ConfigLayerStack,
) -> Vec<SkillRoot> {
skill_roots_from_layer_stack_inner(config_layer_stack)
}
fn discover_skills_under_root(root: &Path, scope: SkillScope, outcome: &mut SkillLoadOutcome) {
@@ -318,21 +312,91 @@ fn extract_frontmatter(contents: &str) -> Option<String> {
mod tests {
use super::*;
use crate::config::ConfigBuilder;
use crate::config::ConfigOverrides;
use crate::config_loader::ConfigLayerEntry;
use crate::config_loader::ConfigLayerStack;
use crate::config_loader::ConfigRequirements;
use codex_protocol::protocol::SkillScope;
use codex_utils_absolute_path::AbsolutePathBuf;
use pretty_assertions::assert_eq;
use std::path::Path;
use std::process::Command;
use tempfile::TempDir;
use toml::Value as TomlValue;
const REPO_ROOT_CONFIG_DIR_NAME: &str = ".codex";
async fn make_config(codex_home: &TempDir) -> Config {
let mut config = ConfigBuilder::default()
make_config_for_cwd(codex_home, codex_home.path().to_path_buf()).await
}
async fn make_config_for_cwd(codex_home: &TempDir, cwd: PathBuf) -> Config {
let harness_overrides = ConfigOverrides {
cwd: Some(cwd),
..Default::default()
};
ConfigBuilder::default()
.codex_home(codex_home.path().to_path_buf())
.harness_overrides(harness_overrides)
.build()
.await
.expect("defaults for test should always succeed");
.expect("defaults for test should always succeed")
}
config.cwd = codex_home.path().to_path_buf();
config
fn mark_as_git_repo(dir: &Path) {
// Config/project-root discovery only checks for the presence of `.git` (file or dir),
// so we can avoid shelling out to `git init` in tests.
fs::write(dir.join(".git"), "gitdir: fake\n").unwrap();
}
fn normalized(path: &Path) -> PathBuf {
normalize_path(path).unwrap_or_else(|_| path.to_path_buf())
}
#[test]
fn skill_roots_from_layer_stack_maps_user_to_user_and_system_cache_and_system_to_admin()
-> anyhow::Result<()> {
let tmp = tempfile::tempdir()?;
let system_folder = tmp.path().join("etc/codex");
let user_folder = tmp.path().join("home/codex");
fs::create_dir_all(&system_folder)?;
fs::create_dir_all(&user_folder)?;
// The file path doesn't need to exist; it's only used to derive the config folder.
let system_file = AbsolutePathBuf::from_absolute_path(system_folder.join("config.toml"))?;
let user_file = AbsolutePathBuf::from_absolute_path(user_folder.join("config.toml"))?;
let layers = vec![
ConfigLayerEntry::new(
ConfigLayerSource::System { file: system_file },
TomlValue::Table(toml::map::Map::new()),
),
ConfigLayerEntry::new(
ConfigLayerSource::User { file: user_file },
TomlValue::Table(toml::map::Map::new()),
),
];
let stack = ConfigLayerStack::new(layers, ConfigRequirements::default())?;
let got = skill_roots_from_layer_stack(&stack)
.into_iter()
.map(|root| (root.scope, root.path))
.collect::<Vec<_>>();
assert_eq!(
got,
vec![
(SkillScope::User, user_folder.join("skills")),
(
SkillScope::System,
user_folder.join("skills").join(".system")
),
(SkillScope::Admin, system_folder.join("skills")),
]
);
Ok(())
}
fn write_skill(codex_home: &TempDir, dir: &str, name: &str, description: &str) -> PathBuf {
@@ -368,7 +432,7 @@ mod tests {
#[tokio::test]
async fn loads_valid_skill() {
let codex_home = tempfile::tempdir().expect("tempdir");
write_skill(&codex_home, "demo", "demo-skill", "does things\ncarefully");
let skill_path = write_skill(&codex_home, "demo", "demo-skill", "does things\ncarefully");
let cfg = make_config(&codex_home).await;
let outcome = load_skills(&cfg);
@@ -377,15 +441,15 @@ mod tests {
"unexpected errors: {:?}",
outcome.errors
);
assert_eq!(outcome.skills.len(), 1);
let skill = &outcome.skills[0];
assert_eq!(skill.name, "demo-skill");
assert_eq!(skill.description, "does things carefully");
assert_eq!(skill.short_description, None);
let path_str = skill.path.to_string_lossy().replace('\\', "/");
assert!(
path_str.ends_with("skills/demo/SKILL.md"),
"unexpected path {path_str}"
assert_eq!(
outcome.skills,
vec![SkillMetadata {
name: "demo-skill".to_string(),
description: "does things carefully".to_string(),
short_description: None,
path: normalized(&skill_path),
scope: SkillScope::User,
}]
);
}
@@ -395,7 +459,8 @@ mod tests {
let skill_dir = codex_home.path().join("skills/demo");
fs::create_dir_all(&skill_dir).unwrap();
let contents = "---\nname: demo-skill\ndescription: long description\nmetadata:\n short-description: short summary\n---\n\n# Body\n";
fs::write(skill_dir.join(SKILLS_FILENAME), contents).unwrap();
let skill_path = skill_dir.join(SKILLS_FILENAME);
fs::write(&skill_path, contents).unwrap();
let cfg = make_config(&codex_home).await;
let outcome = load_skills(&cfg);
@@ -404,10 +469,15 @@ mod tests {
"unexpected errors: {:?}",
outcome.errors
);
assert_eq!(outcome.skills.len(), 1);
assert_eq!(
outcome.skills[0].short_description,
Some("short summary".to_string())
outcome.skills,
vec![SkillMetadata {
name: "demo-skill".to_string(),
description: "long description".to_string(),
short_description: Some("short summary".to_string()),
path: normalized(&skill_path),
scope: SkillScope::User,
}]
);
}
@@ -493,22 +563,14 @@ mod tests {
async fn loads_skills_from_repo_root() {
let codex_home = tempfile::tempdir().expect("tempdir");
let repo_dir = tempfile::tempdir().expect("tempdir");
let status = Command::new("git")
.arg("init")
.current_dir(repo_dir.path())
.status()
.expect("git init");
assert!(status.success(), "git init failed");
mark_as_git_repo(repo_dir.path());
let skills_root = repo_dir
.path()
.join(REPO_ROOT_CONFIG_DIR_NAME)
.join(SKILLS_DIR_NAME);
write_skill_at(&skills_root, "repo", "repo-skill", "from repo");
let mut cfg = make_config(&codex_home).await;
cfg.cwd = repo_dir.path().to_path_buf();
let repo_root = normalize_path(&skills_root).unwrap_or_else(|_| skills_root.clone());
let skill_path = write_skill_at(&skills_root, "repo", "repo-skill", "from repo");
let cfg = make_config_for_cwd(&codex_home, repo_dir.path().to_path_buf()).await;
let outcome = load_skills(&cfg);
assert!(
@@ -516,28 +578,28 @@ mod tests {
"unexpected errors: {:?}",
outcome.errors
);
assert_eq!(outcome.skills.len(), 1);
let skill = &outcome.skills[0];
assert_eq!(skill.name, "repo-skill");
assert!(skill.path.starts_with(&repo_root));
assert_eq!(
outcome.skills,
vec![SkillMetadata {
name: "repo-skill".to_string(),
description: "from repo".to_string(),
short_description: None,
path: normalized(&skill_path),
scope: SkillScope::Repo,
}]
);
}
#[tokio::test]
async fn loads_skills_from_nearest_codex_dir_under_repo_root() {
async fn loads_skills_from_all_codex_dirs_under_project_root() {
let codex_home = tempfile::tempdir().expect("tempdir");
let repo_dir = tempfile::tempdir().expect("tempdir");
let status = Command::new("git")
.arg("init")
.current_dir(repo_dir.path())
.status()
.expect("git init");
assert!(status.success(), "git init failed");
mark_as_git_repo(repo_dir.path());
let nested_dir = repo_dir.path().join("nested/inner");
fs::create_dir_all(&nested_dir).unwrap();
write_skill_at(
let root_skill_path = write_skill_at(
&repo_dir
.path()
.join(REPO_ROOT_CONFIG_DIR_NAME)
@@ -546,7 +608,7 @@ mod tests {
"root-skill",
"from root",
);
write_skill_at(
let nested_skill_path = write_skill_at(
&repo_dir
.path()
.join("nested")
@@ -557,8 +619,7 @@ mod tests {
"from nested",
);
let mut cfg = make_config(&codex_home).await;
cfg.cwd = nested_dir;
let cfg = make_config_for_cwd(&codex_home, nested_dir).await;
let outcome = load_skills(&cfg);
assert!(
@@ -566,8 +627,25 @@ mod tests {
"unexpected errors: {:?}",
outcome.errors
);
assert_eq!(outcome.skills.len(), 1);
assert_eq!(outcome.skills[0].name, "nested-skill");
assert_eq!(
outcome.skills,
vec![
SkillMetadata {
name: "nested-skill".to_string(),
description: "from nested".to_string(),
short_description: None,
path: normalized(&nested_skill_path),
scope: SkillScope::Repo,
},
SkillMetadata {
name: "root-skill".to_string(),
description: "from root".to_string(),
short_description: None,
path: normalized(&root_skill_path),
scope: SkillScope::Repo,
},
]
);
}
#[tokio::test]
@@ -575,7 +653,7 @@ mod tests {
let codex_home = tempfile::tempdir().expect("tempdir");
let work_dir = tempfile::tempdir().expect("tempdir");
write_skill_at(
let skill_path = write_skill_at(
&work_dir
.path()
.join(REPO_ROOT_CONFIG_DIR_NAME)
@@ -585,8 +663,7 @@ mod tests {
"from cwd",
);
let mut cfg = make_config(&codex_home).await;
cfg.cwd = work_dir.path().to_path_buf();
let cfg = make_config_for_cwd(&codex_home, work_dir.path().to_path_buf()).await;
let outcome = load_skills(&cfg);
assert!(
@@ -594,25 +671,26 @@ mod tests {
"unexpected errors: {:?}",
outcome.errors
);
assert_eq!(outcome.skills.len(), 1);
assert_eq!(outcome.skills[0].name, "local-skill");
assert_eq!(outcome.skills[0].scope, SkillScope::Repo);
assert_eq!(
outcome.skills,
vec![SkillMetadata {
name: "local-skill".to_string(),
description: "from cwd".to_string(),
short_description: None,
path: normalized(&skill_path),
scope: SkillScope::Repo,
}]
);
}
#[tokio::test]
async fn deduplicates_by_name_preferring_repo_over_user() {
let codex_home = tempfile::tempdir().expect("tempdir");
let repo_dir = tempfile::tempdir().expect("tempdir");
mark_as_git_repo(repo_dir.path());
let status = Command::new("git")
.arg("init")
.current_dir(repo_dir.path())
.status()
.expect("git init");
assert!(status.success(), "git init failed");
write_skill(&codex_home, "user", "dupe-skill", "from user");
write_skill_at(
let _user_skill_path = write_skill(&codex_home, "user", "dupe-skill", "from user");
let repo_skill_path = write_skill_at(
&repo_dir
.path()
.join(REPO_ROOT_CONFIG_DIR_NAME)
@@ -622,8 +700,7 @@ mod tests {
"from repo",
);
let mut cfg = make_config(&codex_home).await;
cfg.cwd = repo_dir.path().to_path_buf();
let cfg = make_config_for_cwd(&codex_home, repo_dir.path().to_path_buf()).await;
let outcome = load_skills(&cfg);
assert!(
@@ -631,17 +708,25 @@ mod tests {
"unexpected errors: {:?}",
outcome.errors
);
assert_eq!(outcome.skills.len(), 1);
assert_eq!(outcome.skills[0].name, "dupe-skill");
assert_eq!(outcome.skills[0].scope, SkillScope::Repo);
assert_eq!(
outcome.skills,
vec![SkillMetadata {
name: "dupe-skill".to_string(),
description: "from repo".to_string(),
short_description: None,
path: normalized(&repo_skill_path),
scope: SkillScope::Repo,
}]
);
}
#[tokio::test]
async fn loads_system_skills_when_present() {
let codex_home = tempfile::tempdir().expect("tempdir");
write_system_skill(&codex_home, "system", "dupe-skill", "from system");
write_skill(&codex_home, "user", "dupe-skill", "from user");
let _system_skill_path =
write_system_skill(&codex_home, "system", "dupe-skill", "from system");
let user_skill_path = write_skill(&codex_home, "user", "dupe-skill", "from user");
let cfg = make_config(&codex_home).await;
let outcome = load_skills(&cfg);
@@ -650,9 +735,16 @@ mod tests {
"unexpected errors: {:?}",
outcome.errors
);
assert_eq!(outcome.skills.len(), 1);
assert_eq!(outcome.skills[0].description, "from user");
assert_eq!(outcome.skills[0].scope, SkillScope::User);
assert_eq!(
outcome.skills,
vec![SkillMetadata {
name: "dupe-skill".to_string(),
description: "from user".to_string(),
short_description: None,
path: normalized(&user_skill_path),
scope: SkillScope::User,
}]
);
}
#[tokio::test]
@@ -672,15 +764,9 @@ mod tests {
"from outer",
);
let status = Command::new("git")
.arg("init")
.current_dir(&repo_dir)
.status()
.expect("git init");
assert!(status.success(), "git init failed");
mark_as_git_repo(&repo_dir);
let mut cfg = make_config(&codex_home).await;
cfg.cwd = repo_dir;
let cfg = make_config_for_cwd(&codex_home, repo_dir).await;
let outcome = load_skills(&cfg);
assert!(
@@ -695,15 +781,9 @@ mod tests {
async fn loads_skills_when_cwd_is_file_in_repo() {
let codex_home = tempfile::tempdir().expect("tempdir");
let repo_dir = tempfile::tempdir().expect("tempdir");
mark_as_git_repo(repo_dir.path());
let status = Command::new("git")
.arg("init")
.current_dir(repo_dir.path())
.status()
.expect("git init");
assert!(status.success(), "git init failed");
write_skill_at(
let skill_path = write_skill_at(
&repo_dir
.path()
.join(REPO_ROOT_CONFIG_DIR_NAME)
@@ -715,8 +795,7 @@ mod tests {
let file_path = repo_dir.path().join("some-file.txt");
fs::write(&file_path, "contents").unwrap();
let mut cfg = make_config(&codex_home).await;
cfg.cwd = file_path;
let cfg = make_config_for_cwd(&codex_home, file_path).await;
let outcome = load_skills(&cfg);
assert!(
@@ -724,9 +803,16 @@ mod tests {
"unexpected errors: {:?}",
outcome.errors
);
assert_eq!(outcome.skills.len(), 1);
assert_eq!(outcome.skills[0].name, "repo-skill");
assert_eq!(outcome.skills[0].scope, SkillScope::Repo);
assert_eq!(
outcome.skills,
vec![SkillMetadata {
name: "repo-skill".to_string(),
description: "from repo".to_string(),
short_description: None,
path: normalized(&skill_path),
scope: SkillScope::Repo,
}]
);
}
#[tokio::test]
@@ -746,8 +832,7 @@ mod tests {
"from outer",
);
let mut cfg = make_config(&codex_home).await;
cfg.cwd = nested_dir;
let cfg = make_config_for_cwd(&codex_home, nested_dir).await;
let outcome = load_skills(&cfg);
assert!(
@@ -763,10 +848,9 @@ mod tests {
let codex_home = tempfile::tempdir().expect("tempdir");
let work_dir = tempfile::tempdir().expect("tempdir");
write_system_skill(&codex_home, "system", "system-skill", "from system");
let skill_path = write_system_skill(&codex_home, "system", "system-skill", "from system");
let mut cfg = make_config(&codex_home).await;
cfg.cwd = work_dir.path().to_path_buf();
let cfg = make_config_for_cwd(&codex_home, work_dir.path().to_path_buf()).await;
let outcome = load_skills(&cfg);
assert!(
@@ -774,9 +858,16 @@ mod tests {
"unexpected errors: {:?}",
outcome.errors
);
assert_eq!(outcome.skills.len(), 1);
assert_eq!(outcome.skills[0].name, "system-skill");
assert_eq!(outcome.skills[0].scope, SkillScope::System);
assert_eq!(
outcome.skills,
vec![SkillMetadata {
name: "system-skill".to_string(),
description: "from system".to_string(),
short_description: None,
path: normalized(&skill_path),
scope: SkillScope::System,
}]
);
}
#[tokio::test]
@@ -800,8 +891,10 @@ mod tests {
let system_dir = tempfile::tempdir().expect("tempdir");
let admin_dir = tempfile::tempdir().expect("tempdir");
write_skill_at(system_dir.path(), "system", "dupe-skill", "from system");
write_skill_at(admin_dir.path(), "admin", "dupe-skill", "from admin");
let system_skill_path =
write_skill_at(system_dir.path(), "system", "dupe-skill", "from system");
let _admin_skill_path =
write_skill_at(admin_dir.path(), "admin", "dupe-skill", "from admin");
let outcome = load_skills_from_roots([
SkillRoot {
@@ -819,9 +912,16 @@ mod tests {
"unexpected errors: {:?}",
outcome.errors
);
assert_eq!(outcome.skills.len(), 1);
assert_eq!(outcome.skills[0].name, "dupe-skill");
assert_eq!(outcome.skills[0].scope, SkillScope::System);
assert_eq!(
outcome.skills,
vec![SkillMetadata {
name: "dupe-skill".to_string(),
description: "from system".to_string(),
short_description: None,
path: normalized(&system_skill_path),
scope: SkillScope::System,
}]
);
}
#[tokio::test]
@@ -829,11 +929,11 @@ mod tests {
let codex_home = tempfile::tempdir().expect("tempdir");
let work_dir = tempfile::tempdir().expect("tempdir");
write_skill(&codex_home, "user", "dupe-skill", "from user");
write_system_skill(&codex_home, "system", "dupe-skill", "from system");
let user_skill_path = write_skill(&codex_home, "user", "dupe-skill", "from user");
let _system_skill_path =
write_system_skill(&codex_home, "system", "dupe-skill", "from system");
let mut cfg = make_config(&codex_home).await;
cfg.cwd = work_dir.path().to_path_buf();
let cfg = make_config_for_cwd(&codex_home, work_dir.path().to_path_buf()).await;
let outcome = load_skills(&cfg);
assert!(
@@ -841,24 +941,25 @@ mod tests {
"unexpected errors: {:?}",
outcome.errors
);
assert_eq!(outcome.skills.len(), 1);
assert_eq!(outcome.skills[0].name, "dupe-skill");
assert_eq!(outcome.skills[0].scope, SkillScope::User);
assert_eq!(
outcome.skills,
vec![SkillMetadata {
name: "dupe-skill".to_string(),
description: "from user".to_string(),
short_description: None,
path: normalized(&user_skill_path),
scope: SkillScope::User,
}]
);
}
#[tokio::test]
async fn deduplicates_by_name_preferring_repo_over_system() {
let codex_home = tempfile::tempdir().expect("tempdir");
let repo_dir = tempfile::tempdir().expect("tempdir");
mark_as_git_repo(repo_dir.path());
let status = Command::new("git")
.arg("init")
.current_dir(repo_dir.path())
.status()
.expect("git init");
assert!(status.success(), "git init failed");
write_skill_at(
let repo_skill_path = write_skill_at(
&repo_dir
.path()
.join(REPO_ROOT_CONFIG_DIR_NAME)
@@ -867,10 +968,10 @@ mod tests {
"dupe-skill",
"from repo",
);
write_system_skill(&codex_home, "system", "dupe-skill", "from system");
let _system_skill_path =
write_system_skill(&codex_home, "system", "dupe-skill", "from system");
let mut cfg = make_config(&codex_home).await;
cfg.cwd = repo_dir.path().to_path_buf();
let cfg = make_config_for_cwd(&codex_home, repo_dir.path().to_path_buf()).await;
let outcome = load_skills(&cfg);
assert!(
@@ -878,8 +979,66 @@ mod tests {
"unexpected errors: {:?}",
outcome.errors
);
assert_eq!(outcome.skills.len(), 1);
assert_eq!(outcome.skills[0].name, "dupe-skill");
assert_eq!(outcome.skills[0].scope, SkillScope::Repo);
assert_eq!(
outcome.skills,
vec![SkillMetadata {
name: "dupe-skill".to_string(),
description: "from repo".to_string(),
short_description: None,
path: normalized(&repo_skill_path),
scope: SkillScope::Repo,
}]
);
}
#[tokio::test]
async fn deduplicates_by_name_preferring_nearest_project_codex_dir() {
let codex_home = tempfile::tempdir().expect("tempdir");
let repo_dir = tempfile::tempdir().expect("tempdir");
mark_as_git_repo(repo_dir.path());
let nested_dir = repo_dir.path().join("nested/inner");
fs::create_dir_all(&nested_dir).unwrap();
let _root_skill_path = write_skill_at(
&repo_dir
.path()
.join(REPO_ROOT_CONFIG_DIR_NAME)
.join(SKILLS_DIR_NAME),
"root",
"dupe-skill",
"from root",
);
let nested_skill_path = write_skill_at(
&repo_dir
.path()
.join("nested")
.join(REPO_ROOT_CONFIG_DIR_NAME)
.join(SKILLS_DIR_NAME),
"nested",
"dupe-skill",
"from nested",
);
let cfg = make_config_for_cwd(&codex_home, nested_dir).await;
let outcome = load_skills(&cfg);
assert!(
outcome.errors.is_empty(),
"unexpected errors: {:?}",
outcome.errors
);
let expected_path =
normalize_path(&nested_skill_path).unwrap_or_else(|_| nested_skill_path.clone());
assert_eq!(
vec![SkillMetadata {
name: "dupe-skill".to_string(),
description: "from nested".to_string(),
short_description: None,
path: expected_path,
scope: SkillScope::Repo,
}],
outcome.skills
);
}
}

View File

@@ -3,10 +3,17 @@ use std::path::Path;
use std::path::PathBuf;
use std::sync::RwLock;
use codex_utils_absolute_path::AbsolutePathBuf;
use toml::Value as TomlValue;
use crate::config::Config;
use crate::config_loader::LoaderOverrides;
use crate::config_loader::load_config_layers_state;
use crate::skills::SkillLoadOutcome;
use crate::skills::loader::load_skills_from_roots;
use crate::skills::loader::skill_roots_for_cwd;
use crate::skills::loader::skill_roots_from_layer_stack;
use crate::skills::system::install_system_skills;
pub struct SkillsManager {
codex_home: PathBuf,
cache_by_cwd: RwLock<HashMap<PathBuf, SkillLoadOutcome>>,
@@ -24,11 +31,32 @@ impl SkillsManager {
}
}
pub fn skills_for_cwd(&self, cwd: &Path) -> SkillLoadOutcome {
self.skills_for_cwd_with_options(cwd, false)
/// Load skills for an already-constructed [`Config`], avoiding any additional config-layer
/// loading. This also seeds the per-cwd cache for subsequent lookups.
pub fn skills_for_config(&self, config: &Config) -> SkillLoadOutcome {
let cwd = &config.cwd;
let cached = match self.cache_by_cwd.read() {
Ok(cache) => cache.get(cwd).cloned(),
Err(err) => err.into_inner().get(cwd).cloned(),
};
if let Some(outcome) = cached {
return outcome;
}
let roots = skill_roots_from_layer_stack(&config.config_layer_stack);
let outcome = load_skills_from_roots(roots);
match self.cache_by_cwd.write() {
Ok(mut cache) => {
cache.insert(cwd.to_path_buf(), outcome.clone());
}
Err(err) => {
err.into_inner().insert(cwd.to_path_buf(), outcome.clone());
}
}
outcome
}
pub fn skills_for_cwd_with_options(&self, cwd: &Path, force_reload: bool) -> SkillLoadOutcome {
pub async fn skills_for_cwd(&self, cwd: &Path, force_reload: bool) -> SkillLoadOutcome {
let cached = match self.cache_by_cwd.read() {
Ok(cache) => cache.get(cwd).cloned(),
Err(err) => err.into_inner().get(cwd).cloned(),
@@ -37,7 +65,41 @@ impl SkillsManager {
return outcome;
}
let roots = skill_roots_for_cwd(&self.codex_home, cwd);
let cwd_abs = match AbsolutePathBuf::try_from(cwd) {
Ok(cwd_abs) => cwd_abs,
Err(err) => {
return SkillLoadOutcome {
errors: vec![crate::skills::model::SkillError {
path: cwd.to_path_buf(),
message: err.to_string(),
}],
..Default::default()
};
}
};
let cli_overrides: Vec<(String, TomlValue)> = Vec::new();
let config_layer_stack = match load_config_layers_state(
&self.codex_home,
Some(cwd_abs),
&cli_overrides,
LoaderOverrides::default(),
)
.await
{
Ok(config_layer_stack) => config_layer_stack,
Err(err) => {
return SkillLoadOutcome {
errors: vec![crate::skills::model::SkillError {
path: cwd.to_path_buf(),
message: err.to_string(),
}],
..Default::default()
};
}
};
let roots = skill_roots_from_layer_stack(&config_layer_stack);
let outcome = load_skills_from_roots(roots);
match self.cache_by_cwd.write() {
Ok(mut cache) => {
@@ -50,3 +112,52 @@ impl SkillsManager {
outcome
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::config::ConfigBuilder;
use crate::config::ConfigOverrides;
use pretty_assertions::assert_eq;
use std::fs;
use tempfile::TempDir;
fn write_user_skill(codex_home: &TempDir, dir: &str, name: &str, description: &str) {
let skill_dir = codex_home.path().join("skills").join(dir);
fs::create_dir_all(&skill_dir).unwrap();
let content = format!("---\nname: {name}\ndescription: {description}\n---\n\n# Body\n");
fs::write(skill_dir.join("SKILL.md"), content).unwrap();
}
#[tokio::test]
async fn skills_for_config_seeds_cache_by_cwd() {
let codex_home = tempfile::tempdir().expect("tempdir");
let cwd = tempfile::tempdir().expect("tempdir");
let cfg = ConfigBuilder::default()
.codex_home(codex_home.path().to_path_buf())
.harness_overrides(ConfigOverrides {
cwd: Some(cwd.path().to_path_buf()),
..Default::default()
})
.build()
.await
.expect("defaults for test should always succeed");
let skills_manager = SkillsManager::new(codex_home.path().to_path_buf());
write_user_skill(&codex_home, "a", "skill-a", "from a");
let outcome1 = skills_manager.skills_for_config(&cfg);
assert!(
outcome1.skills.iter().any(|s| s.name == "skill-a"),
"expected skill-a to be discovered"
);
// Write a new skill after the first call; the second call should hit the cache and not
// reflect the new file.
write_user_skill(&codex_home, "b", "skill-b", "from b");
let outcome2 = skills_manager.skills_for_config(&cfg);
assert_eq!(outcome2.errors, outcome1.errors);
assert_eq!(outcome2.skills, outcome1.skills);
}
}

View File

@@ -7,7 +7,8 @@ pub fn render_skills_section(skills: &[SkillMetadata]) -> Option<String> {
let mut lines: Vec<String> = Vec::new();
lines.push("## Skills".to_string());
lines.push("These skills are discovered at startup from multiple local sources. Each entry includes a name, description, and file path so you can open the source for full instructions.".to_string());
lines.push("A skill is a set of local instructions to follow that is stored in a `SKILL.md` file. Below is the list of skills that can be used. Each entry includes a name, description, and file path so you can open the source for full instructions when using a specific skill.".to_string());
lines.push("### Available skills".to_string());
for skill in skills {
let path_str = skill.path.to_string_lossy().replace('\\', "/");
@@ -16,22 +17,22 @@ pub fn render_skills_section(skills: &[SkillMetadata]) -> Option<String> {
lines.push(format!("- {name}: {description} (file: {path_str})"));
}
lines.push("### How to use skills".to_string());
lines.push(
r###"- Discovery: Available skills are listed in project docs and may also appear in a runtime "## Skills" section (name + description + file path). These are the sources of truth; skill bodies live on disk at the listed paths.
- Trigger rules: If the user names a skill (with `$SkillName` or plain text) OR the task clearly matches a skill's description, you must use that skill for that turn. Multiple mentions mean use them all. Do not carry skills across turns unless re-mentioned.
r###"- Discovery: The list above is the skills available in this session (name + description + file path). Skill bodies live on disk at the listed paths.
- Trigger rules: If the user names a skill (with `$SkillName` or plain text) OR the task clearly matches a skill's description shown above, you must use that skill for that turn. Multiple mentions mean use them all. Do not carry skills across turns unless re-mentioned.
- Missing/blocked: If a named skill isn't in the list or the path can't be read, say so briefly and continue with the best fallback.
- How to use a skill (progressive disclosure):
1) After deciding to use a skill, open its `SKILL.md`. Read only enough to follow the workflow.
2) If `SKILL.md` points to extra folders such as `references/`, load only the specific files needed for the request; don't bulk-load everything.
3) If `scripts/` exist, prefer running or patching them instead of retyping large code blocks.
4) If `assets/` or templates exist, reuse them instead of recreating from scratch.
- Description as trigger: The YAML `description` in `SKILL.md` is the primary trigger signal; rely on it to decide applicability. If unsure, ask a brief clarification before proceeding.
- Coordination and sequencing:
- If multiple skills apply, choose the minimal set that covers the request and state the order you'll use them.
- Announce which skill(s) you're using and why (one short line). If you skip an obvious skill, say why.
- Context hygiene:
- Keep context small: summarize long sections instead of pasting them; only load extra files when needed.
- Avoid deeply nested references; prefer one-hop files explicitly linked from `SKILL.md`.
- Avoid deep reference-chasing: prefer opening only files directly linked from `SKILL.md` unless you're blocked.
- When variants exist (frameworks, providers, domains), pick only the relevant reference file(s) and note that choice.
- Safety and fallback: If a skill can't be applied cleanly (missing files, unclear instructions), state the issue, pick the next-best approach, and continue."###
.to_string(),

View File

@@ -2,12 +2,14 @@ use std::sync::Arc;
use crate::AuthManager;
use crate::RolloutRecorder;
use crate::agent::AgentControl;
use crate::exec_policy::ExecPolicyManager;
use crate::mcp_connection_manager::McpConnectionManager;
use crate::models_manager::manager::ModelsManager;
use crate::responses_ws::ResponsesWsManager;
use crate::skills::SkillsManager;
use crate::tools::sandboxing::ApprovalStore;
use crate::unified_exec::UnifiedExecSessionManager;
use crate::unified_exec::UnifiedExecProcessManager;
use crate::user_notification::UserNotifier;
use codex_otel::otel_manager::OtelManager;
use tokio::sync::Mutex;
@@ -17,7 +19,7 @@ use tokio_util::sync::CancellationToken;
pub(crate) struct SessionServices {
pub(crate) mcp_connection_manager: Arc<RwLock<McpConnectionManager>>,
pub(crate) mcp_startup_cancellation_token: CancellationToken,
pub(crate) unified_exec_manager: UnifiedExecSessionManager,
pub(crate) unified_exec_manager: UnifiedExecProcessManager,
pub(crate) notifier: UserNotifier,
pub(crate) rollout: Mutex<Option<RolloutRecorder>>,
pub(crate) user_shell: Arc<crate::shell::Shell>,
@@ -28,4 +30,6 @@ pub(crate) struct SessionServices {
pub(crate) otel_manager: OtelManager,
pub(crate) tool_approvals: Mutex<ApprovalStore>,
pub(crate) skills_manager: Arc<SkillsManager>,
pub(crate) agent_control: AgentControl,
pub(crate) responses_ws: Option<Arc<ResponsesWsManager>>,
}

View File

@@ -159,7 +159,7 @@ impl Session {
for task in self.take_all_running_tasks().await {
self.handle_task_abort(task, reason.clone()).await;
}
self.close_unified_exec_sessions().await;
self.close_unified_exec_processes().await;
}
pub async fn on_task_finished(
@@ -168,7 +168,7 @@ impl Session {
last_agent_message: Option<String>,
) {
let mut active = self.active_turn.lock().await;
let should_close_sessions = if let Some(at) = active.as_mut()
let should_close_processes = if let Some(at) = active.as_mut()
&& at.remove_task(&turn_context.sub_id)
{
*active = None;
@@ -177,8 +177,8 @@ impl Session {
false
};
drop(active);
if should_close_sessions {
self.close_unified_exec_sessions().await;
if should_close_processes {
self.close_unified_exec_processes().await;
}
let event = EventMsg::TaskComplete(TaskCompleteEvent { last_agent_message });
self.send_event(turn_context.as_ref(), event).await;
@@ -203,10 +203,10 @@ impl Session {
}
}
async fn close_unified_exec_sessions(&self) {
async fn close_unified_exec_processes(&self) {
self.services
.unified_exec_manager
.terminate_all_sessions()
.terminate_all_processes()
.await;
}

View File

@@ -15,7 +15,7 @@ use tokio_util::sync::CancellationToken;
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::codex_delegate::run_codex_conversation_one_shot;
use crate::codex_delegate::run_codex_thread_one_shot;
use crate::review_format::format_review_findings_block;
use crate::review_format::render_review_output_text;
use crate::state::TaskKind;
@@ -92,7 +92,7 @@ async fn start_review_conversation(
sub_agent_config.base_instructions = Some(crate::REVIEW_PROMPT.to_string());
sub_agent_config.model = Some(config.review_model.clone());
(run_codex_conversation_one_shot(
(run_codex_thread_one_shot(
sub_agent_config,
session.auth_manager(),
session.models_manager(),

View File

@@ -3,10 +3,11 @@ use crate::AuthManager;
use crate::CodexAuth;
#[cfg(any(test, feature = "test-support"))]
use crate::ModelProviderInfo;
use crate::agent::AgentControl;
use crate::codex::Codex;
use crate::codex::CodexSpawnOk;
use crate::codex::INITIAL_SUBMIT_ID;
use crate::codex_conversation::CodexConversation;
use crate::codex_thread::CodexThread;
use crate::config::Config;
use crate::error::CodexErr;
use crate::error::Result as CodexResult;
@@ -15,12 +16,12 @@ use crate::protocol::Event;
use crate::protocol::EventMsg;
use crate::protocol::SessionConfiguredEvent;
use crate::rollout::RolloutRecorder;
use crate::rollout::truncation;
use crate::skills::SkillsManager;
use codex_protocol::ConversationId;
use codex_protocol::items::TurnItem;
use codex_protocol::models::ResponseItem;
use codex_protocol::ThreadId;
use codex_protocol::openai_models::ModelPreset;
use codex_protocol::protocol::InitialHistory;
use codex_protocol::protocol::Op;
use codex_protocol::protocol::RolloutItem;
use codex_protocol::protocol::SessionSource;
use std::collections::HashMap;
@@ -30,35 +31,50 @@ use std::sync::Arc;
use tempfile::TempDir;
use tokio::sync::RwLock;
/// Represents a newly created Codex conversation, including the first event
/// Represents a newly created Codex thread (formerly called a conversation), including the first event
/// (which is [`EventMsg::SessionConfigured`]).
pub struct NewConversation {
pub conversation_id: ConversationId,
pub conversation: Arc<CodexConversation>,
pub struct NewThread {
pub thread_id: ThreadId,
pub thread: Arc<CodexThread>,
pub session_configured: SessionConfiguredEvent,
}
/// [`ConversationManager`] is responsible for creating conversations and
/// maintaining them in memory.
pub struct ConversationManager {
conversations: Arc<RwLock<HashMap<ConversationId, Arc<CodexConversation>>>>,
auth_manager: Arc<AuthManager>,
models_manager: Arc<ModelsManager>,
skills_manager: Arc<SkillsManager>,
session_source: SessionSource,
/// [`ThreadManager`] is responsible for creating threads and maintaining
/// them in memory.
pub struct ThreadManager {
state: Arc<ThreadManagerState>,
#[cfg(any(test, feature = "test-support"))]
_test_codex_home_guard: Option<TempDir>,
}
impl ConversationManager {
pub fn new(auth_manager: Arc<AuthManager>, session_source: SessionSource) -> Self {
let skills_manager = Arc::new(SkillsManager::new(auth_manager.codex_home().to_path_buf()));
/// Shared, `Arc`-owned state for [`ThreadManager`]. This `Arc` is required to have a single
/// `Arc` reference that can be downgraded to by `AgentControl` while preventing every single
/// function to require an `Arc<&Self>`.
pub(crate) struct ThreadManagerState {
threads: Arc<RwLock<HashMap<ThreadId, Arc<CodexThread>>>>,
auth_manager: Arc<AuthManager>,
models_manager: Arc<ModelsManager>,
skills_manager: Arc<SkillsManager>,
session_source: SessionSource,
}
impl ThreadManager {
pub fn new(
codex_home: PathBuf,
auth_manager: Arc<AuthManager>,
session_source: SessionSource,
) -> Self {
Self {
conversations: Arc::new(RwLock::new(HashMap::new())),
auth_manager: auth_manager.clone(),
session_source,
models_manager: Arc::new(ModelsManager::new(auth_manager)),
skills_manager,
state: Arc::new(ThreadManagerState {
threads: Arc::new(RwLock::new(HashMap::new())),
models_manager: Arc::new(ModelsManager::new(
codex_home.clone(),
auth_manager.clone(),
)),
skills_manager: Arc::new(SkillsManager::new(codex_home)),
auth_manager,
session_source,
}),
#[cfg(any(test, feature = "test-support"))]
_test_codex_home_guard: None,
}
@@ -83,64 +99,212 @@ impl ConversationManager {
provider: ModelProviderInfo,
codex_home: PathBuf,
) -> Self {
let auth_manager = crate::AuthManager::from_auth_for_testing_with_home(auth, codex_home);
let skills_manager = Arc::new(SkillsManager::new(auth_manager.codex_home().to_path_buf()));
let auth_manager = AuthManager::from_auth_for_testing(auth);
Self {
conversations: Arc::new(RwLock::new(HashMap::new())),
auth_manager: auth_manager.clone(),
session_source: SessionSource::Exec,
models_manager: Arc::new(ModelsManager::with_provider(auth_manager, provider)),
skills_manager,
state: Arc::new(ThreadManagerState {
threads: Arc::new(RwLock::new(HashMap::new())),
models_manager: Arc::new(ModelsManager::with_provider(
codex_home.clone(),
auth_manager.clone(),
provider,
)),
skills_manager: Arc::new(SkillsManager::new(codex_home)),
auth_manager,
session_source: SessionSource::Exec,
}),
_test_codex_home_guard: None,
}
}
pub fn session_source(&self) -> SessionSource {
self.session_source.clone()
self.state.session_source.clone()
}
pub fn skills_manager(&self) -> Arc<SkillsManager> {
self.skills_manager.clone()
self.state.skills_manager.clone()
}
pub async fn new_conversation(&self, config: Config) -> CodexResult<NewConversation> {
self.spawn_conversation(
pub fn get_models_manager(&self) -> Arc<ModelsManager> {
self.state.models_manager.clone()
}
pub async fn list_models(&self, config: &Config) -> Vec<ModelPreset> {
self.state.models_manager.list_models(config).await
}
pub async fn get_thread(&self, thread_id: ThreadId) -> CodexResult<Arc<CodexThread>> {
self.state.get_thread(thread_id).await
}
pub async fn start_thread(&self, config: Config) -> CodexResult<NewThread> {
self.state
.spawn_thread(
config,
InitialHistory::New,
Arc::clone(&self.state.auth_manager),
self.agent_control(),
)
.await
}
pub async fn resume_thread_from_rollout(
&self,
config: Config,
rollout_path: PathBuf,
auth_manager: Arc<AuthManager>,
) -> CodexResult<NewThread> {
let initial_history = RolloutRecorder::get_rollout_history(&rollout_path).await?;
self.resume_thread_with_history(config, initial_history, auth_manager)
.await
}
pub async fn resume_thread_with_history(
&self,
config: Config,
initial_history: InitialHistory,
auth_manager: Arc<AuthManager>,
) -> CodexResult<NewThread> {
self.state
.spawn_thread(config, initial_history, auth_manager, self.agent_control())
.await
}
#[deprecated(note = "use get_thread")]
pub async fn get_conversation(&self, thread_id: ThreadId) -> CodexResult<Arc<CodexThread>> {
self.get_thread(thread_id).await
}
#[deprecated(note = "use start_thread")]
pub async fn new_conversation(&self, config: Config) -> CodexResult<NewThread> {
self.start_thread(config).await
}
#[deprecated(note = "use resume_thread_from_rollout")]
pub async fn resume_conversation_from_rollout(
&self,
config: Config,
rollout_path: PathBuf,
auth_manager: Arc<AuthManager>,
) -> CodexResult<NewThread> {
self.resume_thread_from_rollout(config, rollout_path, auth_manager)
.await
}
#[deprecated(note = "use resume_thread_with_history")]
pub async fn resume_conversation_with_history(
&self,
config: Config,
initial_history: InitialHistory,
auth_manager: Arc<AuthManager>,
) -> CodexResult<NewThread> {
self.resume_thread_with_history(config, initial_history, auth_manager)
.await
}
#[deprecated(note = "use remove_thread")]
pub async fn remove_conversation(&self, thread_id: &ThreadId) -> Option<Arc<CodexThread>> {
self.remove_thread(thread_id).await
}
#[deprecated(note = "use fork_thread")]
pub async fn fork_conversation(
&self,
nth_user_message: usize,
config: Config,
path: PathBuf,
) -> CodexResult<NewThread> {
self.fork_thread(nth_user_message, config, path).await
}
/// Removes the thread from the manager's internal map, though the thread is stored
/// as `Arc<CodexThread>`, it is possible that other references to it exist elsewhere.
/// Returns the thread if the thread was found and removed.
pub async fn remove_thread(&self, thread_id: &ThreadId) -> Option<Arc<CodexThread>> {
self.state.threads.write().await.remove(thread_id)
}
/// Fork an existing thread by taking messages up to the given position (not including
/// the message at the given position) and starting a new thread with identical
/// configuration (unless overridden by the caller's `config`). The new thread will have
/// a fresh id.
pub async fn fork_thread(
&self,
nth_user_message: usize,
config: Config,
path: PathBuf,
) -> CodexResult<NewThread> {
let history = RolloutRecorder::get_rollout_history(&path).await?;
let history = truncate_before_nth_user_message(history, nth_user_message);
self.state
.spawn_thread(
config,
history,
Arc::clone(&self.state.auth_manager),
self.agent_control(),
)
.await
}
fn agent_control(&self) -> AgentControl {
AgentControl::new(Arc::downgrade(&self.state))
}
}
impl ThreadManagerState {
pub(crate) async fn get_thread(&self, thread_id: ThreadId) -> CodexResult<Arc<CodexThread>> {
let threads = self.threads.read().await;
threads
.get(&thread_id)
.cloned()
.ok_or_else(|| CodexErr::ThreadNotFound(thread_id))
}
pub(crate) async fn send_op(&self, thread_id: ThreadId, op: Op) -> CodexResult<String> {
self.get_thread(thread_id).await?.submit(op).await
}
#[allow(dead_code)] // Used by upcoming multi-agent tooling.
pub(crate) async fn spawn_new_thread(
&self,
config: Config,
agent_control: AgentControl,
) -> CodexResult<NewThread> {
self.spawn_thread(
config,
self.auth_manager.clone(),
self.models_manager.clone(),
InitialHistory::New,
Arc::clone(&self.auth_manager),
agent_control,
)
.await
}
async fn spawn_conversation(
pub(crate) async fn spawn_thread(
&self,
config: Config,
initial_history: InitialHistory,
auth_manager: Arc<AuthManager>,
models_manager: Arc<ModelsManager>,
) -> CodexResult<NewConversation> {
agent_control: AgentControl,
) -> CodexResult<NewThread> {
let CodexSpawnOk {
codex,
conversation_id,
codex, thread_id, ..
} = Codex::spawn(
config,
auth_manager,
models_manager,
self.skills_manager.clone(),
InitialHistory::New,
Arc::clone(&self.models_manager),
Arc::clone(&self.skills_manager),
initial_history,
self.session_source.clone(),
agent_control,
)
.await?;
self.finalize_spawn(codex, conversation_id).await
self.finalize_thread_spawn(codex, thread_id).await
}
async fn finalize_spawn(
async fn finalize_thread_spawn(
&self,
codex: Codex,
conversation_id: ConversationId,
) -> CodexResult<NewConversation> {
// The first event must be `SessionInitialized`. Validate and forward it
// to the caller so that they can display it in the conversation
// history.
thread_id: ThreadId,
) -> CodexResult<NewThread> {
let event = codex.next_event().await?;
let session_configured = match event {
Event {
@@ -152,144 +316,26 @@ impl ConversationManager {
}
};
let conversation = Arc::new(CodexConversation::new(
let thread = Arc::new(CodexThread::new(
codex,
session_configured.rollout_path.clone(),
));
self.conversations
.write()
.await
.insert(conversation_id, conversation.clone());
self.threads.write().await.insert(thread_id, thread.clone());
Ok(NewConversation {
conversation_id,
conversation,
#[allow(deprecated)]
Ok(NewThread {
thread_id,
thread,
session_configured,
})
}
pub async fn get_conversation(
&self,
conversation_id: ConversationId,
) -> CodexResult<Arc<CodexConversation>> {
let conversations = self.conversations.read().await;
conversations
.get(&conversation_id)
.cloned()
.ok_or_else(|| CodexErr::ConversationNotFound(conversation_id))
}
pub async fn resume_conversation_from_rollout(
&self,
config: Config,
rollout_path: PathBuf,
auth_manager: Arc<AuthManager>,
) -> CodexResult<NewConversation> {
let initial_history = RolloutRecorder::get_rollout_history(&rollout_path).await?;
self.resume_conversation_with_history(config, initial_history, auth_manager)
.await
}
pub async fn resume_conversation_with_history(
&self,
config: Config,
initial_history: InitialHistory,
auth_manager: Arc<AuthManager>,
) -> CodexResult<NewConversation> {
let CodexSpawnOk {
codex,
conversation_id,
} = Codex::spawn(
config,
auth_manager,
self.models_manager.clone(),
self.skills_manager.clone(),
initial_history,
self.session_source.clone(),
)
.await?;
self.finalize_spawn(codex, conversation_id).await
}
/// Removes the conversation from the manager's internal map, though the
/// conversation is stored as `Arc<CodexConversation>`, it is possible that
/// other references to it exist elsewhere. Returns the conversation if the
/// conversation was found and removed.
pub async fn remove_conversation(
&self,
conversation_id: &ConversationId,
) -> Option<Arc<CodexConversation>> {
self.conversations.write().await.remove(conversation_id)
}
/// Fork an existing conversation by taking messages up to the given position
/// (not including the message at the given position) and starting a new
/// conversation with identical configuration (unless overridden by the
/// caller's `config`). The new conversation will have a fresh id.
pub async fn fork_conversation(
&self,
nth_user_message: usize,
config: Config,
path: PathBuf,
) -> CodexResult<NewConversation> {
// Compute the prefix up to the cut point.
let history = RolloutRecorder::get_rollout_history(&path).await?;
let history = truncate_before_nth_user_message(history, nth_user_message);
// Spawn a new conversation with the computed initial history.
let auth_manager = self.auth_manager.clone();
let CodexSpawnOk {
codex,
conversation_id,
} = Codex::spawn(
config,
auth_manager,
self.models_manager.clone(),
self.skills_manager.clone(),
history,
self.session_source.clone(),
)
.await?;
self.finalize_spawn(codex, conversation_id).await
}
pub async fn list_models(&self, config: &Config) -> Vec<ModelPreset> {
self.models_manager.list_models(config).await
}
pub fn get_models_manager(&self) -> Arc<ModelsManager> {
self.models_manager.clone()
}
}
/// Return a prefix of `items` obtained by cutting strictly before the nth user message
/// (0-based) and all items that follow it.
fn truncate_before_nth_user_message(history: InitialHistory, n: usize) -> InitialHistory {
// Work directly on rollout items, and cut the vector at the nth user message input.
let items: Vec<RolloutItem> = history.get_rollout_items();
// Find indices of user message inputs in rollout order.
let mut user_positions: Vec<usize> = Vec::new();
for (idx, item) in items.iter().enumerate() {
if let RolloutItem::ResponseItem(item @ ResponseItem::Message { .. }) = item
&& matches!(
crate::event_mapping::parse_turn_item(item),
Some(TurnItem::UserMessage(_))
)
{
user_positions.push(idx);
}
}
// If fewer than or equal to n user messages exist, treat as empty (out of range).
if user_positions.len() <= n {
return InitialHistory::New;
}
// Cut strictly before the nth user message (do not keep the nth itself).
let cut_idx = user_positions[n];
let rolled: Vec<RolloutItem> = items.into_iter().take(cut_idx).collect();
let rolled = truncation::truncate_rollout_before_nth_user_message_from_start(&items, n);
if rolled.is_empty() {
InitialHistory::New
@@ -345,14 +391,13 @@ mod tests {
},
ResponseItem::FunctionCall {
id: None,
call_id: "c1".to_string(),
name: "tool".to_string(),
arguments: "{}".to_string(),
call_id: "c1".to_string(),
},
assistant_msg("a4"),
];
// Wrap as InitialHistory::Forked with response items only.
let initial: Vec<RolloutItem> = items
.iter()
.cloned()

View File

@@ -1,7 +1,6 @@
use crate::function_tool::FunctionCallError;
use crate::is_safe_command::is_known_safe_command;
use crate::protocol::EventMsg;
use crate::protocol::ExecCommandSource;
use crate::protocol::TerminalInteractionEvent;
use crate::sandboxing::SandboxPermissions;
use crate::shell::Shell;
@@ -9,16 +8,13 @@ use crate::shell::get_shell_by_model_provided_path;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolOutput;
use crate::tools::context::ToolPayload;
use crate::tools::events::ToolEmitter;
use crate::tools::events::ToolEventCtx;
use crate::tools::events::ToolEventStage;
use crate::tools::handlers::apply_patch::intercept_apply_patch;
use crate::tools::registry::ToolHandler;
use crate::tools::registry::ToolKind;
use crate::unified_exec::ExecCommandRequest;
use crate::unified_exec::UnifiedExecContext;
use crate::unified_exec::UnifiedExecProcessManager;
use crate::unified_exec::UnifiedExecResponse;
use crate::unified_exec::UnifiedExecSessionManager;
use crate::unified_exec::WriteStdinRequest;
use async_trait::async_trait;
use serde::Deserialize;
@@ -116,7 +112,7 @@ impl ToolHandler for UnifiedExecHandler {
}
};
let manager: &UnifiedExecSessionManager = &session.services.unified_exec_manager;
let manager: &UnifiedExecProcessManager = &session.services.unified_exec_manager;
let context = UnifiedExecContext::new(session.clone(), turn.clone(), call_id.clone());
let response = match tool_name.as_str() {
@@ -172,20 +168,6 @@ impl ToolHandler for UnifiedExecHandler {
return Ok(output);
}
let event_ctx = ToolEventCtx::new(
context.session.as_ref(),
context.turn.as_ref(),
&context.call_id,
None,
);
let emitter = ToolEmitter::unified_exec(
&command,
cwd.clone(),
ExecCommandSource::UnifiedExecStartup,
Some(process_id.clone()),
);
emitter.emit(event_ctx, ToolEventStage::Begin).await;
manager
.exec_command(
ExecCommandRequest {

View File

@@ -2,7 +2,7 @@
Runtime: unified exec
Handles approval + sandbox orchestration for unified exec requests, delegating to
the session manager to spawn PTYs once an ExecEnv is prepared.
the process manager to spawn PTYs once an ExecEnv is prepared.
*/
use crate::error::CodexErr;
use crate::error::SandboxErr;
@@ -25,8 +25,8 @@ use crate::tools::sandboxing::ToolError;
use crate::tools::sandboxing::ToolRuntime;
use crate::tools::sandboxing::with_cached_approval;
use crate::unified_exec::UnifiedExecError;
use crate::unified_exec::UnifiedExecSession;
use crate::unified_exec::UnifiedExecSessionManager;
use crate::unified_exec::UnifiedExecProcess;
use crate::unified_exec::UnifiedExecProcessManager;
use codex_protocol::protocol::ReviewDecision;
use futures::future::BoxFuture;
use std::collections::HashMap;
@@ -50,7 +50,7 @@ pub struct UnifiedExecApprovalKey {
}
pub struct UnifiedExecRuntime<'a> {
manager: &'a UnifiedExecSessionManager,
manager: &'a UnifiedExecProcessManager,
}
impl UnifiedExecRequest {
@@ -74,7 +74,7 @@ impl UnifiedExecRequest {
}
impl<'a> UnifiedExecRuntime<'a> {
pub fn new(manager: &'a UnifiedExecSessionManager) -> Self {
pub fn new(manager: &'a UnifiedExecProcessManager) -> Self {
Self { manager }
}
}
@@ -158,13 +158,13 @@ impl Approvable<UnifiedExecRequest> for UnifiedExecRuntime<'_> {
}
}
impl<'a> ToolRuntime<UnifiedExecRequest, UnifiedExecSession> for UnifiedExecRuntime<'a> {
impl<'a> ToolRuntime<UnifiedExecRequest, UnifiedExecProcess> for UnifiedExecRuntime<'a> {
async fn run(
&mut self,
req: &UnifiedExecRequest,
attempt: &SandboxAttempt<'_>,
ctx: &ToolCtx<'_>,
) -> Result<UnifiedExecSession, ToolError> {
) -> Result<UnifiedExecProcess, ToolError> {
let base_command = &req.command;
let session_shell = ctx.session.user_shell();
let command = maybe_wrap_shell_lc_with_snapshot(base_command, session_shell.as_ref());

Some files were not shown because too many files have changed in this diff Show More