Compare commits

...

53 Commits

Author SHA1 Message Date
Ahmed Ibrahim
a9dd1b2e11 proto 2025-10-23 12:12:48 -07:00
Ahmed Ibrahim
54f6a292bb progress 2025-10-23 08:16:49 -07:00
Ahmed Ibrahim
282e696079 feedback 2025-10-23 07:53:03 -07:00
Ahmed Ibrahim
bb26652be1 feedback 2025-10-23 07:52:05 -07:00
Ahmed Ibrahim
d830ec9678 clean 2025-10-22 16:25:52 -07:00
Ahmed Ibrahim
c3ef52f442 clean 2025-10-22 16:25:27 -07:00
Ahmed Ibrahim
d7ad216217 clean 2025-10-22 16:24:19 -07:00
Ahmed Ibrahim
e380b297d0 clean 2025-10-22 16:21:52 -07:00
Ahmed Ibrahim
d63e920ea4 tests 2025-10-22 16:17:05 -07:00
Ahmed Ibrahim
838d82e19c clean 2025-10-22 16:14:20 -07:00
Ahmed Ibrahim
89efcac5a3 clean 2025-10-22 16:12:33 -07:00
Ahmed Ibrahim
6dcb9ff39d clippy 2025-10-22 16:05:32 -07:00
Ahmed Ibrahim
3bf76f79f0 progress 2025-10-22 15:39:34 -07:00
Ahmed Ibrahim
273819aaae Move changing turn input functionalities to ConversationHistory (#5473)
We are doing some ad-hoc logic while dealing with conversation history.
Ideally, we shouldn't mutate `vec[responseitem]` manually at all and
should depend on `ConversationHistory` for those changes.

Those changes are:
- Adding input to the history
- Removing items from the history
- Correcting history

I am also adding some `error` logs for cases we shouldn't ideally face.
For example, we shouldn't be missing `toolcalls` or `outputs`. We
shouldn't hit `ContextWindowExceeded` while performing `compact`

This refactor will give us granular control over our context management.
2025-10-22 13:08:46 -07:00
Gabriel Peal
4cd6b01494 [MCP] Remove the legacy stdio client in favor of rmcp (#5529)
I haven't heard of any issues with the studio rmcp client so let's
remove the legacy one and default to the new one.

Any code changes are moving code from the adapter inline but there
should be no meaningful functionality changes.
2025-10-22 12:06:59 -07:00
Thibault Sottiaux
dd59b16a17 docs: fix agents fallback example (#5396) 2025-10-22 11:32:35 -07:00
jif-oai
bac7acaa7c chore: clean spec tests (#5517) 2025-10-22 18:30:33 +01:00
pakrym-oai
3c90728a29 Add new thread items and rewire event parsing to use them (#5418)
1. Adds AgentMessage,  Reasoning,  WebSearch items.
2. Switches the ResponseItem parsing to use new items and then also emit
3. Removes user-item kind and filters out "special" (environment) user
items when returning to clients.
2025-10-22 10:14:50 -07:00
Gabriel Peal
34c5a9eaa9 [MCP] Add support for specifying scopes for MCP oauth (#5487)
```
codex mcp login server_name --scopes=scope1,scope2,scope3
```

Fixes #5480
2025-10-22 09:37:33 -07:00
jif-oai
f522aafb7f chore: drop approve all (#5503)
Not needed anymore
2025-10-22 16:55:06 +01:00
jif-oai
fd0673e457 feat: local tokenizer (#5508) 2025-10-22 16:01:02 +01:00
jif-oai
00b1e130b3 chore: align unified_exec (#5442)
Align `unified_exec` with b implementation
2025-10-22 11:50:18 +01:00
Naoya Yasuda
53cadb4df6 docs: Add --cask option to brew command to suggest (#5432)
## What
- Add the `--cask` flag to the Homebrew update command for Codex.

## Why
- `brew upgrade codex` alone does not update the cask, so users were not
getting the right upgrade instructions.

## How
- Update `UpdateAction::BrewUpgrade` in `codex-rs/tui/src/updates.rs` to
use `upgrade --cask codex`.

## Testing
- [x] cargo test -p codex-tui

Co-authored-by: Thibault Sottiaux <tibo@openai.com>
2025-10-21 19:10:30 -07:00
Javi
db7eb9a7ce feat: add text cleared with ctrl+c to the history so it can be recovered with up arrow (#5470)
https://github.com/user-attachments/assets/5eed882e-6a54-4f2c-8f21-14fa0d0ef347
2025-10-21 16:45:16 -07:00
pakrym-oai
cdd106b930 Log HTTP Version (#5475) 2025-10-21 23:29:18 +00:00
Michael Bolin
404cae7d40 feat: add experimental_bearer_token option to model provider definition (#5467)
While we do not want to encourage users to hardcode secrets in their
`config.toml` file, it should be possible to pass an API key
programmatically. For example, when using `codex app-server`, it is
possible to pass a "bag of configuration" as part of the
`NewConversationParams`:

682d05512f/codex-rs/app-server-protocol/src/protocol.rs (L248-L251)

When using `codex app-server`, it's not practical to change env vars of
the `codex app-server` process on the fly (which is how we usually read
API key values), so this helps with that.
2025-10-21 14:02:56 -07:00
Anton Panasenko
682d05512f [otel] init otel for app-server (#5469) 2025-10-21 12:34:27 -07:00
pakrym-oai
5cd8803998 Add a baseline test for resume initial messages (#5466) 2025-10-21 11:45:01 -07:00
Owen Lin
26f314904a [app-server] model/list API (#5382)
Adds a `model/list` paginated API that returns the list of models
supported by Codex.
2025-10-21 11:15:17 -07:00
jif-oai
da82153a8d fix: fix UI issue when 0 omitted lines (#5451) 2025-10-21 16:45:05 +00:00
jif-oai
4bd68e4d9e feat: emit events for unified_exec (#5448) 2025-10-21 17:32:39 +01:00
pakrym-oai
1b10a3a1b2 Enable plan tool by default (#5384)
## Summary
- make the plan tool available by default by removing the feature flag
and always registering the handler
- drop plan-tool CLI and API toggles across the exec, TUI, MCP server,
and app server code paths
- update tests and configs to reflect the always-on plan tool and guard
workspace restriction tests against env leakage

## Testing
Manually tested the extension. 
------
https://chatgpt.com/codex/tasks/task_i_68f67a3ff2d083209562a773f814c1f9
2025-10-21 16:25:05 +00:00
jif-oai
ad9a289951 chore: drop env var flag (#5462) 2025-10-21 16:11:12 +00:00
Gabriel Peal
a517f6f55b Fix flaky auth tests (#5461)
This #[serial] approach is not ideal. I am tracking a separate issue to
create an injectable env var provider but I want to fix these tests
first.

Fixes #5447
2025-10-21 09:08:34 -07:00
pakrym-oai
789e65b9d2 Pass TurnContext around instead of sub_id (#5421)
Today `sub_id` is an ID of a single incoming Codex Op submition. We then
associate all events triggered by this operation using the same
`sub_id`.

At the same time we are also creating a TurnContext per submission and
we'd like to start associating some events (item added/item completed)
with an entire turn instead of just the operation that started it.

Using turn context when sending events give us flexibility to change
notification scheme.
2025-10-21 08:04:16 -07:00
Gabriel Peal
42d5c35020 [MCP] Bump rmcp to 0.8.2 (#5423)
[Release
notes](https://github.com/modelcontextprotocol/rust-sdk/releases)

Notably, this picks up two of my PRs that have four separate fixes for
oauth dynamic client registration and auth
https://github.com/modelcontextprotocol/rust-sdk/pull/489
https://github.com/modelcontextprotocol/rust-sdk/pull/476
2025-10-20 21:19:05 -07:00
Dylan
ab95eaa356 fix(tui): Update WSL instructions (#5307)
## Summary
Clearer and more complete WSL instructions in our shell message.

## Testing
- [x] Tested locally

---------

Co-authored-by: Josh McKinney <joshka@openai.com>
2025-10-20 17:46:14 -07:00
Thibault Sottiaux
7fc01c6e9b feat: include cwd in notify payload (#5415)
Expose the session cwd in the notify payload and update docs so scripts
and extensions receive the real project path; users get accurate
project-aware notifications in CLI and VS Code.

Fixes #5387
2025-10-20 23:53:03 +00:00
Javi
df15a2f6ef chore(ci): Speed up macOS builds by using larger runner (#5234)
Saves about 2min per build

https://github.com/openai/codex/actions/runs/18544852356/job/52860637804
vs
https://github.com/openai/codex/actions/runs/18545106208/job/52861427485
2025-10-20 23:47:38 +00:00
Gabriel Peal
ef806456e4 [MCP] Dedicated error message for GitHub MCPs missing a personal access token (#5393)
Because the GitHub MCP is one of the most popular MCPs and it
confusingly doesn't support OAuth, we should make it more clear how to
make it work so people don't think Codex is broken.
2025-10-20 16:23:26 -07:00
Thibault Sottiaux
bd6ab8c665 docs: correct getting-started behaviors (#5407) 2025-10-20 16:17:07 -07:00
Thibault Sottiaux
d2bae07687 docs: document exec json events (#5399) 2025-10-20 16:11:36 -07:00
Thibault Sottiaux
9c09094583 docs: remove stale contribution reference (#5400) 2025-10-20 16:11:14 -07:00
Thibault Sottiaux
7e4ab31488 docs: clarify prompt metadata behavior (#5403) 2025-10-20 16:09:47 -07:00
Gabriel Peal
32d50bda94 Treat zsh -lc like bash -lc (#5411)
Without proper `zsh -lc` parsing, we lose some things like proper
command parsing, turn diff tracking, safe command checks, and other
things we expect from raw or `bash -lc` commands.
2025-10-20 15:52:25 -07:00
Gabriel Peal
740b4a95f4 [MCP] Add configuration options to enable or disable specific tools (#5367)
Some MCP servers expose a lot of tools. In those cases, it is reasonable
to allow/denylist tools for Codex to use so it doesn't get overwhelmed
with too many tools.

The new configuration options available in the `mcp_server` toml table
are:
* `enabled_tools`
* `disabled_tools`

Fixes #4796
2025-10-20 15:35:36 -07:00
Thibault Sottiaux
c37469b5ba docs: clarify responses proxy metadata (#5406) 2025-10-20 15:04:02 -07:00
Thibault Sottiaux
c782f8c68d docs: update advanced guide details (#5395) 2025-10-20 15:00:42 -07:00
pakrym-oai
7d6e318f87 Reduce symbol size for tests (#5389)
Test executables were huge because of detailed debugging symbols. Switch
to less rich debugging symbols.
2025-10-20 14:52:37 -07:00
Jeremy Rose
58159383c4 fix terminal corruption that could happen when onboarding and update banner (#5269)
Instead of printing characters before booting the app, make the upgrade
banner a history cell so it's well-behaved.

<img width="771" height="586" alt="Screenshot 2025-10-16 at 4 20 51 PM"
src="https://github.com/user-attachments/assets/90629d47-2c3d-4970-a826-283795ab34e5"
/>

---------

Co-authored-by: Josh McKinney <joshka@openai.com>
2025-10-20 21:40:14 +00:00
Owen Lin
5c680c6587 [app-server] read rate limits API (#5302)
Adds a `GET account/rateLimits/read` API to app-server. This calls the
codex backend to fetch the user's current rate limits.

This would be helpful in checking rate limits without having to send a
message.

For calling the codex backend usage API, I generated the types and
manually copied the relevant ones into `codex-backend-openapi-types`.
It'll be nice to extend our internal openapi generator to support Rust
so we don't have to run these manual steps.

# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.
2025-10-20 14:11:54 -07:00
Jeremy Rose
39a2446716 tui: drop citation rendering (#4855)
We don't instruct the model to use citations, so it never emits them.
Further, ratatui [doesn't currently support rendering links into the
terminal with OSC 8](https://github.com/ratatui/ratatui/issues/1028), so
even if we did parse citations, we can't correctly render them.

So, remove all the code related to rendering them.
2025-10-20 21:08:19 +00:00
pakrym-oai
9c903c4716 Add ItemStarted/ItemCompleted events for UserInputItem (#5306)
Adds a new ItemStarted event and delivers UserMessage as the first item
type (more to come).


Renames `InputItem` to `UserInput` considering we're using the `Item`
suffix for actual items.
2025-10-20 13:34:44 -07:00
168 changed files with 6638 additions and 3962 deletions

View File

@@ -201,7 +201,7 @@ jobs:
# Tests take too long for release builds to run them on every PR.
if: ${{ matrix.profile != 'release' }}
continue-on-error: true
run: cargo nextest run --all-features --no-fail-fast --target ${{ matrix.target }}
run: cargo nextest run --all-features --no-fail-fast --target ${{ matrix.target }} --cargo-profile ci-test
env:
RUST_BACKTRACE: 1

View File

@@ -58,9 +58,9 @@ jobs:
fail-fast: false
matrix:
include:
- runner: macos-14
- runner: macos-15-xlarge
target: aarch64-apple-darwin
- runner: macos-14
- runner: macos-15-xlarge
target: x86_64-apple-darwin
- runner: ubuntu-24.04
target: x86_64-unknown-linux-musl
@@ -100,7 +100,7 @@ jobs:
- name: Cargo build
run: cargo build --target ${{ matrix.target }} --release --bin codex --bin codex-responses-api-proxy
- if: ${{ matrix.runner == 'macos-14' }}
- if: ${{ matrix.runner == 'macos-15-xlarge' }}
name: Configure Apple code signing
shell: bash
env:
@@ -185,7 +185,7 @@ jobs:
echo "APPLE_CODESIGN_KEYCHAIN=$keychain_path" >> "$GITHUB_ENV"
echo "::add-mask::$APPLE_CODESIGN_IDENTITY"
- if: ${{ matrix.runner == 'macos-14' }}
- if: ${{ matrix.runner == 'macos-15-xlarge' }}
name: Sign macOS binaries
shell: bash
run: |
@@ -206,7 +206,7 @@ jobs:
codesign --force --options runtime --timestamp --sign "$APPLE_CODESIGN_IDENTITY" "${keychain_args[@]}" "$path"
done
- if: ${{ matrix.runner == 'macos-14' }}
- if: ${{ matrix.runner == 'macos-15-xlarge' }}
name: Notarize macOS binaries
shell: bash
env:
@@ -328,7 +328,7 @@ jobs:
done
- name: Remove signing keychain
if: ${{ always() && matrix.runner == 'macos-14' }}
if: ${{ always() && matrix.runner == 'macos-15-xlarge' }}
shell: bash
env:
APPLE_CODESIGN_KEYCHAIN: ${{ env.APPLE_CODESIGN_KEYCHAIN }}

94
codex-rs/Cargo.lock generated
View File

@@ -182,7 +182,10 @@ version = "0.0.0"
dependencies = [
"anyhow",
"assert_cmd",
"base64",
"chrono",
"codex-app-server-protocol",
"codex-core",
"serde",
"serde_json",
"tokio",
@@ -834,8 +837,10 @@ dependencies = [
"app_test_support",
"assert_cmd",
"base64",
"chrono",
"codex-app-server-protocol",
"codex-arg0",
"codex-backend-client",
"codex-common",
"codex-core",
"codex-file-search",
@@ -843,6 +848,7 @@ dependencies = [
"codex-protocol",
"codex-utils-json-to-toml",
"core_test_support",
"opentelemetry-appender-tracing",
"os_info",
"pretty_assertions",
"serde",
@@ -917,6 +923,8 @@ version = "0.0.0"
dependencies = [
"anyhow",
"codex-backend-openapi-models",
"codex-core",
"codex-protocol",
"pretty_assertions",
"reqwest",
"serde",
@@ -929,6 +937,7 @@ version = "0.0.0"
dependencies = [
"serde",
"serde_json",
"serde_with",
]
[[package]]
@@ -1052,7 +1061,6 @@ dependencies = [
"codex-apply-patch",
"codex-async-utils",
"codex-file-search",
"codex-mcp-client",
"codex-otel",
"codex-protocol",
"codex-rmcp-client",
@@ -1240,19 +1248,6 @@ dependencies = [
"wiremock",
]
[[package]]
name = "codex-mcp-client"
version = "0.0.0"
dependencies = [
"anyhow",
"mcp-types",
"serde",
"serde_json",
"tokio",
"tracing",
"tracing-subscriber",
]
[[package]]
name = "codex-mcp-server"
version = "0.0.0"
@@ -1446,12 +1441,12 @@ dependencies = [
"libc",
"mcp-types",
"opentelemetry-appender-tracing",
"path-clean",
"pathdiff",
"pretty_assertions",
"pulldown-cmark",
"rand 0.9.2",
"ratatui",
"ratatui-macros",
"regex-lite",
"serde",
"serde_json",
@@ -1508,6 +1503,16 @@ dependencies = [
name = "codex-utils-string"
version = "0.0.0"
[[package]]
name = "codex-utils-tokenizer"
version = "0.0.0"
dependencies = [
"anyhow",
"pretty_assertions",
"thiserror 2.0.16",
"tiktoken-rs",
]
[[package]]
name = "color-eyre"
version = "0.6.5"
@@ -2305,6 +2310,17 @@ dependencies = [
"once_cell",
]
[[package]]
name = "fancy-regex"
version = "0.13.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "531e46835a22af56d1e3b66f04844bed63158bc094a628bec1d321d9b4c44bf2"
dependencies = [
"bit-set",
"regex-automata",
"regex-syntax 0.8.5",
]
[[package]]
name = "fastrand"
version = "2.3.0"
@@ -4277,12 +4293,6 @@ dependencies = [
"path-dedot",
]
[[package]]
name = "path-clean"
version = "1.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "17359afc20d7ab31fdb42bb844c8b3bb1dabd7dcf7e68428492da7f16966fcef"
[[package]]
name = "path-dedot"
version = "3.1.1"
@@ -4628,7 +4638,7 @@ dependencies = [
"pin-project-lite",
"quinn-proto",
"quinn-udp",
"rustc-hash",
"rustc-hash 2.1.1",
"rustls",
"socket2 0.6.0",
"thiserror 2.0.16",
@@ -4648,7 +4658,7 @@ dependencies = [
"lru-slab",
"rand 0.9.2",
"ring",
"rustc-hash",
"rustc-hash 2.1.1",
"rustls",
"rustls-pki-types",
"slab",
@@ -4776,6 +4786,15 @@ dependencies = [
"unicode-width 0.2.1",
]
[[package]]
name = "ratatui-macros"
version = "0.6.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6fef540f80dbe8a0773266fa6077788ceb65ef624cdbf36e131aaf90b4a52df4"
dependencies = [
"ratatui",
]
[[package]]
name = "redox_syscall"
version = "0.5.15"
@@ -4933,9 +4952,9 @@ dependencies = [
[[package]]
name = "rmcp"
version = "0.8.1"
version = "0.8.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6f35acda8f89fca5fd8c96cae3c6d5b4c38ea0072df4c8030915f3b5ff469c1c"
checksum = "4e35d31f89beb59c83bc31363426da25b323ce0c2e5b53c7bf29867d16ee7898"
dependencies = [
"base64",
"bytes",
@@ -4967,9 +4986,9 @@ dependencies = [
[[package]]
name = "rmcp-macros"
version = "0.8.1"
version = "0.8.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c9f1d5220aaa23b79c3d02e18f7a554403b3ccea544bbb6c69d6bcb3e854a274"
checksum = "d88518b38110c439a03f0f4eee40e5105d648a530711cb87f98991e3f324a664"
dependencies = [
"darling 0.21.3",
"proc-macro2",
@@ -4984,6 +5003,12 @@ version = "0.1.25"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "989e6739f80c4ad5b13e0fd7fe89531180375b18520cc8c82080e4dc4035b84f"
[[package]]
name = "rustc-hash"
version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "08d43f7aa6b08d49f382cde6a7982047c3426db949b1424bc4b7ec9ae12c6ce2"
[[package]]
name = "rustc-hash"
version = "2.1.1"
@@ -6164,6 +6189,21 @@ dependencies = [
"zune-jpeg",
]
[[package]]
name = "tiktoken-rs"
version = "0.7.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "25563eeba904d770acf527e8b370fe9a5547bacd20ff84a0b6c3bc41288e5625"
dependencies = [
"anyhow",
"base64",
"bstr",
"fancy-regex",
"lazy_static",
"regex",
"rustc-hash 1.1.0",
]
[[package]]
name = "time"
version = "0.3.44"

View File

@@ -20,7 +20,6 @@ members = [
"git-tooling",
"linux-sandbox",
"login",
"mcp-client",
"mcp-server",
"mcp-types",
"ollama",
@@ -37,6 +36,7 @@ members = [
"utils/readiness",
"utils/pty",
"utils/string",
"utils/tokenizer",
]
resolver = "2"
@@ -57,6 +57,7 @@ codex-app-server-protocol = { path = "app-server-protocol" }
codex-apply-patch = { path = "apply-patch" }
codex-arg0 = { path = "arg0" }
codex-async-utils = { path = "async-utils" }
codex-backend-client = { path = "backend-client" }
codex-chatgpt = { path = "chatgpt" }
codex-common = { path = "common" }
codex-core = { path = "core" }
@@ -66,7 +67,6 @@ codex-file-search = { path = "file-search" }
codex-git-tooling = { path = "git-tooling" }
codex-linux-sandbox = { path = "linux-sandbox" }
codex-login = { path = "login" }
codex-mcp-client = { path = "mcp-client" }
codex-mcp-server = { path = "mcp-server" }
codex-ollama = { path = "ollama" }
codex-otel = { path = "otel" }
@@ -78,9 +78,10 @@ codex-rmcp-client = { path = "rmcp-client" }
codex-stdio-to-uds = { path = "stdio-to-uds" }
codex-tui = { path = "tui" }
codex-utils-json-to-toml = { path = "utils/json-to-toml" }
codex-utils-readiness = { path = "utils/readiness" }
codex-utils-pty = { path = "utils/pty" }
codex-utils-readiness = { path = "utils/readiness" }
codex-utils-string = { path = "utils/string" }
codex-utils-tokenizer = { path = "utils/tokenizer" }
core_test_support = { path = "core/tests/common" }
mcp-types = { path = "mcp-types" }
mcp_test_support = { path = "mcp-server/tests/common" }
@@ -142,7 +143,6 @@ os_info = "3.12.0"
owo-colors = "4.2.0"
paste = "1.0.15"
path-absolutize = "3.1.1"
path-clean = "1.0.1"
pathdiff = "0.2"
portable-pty = "0.9.0"
predicates = "3"
@@ -150,9 +150,10 @@ pretty_assertions = "1.4.1"
pulldown-cmark = "0.10"
rand = "0.9"
ratatui = "0.29.0"
ratatui-macros = "0.6.0"
regex-lite = "0.1.7"
reqwest = "0.12"
rmcp = { version = "0.8.0", default-features = false }
rmcp = { version = "0.8.2", default-features = false }
schemars = "0.8.22"
seccompiler = "0.5.0"
sentry = "0.34.0"
@@ -245,7 +246,7 @@ unwrap_used = "deny"
# cargo-shear cannot see the platform-specific openssl-sys usage, so we
# silence the false positive here instead of deleting a real dependency.
[workspace.metadata.cargo-shear]
ignored = ["openssl-sys", "codex-utils-readiness"]
ignored = ["openssl-sys", "codex-utils-readiness", "codex-utils-tokenizer"]
[profile.release]
lto = "fat"
@@ -256,6 +257,11 @@ strip = "symbols"
# See https://github.com/openai/codex/issues/1411 for details.
codegen-units = 1
[profile.ci-test]
debug = 1 # Reduce debug symbol size
inherits = "test"
opt-level = 0
[patch.crates-io]
# Uncomment to debug local changes.
# ratatui = { path = "../../ratatui" }

View File

@@ -14,6 +14,7 @@ use codex_protocol::parse_command::ParsedCommand;
use codex_protocol::protocol::AskForApproval;
use codex_protocol::protocol::EventMsg;
use codex_protocol::protocol::FileChange;
use codex_protocol::protocol::RateLimitSnapshot;
use codex_protocol::protocol::ReviewDecision;
use codex_protocol::protocol::SandboxPolicy;
use codex_protocol::protocol::TurnAbortReason;
@@ -105,6 +106,13 @@ client_request_definitions! {
params: ListConversationsParams,
response: ListConversationsResponse,
},
#[serde(rename = "model/list")]
#[ts(rename = "model/list")]
/// List available Codex models along with display metadata.
ListModels {
params: ListModelsParams,
response: ListModelsResponse,
},
/// Resume a recorded Codex conversation from a rollout file.
ResumeConversation {
params: ResumeConversationParams,
@@ -183,6 +191,12 @@ client_request_definitions! {
params: ExecOneOffCommandParams,
response: ExecOneOffCommandResponse,
},
#[serde(rename = "account/rateLimits/read")]
#[ts(rename = "account/rateLimits/read")]
GetAccountRateLimits {
params: #[ts(type = "undefined")] #[serde(skip_serializing_if = "Option::is_none")] Option<()>,
response: GetAccountRateLimitsResponse,
},
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Default, JsonSchema, TS)]
@@ -240,10 +254,6 @@ pub struct NewConversationParams {
#[serde(skip_serializing_if = "Option::is_none")]
pub base_instructions: Option<String>,
/// Whether to include the plan tool in the conversation.
#[serde(skip_serializing_if = "Option::is_none")]
pub include_plan_tool: Option<bool>,
/// Whether to include the apply patch tool in the conversation.
#[serde(skip_serializing_if = "Option::is_none")]
pub include_apply_patch_tool: Option<bool>,
@@ -301,6 +311,47 @@ pub struct ListConversationsResponse {
pub next_cursor: Option<String>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Default, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct ListModelsParams {
/// Optional page size; defaults to a reasonable server-side value.
#[serde(skip_serializing_if = "Option::is_none")]
pub page_size: Option<usize>,
/// Opaque pagination cursor returned by a previous call.
#[serde(skip_serializing_if = "Option::is_none")]
pub cursor: Option<String>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct Model {
pub id: String,
pub model: String,
pub display_name: String,
pub description: String,
pub supported_reasoning_efforts: Vec<ReasoningEffortOption>,
pub default_reasoning_effort: ReasoningEffort,
// Only one model should be marked as default.
pub is_default: bool,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct ReasoningEffortOption {
pub reasoning_effort: ReasoningEffort,
pub description: String,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct ListModelsResponse {
pub items: Vec<Model>,
/// Opaque cursor to pass to the next call to continue after the last item.
/// if None, there are no more items to return.
#[serde(skip_serializing_if = "Option::is_none")]
pub next_cursor: Option<String>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct ResumeConversationParams {
@@ -420,6 +471,12 @@ pub struct ExecOneOffCommandResponse {
pub stderr: String,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct GetAccountRateLimitsResponse {
pub rate_limits: RateLimitSnapshot,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct GetAuthStatusResponse {
@@ -873,7 +930,6 @@ mod tests {
sandbox: None,
config: None,
base_instructions: None,
include_plan_tool: None,
include_apply_patch_tool: None,
},
};
@@ -970,4 +1026,37 @@ mod tests {
assert_eq!(payload.request_with_id(RequestId::Integer(7)), request);
Ok(())
}
#[test]
fn serialize_get_account_rate_limits() -> Result<()> {
let request = ClientRequest::GetAccountRateLimits {
request_id: RequestId::Integer(1),
params: None,
};
assert_eq!(
json!({
"method": "account/rateLimits/read",
"id": 1,
}),
serde_json::to_value(&request)?,
);
Ok(())
}
#[test]
fn serialize_list_models() -> Result<()> {
let request = ClientRequest::ListModels {
request_id: RequestId::Integer(2),
params: ListModelsParams::default(),
};
assert_eq!(
json!({
"method": "model/list",
"id": 2,
"params": {}
}),
serde_json::to_value(&request)?,
);
Ok(())
}
}

View File

@@ -19,11 +19,13 @@ anyhow = { workspace = true }
codex-arg0 = { workspace = true }
codex-common = { workspace = true, features = ["cli"] }
codex-core = { workspace = true }
codex-backend-client = { workspace = true }
codex-file-search = { workspace = true }
codex-login = { workspace = true }
codex-protocol = { workspace = true }
codex-app-server-protocol = { workspace = true }
codex-utils-json-to-toml = { workspace = true }
chrono = { workspace = true }
serde = { workspace = true, features = ["derive"] }
serde_json = { workspace = true }
tokio = { workspace = true, features = [
@@ -35,6 +37,7 @@ tokio = { workspace = true, features = [
] }
tracing = { workspace = true, features = ["log"] }
tracing-subscriber = { workspace = true, features = ["env-filter", "fmt"] }
opentelemetry-appender-tracing = { workspace = true }
uuid = { workspace = true, features = ["serde", "v7"] }
[dev-dependencies]

View File

@@ -1,6 +1,7 @@
use crate::error_code::INTERNAL_ERROR_CODE;
use crate::error_code::INVALID_REQUEST_ERROR_CODE;
use crate::fuzzy_file_search::run_fuzzy_file_search;
use crate::models::supported_models;
use crate::outgoing_message::OutgoingMessageSender;
use crate::outgoing_message::OutgoingNotification;
use codex_app_server_protocol::AddConversationListenerParams;
@@ -9,6 +10,7 @@ use codex_app_server_protocol::ApplyPatchApprovalParams;
use codex_app_server_protocol::ApplyPatchApprovalResponse;
use codex_app_server_protocol::ArchiveConversationParams;
use codex_app_server_protocol::ArchiveConversationResponse;
use codex_app_server_protocol::AuthMode;
use codex_app_server_protocol::AuthStatusChangeNotification;
use codex_app_server_protocol::ClientRequest;
use codex_app_server_protocol::ConversationSummary;
@@ -18,6 +20,7 @@ use codex_app_server_protocol::ExecOneOffCommandParams;
use codex_app_server_protocol::ExecOneOffCommandResponse;
use codex_app_server_protocol::FuzzyFileSearchParams;
use codex_app_server_protocol::FuzzyFileSearchResponse;
use codex_app_server_protocol::GetAccountRateLimitsResponse;
use codex_app_server_protocol::GetUserAgentResponse;
use codex_app_server_protocol::GetUserSavedConfigResponse;
use codex_app_server_protocol::GitDiffToRemoteResponse;
@@ -27,6 +30,8 @@ use codex_app_server_protocol::InterruptConversationResponse;
use codex_app_server_protocol::JSONRPCErrorError;
use codex_app_server_protocol::ListConversationsParams;
use codex_app_server_protocol::ListConversationsResponse;
use codex_app_server_protocol::ListModelsParams;
use codex_app_server_protocol::ListModelsResponse;
use codex_app_server_protocol::LoginApiKeyParams;
use codex_app_server_protocol::LoginApiKeyResponse;
use codex_app_server_protocol::LoginChatGptCompleteNotification;
@@ -49,6 +54,7 @@ use codex_app_server_protocol::SetDefaultModelParams;
use codex_app_server_protocol::SetDefaultModelResponse;
use codex_app_server_protocol::UserInfoResponse;
use codex_app_server_protocol::UserSavedConfig;
use codex_backend_client::Client as BackendClient;
use codex_core::AuthManager;
use codex_core::CodexConversation;
use codex_core::ConversationManager;
@@ -77,7 +83,6 @@ use codex_core::protocol::ApplyPatchApprovalRequestEvent;
use codex_core::protocol::Event;
use codex_core::protocol::EventMsg;
use codex_core::protocol::ExecApprovalRequestEvent;
use codex_core::protocol::InputItem as CoreInputItem;
use codex_core::protocol::Op;
use codex_core::protocol::ReviewDecision;
use codex_login::ServerOptions as LoginServerOptions;
@@ -85,10 +90,11 @@ use codex_login::ShutdownHandle;
use codex_login::run_login_server;
use codex_protocol::ConversationId;
use codex_protocol::config_types::ForcedLoginMethod;
use codex_protocol::models::ContentItem;
use codex_protocol::items::TurnItem;
use codex_protocol::models::ResponseItem;
use codex_protocol::protocol::InputMessageKind;
use codex_protocol::protocol::RateLimitSnapshot;
use codex_protocol::protocol::USER_MESSAGE_BEGIN;
use codex_protocol::user_input::UserInput as CoreInputItem;
use codex_utils_json_to_toml::json_to_toml;
use std::collections::HashMap;
use std::ffi::OsStr;
@@ -107,7 +113,6 @@ use uuid::Uuid;
// Duration before a ChatGPT login attempt is abandoned.
const LOGIN_CHATGPT_TIMEOUT: Duration = Duration::from_secs(10 * 60);
struct ActiveLogin {
shutdown_handle: ShutdownHandle,
login_id: Uuid,
@@ -168,6 +173,9 @@ impl CodexMessageProcessor {
ClientRequest::ListConversations { request_id, params } => {
self.handle_list_conversations(request_id, params).await;
}
ClientRequest::ListModels { request_id, params } => {
self.list_models(request_id, params).await;
}
ClientRequest::ResumeConversation { request_id, params } => {
self.handle_resume_conversation(request_id, params).await;
}
@@ -240,6 +248,12 @@ impl CodexMessageProcessor {
ClientRequest::ExecOneOffCommand { request_id, params } => {
self.exec_one_off_command(request_id, params).await;
}
ClientRequest::GetAccountRateLimits {
request_id,
params: _,
} => {
self.get_account_rate_limits(request_id).await;
}
}
}
@@ -527,6 +541,53 @@ impl CodexMessageProcessor {
self.outgoing.send_response(request_id, response).await;
}
async fn get_account_rate_limits(&self, request_id: RequestId) {
match self.fetch_account_rate_limits().await {
Ok(rate_limits) => {
let response = GetAccountRateLimitsResponse { rate_limits };
self.outgoing.send_response(request_id, response).await;
}
Err(error) => {
self.outgoing.send_error(request_id, error).await;
}
}
}
async fn fetch_account_rate_limits(&self) -> Result<RateLimitSnapshot, JSONRPCErrorError> {
let Some(auth) = self.auth_manager.auth() else {
return Err(JSONRPCErrorError {
code: INVALID_REQUEST_ERROR_CODE,
message: "codex account authentication required to read rate limits".to_string(),
data: None,
});
};
if auth.mode != AuthMode::ChatGPT {
return Err(JSONRPCErrorError {
code: INVALID_REQUEST_ERROR_CODE,
message: "chatgpt authentication required to read rate limits".to_string(),
data: None,
});
}
let client = BackendClient::from_auth(self.config.chatgpt_base_url.clone(), &auth)
.await
.map_err(|err| JSONRPCErrorError {
code: INTERNAL_ERROR_CODE,
message: format!("failed to construct backend client: {err}"),
data: None,
})?;
client
.get_rate_limits()
.await
.map_err(|err| JSONRPCErrorError {
code: INTERNAL_ERROR_CODE,
message: format!("failed to fetch codex rate limits: {err}"),
data: None,
})
}
async fn get_user_saved_config(&self, request_id: RequestId) {
let toml_value = match load_config_as_toml(&self.config.codex_home).await {
Ok(val) => val,
@@ -774,6 +835,58 @@ impl CodexMessageProcessor {
self.outgoing.send_response(request_id, response).await;
}
async fn list_models(&self, request_id: RequestId, params: ListModelsParams) {
let ListModelsParams { page_size, cursor } = params;
let models = supported_models();
let total = models.len();
if total == 0 {
let response = ListModelsResponse {
items: Vec::new(),
next_cursor: None,
};
self.outgoing.send_response(request_id, response).await;
return;
}
let effective_page_size = page_size.unwrap_or(total).max(1).min(total);
let start = match cursor {
Some(cursor) => match cursor.parse::<usize>() {
Ok(idx) => idx,
Err(_) => {
let error = JSONRPCErrorError {
code: INVALID_REQUEST_ERROR_CODE,
message: format!("invalid cursor: {cursor}"),
data: None,
};
self.outgoing.send_error(request_id, error).await;
return;
}
},
None => 0,
};
if start > total {
let error = JSONRPCErrorError {
code: INVALID_REQUEST_ERROR_CODE,
message: format!("cursor {start} exceeds total models {total}"),
data: None,
};
self.outgoing.send_error(request_id, error).await;
return;
}
let end = start.saturating_add(effective_page_size).min(total);
let items = models[start..end].to_vec();
let next_cursor = if end < total {
Some(end.to_string())
} else {
None
};
let response = ListModelsResponse { items, next_cursor };
self.outgoing.send_response(request_id, response).await;
}
async fn handle_resume_conversation(
&self,
request_id: RequestId,
@@ -826,18 +939,9 @@ impl CodexMessageProcessor {
},
))
.await;
let initial_messages = session_configured.initial_messages.map(|msgs| {
msgs.into_iter()
.filter(|event| {
// Don't send non-plain user messages (like user instructions
// or environment context) back so they don't get rendered.
if let EventMsg::UserMessage(user_message) = event {
return matches!(user_message.kind, Some(InputMessageKind::Plain));
}
true
})
.collect()
});
let initial_messages = session_configured
.initial_messages
.map(|msgs| msgs.into_iter().collect());
// Reply with conversation id + model and initial messages (when present)
let response = codex_app_server_protocol::ResumeConversationResponse {
@@ -1364,7 +1468,6 @@ async fn derive_config_from_params(
sandbox: sandbox_mode,
config: cli_overrides,
base_instructions,
include_plan_tool,
include_apply_patch_tool,
} = params;
let overrides = ConfigOverrides {
@@ -1377,7 +1480,6 @@ async fn derive_config_from_params(
model_provider: None,
codex_linux_sandbox_exe,
base_instructions,
include_plan_tool,
include_apply_patch_tool,
include_view_image_tool: None,
show_raw_agent_reasoning: None,
@@ -1484,18 +1586,8 @@ fn extract_conversation_summary(
let preview = head
.iter()
.filter_map(|value| serde_json::from_value::<ResponseItem>(value.clone()).ok())
.find_map(|item| match item {
ResponseItem::Message { content, .. } => {
content.into_iter().find_map(|content| match content {
ContentItem::InputText { text } => {
match InputMessageKind::from(("user", &text)) {
InputMessageKind::Plain => Some(text),
_ => None,
}
}
_ => None,
})
}
.find_map(|item| match codex_core::parse_turn_item(&item) {
Some(TurnItem::UserMessage(user)) => Some(user.message()),
_ => None,
})?;

View File

@@ -1,13 +1,16 @@
#![deny(clippy::print_stdout, clippy::print_stderr)]
use std::io::ErrorKind;
use std::io::Result as IoResult;
use std::path::PathBuf;
use codex_common::CliConfigOverrides;
use codex_core::config::Config;
use codex_core::config::ConfigOverrides;
use opentelemetry_appender_tracing::layer::OpenTelemetryTracingBridge;
use std::io::ErrorKind;
use std::io::Result as IoResult;
use std::path::PathBuf;
use crate::message_processor::MessageProcessor;
use crate::outgoing_message::OutgoingMessage;
use crate::outgoing_message::OutgoingMessageSender;
use codex_app_server_protocol::JSONRPCMessage;
use tokio::io::AsyncBufReadExt;
use tokio::io::AsyncWriteExt;
@@ -18,15 +21,15 @@ use tracing::debug;
use tracing::error;
use tracing::info;
use tracing_subscriber::EnvFilter;
use crate::message_processor::MessageProcessor;
use crate::outgoing_message::OutgoingMessage;
use crate::outgoing_message::OutgoingMessageSender;
use tracing_subscriber::Layer;
use tracing_subscriber::layer::SubscriberExt;
use tracing_subscriber::util::SubscriberInitExt;
mod codex_message_processor;
mod error_code;
mod fuzzy_file_search;
mod message_processor;
mod models;
mod outgoing_message;
/// Size of the bounded channels used to communicate between tasks. The value
@@ -38,13 +41,6 @@ pub async fn run_main(
codex_linux_sandbox_exe: Option<PathBuf>,
cli_config_overrides: CliConfigOverrides,
) -> IoResult<()> {
// Install a simple subscriber so `tracing` output is visible. Users can
// control the log level with `RUST_LOG`.
tracing_subscriber::fmt()
.with_writer(std::io::stderr)
.with_env_filter(EnvFilter::from_default_env())
.init();
// Set up channels.
let (incoming_tx, mut incoming_rx) = mpsc::channel::<JSONRPCMessage>(CHANNEL_CAPACITY);
let (outgoing_tx, mut outgoing_rx) = mpsc::unbounded_channel::<OutgoingMessage>();
@@ -86,6 +82,29 @@ pub async fn run_main(
std::io::Error::new(ErrorKind::InvalidData, format!("error loading config: {e}"))
})?;
let otel =
codex_core::otel_init::build_provider(&config, env!("CARGO_PKG_VERSION")).map_err(|e| {
std::io::Error::new(
ErrorKind::InvalidData,
format!("error loading otel config: {e}"),
)
})?;
// Install a simple subscriber so `tracing` output is visible. Users can
// control the log level with `RUST_LOG`.
let stderr_fmt = tracing_subscriber::fmt::layer()
.with_writer(std::io::stderr)
.with_filter(EnvFilter::from_default_env());
let _ = tracing_subscriber::registry()
.with(stderr_fmt)
.with(otel.as_ref().map(|provider| {
OpenTelemetryTracingBridge::new(&provider.logger).with_filter(
tracing_subscriber::filter::filter_fn(codex_core::otel_init::codex_export_filter),
)
}))
.try_init();
// Task: process incoming messages.
let processor_handle = tokio::spawn({
let outgoing_message_sender = OutgoingMessageSender::new(outgoing_tx);

View File

@@ -0,0 +1,38 @@
use codex_app_server_protocol::Model;
use codex_app_server_protocol::ReasoningEffortOption;
use codex_common::model_presets::ModelPreset;
use codex_common::model_presets::ReasoningEffortPreset;
use codex_common::model_presets::builtin_model_presets;
pub fn supported_models() -> Vec<Model> {
builtin_model_presets(None)
.into_iter()
.map(model_from_preset)
.collect()
}
fn model_from_preset(preset: ModelPreset) -> Model {
Model {
id: preset.id.to_string(),
model: preset.model.to_string(),
display_name: preset.display_name.to_string(),
description: preset.description.to_string(),
supported_reasoning_efforts: reasoning_efforts_from_preset(
preset.supported_reasoning_efforts,
),
default_reasoning_effort: preset.default_reasoning_effort,
is_default: preset.is_default,
}
}
fn reasoning_efforts_from_preset(
efforts: &'static [ReasoningEffortPreset],
) -> Vec<ReasoningEffortOption> {
efforts
.iter()
.map(|preset| ReasoningEffortOption {
reasoning_effort: preset.effort,
description: preset.description.to_string(),
})
.collect()
}

View File

@@ -9,7 +9,10 @@ path = "lib.rs"
[dependencies]
anyhow = { workspace = true }
assert_cmd = { workspace = true }
base64 = { workspace = true }
chrono = { workspace = true }
codex-app-server-protocol = { workspace = true }
codex-core = { workspace = true }
serde = { workspace = true }
serde_json = { workspace = true }
tokio = { workspace = true, features = [

View File

@@ -0,0 +1,131 @@
use std::path::Path;
use anyhow::Context;
use anyhow::Result;
use base64::Engine;
use base64::engine::general_purpose::URL_SAFE_NO_PAD;
use chrono::DateTime;
use chrono::Utc;
use codex_core::auth::AuthDotJson;
use codex_core::auth::get_auth_file;
use codex_core::auth::write_auth_json;
use codex_core::token_data::TokenData;
use codex_core::token_data::parse_id_token;
use serde_json::json;
/// Builder for writing a fake ChatGPT auth.json in tests.
#[derive(Debug, Clone)]
pub struct ChatGptAuthFixture {
access_token: String,
refresh_token: String,
account_id: Option<String>,
claims: ChatGptIdTokenClaims,
last_refresh: Option<Option<DateTime<Utc>>>,
}
impl ChatGptAuthFixture {
pub fn new(access_token: impl Into<String>) -> Self {
Self {
access_token: access_token.into(),
refresh_token: "refresh-token".to_string(),
account_id: None,
claims: ChatGptIdTokenClaims::default(),
last_refresh: None,
}
}
pub fn refresh_token(mut self, refresh_token: impl Into<String>) -> Self {
self.refresh_token = refresh_token.into();
self
}
pub fn account_id(mut self, account_id: impl Into<String>) -> Self {
self.account_id = Some(account_id.into());
self
}
pub fn plan_type(mut self, plan_type: impl Into<String>) -> Self {
self.claims.plan_type = Some(plan_type.into());
self
}
pub fn email(mut self, email: impl Into<String>) -> Self {
self.claims.email = Some(email.into());
self
}
pub fn last_refresh(mut self, last_refresh: Option<DateTime<Utc>>) -> Self {
self.last_refresh = Some(last_refresh);
self
}
pub fn claims(mut self, claims: ChatGptIdTokenClaims) -> Self {
self.claims = claims;
self
}
}
#[derive(Debug, Clone, Default)]
pub struct ChatGptIdTokenClaims {
pub email: Option<String>,
pub plan_type: Option<String>,
}
impl ChatGptIdTokenClaims {
pub fn new() -> Self {
Self::default()
}
pub fn email(mut self, email: impl Into<String>) -> Self {
self.email = Some(email.into());
self
}
pub fn plan_type(mut self, plan_type: impl Into<String>) -> Self {
self.plan_type = Some(plan_type.into());
self
}
}
pub fn encode_id_token(claims: &ChatGptIdTokenClaims) -> Result<String> {
let header = json!({ "alg": "none", "typ": "JWT" });
let mut payload = serde_json::Map::new();
if let Some(email) = &claims.email {
payload.insert("email".to_string(), json!(email));
}
if let Some(plan_type) = &claims.plan_type {
payload.insert(
"https://api.openai.com/auth".to_string(),
json!({ "chatgpt_plan_type": plan_type }),
);
}
let payload = serde_json::Value::Object(payload);
let header_b64 =
URL_SAFE_NO_PAD.encode(serde_json::to_vec(&header).context("serialize jwt header")?);
let payload_b64 =
URL_SAFE_NO_PAD.encode(serde_json::to_vec(&payload).context("serialize jwt payload")?);
let signature_b64 = URL_SAFE_NO_PAD.encode(b"signature");
Ok(format!("{header_b64}.{payload_b64}.{signature_b64}"))
}
pub fn write_chatgpt_auth(codex_home: &Path, fixture: ChatGptAuthFixture) -> Result<()> {
let id_token_raw = encode_id_token(&fixture.claims)?;
let id_token = parse_id_token(&id_token_raw).context("parse id token")?;
let tokens = TokenData {
id_token,
access_token: fixture.access_token,
refresh_token: fixture.refresh_token,
account_id: fixture.account_id,
};
let last_refresh = fixture.last_refresh.unwrap_or_else(|| Some(Utc::now()));
let auth = AuthDotJson {
openai_api_key: None,
tokens: Some(tokens),
last_refresh,
};
write_auth_json(&get_auth_file(codex_home), &auth).context("write auth.json")
}

View File

@@ -1,7 +1,12 @@
mod auth_fixtures;
mod mcp_process;
mod mock_model_server;
mod responses;
pub use auth_fixtures::ChatGptAuthFixture;
pub use auth_fixtures::ChatGptIdTokenClaims;
pub use auth_fixtures::encode_id_token;
pub use auth_fixtures::write_chatgpt_auth;
use codex_app_server_protocol::JSONRPCResponse;
pub use mcp_process::McpProcess;
pub use mock_model_server::create_mock_chat_completions_server;

View File

@@ -21,6 +21,7 @@ use codex_app_server_protocol::GetAuthStatusParams;
use codex_app_server_protocol::InitializeParams;
use codex_app_server_protocol::InterruptConversationParams;
use codex_app_server_protocol::ListConversationsParams;
use codex_app_server_protocol::ListModelsParams;
use codex_app_server_protocol::LoginApiKeyParams;
use codex_app_server_protocol::NewConversationParams;
use codex_app_server_protocol::RemoveConversationListenerParams;
@@ -236,6 +237,11 @@ impl McpProcess {
self.send_request("getUserAgent", None).await
}
/// Send an `account/rateLimits/read` JSON-RPC request.
pub async fn send_get_account_rate_limits_request(&mut self) -> anyhow::Result<i64> {
self.send_request("account/rateLimits/read", None).await
}
/// Send a `userInfo` JSON-RPC request.
pub async fn send_user_info_request(&mut self) -> anyhow::Result<i64> {
self.send_request("userInfo", None).await
@@ -259,6 +265,15 @@ impl McpProcess {
self.send_request("listConversations", params).await
}
/// Send a `model/list` JSON-RPC request.
pub async fn send_list_models_request(
&mut self,
params: ListModelsParams,
) -> anyhow::Result<i64> {
let params = Some(serde_json::to_value(params)?);
self.send_request("model/list", params).await
}
/// Send a `resumeConversation` JSON-RPC request.
pub async fn send_resume_conversation_request(
&mut self,

View File

@@ -30,7 +30,6 @@ use codex_protocol::config_types::SandboxMode;
use codex_protocol::parse_command::ParsedCommand;
use codex_protocol::protocol::Event;
use codex_protocol::protocol::EventMsg;
use codex_protocol::protocol::InputMessageKind;
use pretty_assertions::assert_eq;
use std::env;
use tempfile::TempDir;
@@ -528,43 +527,6 @@ async fn test_send_user_turn_updates_sandbox_and_cwd_between_turns() {
.expect("sendUserTurn 2 timeout")
.expect("sendUserTurn 2 resp");
let mut env_message: Option<String> = None;
let second_cwd_str = second_cwd.to_string_lossy().into_owned();
for _ in 0..10 {
let notification = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("codex/event/user_message"),
)
.await
.expect("user_message timeout")
.expect("user_message notification");
let params = notification
.params
.clone()
.expect("user_message should include params");
let event: Event = serde_json::from_value(params).expect("deserialize user_message event");
if let EventMsg::UserMessage(user) = event.msg
&& matches!(user.kind, Some(InputMessageKind::EnvironmentContext))
&& user.message.contains(&second_cwd_str)
{
env_message = Some(user.message);
break;
}
}
let env_message = env_message.expect("expected environment context update");
assert!(
env_message.contains("<sandbox_mode>danger-full-access</sandbox_mode>"),
"env context should reflect new sandbox mode: {env_message}"
);
assert!(
env_message.contains("<network_access>enabled</network_access>"),
"env context should enable network access for danger-full-access policy: {env_message}"
);
assert!(
env_message.contains(&second_cwd_str),
"env context should include updated cwd: {env_message}"
);
let exec_begin_notification = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("codex/event/exec_command_begin"),

View File

@@ -7,6 +7,8 @@ mod fuzzy_file_search;
mod interrupt;
mod list_resume;
mod login;
mod model_list;
mod rate_limits;
mod send_message;
mod set_default_model;
mod user_agent;

View File

@@ -0,0 +1,183 @@
use std::time::Duration;
use anyhow::Result;
use anyhow::anyhow;
use app_test_support::McpProcess;
use app_test_support::to_response;
use codex_app_server_protocol::JSONRPCError;
use codex_app_server_protocol::JSONRPCResponse;
use codex_app_server_protocol::ListModelsParams;
use codex_app_server_protocol::ListModelsResponse;
use codex_app_server_protocol::Model;
use codex_app_server_protocol::ReasoningEffortOption;
use codex_app_server_protocol::RequestId;
use codex_protocol::config_types::ReasoningEffort;
use pretty_assertions::assert_eq;
use tempfile::TempDir;
use tokio::time::timeout;
const DEFAULT_TIMEOUT: Duration = Duration::from_secs(10);
const INVALID_REQUEST_ERROR_CODE: i64 = -32600;
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn list_models_returns_all_models_with_large_limit() -> Result<()> {
let codex_home = TempDir::new()?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
timeout(DEFAULT_TIMEOUT, mcp.initialize()).await??;
let request_id = mcp
.send_list_models_request(ListModelsParams {
page_size: Some(100),
cursor: None,
})
.await?;
let response: JSONRPCResponse = timeout(
DEFAULT_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(request_id)),
)
.await??;
let ListModelsResponse { items, next_cursor } = to_response::<ListModelsResponse>(response)?;
let expected_models = vec![
Model {
id: "gpt-5-codex".to_string(),
model: "gpt-5-codex".to_string(),
display_name: "gpt-5-codex".to_string(),
description: "Optimized for coding tasks with many tools.".to_string(),
supported_reasoning_efforts: vec![
ReasoningEffortOption {
reasoning_effort: ReasoningEffort::Low,
description: "Fastest responses with limited reasoning".to_string(),
},
ReasoningEffortOption {
reasoning_effort: ReasoningEffort::Medium,
description: "Dynamically adjusts reasoning based on the task".to_string(),
},
ReasoningEffortOption {
reasoning_effort: ReasoningEffort::High,
description: "Maximizes reasoning depth for complex or ambiguous problems"
.to_string(),
},
],
default_reasoning_effort: ReasoningEffort::Medium,
is_default: true,
},
Model {
id: "gpt-5".to_string(),
model: "gpt-5".to_string(),
display_name: "gpt-5".to_string(),
description: "Broad world knowledge with strong general reasoning.".to_string(),
supported_reasoning_efforts: vec![
ReasoningEffortOption {
reasoning_effort: ReasoningEffort::Minimal,
description: "Fastest responses with little reasoning".to_string(),
},
ReasoningEffortOption {
reasoning_effort: ReasoningEffort::Low,
description: "Balances speed with some reasoning; useful for straightforward \
queries and short explanations"
.to_string(),
},
ReasoningEffortOption {
reasoning_effort: ReasoningEffort::Medium,
description: "Provides a solid balance of reasoning depth and latency for \
general-purpose tasks"
.to_string(),
},
ReasoningEffortOption {
reasoning_effort: ReasoningEffort::High,
description: "Maximizes reasoning depth for complex or ambiguous problems"
.to_string(),
},
],
default_reasoning_effort: ReasoningEffort::Medium,
is_default: false,
},
];
assert_eq!(items, expected_models);
assert!(next_cursor.is_none());
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn list_models_pagination_works() -> Result<()> {
let codex_home = TempDir::new()?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
timeout(DEFAULT_TIMEOUT, mcp.initialize()).await??;
let first_request = mcp
.send_list_models_request(ListModelsParams {
page_size: Some(1),
cursor: None,
})
.await?;
let first_response: JSONRPCResponse = timeout(
DEFAULT_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(first_request)),
)
.await??;
let ListModelsResponse {
items: first_items,
next_cursor: first_cursor,
} = to_response::<ListModelsResponse>(first_response)?;
assert_eq!(first_items.len(), 1);
assert_eq!(first_items[0].id, "gpt-5-codex");
let next_cursor = first_cursor.ok_or_else(|| anyhow!("cursor for second page"))?;
let second_request = mcp
.send_list_models_request(ListModelsParams {
page_size: Some(1),
cursor: Some(next_cursor.clone()),
})
.await?;
let second_response: JSONRPCResponse = timeout(
DEFAULT_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(second_request)),
)
.await??;
let ListModelsResponse {
items: second_items,
next_cursor: second_cursor,
} = to_response::<ListModelsResponse>(second_response)?;
assert_eq!(second_items.len(), 1);
assert_eq!(second_items[0].id, "gpt-5");
assert!(second_cursor.is_none());
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn list_models_rejects_invalid_cursor() -> Result<()> {
let codex_home = TempDir::new()?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
timeout(DEFAULT_TIMEOUT, mcp.initialize()).await??;
let request_id = mcp
.send_list_models_request(ListModelsParams {
page_size: None,
cursor: Some("invalid".to_string()),
})
.await?;
let error: JSONRPCError = timeout(
DEFAULT_TIMEOUT,
mcp.read_stream_until_error_message(RequestId::Integer(request_id)),
)
.await??;
assert_eq!(error.id, RequestId::Integer(request_id));
assert_eq!(error.error.code, INVALID_REQUEST_ERROR_CODE);
assert_eq!(error.error.message, "invalid cursor: invalid");
Ok(())
}

View File

@@ -0,0 +1,215 @@
use anyhow::Context;
use anyhow::Result;
use app_test_support::ChatGptAuthFixture;
use app_test_support::McpProcess;
use app_test_support::to_response;
use app_test_support::write_chatgpt_auth;
use codex_app_server_protocol::GetAccountRateLimitsResponse;
use codex_app_server_protocol::JSONRPCError;
use codex_app_server_protocol::JSONRPCResponse;
use codex_app_server_protocol::LoginApiKeyParams;
use codex_app_server_protocol::RequestId;
use codex_protocol::protocol::RateLimitSnapshot;
use codex_protocol::protocol::RateLimitWindow;
use pretty_assertions::assert_eq;
use serde_json::json;
use std::path::Path;
use tempfile::TempDir;
use tokio::time::timeout;
use wiremock::Mock;
use wiremock::MockServer;
use wiremock::ResponseTemplate;
use wiremock::matchers::header;
use wiremock::matchers::method;
use wiremock::matchers::path;
const DEFAULT_READ_TIMEOUT: std::time::Duration = std::time::Duration::from_secs(10);
const INVALID_REQUEST_ERROR_CODE: i64 = -32600;
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn get_account_rate_limits_requires_auth() -> Result<()> {
let codex_home = TempDir::new().context("create codex home tempdir")?;
let mut mcp = McpProcess::new_with_env(codex_home.path(), &[("OPENAI_API_KEY", None)])
.await
.context("spawn mcp process")?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize())
.await
.context("initialize timeout")?
.context("initialize request")?;
let request_id = mcp
.send_get_account_rate_limits_request()
.await
.context("send account/rateLimits/read")?;
let error: JSONRPCError = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_error_message(RequestId::Integer(request_id)),
)
.await
.context("account/rateLimits/read timeout")?
.context("account/rateLimits/read error")?;
assert_eq!(error.id, RequestId::Integer(request_id));
assert_eq!(error.error.code, INVALID_REQUEST_ERROR_CODE);
assert_eq!(
error.error.message,
"codex account authentication required to read rate limits"
);
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn get_account_rate_limits_requires_chatgpt_auth() -> Result<()> {
let codex_home = TempDir::new().context("create codex home tempdir")?;
let mut mcp = McpProcess::new(codex_home.path())
.await
.context("spawn mcp process")?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize())
.await
.context("initialize timeout")?
.context("initialize request")?;
login_with_api_key(&mut mcp, "sk-test-key").await?;
let request_id = mcp
.send_get_account_rate_limits_request()
.await
.context("send account/rateLimits/read")?;
let error: JSONRPCError = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_error_message(RequestId::Integer(request_id)),
)
.await
.context("account/rateLimits/read timeout")?
.context("account/rateLimits/read error")?;
assert_eq!(error.id, RequestId::Integer(request_id));
assert_eq!(error.error.code, INVALID_REQUEST_ERROR_CODE);
assert_eq!(
error.error.message,
"chatgpt authentication required to read rate limits"
);
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn get_account_rate_limits_returns_snapshot() -> Result<()> {
let codex_home = TempDir::new().context("create codex home tempdir")?;
write_chatgpt_auth(
codex_home.path(),
ChatGptAuthFixture::new("chatgpt-token")
.account_id("account-123")
.plan_type("pro"),
)
.context("write chatgpt auth")?;
let server = MockServer::start().await;
let server_url = server.uri();
write_chatgpt_base_url(codex_home.path(), &server_url).context("write chatgpt base url")?;
let primary_reset_timestamp = chrono::DateTime::parse_from_rfc3339("2025-01-01T00:02:00Z")
.expect("parse primary reset timestamp")
.timestamp();
let secondary_reset_timestamp = chrono::DateTime::parse_from_rfc3339("2025-01-01T01:00:00Z")
.expect("parse secondary reset timestamp")
.timestamp();
let response_body = json!({
"plan_type": "pro",
"rate_limit": {
"allowed": true,
"limit_reached": false,
"primary_window": {
"used_percent": 42,
"limit_window_seconds": 3600,
"reset_after_seconds": 120,
"reset_at": primary_reset_timestamp,
},
"secondary_window": {
"used_percent": 5,
"limit_window_seconds": 86400,
"reset_after_seconds": 43200,
"reset_at": secondary_reset_timestamp,
}
}
});
Mock::given(method("GET"))
.and(path("/api/codex/usage"))
.and(header("authorization", "Bearer chatgpt-token"))
.and(header("chatgpt-account-id", "account-123"))
.respond_with(ResponseTemplate::new(200).set_body_json(response_body))
.mount(&server)
.await;
let mut mcp = McpProcess::new_with_env(codex_home.path(), &[("OPENAI_API_KEY", None)])
.await
.context("spawn mcp process")?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize())
.await
.context("initialize timeout")?
.context("initialize request")?;
let request_id = mcp
.send_get_account_rate_limits_request()
.await
.context("send account/rateLimits/read")?;
let response: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(request_id)),
)
.await
.context("account/rateLimits/read timeout")?
.context("account/rateLimits/read response")?;
let received: GetAccountRateLimitsResponse =
to_response(response).context("deserialize rate limit response")?;
let expected = GetAccountRateLimitsResponse {
rate_limits: RateLimitSnapshot {
primary: Some(RateLimitWindow {
used_percent: 42.0,
window_minutes: Some(60),
resets_at: Some(primary_reset_timestamp),
}),
secondary: Some(RateLimitWindow {
used_percent: 5.0,
window_minutes: Some(1440),
resets_at: Some(secondary_reset_timestamp),
}),
},
};
assert_eq!(received, expected);
Ok(())
}
async fn login_with_api_key(mcp: &mut McpProcess, api_key: &str) -> Result<()> {
let request_id = mcp
.send_login_api_key_request(LoginApiKeyParams {
api_key: api_key.to_string(),
})
.await
.context("send loginApiKey")?;
timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(request_id)),
)
.await
.context("loginApiKey timeout")?
.context("loginApiKey response")?;
Ok(())
}
fn write_chatgpt_base_url(codex_home: &Path, base_url: &str) -> std::io::Result<()> {
let config_toml = codex_home.join("config.toml");
std::fs::write(config_toml, format!("chatgpt_base_url = \"{base_url}\"\n"))
}

View File

@@ -1,20 +1,13 @@
use std::time::Duration;
use anyhow::Context;
use app_test_support::ChatGptAuthFixture;
use app_test_support::McpProcess;
use app_test_support::to_response;
use base64::Engine;
use base64::engine::general_purpose::URL_SAFE_NO_PAD;
use app_test_support::write_chatgpt_auth;
use codex_app_server_protocol::JSONRPCResponse;
use codex_app_server_protocol::RequestId;
use codex_app_server_protocol::UserInfoResponse;
use codex_core::auth::AuthDotJson;
use codex_core::auth::get_auth_file;
use codex_core::auth::write_auth_json;
use codex_core::token_data::IdTokenInfo;
use codex_core::token_data::TokenData;
use pretty_assertions::assert_eq;
use serde_json::json;
use tempfile::TempDir;
use tokio::time::timeout;
@@ -24,22 +17,13 @@ const DEFAULT_READ_TIMEOUT: Duration = Duration::from_secs(10);
async fn user_info_returns_email_from_auth_json() {
let codex_home = TempDir::new().expect("create tempdir");
let auth_path = get_auth_file(codex_home.path());
let mut id_token = IdTokenInfo::default();
id_token.email = Some("user@example.com".to_string());
id_token.raw_jwt = encode_id_token_with_email("user@example.com").expect("encode id token");
let auth = AuthDotJson {
openai_api_key: None,
tokens: Some(TokenData {
id_token,
access_token: "access".to_string(),
refresh_token: "refresh".to_string(),
account_id: None,
}),
last_refresh: None,
};
write_auth_json(&auth_path, &auth).expect("write auth.json");
write_chatgpt_auth(
codex_home.path(),
ChatGptAuthFixture::new("access")
.refresh_token("refresh")
.email("user@example.com"),
)
.expect("write chatgpt auth");
let mut mcp = McpProcess::new(codex_home.path())
.await
@@ -65,14 +49,3 @@ async fn user_info_returns_email_from_auth_json() {
assert_eq!(received, expected);
}
fn encode_id_token_with_email(email: &str) -> anyhow::Result<String> {
let header_b64 = URL_SAFE_NO_PAD.encode(
serde_json::to_vec(&json!({ "alg": "none", "typ": "JWT" }))
.context("serialize jwt header")?,
);
let payload =
serde_json::to_vec(&json!({ "email": email })).context("serialize jwt payload")?;
let payload_b64 = URL_SAFE_NO_PAD.encode(payload);
Ok(format!("{header_b64}.{payload_b64}.signature"))
}

View File

@@ -13,6 +13,8 @@ serde = { version = "1", features = ["derive"] }
serde_json = "1"
reqwest = { version = "0.12", default-features = false, features = ["json", "rustls-tls"] }
codex-backend-openapi-models = { path = "../codex-backend-openapi-models" }
codex-protocol = { workspace = true }
codex-core = { workspace = true }
[dev-dependencies]
pretty_assertions = "1"

View File

@@ -1,7 +1,13 @@
use crate::types::CodeTaskDetailsResponse;
use crate::types::PaginatedListTaskListItem;
use crate::types::RateLimitStatusPayload;
use crate::types::RateLimitWindowSnapshot;
use crate::types::TurnAttemptsSiblingTurnsResponse;
use anyhow::Result;
use codex_core::auth::CodexAuth;
use codex_core::default_client::get_codex_user_agent;
use codex_protocol::protocol::RateLimitSnapshot;
use codex_protocol::protocol::RateLimitWindow;
use reqwest::header::AUTHORIZATION;
use reqwest::header::CONTENT_TYPE;
use reqwest::header::HeaderMap;
@@ -64,6 +70,17 @@ impl Client {
})
}
pub async fn from_auth(base_url: impl Into<String>, auth: &CodexAuth) -> Result<Self> {
let token = auth.get_token().await.map_err(anyhow::Error::from)?;
let mut client = Self::new(base_url)?
.with_user_agent(get_codex_user_agent())
.with_bearer_token(token);
if let Some(account_id) = auth.get_account_id() {
client = client.with_chatgpt_account_id(account_id);
}
Ok(client)
}
pub fn with_bearer_token(mut self, token: impl Into<String>) -> Self {
self.bearer_token = Some(token.into());
self
@@ -138,6 +155,17 @@ impl Client {
}
}
pub async fn get_rate_limits(&self) -> Result<RateLimitSnapshot> {
let url = match self.path_style {
PathStyle::CodexApi => format!("{}/api/codex/usage", self.base_url),
PathStyle::ChatGptApi => format!("{}/wham/usage", self.base_url),
};
let req = self.http.get(&url).headers(self.headers());
let (body, ct) = self.exec_request(req, "GET", &url).await?;
let payload: RateLimitStatusPayload = self.decode_json(&url, &ct, &body)?;
Ok(Self::rate_limit_snapshot_from_payload(payload))
}
pub async fn list_tasks(
&self,
limit: Option<i32>,
@@ -241,4 +269,49 @@ impl Client {
Err(e) => anyhow::bail!("Decode error for {url}: {e}; content-type={ct}; body={body}"),
}
}
// rate limit helpers
fn rate_limit_snapshot_from_payload(payload: RateLimitStatusPayload) -> RateLimitSnapshot {
let Some(details) = payload
.rate_limit
.and_then(|inner| inner.map(|boxed| *boxed))
else {
return RateLimitSnapshot {
primary: None,
secondary: None,
};
};
RateLimitSnapshot {
primary: Self::map_rate_limit_window(details.primary_window),
secondary: Self::map_rate_limit_window(details.secondary_window),
}
}
fn map_rate_limit_window(
window: Option<Option<Box<RateLimitWindowSnapshot>>>,
) -> Option<RateLimitWindow> {
let snapshot = match window {
Some(Some(snapshot)) => *snapshot,
_ => return None,
};
let used_percent = f64::from(snapshot.used_percent);
let window_minutes = Self::window_minutes_from_seconds(snapshot.limit_window_seconds);
let resets_at = Some(i64::from(snapshot.reset_at));
Some(RateLimitWindow {
used_percent,
window_minutes,
resets_at,
})
}
fn window_minutes_from_seconds(seconds: i32) -> Option<i64> {
if seconds <= 0 {
return None;
}
let seconds_i64 = i64::from(seconds);
Some((seconds_i64 + 59) / 60)
}
}

View File

@@ -1,4 +1,8 @@
pub use codex_backend_openapi_models::models::PaginatedListTaskListItem;
pub use codex_backend_openapi_models::models::PlanType;
pub use codex_backend_openapi_models::models::RateLimitStatusDetails;
pub use codex_backend_openapi_models::models::RateLimitStatusPayload;
pub use codex_backend_openapi_models::models::RateLimitWindowSnapshot;
pub use codex_backend_openapi_models::models::TaskListItem;
use serde::Deserialize;

View File

@@ -19,7 +19,7 @@ use codex_exec::Cli as ExecCli;
use codex_responses_api_proxy::Args as ResponsesApiProxyArgs;
use codex_tui::AppExitInfo;
use codex_tui::Cli as TuiCli;
use codex_tui::UpdateAction;
use codex_tui::updates::UpdateAction;
use owo_colors::OwoColorize;
use std::path::PathBuf;
use supports_color::Stream;

View File

@@ -150,6 +150,10 @@ pub struct RemoveArgs {
pub struct LoginArgs {
/// Name of the MCP server to authenticate with oauth.
pub name: String,
/// Comma-separated list of OAuth scopes to request.
#[arg(long, value_delimiter = ',', value_name = "SCOPE,SCOPE")]
pub scopes: Vec<String>,
}
#[derive(Debug, clap::Parser)]
@@ -253,6 +257,8 @@ async fn run_add(config_overrides: &CliConfigOverrides, add_args: AddArgs) -> Re
enabled: true,
startup_timeout_sec: None,
tool_timeout_sec: None,
enabled_tools: None,
disabled_tools: None,
};
servers.insert(name.clone(), new_entry);
@@ -277,6 +283,7 @@ async fn run_add(config_overrides: &CliConfigOverrides, add_args: AddArgs) -> Re
config.mcp_oauth_credentials_store_mode,
http_headers.clone(),
env_http_headers.clone(),
&Vec::new(),
)
.await?;
println!("Successfully logged in.");
@@ -325,7 +332,7 @@ async fn run_login(config_overrides: &CliConfigOverrides, login_args: LoginArgs)
);
}
let LoginArgs { name } = login_args;
let LoginArgs { name, scopes } = login_args;
let Some(server) = config.mcp_servers.get(&name) else {
bail!("No MCP server named '{name}' found.");
@@ -347,6 +354,7 @@ async fn run_login(config_overrides: &CliConfigOverrides, login_args: LoginArgs)
config.mcp_oauth_credentials_store_mode,
http_headers,
env_http_headers,
&scopes,
)
.await?;
println!("Successfully logged in to MCP server '{name}'.");
@@ -400,7 +408,7 @@ async fn run_list(config_overrides: &CliConfigOverrides, list_args: ListArgs) ->
.map(|(name, cfg)| {
let auth_status = auth_statuses
.get(name.as_str())
.copied()
.map(|entry| entry.auth_status)
.unwrap_or(McpAuthStatus::Unsupported);
let transport = match &cfg.transport {
McpServerTransportConfig::Stdio {
@@ -487,7 +495,7 @@ async fn run_list(config_overrides: &CliConfigOverrides, list_args: ListArgs) ->
};
let auth_status = auth_statuses
.get(name.as_str())
.copied()
.map(|entry| entry.auth_status)
.unwrap_or(McpAuthStatus::Unsupported)
.to_string();
stdio_rows.push([
@@ -512,7 +520,7 @@ async fn run_list(config_overrides: &CliConfigOverrides, list_args: ListArgs) ->
};
let auth_status = auth_statuses
.get(name.as_str())
.copied()
.map(|entry| entry.auth_status)
.unwrap_or(McpAuthStatus::Unsupported)
.to_string();
http_rows.push([
@@ -676,6 +684,8 @@ async fn run_get(config_overrides: &CliConfigOverrides, get_args: GetArgs) -> Re
"name": get_args.name,
"enabled": server.enabled,
"transport": transport,
"enabled_tools": server.enabled_tools.clone(),
"disabled_tools": server.disabled_tools.clone(),
"startup_timeout_sec": server
.startup_timeout_sec
.map(|timeout| timeout.as_secs_f64()),
@@ -687,8 +697,28 @@ async fn run_get(config_overrides: &CliConfigOverrides, get_args: GetArgs) -> Re
return Ok(());
}
if !server.enabled {
println!("{} (disabled)", get_args.name);
return Ok(());
}
println!("{}", get_args.name);
println!(" enabled: {}", server.enabled);
let format_tool_list = |tools: &Option<Vec<String>>| -> String {
match tools {
Some(list) if list.is_empty() => "[]".to_string(),
Some(list) => list.join(", "),
None => "-".to_string(),
}
};
if server.enabled_tools.is_some() {
let enabled_tools_display = format_tool_list(&server.enabled_tools);
println!(" enabled_tools: {enabled_tools_display}");
}
if server.disabled_tools.is_some() {
let disabled_tools_display = format_tool_list(&server.disabled_tools);
println!(" disabled_tools: {disabled_tools_display}");
}
match &server.transport {
McpServerTransportConfig::Stdio {
command,

View File

@@ -134,3 +134,28 @@ async fn list_and_get_render_expected_output() -> Result<()> {
Ok(())
}
#[tokio::test]
async fn get_disabled_server_shows_single_line() -> Result<()> {
let codex_home = TempDir::new()?;
let mut add = codex_command(codex_home.path())?;
add.args(["mcp", "add", "docs", "--", "docs-server"])
.assert()
.success();
let mut servers = load_global_mcp_servers(codex_home.path()).await?;
let docs = servers
.get_mut("docs")
.expect("docs server should exist after add");
docs.enabled = false;
write_global_mcp_servers(codex_home.path(), &servers)?;
let mut get_cmd = codex_command(codex_home.path())?;
let get_output = get_cmd.args(["mcp", "get", "docs"]).output()?;
assert!(get_output.status.success());
let stdout = String::from_utf8(get_output.stdout)?;
assert_eq!(stdout.trim_end(), "docs (disabled)");
Ok(())
}

View File

@@ -15,3 +15,4 @@ path = "src/lib.rs"
[dependencies]
serde = { version = "1", features = ["derive"] }
serde_json = "1"
serde_with = "3"

View File

@@ -3,6 +3,7 @@
// Currently export only the types referenced by the workspace
// The process for this will change
// Cloud Tasks
pub mod code_task_details_response;
pub use self::code_task_details_response::CodeTaskDetailsResponse;
@@ -20,3 +21,14 @@ pub use self::task_list_item::TaskListItem;
pub mod paginated_list_task_list_item_;
pub use self::paginated_list_task_list_item_::PaginatedListTaskListItem;
// Rate Limits
pub mod rate_limit_status_payload;
pub use self::rate_limit_status_payload::PlanType;
pub use self::rate_limit_status_payload::RateLimitStatusPayload;
pub mod rate_limit_status_details;
pub use self::rate_limit_status_details::RateLimitStatusDetails;
pub mod rate_limit_window_snapshot;
pub use self::rate_limit_window_snapshot::RateLimitWindowSnapshot;

View File

@@ -0,0 +1,46 @@
/*
* codex-backend
*
* codex-backend
*
* The version of the OpenAPI document: 0.0.1
*
* Generated by: https://openapi-generator.tech
*/
use crate::models;
use serde::Deserialize;
use serde::Serialize;
#[derive(Clone, Default, Debug, PartialEq, Serialize, Deserialize)]
pub struct RateLimitStatusDetails {
#[serde(rename = "allowed")]
pub allowed: bool,
#[serde(rename = "limit_reached")]
pub limit_reached: bool,
#[serde(
rename = "primary_window",
default,
with = "::serde_with::rust::double_option",
skip_serializing_if = "Option::is_none"
)]
pub primary_window: Option<Option<Box<models::RateLimitWindowSnapshot>>>,
#[serde(
rename = "secondary_window",
default,
with = "::serde_with::rust::double_option",
skip_serializing_if = "Option::is_none"
)]
pub secondary_window: Option<Option<Box<models::RateLimitWindowSnapshot>>>,
}
impl RateLimitStatusDetails {
pub fn new(allowed: bool, limit_reached: bool) -> RateLimitStatusDetails {
RateLimitStatusDetails {
allowed,
limit_reached,
primary_window: None,
secondary_window: None,
}
}
}

View File

@@ -0,0 +1,65 @@
/*
* codex-backend
*
* codex-backend
*
* The version of the OpenAPI document: 0.0.1
*
* Generated by: https://openapi-generator.tech
*/
use crate::models;
use serde::Deserialize;
use serde::Serialize;
#[derive(Clone, Default, Debug, PartialEq, Serialize, Deserialize)]
pub struct RateLimitStatusPayload {
#[serde(rename = "plan_type")]
pub plan_type: PlanType,
#[serde(
rename = "rate_limit",
default,
with = "::serde_with::rust::double_option",
skip_serializing_if = "Option::is_none"
)]
pub rate_limit: Option<Option<Box<models::RateLimitStatusDetails>>>,
}
impl RateLimitStatusPayload {
pub fn new(plan_type: PlanType) -> RateLimitStatusPayload {
RateLimitStatusPayload {
plan_type,
rate_limit: None,
}
}
}
#[derive(Clone, Copy, Debug, Eq, PartialEq, Ord, PartialOrd, Hash, Serialize, Deserialize)]
pub enum PlanType {
#[serde(rename = "free")]
Free,
#[serde(rename = "go")]
Go,
#[serde(rename = "plus")]
Plus,
#[serde(rename = "pro")]
Pro,
#[serde(rename = "team")]
Team,
#[serde(rename = "business")]
Business,
#[serde(rename = "education")]
Education,
#[serde(rename = "quorum")]
Quorum,
#[serde(rename = "enterprise")]
Enterprise,
#[serde(rename = "edu")]
Edu,
}
impl Default for PlanType {
fn default() -> PlanType {
Self::Free
}
}

View File

@@ -0,0 +1,40 @@
/*
* codex-backend
*
* codex-backend
*
* The version of the OpenAPI document: 0.0.1
*
* Generated by: https://openapi-generator.tech
*/
use serde::Deserialize;
use serde::Serialize;
#[derive(Clone, Default, Debug, PartialEq, Serialize, Deserialize)]
pub struct RateLimitWindowSnapshot {
#[serde(rename = "used_percent")]
pub used_percent: i32,
#[serde(rename = "limit_window_seconds")]
pub limit_window_seconds: i32,
#[serde(rename = "reset_after_seconds")]
pub reset_after_seconds: i32,
#[serde(rename = "reset_at")]
pub reset_at: i32,
}
impl RateLimitWindowSnapshot {
pub fn new(
used_percent: i32,
limit_window_seconds: i32,
reset_after_seconds: i32,
reset_at: i32,
) -> RateLimitWindowSnapshot {
RateLimitWindowSnapshot {
used_percent,
limit_window_seconds,
reset_after_seconds,
reset_at,
}
}
}

View File

@@ -1,73 +1,96 @@
use codex_app_server_protocol::AuthMode;
use codex_core::protocol_config_types::ReasoningEffort;
/// A simple preset pairing a model slug with a reasoning effort.
/// A reasoning effort option that can be surfaced for a model.
#[derive(Debug, Clone, Copy)]
pub struct ReasoningEffortPreset {
/// Effort level that the model supports.
pub effort: ReasoningEffort,
/// Short human description shown next to the effort in UIs.
pub description: &'static str,
}
/// Metadata describing a Codex-supported model.
#[derive(Debug, Clone, Copy)]
pub struct ModelPreset {
/// Stable identifier for the preset.
pub id: &'static str,
/// Display label shown in UIs.
pub label: &'static str,
/// Short human description shown next to the label in UIs.
pub description: &'static str,
/// Model slug (e.g., "gpt-5").
pub model: &'static str,
/// Reasoning effort to apply for this preset.
pub effort: Option<ReasoningEffort>,
/// Display name shown in UIs.
pub display_name: &'static str,
/// Short human description shown in UIs.
pub description: &'static str,
/// Reasoning effort applied when none is explicitly chosen.
pub default_reasoning_effort: ReasoningEffort,
/// Supported reasoning effort options.
pub supported_reasoning_efforts: &'static [ReasoningEffortPreset],
/// Whether this is the default model for new users.
pub is_default: bool,
}
const PRESETS: &[ModelPreset] = &[
ModelPreset {
id: "gpt-5-codex-low",
label: "gpt-5-codex low",
description: "Fastest responses with limited reasoning",
id: "gpt-5-codex",
model: "gpt-5-codex",
effort: Some(ReasoningEffort::Low),
display_name: "gpt-5-codex",
description: "Optimized for coding tasks with many tools.",
default_reasoning_effort: ReasoningEffort::Medium,
supported_reasoning_efforts: &[
ReasoningEffortPreset {
effort: ReasoningEffort::Low,
description: "Fastest responses with limited reasoning",
},
ReasoningEffortPreset {
effort: ReasoningEffort::Medium,
description: "Dynamically adjusts reasoning based on the task",
},
ReasoningEffortPreset {
effort: ReasoningEffort::High,
description: "Maximizes reasoning depth for complex or ambiguous problems",
},
],
is_default: true,
},
ModelPreset {
id: "gpt-5-codex-medium",
label: "gpt-5-codex medium",
description: "Dynamically adjusts reasoning based on the task",
model: "gpt-5-codex",
effort: Some(ReasoningEffort::Medium),
},
ModelPreset {
id: "gpt-5-codex-high",
label: "gpt-5-codex high",
description: "Maximizes reasoning depth for complex or ambiguous problems",
model: "gpt-5-codex",
effort: Some(ReasoningEffort::High),
},
ModelPreset {
id: "gpt-5-minimal",
label: "gpt-5 minimal",
description: "Fastest responses with little reasoning",
id: "gpt-5",
model: "gpt-5",
effort: Some(ReasoningEffort::Minimal),
},
ModelPreset {
id: "gpt-5-low",
label: "gpt-5 low",
description: "Balances speed with some reasoning; useful for straightforward queries and short explanations",
model: "gpt-5",
effort: Some(ReasoningEffort::Low),
},
ModelPreset {
id: "gpt-5-medium",
label: "gpt-5 medium",
description: "Provides a solid balance of reasoning depth and latency for general-purpose tasks",
model: "gpt-5",
effort: Some(ReasoningEffort::Medium),
},
ModelPreset {
id: "gpt-5-high",
label: "gpt-5 high",
description: "Maximizes reasoning depth for complex or ambiguous problems",
model: "gpt-5",
effort: Some(ReasoningEffort::High),
display_name: "gpt-5",
description: "Broad world knowledge with strong general reasoning.",
default_reasoning_effort: ReasoningEffort::Medium,
supported_reasoning_efforts: &[
ReasoningEffortPreset {
effort: ReasoningEffort::Minimal,
description: "Fastest responses with little reasoning",
},
ReasoningEffortPreset {
effort: ReasoningEffort::Low,
description: "Balances speed with some reasoning; useful for straightforward queries and short explanations",
},
ReasoningEffortPreset {
effort: ReasoningEffort::Medium,
description: "Provides a solid balance of reasoning depth and latency for general-purpose tasks",
},
ReasoningEffortPreset {
effort: ReasoningEffort::High,
description: "Maximizes reasoning depth for complex or ambiguous problems",
},
],
is_default: false,
},
];
pub fn builtin_model_presets(_auth_mode: Option<AuthMode>) -> Vec<ModelPreset> {
PRESETS.to_vec()
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn only_one_default_model_is_configured() {
let default_models = PRESETS.iter().filter(|preset| preset.is_default).count();
assert!(default_models == 1);
}
}

View File

@@ -22,7 +22,6 @@ chrono = { workspace = true, features = ["serde"] }
codex-app-server-protocol = { workspace = true }
codex-apply-patch = { workspace = true }
codex-file-search = { workspace = true }
codex-mcp-client = { workspace = true }
codex-otel = { workspace = true, features = ["otel"] }
codex-protocol = { workspace = true }
codex-rmcp-client = { workspace = true }

View File

@@ -36,7 +36,6 @@ pub(crate) struct ApplyPatchExec {
pub(crate) async fn apply_patch(
sess: &Session,
turn_context: &TurnContext,
sub_id: &str,
call_id: &str,
action: ApplyPatchAction,
) -> InternalApplyPatchInvocation {
@@ -62,7 +61,7 @@ pub(crate) async fn apply_patch(
// that similar patches can be auto-approved in the future during
// this session.
let rx_approve = sess
.request_patch_approval(sub_id.to_owned(), call_id.to_owned(), &action, None, None)
.request_patch_approval(turn_context, call_id.to_owned(), &action, None, None)
.await;
match rx_approve.await.unwrap_or_default() {
ReviewDecision::Approved | ReviewDecision::ApprovedForSession => {

View File

@@ -542,6 +542,7 @@ mod tests {
}
#[tokio::test]
#[serial(codex_api_key)]
async fn pro_account_with_no_api_key_uses_chatgpt_auth() {
let codex_home = tempdir().unwrap();
let fake_jwt = write_auth_file(
@@ -591,6 +592,7 @@ mod tests {
}
#[tokio::test]
#[serial(codex_api_key)]
async fn loads_api_key_from_auth_json() {
let dir = tempdir().unwrap();
let auth_file = dir.path().join("auth.json");
@@ -742,6 +744,7 @@ mod tests {
}
#[tokio::test]
#[serial(codex_api_key)]
async fn enforce_login_restrictions_logs_out_for_workspace_mismatch() {
let codex_home = tempdir().unwrap();
let _jwt = write_auth_file(
@@ -767,6 +770,7 @@ mod tests {
}
#[tokio::test]
#[serial(codex_api_key)]
async fn enforce_login_restrictions_allows_matching_workspace() {
let codex_home = tempdir().unwrap();
let _jwt = write_auth_file(

View File

@@ -5,13 +5,13 @@ use tree_sitter_bash::LANGUAGE as BASH;
/// Parse the provided bash source using tree-sitter-bash, returning a Tree on
/// success or None if parsing failed.
pub fn try_parse_bash(bash_lc_arg: &str) -> Option<Tree> {
pub fn try_parse_shell(shell_lc_arg: &str) -> Option<Tree> {
let lang = BASH.into();
let mut parser = Parser::new();
#[expect(clippy::expect_used)]
parser.set_language(&lang).expect("load bash grammar");
let old_tree: Option<&Tree> = None;
parser.parse(bash_lc_arg, old_tree)
parser.parse(shell_lc_arg, old_tree)
}
/// Parse a script which may contain multiple simple commands joined only by
@@ -88,18 +88,19 @@ pub fn try_parse_word_only_commands_sequence(tree: &Tree, src: &str) -> Option<V
Some(commands)
}
/// Returns the sequence of plain commands within a `bash -lc "..."` invocation
/// when the script only contains word-only commands joined by safe operators.
pub fn parse_bash_lc_plain_commands(command: &[String]) -> Option<Vec<Vec<String>>> {
let [bash, flag, script] = command else {
/// Returns the sequence of plain commands within a `bash -lc "..."` or
/// `zsh -lc "..."` invocation when the script only contains word-only commands
/// joined by safe operators.
pub fn parse_shell_lc_plain_commands(command: &[String]) -> Option<Vec<Vec<String>>> {
let [shell, flag, script] = command else {
return None;
};
if bash != "bash" || flag != "-lc" {
if flag != "-lc" || !(shell == "bash" || shell == "zsh") {
return None;
}
let tree = try_parse_bash(script)?;
let tree = try_parse_shell(script)?;
try_parse_word_only_commands_sequence(&tree, script)
}
@@ -154,7 +155,7 @@ mod tests {
use super::*;
fn parse_seq(src: &str) -> Option<Vec<Vec<String>>> {
let tree = try_parse_bash(src)?;
let tree = try_parse_shell(src)?;
try_parse_word_only_commands_sequence(&tree, src)
}
@@ -234,4 +235,11 @@ mod tests {
fn rejects_trailing_operator_parse_error() {
assert!(parse_seq("ls &&").is_none());
}
#[test]
fn parse_zsh_lc_plain_commands() {
let command = vec!["zsh".to_string(), "-lc".to_string(), "ls".to_string()];
let parsed = parse_shell_lc_plain_commands(&command).unwrap();
assert_eq!(parsed, vec![vec!["ls".to_string()]]);
}
}

View File

@@ -104,10 +104,10 @@ pub(crate) async fn stream_chat_completions(
} = item
{
let mut text = String::new();
for c in items {
match c {
ReasoningItemContent::ReasoningText { text: t }
| ReasoningItemContent::Text { text: t } => text.push_str(t),
for entry in items {
match entry {
ReasoningItemContent::ReasoningText { text: segment }
| ReasoningItemContent::Text { text: segment } => text.push_str(segment),
}
}
if text.trim().is_empty() {

View File

@@ -1,17 +1,18 @@
use std::io::BufRead;
use std::path::Path;
use std::sync::Arc;
use std::sync::OnceLock;
use std::time::Duration;
use crate::AuthManager;
use crate::auth::CodexAuth;
use crate::error::ConnectionFailedError;
use crate::error::ResponseStreamFailed;
use crate::error::RetryLimitReachedError;
use crate::error::UnexpectedResponseError;
use bytes::Bytes;
use chrono::DateTime;
use chrono::Utc;
use codex_app_server_protocol::AuthMode;
use codex_otel::otel_event_manager::OtelEventManager;
use codex_protocol::ConversationId;
use codex_protocol::config_types::ReasoningEffort as ReasoningEffortConfig;
use codex_protocol::config_types::ReasoningSummary as ReasoningSummaryConfig;
use codex_protocol::models::ResponseItem;
use eventsource_stream::Eventsource;
use futures::prelude::*;
use regex_lite::Regex;
@@ -27,6 +28,8 @@ use tracing::debug;
use tracing::trace;
use tracing::warn;
use crate::AuthManager;
use crate::auth::CodexAuth;
use crate::chat_completions::AggregateStreamExt;
use crate::chat_completions::stream_chat_completions;
use crate::client_common::Prompt;
@@ -38,7 +41,11 @@ use crate::client_common::create_text_param_for_request;
use crate::config::Config;
use crate::default_client::create_client;
use crate::error::CodexErr;
use crate::error::ConnectionFailedError;
use crate::error::ResponseStreamFailed;
use crate::error::Result;
use crate::error::RetryLimitReachedError;
use crate::error::UnexpectedResponseError;
use crate::error::UsageLimitReachedError;
use crate::flags::CODEX_RS_SSE_FIXTURE;
use crate::model_family::ModelFamily;
@@ -52,13 +59,6 @@ use crate::state::TaskKind;
use crate::token_data::PlanType;
use crate::tools::spec::create_tools_json_for_responses_api;
use crate::util::backoff;
use chrono::DateTime;
use chrono::Utc;
use codex_otel::otel_event_manager::OtelEventManager;
use codex_protocol::config_types::ReasoningEffort as ReasoningEffortConfig;
use codex_protocol::config_types::ReasoningSummary as ReasoningSummaryConfig;
use codex_protocol::models::ResponseItem;
use std::sync::Arc;
#[derive(Debug, Deserialize)]
struct ErrorResponse {
@@ -336,10 +336,11 @@ impl ModelClient {
.get("cf-ray")
.map(|v| v.to_str().unwrap_or_default().to_string());
trace!(
"Response status: {}, cf-ray: {:?}",
debug!(
"Response status: {}, cf-ray: {:?}, version: {:?}",
resp.status(),
request_id
request_id,
resp.version()
);
}
@@ -628,13 +629,13 @@ fn parse_rate_limit_window(
headers: &HeaderMap,
used_percent_header: &str,
window_minutes_header: &str,
resets_header: &str,
resets_at_header: &str,
) -> Option<RateLimitWindow> {
let used_percent: Option<f64> = parse_header_f64(headers, used_percent_header);
used_percent.and_then(|used_percent| {
let window_minutes = parse_header_i64(headers, window_minutes_header);
let resets_at = parse_header_i64(headers, resets_header);
let resets_at = parse_header_i64(headers, resets_at_header);
let has_data = used_percent != 0.0
|| window_minutes.is_some_and(|minutes| minutes != 0)
@@ -1093,6 +1094,7 @@ mod tests {
base_url: Some("https://test.com".to_string()),
env_key: Some("TEST_API_KEY".to_string()),
env_key_instructions: None,
experimental_bearer_token: None,
wire_api: WireApi::Responses,
query_params: None,
http_headers: None,
@@ -1156,6 +1158,7 @@ mod tests {
base_url: Some("https://test.com".to_string()),
env_key: Some("TEST_API_KEY".to_string()),
env_key_instructions: None,
experimental_bearer_token: None,
wire_api: WireApi::Responses,
query_params: None,
http_headers: None,
@@ -1192,6 +1195,7 @@ mod tests {
base_url: Some("https://test.com".to_string()),
env_key: Some("TEST_API_KEY".to_string()),
env_key_instructions: None,
experimental_bearer_token: None,
wire_api: WireApi::Responses,
query_params: None,
http_headers: None,
@@ -1230,6 +1234,7 @@ mod tests {
base_url: Some("https://test.com".to_string()),
env_key: Some("TEST_API_KEY".to_string()),
env_key_instructions: None,
experimental_bearer_token: None,
wire_api: WireApi::Responses,
query_params: None,
http_headers: None,
@@ -1264,6 +1269,7 @@ mod tests {
base_url: Some("https://test.com".to_string()),
env_key: Some("TEST_API_KEY".to_string()),
env_key_instructions: None,
experimental_bearer_token: None,
wire_api: WireApi::Responses,
query_params: None,
http_headers: None,
@@ -1367,6 +1373,7 @@ mod tests {
base_url: Some("https://test.com".to_string()),
env_key: Some("TEST_API_KEY".to_string()),
env_key_instructions: None,
experimental_bearer_token: None,
wire_api: WireApi::Responses,
query_params: None,
http_headers: None,

File diff suppressed because it is too large Load Diff

View File

@@ -10,21 +10,21 @@ use crate::error::Result as CodexResult;
use crate::protocol::AgentMessageEvent;
use crate::protocol::CompactedItem;
use crate::protocol::ErrorEvent;
use crate::protocol::Event;
use crate::protocol::EventMsg;
use crate::protocol::InputItem;
use crate::protocol::InputMessageKind;
use crate::protocol::TaskStartedEvent;
use crate::protocol::TurnContextItem;
use crate::state::TaskKind;
use crate::truncate::truncate_middle;
use crate::util::backoff;
use askama::Template;
use codex_protocol::items::TurnItem;
use codex_protocol::models::ContentItem;
use codex_protocol::models::ResponseInputItem;
use codex_protocol::models::ResponseItem;
use codex_protocol::protocol::RolloutItem;
use codex_protocol::user_input::UserInput;
use futures::prelude::*;
use tracing::error;
pub const SUMMARIZATION_PROMPT: &str = include_str!("../../templates/compact/prompt.md");
const COMPACT_USER_MESSAGE_MAX_TOKENS: usize = 20_000;
@@ -40,40 +40,35 @@ pub(crate) async fn run_inline_auto_compact_task(
sess: Arc<Session>,
turn_context: Arc<TurnContext>,
) {
let sub_id = sess.next_internal_sub_id();
let input = vec![InputItem::Text {
let input = vec![UserInput::Text {
text: SUMMARIZATION_PROMPT.to_string(),
}];
run_compact_task_inner(sess, turn_context, sub_id, input).await;
run_compact_task_inner(sess, turn_context, input).await;
}
pub(crate) async fn run_compact_task(
sess: Arc<Session>,
turn_context: Arc<TurnContext>,
sub_id: String,
input: Vec<InputItem>,
input: Vec<UserInput>,
) -> Option<String> {
let start_event = Event {
id: sub_id.clone(),
msg: EventMsg::TaskStarted(TaskStartedEvent {
model_context_window: turn_context.client.get_model_context_window(),
}),
};
sess.send_event(start_event).await;
run_compact_task_inner(sess.clone(), turn_context, sub_id.clone(), input).await;
let start_event = EventMsg::TaskStarted(TaskStartedEvent {
model_context_window: turn_context.client.get_model_context_window(),
});
sess.send_event(&turn_context, start_event).await;
run_compact_task_inner(sess.clone(), turn_context, input).await;
None
}
async fn run_compact_task_inner(
sess: Arc<Session>,
turn_context: Arc<TurnContext>,
sub_id: String,
input: Vec<InputItem>,
input: Vec<UserInput>,
) {
let initial_input_for_turn: ResponseInputItem = ResponseInputItem::from(input);
let mut turn_input = sess
.turn_input_with_history(vec![initial_input_for_turn.clone().into()])
.await;
let mut history = sess.clone_history().await;
history.record_items(&[initial_input_for_turn.into()]);
let mut truncated_count = 0usize;
let max_retries = turn_context.client.get_provider().stream_max_retries();
@@ -90,18 +85,18 @@ async fn run_compact_task_inner(
sess.persist_rollout_items(&[rollout_item]).await;
loop {
let turn_input = history.get_history();
let prompt = Prompt {
input: turn_input.clone(),
..Default::default()
};
let attempt_result =
drain_to_completed(&sess, turn_context.as_ref(), &sub_id, &prompt).await;
let attempt_result = drain_to_completed(&sess, turn_context.as_ref(), &prompt).await;
match attempt_result {
Ok(()) => {
if truncated_count > 0 {
sess.notify_background_event(
&sub_id,
turn_context.as_ref(),
format!(
"Trimmed {truncated_count} older conversation item(s) before compacting so the prompt fits the model context window."
),
@@ -115,20 +110,20 @@ async fn run_compact_task_inner(
}
Err(e @ CodexErr::ContextWindowExceeded) => {
if turn_input.len() > 1 {
turn_input.remove(0);
// Trim from the beginning to preserve cache (prefix-based) and keep recent messages intact.
error!(
"Context window exceeded while compacting; removing oldest history item. Error: {e}"
);
history.remove_first_item();
truncated_count += 1;
retries = 0;
continue;
}
sess.set_total_tokens_full(&sub_id, turn_context.as_ref())
.await;
let event = Event {
id: sub_id.clone(),
msg: EventMsg::Error(ErrorEvent {
message: e.to_string(),
}),
};
sess.send_event(event).await;
sess.set_total_tokens_full(turn_context.as_ref()).await;
let event = EventMsg::Error(ErrorEvent {
message: e.to_string(),
});
sess.send_event(&turn_context, event).await;
return;
}
Err(e) => {
@@ -136,20 +131,17 @@ async fn run_compact_task_inner(
retries += 1;
let delay = backoff(retries);
sess.notify_stream_error(
&sub_id,
turn_context.as_ref(),
format!("Re-connecting... {retries}/{max_retries}"),
)
.await;
tokio::time::sleep(delay).await;
continue;
} else {
let event = Event {
id: sub_id.clone(),
msg: EventMsg::Error(ErrorEvent {
message: e.to_string(),
}),
};
sess.send_event(event).await;
let event = EventMsg::Error(ErrorEvent {
message: e.to_string(),
});
sess.send_event(&turn_context, event).await;
return;
}
}
@@ -168,13 +160,10 @@ async fn run_compact_task_inner(
});
sess.persist_rollout_items(&[rollout_item]).await;
let event = Event {
id: sub_id.clone(),
msg: EventMsg::AgentMessage(AgentMessageEvent {
message: "Compact task completed".to_string(),
}),
};
sess.send_event(event).await;
let event = EventMsg::AgentMessage(AgentMessageEvent {
message: "Compact task completed".to_string(),
});
sess.send_event(&turn_context, event).await;
}
pub fn content_items_to_text(content: &[ContentItem]) -> Option<String> {
@@ -199,23 +188,13 @@ pub fn content_items_to_text(content: &[ContentItem]) -> Option<String> {
pub(crate) fn collect_user_messages(items: &[ResponseItem]) -> Vec<String> {
items
.iter()
.filter_map(|item| match item {
ResponseItem::Message { role, content, .. } if role == "user" => {
content_items_to_text(content)
}
.filter_map(|item| match crate::event_mapping::parse_turn_item(item) {
Some(TurnItem::UserMessage(user)) => Some(user.message()),
_ => None,
})
.filter(|text| !is_session_prefix_message(text))
.collect()
}
pub fn is_session_prefix_message(text: &str) -> bool {
matches!(
InputMessageKind::from(("user", text)),
InputMessageKind::UserInstructions | InputMessageKind::EnvironmentContext
)
}
pub(crate) fn build_compacted_history(
initial_context: Vec<ResponseItem>,
user_messages: &[String],
@@ -256,7 +235,6 @@ pub(crate) fn build_compacted_history(
async fn drain_to_completed(
sess: &Session,
turn_context: &TurnContext,
sub_id: &str,
prompt: &Prompt,
) -> CodexResult<()> {
let mut stream = turn_context
@@ -277,10 +255,10 @@ async fn drain_to_completed(
sess.record_into_history(std::slice::from_ref(&item)).await;
}
Ok(ResponseEvent::RateLimits(snapshot)) => {
sess.update_rate_limits(sub_id, snapshot).await;
sess.update_rate_limits(turn_context, snapshot).await;
}
Ok(ResponseEvent::Completed { token_usage, .. }) => {
sess.update_token_usage_info(sub_id, turn_context, token_usage.as_ref())
sess.update_token_usage_info(turn_context, token_usage.as_ref())
.await;
return Ok(());
}
@@ -338,21 +316,16 @@ mod tests {
ResponseItem::Message {
id: Some("user".to_string()),
role: "user".to_string(),
content: vec![
ContentItem::InputText {
text: "first".to_string(),
},
ContentItem::OutputText {
text: "second".to_string(),
},
],
content: vec![ContentItem::InputText {
text: "first".to_string(),
}],
},
ResponseItem::Other,
];
let collected = collect_user_messages(&items);
assert_eq!(vec!["first\nsecond".to_string()], collected);
assert_eq!(vec!["first".to_string()], collected);
}
#[test]

View File

@@ -1,4 +1,4 @@
use crate::bash::parse_bash_lc_plain_commands;
use crate::bash::parse_shell_lc_plain_commands;
pub fn command_might_be_dangerous(command: &[String]) -> bool {
if is_dangerous_to_call_with_exec(command) {
@@ -6,7 +6,7 @@ pub fn command_might_be_dangerous(command: &[String]) -> bool {
}
// Support `bash -lc "<script>"` where the any part of the script might contain a dangerous command.
if let Some(all_commands) = parse_bash_lc_plain_commands(command)
if let Some(all_commands) = parse_shell_lc_plain_commands(command)
&& all_commands
.iter()
.any(|cmd| is_dangerous_to_call_with_exec(cmd))
@@ -57,6 +57,15 @@ mod tests {
])));
}
#[test]
fn zsh_git_reset_is_dangerous() {
assert!(command_might_be_dangerous(&vec_str(&[
"zsh",
"-lc",
"git reset --hard"
])));
}
#[test]
fn git_status_is_not_dangerous() {
assert!(!command_might_be_dangerous(&vec_str(&["git", "status"])));

View File

@@ -1,4 +1,4 @@
use crate::bash::parse_bash_lc_plain_commands;
use crate::bash::parse_shell_lc_plain_commands;
pub fn is_known_safe_command(command: &[String]) -> bool {
let command: Vec<String> = command
@@ -29,7 +29,7 @@ pub fn is_known_safe_command(command: &[String]) -> bool {
// introduce side effects ( "&&", "||", ";", and "|" ). If every
// individual command in the script is itself a knownsafe command, then
// the composite expression is considered safe.
if let Some(all_commands) = parse_bash_lc_plain_commands(&command)
if let Some(all_commands) = parse_shell_lc_plain_commands(&command)
&& !all_commands.is_empty()
&& all_commands
.iter()
@@ -201,6 +201,11 @@ mod tests {
])));
}
#[test]
fn zsh_lc_safe_command_sequence() {
assert!(is_known_safe_command(&vec_str(&["zsh", "-lc", "ls"])));
}
#[test]
fn unknown_or_partial() {
assert!(!is_safe_to_call_with_exec(&vec_str(&["foo"])));

View File

@@ -216,9 +216,6 @@ pub struct Config {
/// When set, restricts the login mechanism users may use.
pub forced_login_method: Option<ForcedLoginMethod>,
/// Include an experimental plan tool that the model can use to update its current plan and status of each step.
pub include_plan_tool: bool,
/// Include the `apply_patch` tool for models that benefit from invoking
/// file edits as a structured tool call. When unset, this falls back to the
/// model family's default preference.
@@ -484,6 +481,16 @@ pub fn write_global_mcp_servers(
entry["tool_timeout_sec"] = toml_edit::value(timeout.as_secs_f64());
}
if let Some(enabled_tools) = &config.enabled_tools {
entry["enabled_tools"] =
TomlItem::Value(enabled_tools.iter().collect::<TomlArray>().into());
}
if let Some(disabled_tools) = &config.disabled_tools {
entry["disabled_tools"] =
TomlItem::Value(disabled_tools.iter().collect::<TomlArray>().into());
}
doc["mcp_servers"][name.as_str()] = TomlItem::Table(entry);
}
}
@@ -1107,7 +1114,6 @@ pub struct ConfigOverrides {
pub config_profile: Option<String>,
pub codex_linux_sandbox_exe: Option<PathBuf>,
pub base_instructions: Option<String>,
pub include_plan_tool: Option<bool>,
pub include_apply_patch_tool: Option<bool>,
pub include_view_image_tool: Option<bool>,
pub show_raw_agent_reasoning: Option<bool>,
@@ -1137,7 +1143,6 @@ impl Config {
config_profile: config_profile_key,
codex_linux_sandbox_exe,
base_instructions,
include_plan_tool: include_plan_tool_override,
include_apply_patch_tool: include_apply_patch_tool_override,
include_view_image_tool: include_view_image_tool_override,
show_raw_agent_reasoning,
@@ -1164,7 +1169,6 @@ impl Config {
};
let feature_overrides = FeatureOverrides {
include_plan_tool: include_plan_tool_override,
include_apply_patch_tool: include_apply_patch_tool_override,
include_view_image_tool: include_view_image_tool_override,
web_search_request: override_tools_web_search_request,
@@ -1216,7 +1220,7 @@ impl Config {
}
}
}
let mut approval_policy = approval_policy_override
let approval_policy = approval_policy_override
.or(config_profile.approval_policy)
.or(cfg.approval_policy)
.unwrap_or_else(|| {
@@ -1259,7 +1263,6 @@ impl Config {
let history = cfg.history.unwrap_or_default();
let include_plan_tool_flag = features.enabled(Feature::PlanTool);
let include_apply_patch_tool_flag = features.enabled(Feature::ApplyPatchFreeform);
let include_view_image_tool_flag = features.enabled(Feature::ViewImageTool);
let tools_web_search_request = features.enabled(Feature::WebSearchRequest);
@@ -1325,10 +1328,6 @@ impl Config {
.or(cfg.review_model)
.unwrap_or_else(default_review_model);
if features.enabled(Feature::ApproveAll) {
approval_policy = AskForApproval::OnRequest;
}
let config = Self {
model,
review_model,
@@ -1389,7 +1388,6 @@ impl Config {
.unwrap_or("https://chatgpt.com/backend-api/".to_string()),
forced_chatgpt_workspace_id,
forced_login_method,
include_plan_tool: include_plan_tool_flag,
include_apply_patch_tool: include_apply_patch_tool_flag,
tools_web_search_request,
use_experimental_streamable_shell_tool,
@@ -1709,26 +1707,6 @@ trust_level = "trusted"
Ok(())
}
#[test]
fn approve_all_feature_forces_on_request_policy() -> std::io::Result<()> {
let cfg = r#"
[features]
approve_all = true
"#;
let parsed = toml::from_str::<ConfigToml>(cfg)
.expect("TOML deserialization should succeed for approve_all feature");
let temp_dir = TempDir::new()?;
let config = Config::load_from_base_config_with_overrides(
parsed,
ConfigOverrides::default(),
temp_dir.path().to_path_buf(),
)?;
assert!(config.features.enabled(Feature::ApproveAll));
assert_eq!(config.approval_policy, AskForApproval::OnRequest);
Ok(())
}
#[test]
fn config_defaults_to_auto_oauth_store_mode() -> std::io::Result<()> {
let codex_home = TempDir::new()?;
@@ -1755,7 +1733,6 @@ approve_all = true
profiles.insert(
"work".to_string(),
ConfigProfile {
include_plan_tool: Some(true),
include_view_image_tool: Some(false),
..Default::default()
},
@@ -1772,9 +1749,7 @@ approve_all = true
codex_home.path().to_path_buf(),
)?;
assert!(config.features.enabled(Feature::PlanTool));
assert!(!config.features.enabled(Feature::ViewImageTool));
assert!(config.include_plan_tool);
assert!(!config.include_view_image_tool);
Ok(())
@@ -1784,7 +1759,6 @@ approve_all = true
fn feature_table_overrides_legacy_flags() -> std::io::Result<()> {
let codex_home = TempDir::new()?;
let mut entries = BTreeMap::new();
entries.insert("plan_tool".to_string(), false);
entries.insert("apply_patch_freeform".to_string(), false);
let cfg = ConfigToml {
features: Some(crate::features::FeaturesToml { entries }),
@@ -1797,9 +1771,7 @@ approve_all = true
codex_home.path().to_path_buf(),
)?;
assert!(!config.features.enabled(Feature::PlanTool));
assert!(!config.features.enabled(Feature::ApplyPatchFreeform));
assert!(!config.include_plan_tool);
assert!(!config.include_apply_patch_tool);
Ok(())
@@ -1923,6 +1895,8 @@ approve_all = true
enabled: true,
startup_timeout_sec: Some(Duration::from_secs(3)),
tool_timeout_sec: Some(Duration::from_secs(5)),
enabled_tools: None,
disabled_tools: None,
},
);
@@ -2059,6 +2033,8 @@ bearer_token = "secret"
enabled: true,
startup_timeout_sec: None,
tool_timeout_sec: None,
enabled_tools: None,
disabled_tools: None,
},
)]);
@@ -2121,6 +2097,8 @@ ZIG_VAR = "3"
enabled: true,
startup_timeout_sec: None,
tool_timeout_sec: None,
enabled_tools: None,
disabled_tools: None,
},
)]);
@@ -2163,6 +2141,8 @@ ZIG_VAR = "3"
enabled: true,
startup_timeout_sec: None,
tool_timeout_sec: None,
enabled_tools: None,
disabled_tools: None,
},
)]);
@@ -2204,6 +2184,8 @@ ZIG_VAR = "3"
enabled: true,
startup_timeout_sec: Some(Duration::from_secs(2)),
tool_timeout_sec: None,
enabled_tools: None,
disabled_tools: None,
},
)]);
@@ -2261,6 +2243,8 @@ startup_timeout_sec = 2.0
enabled: true,
startup_timeout_sec: Some(Duration::from_secs(2)),
tool_timeout_sec: None,
enabled_tools: None,
disabled_tools: None,
},
)]);
write_global_mcp_servers(codex_home.path(), &servers)?;
@@ -2330,6 +2314,8 @@ X-Auth = "DOCS_AUTH"
enabled: true,
startup_timeout_sec: Some(Duration::from_secs(2)),
tool_timeout_sec: None,
enabled_tools: None,
disabled_tools: None,
},
)]);
@@ -2351,6 +2337,8 @@ X-Auth = "DOCS_AUTH"
enabled: true,
startup_timeout_sec: None,
tool_timeout_sec: None,
enabled_tools: None,
disabled_tools: None,
},
);
write_global_mcp_servers(codex_home.path(), &servers)?;
@@ -2410,6 +2398,8 @@ url = "https://example.com/mcp"
enabled: true,
startup_timeout_sec: Some(Duration::from_secs(2)),
tool_timeout_sec: None,
enabled_tools: None,
disabled_tools: None,
},
),
(
@@ -2425,6 +2415,8 @@ url = "https://example.com/mcp"
enabled: true,
startup_timeout_sec: None,
tool_timeout_sec: None,
enabled_tools: None,
disabled_tools: None,
},
),
]);
@@ -2499,6 +2491,8 @@ url = "https://example.com/mcp"
enabled: false,
startup_timeout_sec: None,
tool_timeout_sec: None,
enabled_tools: None,
disabled_tools: None,
},
)]);
@@ -2518,6 +2512,49 @@ url = "https://example.com/mcp"
Ok(())
}
#[tokio::test]
async fn write_global_mcp_servers_serializes_tool_filters() -> anyhow::Result<()> {
let codex_home = TempDir::new()?;
let servers = BTreeMap::from([(
"docs".to_string(),
McpServerConfig {
transport: McpServerTransportConfig::Stdio {
command: "docs-server".to_string(),
args: Vec::new(),
env: None,
env_vars: Vec::new(),
cwd: None,
},
enabled: true,
startup_timeout_sec: None,
tool_timeout_sec: None,
enabled_tools: Some(vec!["allowed".to_string()]),
disabled_tools: Some(vec!["blocked".to_string()]),
},
)]);
write_global_mcp_servers(codex_home.path(), &servers)?;
let config_path = codex_home.path().join(CONFIG_TOML_FILE);
let serialized = std::fs::read_to_string(&config_path)?;
assert!(serialized.contains(r#"enabled_tools = ["allowed"]"#));
assert!(serialized.contains(r#"disabled_tools = ["blocked"]"#));
let loaded = load_global_mcp_servers(codex_home.path()).await?;
let docs = loaded.get("docs").expect("docs entry");
assert_eq!(
docs.enabled_tools.as_ref(),
Some(&vec!["allowed".to_string()])
);
assert_eq!(
docs.disabled_tools.as_ref(),
Some(&vec!["blocked".to_string()])
);
Ok(())
}
#[tokio::test]
async fn persist_model_selection_updates_defaults() -> anyhow::Result<()> {
let codex_home = TempDir::new()?;
@@ -2740,6 +2777,7 @@ model_verbosity = "high"
env_key: Some("OPENAI_API_KEY".to_string()),
wire_api: crate::WireApi::Chat,
env_key_instructions: None,
experimental_bearer_token: None,
query_params: None,
http_headers: None,
env_http_headers: None,
@@ -2833,7 +2871,6 @@ model_verbosity = "high"
base_instructions: None,
forced_chatgpt_workspace_id: None,
forced_login_method: None,
include_plan_tool: false,
include_apply_patch_tool: false,
tools_web_search_request: false,
use_experimental_streamable_shell_tool: false,
@@ -2902,7 +2939,6 @@ model_verbosity = "high"
base_instructions: None,
forced_chatgpt_workspace_id: None,
forced_login_method: None,
include_plan_tool: false,
include_apply_patch_tool: false,
tools_web_search_request: false,
use_experimental_streamable_shell_tool: false,
@@ -2986,7 +3022,6 @@ model_verbosity = "high"
base_instructions: None,
forced_chatgpt_workspace_id: None,
forced_login_method: None,
include_plan_tool: false,
include_apply_patch_tool: false,
tools_web_search_request: false,
use_experimental_streamable_shell_tool: false,
@@ -3056,7 +3091,6 @@ model_verbosity = "high"
base_instructions: None,
forced_chatgpt_workspace_id: None,
forced_login_method: None,
include_plan_tool: false,
include_apply_patch_tool: false,
tools_web_search_request: false,
use_experimental_streamable_shell_tool: false,

View File

@@ -20,7 +20,6 @@ pub struct ConfigProfile {
pub model_verbosity: Option<Verbosity>,
pub chatgpt_base_url: Option<String>,
pub experimental_instructions_file: Option<PathBuf>,
pub include_plan_tool: Option<bool>,
pub include_apply_patch_tool: Option<bool>,
pub include_view_image_tool: Option<bool>,
pub experimental_use_unified_exec_tool: Option<bool>,

View File

@@ -35,6 +35,14 @@ pub struct McpServerConfig {
/// Default timeout for MCP tool calls initiated via this server.
#[serde(default, with = "option_duration_secs")]
pub tool_timeout_sec: Option<Duration>,
/// Explicit allow-list of tools exposed from this server. When set, only these tools will be registered.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub enabled_tools: Option<Vec<String>>,
/// Explicit deny-list of tools. These tools will be removed after applying `enabled_tools`.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub disabled_tools: Option<Vec<String>>,
}
impl<'de> Deserialize<'de> for McpServerConfig {
@@ -42,7 +50,7 @@ impl<'de> Deserialize<'de> for McpServerConfig {
where
D: Deserializer<'de>,
{
#[derive(Deserialize)]
#[derive(Deserialize, Clone)]
struct RawMcpServerConfig {
// stdio
command: Option<String>,
@@ -72,9 +80,13 @@ impl<'de> Deserialize<'de> for McpServerConfig {
tool_timeout_sec: Option<Duration>,
#[serde(default)]
enabled: Option<bool>,
#[serde(default)]
enabled_tools: Option<Vec<String>>,
#[serde(default)]
disabled_tools: Option<Vec<String>>,
}
let raw = RawMcpServerConfig::deserialize(deserializer)?;
let mut raw = RawMcpServerConfig::deserialize(deserializer)?;
let startup_timeout_sec = match (raw.startup_timeout_sec, raw.startup_timeout_ms) {
(Some(sec), _) => {
@@ -84,6 +96,10 @@ impl<'de> Deserialize<'de> for McpServerConfig {
(None, Some(ms)) => Some(Duration::from_millis(ms)),
(None, None) => None,
};
let tool_timeout_sec = raw.tool_timeout_sec;
let enabled = raw.enabled.unwrap_or_else(default_enabled);
let enabled_tools = raw.enabled_tools.clone();
let disabled_tools = raw.disabled_tools.clone();
fn throw_if_set<E, T>(transport: &str, field: &str, value: Option<&T>) -> Result<(), E>
where
@@ -97,72 +113,46 @@ impl<'de> Deserialize<'de> for McpServerConfig {
)))
}
let transport = match raw {
RawMcpServerConfig {
command: Some(command),
args,
env,
env_vars,
cwd,
url,
bearer_token_env_var,
http_headers,
env_http_headers,
..
} => {
throw_if_set("stdio", "url", url.as_ref())?;
throw_if_set(
"stdio",
"bearer_token_env_var",
bearer_token_env_var.as_ref(),
)?;
throw_if_set("stdio", "http_headers", http_headers.as_ref())?;
throw_if_set("stdio", "env_http_headers", env_http_headers.as_ref())?;
McpServerTransportConfig::Stdio {
command,
args: args.unwrap_or_default(),
env,
env_vars: env_vars.unwrap_or_default(),
cwd,
}
}
RawMcpServerConfig {
url: Some(url),
bearer_token,
bearer_token_env_var,
let transport = if let Some(command) = raw.command.clone() {
throw_if_set("stdio", "url", raw.url.as_ref())?;
throw_if_set(
"stdio",
"bearer_token_env_var",
raw.bearer_token_env_var.as_ref(),
)?;
throw_if_set("stdio", "bearer_token", raw.bearer_token.as_ref())?;
throw_if_set("stdio", "http_headers", raw.http_headers.as_ref())?;
throw_if_set("stdio", "env_http_headers", raw.env_http_headers.as_ref())?;
McpServerTransportConfig::Stdio {
command,
args,
env,
env_vars,
cwd,
http_headers,
env_http_headers,
startup_timeout_sec: _,
tool_timeout_sec: _,
startup_timeout_ms: _,
enabled: _,
} => {
throw_if_set("streamable_http", "command", command.as_ref())?;
throw_if_set("streamable_http", "args", args.as_ref())?;
throw_if_set("streamable_http", "env", env.as_ref())?;
throw_if_set("streamable_http", "env_vars", env_vars.as_ref())?;
throw_if_set("streamable_http", "cwd", cwd.as_ref())?;
throw_if_set("streamable_http", "bearer_token", bearer_token.as_ref())?;
McpServerTransportConfig::StreamableHttp {
url,
bearer_token_env_var,
http_headers,
env_http_headers,
}
args: raw.args.clone().unwrap_or_default(),
env: raw.env.clone(),
env_vars: raw.env_vars.clone().unwrap_or_default(),
cwd: raw.cwd.take(),
}
_ => return Err(SerdeError::custom("invalid transport")),
} else if let Some(url) = raw.url.clone() {
throw_if_set("streamable_http", "args", raw.args.as_ref())?;
throw_if_set("streamable_http", "env", raw.env.as_ref())?;
throw_if_set("streamable_http", "env_vars", raw.env_vars.as_ref())?;
throw_if_set("streamable_http", "cwd", raw.cwd.as_ref())?;
throw_if_set("streamable_http", "bearer_token", raw.bearer_token.as_ref())?;
McpServerTransportConfig::StreamableHttp {
url,
bearer_token_env_var: raw.bearer_token_env_var.clone(),
http_headers: raw.http_headers.clone(),
env_http_headers: raw.env_http_headers.take(),
}
} else {
return Err(SerdeError::custom("invalid transport"));
};
Ok(Self {
transport,
startup_timeout_sec,
tool_timeout_sec: raw.tool_timeout_sec,
enabled: raw.enabled.unwrap_or_else(default_enabled),
tool_timeout_sec,
enabled,
enabled_tools,
disabled_tools,
})
}
}
@@ -527,6 +517,8 @@ mod tests {
}
);
assert!(cfg.enabled);
assert!(cfg.enabled_tools.is_none());
assert!(cfg.disabled_tools.is_none());
}
#[test]
@@ -701,6 +693,21 @@ mod tests {
);
}
#[test]
fn deserialize_server_config_with_tool_filters() {
let cfg: McpServerConfig = toml::from_str(
r#"
command = "echo"
enabled_tools = ["allowed"]
disabled_tools = ["blocked"]
"#,
)
.expect("should deserialize tool filters");
assert_eq!(cfg.enabled_tools, Some(vec!["allowed".to_string()]));
assert_eq!(cfg.disabled_tools, Some(vec!["blocked".to_string()]));
}
#[test]
fn deserialize_rejects_command_and_url() {
toml::from_str::<McpServerConfig>(

View File

@@ -1,4 +1,6 @@
use codex_protocol::models::FunctionCallOutputPayload;
use codex_protocol::models::ResponseItem;
use tracing::error;
/// Transcript of conversation history
#[derive(Debug, Clone, Default)]
@@ -12,11 +14,6 @@ impl ConversationHistory {
Self { items: Vec::new() }
}
/// Returns a clone of the contents in the transcript.
pub(crate) fn contents(&self) -> Vec<ResponseItem> {
self.items.clone()
}
/// `items` is ordered from oldest to newest.
pub(crate) fn record_items<I>(&mut self, items: I)
where
@@ -32,9 +29,287 @@ impl ConversationHistory {
}
}
pub(crate) fn get_history(&mut self) -> Vec<ResponseItem> {
self.normalize_history();
self.contents()
}
pub(crate) fn remove_first_item(&mut self) {
if !self.items.is_empty() {
// Remove the oldest item (front of the list). Items are ordered from
// oldest → newest, so index 0 is the first entry recorded.
let removed = self.items.remove(0);
// If the removed item participates in a call/output pair, also remove
// its corresponding counterpart to keep the invariants intact without
// running a full normalization pass.
self.remove_corresponding_for(&removed);
}
}
/// This function enforces a couple of invariants on the in-memory history:
/// 1. every call (function/custom) has a corresponding output entry
/// 2. every output has a corresponding call entry
fn normalize_history(&mut self) {
// all function/tool calls must have a corresponding output
self.ensure_call_outputs_present();
// all outputs must have a corresponding function/tool call
self.remove_orphan_outputs();
}
/// Returns a clone of the contents in the transcript.
fn contents(&self) -> Vec<ResponseItem> {
self.items.clone()
}
fn ensure_call_outputs_present(&mut self) {
// Collect synthetic outputs to insert immediately after their calls.
// Store the insertion position (index of call) alongside the item so
// we can insert in reverse order and avoid index shifting.
let mut missing_outputs_to_insert: Vec<(usize, ResponseItem)> = Vec::new();
for (idx, item) in self.items.iter().enumerate() {
match item {
ResponseItem::FunctionCall { call_id, .. } => {
let has_output = self.items.iter().any(|i| match i {
ResponseItem::FunctionCallOutput {
call_id: existing, ..
} => existing == call_id,
_ => false,
});
if !has_output {
error_or_panic(format!(
"Function call output is missing for call id: {call_id}"
));
missing_outputs_to_insert.push((
idx,
ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: None,
},
},
));
}
}
ResponseItem::CustomToolCall { call_id, .. } => {
let has_output = self.items.iter().any(|i| match i {
ResponseItem::CustomToolCallOutput {
call_id: existing, ..
} => existing == call_id,
_ => false,
});
if !has_output {
error_or_panic(format!(
"Custom tool call output is missing for call id: {call_id}"
));
missing_outputs_to_insert.push((
idx,
ResponseItem::CustomToolCallOutput {
call_id: call_id.clone(),
output: "aborted".to_string(),
},
));
}
}
// LocalShellCall is represented in upstream streams by a FunctionCallOutput
ResponseItem::LocalShellCall { call_id, .. } => {
if let Some(call_id) = call_id.as_ref() {
let has_output = self.items.iter().any(|i| match i {
ResponseItem::FunctionCallOutput {
call_id: existing, ..
} => existing == call_id,
_ => false,
});
if !has_output {
error_or_panic(format!(
"Local shell call output is missing for call id: {call_id}"
));
missing_outputs_to_insert.push((
idx,
ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: None,
},
},
));
}
}
}
ResponseItem::Reasoning { .. }
| ResponseItem::WebSearchCall { .. }
| ResponseItem::FunctionCallOutput { .. }
| ResponseItem::CustomToolCallOutput { .. }
| ResponseItem::Other
| ResponseItem::Message { .. } => {
// nothing to do for these variants
}
}
}
if !missing_outputs_to_insert.is_empty() {
// Insert from the end to avoid shifting subsequent indices.
missing_outputs_to_insert.sort_by_key(|(i, _)| *i);
for (idx, item) in missing_outputs_to_insert.into_iter().rev() {
let insert_pos = idx + 1; // place immediately after the call
if insert_pos <= self.items.len() {
self.items.insert(insert_pos, item);
} else {
self.items.push(item);
}
}
}
}
fn remove_orphan_outputs(&mut self) {
// Work on a snapshot to avoid borrowing `self.items` while mutating it.
let snapshot = self.items.clone();
let mut orphan_output_call_ids: std::collections::HashSet<String> =
std::collections::HashSet::new();
for item in &snapshot {
match item {
ResponseItem::FunctionCallOutput { call_id, .. } => {
let has_call = snapshot.iter().any(|i| match i {
ResponseItem::FunctionCall {
call_id: existing, ..
} => existing == call_id,
ResponseItem::LocalShellCall {
call_id: Some(existing),
..
} => existing == call_id,
_ => false,
});
if !has_call {
error_or_panic(format!("Function call is missing for call id: {call_id}"));
orphan_output_call_ids.insert(call_id.clone());
}
}
ResponseItem::CustomToolCallOutput { call_id, .. } => {
let has_call = snapshot.iter().any(|i| match i {
ResponseItem::CustomToolCall {
call_id: existing, ..
} => existing == call_id,
_ => false,
});
if !has_call {
error_or_panic(format!(
"Custom tool call is missing for call id: {call_id}"
));
orphan_output_call_ids.insert(call_id.clone());
}
}
ResponseItem::FunctionCall { .. }
| ResponseItem::CustomToolCall { .. }
| ResponseItem::LocalShellCall { .. }
| ResponseItem::Reasoning { .. }
| ResponseItem::WebSearchCall { .. }
| ResponseItem::Other
| ResponseItem::Message { .. } => {
// nothing to do for these variants
}
}
}
if !orphan_output_call_ids.is_empty() {
let ids = orphan_output_call_ids;
self.items.retain(|i| match i {
ResponseItem::FunctionCallOutput { call_id, .. }
| ResponseItem::CustomToolCallOutput { call_id, .. } => !ids.contains(call_id),
_ => true,
});
}
}
pub(crate) fn replace(&mut self, items: Vec<ResponseItem>) {
self.items = items;
}
/// Removes the corresponding paired item for the provided `item`, if any.
///
/// Pairs:
/// - FunctionCall <-> FunctionCallOutput
/// - CustomToolCall <-> CustomToolCallOutput
/// - LocalShellCall(call_id: Some) <-> FunctionCallOutput
fn remove_corresponding_for(&mut self, item: &ResponseItem) {
match item {
ResponseItem::FunctionCall { call_id, .. } => {
self.remove_first_matching(|i| match i {
ResponseItem::FunctionCallOutput {
call_id: existing, ..
} => existing == call_id,
_ => false,
});
}
ResponseItem::CustomToolCall { call_id, .. } => {
self.remove_first_matching(|i| match i {
ResponseItem::CustomToolCallOutput {
call_id: existing, ..
} => existing == call_id,
_ => false,
});
}
ResponseItem::LocalShellCall {
call_id: Some(call_id),
..
} => {
self.remove_first_matching(|i| match i {
ResponseItem::FunctionCallOutput {
call_id: existing, ..
} => existing == call_id,
_ => false,
});
}
ResponseItem::FunctionCallOutput { call_id, .. } => {
self.remove_first_matching(|i| match i {
ResponseItem::FunctionCall {
call_id: existing, ..
} => existing == call_id,
ResponseItem::LocalShellCall {
call_id: Some(existing),
..
} => existing == call_id,
_ => false,
});
}
ResponseItem::CustomToolCallOutput { call_id, .. } => {
self.remove_first_matching(|i| match i {
ResponseItem::CustomToolCall {
call_id: existing, ..
} => existing == call_id,
_ => false,
});
}
_ => {}
}
}
/// Remove the first item matching the predicate.
fn remove_first_matching<F>(&mut self, predicate: F)
where
F: FnMut(&ResponseItem) -> bool,
{
if let Some(pos) = self.items.iter().position(predicate) {
self.items.remove(pos);
}
}
}
#[inline]
fn error_or_panic(message: String) {
if cfg!(debug_assertions) || env!("CARGO_PKG_VERSION").contains("alpha") {
panic!("{message}");
} else {
error!("{message}");
}
}
/// Anything that is not a system message or "reasoning" message is considered
@@ -57,6 +332,11 @@ fn is_api_message(message: &ResponseItem) -> bool {
mod tests {
use super::*;
use codex_protocol::models::ContentItem;
use codex_protocol::models::FunctionCallOutputPayload;
use codex_protocol::models::LocalShellAction;
use codex_protocol::models::LocalShellExecAction;
use codex_protocol::models::LocalShellStatus;
use pretty_assertions::assert_eq;
fn assistant_msg(text: &str) -> ResponseItem {
ResponseItem::Message {
@@ -68,6 +348,12 @@ mod tests {
}
}
fn create_history_with_items(items: Vec<ResponseItem>) -> ConversationHistory {
let mut h = ConversationHistory::new();
h.record_items(items.iter());
h
}
fn user_msg(text: &str) -> ResponseItem {
ResponseItem::Message {
id: None,
@@ -117,4 +403,452 @@ mod tests {
]
);
}
#[test]
fn remove_first_item_removes_matching_output_for_function_call() {
let items = vec![
ResponseItem::FunctionCall {
id: None,
name: "do_it".to_string(),
arguments: "{}".to_string(),
call_id: "call-1".to_string(),
},
ResponseItem::FunctionCallOutput {
call_id: "call-1".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
success: None,
},
},
];
let mut h = create_history_with_items(items);
h.remove_first_item();
assert_eq!(h.contents(), vec![]);
}
#[test]
fn remove_first_item_removes_matching_call_for_output() {
let items = vec![
ResponseItem::FunctionCallOutput {
call_id: "call-2".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
success: None,
},
},
ResponseItem::FunctionCall {
id: None,
name: "do_it".to_string(),
arguments: "{}".to_string(),
call_id: "call-2".to_string(),
},
];
let mut h = create_history_with_items(items);
h.remove_first_item();
assert_eq!(h.contents(), vec![]);
}
#[test]
fn remove_first_item_handles_local_shell_pair() {
let items = vec![
ResponseItem::LocalShellCall {
id: None,
call_id: Some("call-3".to_string()),
status: LocalShellStatus::Completed,
action: LocalShellAction::Exec(LocalShellExecAction {
command: vec!["echo".to_string(), "hi".to_string()],
timeout_ms: None,
working_directory: None,
env: None,
user: None,
}),
},
ResponseItem::FunctionCallOutput {
call_id: "call-3".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
success: None,
},
},
];
let mut h = create_history_with_items(items);
h.remove_first_item();
assert_eq!(h.contents(), vec![]);
}
#[test]
fn remove_first_item_handles_custom_tool_pair() {
let items = vec![
ResponseItem::CustomToolCall {
id: None,
status: None,
call_id: "tool-1".to_string(),
name: "my_tool".to_string(),
input: "{}".to_string(),
},
ResponseItem::CustomToolCallOutput {
call_id: "tool-1".to_string(),
output: "ok".to_string(),
},
];
let mut h = create_history_with_items(items);
h.remove_first_item();
assert_eq!(h.contents(), vec![]);
}
//TODO(aibrahim): run CI in release mode.
#[cfg(not(debug_assertions))]
#[test]
fn normalize_adds_missing_output_for_function_call() {
let items = vec![ResponseItem::FunctionCall {
id: None,
name: "do_it".to_string(),
arguments: "{}".to_string(),
call_id: "call-x".to_string(),
}];
let mut h = create_history_with_items(items);
h.normalize_history();
assert_eq!(
h.contents(),
vec![
ResponseItem::FunctionCall {
id: None,
name: "do_it".to_string(),
arguments: "{}".to_string(),
call_id: "call-x".to_string(),
},
ResponseItem::FunctionCallOutput {
call_id: "call-x".to_string(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: None,
},
},
]
);
}
#[cfg(not(debug_assertions))]
#[test]
fn normalize_adds_missing_output_for_custom_tool_call() {
let items = vec![ResponseItem::CustomToolCall {
id: None,
status: None,
call_id: "tool-x".to_string(),
name: "custom".to_string(),
input: "{}".to_string(),
}];
let mut h = create_history_with_items(items);
h.normalize_history();
assert_eq!(
h.contents(),
vec![
ResponseItem::CustomToolCall {
id: None,
status: None,
call_id: "tool-x".to_string(),
name: "custom".to_string(),
input: "{}".to_string(),
},
ResponseItem::CustomToolCallOutput {
call_id: "tool-x".to_string(),
output: "aborted".to_string(),
},
]
);
}
#[cfg(not(debug_assertions))]
#[test]
fn normalize_adds_missing_output_for_local_shell_call_with_id() {
let items = vec![ResponseItem::LocalShellCall {
id: None,
call_id: Some("shell-1".to_string()),
status: LocalShellStatus::Completed,
action: LocalShellAction::Exec(LocalShellExecAction {
command: vec!["echo".to_string(), "hi".to_string()],
timeout_ms: None,
working_directory: None,
env: None,
user: None,
}),
}];
let mut h = create_history_with_items(items);
h.normalize_history();
assert_eq!(
h.contents(),
vec![
ResponseItem::LocalShellCall {
id: None,
call_id: Some("shell-1".to_string()),
status: LocalShellStatus::Completed,
action: LocalShellAction::Exec(LocalShellExecAction {
command: vec!["echo".to_string(), "hi".to_string()],
timeout_ms: None,
working_directory: None,
env: None,
user: None,
}),
},
ResponseItem::FunctionCallOutput {
call_id: "shell-1".to_string(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: None,
},
},
]
);
}
#[cfg(not(debug_assertions))]
#[test]
fn normalize_removes_orphan_function_call_output() {
let items = vec![ResponseItem::FunctionCallOutput {
call_id: "orphan-1".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
success: None,
},
}];
let mut h = create_history_with_items(items);
h.normalize_history();
assert_eq!(h.contents(), vec![]);
}
#[cfg(not(debug_assertions))]
#[test]
fn normalize_removes_orphan_custom_tool_call_output() {
let items = vec![ResponseItem::CustomToolCallOutput {
call_id: "orphan-2".to_string(),
output: "ok".to_string(),
}];
let mut h = create_history_with_items(items);
h.normalize_history();
assert_eq!(h.contents(), vec![]);
}
#[cfg(not(debug_assertions))]
#[test]
fn normalize_mixed_inserts_and_removals() {
let items = vec![
// Will get an inserted output
ResponseItem::FunctionCall {
id: None,
name: "f1".to_string(),
arguments: "{}".to_string(),
call_id: "c1".to_string(),
},
// Orphan output that should be removed
ResponseItem::FunctionCallOutput {
call_id: "c2".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
success: None,
},
},
// Will get an inserted custom tool output
ResponseItem::CustomToolCall {
id: None,
status: None,
call_id: "t1".to_string(),
name: "tool".to_string(),
input: "{}".to_string(),
},
// Local shell call also gets an inserted function call output
ResponseItem::LocalShellCall {
id: None,
call_id: Some("s1".to_string()),
status: LocalShellStatus::Completed,
action: LocalShellAction::Exec(LocalShellExecAction {
command: vec!["echo".to_string()],
timeout_ms: None,
working_directory: None,
env: None,
user: None,
}),
},
];
let mut h = create_history_with_items(items);
h.normalize_history();
assert_eq!(
h.contents(),
vec![
ResponseItem::FunctionCall {
id: None,
name: "f1".to_string(),
arguments: "{}".to_string(),
call_id: "c1".to_string(),
},
ResponseItem::FunctionCallOutput {
call_id: "c1".to_string(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: None,
},
},
ResponseItem::CustomToolCall {
id: None,
status: None,
call_id: "t1".to_string(),
name: "tool".to_string(),
input: "{}".to_string(),
},
ResponseItem::CustomToolCallOutput {
call_id: "t1".to_string(),
output: "aborted".to_string(),
},
ResponseItem::LocalShellCall {
id: None,
call_id: Some("s1".to_string()),
status: LocalShellStatus::Completed,
action: LocalShellAction::Exec(LocalShellExecAction {
command: vec!["echo".to_string()],
timeout_ms: None,
working_directory: None,
env: None,
user: None,
}),
},
ResponseItem::FunctionCallOutput {
call_id: "s1".to_string(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: None,
},
},
]
);
}
// In debug builds we panic on normalization errors instead of silently fixing them.
#[cfg(debug_assertions)]
#[test]
#[should_panic]
fn normalize_adds_missing_output_for_function_call_panics_in_debug() {
let items = vec![ResponseItem::FunctionCall {
id: None,
name: "do_it".to_string(),
arguments: "{}".to_string(),
call_id: "call-x".to_string(),
}];
let mut h = create_history_with_items(items);
h.normalize_history();
}
#[cfg(debug_assertions)]
#[test]
#[should_panic]
fn normalize_adds_missing_output_for_custom_tool_call_panics_in_debug() {
let items = vec![ResponseItem::CustomToolCall {
id: None,
status: None,
call_id: "tool-x".to_string(),
name: "custom".to_string(),
input: "{}".to_string(),
}];
let mut h = create_history_with_items(items);
h.normalize_history();
}
#[cfg(debug_assertions)]
#[test]
#[should_panic]
fn normalize_adds_missing_output_for_local_shell_call_with_id_panics_in_debug() {
let items = vec![ResponseItem::LocalShellCall {
id: None,
call_id: Some("shell-1".to_string()),
status: LocalShellStatus::Completed,
action: LocalShellAction::Exec(LocalShellExecAction {
command: vec!["echo".to_string(), "hi".to_string()],
timeout_ms: None,
working_directory: None,
env: None,
user: None,
}),
}];
let mut h = create_history_with_items(items);
h.normalize_history();
}
#[cfg(debug_assertions)]
#[test]
#[should_panic]
fn normalize_removes_orphan_function_call_output_panics_in_debug() {
let items = vec![ResponseItem::FunctionCallOutput {
call_id: "orphan-1".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
success: None,
},
}];
let mut h = create_history_with_items(items);
h.normalize_history();
}
#[cfg(debug_assertions)]
#[test]
#[should_panic]
fn normalize_removes_orphan_custom_tool_call_output_panics_in_debug() {
let items = vec![ResponseItem::CustomToolCallOutput {
call_id: "orphan-2".to_string(),
output: "ok".to_string(),
}];
let mut h = create_history_with_items(items);
h.normalize_history();
}
#[cfg(debug_assertions)]
#[test]
#[should_panic]
fn normalize_mixed_inserts_and_removals_panics_in_debug() {
let items = vec![
ResponseItem::FunctionCall {
id: None,
name: "f1".to_string(),
arguments: "{}".to_string(),
call_id: "c1".to_string(),
},
ResponseItem::FunctionCallOutput {
call_id: "c2".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
success: None,
},
},
ResponseItem::CustomToolCall {
id: None,
status: None,
call_id: "t1".to_string(),
name: "tool".to_string(),
input: "{}".to_string(),
},
ResponseItem::LocalShellCall {
id: None,
call_id: Some("s1".to_string()),
status: LocalShellStatus::Completed,
action: LocalShellAction::Exec(LocalShellExecAction {
command: vec!["echo".to_string()],
timeout_ms: None,
working_directory: None,
env: None,
user: None,
}),
},
];
let mut h = create_history_with_items(items);
h.normalize_history();
}
}

View File

@@ -3,8 +3,6 @@ use crate::CodexAuth;
use crate::codex::Codex;
use crate::codex::CodexSpawnOk;
use crate::codex::INITIAL_SUBMIT_ID;
use crate::codex::compact::content_items_to_text;
use crate::codex::compact::is_session_prefix_message;
use crate::codex_conversation::CodexConversation;
use crate::config::Config;
use crate::error::CodexErr;
@@ -14,6 +12,7 @@ use crate::protocol::EventMsg;
use crate::protocol::SessionConfiguredEvent;
use crate::rollout::RolloutRecorder;
use codex_protocol::ConversationId;
use codex_protocol::items::TurnItem;
use codex_protocol::models::ResponseItem;
use codex_protocol::protocol::InitialHistory;
use codex_protocol::protocol::RolloutItem;
@@ -182,9 +181,11 @@ fn truncate_before_nth_user_message(history: InitialHistory, n: usize) -> Initia
// Find indices of user message inputs in rollout order.
let mut user_positions: Vec<usize> = Vec::new();
for (idx, item) in items.iter().enumerate() {
if let RolloutItem::ResponseItem(ResponseItem::Message { role, content, .. }) = item
&& role == "user"
&& content_items_to_text(content).is_some_and(|text| !is_session_prefix_message(&text))
if let RolloutItem::ResponseItem(item @ ResponseItem::Message { .. }) = item
&& matches!(
crate::event_mapping::parse_turn_item(item),
Some(TurnItem::UserMessage(_))
)
{
user_positions.push(idx);
}

View File

@@ -1,3 +1,4 @@
use crate::codex::ProcessedResponseItem;
use crate::exec::ExecToolCallOutput;
use crate::token_data::KnownPlan;
use crate::token_data::PlanType;
@@ -53,8 +54,11 @@ pub enum SandboxErr {
#[derive(Error, Debug)]
pub enum CodexErr {
// todo(aibrahim): git rid of this error carrying the dangling artifacts
#[error("turn aborted")]
TurnAborted,
TurnAborted {
dangling_artifacts: Vec<ProcessedResponseItem>,
},
/// Returned by ResponsesClient when the SSE stream disconnects or errors out **after** the HTTP
/// handshake has succeeded but **before** it finished emitting `response.completed`.
@@ -158,7 +162,9 @@ pub enum CodexErr {
impl From<CancelErr> for CodexErr {
fn from(_: CancelErr) -> Self {
CodexErr::TurnAborted
CodexErr::TurnAborted {
dangling_artifacts: Vec::new(),
}
}
}

View File

@@ -1,139 +1,131 @@
use crate::protocol::AgentMessageEvent;
use crate::protocol::AgentReasoningEvent;
use crate::protocol::AgentReasoningRawContentEvent;
use crate::protocol::EventMsg;
use crate::protocol::InputMessageKind;
use crate::protocol::UserMessageEvent;
use crate::protocol::WebSearchEndEvent;
use codex_protocol::items::AgentMessageContent;
use codex_protocol::items::AgentMessageItem;
use codex_protocol::items::ReasoningItem;
use codex_protocol::items::TurnItem;
use codex_protocol::items::UserMessageItem;
use codex_protocol::items::WebSearchItem;
use codex_protocol::models::ContentItem;
use codex_protocol::models::ReasoningItemContent;
use codex_protocol::models::ReasoningItemReasoningSummary;
use codex_protocol::models::ResponseItem;
use codex_protocol::models::WebSearchAction;
use codex_protocol::user_input::UserInput;
use tracing::warn;
/// Convert a `ResponseItem` into zero or more `EventMsg` values that the UI can render.
///
/// When `show_raw_agent_reasoning` is false, raw reasoning content events are omitted.
pub(crate) fn map_response_item_to_event_messages(
item: &ResponseItem,
show_raw_agent_reasoning: bool,
) -> Vec<EventMsg> {
fn is_session_prefix(text: &str) -> bool {
let trimmed = text.trim_start();
let lowered = trimmed.to_ascii_lowercase();
lowered.starts_with("<environment_context>") || lowered.starts_with("<user_instructions>")
}
fn parse_user_message(message: &[ContentItem]) -> Option<UserMessageItem> {
let mut content: Vec<UserInput> = Vec::new();
for content_item in message.iter() {
match content_item {
ContentItem::InputText { text } => {
if is_session_prefix(text) {
return None;
}
content.push(UserInput::Text { text: text.clone() });
}
ContentItem::InputImage { image_url } => {
content.push(UserInput::Image {
image_url: image_url.clone(),
});
}
ContentItem::OutputText { text } => {
if is_session_prefix(text) {
return None;
}
warn!("Output text in user message: {}", text);
}
}
}
Some(UserMessageItem::new(&content))
}
fn parse_agent_message(message: &[ContentItem]) -> AgentMessageItem {
let mut content: Vec<AgentMessageContent> = Vec::new();
for content_item in message.iter() {
match content_item {
ContentItem::OutputText { text } => {
content.push(AgentMessageContent::Text { text: text.clone() });
}
_ => {
warn!(
"Unexpected content item in agent message: {:?}",
content_item
);
}
}
}
AgentMessageItem::new(&content)
}
pub fn parse_turn_item(item: &ResponseItem) -> Option<TurnItem> {
match item {
ResponseItem::Message { role, content, .. } => {
// Do not surface system messages as user events.
if role == "system" {
return Vec::new();
}
let mut events: Vec<EventMsg> = Vec::new();
let mut message_parts: Vec<String> = Vec::new();
let mut images: Vec<String> = Vec::new();
let mut kind: Option<InputMessageKind> = None;
for content_item in content.iter() {
match content_item {
ContentItem::InputText { text } => {
if kind.is_none() {
let trimmed = text.trim_start();
kind = if trimmed.starts_with("<environment_context>") {
Some(InputMessageKind::EnvironmentContext)
} else if trimmed.starts_with("<user_instructions>") {
Some(InputMessageKind::UserInstructions)
} else {
Some(InputMessageKind::Plain)
};
}
message_parts.push(text.clone());
}
ContentItem::InputImage { image_url } => {
images.push(image_url.clone());
}
ContentItem::OutputText { text } => {
events.push(EventMsg::AgentMessage(AgentMessageEvent {
message: text.clone(),
}));
}
}
}
if !message_parts.is_empty() || !images.is_empty() {
let message = if message_parts.is_empty() {
String::new()
} else {
message_parts.join("")
};
let images = if images.is_empty() {
None
} else {
Some(images)
};
events.push(EventMsg::UserMessage(UserMessageEvent {
message,
kind,
images,
}));
}
events
}
ResponseItem::Reasoning {
summary, content, ..
} => {
let mut events = Vec::new();
for ReasoningItemReasoningSummary::SummaryText { text } in summary {
events.push(EventMsg::AgentReasoning(AgentReasoningEvent {
text: text.clone(),
}));
}
if let Some(items) = content.as_ref().filter(|_| show_raw_agent_reasoning) {
for c in items {
let text = match c {
ReasoningItemContent::ReasoningText { text }
| ReasoningItemContent::Text { text } => text,
};
events.push(EventMsg::AgentReasoningRawContent(
AgentReasoningRawContentEvent { text: text.clone() },
));
}
}
events
}
ResponseItem::WebSearchCall { id, action, .. } => match action {
WebSearchAction::Search { query } => {
let call_id = id.clone().unwrap_or_else(|| "".to_string());
vec![EventMsg::WebSearchEnd(WebSearchEndEvent {
call_id,
query: query.clone(),
})]
}
WebSearchAction::Other => Vec::new(),
ResponseItem::Message { role, content, .. } => match role.as_str() {
"user" => parse_user_message(content).map(TurnItem::UserMessage),
"assistant" => Some(TurnItem::AgentMessage(parse_agent_message(content))),
"system" => None,
_ => None,
},
// Variants that require side effects are handled by higher layers and do not emit events here.
ResponseItem::FunctionCall { .. }
| ResponseItem::FunctionCallOutput { .. }
| ResponseItem::LocalShellCall { .. }
| ResponseItem::CustomToolCall { .. }
| ResponseItem::CustomToolCallOutput { .. }
| ResponseItem::Other => Vec::new(),
ResponseItem::Reasoning {
id,
summary,
content,
..
} => {
let summary_text = summary
.iter()
.map(|entry| match entry {
ReasoningItemReasoningSummary::SummaryText { text } => text.clone(),
})
.collect();
let raw_content = content
.clone()
.unwrap_or_default()
.into_iter()
.map(|entry| match entry {
ReasoningItemContent::ReasoningText { text }
| ReasoningItemContent::Text { text } => text,
})
.collect();
Some(TurnItem::Reasoning(ReasoningItem {
id: id.clone(),
summary_text,
raw_content,
}))
}
ResponseItem::WebSearchCall {
id,
action: WebSearchAction::Search { query },
..
} => Some(TurnItem::WebSearch(WebSearchItem {
id: id.clone().unwrap_or_default(),
query: query.clone(),
})),
_ => None,
}
}
#[cfg(test)]
mod tests {
use super::map_response_item_to_event_messages;
use crate::protocol::EventMsg;
use crate::protocol::InputMessageKind;
use assert_matches::assert_matches;
use super::parse_turn_item;
use codex_protocol::items::AgentMessageContent;
use codex_protocol::items::TurnItem;
use codex_protocol::models::ContentItem;
use codex_protocol::models::ReasoningItemContent;
use codex_protocol::models::ReasoningItemReasoningSummary;
use codex_protocol::models::ResponseItem;
use codex_protocol::models::WebSearchAction;
use codex_protocol::user_input::UserInput;
use pretty_assertions::assert_eq;
#[test]
fn maps_user_message_with_text_and_two_images() {
fn parses_user_message_with_text_and_two_images() {
let img1 = "https://example.com/one.png".to_string();
let img2 = "https://example.com/two.jpg".to_string();
@@ -153,16 +145,128 @@ mod tests {
],
};
let events = map_response_item_to_event_messages(&item, false);
assert_eq!(events.len(), 1, "expected a single user message event");
let turn_item = parse_turn_item(&item).expect("expected user message turn item");
match &events[0] {
EventMsg::UserMessage(user) => {
assert_eq!(user.message, "Hello world");
assert_matches!(user.kind, Some(InputMessageKind::Plain));
assert_eq!(user.images, Some(vec![img1, img2]));
match turn_item {
TurnItem::UserMessage(user) => {
let expected_content = vec![
UserInput::Text {
text: "Hello world".to_string(),
},
UserInput::Image { image_url: img1 },
UserInput::Image { image_url: img2 },
];
assert_eq!(user.content, expected_content);
}
other => panic!("expected UserMessage, got {other:?}"),
other => panic!("expected TurnItem::UserMessage, got {other:?}"),
}
}
#[test]
fn parses_agent_message() {
let item = ResponseItem::Message {
id: Some("msg-1".to_string()),
role: "assistant".to_string(),
content: vec![ContentItem::OutputText {
text: "Hello from Codex".to_string(),
}],
};
let turn_item = parse_turn_item(&item).expect("expected agent message turn item");
match turn_item {
TurnItem::AgentMessage(message) => {
let Some(AgentMessageContent::Text { text }) = message.content.first() else {
panic!("expected agent message text content");
};
assert_eq!(text, "Hello from Codex");
}
other => panic!("expected TurnItem::AgentMessage, got {other:?}"),
}
}
#[test]
fn parses_reasoning_summary_and_raw_content() {
let item = ResponseItem::Reasoning {
id: "reasoning_1".to_string(),
summary: vec![
ReasoningItemReasoningSummary::SummaryText {
text: "Step 1".to_string(),
},
ReasoningItemReasoningSummary::SummaryText {
text: "Step 2".to_string(),
},
],
content: Some(vec![ReasoningItemContent::ReasoningText {
text: "raw details".to_string(),
}]),
encrypted_content: None,
};
let turn_item = parse_turn_item(&item).expect("expected reasoning turn item");
match turn_item {
TurnItem::Reasoning(reasoning) => {
assert_eq!(
reasoning.summary_text,
vec!["Step 1".to_string(), "Step 2".to_string()]
);
assert_eq!(reasoning.raw_content, vec!["raw details".to_string()]);
}
other => panic!("expected TurnItem::Reasoning, got {other:?}"),
}
}
#[test]
fn parses_reasoning_including_raw_content() {
let item = ResponseItem::Reasoning {
id: "reasoning_2".to_string(),
summary: vec![ReasoningItemReasoningSummary::SummaryText {
text: "Summarized step".to_string(),
}],
content: Some(vec![
ReasoningItemContent::ReasoningText {
text: "raw step".to_string(),
},
ReasoningItemContent::Text {
text: "final thought".to_string(),
},
]),
encrypted_content: None,
};
let turn_item = parse_turn_item(&item).expect("expected reasoning turn item");
match turn_item {
TurnItem::Reasoning(reasoning) => {
assert_eq!(reasoning.summary_text, vec!["Summarized step".to_string()]);
assert_eq!(
reasoning.raw_content,
vec!["raw step".to_string(), "final thought".to_string()]
);
}
other => panic!("expected TurnItem::Reasoning, got {other:?}"),
}
}
#[test]
fn parses_web_search_call() {
let item = ResponseItem::WebSearchCall {
id: Some("ws_1".to_string()),
status: Some("completed".to_string()),
action: WebSearchAction::Search {
query: "weather".to_string(),
},
};
let turn_item = parse_turn_item(&item).expect("expected web search turn item");
match turn_item {
TurnItem::WebSearch(search) => {
assert_eq!(search.id, "ws_1");
assert_eq!(search.query, "weather");
}
other => panic!("expected TurnItem::WebSearch, got {other:?}"),
}
}
}

View File

@@ -31,18 +31,14 @@ pub enum Feature {
UnifiedExec,
/// Use the streamable exec-command/write-stdin tool pair.
StreamableShell,
/// Use the official Rust MCP client (rmcp).
/// Enable experimental RMCP features such as OAuth login.
RmcpClient,
/// Include the plan tool.
PlanTool,
/// Include the freeform apply_patch tool.
ApplyPatchFreeform,
/// Include the view_image tool.
ViewImageTool,
/// Allow the model to request web searches.
WebSearchRequest,
/// Automatically approve all approval requests from the harness.
ApproveAll,
}
impl Feature {
@@ -74,7 +70,6 @@ pub struct Features {
#[derive(Debug, Clone, Default)]
pub struct FeatureOverrides {
pub include_plan_tool: Option<bool>,
pub include_apply_patch_tool: Option<bool>,
pub include_view_image_tool: Option<bool>,
pub web_search_request: Option<bool>,
@@ -83,7 +78,6 @@ pub struct FeatureOverrides {
impl FeatureOverrides {
fn apply(self, features: &mut Features) {
LegacyFeatureToggles {
include_plan_tool: self.include_plan_tool,
include_apply_patch_tool: self.include_apply_patch_tool,
include_view_image_tool: self.include_view_image_tool,
tools_web_search: self.web_search_request,
@@ -158,7 +152,6 @@ impl Features {
}
let profile_legacy = LegacyFeatureToggles {
include_plan_tool: config_profile.include_plan_tool,
include_apply_patch_tool: config_profile.include_apply_patch_tool,
include_view_image_tool: config_profile.include_view_image_tool,
experimental_use_freeform_apply_patch: config_profile
@@ -225,12 +218,6 @@ pub const FEATURES: &[FeatureSpec] = &[
stage: Stage::Experimental,
default_enabled: false,
},
FeatureSpec {
id: Feature::PlanTool,
key: "plan_tool",
stage: Stage::Stable,
default_enabled: false,
},
FeatureSpec {
id: Feature::ApplyPatchFreeform,
key: "apply_patch_freeform",
@@ -249,10 +236,4 @@ pub const FEATURES: &[FeatureSpec] = &[
stage: Stage::Stable,
default_enabled: false,
},
FeatureSpec {
id: Feature::ApproveAll,
key: "approve_all",
stage: Stage::Experimental,
default_enabled: false,
},
];

View File

@@ -29,10 +29,6 @@ const ALIASES: &[Alias] = &[
legacy_key: "include_apply_patch_tool",
feature: Feature::ApplyPatchFreeform,
},
Alias {
legacy_key: "include_plan_tool",
feature: Feature::PlanTool,
},
Alias {
legacy_key: "include_view_image_tool",
feature: Feature::ViewImageTool,
@@ -55,7 +51,6 @@ pub(crate) fn feature_for_key(key: &str) -> Option<Feature> {
#[derive(Debug, Default)]
pub struct LegacyFeatureToggles {
pub include_plan_tool: Option<bool>,
pub include_apply_patch_tool: Option<bool>,
pub include_view_image_tool: Option<bool>,
pub experimental_use_freeform_apply_patch: Option<bool>,
@@ -68,12 +63,6 @@ pub struct LegacyFeatureToggles {
impl LegacyFeatureToggles {
pub fn apply(self, features: &mut Features) {
set_if_some(
features,
Feature::PlanTool,
self.include_plan_tool,
"include_plan_tool",
);
set_if_some(
features,
Feature::ApplyPatchFreeform,

View File

@@ -14,6 +14,7 @@ mod client_common;
pub mod codex;
mod codex_conversation;
pub use codex_conversation::CodexConversation;
mod codex_delegate;
mod command_safety;
pub mod config;
pub mod config_edit;
@@ -36,6 +37,7 @@ mod mcp_tool_call;
mod message_history;
mod model_provider_info;
pub mod parse_command;
mod response_processing;
pub mod sandboxing;
pub mod token_data;
mod truncate;
@@ -98,11 +100,10 @@ pub use client_common::REVIEW_PROMPT;
pub use client_common::ResponseEvent;
pub use client_common::ResponseStream;
pub use codex::compact::content_items_to_text;
pub use codex::compact::is_session_prefix_message;
pub use codex_protocol::models::ContentItem;
pub use codex_protocol::models::LocalShellAction;
pub use codex_protocol::models::LocalShellExecAction;
pub use codex_protocol::models::LocalShellStatus;
pub use codex_protocol::models::ReasoningItemContent;
pub use codex_protocol::models::ResponseItem;
pub use event_mapping::parse_turn_item;
pub mod otel_init;

View File

@@ -10,10 +10,16 @@ use tracing::warn;
use crate::config_types::McpServerConfig;
use crate::config_types::McpServerTransportConfig;
#[derive(Debug, Clone)]
pub struct McpAuthStatusEntry {
pub config: McpServerConfig,
pub auth_status: McpAuthStatus,
}
pub async fn compute_auth_statuses<'a, I>(
servers: I,
store_mode: OAuthCredentialsStoreMode,
) -> HashMap<String, McpAuthStatus>
) -> HashMap<String, McpAuthStatusEntry>
where
I: IntoIterator<Item = (&'a String, &'a McpServerConfig)>,
{
@@ -21,14 +27,18 @@ where
let name = name.clone();
let config = config.clone();
async move {
let status = match compute_auth_status(&name, &config, store_mode).await {
let auth_status = match compute_auth_status(&name, &config, store_mode).await {
Ok(status) => status,
Err(error) => {
warn!("failed to determine auth status for MCP server `{name}`: {error:?}");
McpAuthStatus::Unsupported
}
};
(name, status)
let entry = McpAuthStatusEntry {
config,
auth_status,
};
(name, entry)
}
});

View File

@@ -1,6 +1,6 @@
//! Connection manager for Model Context Protocol (MCP) servers.
//!
//! The [`McpConnectionManager`] owns one [`codex_mcp_client::McpClient`] per
//! The [`McpConnectionManager`] owns one [`codex_rmcp_client::RmcpClient`] per
//! configured server (keyed by the *server name*). It offers convenience
//! helpers to query the available tools across *all* servers and returns them
//! in a single aggregated map using the fully-qualified tool name
@@ -10,14 +10,12 @@ use std::collections::HashMap;
use std::collections::HashSet;
use std::env;
use std::ffi::OsString;
use std::path::PathBuf;
use std::sync::Arc;
use std::time::Duration;
use anyhow::Context;
use anyhow::Result;
use anyhow::anyhow;
use codex_mcp_client::McpClient;
use codex_rmcp_client::OAuthCredentialsStoreMode;
use codex_rmcp_client::RmcpClient;
use mcp_types::ClientCapabilities;
@@ -99,134 +97,12 @@ struct ToolInfo {
}
struct ManagedClient {
client: McpClientAdapter,
client: Arc<RmcpClient>,
startup_timeout: Duration,
tool_timeout: Option<Duration>,
}
#[derive(Clone)]
enum McpClientAdapter {
Legacy(Arc<McpClient>),
Rmcp(Arc<RmcpClient>),
}
impl McpClientAdapter {
#[allow(clippy::too_many_arguments)]
async fn new_stdio_client(
use_rmcp_client: bool,
program: OsString,
args: Vec<OsString>,
env: Option<HashMap<String, String>>,
env_vars: Vec<String>,
cwd: Option<PathBuf>,
params: mcp_types::InitializeRequestParams,
startup_timeout: Duration,
) -> Result<Self> {
if use_rmcp_client {
let client =
Arc::new(RmcpClient::new_stdio_client(program, args, env, &env_vars, cwd).await?);
client.initialize(params, Some(startup_timeout)).await?;
Ok(McpClientAdapter::Rmcp(client))
} else {
let client =
Arc::new(McpClient::new_stdio_client(program, args, env, &env_vars, cwd).await?);
client.initialize(params, Some(startup_timeout)).await?;
Ok(McpClientAdapter::Legacy(client))
}
}
#[allow(clippy::too_many_arguments)]
async fn new_streamable_http_client(
server_name: String,
url: String,
bearer_token: Option<String>,
http_headers: Option<HashMap<String, String>>,
env_http_headers: Option<HashMap<String, String>>,
params: mcp_types::InitializeRequestParams,
startup_timeout: Duration,
store_mode: OAuthCredentialsStoreMode,
) -> Result<Self> {
let client = Arc::new(
RmcpClient::new_streamable_http_client(
&server_name,
&url,
bearer_token,
http_headers,
env_http_headers,
store_mode,
)
.await?,
);
client.initialize(params, Some(startup_timeout)).await?;
Ok(McpClientAdapter::Rmcp(client))
}
async fn list_tools(
&self,
params: Option<mcp_types::ListToolsRequestParams>,
timeout: Option<Duration>,
) -> Result<mcp_types::ListToolsResult> {
match self {
McpClientAdapter::Legacy(client) => client.list_tools(params, timeout).await,
McpClientAdapter::Rmcp(client) => client.list_tools(params, timeout).await,
}
}
async fn list_resources(
&self,
params: Option<mcp_types::ListResourcesRequestParams>,
timeout: Option<Duration>,
) -> Result<mcp_types::ListResourcesResult> {
match self {
McpClientAdapter::Legacy(_) => Ok(ListResourcesResult {
next_cursor: None,
resources: Vec::new(),
}),
McpClientAdapter::Rmcp(client) => client.list_resources(params, timeout).await,
}
}
async fn read_resource(
&self,
params: mcp_types::ReadResourceRequestParams,
timeout: Option<Duration>,
) -> Result<mcp_types::ReadResourceResult> {
match self {
McpClientAdapter::Legacy(_) => Err(anyhow!(
"resources/read is not supported by legacy MCP clients"
)),
McpClientAdapter::Rmcp(client) => client.read_resource(params, timeout).await,
}
}
async fn list_resource_templates(
&self,
params: Option<mcp_types::ListResourceTemplatesRequestParams>,
timeout: Option<Duration>,
) -> Result<mcp_types::ListResourceTemplatesResult> {
match self {
McpClientAdapter::Legacy(_) => Ok(ListResourceTemplatesResult {
next_cursor: None,
resource_templates: Vec::new(),
}),
McpClientAdapter::Rmcp(client) => client.list_resource_templates(params, timeout).await,
}
}
async fn call_tool(
&self,
name: String,
arguments: Option<serde_json::Value>,
timeout: Option<Duration>,
) -> Result<mcp_types::CallToolResult> {
match self {
McpClientAdapter::Legacy(client) => client.call_tool(name, arguments, timeout).await,
McpClientAdapter::Rmcp(client) => client.call_tool(name, arguments, timeout).await,
}
}
}
/// A thin wrapper around a set of running [`McpClient`] instances.
/// A thin wrapper around a set of running [`RmcpClient`] instances.
#[derive(Default)]
pub(crate) struct McpConnectionManager {
/// Server-name -> client instance.
@@ -237,10 +113,13 @@ pub(crate) struct McpConnectionManager {
/// Fully qualified tool name -> tool instance.
tools: HashMap<String, ToolInfo>,
/// Server-name -> configured tool filters.
tool_filters: HashMap<String, ToolFilter>,
}
impl McpConnectionManager {
/// Spawn a [`McpClient`] for each configured server.
/// Spawn a [`RmcpClient`] for each configured server.
///
/// * `mcp_servers` Map loaded from the user configuration where *keys*
/// are human-readable server identifiers and *values* are the spawn
@@ -250,7 +129,6 @@ impl McpConnectionManager {
/// user should be informed about these errors.
pub async fn new(
mcp_servers: HashMap<String, McpServerConfig>,
use_rmcp_client: bool,
store_mode: OAuthCredentialsStoreMode,
) -> Result<(Self, ClientStartErrors)> {
// Early exit if no servers are configured.
@@ -261,6 +139,7 @@ impl McpConnectionManager {
// Launch all configured servers concurrently.
let mut join_set = JoinSet::new();
let mut errors = ClientStartErrors::new();
let mut tool_filters: HashMap<String, ToolFilter> = HashMap::new();
for (server_name, cfg) in mcp_servers {
// Validate server name before spawning
@@ -273,11 +152,13 @@ impl McpConnectionManager {
}
if !cfg.enabled {
tool_filters.insert(server_name, ToolFilter::from_config(&cfg));
continue;
}
let startup_timeout = cfg.startup_timeout_sec.unwrap_or(DEFAULT_STARTUP_TIMEOUT);
let tool_timeout = cfg.tool_timeout_sec.unwrap_or(DEFAULT_TOOL_TIMEOUT);
tool_filters.insert(server_name.clone(), ToolFilter::from_config(&cfg));
let resolved_bearer_token = match &cfg.transport {
McpServerTransportConfig::StreamableHttp {
@@ -310,7 +191,8 @@ impl McpConnectionManager {
protocol_version: mcp_types::MCP_SCHEMA_VERSION.to_owned(),
};
let client = match transport {
let resolved_bearer_token = resolved_bearer_token.unwrap_or_default();
let client_result = match transport {
McpServerTransportConfig::Stdio {
command,
args,
@@ -320,17 +202,18 @@ impl McpConnectionManager {
} => {
let command_os: OsString = command.into();
let args_os: Vec<OsString> = args.into_iter().map(Into::into).collect();
McpClientAdapter::new_stdio_client(
use_rmcp_client,
command_os,
args_os,
env,
env_vars,
cwd,
params,
startup_timeout,
)
.await
match RmcpClient::new_stdio_client(command_os, args_os, env, &env_vars, cwd)
.await
{
Ok(client) => {
let client = Arc::new(client);
client
.initialize(params.clone(), Some(startup_timeout))
.await
.map(|_| client)
}
Err(err) => Err(err.into()),
}
}
McpServerTransportConfig::StreamableHttp {
url,
@@ -338,22 +221,32 @@ impl McpConnectionManager {
env_http_headers,
..
} => {
McpClientAdapter::new_streamable_http_client(
server_name.clone(),
url,
resolved_bearer_token.unwrap_or_default(),
match RmcpClient::new_streamable_http_client(
&server_name,
&url,
resolved_bearer_token.clone(),
http_headers,
env_http_headers,
params,
startup_timeout,
store_mode,
)
.await
{
Ok(client) => {
let client = Arc::new(client);
client
.initialize(params.clone(), Some(startup_timeout))
.await
.map(|_| client)
}
Err(err) => Err(err),
}
}
}
.map(|c| (c, startup_timeout));
};
((server_name, tool_timeout), client)
(
(server_name, tool_timeout),
client_result.map(|client| (client, startup_timeout)),
)
});
}
@@ -393,9 +286,17 @@ impl McpConnectionManager {
}
};
let tools = qualify_tools(all_tools);
let filtered_tools = filter_tools(all_tools, &tool_filters);
let tools = qualify_tools(filtered_tools);
Ok((Self { clients, tools }, errors))
Ok((
Self {
clients,
tools,
tool_filters,
},
errors,
))
}
/// Returns a single map that contains all tools. Each key is the
@@ -541,6 +442,13 @@ impl McpConnectionManager {
tool: &str,
arguments: Option<serde_json::Value>,
) -> Result<mcp_types::CallToolResult> {
if let Some(filter) = self.tool_filters.get(server)
&& !filter.allows(tool)
{
return Err(anyhow!(
"tool '{tool}' is disabled for MCP server '{server}'"
));
}
let managed = self
.clients
.get(server)
@@ -619,6 +527,52 @@ impl McpConnectionManager {
}
}
/// A tool is allowed to be used if both are true:
/// 1. enabled is None (no allowlist is set) or the tool is explicitly enabled.
/// 2. The tool is not explicitly disabled.
#[derive(Default, Clone)]
struct ToolFilter {
enabled: Option<HashSet<String>>,
disabled: HashSet<String>,
}
impl ToolFilter {
fn from_config(cfg: &McpServerConfig) -> Self {
let enabled = cfg
.enabled_tools
.as_ref()
.map(|tools| tools.iter().cloned().collect::<HashSet<_>>());
let disabled = cfg
.disabled_tools
.as_ref()
.map(|tools| tools.iter().cloned().collect::<HashSet<_>>())
.unwrap_or_default();
Self { enabled, disabled }
}
fn allows(&self, tool_name: &str) -> bool {
if let Some(enabled) = &self.enabled
&& !enabled.contains(tool_name)
{
return false;
}
!self.disabled.contains(tool_name)
}
}
fn filter_tools(tools: Vec<ToolInfo>, filters: &HashMap<String, ToolFilter>) -> Vec<ToolInfo> {
tools
.into_iter()
.filter(|tool| {
filters
.get(&tool.server_name)
.is_none_or(|filter| filter.allows(&tool.tool_name))
})
.collect()
}
fn resolve_bearer_token(
server_name: &str,
bearer_token_env_var: Option<&str>,
@@ -711,6 +665,7 @@ fn is_valid_mcp_server_name(server_name: &str) -> bool {
mod tests {
use super::*;
use mcp_types::ToolInputSchema;
use std::collections::HashSet;
fn create_test_tool(server_name: &str, tool_name: &str) -> ToolInfo {
ToolInfo {
@@ -793,4 +748,75 @@ mod tests {
"mcp__my_server__yet_anot419a82a89325c1b477274a41f8c65ea5f3a7f341"
);
}
#[test]
fn tool_filter_allows_by_default() {
let filter = ToolFilter::default();
assert!(filter.allows("any"));
}
#[test]
fn tool_filter_applies_enabled_list() {
let filter = ToolFilter {
enabled: Some(HashSet::from(["allowed".to_string()])),
disabled: HashSet::new(),
};
assert!(filter.allows("allowed"));
assert!(!filter.allows("denied"));
}
#[test]
fn tool_filter_applies_disabled_list() {
let filter = ToolFilter {
enabled: None,
disabled: HashSet::from(["blocked".to_string()]),
};
assert!(!filter.allows("blocked"));
assert!(filter.allows("open"));
}
#[test]
fn tool_filter_applies_enabled_then_disabled() {
let filter = ToolFilter {
enabled: Some(HashSet::from(["keep".to_string(), "remove".to_string()])),
disabled: HashSet::from(["remove".to_string()]),
};
assert!(filter.allows("keep"));
assert!(!filter.allows("remove"));
assert!(!filter.allows("unknown"));
}
#[test]
fn filter_tools_applies_per_server_filters() {
let tools = vec![
create_test_tool("server1", "tool_a"),
create_test_tool("server1", "tool_b"),
create_test_tool("server2", "tool_a"),
];
let mut filters = HashMap::new();
filters.insert(
"server1".to_string(),
ToolFilter {
enabled: Some(HashSet::from(["tool_a".to_string(), "tool_b".to_string()])),
disabled: HashSet::from(["tool_b".to_string()]),
},
);
filters.insert(
"server2".to_string(),
ToolFilter {
enabled: None,
disabled: HashSet::from(["tool_a".to_string()]),
},
);
let filtered = filter_tools(tools, &filters);
assert_eq!(filtered.len(), 1);
assert_eq!(filtered[0].server_name, "server1");
assert_eq!(filtered[0].tool_name, "tool_a");
}
}

View File

@@ -3,7 +3,7 @@ use std::time::Instant;
use tracing::error;
use crate::codex::Session;
use crate::protocol::Event;
use crate::codex::TurnContext;
use crate::protocol::EventMsg;
use crate::protocol::McpInvocation;
use crate::protocol::McpToolCallBeginEvent;
@@ -15,7 +15,7 @@ use codex_protocol::models::ResponseInputItem;
/// `McpToolCallBegin` and `McpToolCallEnd` events to the `Session`.
pub(crate) async fn handle_mcp_tool_call(
sess: &Session,
sub_id: &str,
turn_context: &TurnContext,
call_id: String,
server: String,
tool_name: String,
@@ -51,7 +51,7 @@ pub(crate) async fn handle_mcp_tool_call(
call_id: call_id.clone(),
invocation: invocation.clone(),
});
notify_mcp_tool_call_event(sess, sub_id, tool_call_begin_event).await;
notify_mcp_tool_call_event(sess, turn_context, tool_call_begin_event).await;
let start = Instant::now();
// Perform the tool call.
@@ -69,15 +69,11 @@ pub(crate) async fn handle_mcp_tool_call(
result: result.clone(),
});
notify_mcp_tool_call_event(sess, sub_id, tool_call_end_event.clone()).await;
notify_mcp_tool_call_event(sess, turn_context, tool_call_end_event.clone()).await;
ResponseInputItem::McpToolCallOutput { call_id, result }
}
async fn notify_mcp_tool_call_event(sess: &Session, sub_id: &str, event: EventMsg) {
sess.send_event(Event {
id: sub_id.to_string(),
msg: event,
})
.await;
async fn notify_mcp_tool_call_event(sess: &Session, turn_context: &TurnContext, event: EventMsg) {
sess.send_event(turn_context, event).await;
}

View File

@@ -84,11 +84,7 @@ macro_rules! model_family {
/// Returns a `ModelFamily` for the given model slug, or `None` if the slug
/// does not match any known model family.
pub fn find_family_for_model(mut slug: &str) -> Option<ModelFamily> {
// TODO(jif) clean once we have proper feature flags
if matches!(std::env::var("CODEX_EXPERIMENTAL").as_deref(), Ok("1")) {
slug = "codex-experimental";
}
pub fn find_family_for_model(slug: &str) -> Option<ModelFamily> {
if slug.starts_with("o3") {
model_family!(
slug, "o3",

View File

@@ -53,6 +53,11 @@ pub struct ModelProviderInfo {
/// variable and set it.
pub env_key_instructions: Option<String>,
/// Value to use with `Authorization: Bearer <token>` header. Use of this
/// config is discouraged in favor of `env_key` for security reasons, but
/// this may be necessary when using this programmatically.
pub experimental_bearer_token: Option<String>,
/// Which wire protocol this provider expects.
#[serde(default)]
pub wire_api: WireApi,
@@ -102,14 +107,18 @@ impl ModelProviderInfo {
client: &'a reqwest::Client,
auth: &Option<CodexAuth>,
) -> crate::error::Result<reqwest::RequestBuilder> {
let effective_auth = match self.api_key() {
Ok(Some(key)) => Some(CodexAuth::from_api_key(&key)),
Ok(None) => auth.clone(),
Err(err) => {
if auth.is_some() {
auth.clone()
} else {
return Err(err);
let effective_auth = if let Some(secret_key) = &self.experimental_bearer_token {
Some(CodexAuth::from_api_key(secret_key))
} else {
match self.api_key() {
Ok(Some(key)) => Some(CodexAuth::from_api_key(&key)),
Ok(None) => auth.clone(),
Err(err) => {
if auth.is_some() {
auth.clone()
} else {
return Err(err);
}
}
}
};
@@ -274,6 +283,7 @@ pub fn built_in_model_providers() -> HashMap<String, ModelProviderInfo> {
.filter(|v| !v.trim().is_empty()),
env_key: None,
env_key_instructions: None,
experimental_bearer_token: None,
wire_api: WireApi::Responses,
query_params: None,
http_headers: Some(
@@ -333,6 +343,7 @@ pub fn create_oss_provider_with_base_url(base_url: &str) -> ModelProviderInfo {
base_url: Some(base_url.into()),
env_key: None,
env_key_instructions: None,
experimental_bearer_token: None,
wire_api: WireApi::Chat,
query_params: None,
http_headers: None,
@@ -372,6 +383,7 @@ base_url = "http://localhost:11434/v1"
base_url: Some("http://localhost:11434/v1".into()),
env_key: None,
env_key_instructions: None,
experimental_bearer_token: None,
wire_api: WireApi::Chat,
query_params: None,
http_headers: None,
@@ -399,6 +411,7 @@ query_params = { api-version = "2025-04-01-preview" }
base_url: Some("https://xxxxx.openai.azure.com/openai".into()),
env_key: Some("AZURE_OPENAI_API_KEY".into()),
env_key_instructions: None,
experimental_bearer_token: None,
wire_api: WireApi::Chat,
query_params: Some(maplit::hashmap! {
"api-version".to_string() => "2025-04-01-preview".to_string(),
@@ -429,6 +442,7 @@ env_http_headers = { "X-Example-Env-Header" = "EXAMPLE_ENV_VAR" }
base_url: Some("https://example.com".into()),
env_key: Some("API_KEY".into()),
env_key_instructions: None,
experimental_bearer_token: None,
wire_api: WireApi::Chat,
query_params: None,
http_headers: Some(maplit::hashmap! {
@@ -455,6 +469,7 @@ env_http_headers = { "X-Example-Env-Header" = "EXAMPLE_ENV_VAR" }
base_url: Some(base_url.into()),
env_key: None,
env_key_instructions: None,
experimental_bearer_token: None,
wire_api: WireApi::Responses,
query_params: None,
http_headers: None,
@@ -487,6 +502,7 @@ env_http_headers = { "X-Example-Env-Header" = "EXAMPLE_ENV_VAR" }
base_url: Some("https://example.com".into()),
env_key: None,
env_key_instructions: None,
experimental_bearer_token: None,
wire_api: WireApi::Responses,
query_params: None,
http_headers: None,

View File

@@ -1,4 +1,4 @@
use crate::bash::try_parse_bash;
use crate::bash::try_parse_shell;
use crate::bash::try_parse_word_only_commands_sequence;
use codex_protocol::parse_command::ParsedCommand;
use shlex::split as shlex_split;
@@ -193,6 +193,19 @@ mod tests {
);
}
#[test]
fn zsh_lc_supports_cat() {
let inner = "cat README.md";
assert_parsed(
&vec_str(&["zsh", "-lc", inner]),
vec![ParsedCommand::Read {
cmd: inner.to_string(),
name: "README.md".to_string(),
path: PathBuf::from("README.md"),
}],
);
}
#[test]
fn cd_then_cat_is_single_read() {
assert_parsed(
@@ -843,7 +856,7 @@ mod tests {
}
pub fn parse_command_impl(command: &[String]) -> Vec<ParsedCommand> {
if let Some(commands) = parse_bash_lc_commands(command) {
if let Some(commands) = parse_shell_lc_commands(command) {
return commands;
}
@@ -981,7 +994,7 @@ fn is_valid_sed_n_arg(arg: Option<&str>) -> bool {
}
/// Normalize a command by:
/// - Removing `yes`/`no`/`bash -c`/`bash -lc` prefixes.
/// - Removing `yes`/`no`/`bash -c`/`bash -lc`/`zsh -c`/`zsh -lc` prefixes.
/// - Splitting on `|` and `&&`/`||`/`;
fn normalize_tokens(cmd: &[String]) -> Vec<String> {
match cmd {
@@ -993,9 +1006,10 @@ fn normalize_tokens(cmd: &[String]) -> Vec<String> {
// Do not re-shlex already-tokenized input; just drop the prefix.
rest.to_vec()
}
[bash, flag, script] if bash == "bash" && (flag == "-c" || flag == "-lc") => {
shlex_split(script)
.unwrap_or_else(|| vec!["bash".to_string(), flag.clone(), script.clone()])
[shell, flag, script]
if (shell == "bash" || shell == "zsh") && (flag == "-c" || flag == "-lc") =>
{
shlex_split(script).unwrap_or_else(|| vec![shell.clone(), flag.clone(), script.clone()])
}
_ => cmd.to_vec(),
}
@@ -1151,19 +1165,19 @@ fn parse_find_query_and_path(tail: &[String]) -> (Option<String>, Option<String>
(query, path)
}
fn parse_bash_lc_commands(original: &[String]) -> Option<Vec<ParsedCommand>> {
let [bash, flag, script] = original else {
fn parse_shell_lc_commands(original: &[String]) -> Option<Vec<ParsedCommand>> {
let [shell, flag, script] = original else {
return None;
};
if bash != "bash" || flag != "-lc" {
if flag != "-lc" || !(shell == "bash" || shell == "zsh") {
return None;
}
if let Some(tree) = try_parse_bash(script)
if let Some(tree) = try_parse_shell(script)
&& let Some(all_commands) = try_parse_word_only_commands_sequence(&tree, script)
&& !all_commands.is_empty()
{
let script_tokens = shlex_split(script)
.unwrap_or_else(|| vec!["bash".to_string(), flag.clone(), script.clone()]);
.unwrap_or_else(|| vec![shell.clone(), flag.clone(), script.clone()]);
// Strip small formatting helpers (e.g., head/tail/awk/wc/etc) so we
// bias toward the primary command when pipelines are present.
// First, drop obvious small formatting helpers (e.g., wc/awk/etc).

View File

@@ -0,0 +1,105 @@
use crate::codex::Session;
use codex_protocol::models::FunctionCallOutputPayload;
use codex_protocol::models::ResponseInputItem;
use codex_protocol::models::ResponseItem;
use tracing::warn;
/// Process streamed `ResponseItem`s from the model into the pair of:
/// - items we should record in conversation history; and
/// - `ResponseInputItem`s to send back to the model on the next turn.
pub(crate) async fn process_items(
processed_items: Vec<crate::codex::ProcessedResponseItem>,
sess: &Session,
) -> (Vec<ResponseInputItem>, Vec<ResponseItem>) {
let mut items_to_record_in_conversation_history = Vec::<ResponseItem>::new();
let mut responses = Vec::<ResponseInputItem>::new();
for processed_response_item in processed_items {
let crate::codex::ProcessedResponseItem { item, response } = processed_response_item;
match (&item, &response) {
(ResponseItem::Message { role, .. }, None) if role == "assistant" => {
// If the model returned a message, we need to record it.
items_to_record_in_conversation_history.push(item);
}
(
ResponseItem::LocalShellCall { .. },
Some(ResponseInputItem::FunctionCallOutput { call_id, output }),
) => {
items_to_record_in_conversation_history.push(item);
items_to_record_in_conversation_history.push(ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output: output.clone(),
});
}
(
ResponseItem::FunctionCall { .. },
Some(ResponseInputItem::FunctionCallOutput { call_id, output }),
) => {
items_to_record_in_conversation_history.push(item);
items_to_record_in_conversation_history.push(ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output: output.clone(),
});
}
(
ResponseItem::CustomToolCall { .. },
Some(ResponseInputItem::CustomToolCallOutput { call_id, output }),
) => {
items_to_record_in_conversation_history.push(item);
items_to_record_in_conversation_history.push(ResponseItem::CustomToolCallOutput {
call_id: call_id.clone(),
output: output.clone(),
});
}
(
ResponseItem::FunctionCall { .. },
Some(ResponseInputItem::McpToolCallOutput { call_id, result }),
) => {
items_to_record_in_conversation_history.push(item);
let output = match result {
Ok(call_tool_result) => {
crate::codex::convert_call_tool_result_to_function_call_output_payload(
call_tool_result,
)
}
Err(err) => FunctionCallOutputPayload {
content: err.clone(),
success: Some(false),
},
};
items_to_record_in_conversation_history.push(ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output,
});
}
(
ResponseItem::Reasoning {
id,
summary,
content,
encrypted_content,
},
None,
) => {
items_to_record_in_conversation_history.push(ResponseItem::Reasoning {
id: id.clone(),
summary: summary.clone(),
content: content.clone(),
encrypted_content: encrypted_content.clone(),
});
}
_ => {
warn!("Unexpected response item: {item:?} with response: {response:?}");
}
};
if let Some(response) = response {
responses.push(response);
}
}
// Only attempt to take the lock if there is something to record.
if !items_to_record_in_conversation_history.is_empty() {
sess.record_conversation_items(&items_to_record_in_conversation_history)
.await;
}
(responses, items_to_record_in_conversation_history)
}

View File

@@ -71,6 +71,8 @@ pub(crate) fn should_persist_event_msg(ev: &EventMsg) -> bool {
| EventMsg::PlanUpdate(_)
| EventMsg::ShutdownComplete
| EventMsg::ViewImageToolCall(_)
| EventMsg::ConversationPath(_) => false,
| EventMsg::ConversationPath(_)
| EventMsg::ItemStarted(_)
| EventMsg::ItemCompleted(_) => false,
}
}

View File

@@ -24,7 +24,6 @@ use codex_protocol::models::ContentItem;
use codex_protocol::models::ResponseItem;
use codex_protocol::protocol::CompactedItem;
use codex_protocol::protocol::EventMsg;
use codex_protocol::protocol::InputMessageKind;
use codex_protocol::protocol::RolloutItem;
use codex_protocol::protocol::RolloutLine;
use codex_protocol::protocol::SessionMeta;
@@ -543,7 +542,6 @@ async fn test_tail_includes_last_response_items() -> Result<()> {
timestamp: ts.to_string(),
item: RolloutItem::EventMsg(EventMsg::UserMessage(UserMessageEvent {
message: "hello".into(),
kind: Some(InputMessageKind::Plain),
images: None,
})),
};
@@ -627,7 +625,6 @@ async fn test_tail_handles_short_sessions() -> Result<()> {
timestamp: ts.to_string(),
item: RolloutItem::EventMsg(EventMsg::UserMessage(UserMessageEvent {
message: "hi".into(),
kind: Some(InputMessageKind::Plain),
images: None,
})),
};
@@ -712,7 +709,6 @@ async fn test_tail_skips_trailing_non_responses() -> Result<()> {
timestamp: ts.to_string(),
item: RolloutItem::EventMsg(EventMsg::UserMessage(UserMessageEvent {
message: "hello".into(),
kind: Some(InputMessageKind::Plain),
images: None,
})),
};

View File

@@ -36,8 +36,12 @@ impl SessionState {
self.history.record_items(items)
}
pub(crate) fn history_snapshot(&self) -> Vec<ResponseItem> {
self.history.contents()
pub(crate) fn history_snapshot(&mut self) -> Vec<ResponseItem> {
self.history.get_history()
}
pub(crate) fn clone_history(&self) -> ConversationHistory {
self.history.clone()
}
pub(crate) fn replace_history(&mut self, items: Vec<ResponseItem>) {

View File

@@ -11,6 +11,7 @@ use tokio_util::task::AbortOnDropHandle;
use codex_protocol::models::ResponseInputItem;
use tokio::sync::oneshot;
use crate::codex::TurnContext;
use crate::protocol::ReviewDecision;
use crate::tasks::SessionTask;
@@ -53,10 +54,12 @@ pub(crate) struct RunningTask {
pub(crate) task: Arc<dyn SessionTask>,
pub(crate) cancellation_token: CancellationToken,
pub(crate) handle: Arc<AbortOnDropHandle<()>>,
pub(crate) turn_context: Arc<TurnContext>,
}
impl ActiveTurn {
pub(crate) fn add_task(&mut self, sub_id: String, task: RunningTask) {
pub(crate) fn add_task(&mut self, task: RunningTask) {
let sub_id = task.turn_context.sub_id.clone();
self.tasks.insert(sub_id, task);
}
@@ -65,8 +68,8 @@ impl ActiveTurn {
self.tasks.is_empty()
}
pub(crate) fn drain_tasks(&mut self) -> IndexMap<String, RunningTask> {
std::mem::take(&mut self.tasks)
pub(crate) fn drain_tasks(&mut self) -> Vec<RunningTask> {
self.tasks.drain(..).map(|(_, task)| task).collect()
}
}

View File

@@ -5,8 +5,8 @@ use tokio_util::sync::CancellationToken;
use crate::codex::TurnContext;
use crate::codex::compact;
use crate::protocol::InputItem;
use crate::state::TaskKind;
use codex_protocol::user_input::UserInput;
use super::SessionTask;
use super::SessionTaskContext;
@@ -24,10 +24,9 @@ impl SessionTask for CompactTask {
self: Arc<Self>,
session: Arc<SessionTaskContext>,
ctx: Arc<TurnContext>,
sub_id: String,
input: Vec<InputItem>,
input: Vec<UserInput>,
_cancellation_token: CancellationToken,
) -> Option<String> {
compact::run_compact_task(session.clone_session(), ctx, sub_id, input).await
compact::run_compact_task(session.clone_session(), ctx, input).await
}
}

View File

@@ -13,17 +13,18 @@ use tokio_util::task::AbortOnDropHandle;
use tracing::trace;
use tracing::warn;
use crate::AuthManager;
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::protocol::Event;
use crate::config::Config;
use crate::protocol::EventMsg;
use crate::protocol::InputItem;
use crate::protocol::TaskCompleteEvent;
use crate::protocol::TurnAbortReason;
use crate::protocol::TurnAbortedEvent;
use crate::state::ActiveTurn;
use crate::state::RunningTask;
use crate::state::TaskKind;
use codex_protocol::user_input::UserInput;
pub(crate) use compact::CompactTask;
pub(crate) use regular::RegularTask;
@@ -45,6 +46,14 @@ impl SessionTaskContext {
pub(crate) fn clone_session(&self) -> Arc<Session> {
Arc::clone(&self.session)
}
pub(crate) fn auth_manager(&self) -> Arc<AuthManager> {
Arc::clone(&self.session.services.auth_manager)
}
pub(crate) async fn base_config(&self) -> Arc<Config> {
self.session.base_config().await
}
}
#[async_trait]
@@ -55,13 +64,12 @@ pub(crate) trait SessionTask: Send + Sync + 'static {
self: Arc<Self>,
session: Arc<SessionTaskContext>,
ctx: Arc<TurnContext>,
sub_id: String,
input: Vec<InputItem>,
input: Vec<UserInput>,
cancellation_token: CancellationToken,
) -> Option<String>;
async fn abort(&self, session: Arc<SessionTaskContext>, sub_id: &str) {
let _ = (session, sub_id);
async fn abort(&self, session: Arc<SessionTaskContext>, ctx: Arc<TurnContext>) {
let _ = (session, ctx);
}
}
@@ -69,8 +77,7 @@ impl Session {
pub async fn spawn_task<T: SessionTask>(
self: &Arc<Self>,
turn_context: Arc<TurnContext>,
sub_id: String,
input: Vec<InputItem>,
input: Vec<UserInput>,
task: T,
) {
self.abort_all_tasks(TurnAbortReason::Replaced).await;
@@ -86,14 +93,13 @@ impl Session {
let session_ctx = Arc::new(SessionTaskContext::new(Arc::clone(self)));
let ctx = Arc::clone(&turn_context);
let task_for_run = Arc::clone(&task);
let sub_clone = sub_id.clone();
let task_cancellation_token = cancellation_token.child_token();
tokio::spawn(async move {
let ctx_for_finish = Arc::clone(&ctx);
let last_agent_message = task_for_run
.run(
Arc::clone(&session_ctx),
ctx,
sub_clone.clone(),
input,
task_cancellation_token.child_token(),
)
@@ -102,7 +108,8 @@ impl Session {
if !task_cancellation_token.is_cancelled() {
// Emit completion uniformly from spawn site so all tasks share the same lifecycle.
let sess = session_ctx.clone_session();
sess.on_task_finished(sub_clone, last_agent_message).await;
sess.on_task_finished(ctx_for_finish, last_agent_message)
.await;
}
done_clone.notify_waiters();
})
@@ -114,60 +121,54 @@ impl Session {
kind: task_kind,
task,
cancellation_token,
turn_context: Arc::clone(&turn_context),
};
self.register_new_active_task(sub_id, running_task).await;
self.register_new_active_task(running_task).await;
}
pub async fn abort_all_tasks(self: &Arc<Self>, reason: TurnAbortReason) {
for (sub_id, task) in self.take_all_running_tasks().await {
self.handle_task_abort(sub_id, task, reason.clone()).await;
for task in self.take_all_running_tasks().await {
self.handle_task_abort(task, reason.clone()).await;
}
}
pub async fn on_task_finished(
self: &Arc<Self>,
sub_id: String,
turn_context: Arc<TurnContext>,
last_agent_message: Option<String>,
) {
let mut active = self.active_turn.lock().await;
if let Some(at) = active.as_mut()
&& at.remove_task(&sub_id)
&& at.remove_task(&turn_context.sub_id)
{
*active = None;
}
drop(active);
let event = Event {
id: sub_id,
msg: EventMsg::TaskComplete(TaskCompleteEvent { last_agent_message }),
};
self.send_event(event).await;
let event = EventMsg::TaskComplete(TaskCompleteEvent { last_agent_message });
self.send_event(turn_context.as_ref(), event).await;
}
async fn register_new_active_task(&self, sub_id: String, task: RunningTask) {
async fn register_new_active_task(&self, task: RunningTask) {
let mut active = self.active_turn.lock().await;
let mut turn = ActiveTurn::default();
turn.add_task(sub_id, task);
turn.add_task(task);
*active = Some(turn);
}
async fn take_all_running_tasks(&self) -> Vec<(String, RunningTask)> {
async fn take_all_running_tasks(&self) -> Vec<RunningTask> {
let mut active = self.active_turn.lock().await;
match active.take() {
Some(mut at) => {
at.clear_pending().await;
let tasks = at.drain_tasks();
tasks.into_iter().collect()
at.drain_tasks()
}
None => Vec::new(),
}
}
async fn handle_task_abort(
self: &Arc<Self>,
sub_id: String,
task: RunningTask,
reason: TurnAbortReason,
) {
async fn handle_task_abort(self: &Arc<Self>, task: RunningTask, reason: TurnAbortReason) {
let sub_id = task.turn_context.sub_id.clone();
if task.cancellation_token.is_cancelled() {
return;
}
@@ -187,13 +188,12 @@ impl Session {
task.handle.abort();
let session_ctx = Arc::new(SessionTaskContext::new(Arc::clone(self)));
session_task.abort(session_ctx, &sub_id).await;
session_task
.abort(session_ctx, Arc::clone(&task.turn_context))
.await;
let event = Event {
id: sub_id.clone(),
msg: EventMsg::TurnAborted(TurnAbortedEvent { reason }),
};
self.send_event(event).await;
let event = EventMsg::TurnAborted(TurnAbortedEvent { reason });
self.send_event(task.turn_context.as_ref(), event).await;
}
}

View File

@@ -5,8 +5,8 @@ use tokio_util::sync::CancellationToken;
use crate::codex::TurnContext;
use crate::codex::run_task;
use crate::protocol::InputItem;
use crate::state::TaskKind;
use codex_protocol::user_input::UserInput;
use super::SessionTask;
use super::SessionTaskContext;
@@ -24,19 +24,10 @@ impl SessionTask for RegularTask {
self: Arc<Self>,
session: Arc<SessionTaskContext>,
ctx: Arc<TurnContext>,
sub_id: String,
input: Vec<InputItem>,
input: Vec<UserInput>,
cancellation_token: CancellationToken,
) -> Option<String> {
let sess = session.clone_session();
run_task(
sess,
ctx,
sub_id,
input,
TaskKind::Regular,
cancellation_token,
)
.await
run_task(sess, ctx, input, TaskKind::Regular, cancellation_token).await
}
}

View File

@@ -1,13 +1,20 @@
use std::sync::Arc;
use async_trait::async_trait;
use codex_protocol::models::ContentItem;
use codex_protocol::models::ResponseItem;
use codex_protocol::protocol::EventMsg;
use codex_protocol::protocol::ReviewOutputEvent;
use codex_protocol::protocol::TaskCompleteEvent;
use tokio_util::sync::CancellationToken;
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::codex::exit_review_mode;
use crate::codex::run_task;
use crate::protocol::InputItem;
use crate::codex_delegate::run_codex_conversation;
// use crate::config::Config; // no longer needed directly; use session.base_config()
use crate::review_format::format_review_findings_block;
use crate::state::TaskKind;
use codex_protocol::user_input::UserInput;
use super::SessionTask;
use super::SessionTaskContext;
@@ -25,23 +32,111 @@ impl SessionTask for ReviewTask {
self: Arc<Self>,
session: Arc<SessionTaskContext>,
ctx: Arc<TurnContext>,
sub_id: String,
input: Vec<InputItem>,
input: Vec<UserInput>,
cancellation_token: CancellationToken,
) -> Option<String> {
let sess = session.clone_session();
run_task(
sess,
ctx,
sub_id,
input,
TaskKind::Review,
cancellation_token,
)
.await
// let sess = session.clone_session();
// run_task(sess, ctx, input, TaskKind::Review, cancellation_token).await
let config = session.base_config().await.as_ref().clone();
let receiver =
match run_codex_conversation(config, session.auth_manager(), input, cancellation_token)
.await
{
Ok(r) => r,
Err(_) => return None,
};
while let Ok(event) = receiver.recv().await {
session
.clone_session()
.send_event(ctx.as_ref(), event.clone())
.await;
if let EventMsg::TaskComplete(TaskCompleteEvent { last_agent_message }) = event {
exit_review_mode(
session.clone_session(),
last_agent_message.as_deref().map(parse_review_output_event),
)
.await;
}
}
Some("".to_string())
}
async fn abort(&self, session: Arc<SessionTaskContext>, sub_id: &str) {
exit_review_mode(session.clone_session(), sub_id.to_string(), None).await;
async fn abort(&self, session: Arc<SessionTaskContext>, _ctx: Arc<TurnContext>) {
exit_review_mode(session.clone_session(), None).await;
}
}
/// Emits an ExitedReviewMode Event with optional ReviewOutput,
/// and records a developer message with the review output.
pub(crate) async fn exit_review_mode(
session: Arc<Session>,
review_output: Option<ReviewOutputEvent>,
) {
// ExitedReviewMode event can be emitted by the caller if needed.
let mut user_message = String::new();
if let Some(out) = review_output {
let mut findings_str = String::new();
let text = out.overall_explanation.trim();
if !text.is_empty() {
findings_str.push_str(text);
}
if !out.findings.is_empty() {
let block = format_review_findings_block(&out.findings, None);
findings_str.push_str(&format!("\n{block}"));
}
user_message.push_str(&format!(
r#"<user_action>
<context>User initiated a review task. Here's the full review output from reviewer model. User may select one or more comments to resolve.</context>
<action>review</action>
<results>
{findings_str}
</results>
</user_action>
"#));
} else {
user_message.push_str(r#"<user_action>
<context>User initiated a review task, but was interrupted. If user asks about this, tell them to re-initiate a review with `/review` and wait for it to complete.</context>
<action>review</action>
<results>
None.
</results>
</user_action>
"#);
}
session
.record_conversation_items(&[ResponseItem::Message {
id: None,
role: "user".to_string(),
content: vec![ContentItem::InputText { text: user_message }],
}])
.await;
}
/// Parse the review output; when not valid JSON, build a structured
/// fallback that carries the plain text as the overall explanation.
///
/// Returns: a ReviewOutputEvent parsed from JSON or a fallback populated from text.
fn parse_review_output_event(text: &str) -> ReviewOutputEvent {
// Try direct parse first
if let Ok(ev) = serde_json::from_str::<ReviewOutputEvent>(text) {
return ev;
}
// If wrapped in markdown fences or extra prose, attempt to extract the first JSON object
if let (Some(start), Some(end)) = (text.find('{'), text.rfind('}'))
&& start < end
&& let Some(slice) = text.get(start..=end)
&& let Ok(ev) = serde_json::from_str::<ReviewOutputEvent>(slice)
{
return ev;
}
// Not JSON return a structured ReviewOutputEvent that carries
// the plain text as the overall explanation.
ReviewOutputEvent {
overall_explanation: text.to_string(),
..Default::default()
}
}

View File

@@ -24,7 +24,6 @@ pub struct ToolInvocation {
pub session: Arc<Session>,
pub turn: Arc<TurnContext>,
pub tracker: SharedTurnDiffTracker,
pub sub_id: String,
pub call_id: String,
pub tool_name: String,
pub payload: ToolPayload,
@@ -234,7 +233,7 @@ mod tests {
#[derive(Clone, Debug)]
#[allow(dead_code)]
pub(crate) struct ExecCommandContext {
pub(crate) sub_id: String,
pub(crate) turn: Arc<TurnContext>,
pub(crate) call_id: String,
pub(crate) command_for_display: Vec<String>,
pub(crate) cwd: PathBuf,

View File

@@ -1,7 +1,7 @@
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::exec::ExecToolCallOutput;
use crate::parse_command::parse_command;
use crate::protocol::Event;
use crate::protocol::EventMsg;
use crate::protocol::ExecCommandBeginEvent;
use crate::protocol::ExecCommandEndEvent;
@@ -11,6 +11,7 @@ use crate::protocol::PatchApplyEndEvent;
use crate::protocol::TurnDiffEvent;
use crate::tools::context::SharedTurnDiffTracker;
use std::collections::HashMap;
use std::path::Path;
use std::path::PathBuf;
use std::time::Duration;
@@ -20,7 +21,7 @@ use super::format_exec_output_str;
#[derive(Clone, Copy)]
pub(crate) struct ToolEventCtx<'a> {
pub session: &'a Session,
pub sub_id: &'a str,
pub turn: &'a TurnContext,
pub call_id: &'a str,
pub turn_diff_tracker: Option<&'a SharedTurnDiffTracker>,
}
@@ -28,13 +29,13 @@ pub(crate) struct ToolEventCtx<'a> {
impl<'a> ToolEventCtx<'a> {
pub fn new(
session: &'a Session,
sub_id: &'a str,
turn: &'a TurnContext,
call_id: &'a str,
turn_diff_tracker: Option<&'a SharedTurnDiffTracker>,
) -> Self {
Self {
session,
sub_id,
turn,
call_id,
turn_diff_tracker,
}
@@ -51,6 +52,20 @@ pub(crate) enum ToolEventFailure {
Output(ExecToolCallOutput),
Message(String),
}
pub(crate) async fn emit_exec_command_begin(ctx: ToolEventCtx<'_>, command: &[String], cwd: &Path) {
ctx.session
.send_event(
ctx.turn,
EventMsg::ExecCommandBegin(ExecCommandBeginEvent {
call_id: ctx.call_id.to_string(),
command: command.to_vec(),
cwd: cwd.to_path_buf(),
parsed_cmd: parse_command(command),
}),
)
.await;
}
// Concrete, allocation-free emitter: avoid trait objects and boxed futures.
pub(crate) enum ToolEmitter {
Shell {
@@ -61,6 +76,13 @@ pub(crate) enum ToolEmitter {
changes: HashMap<PathBuf, FileChange>,
auto_approved: bool,
},
UnifiedExec {
command: String,
cwd: PathBuf,
// True for `exec_command` and false for `write_stdin`.
#[allow(dead_code)]
is_startup_command: bool,
},
}
impl ToolEmitter {
@@ -75,20 +97,18 @@ impl ToolEmitter {
}
}
pub fn unified_exec(command: String, cwd: PathBuf, is_startup_command: bool) -> Self {
Self::UnifiedExec {
command,
cwd,
is_startup_command,
}
}
pub async fn emit(&self, ctx: ToolEventCtx<'_>, stage: ToolEventStage) {
match (self, stage) {
(Self::Shell { command, cwd }, ToolEventStage::Begin) => {
ctx.session
.send_event(Event {
id: ctx.sub_id.to_string(),
msg: EventMsg::ExecCommandBegin(ExecCommandBeginEvent {
call_id: ctx.call_id.to_string(),
command: command.clone(),
cwd: cwd.clone(),
parsed_cmd: parse_command(command),
}),
})
.await;
emit_exec_command_begin(ctx, command, cwd.as_path()).await;
}
(Self::Shell { .. }, ToolEventStage::Success(output)) => {
emit_exec_end(
@@ -139,14 +159,14 @@ impl ToolEmitter {
guard.on_patch_begin(changes);
}
ctx.session
.send_event(Event {
id: ctx.sub_id.to_string(),
msg: EventMsg::PatchApplyBegin(PatchApplyBeginEvent {
.send_event(
ctx.turn,
EventMsg::PatchApplyBegin(PatchApplyBeginEvent {
call_id: ctx.call_id.to_string(),
auto_approved: *auto_approved,
changes: changes.clone(),
}),
})
)
.await;
}
(Self::ApplyPatch { .. }, ToolEventStage::Success(output)) => {
@@ -176,6 +196,10 @@ impl ToolEmitter {
) => {
emit_patch_end(ctx, String::new(), (*message).to_string(), false).await;
}
(Self::UnifiedExec { command, cwd, .. }, _) => {
// TODO(jif) add end and failures.
emit_exec_command_begin(ctx, &[command.to_string()], cwd.as_path()).await;
}
}
}
}
@@ -190,9 +214,9 @@ async fn emit_exec_end(
formatted_output: String,
) {
ctx.session
.send_event(Event {
id: ctx.sub_id.to_string(),
msg: EventMsg::ExecCommandEnd(ExecCommandEndEvent {
.send_event(
ctx.turn,
EventMsg::ExecCommandEnd(ExecCommandEndEvent {
call_id: ctx.call_id.to_string(),
stdout,
stderr,
@@ -201,21 +225,21 @@ async fn emit_exec_end(
duration,
formatted_output,
}),
})
)
.await;
}
async fn emit_patch_end(ctx: ToolEventCtx<'_>, stdout: String, stderr: String, success: bool) {
ctx.session
.send_event(Event {
id: ctx.sub_id.to_string(),
msg: EventMsg::PatchApplyEnd(PatchApplyEndEvent {
.send_event(
ctx.turn,
EventMsg::PatchApplyEnd(PatchApplyEndEvent {
call_id: ctx.call_id.to_string(),
stdout,
stderr,
success,
}),
})
)
.await;
if let Some(tracker) = ctx.turn_diff_tracker {
@@ -225,10 +249,7 @@ async fn emit_patch_end(ctx: ToolEventCtx<'_>, stdout: String, stderr: String, s
};
if let Ok(Some(unified_diff)) = unified_diff {
ctx.session
.send_event(Event {
id: ctx.sub_id.to_string(),
msg: EventMsg::TurnDiff(TurnDiffEvent { unified_diff }),
})
.send_event(ctx.turn, EventMsg::TurnDiff(TurnDiffEvent { unified_diff }))
.await;
}
}

View File

@@ -42,7 +42,6 @@ impl ToolHandler for ApplyPatchHandler {
session,
turn,
tracker,
sub_id,
call_id,
tool_name,
payload,
@@ -81,7 +80,6 @@ impl ToolHandler for ApplyPatchHandler {
Arc::clone(&session),
Arc::clone(&turn),
Arc::clone(&tracker),
sub_id.clone(),
call_id.clone(),
)
.await?;

View File

@@ -19,7 +19,7 @@ impl ToolHandler for McpHandler {
async fn handle(&self, invocation: ToolInvocation) -> Result<ToolOutput, FunctionCallError> {
let ToolInvocation {
session,
sub_id,
turn,
call_id,
payload,
..
@@ -43,7 +43,7 @@ impl ToolHandler for McpHandler {
let response = handle_mcp_tool_call(
session.as_ref(),
&sub_id,
turn.as_ref(),
call_id.clone(),
server,
tool,

View File

@@ -21,8 +21,8 @@ use serde::de::DeserializeOwned;
use serde_json::Value;
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::function_tool::FunctionCallError;
use crate::protocol::Event;
use crate::protocol::EventMsg;
use crate::protocol::McpInvocation;
use crate::protocol::McpToolCallBeginEvent;
@@ -189,7 +189,7 @@ impl ToolHandler for McpResourceHandler {
async fn handle(&self, invocation: ToolInvocation) -> Result<ToolOutput, FunctionCallError> {
let ToolInvocation {
session,
sub_id,
turn,
call_id,
tool_name,
payload,
@@ -211,7 +211,7 @@ impl ToolHandler for McpResourceHandler {
"list_mcp_resources" => {
handle_list_resources(
Arc::clone(&session),
sub_id.clone(),
Arc::clone(&turn),
call_id.clone(),
arguments_value.clone(),
)
@@ -220,14 +220,20 @@ impl ToolHandler for McpResourceHandler {
"list_mcp_resource_templates" => {
handle_list_resource_templates(
Arc::clone(&session),
sub_id.clone(),
Arc::clone(&turn),
call_id.clone(),
arguments_value.clone(),
)
.await
}
"read_mcp_resource" => {
handle_read_resource(Arc::clone(&session), sub_id, call_id, arguments_value).await
handle_read_resource(
Arc::clone(&session),
Arc::clone(&turn),
call_id,
arguments_value,
)
.await
}
other => Err(FunctionCallError::RespondToModel(format!(
"unsupported MCP resource tool: {other}"
@@ -238,7 +244,7 @@ impl ToolHandler for McpResourceHandler {
async fn handle_list_resources(
session: Arc<Session>,
sub_id: String,
turn: Arc<TurnContext>,
call_id: String,
arguments: Option<Value>,
) -> Result<ToolOutput, FunctionCallError> {
@@ -253,7 +259,7 @@ async fn handle_list_resources(
arguments: arguments.clone(),
};
emit_tool_call_begin(&session, &sub_id, &call_id, invocation.clone()).await;
emit_tool_call_begin(&session, turn.as_ref(), &call_id, invocation.clone()).await;
let start = Instant::now();
let payload_result: Result<ListResourcesPayload, FunctionCallError> = async {
@@ -297,7 +303,7 @@ async fn handle_list_resources(
let duration = start.elapsed();
emit_tool_call_end(
&session,
&sub_id,
turn.as_ref(),
&call_id,
invocation,
duration,
@@ -311,7 +317,7 @@ async fn handle_list_resources(
let message = err.to_string();
emit_tool_call_end(
&session,
&sub_id,
turn.as_ref(),
&call_id,
invocation,
duration,
@@ -326,7 +332,7 @@ async fn handle_list_resources(
let message = err.to_string();
emit_tool_call_end(
&session,
&sub_id,
turn.as_ref(),
&call_id,
invocation,
duration,
@@ -340,7 +346,7 @@ async fn handle_list_resources(
async fn handle_list_resource_templates(
session: Arc<Session>,
sub_id: String,
turn: Arc<TurnContext>,
call_id: String,
arguments: Option<Value>,
) -> Result<ToolOutput, FunctionCallError> {
@@ -355,7 +361,7 @@ async fn handle_list_resource_templates(
arguments: arguments.clone(),
};
emit_tool_call_begin(&session, &sub_id, &call_id, invocation.clone()).await;
emit_tool_call_begin(&session, turn.as_ref(), &call_id, invocation.clone()).await;
let start = Instant::now();
let payload_result: Result<ListResourceTemplatesPayload, FunctionCallError> = async {
@@ -403,7 +409,7 @@ async fn handle_list_resource_templates(
let duration = start.elapsed();
emit_tool_call_end(
&session,
&sub_id,
turn.as_ref(),
&call_id,
invocation,
duration,
@@ -417,7 +423,7 @@ async fn handle_list_resource_templates(
let message = err.to_string();
emit_tool_call_end(
&session,
&sub_id,
turn.as_ref(),
&call_id,
invocation,
duration,
@@ -432,7 +438,7 @@ async fn handle_list_resource_templates(
let message = err.to_string();
emit_tool_call_end(
&session,
&sub_id,
turn.as_ref(),
&call_id,
invocation,
duration,
@@ -446,7 +452,7 @@ async fn handle_list_resource_templates(
async fn handle_read_resource(
session: Arc<Session>,
sub_id: String,
turn: Arc<TurnContext>,
call_id: String,
arguments: Option<Value>,
) -> Result<ToolOutput, FunctionCallError> {
@@ -461,7 +467,7 @@ async fn handle_read_resource(
arguments: arguments.clone(),
};
emit_tool_call_begin(&session, &sub_id, &call_id, invocation.clone()).await;
emit_tool_call_begin(&session, turn.as_ref(), &call_id, invocation.clone()).await;
let start = Instant::now();
let payload_result: Result<ReadResourcePayload, FunctionCallError> = async {
@@ -489,7 +495,7 @@ async fn handle_read_resource(
let duration = start.elapsed();
emit_tool_call_end(
&session,
&sub_id,
turn.as_ref(),
&call_id,
invocation,
duration,
@@ -503,7 +509,7 @@ async fn handle_read_resource(
let message = err.to_string();
emit_tool_call_end(
&session,
&sub_id,
turn.as_ref(),
&call_id,
invocation,
duration,
@@ -518,7 +524,7 @@ async fn handle_read_resource(
let message = err.to_string();
emit_tool_call_end(
&session,
&sub_id,
turn.as_ref(),
&call_id,
invocation,
duration,
@@ -544,39 +550,39 @@ fn call_tool_result_from_content(content: &str, success: Option<bool>) -> CallTo
async fn emit_tool_call_begin(
session: &Arc<Session>,
sub_id: &str,
turn: &TurnContext,
call_id: &str,
invocation: McpInvocation,
) {
session
.send_event(Event {
id: sub_id.to_string(),
msg: EventMsg::McpToolCallBegin(McpToolCallBeginEvent {
.send_event(
turn,
EventMsg::McpToolCallBegin(McpToolCallBeginEvent {
call_id: call_id.to_string(),
invocation,
}),
})
)
.await;
}
async fn emit_tool_call_end(
session: &Arc<Session>,
sub_id: &str,
turn: &TurnContext,
call_id: &str,
invocation: McpInvocation,
duration: Duration,
result: Result<CallToolResult, String>,
) {
session
.send_event(Event {
id: sub_id.to_string(),
msg: EventMsg::McpToolCallEnd(McpToolCallEndEvent {
.send_event(
turn,
EventMsg::McpToolCallEnd(McpToolCallEndEvent {
call_id: call_id.to_string(),
invocation,
duration,
result,
}),
})
)
.await;
}

View File

@@ -1,6 +1,7 @@
use crate::client_common::tools::ResponsesApiTool;
use crate::client_common::tools::ToolSpec;
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::function_tool::FunctionCallError;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolOutput;
@@ -10,7 +11,6 @@ use crate::tools::registry::ToolKind;
use crate::tools::spec::JsonSchema;
use async_trait::async_trait;
use codex_protocol::plan_tool::UpdatePlanArgs;
use codex_protocol::protocol::Event;
use codex_protocol::protocol::EventMsg;
use std::collections::BTreeMap;
use std::sync::LazyLock;
@@ -68,7 +68,7 @@ impl ToolHandler for PlanHandler {
async fn handle(&self, invocation: ToolInvocation) -> Result<ToolOutput, FunctionCallError> {
let ToolInvocation {
session,
sub_id,
turn,
call_id,
payload,
..
@@ -84,7 +84,7 @@ impl ToolHandler for PlanHandler {
};
let content =
handle_update_plan(session.as_ref(), arguments, sub_id.clone(), call_id).await?;
handle_update_plan(session.as_ref(), turn.as_ref(), arguments, call_id).await?;
Ok(ToolOutput::Function {
content,
@@ -98,16 +98,13 @@ impl ToolHandler for PlanHandler {
/// than forcing it to come up and document a plan (TBD how that affects performance).
pub(crate) async fn handle_update_plan(
session: &Session,
turn_context: &TurnContext,
arguments: String,
sub_id: String,
_call_id: String,
) -> Result<String, FunctionCallError> {
let args = parse_update_plan_arguments(&arguments)?;
session
.send_event(Event {
id: sub_id.to_string(),
msg: EventMsg::PlanUpdate(args),
})
.send_event(turn_context, EventMsg::PlanUpdate(args))
.await;
Ok("Plan updated".to_string())
}

View File

@@ -47,7 +47,6 @@ impl ToolHandler for ShellHandler {
session,
turn,
tracker,
sub_id,
call_id,
tool_name,
payload,
@@ -68,7 +67,6 @@ impl ToolHandler for ShellHandler {
Arc::clone(&session),
Arc::clone(&turn),
Arc::clone(&tracker),
sub_id.clone(),
call_id.clone(),
)
.await?;
@@ -85,7 +83,6 @@ impl ToolHandler for ShellHandler {
Arc::clone(&session),
Arc::clone(&turn),
Arc::clone(&tracker),
sub_id.clone(),
call_id.clone(),
)
.await?;

View File

@@ -1,35 +1,68 @@
use std::time::Duration;
use async_trait::async_trait;
use serde::Deserialize;
use serde::Serialize;
use crate::function_tool::FunctionCallError;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolOutput;
use crate::tools::context::ToolPayload;
use crate::tools::events::ToolEmitter;
use crate::tools::events::ToolEventCtx;
use crate::tools::events::ToolEventStage;
use crate::tools::registry::ToolHandler;
use crate::tools::registry::ToolKind;
use crate::unified_exec::UnifiedExecRequest;
use crate::unified_exec::ExecCommandRequest;
use crate::unified_exec::UnifiedExecContext;
use crate::unified_exec::UnifiedExecResponse;
use crate::unified_exec::UnifiedExecSessionManager;
use crate::unified_exec::WriteStdinRequest;
pub struct UnifiedExecHandler;
#[derive(Deserialize)]
struct UnifiedExecArgs {
input: Vec<String>,
#[derive(Debug, Deserialize)]
struct ExecCommandArgs {
cmd: String,
#[serde(default = "default_shell")]
shell: String,
#[serde(default = "default_login")]
login: bool,
#[serde(default)]
session_id: Option<String>,
yield_time_ms: Option<u64>,
#[serde(default)]
timeout_ms: Option<u64>,
max_output_tokens: Option<usize>,
}
#[derive(Debug, Deserialize)]
struct WriteStdinArgs {
session_id: i32,
#[serde(default)]
chars: String,
#[serde(default)]
yield_time_ms: Option<u64>,
#[serde(default)]
max_output_tokens: Option<usize>,
}
fn default_shell() -> String {
"/bin/bash".to_string()
}
fn default_login() -> bool {
true
}
#[async_trait]
impl ToolHandler for UnifiedExecHandler {
fn kind(&self) -> ToolKind {
ToolKind::UnifiedExec
ToolKind::Function
}
fn matches_kind(&self, payload: &ToolPayload) -> bool {
matches!(
payload,
ToolPayload::UnifiedExec { .. } | ToolPayload::Function { .. }
ToolPayload::Function { .. } | ToolPayload::UnifiedExec { .. }
)
}
@@ -37,21 +70,15 @@ impl ToolHandler for UnifiedExecHandler {
let ToolInvocation {
session,
turn,
sub_id,
call_id,
tool_name: _tool_name,
tool_name,
payload,
..
} = invocation;
let args = match payload {
ToolPayload::UnifiedExec { arguments } | ToolPayload::Function { arguments } => {
serde_json::from_str::<UnifiedExecArgs>(&arguments).map_err(|err| {
FunctionCallError::RespondToModel(format!(
"failed to parse function arguments: {err:?}"
))
})?
}
let arguments = match payload {
ToolPayload::Function { arguments } => arguments,
ToolPayload::UnifiedExec { arguments } => arguments,
_ => {
return Err(FunctionCallError::RespondToModel(
"unified_exec handler received unsupported payload".to_string(),
@@ -59,59 +86,69 @@ impl ToolHandler for UnifiedExecHandler {
}
};
let UnifiedExecArgs {
input,
session_id,
timeout_ms,
} = args;
let manager: &UnifiedExecSessionManager = &session.services.unified_exec_manager;
let context = UnifiedExecContext {
session: &session,
turn: turn.as_ref(),
call_id: &call_id,
};
let parsed_session_id = if let Some(session_id) = session_id {
match session_id.parse::<i32>() {
Ok(parsed) => Some(parsed),
Err(output) => {
return Err(FunctionCallError::RespondToModel(format!(
"invalid session_id: {session_id} due to error {output:?}"
)));
}
let response = match tool_name.as_str() {
"exec_command" => {
let args: ExecCommandArgs = serde_json::from_str(&arguments).map_err(|err| {
FunctionCallError::RespondToModel(format!(
"failed to parse exec_command arguments: {err:?}"
))
})?;
let event_ctx =
ToolEventCtx::new(context.session, context.turn, context.call_id, None);
let emitter =
ToolEmitter::unified_exec(args.cmd.clone(), context.turn.cwd.clone(), true);
emitter.emit(event_ctx, ToolEventStage::Begin).await;
manager
.exec_command(
ExecCommandRequest {
command: &args.cmd,
shell: &args.shell,
login: args.login,
yield_time_ms: args.yield_time_ms,
max_output_tokens: args.max_output_tokens,
},
&context,
)
.await
.map_err(|err| {
FunctionCallError::RespondToModel(format!("exec_command failed: {err:?}"))
})?
}
"write_stdin" => {
let args: WriteStdinArgs = serde_json::from_str(&arguments).map_err(|err| {
FunctionCallError::RespondToModel(format!(
"failed to parse write_stdin arguments: {err:?}"
))
})?;
manager
.write_stdin(WriteStdinRequest {
session_id: args.session_id,
input: &args.chars,
yield_time_ms: args.yield_time_ms,
max_output_tokens: args.max_output_tokens,
})
.await
.map_err(|err| {
FunctionCallError::RespondToModel(format!("write_stdin failed: {err:?}"))
})?
}
other => {
return Err(FunctionCallError::RespondToModel(format!(
"unsupported unified exec function {other}"
)));
}
} else {
None
};
let request = UnifiedExecRequest {
input_chunks: &input,
timeout_ms,
};
let value = session
.services
.unified_exec_manager
.handle_request(
request,
crate::unified_exec::UnifiedExecContext {
session: &session,
turn: turn.as_ref(),
sub_id: &sub_id,
call_id: &call_id,
session_id: parsed_session_id,
},
)
.await
.map_err(|err| {
FunctionCallError::RespondToModel(format!("unified exec failed: {err:?}"))
})?;
#[derive(serde::Serialize)]
struct SerializedUnifiedExecResult {
session_id: Option<String>,
output: String,
}
let content = serde_json::to_string(&SerializedUnifiedExecResult {
session_id: value.session_id.map(|id| id.to_string()),
output: value.output,
})
.map_err(|err| {
let content = serialize_response(&response).map_err(|err| {
FunctionCallError::RespondToModel(format!(
"failed to serialize unified exec output: {err:?}"
))
@@ -123,3 +160,33 @@ impl ToolHandler for UnifiedExecHandler {
})
}
}
#[derive(Serialize)]
struct SerializedUnifiedExecResponse<'a> {
chunk_id: &'a str,
wall_time_seconds: f64,
output: &'a str,
#[serde(skip_serializing_if = "Option::is_none")]
session_id: Option<i32>,
#[serde(skip_serializing_if = "Option::is_none")]
exit_code: Option<i32>,
#[serde(skip_serializing_if = "Option::is_none")]
original_token_count: Option<usize>,
}
fn serialize_response(response: &UnifiedExecResponse) -> Result<String, serde_json::Error> {
let payload = SerializedUnifiedExecResponse {
chunk_id: &response.chunk_id,
wall_time_seconds: duration_to_seconds(response.wall_time),
output: &response.output,
session_id: response.session_id,
exit_code: response.exit_code,
original_token_count: response.original_token_count,
};
serde_json::to_string(&payload)
}
fn duration_to_seconds(duration: Duration) -> f64 {
duration.as_secs_f64()
}

View File

@@ -3,15 +3,14 @@ use serde::Deserialize;
use tokio::fs;
use crate::function_tool::FunctionCallError;
use crate::protocol::Event;
use crate::protocol::EventMsg;
use crate::protocol::InputItem;
use crate::protocol::ViewImageToolCallEvent;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolOutput;
use crate::tools::context::ToolPayload;
use crate::tools::registry::ToolHandler;
use crate::tools::registry::ToolKind;
use codex_protocol::user_input::UserInput;
pub struct ViewImageHandler;
@@ -31,7 +30,6 @@ impl ToolHandler for ViewImageHandler {
session,
turn,
payload,
sub_id,
call_id,
..
} = invocation;
@@ -67,7 +65,7 @@ impl ToolHandler for ViewImageHandler {
let event_path = abs_path.clone();
session
.inject_input(vec![InputItem::LocalImage { path: abs_path }])
.inject_input(vec![UserInput::LocalImage { path: abs_path }])
.await
.map_err(|_| {
FunctionCallError::RespondToModel(
@@ -76,13 +74,13 @@ impl ToolHandler for ViewImageHandler {
})?;
session
.send_event(Event {
id: sub_id.to_string(),
msg: EventMsg::ViewImageToolCall(ViewImageToolCallEvent {
.send_event(
turn.as_ref(),
EventMsg::ViewImageToolCall(ViewImageToolCallEvent {
call_id,
path: event_path,
}),
})
)
.await;
Ok(ToolOutput::Function {

View File

@@ -61,7 +61,6 @@ pub(crate) async fn handle_container_exec_with_params(
sess: Arc<Session>,
turn_context: Arc<TurnContext>,
turn_diff_tracker: SharedTurnDiffTracker,
sub_id: String,
call_id: String,
) -> Result<String, FunctionCallError> {
let _otel_event_manager = turn_context.client.get_otel_event_manager();
@@ -78,14 +77,8 @@ pub(crate) async fn handle_container_exec_with_params(
// check if this was a patch, and apply it if so
let apply_patch_exec = match maybe_parse_apply_patch_verified(&params.command, &params.cwd) {
MaybeApplyPatchVerified::Body(changes) => {
match apply_patch::apply_patch(
sess.as_ref(),
turn_context.as_ref(),
&sub_id,
&call_id,
changes,
)
.await
match apply_patch::apply_patch(sess.as_ref(), turn_context.as_ref(), &call_id, changes)
.await
{
InternalApplyPatchInvocation::Output(item) => return item,
InternalApplyPatchInvocation::DelegateToExec(apply_patch_exec) => {
@@ -122,7 +115,7 @@ pub(crate) async fn handle_container_exec_with_params(
),
};
let event_ctx = ToolEventCtx::new(sess.as_ref(), &sub_id, &call_id, diff_opt);
let event_ctx = ToolEventCtx::new(sess.as_ref(), turn_context.as_ref(), &call_id, diff_opt);
event_emitter.emit(event_ctx, ToolEventStage::Begin).await;
// Build runtime contexts only when needed (shell/apply_patch below).
@@ -141,7 +134,7 @@ pub(crate) async fn handle_container_exec_with_params(
let mut runtime = ApplyPatchRuntime::new();
let tool_ctx = ToolCtx {
session: sess.as_ref(),
sub_id: sub_id.clone(),
turn: turn_context.as_ref(),
call_id: call_id.clone(),
tool_name: tool_name.to_string(),
};
@@ -172,7 +165,7 @@ pub(crate) async fn handle_container_exec_with_params(
let mut runtime = ShellRuntime::new();
let tool_ctx = ToolCtx {
session: sess.as_ref(),
sub_id: sub_id.clone(),
turn: turn_context.as_ref(),
call_id: call_id.clone(),
tool_name: tool_name.to_string(),
};
@@ -330,33 +323,37 @@ fn truncate_formatted_exec_output(content: &str, total_lines: usize) -> String {
.map(|segment| segment.len())
.sum::<usize>()
};
let marker = format!("\n[... omitted {omitted} of {total_lines} lines ...]\n\n");
// Byte budgets for head/tail around the marker
let mut head_budget = MODEL_FORMAT_HEAD_BYTES.min(MODEL_FORMAT_MAX_BYTES);
let tail_budget = MODEL_FORMAT_MAX_BYTES.saturating_sub(head_budget + marker.len());
if tail_budget == 0 && marker.len() >= MODEL_FORMAT_MAX_BYTES {
// Degenerate case: marker alone exceeds budget; return a clipped marker
return take_bytes_at_char_boundary(&marker, MODEL_FORMAT_MAX_BYTES).to_string();
}
if tail_budget == 0 {
// Make room for the marker by shrinking head
head_budget = MODEL_FORMAT_MAX_BYTES.saturating_sub(marker.len());
}
let head_slice = &content[..head_slice_end];
let tail_slice = &content[tail_slice_start..];
let truncated_by_bytes = content.len() > MODEL_FORMAT_MAX_BYTES;
let marker = if omitted > 0 {
Some(format!(
"\n[... omitted {omitted} of {total_lines} lines ...]\n\n"
))
} else if truncated_by_bytes {
Some(format!(
"\n[... output truncated to fit {MODEL_FORMAT_MAX_BYTES} bytes ...]\n\n"
))
} else {
None
};
let marker_len = marker.as_ref().map_or(0, String::len);
let base_head_budget = MODEL_FORMAT_HEAD_BYTES.min(MODEL_FORMAT_MAX_BYTES);
let head_budget = base_head_budget.min(MODEL_FORMAT_MAX_BYTES.saturating_sub(marker_len));
let head_part = take_bytes_at_char_boundary(head_slice, head_budget);
let mut result = String::with_capacity(MODEL_FORMAT_MAX_BYTES.min(content.len()));
result.push_str(head_part);
result.push_str(&marker);
if let Some(marker_text) = marker.as_ref() {
result.push_str(marker_text);
}
let remaining = MODEL_FORMAT_MAX_BYTES.saturating_sub(result.len());
if remaining == 0 {
return result;
}
let tail_slice = &content[tail_slice_start..];
let tail_part = take_last_bytes_at_char_boundary(tail_slice, remaining);
result.push_str(tail_part);
@@ -403,6 +400,11 @@ mod tests {
let tail_take = MODEL_FORMAT_TAIL_LINES.min(total_lines.saturating_sub(head_take));
let omitted = total_lines.saturating_sub(head_take + tail_take);
let escaped_line = regex_lite::escape(line);
if omitted == 0 {
return format!(
r"(?s)^Total output lines: {total_lines}\n\n(?P<body>{escaped_line}.*\n\[\.{{3}} output truncated to fit {MODEL_FORMAT_MAX_BYTES} bytes \.{{3}}]\n\n.*)$",
);
}
format!(
r"(?s)^Total output lines: {total_lines}\n\n(?P<body>{escaped_line}.*\n\[\.{{3}} omitted {omitted} of {total_lines} lines \.{{3}}]\n\n.*)$",
)
@@ -449,4 +451,76 @@ mod tests {
other => panic!("unexpected error variant: {other:?}"),
}
}
#[test]
fn truncate_formatted_exec_output_marks_byte_truncation_without_omitted_lines() {
let long_line = "a".repeat(MODEL_FORMAT_MAX_BYTES + 50);
let truncated = format_exec_output(&long_line);
assert_ne!(truncated, long_line);
let marker_line =
format!("[... output truncated to fit {MODEL_FORMAT_MAX_BYTES} bytes ...]");
assert!(
truncated.contains(&marker_line),
"missing byte truncation marker: {truncated}"
);
assert!(
!truncated.contains("omitted"),
"line omission marker should not appear when no lines were dropped: {truncated}"
);
}
#[test]
fn truncate_formatted_exec_output_returns_original_when_within_limits() {
let content = "example output\n".repeat(10);
assert_eq!(format_exec_output(&content), content);
}
#[test]
fn truncate_formatted_exec_output_reports_omitted_lines_and_keeps_head_and_tail() {
let total_lines = MODEL_FORMAT_MAX_LINES + 100;
let content: String = (0..total_lines)
.map(|idx| format!("line-{idx}\n"))
.collect();
let truncated = format_exec_output(&content);
let omitted = total_lines - MODEL_FORMAT_MAX_LINES;
let expected_marker = format!("[... omitted {omitted} of {total_lines} lines ...]");
assert!(
truncated.contains(&expected_marker),
"missing omitted marker: {truncated}"
);
assert!(
truncated.contains("line-0\n"),
"expected head line to remain: {truncated}"
);
let last_line = format!("line-{}\n", total_lines - 1);
assert!(
truncated.contains(&last_line),
"expected tail line to remain: {truncated}"
);
}
#[test]
fn truncate_formatted_exec_output_prefers_line_marker_when_both_limits_exceeded() {
let total_lines = MODEL_FORMAT_MAX_LINES + 42;
let long_line = "x".repeat(256);
let content: String = (0..total_lines)
.map(|idx| format!("line-{idx}-{long_line}\n"))
.collect();
let truncated = format_exec_output(&content);
assert!(
truncated.contains("[... omitted 42 of 298 lines ...]"),
"expected omitted marker when line count exceeds limit: {truncated}"
);
assert!(
!truncated.contains("output truncated to fit"),
"line omission marker should take precedence over byte marker: {truncated}"
);
}
}

View File

@@ -53,7 +53,7 @@ impl ToolOrchestrator {
if needs_initial_approval {
let approval_ctx = ApprovalCtx {
session: tool_ctx.session,
sub_id: &tool_ctx.sub_id,
turn: turn_ctx,
call_id: &tool_ctx.call_id,
retry_reason: None,
};
@@ -110,7 +110,7 @@ impl ToolOrchestrator {
let reason_msg = build_denial_reason_from_output(output.as_ref());
let approval_ctx = ApprovalCtx {
session: tool_ctx.session,
sub_id: &tool_ctx.sub_id,
turn: turn_ctx,
call_id: &tool_ctx.call_id,
retry_reason: Some(reason_msg),
};

View File

@@ -2,6 +2,7 @@ use std::sync::Arc;
use tokio::sync::RwLock;
use tokio_util::either::Either;
use tokio_util::sync::CancellationToken;
use tokio_util::task::AbortOnDropHandle;
use crate::codex::Session;
@@ -9,8 +10,10 @@ use crate::codex::TurnContext;
use crate::error::CodexErr;
use crate::function_tool::FunctionCallError;
use crate::tools::context::SharedTurnDiffTracker;
use crate::tools::context::ToolPayload;
use crate::tools::router::ToolCall;
use crate::tools::router::ToolRouter;
use codex_protocol::models::FunctionCallOutputPayload;
use codex_protocol::models::ResponseInputItem;
pub(crate) struct ToolCallRuntime {
@@ -18,7 +21,6 @@ pub(crate) struct ToolCallRuntime {
session: Arc<Session>,
turn_context: Arc<TurnContext>,
tracker: SharedTurnDiffTracker,
sub_id: String,
parallel_execution: Arc<RwLock<()>>,
}
@@ -28,14 +30,12 @@ impl ToolCallRuntime {
session: Arc<Session>,
turn_context: Arc<TurnContext>,
tracker: SharedTurnDiffTracker,
sub_id: String,
) -> Self {
Self {
router,
session,
turn_context,
tracker,
sub_id,
parallel_execution: Arc::new(RwLock::new(())),
}
}
@@ -43,6 +43,7 @@ impl ToolCallRuntime {
pub(crate) fn handle_tool_call(
&self,
call: ToolCall,
cancellation_token: CancellationToken,
) -> impl std::future::Future<Output = Result<ResponseInputItem, CodexErr>> {
let supports_parallel = self.router.tool_supports_parallel(&call.tool_name);
@@ -50,20 +51,25 @@ impl ToolCallRuntime {
let session = Arc::clone(&self.session);
let turn = Arc::clone(&self.turn_context);
let tracker = Arc::clone(&self.tracker);
let sub_id = self.sub_id.clone();
let lock = Arc::clone(&self.parallel_execution);
let aborted_response = Self::aborted_response(&call);
let handle: AbortOnDropHandle<Result<ResponseInputItem, FunctionCallError>> =
AbortOnDropHandle::new(tokio::spawn(async move {
let _guard = if supports_parallel {
Either::Left(lock.read().await)
} else {
Either::Right(lock.write().await)
};
tokio::select! {
_ = cancellation_token.cancelled() => Ok(aborted_response),
res = async {
let _guard = if supports_parallel {
Either::Left(lock.read().await)
} else {
Either::Right(lock.write().await)
};
router
.dispatch_tool_call(session, turn, tracker, sub_id, call)
.await
router
.dispatch_tool_call(session, turn, tracker, call)
.await
} => res,
}
}));
async move {
@@ -78,3 +84,25 @@ impl ToolCallRuntime {
}
}
}
impl ToolCallRuntime {
fn aborted_response(call: &ToolCall) -> ResponseInputItem {
match &call.payload {
ToolPayload::Custom { .. } => ResponseInputItem::CustomToolCallOutput {
call_id: call.call_id.clone(),
output: "aborted".to_string(),
},
ToolPayload::Mcp { .. } => ResponseInputItem::McpToolCallOutput {
call_id: call.call_id.clone(),
result: Err("aborted".to_string()),
},
_ => ResponseInputItem::FunctionCallOutput {
call_id: call.call_id.clone(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: None,
},
},
}
}
}

View File

@@ -15,7 +15,6 @@ use crate::tools::context::ToolPayload;
#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash)]
pub enum ToolKind {
Function,
UnifiedExec,
Mcp,
}
@@ -27,7 +26,6 @@ pub trait ToolHandler: Send + Sync {
matches!(
(self.kind(), payload),
(ToolKind::Function, ToolPayload::Function { .. })
| (ToolKind::UnifiedExec, ToolPayload::UnifiedExec { .. })
| (ToolKind::Mcp, ToolPayload::Mcp { .. })
)
}

View File

@@ -134,7 +134,6 @@ impl ToolRouter {
session: Arc<Session>,
turn: Arc<TurnContext>,
tracker: SharedTurnDiffTracker,
sub_id: String,
call: ToolCall,
) -> Result<ResponseInputItem, FunctionCallError> {
let ToolCall {
@@ -149,7 +148,6 @@ impl ToolRouter {
session,
turn,
tracker,
sub_id,
call_id,
tool_name,
payload,

View File

@@ -68,7 +68,7 @@ impl ApplyPatchRuntime {
fn stdout_stream(ctx: &ToolCtx<'_>) -> Option<crate::exec::StdoutStream> {
Some(crate::exec::StdoutStream {
sub_id: ctx.sub_id.clone(),
sub_id: ctx.turn.sub_id.clone(),
call_id: ctx.call_id.clone(),
tx_event: ctx.session.get_tx_event(),
})
@@ -101,7 +101,7 @@ impl Approvable<ApplyPatchRequest> for ApplyPatchRuntime {
) -> BoxFuture<'a, ReviewDecision> {
let key = self.approval_key(req);
let session = ctx.session;
let sub_id = ctx.sub_id.to_string();
let turn = ctx.turn;
let call_id = ctx.call_id.to_string();
let cwd = req.cwd.clone();
let retry_reason = ctx.retry_reason.clone();
@@ -111,7 +111,7 @@ impl Approvable<ApplyPatchRequest> for ApplyPatchRuntime {
if let Some(reason) = retry_reason {
session
.request_command_approval(
sub_id,
turn,
call_id,
vec!["apply_patch".to_string()],
cwd,

View File

@@ -51,7 +51,7 @@ impl ShellRuntime {
fn stdout_stream(ctx: &ToolCtx<'_>) -> Option<crate::exec::StdoutStream> {
Some(crate::exec::StdoutStream {
sub_id: ctx.sub_id.clone(),
sub_id: ctx.turn.sub_id.clone(),
call_id: ctx.call_id.clone(),
tx_event: ctx.session.get_tx_event(),
})
@@ -91,12 +91,12 @@ impl Approvable<ShellRequest> for ShellRuntime {
.clone()
.or_else(|| req.justification.clone());
let session = ctx.session;
let sub_id = ctx.sub_id.to_string();
let turn = ctx.turn;
let call_id = ctx.call_id.to_string();
Box::pin(async move {
with_cached_approval(&session.services, key, || async move {
session
.request_command_approval(sub_id, call_id, command, cwd, reason)
.request_command_approval(turn, call_id, command, cwd, reason)
.await
})
.await

View File

@@ -80,7 +80,7 @@ impl Approvable<UnifiedExecRequest> for UnifiedExecRuntime<'_> {
) -> BoxFuture<'b, ReviewDecision> {
let key = self.approval_key(req);
let session = ctx.session;
let sub_id = ctx.sub_id.to_string();
let turn = ctx.turn;
let call_id = ctx.call_id.to_string();
let command = req.command.clone();
let cwd = req.cwd.clone();
@@ -88,7 +88,7 @@ impl Approvable<UnifiedExecRequest> for UnifiedExecRuntime<'_> {
Box::pin(async move {
with_cached_approval(&session.services, key, || async move {
session
.request_command_approval(sub_id, call_id, command, cwd, reason)
.request_command_approval(turn, call_id, command, cwd, reason)
.await
})
.await

View File

@@ -5,6 +5,7 @@
//! and helpers (`Sandboxable`, `ToolRuntime`, `SandboxAttempt`, etc.).
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::error::CodexErr;
use crate::protocol::SandboxPolicy;
use crate::sandboxing::CommandSpec;
@@ -77,7 +78,7 @@ where
#[derive(Clone)]
pub(crate) struct ApprovalCtx<'a> {
pub session: &'a Session,
pub sub_id: &'a str,
pub turn: &'a TurnContext,
pub call_id: &'a str,
pub retry_reason: Option<String>,
}
@@ -145,7 +146,7 @@ pub(crate) trait Sandboxable {
pub(crate) struct ToolCtx<'a> {
pub session: &'a Session,
pub sub_id: String,
pub turn: &'a TurnContext,
pub call_id: String,
pub tool_name: String,
}

View File

@@ -25,7 +25,6 @@ pub enum ConfigShellToolType {
#[derive(Debug, Clone)]
pub(crate) struct ToolsConfig {
pub shell_type: ConfigShellToolType,
pub plan_tool: bool,
pub apply_patch_tool_type: Option<ApplyPatchToolType>,
pub web_search_request: bool,
pub include_view_image_tool: bool,
@@ -46,7 +45,6 @@ impl ToolsConfig {
} = params;
let use_streamable_shell_tool = features.enabled(Feature::StreamableShell);
let experimental_unified_exec_tool = features.enabled(Feature::UnifiedExec);
let include_plan_tool = features.enabled(Feature::PlanTool);
let include_apply_patch_tool = features.enabled(Feature::ApplyPatchFreeform);
let include_web_search_request = features.enabled(Feature::WebSearchRequest);
let include_view_image_tool = features.enabled(Feature::ViewImageTool);
@@ -73,7 +71,6 @@ impl ToolsConfig {
Self {
shell_type,
plan_tool: include_plan_tool,
apply_patch_tool_type,
web_search_request: include_web_search_request,
include_view_image_tool,
@@ -139,48 +136,99 @@ impl From<JsonSchema> for AdditionalProperties {
}
}
fn create_unified_exec_tool() -> ToolSpec {
fn create_exec_command_tool() -> ToolSpec {
let mut properties = BTreeMap::new();
properties.insert(
"input".to_string(),
JsonSchema::Array {
items: Box::new(JsonSchema::String { description: None }),
description: Some(
"When no session_id is provided, treat the array as the command and arguments \
to launch. When session_id is set, concatenate the strings (in order) and write \
them to the session's stdin."
.to_string(),
),
},
);
properties.insert(
"session_id".to_string(),
"cmd".to_string(),
JsonSchema::String {
description: Some("Shell command to execute.".to_string()),
},
);
properties.insert(
"shell".to_string(),
JsonSchema::String {
description: Some("Shell binary to launch. Defaults to /bin/bash.".to_string()),
},
);
properties.insert(
"login".to_string(),
JsonSchema::Boolean {
description: Some(
"Identifier for an existing interactive session. If omitted, a new command \
is spawned."
.to_string(),
"Whether to run the shell with -l/-i semantics. Defaults to true.".to_string(),
),
},
);
properties.insert(
"timeout_ms".to_string(),
"yield_time_ms".to_string(),
JsonSchema::Number {
description: Some(
"Maximum time in milliseconds to wait for output after writing the input."
.to_string(),
"How long to wait (in milliseconds) for output before yielding.".to_string(),
),
},
);
properties.insert(
"max_output_tokens".to_string(),
JsonSchema::Number {
description: Some(
"Maximum number of tokens to return. Excess output will be truncated.".to_string(),
),
},
);
ToolSpec::Function(ResponsesApiTool {
name: "unified_exec".to_string(),
name: "exec_command".to_string(),
description:
"Runs a command in a PTY. Provide a session_id to reuse an existing interactive session.".to_string(),
"Runs a command in a PTY, returning output or a session ID for ongoing interaction."
.to_string(),
strict: false,
parameters: JsonSchema::Object {
properties,
required: Some(vec!["input".to_string()]),
required: Some(vec!["cmd".to_string()]),
additional_properties: Some(false.into()),
},
})
}
fn create_write_stdin_tool() -> ToolSpec {
let mut properties = BTreeMap::new();
properties.insert(
"session_id".to_string(),
JsonSchema::Number {
description: Some("Identifier of the running unified exec session.".to_string()),
},
);
properties.insert(
"chars".to_string(),
JsonSchema::String {
description: Some("Bytes to write to stdin (may be empty to poll).".to_string()),
},
);
properties.insert(
"yield_time_ms".to_string(),
JsonSchema::Number {
description: Some(
"How long to wait (in milliseconds) for output before yielding.".to_string(),
),
},
);
properties.insert(
"max_output_tokens".to_string(),
JsonSchema::Number {
description: Some(
"Maximum number of tokens to return. Excess output will be truncated.".to_string(),
),
},
);
ToolSpec::Function(ResponsesApiTool {
name: "write_stdin".to_string(),
description:
"Writes characters to an existing unified exec session and returns recent output."
.to_string(),
strict: false,
parameters: JsonSchema::Object {
properties,
required: Some(vec!["session_id".to_string()]),
additional_properties: Some(false.into()),
},
})
@@ -842,19 +890,20 @@ pub(crate) fn build_specs(
|| matches!(config.shell_type, ConfigShellToolType::Streamable);
if use_unified_exec {
builder.push_spec(create_unified_exec_tool());
builder.register_handler("unified_exec", unified_exec_handler);
} else {
match &config.shell_type {
ConfigShellToolType::Default => {
builder.push_spec(create_shell_tool());
}
ConfigShellToolType::Local => {
builder.push_spec(ToolSpec::LocalShell {});
}
ConfigShellToolType::Streamable => {
// Already handled by use_unified_exec.
}
builder.push_spec(create_exec_command_tool());
builder.push_spec(create_write_stdin_tool());
builder.register_handler("exec_command", unified_exec_handler.clone());
builder.register_handler("write_stdin", unified_exec_handler);
}
match &config.shell_type {
ConfigShellToolType::Default => {
builder.push_spec(create_shell_tool());
}
ConfigShellToolType::Local => {
builder.push_spec(ToolSpec::LocalShell {});
}
ConfigShellToolType::Streamable => {
// Already handled by use_unified_exec.
}
}
@@ -870,10 +919,8 @@ pub(crate) fn build_specs(
builder.register_handler("list_mcp_resource_templates", mcp_resource_handler.clone());
builder.register_handler("read_mcp_resource", mcp_resource_handler);
if config.plan_tool {
builder.push_spec(PLAN_TOOL.clone());
builder.register_handler("update_plan", plan_handler);
}
builder.push_spec(PLAN_TOOL.clone());
builder.register_handler("update_plan", plan_handler);
if let Some(apply_patch_tool_type) = &config.apply_patch_tool_type {
match apply_patch_tool_type {
@@ -972,25 +1019,36 @@ mod tests {
}
}
fn assert_eq_tool_names(tools: &[ConfiguredToolSpec], expected_names: &[&str]) {
let tool_names = tools
.iter()
.map(|tool| tool_name(&tool.spec))
.collect::<Vec<_>>();
assert_eq!(
tool_names.len(),
expected_names.len(),
"tool_name mismatch, {tool_names:?}, {expected_names:?}",
// Avoid order-based assertions; compare via set containment instead.
fn assert_contains_tool_names(tools: &[ConfiguredToolSpec], expected_subset: &[&str]) {
use std::collections::HashSet;
let mut names = HashSet::new();
let mut duplicates = Vec::new();
for name in tools.iter().map(|t| tool_name(&t.spec)) {
if !names.insert(name) {
duplicates.push(name);
}
}
assert!(
duplicates.is_empty(),
"duplicate tool entries detected: {duplicates:?}"
);
for (name, expected_name) in tool_names.iter().zip(expected_names.iter()) {
assert_eq!(
name, expected_name,
"tool_name mismatch, {name:?}, {expected_name:?}"
for expected in expected_subset {
assert!(
names.contains(expected),
"expected tool {expected} to be present; had: {names:?}"
);
}
}
fn shell_tool_name(config: &ToolsConfig) -> Option<&'static str> {
match config.shell_type {
ConfigShellToolType::Default => Some("shell"),
ConfigShellToolType::Local => Some("local_shell"),
ConfigShellToolType::Streamable => None,
}
}
fn find_tool<'a>(
tools: &'a [ConfiguredToolSpec],
expected_name: &str,
@@ -1001,12 +1059,108 @@ mod tests {
.unwrap_or_else(|| panic!("expected tool {expected_name}"))
}
fn strip_descriptions_schema(schema: &mut JsonSchema) {
match schema {
JsonSchema::Boolean { description }
| JsonSchema::String { description }
| JsonSchema::Number { description } => {
*description = None;
}
JsonSchema::Array { items, description } => {
strip_descriptions_schema(items);
*description = None;
}
JsonSchema::Object {
properties,
required: _,
additional_properties,
} => {
for v in properties.values_mut() {
strip_descriptions_schema(v);
}
if let Some(AdditionalProperties::Schema(s)) = additional_properties {
strip_descriptions_schema(s);
}
}
}
}
fn strip_descriptions_tool(spec: &mut ToolSpec) {
match spec {
ToolSpec::Function(ResponsesApiTool { parameters, .. }) => {
strip_descriptions_schema(parameters);
}
ToolSpec::Freeform(_) | ToolSpec::LocalShell {} | ToolSpec::WebSearch {} => {}
}
}
#[test]
fn test_build_specs() {
fn test_full_toolset_specs_for_gpt5_codex() {
let model_family = find_family_for_model("gpt-5-codex")
.expect("gpt-5-codex should be a valid model family");
let mut features = Features::with_defaults();
features.enable(Feature::UnifiedExec);
features.enable(Feature::WebSearchRequest);
features.enable(Feature::ViewImageTool);
let config = ToolsConfig::new(&ToolsConfigParams {
model_family: &model_family,
features: &features,
});
let (tools, _) = build_specs(&config, None).build();
// Build actual map name -> spec
use std::collections::BTreeMap;
use std::collections::HashSet;
let mut actual: BTreeMap<String, ToolSpec> = BTreeMap::new();
let mut duplicate_names = Vec::new();
for t in &tools {
let name = tool_name(&t.spec).to_string();
if actual.insert(name.clone(), t.spec.clone()).is_some() {
duplicate_names.push(name);
}
}
assert!(
duplicate_names.is_empty(),
"duplicate tool entries detected: {duplicate_names:?}"
);
// Build expected from the same helpers used by the builder.
let mut expected: BTreeMap<String, ToolSpec> = BTreeMap::new();
for spec in [
create_exec_command_tool(),
create_write_stdin_tool(),
create_shell_tool(),
create_list_mcp_resources_tool(),
create_list_mcp_resource_templates_tool(),
create_read_mcp_resource_tool(),
PLAN_TOOL.clone(),
create_apply_patch_freeform_tool(),
ToolSpec::WebSearch {},
create_view_image_tool(),
] {
expected.insert(tool_name(&spec).to_string(), spec);
}
// Exact name set match — this is the only test allowed to fail when tools change.
let actual_names: HashSet<_> = actual.keys().cloned().collect();
let expected_names: HashSet<_> = expected.keys().cloned().collect();
assert_eq!(actual_names, expected_names, "tool name set mismatch");
// Compare specs ignoring human-readable descriptions.
for name in expected.keys() {
let mut a = actual.get(name).expect("present").clone();
let mut e = expected.get(name).expect("present").clone();
strip_descriptions_tool(&mut a);
strip_descriptions_tool(&mut e);
assert_eq!(a, e, "spec mismatch for {name}");
}
}
#[test]
fn test_build_specs_contains_expected_basics() {
let model_family = find_family_for_model("codex-mini-latest")
.expect("codex-mini-latest should be a valid model family");
let mut features = Features::with_defaults();
features.enable(Feature::PlanTool);
features.enable(Feature::WebSearchRequest);
features.enable(Feature::UnifiedExec);
let config = ToolsConfig::new(&ToolsConfigParams {
@@ -1014,26 +1168,27 @@ mod tests {
features: &features,
});
let (tools, _) = build_specs(&config, Some(HashMap::new())).build();
assert_eq_tool_names(
&tools,
let tool_names = tools.iter().map(|t| t.spec.name()).collect::<Vec<_>>();
assert_eq!(
&tool_names,
&[
"unified_exec",
"exec_command",
"write_stdin",
"local_shell",
"list_mcp_resources",
"list_mcp_resource_templates",
"read_mcp_resource",
"update_plan",
"web_search",
"view_image",
],
]
);
}
#[test]
fn test_build_specs_default_shell() {
fn test_build_specs_default_shell_present() {
let model_family = find_family_for_model("o3").expect("o3 should be a valid model family");
let mut features = Features::with_defaults();
features.enable(Feature::PlanTool);
features.enable(Feature::WebSearchRequest);
features.enable(Feature::UnifiedExec);
let config = ToolsConfig::new(&ToolsConfigParams {
@@ -1042,18 +1197,12 @@ mod tests {
});
let (tools, _) = build_specs(&config, Some(HashMap::new())).build();
assert_eq_tool_names(
&tools,
&[
"unified_exec",
"list_mcp_resources",
"list_mcp_resource_templates",
"read_mcp_resource",
"update_plan",
"web_search",
"view_image",
],
);
// Only check the shell variant and a couple of core tools.
let mut subset = vec!["exec_command", "write_stdin", "update_plan"];
if let Some(shell_tool) = shell_tool_name(&config) {
subset.push(shell_tool);
}
assert_contains_tool_names(&tools, &subset);
}
#[test]
@@ -1070,7 +1219,8 @@ mod tests {
});
let (tools, _) = build_specs(&config, None).build();
assert!(!find_tool(&tools, "unified_exec").supports_parallel_tool_calls);
assert!(!find_tool(&tools, "exec_command").supports_parallel_tool_calls);
assert!(!find_tool(&tools, "write_stdin").supports_parallel_tool_calls);
assert!(find_tool(&tools, "grep_files").supports_parallel_tool_calls);
assert!(find_tool(&tools, "list_dir").supports_parallel_tool_calls);
assert!(find_tool(&tools, "read_file").supports_parallel_tool_calls);
@@ -1107,7 +1257,7 @@ mod tests {
}
#[test]
fn test_build_specs_mcp_tools() {
fn test_build_specs_mcp_tools_converted() {
let model_family = find_family_for_model("o3").expect("o3 should be a valid model family");
let mut features = Features::with_defaults();
features.enable(Feature::UnifiedExec);
@@ -1155,19 +1305,6 @@ mod tests {
)
.build();
assert_eq_tool_names(
&tools,
&[
"unified_exec",
"list_mcp_resources",
"list_mcp_resource_templates",
"read_mcp_resource",
"web_search",
"view_image",
"test_server/do_something_cool",
],
);
let tool = find_tool(&tools, "test_server/do_something_cool");
assert_eq!(
&tool.spec,
@@ -1273,20 +1410,19 @@ mod tests {
]);
let (tools, _) = build_specs(&config, Some(tools_map)).build();
// Expect unified_exec first, followed by MCP tools sorted by fully-qualified name.
assert_eq_tool_names(
&tools,
&[
"unified_exec",
"list_mcp_resources",
"list_mcp_resource_templates",
"read_mcp_resource",
"view_image",
"test_server/cool",
"test_server/do",
"test_server/something",
],
);
// Only assert that the MCP tools themselves are sorted by fully-qualified name.
let mcp_names: Vec<_> = tools
.iter()
.map(|t| tool_name(&t.spec).to_string())
.filter(|n| n.starts_with("test_server/"))
.collect();
let expected = vec![
"test_server/cool".to_string(),
"test_server/do".to_string(),
"test_server/something".to_string(),
];
assert_eq!(mcp_names, expected);
}
#[test]
@@ -1325,22 +1461,9 @@ mod tests {
)
.build();
assert_eq_tool_names(
&tools,
&[
"unified_exec",
"list_mcp_resources",
"list_mcp_resource_templates",
"read_mcp_resource",
"apply_patch",
"web_search",
"view_image",
"dash/search",
],
);
let tool = find_tool(&tools, "dash/search");
assert_eq!(
tools[7].spec,
tool.spec,
ToolSpec::Function(ResponsesApiTool {
name: "dash/search".to_string(),
parameters: JsonSchema::Object {
@@ -1393,21 +1516,9 @@ mod tests {
)
.build();
assert_eq_tool_names(
&tools,
&[
"unified_exec",
"list_mcp_resources",
"list_mcp_resource_templates",
"read_mcp_resource",
"apply_patch",
"web_search",
"view_image",
"dash/paginate",
],
);
let tool = find_tool(&tools, "dash/paginate");
assert_eq!(
tools[7].spec,
tool.spec,
ToolSpec::Function(ResponsesApiTool {
name: "dash/paginate".to_string(),
parameters: JsonSchema::Object {
@@ -1459,21 +1570,9 @@ mod tests {
)
.build();
assert_eq_tool_names(
&tools,
&[
"unified_exec",
"list_mcp_resources",
"list_mcp_resource_templates",
"read_mcp_resource",
"apply_patch",
"web_search",
"view_image",
"dash/tags",
],
);
let tool = find_tool(&tools, "dash/tags");
assert_eq!(
tools[7].spec,
tool.spec,
ToolSpec::Function(ResponsesApiTool {
name: "dash/tags".to_string(),
parameters: JsonSchema::Object {
@@ -1527,21 +1626,9 @@ mod tests {
)
.build();
assert_eq_tool_names(
&tools,
&[
"unified_exec",
"list_mcp_resources",
"list_mcp_resource_templates",
"read_mcp_resource",
"apply_patch",
"web_search",
"view_image",
"dash/value",
],
);
let tool = find_tool(&tools, "dash/value");
assert_eq!(
tools[7].spec,
tool.spec,
ToolSpec::Function(ResponsesApiTool {
name: "dash/value".to_string(),
parameters: JsonSchema::Object {
@@ -1632,22 +1719,9 @@ mod tests {
)
.build();
assert_eq_tool_names(
&tools,
&[
"unified_exec",
"list_mcp_resources",
"list_mcp_resource_templates",
"read_mcp_resource",
"apply_patch",
"web_search",
"view_image",
"test_server/do_something_cool",
],
);
let tool = find_tool(&tools, "test_server/do_something_cool");
assert_eq!(
tools[7].spec,
tool.spec,
ToolSpec::Function(ResponsesApiTool {
name: "test_server/do_something_cool".to_string(),
parameters: JsonSchema::Object {

View File

@@ -23,7 +23,10 @@
use std::collections::HashMap;
use std::sync::atomic::AtomicI32;
use std::time::Duration;
use rand::Rng;
use rand::rng;
use tokio::sync::Mutex;
use crate::codex::Session;
@@ -36,28 +39,43 @@ mod session_manager;
pub(crate) use errors::UnifiedExecError;
pub(crate) use session::UnifiedExecSession;
const DEFAULT_TIMEOUT_MS: u64 = 1_000;
const MAX_TIMEOUT_MS: u64 = 60_000;
const UNIFIED_EXEC_OUTPUT_MAX_BYTES: usize = 128 * 1024; // 128 KiB
pub(crate) const DEFAULT_YIELD_TIME_MS: u64 = 10_000;
pub(crate) const MIN_YIELD_TIME_MS: u64 = 250;
pub(crate) const MAX_YIELD_TIME_MS: u64 = 30_000;
pub(crate) const DEFAULT_MAX_OUTPUT_TOKENS: usize = 10_000;
pub(crate) const UNIFIED_EXEC_OUTPUT_MAX_BYTES: usize = 1024 * 1024; // 1 MiB
pub(crate) struct UnifiedExecContext<'a> {
pub session: &'a Session,
pub turn: &'a TurnContext,
pub sub_id: &'a str,
pub call_id: &'a str,
pub session_id: Option<i32>,
}
#[derive(Debug)]
pub(crate) struct UnifiedExecRequest<'a> {
pub input_chunks: &'a [String],
pub timeout_ms: Option<u64>,
pub(crate) struct ExecCommandRequest<'a> {
pub command: &'a str,
pub shell: &'a str,
pub login: bool,
pub yield_time_ms: Option<u64>,
pub max_output_tokens: Option<usize>,
}
#[derive(Debug)]
pub(crate) struct WriteStdinRequest<'a> {
pub session_id: i32,
pub input: &'a str,
pub yield_time_ms: Option<u64>,
pub max_output_tokens: Option<usize>,
}
#[derive(Debug, Clone, PartialEq)]
pub(crate) struct UnifiedExecResult {
pub session_id: Option<i32>,
pub(crate) struct UnifiedExecResponse {
pub chunk_id: String,
pub wall_time: Duration,
pub output: String,
pub session_id: Option<i32>,
pub exit_code: Option<i32>,
pub original_token_count: Option<usize>,
}
#[derive(Debug, Default)]
@@ -66,16 +84,66 @@ pub(crate) struct UnifiedExecSessionManager {
sessions: Mutex<HashMap<i32, session::UnifiedExecSession>>,
}
pub(crate) fn clamp_yield_time(yield_time_ms: Option<u64>) -> u64 {
match yield_time_ms {
Some(value) => value.clamp(MIN_YIELD_TIME_MS, MAX_YIELD_TIME_MS),
None => DEFAULT_YIELD_TIME_MS,
}
}
pub(crate) fn resolve_max_tokens(max_tokens: Option<usize>) -> usize {
max_tokens.unwrap_or(DEFAULT_MAX_OUTPUT_TOKENS)
}
pub(crate) fn generate_chunk_id() -> String {
let mut rng = rng();
(0..6)
.map(|_| format!("{:x}", rng.random_range(0..16)))
.collect()
}
pub(crate) fn truncate_output_to_tokens(
output: &str,
max_tokens: usize,
) -> (String, Option<usize>) {
if max_tokens == 0 {
let total_tokens = output.chars().count();
let message = format!("{total_tokens} tokens truncated…");
return (message, Some(total_tokens));
}
let tokens: Vec<char> = output.chars().collect();
let total_tokens = tokens.len();
if total_tokens <= max_tokens {
return (output.to_string(), None);
}
let half = max_tokens / 2;
if half == 0 {
let truncated = total_tokens.saturating_sub(max_tokens);
let message = format!("{truncated} tokens truncated…");
return (message, Some(total_tokens));
}
let truncated = total_tokens.saturating_sub(half * 2);
let mut truncated_output = String::new();
truncated_output.extend(&tokens[..half]);
truncated_output.push_str(&format!("{truncated} tokens truncated…"));
truncated_output.extend(&tokens[total_tokens - half..]);
(truncated_output, Some(total_tokens))
}
#[cfg(test)]
#[cfg(unix)]
mod tests {
use super::*;
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::codex::make_session_and_context;
use crate::protocol::AskForApproval;
use crate::protocol::SandboxPolicy;
use crate::unified_exec::ExecCommandRequest;
use crate::unified_exec::WriteStdinRequest;
use core_test_support::skip_if_sandbox;
use std::sync::Arc;
use tokio::time::Duration;
@@ -89,35 +157,52 @@ mod tests {
(Arc::new(session), Arc::new(turn))
}
async fn run_unified_exec_request(
async fn exec_command(
session: &Arc<Session>,
turn: &Arc<TurnContext>,
session_id: Option<i32>,
input: Vec<String>,
timeout_ms: Option<u64>,
) -> Result<UnifiedExecResult, UnifiedExecError> {
let request_input = input;
let request = UnifiedExecRequest {
input_chunks: &request_input,
timeout_ms,
cmd: &str,
yield_time_ms: Option<u64>,
) -> Result<UnifiedExecResponse, UnifiedExecError> {
let context = UnifiedExecContext {
session,
turn: turn.as_ref(),
call_id: "call",
};
session
.services
.unified_exec_manager
.handle_request(
request,
UnifiedExecContext {
session,
turn: turn.as_ref(),
sub_id: "sub",
call_id: "call",
session_id,
.exec_command(
ExecCommandRequest {
command: cmd,
shell: "/bin/bash",
login: true,
yield_time_ms,
max_output_tokens: None,
},
&context,
)
.await
}
async fn write_stdin(
session: &Arc<Session>,
session_id: i32,
input: &str,
yield_time_ms: Option<u64>,
) -> Result<UnifiedExecResponse, UnifiedExecError> {
session
.services
.unified_exec_manager
.write_stdin(WriteStdinRequest {
session_id,
input,
yield_time_ms,
max_output_tokens: None,
})
.await
}
#[test]
fn push_chunk_trims_only_excess_bytes() {
let mut buffer = OutputBufferState::default();
@@ -142,37 +227,28 @@ mod tests {
let (session, turn) = test_session_and_turn();
let open_shell = run_unified_exec_request(
&session,
&turn,
None,
vec!["bash".to_string(), "-i".to_string()],
Some(2_500),
)
.await?;
let open_shell = exec_command(&session, &turn, "bash -i", Some(2_500)).await?;
let session_id = open_shell.session_id.expect("expected session_id");
run_unified_exec_request(
write_stdin(
&session,
&turn,
Some(session_id),
vec![
"export".to_string(),
"CODEX_INTERACTIVE_SHELL_VAR=codex\n".to_string(),
],
session_id,
"export CODEX_INTERACTIVE_SHELL_VAR=codex\n",
Some(2_500),
)
.await?;
let out_2 = run_unified_exec_request(
let out_2 = write_stdin(
&session,
&turn,
Some(session_id),
vec!["echo $CODEX_INTERACTIVE_SHELL_VAR\n".to_string()],
session_id,
"echo $CODEX_INTERACTIVE_SHELL_VAR\n",
Some(2_500),
)
.await?;
assert!(out_2.output.contains("codex"));
assert!(
out_2.output.contains("codex"),
"expected environment variable output"
);
Ok(())
}
@@ -183,47 +259,44 @@ mod tests {
let (session, turn) = test_session_and_turn();
let shell_a = run_unified_exec_request(
&session,
&turn,
None,
vec!["/bin/bash".to_string(), "-i".to_string()],
Some(2_500),
)
.await?;
let shell_a = exec_command(&session, &turn, "bash -i", Some(2_500)).await?;
let session_a = shell_a.session_id.expect("expected session id");
run_unified_exec_request(
write_stdin(
&session,
&turn,
Some(session_a),
vec!["export CODEX_INTERACTIVE_SHELL_VAR=codex\n".to_string()],
session_a,
"export CODEX_INTERACTIVE_SHELL_VAR=codex\n",
Some(2_500),
)
.await?;
let out_2 = run_unified_exec_request(
let out_2 = exec_command(
&session,
&turn,
None,
vec![
"echo".to_string(),
"$CODEX_INTERACTIVE_SHELL_VAR\n".to_string(),
],
"echo $CODEX_INTERACTIVE_SHELL_VAR",
Some(2_500),
)
.await?;
assert!(!out_2.output.contains("codex"));
assert!(
out_2.session_id.is_none(),
"short command should not retain a session"
);
assert!(
!out_2.output.contains("codex"),
"short command should run in a fresh shell"
);
let out_3 = run_unified_exec_request(
let out_3 = write_stdin(
&session,
&turn,
Some(session_a),
vec!["echo $CODEX_INTERACTIVE_SHELL_VAR\n".to_string()],
session_a,
"echo $CODEX_INTERACTIVE_SHELL_VAR\n",
Some(2_500),
)
.await?;
assert!(out_3.output.contains("codex"));
assert!(
out_3.output.contains("codex"),
"session should preserve state"
);
Ok(())
}
@@ -234,45 +307,37 @@ mod tests {
let (session, turn) = test_session_and_turn();
let open_shell = run_unified_exec_request(
&session,
&turn,
None,
vec!["bash".to_string(), "-i".to_string()],
Some(2_500),
)
.await?;
let open_shell = exec_command(&session, &turn, "bash -i", Some(2_500)).await?;
let session_id = open_shell.session_id.expect("expected session id");
run_unified_exec_request(
write_stdin(
&session,
&turn,
Some(session_id),
vec![
"export".to_string(),
"CODEX_INTERACTIVE_SHELL_VAR=codex\n".to_string(),
],
session_id,
"export CODEX_INTERACTIVE_SHELL_VAR=codex\n",
Some(2_500),
)
.await?;
let out_2 = run_unified_exec_request(
let out_2 = write_stdin(
&session,
&turn,
Some(session_id),
vec!["sleep 5 && echo $CODEX_INTERACTIVE_SHELL_VAR\n".to_string()],
session_id,
"sleep 5 && echo $CODEX_INTERACTIVE_SHELL_VAR\n",
Some(10),
)
.await?;
assert!(!out_2.output.contains("codex"));
assert!(
!out_2.output.contains("codex"),
"timeout too short should yield incomplete output"
);
tokio::time::sleep(Duration::from_secs(7)).await;
let out_3 =
run_unified_exec_request(&session, &turn, Some(session_id), Vec::new(), Some(100))
.await?;
let out_3 = write_stdin(&session, session_id, "", Some(100)).await?;
assert!(out_3.output.contains("codex"));
assert!(
out_3.output.contains("codex"),
"subsequent poll should retrieve output"
);
Ok(())
}
@@ -282,18 +347,9 @@ mod tests {
async fn requests_with_large_timeout_are_capped() -> anyhow::Result<()> {
let (session, turn) = test_session_and_turn();
let result = run_unified_exec_request(
&session,
&turn,
None,
vec!["echo".to_string(), "codex".to_string()],
Some(120_000),
)
.await?;
let result = exec_command(&session, &turn, "echo codex", Some(120_000)).await?;
assert!(result.output.starts_with(
"Warning: requested timeout 120000ms exceeds maximum of 60000ms; clamping to 60000ms.\n"
));
assert!(result.session_id.is_none());
assert!(result.output.contains("codex"));
Ok(())
@@ -303,16 +359,12 @@ mod tests {
#[ignore] // Ignored while we have a better way to test this.
async fn completed_commands_do_not_persist_sessions() -> anyhow::Result<()> {
let (session, turn) = test_session_and_turn();
let result = run_unified_exec_request(
&session,
&turn,
None,
vec!["/bin/echo".to_string(), "codex".to_string()],
Some(2_500),
)
.await?;
let result = exec_command(&session, &turn, "echo codex", Some(2_500)).await?;
assert!(result.session_id.is_none());
assert!(
result.session_id.is_none(),
"completed command should not retain session"
);
assert!(result.output.contains("codex"));
assert!(
@@ -334,31 +386,16 @@ mod tests {
let (session, turn) = test_session_and_turn();
let open_shell = run_unified_exec_request(
&session,
&turn,
None,
vec!["/bin/bash".to_string(), "-i".to_string()],
Some(2_500),
)
.await?;
let open_shell = exec_command(&session, &turn, "bash -i", Some(2_500)).await?;
let session_id = open_shell.session_id.expect("expected session id");
run_unified_exec_request(
&session,
&turn,
Some(session_id),
vec!["exit\n".to_string()],
Some(2_500),
)
.await?;
write_stdin(&session, session_id, "exit\n", Some(2_500)).await?;
tokio::time::sleep(Duration::from_millis(200)).await;
let err =
run_unified_exec_request(&session, &turn, Some(session_id), Vec::new(), Some(100))
.await
.expect_err("expected unknown session error");
let err = write_stdin(&session, session_id, "", Some(100))
.await
.expect_err("expected unknown session error");
match err {
UnifiedExecError::UnknownSessionId { session_id: err_id } => {

View File

@@ -11,77 +11,159 @@ use crate::tools::orchestrator::ToolOrchestrator;
use crate::tools::runtimes::unified_exec::UnifiedExecRequest as UnifiedExecToolRequest;
use crate::tools::runtimes::unified_exec::UnifiedExecRuntime;
use crate::tools::sandboxing::ToolCtx;
use crate::truncate::truncate_middle;
use super::DEFAULT_TIMEOUT_MS;
use super::MAX_TIMEOUT_MS;
use super::UNIFIED_EXEC_OUTPUT_MAX_BYTES;
use super::ExecCommandRequest;
use super::MIN_YIELD_TIME_MS;
use super::UnifiedExecContext;
use super::UnifiedExecError;
use super::UnifiedExecRequest;
use super::UnifiedExecResult;
use super::UnifiedExecResponse;
use super::UnifiedExecSessionManager;
use super::WriteStdinRequest;
use super::clamp_yield_time;
use super::generate_chunk_id;
use super::resolve_max_tokens;
use super::session::OutputBuffer;
use super::session::UnifiedExecSession;
pub(super) struct SessionAcquisition {
pub(super) session_id: i32,
pub(super) writer_tx: mpsc::Sender<Vec<u8>>,
pub(super) output_buffer: OutputBuffer,
pub(super) output_notify: Arc<Notify>,
pub(super) new_session: Option<UnifiedExecSession>,
pub(super) reuse_requested: bool,
}
use super::truncate_output_to_tokens;
impl UnifiedExecSessionManager {
pub(super) async fn acquire_session(
pub(crate) async fn exec_command(
&self,
request: &UnifiedExecRequest<'_>,
request: ExecCommandRequest<'_>,
context: &UnifiedExecContext<'_>,
) -> Result<SessionAcquisition, UnifiedExecError> {
if let Some(existing_id) = context.session_id {
let mut sessions = self.sessions.lock().await;
match sessions.get(&existing_id) {
Some(session) => {
if session.has_exited() {
sessions.remove(&existing_id);
return Err(UnifiedExecError::UnknownSessionId {
session_id: existing_id,
});
}
let (buffer, notify) = session.output_handles();
let writer_tx = session.writer_sender();
Ok(SessionAcquisition {
session_id: existing_id,
writer_tx,
output_buffer: buffer,
output_notify: notify,
new_session: None,
reuse_requested: true,
})
}
None => Err(UnifiedExecError::UnknownSessionId {
session_id: existing_id,
}),
}
) -> Result<UnifiedExecResponse, UnifiedExecError> {
let shell_flag = if request.login { "-lc" } else { "-c" };
let command = vec![
request.shell.to_string(),
shell_flag.to_string(),
request.command.to_string(),
];
let session = self.open_session_with_sandbox(command, context).await?;
let max_tokens = resolve_max_tokens(request.max_output_tokens);
let yield_time_ms =
clamp_yield_time(Some(request.yield_time_ms.unwrap_or(MIN_YIELD_TIME_MS)));
let start = Instant::now();
let (output_buffer, output_notify) = session.output_handles();
let deadline = start + Duration::from_millis(yield_time_ms);
let collected =
Self::collect_output_until_deadline(&output_buffer, &output_notify, deadline).await;
let wall_time = Instant::now().saturating_duration_since(start);
let text = String::from_utf8_lossy(&collected).to_string();
let (output, original_token_count) = truncate_output_to_tokens(&text, max_tokens);
let chunk_id = generate_chunk_id();
let exit_code = session.exit_code();
let session_id = if session.has_exited() {
None
} else {
let new_id = self
.next_session_id
.fetch_add(1, std::sync::atomic::Ordering::SeqCst);
let managed_session = self
.open_session_with_sandbox(request.input_chunks.to_vec(), context)
.await?;
let (buffer, notify) = managed_session.output_handles();
let writer_tx = managed_session.writer_sender();
Ok(SessionAcquisition {
session_id: new_id,
writer_tx,
output_buffer: buffer,
output_notify: notify,
new_session: Some(managed_session),
reuse_requested: false,
})
Some(self.store_session(session).await)
};
Ok(UnifiedExecResponse {
chunk_id,
wall_time,
output,
session_id,
exit_code,
original_token_count,
})
}
pub(crate) async fn write_stdin(
&self,
request: WriteStdinRequest<'_>,
) -> Result<UnifiedExecResponse, UnifiedExecError> {
let session_id = request.session_id;
let (writer_tx, output_buffer, output_notify) =
self.prepare_session_handles(session_id).await?;
if !request.input.is_empty() {
Self::send_input(&writer_tx, request.input.as_bytes()).await?;
tokio::time::sleep(Duration::from_millis(100)).await;
}
let max_tokens = resolve_max_tokens(request.max_output_tokens);
let yield_time_ms = clamp_yield_time(request.yield_time_ms);
let start = Instant::now();
let deadline = start + Duration::from_millis(yield_time_ms);
let collected =
Self::collect_output_until_deadline(&output_buffer, &output_notify, deadline).await;
let wall_time = Instant::now().saturating_duration_since(start);
let text = String::from_utf8_lossy(&collected).to_string();
let (output, original_token_count) = truncate_output_to_tokens(&text, max_tokens);
let chunk_id = generate_chunk_id();
let (session_id, exit_code) = self.refresh_session_state(session_id).await;
Ok(UnifiedExecResponse {
chunk_id,
wall_time,
output,
session_id,
exit_code,
original_token_count,
})
}
async fn refresh_session_state(&self, session_id: i32) -> (Option<i32>, Option<i32>) {
let mut sessions = self.sessions.lock().await;
if !sessions.contains_key(&session_id) {
return (None, None);
}
let has_exited = sessions
.get(&session_id)
.map(UnifiedExecSession::has_exited)
.unwrap_or(false);
let exit_code = sessions
.get(&session_id)
.and_then(UnifiedExecSession::exit_code);
if has_exited {
sessions.remove(&session_id);
(None, exit_code)
} else {
(Some(session_id), exit_code)
}
}
async fn prepare_session_handles(
&self,
session_id: i32,
) -> Result<(mpsc::Sender<Vec<u8>>, OutputBuffer, Arc<Notify>), UnifiedExecError> {
let sessions = self.sessions.lock().await;
let (output_buffer, output_notify, writer_tx) =
if let Some(session) = sessions.get(&session_id) {
let (buffer, notify) = session.output_handles();
(buffer, notify, session.writer_sender())
} else {
return Err(UnifiedExecError::UnknownSessionId { session_id });
};
Ok((writer_tx, output_buffer, output_notify))
}
async fn send_input(
writer_tx: &mpsc::Sender<Vec<u8>>,
data: &[u8],
) -> Result<(), UnifiedExecError> {
writer_tx
.send(data.to_vec())
.await
.map_err(|_| UnifiedExecError::WriteToStdin)
}
async fn store_session(&self, session: UnifiedExecSession) -> i32 {
let session_id = self
.next_session_id
.fetch_add(1, std::sync::atomic::Ordering::SeqCst);
self.sessions.lock().await.insert(session_id, session);
session_id
}
pub(crate) async fn open_session_with_exec_env(
@@ -113,9 +195,9 @@ impl UnifiedExecSessionManager {
);
let tool_ctx = ToolCtx {
session: context.session,
sub_id: context.sub_id.to_string(),
turn: context.turn,
call_id: context.call_id.to_string(),
tool_name: "unified_exec".to_string(),
tool_name: "exec_command".to_string(),
};
orchestrator
.run(
@@ -172,117 +254,4 @@ impl UnifiedExecSessionManager {
collected
}
pub(super) async fn should_store_session(&self, acquisition: &SessionAcquisition) -> bool {
if let Some(session) = acquisition.new_session.as_ref() {
!session.has_exited()
} else if acquisition.reuse_requested {
let mut sessions = self.sessions.lock().await;
if let Some(existing) = sessions.get(&acquisition.session_id) {
if existing.has_exited() {
sessions.remove(&acquisition.session_id);
false
} else {
true
}
} else {
false
}
} else {
true
}
}
pub(super) async fn send_input_chunks(
writer_tx: &mpsc::Sender<Vec<u8>>,
chunks: &[String],
) -> Result<(), UnifiedExecError> {
let mut trailing_whitespace = true;
for chunk in chunks {
if chunk.is_empty() {
continue;
}
let leading_whitespace = chunk
.chars()
.next()
.map(char::is_whitespace)
.unwrap_or(true);
if !trailing_whitespace
&& !leading_whitespace
&& writer_tx.send(vec![b' ']).await.is_err()
{
return Err(UnifiedExecError::WriteToStdin);
}
if writer_tx.send(chunk.as_bytes().to_vec()).await.is_err() {
return Err(UnifiedExecError::WriteToStdin);
}
trailing_whitespace = chunk
.chars()
.next_back()
.map(char::is_whitespace)
.unwrap_or(trailing_whitespace);
}
Ok(())
}
pub async fn handle_request(
&self,
request: UnifiedExecRequest<'_>,
context: UnifiedExecContext<'_>,
) -> Result<UnifiedExecResult, UnifiedExecError> {
let (timeout_ms, timeout_warning) = match request.timeout_ms {
Some(requested) if requested > MAX_TIMEOUT_MS => (
MAX_TIMEOUT_MS,
Some(format!(
"Warning: requested timeout {requested}ms exceeds maximum of {MAX_TIMEOUT_MS}ms; clamping to {MAX_TIMEOUT_MS}ms.\n"
)),
),
Some(requested) => (requested, None),
None => (DEFAULT_TIMEOUT_MS, None),
};
let mut acquisition = self.acquire_session(&request, &context).await?;
if acquisition.reuse_requested {
Self::send_input_chunks(&acquisition.writer_tx, request.input_chunks).await?;
}
let deadline = Instant::now() + Duration::from_millis(timeout_ms);
let collected = Self::collect_output_until_deadline(
&acquisition.output_buffer,
&acquisition.output_notify,
deadline,
)
.await;
let (output, _maybe_tokens) = truncate_middle(
&String::from_utf8_lossy(&collected),
UNIFIED_EXEC_OUTPUT_MAX_BYTES,
);
let output = if let Some(warning) = timeout_warning {
format!("{warning}{output}")
} else {
output
};
let should_store_session = self.should_store_session(&acquisition).await;
let session_id = if should_store_session {
if let Some(session) = acquisition.new_session.take() {
self.sessions
.lock()
.await
.insert(acquisition.session_id, session);
}
Some(acquisition.session_id)
} else {
None
};
Ok(UnifiedExecResult { session_id, output })
}
}

View File

@@ -51,6 +51,7 @@ pub(crate) enum UserNotification {
AgentTurnComplete {
thread_id: String,
turn_id: String,
cwd: String,
/// Messages that the user sent to the agent to initiate the turn.
input_messages: Vec<String>,
@@ -70,6 +71,7 @@ mod tests {
let notification = UserNotification::AgentTurnComplete {
thread_id: "b5f6c1c2-1111-2222-3333-444455556666".to_string(),
turn_id: "12345".to_string(),
cwd: "/Users/example/project".to_string(),
input_messages: vec!["Rename `foo` to `bar` and update the callsites.".to_string()],
last_assistant_message: Some(
"Rename complete and verified `cargo build` succeeds.".to_string(),
@@ -78,7 +80,7 @@ mod tests {
let serialized = serde_json::to_string(&notification)?;
assert_eq!(
serialized,
r#"{"type":"agent-turn-complete","thread-id":"b5f6c1c2-1111-2222-3333-444455556666","turn-id":"12345","input-messages":["Rename `foo` to `bar` and update the callsites."],"last-assistant-message":"Rename complete and verified `cargo build` succeeds."}"#
r#"{"type":"agent-turn-complete","thread-id":"b5f6c1c2-1111-2222-3333-444455556666","turn-id":"12345","cwd":"/Users/example/project","input-messages":["Rename `foo` to `bar` and update the callsites."],"last-assistant-message":"Rename complete and verified `cargo build` succeeds."}"#
);
Ok(())
}

View File

@@ -8,12 +8,12 @@ use codex_core::LocalShellStatus;
use codex_core::ModelClient;
use codex_core::ModelProviderInfo;
use codex_core::Prompt;
use codex_core::ReasoningItemContent;
use codex_core::ResponseItem;
use codex_core::WireApi;
use codex_core::spawn::CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR;
use codex_otel::otel_event_manager::OtelEventManager;
use codex_protocol::ConversationId;
use codex_protocol::models::ReasoningItemContent;
use core_test_support::load_default_config_for_test;
use futures::StreamExt;
use serde_json::Value;
@@ -50,6 +50,7 @@ async fn run_request(input: Vec<ResponseItem>) -> Value {
base_url: Some(format!("{}/v1", server.uri())),
env_key: None,
env_key_instructions: None,
experimental_bearer_token: None,
wire_api: WireApi::Chat,
query_params: None,
http_headers: None,

View File

@@ -13,6 +13,7 @@ use codex_core::WireApi;
use codex_core::spawn::CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR;
use codex_otel::otel_event_manager::OtelEventManager;
use codex_protocol::ConversationId;
use codex_protocol::models::ReasoningItemContent;
use core_test_support::load_default_config_for_test;
use futures::StreamExt;
use tempfile::TempDir;
@@ -49,6 +50,7 @@ async fn run_stream_with_bytes(sse_body: &[u8]) -> Vec<ResponseEvent> {
base_url: Some(format!("{}/v1", server.uri())),
env_key: None,
env_key_instructions: None,
experimental_bearer_token: None,
wire_api: WireApi::Chat,
query_params: None,
http_headers: None,
@@ -142,8 +144,8 @@ fn assert_reasoning(item: &ResponseItem, expected: &str) {
let mut combined = String::new();
for part in parts {
match part {
codex_core::ReasoningItemContent::ReasoningText { text }
| codex_core::ReasoningItemContent::Text { text } => combined.push_str(text),
ReasoningItemContent::ReasoningText { text }
| ReasoningItemContent::Text { text } => combined.push_str(text),
}
}
assert_eq!(combined, expected);

View File

@@ -134,6 +134,14 @@ where
wait_for_event_with_timeout(codex, predicate, Duration::from_secs(1)).await
}
pub async fn wait_for_event_match<T, F>(codex: &CodexConversation, matcher: F) -> T
where
F: Fn(&codex_core::protocol::EventMsg) -> Option<T>,
{
let ev = wait_for_event(codex, |ev| matcher(ev).is_some()).await;
matcher(&ev).unwrap()
}
pub async fn wait_for_event_with_timeout<F>(
codex: &CodexConversation,
mut predicate: F,

View File

@@ -35,6 +35,22 @@ impl ResponseMock {
pub fn requests(&self) -> Vec<ResponsesRequest> {
self.requests.lock().unwrap().clone()
}
/// Returns true if any captured request contains a `function_call` with the
/// provided `call_id`.
pub fn saw_function_call(&self, call_id: &str) -> bool {
self.requests()
.iter()
.any(|req| req.has_function_call(call_id))
}
/// Returns the `output` string for a matching `function_call_output` with
/// the provided `call_id`, searching across all captured requests.
pub fn function_call_output_text(&self, call_id: &str) -> Option<String> {
self.requests()
.iter()
.find_map(|req| req.function_call_output_text(call_id))
}
}
#[derive(Debug, Clone)]
@@ -70,6 +86,28 @@ impl ResponsesRequest {
.unwrap_or_else(|| panic!("function call output {call_id} item not found in request"))
}
/// Returns true if this request's `input` contains a `function_call` with
/// the specified `call_id`.
pub fn has_function_call(&self, call_id: &str) -> bool {
self.input().iter().any(|item| {
item.get("type").and_then(Value::as_str) == Some("function_call")
&& item.get("call_id").and_then(Value::as_str) == Some(call_id)
})
}
/// If present, returns the `output` string of the `function_call_output`
/// entry matching `call_id` in this request's `input`.
pub fn function_call_output_text(&self, call_id: &str) -> Option<String> {
let binding = self.input();
let item = binding.iter().find(|item| {
item.get("type").and_then(Value::as_str) == Some("function_call_output")
&& item.get("call_id").and_then(Value::as_str) == Some(call_id)
})?;
item.get("output")
.and_then(Value::as_str)
.map(str::to_string)
}
pub fn header(&self, name: &str) -> Option<String> {
self.0
.headers
@@ -97,6 +135,10 @@ impl Match for ResponseMock {
.lock()
.unwrap()
.push(ResponsesRequest(request.clone()));
// Enforce invariant checks on every request body captured by the mock.
// Panic on orphan tool outputs or calls to catch regressions early.
validate_request_body_invariants(request);
true
}
}
@@ -167,6 +209,56 @@ pub fn ev_assistant_message(id: &str, text: &str) -> Value {
})
}
pub fn ev_reasoning_item(id: &str, summary: &[&str], raw_content: &[&str]) -> Value {
let summary_entries: Vec<Value> = summary
.iter()
.map(|text| serde_json::json!({"type": "summary_text", "text": text}))
.collect();
let mut event = serde_json::json!({
"type": "response.output_item.done",
"item": {
"type": "reasoning",
"id": id,
"summary": summary_entries,
}
});
if !raw_content.is_empty() {
let content_entries: Vec<Value> = raw_content
.iter()
.map(|text| serde_json::json!({"type": "reasoning_text", "text": text}))
.collect();
event["item"]["content"] = Value::Array(content_entries);
}
event
}
pub fn ev_web_search_call_added(id: &str, status: &str, query: &str) -> Value {
serde_json::json!({
"type": "response.output_item.added",
"item": {
"type": "web_search_call",
"id": id,
"status": status,
"action": {"type": "search", "query": query}
}
})
}
pub fn ev_web_search_call_done(id: &str, status: &str, query: &str) -> Value {
serde_json::json!({
"type": "response.output_item.done",
"item": {
"type": "web_search_call",
"id": id,
"status": status,
"action": {"type": "search", "query": query}
}
})
}
pub fn ev_function_call(call_id: &str, name: &str, arguments: &str) -> Value {
serde_json::json!({
"type": "response.output_item.done",
@@ -336,3 +428,90 @@ pub async fn mount_sse_sequence(server: &MockServer, bodies: Vec<String>) -> Res
response_mock
}
/// Validate invariants on the request body sent to `/v1/responses`.
///
/// - No `function_call_output`/`custom_tool_call_output` with missing/empty `call_id`.
/// - Every `function_call_output` must match a prior `function_call` or
/// `local_shell_call` with the same `call_id` in the same `input`.
/// - Every `custom_tool_call_output` must match a prior `custom_tool_call`.
/// - Additionally, enforce symmetry: every `function_call`/`custom_tool_call`
/// in the `input` must have a matching output entry.
fn validate_request_body_invariants(request: &wiremock::Request) {
let Ok(body): Result<Value, _> = request.body_json() else {
return;
};
let Some(items) = body.get("input").and_then(Value::as_array) else {
panic!("input array not found in request");
};
use std::collections::HashSet;
fn get_call_id(item: &Value) -> Option<&str> {
item.get("call_id")
.and_then(Value::as_str)
.filter(|id| !id.is_empty())
}
fn gather_ids(items: &[Value], kind: &str) -> HashSet<String> {
items
.iter()
.filter(|item| item.get("type").and_then(Value::as_str) == Some(kind))
.filter_map(get_call_id)
.map(str::to_string)
.collect()
}
fn gather_output_ids(items: &[Value], kind: &str, missing_msg: &str) -> HashSet<String> {
items
.iter()
.filter(|item| item.get("type").and_then(Value::as_str) == Some(kind))
.map(|item| {
let Some(id) = get_call_id(item) else {
panic!("{missing_msg}");
};
id.to_string()
})
.collect()
}
let function_calls = gather_ids(items, "function_call");
let custom_tool_calls = gather_ids(items, "custom_tool_call");
let local_shell_calls = gather_ids(items, "local_shell_call");
let function_call_outputs = gather_output_ids(
items,
"function_call_output",
"orphan function_call_output with empty call_id should be dropped",
);
let custom_tool_call_outputs = gather_output_ids(
items,
"custom_tool_call_output",
"orphan custom_tool_call_output with empty call_id should be dropped",
);
for cid in &function_call_outputs {
assert!(
function_calls.contains(cid) || local_shell_calls.contains(cid),
"function_call_output without matching call in input: {cid}",
);
}
for cid in &custom_tool_call_outputs {
assert!(
custom_tool_calls.contains(cid),
"custom_tool_call_output without matching call in input: {cid}",
);
}
for cid in &function_calls {
assert!(
function_call_outputs.contains(cid),
"Function call output is missing for call id: {cid}",
);
}
for cid in &custom_tool_calls {
assert!(
custom_tool_call_outputs.contains(cid),
"Custom tool call output is missing for call id: {cid}",
);
}
}

View File

@@ -6,7 +6,6 @@ use codex_core::CodexAuth;
use codex_core::CodexConversation;
use codex_core::ConversationManager;
use codex_core::ModelProviderInfo;
use codex_core::NewConversation;
use codex_core::built_in_model_providers;
use codex_core::config::Config;
use codex_core::protocol::SessionConfiguredEvent;
@@ -30,14 +29,59 @@ impl TestCodexBuilder {
}
pub async fn build(&mut self, server: &wiremock::MockServer) -> anyhow::Result<TestCodex> {
// Build config pointing to the mock server and spawn Codex.
let home = Arc::new(TempDir::new()?);
self.build_with_home(server, home, None).await
}
pub async fn resume(
&mut self,
server: &wiremock::MockServer,
home: Arc<TempDir>,
rollout_path: PathBuf,
) -> anyhow::Result<TestCodex> {
self.build_with_home(server, home, Some(rollout_path)).await
}
async fn build_with_home(
&mut self,
server: &wiremock::MockServer,
home: Arc<TempDir>,
resume_from: Option<PathBuf>,
) -> anyhow::Result<TestCodex> {
let (config, cwd) = self.prepare_config(server, &home).await?;
let conversation_manager = ConversationManager::with_auth(CodexAuth::from_api_key("dummy"));
let new_conversation = match resume_from {
Some(path) => {
let auth_manager = codex_core::AuthManager::from_auth_for_testing(
CodexAuth::from_api_key("dummy"),
);
conversation_manager
.resume_conversation_from_rollout(config, path, auth_manager)
.await?
}
None => conversation_manager.new_conversation(config).await?,
};
Ok(TestCodex {
home,
cwd,
codex: new_conversation.conversation,
session_configured: new_conversation.session_configured,
})
}
async fn prepare_config(
&mut self,
server: &wiremock::MockServer,
home: &TempDir,
) -> anyhow::Result<(Config, Arc<TempDir>)> {
let model_provider = ModelProviderInfo {
base_url: Some(format!("{}/v1", server.uri())),
..built_in_model_providers()["openai"].clone()
};
let home = TempDir::new()?;
let cwd = TempDir::new()?;
let mut config = load_default_config_for_test(&home);
let cwd = Arc::new(TempDir::new()?);
let mut config = load_default_config_for_test(home);
config.cwd = cwd.path().to_path_buf();
config.model_provider = model_provider;
config.codex_linux_sandbox_exe = Some(PathBuf::from(
@@ -48,29 +92,17 @@ impl TestCodexBuilder {
let mut mutators = vec![];
swap(&mut self.config_mutators, &mut mutators);
for mutator in mutators {
mutator(&mut config)
mutator(&mut config);
}
let conversation_manager = ConversationManager::with_auth(CodexAuth::from_api_key("dummy"));
let NewConversation {
conversation,
session_configured,
..
} = conversation_manager.new_conversation(config).await?;
Ok(TestCodex {
home,
cwd,
codex: conversation,
session_configured,
})
Ok((config, cwd))
}
}
pub struct TestCodex {
pub home: TempDir,
pub cwd: TempDir,
pub home: Arc<TempDir>,
pub cwd: Arc<TempDir>,
pub codex: Arc<CodexConversation>,
pub session_configured: SessionConfiguredEvent,
}

View File

@@ -38,6 +38,7 @@ async fn responses_stream_includes_task_type_header() {
base_url: Some(format!("{}/v1", server.uri())),
env_key: None,
env_key_instructions: None,
experimental_bearer_token: None,
wire_api: WireApi::Responses,
query_params: None,
http_headers: None,

View File

@@ -1,11 +1,14 @@
use std::sync::Arc;
use std::time::Duration;
use codex_core::protocol::EventMsg;
use codex_core::protocol::InputItem;
use codex_core::protocol::Op;
use codex_protocol::user_input::UserInput;
use core_test_support::responses::ev_completed;
use core_test_support::responses::ev_function_call;
use core_test_support::responses::ev_response_created;
use core_test_support::responses::mount_sse_once;
use core_test_support::responses::mount_sse_sequence;
use core_test_support::responses::sse;
use core_test_support::responses::start_mock_server;
use core_test_support::test_codex::test_codex;
@@ -42,7 +45,7 @@ async fn interrupt_long_running_tool_emits_turn_aborted() {
// Kick off a turn that triggers the function call.
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: "start sleep".into(),
}],
})
@@ -67,3 +70,98 @@ async fn interrupt_long_running_tool_emits_turn_aborted() {
)
.await;
}
/// After an interrupt we expect the next request to the model to include both
/// the original tool call and an `"aborted"` `function_call_output`. This test
/// exercises the follow-up flow: it sends another user turn, inspects the mock
/// responses server, and ensures the model receives the synthesized abort.
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn interrupt_tool_records_history_entries() {
let command = vec![
"bash".to_string(),
"-lc".to_string(),
"sleep 60".to_string(),
];
let call_id = "call-history";
let args = json!({
"command": command,
"timeout_ms": 60_000
})
.to_string();
let first_body = sse(vec![
ev_response_created("resp-history"),
ev_function_call(call_id, "shell", &args),
ev_completed("resp-history"),
]);
let follow_up_body = sse(vec![
ev_response_created("resp-followup"),
ev_completed("resp-followup"),
]);
let server = start_mock_server().await;
let response_mock = mount_sse_sequence(&server, vec![first_body, follow_up_body]).await;
let fixture = test_codex().build(&server).await.unwrap();
let codex = Arc::clone(&fixture.codex);
let wait_timeout = Duration::from_millis(100);
codex
.submit(Op::UserInput {
items: vec![UserInput::Text {
text: "start history recording".into(),
}],
})
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::ExecCommandBegin(_)),
wait_timeout,
)
.await;
codex.submit(Op::Interrupt).await.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TurnAborted(_)),
wait_timeout,
)
.await;
codex
.submit(Op::UserInput {
items: vec![UserInput::Text {
text: "follow up".into(),
}],
})
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TaskComplete(_)),
wait_timeout,
)
.await;
let requests = response_mock.requests();
assert!(
requests.len() == 2,
"expected two calls to the responses API, got {}",
requests.len()
);
assert!(
response_mock.saw_function_call(call_id),
"function call not recorded in responses payload"
);
assert_eq!(
response_mock.function_call_output_text(call_id).as_deref(),
Some("aborted"),
"aborted function call output not recorded in responses payload"
);
}

View File

@@ -6,11 +6,11 @@ use codex_core::protocol::ApplyPatchApprovalRequestEvent;
use codex_core::protocol::AskForApproval;
use codex_core::protocol::EventMsg;
use codex_core::protocol::ExecApprovalRequestEvent;
use codex_core::protocol::InputItem;
use codex_core::protocol::Op;
use codex_core::protocol::SandboxPolicy;
use codex_protocol::config_types::ReasoningSummary;
use codex_protocol::protocol::ReviewDecision;
use codex_protocol::user_input::UserInput;
use core_test_support::responses::ev_apply_patch_function_call;
use core_test_support::responses::ev_assistant_message;
use core_test_support::responses::ev_completed;
@@ -374,7 +374,7 @@ async fn submit_turn(
test.codex
.submit(Op::UserTurn {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: prompt.into(),
}],
final_output_json_schema: None,

View File

@@ -9,7 +9,6 @@ use codex_core::ModelClient;
use codex_core::ModelProviderInfo;
use codex_core::NewConversation;
use codex_core::Prompt;
use codex_core::ReasoningItemContent;
use codex_core::ResponseEvent;
use codex_core::ResponseItem;
use codex_core::WireApi;
@@ -17,13 +16,14 @@ use codex_core::built_in_model_providers;
use codex_core::error::CodexErr;
use codex_core::model_family::find_family_for_model;
use codex_core::protocol::EventMsg;
use codex_core::protocol::InputItem;
use codex_core::protocol::Op;
use codex_core::protocol::SessionSource;
use codex_otel::otel_event_manager::OtelEventManager;
use codex_protocol::ConversationId;
use codex_protocol::models::ReasoningItemContent;
use codex_protocol::models::ReasoningItemReasoningSummary;
use codex_protocol::models::WebSearchAction;
use codex_protocol::user_input::UserInput;
use core_test_support::load_default_config_for_test;
use core_test_support::load_sse_fixture_with_id;
use core_test_support::responses;
@@ -263,7 +263,7 @@ async fn resume_includes_initial_messages_and_sends_prior_items() {
// 2) Submit new input; the request body must include the prior item followed by the new user input.
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: "hello".into(),
}],
})
@@ -335,7 +335,7 @@ async fn includes_conversation_id_and_model_headers_in_request() {
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: "hello".into(),
}],
})
@@ -390,7 +390,7 @@ async fn includes_base_instructions_override_in_request() {
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: "hello".into(),
}],
})
@@ -450,7 +450,7 @@ async fn chatgpt_auth_sends_correct_request() {
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: "hello".into(),
}],
})
@@ -540,7 +540,7 @@ async fn prefers_apikey_when_config_prefers_apikey_even_with_chatgpt_tokens() {
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: "hello".into(),
}],
})
@@ -579,7 +579,7 @@ async fn includes_user_instructions_message_in_request() {
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: "hello".into(),
}],
})
@@ -632,6 +632,7 @@ async fn azure_responses_request_includes_store_and_reasoning_ids() {
base_url: Some(format!("{}/openai", server.uri())),
env_key: None,
env_key_instructions: None,
experimental_bearer_token: None,
wire_api: WireApi::Responses,
query_params: None,
http_headers: None,
@@ -795,7 +796,7 @@ async fn token_count_includes_rate_limits_snapshot() {
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: "hello".into(),
}],
})
@@ -946,7 +947,7 @@ async fn usage_limit_error_emits_rate_limit_event() -> anyhow::Result<()> {
let submission_id = codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: "hello".into(),
}],
})
@@ -1016,7 +1017,7 @@ async fn context_window_error_sets_total_tokens_to_model_window() -> anyhow::Res
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: "seed turn".into(),
}],
})
@@ -1026,7 +1027,7 @@ async fn context_window_error_sets_total_tokens_to_model_window() -> anyhow::Res
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: "trigger context window".into(),
}],
})
@@ -1115,6 +1116,7 @@ async fn azure_overrides_assign_properties_used_for_responses_url() {
base_url: Some(format!("{}/openai", server.uri())),
// Reuse the existing environment variable to avoid using unsafe code
env_key: Some(existing_env_var_with_random_value.to_string()),
experimental_bearer_token: None,
query_params: Some(std::collections::HashMap::from([(
"api-version".to_string(),
"2025-04-01-preview".to_string(),
@@ -1146,7 +1148,7 @@ async fn azure_overrides_assign_properties_used_for_responses_url() {
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: "hello".into(),
}],
})
@@ -1197,6 +1199,7 @@ async fn env_var_overrides_loaded_auth() {
"2025-04-01-preview".to_string(),
)])),
env_key_instructions: None,
experimental_bearer_token: None,
wire_api: WireApi::Responses,
http_headers: Some(std::collections::HashMap::from([(
"Custom-Header".to_string(),
@@ -1223,7 +1226,7 @@ async fn env_var_overrides_loaded_auth() {
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: "hello".into(),
}],
})
@@ -1301,7 +1304,7 @@ async fn history_dedupes_streamed_and_final_messages_across_turns() {
// Turn 1: user sends U1; wait for completion.
codex
.submit(Op::UserInput {
items: vec![InputItem::Text { text: "U1".into() }],
items: vec![UserInput::Text { text: "U1".into() }],
})
.await
.unwrap();
@@ -1310,7 +1313,7 @@ async fn history_dedupes_streamed_and_final_messages_across_turns() {
// Turn 2: user sends U2; wait for completion.
codex
.submit(Op::UserInput {
items: vec![InputItem::Text { text: "U2".into() }],
items: vec![UserInput::Text { text: "U2".into() }],
})
.await
.unwrap();
@@ -1319,7 +1322,7 @@ async fn history_dedupes_streamed_and_final_messages_across_turns() {
// Turn 3: user sends U3; wait for completion.
codex
.submit(Op::UserInput {
items: vec![InputItem::Text { text: "U3".into() }],
items: vec![UserInput::Text { text: "U3".into() }],
})
.await
.unwrap();

View File

@@ -5,10 +5,10 @@ use codex_core::NewConversation;
use codex_core::built_in_model_providers;
use codex_core::protocol::ErrorEvent;
use codex_core::protocol::EventMsg;
use codex_core::protocol::InputItem;
use codex_core::protocol::Op;
use codex_core::protocol::RolloutItem;
use codex_core::protocol::RolloutLine;
use codex_protocol::user_input::UserInput;
use core_test_support::load_default_config_for_test;
use core_test_support::skip_if_no_network;
use core_test_support::wait_for_event;
@@ -108,7 +108,7 @@ async fn summarize_context_three_requests_and_instructions() {
// 1) Normal user input should hit server once.
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: "hello world".into(),
}],
})
@@ -123,7 +123,7 @@ async fn summarize_context_three_requests_and_instructions() {
// 3) Next user input third hit; history should include only the summary.
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: THIRD_USER_MSG.into(),
}],
})
@@ -324,7 +324,7 @@ async fn auto_compact_runs_after_token_limit_hit() {
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: FIRST_AUTO_MSG.into(),
}],
})
@@ -335,7 +335,7 @@ async fn auto_compact_runs_after_token_limit_hit() {
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: SECOND_AUTO_MSG.into(),
}],
})
@@ -468,7 +468,7 @@ async fn auto_compact_persists_rollout_entries() {
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: FIRST_AUTO_MSG.into(),
}],
})
@@ -478,7 +478,7 @@ async fn auto_compact_persists_rollout_entries() {
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: SECOND_AUTO_MSG.into(),
}],
})
@@ -580,7 +580,7 @@ async fn auto_compact_stops_after_failed_attempt() {
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: FIRST_AUTO_MSG.into(),
}],
})
@@ -674,7 +674,7 @@ async fn manual_compact_retries_after_context_window_error() {
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: "first turn".into(),
}],
})
@@ -802,7 +802,7 @@ async fn auto_compact_allows_multiple_attempts_when_interleaved_with_other_turn_
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: MULTI_AUTO_MSG.into(),
}],
})
@@ -913,7 +913,7 @@ async fn auto_compact_triggers_after_function_call_over_95_percent_usage() {
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: FUNCTION_CALL_LIMIT_MSG.into(),
}],
})

View File

@@ -20,9 +20,9 @@ use codex_core::config::Config;
use codex_core::config::OPENAI_DEFAULT_MODEL;
use codex_core::protocol::ConversationPathResponseEvent;
use codex_core::protocol::EventMsg;
use codex_core::protocol::InputItem;
use codex_core::protocol::Op;
use codex_core::spawn::CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR;
use codex_protocol::user_input::UserInput;
use core_test_support::load_default_config_for_test;
use core_test_support::responses::ev_assistant_message;
use core_test_support::responses::ev_completed;
@@ -777,7 +777,7 @@ async fn start_test_conversation(
async fn user_turn(conversation: &Arc<CodexConversation>, text: &str) {
conversation
.submit(Op::UserInput {
items: vec![InputItem::Text { text: text.into() }],
items: vec![UserInput::Text { text: text.into() }],
})
.await
.expect("submit user turn");

View File

@@ -1,18 +1,16 @@
use codex_core::CodexAuth;
use codex_core::ContentItem;
use codex_core::ConversationManager;
use codex_core::ModelProviderInfo;
use codex_core::NewConversation;
use codex_core::ResponseItem;
use codex_core::built_in_model_providers;
use codex_core::content_items_to_text;
use codex_core::is_session_prefix_message;
use codex_core::parse_turn_item;
use codex_core::protocol::ConversationPathResponseEvent;
use codex_core::protocol::EventMsg;
use codex_core::protocol::InputItem;
use codex_core::protocol::Op;
use codex_core::protocol::RolloutItem;
use codex_core::protocol::RolloutLine;
use codex_protocol::items::TurnItem;
use codex_protocol::user_input::UserInput;
use core_test_support::load_default_config_for_test;
use core_test_support::skip_if_no_network;
use core_test_support::wait_for_event;
@@ -71,7 +69,7 @@ async fn fork_conversation_twice_drops_to_first_message() {
for text in ["first", "second", "third"] {
codex
.submit(Op::UserInput {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: text.to_string(),
}],
})
@@ -115,19 +113,12 @@ async fn fork_conversation_twice_drops_to_first_message() {
let find_user_input_positions = |items: &[RolloutItem]| -> Vec<usize> {
let mut pos = Vec::new();
for (i, it) in items.iter().enumerate() {
if let RolloutItem::ResponseItem(ResponseItem::Message { role, content, .. }) = it
&& role == "user"
&& content_items_to_text(content)
.is_some_and(|text| !is_session_prefix_message(&text))
if let RolloutItem::ResponseItem(response_item) = it
&& let Some(TurnItem::UserMessage(_)) = parse_turn_item(response_item)
{
// Consider any user message as an input boundary; recorder stores both EventMsg and ResponseItem.
// We specifically look for input items, which are represented as ContentItem::InputText.
if content
.iter()
.any(|c| matches!(c, ContentItem::InputText { .. }))
{
pos.push(i);
}
pos.push(i);
}
}
pos

View File

@@ -4,10 +4,10 @@ use anyhow::Result;
use codex_core::model_family::find_family_for_model;
use codex_core::protocol::AskForApproval;
use codex_core::protocol::EventMsg;
use codex_core::protocol::InputItem;
use codex_core::protocol::Op;
use codex_core::protocol::SandboxPolicy;
use codex_protocol::config_types::ReasoningSummary;
use codex_protocol::user_input::UserInput;
use core_test_support::responses;
use core_test_support::responses::ev_assistant_message;
use core_test_support::responses::ev_completed;
@@ -149,7 +149,7 @@ async fn submit_turn(test: &TestCodex, prompt: &str) -> Result<()> {
test.codex
.submit(Op::UserTurn {
items: vec![InputItem::Text {
items: vec![UserInput::Text {
text: prompt.into(),
}],
final_output_json_schema: None,

Some files were not shown because too many files have changed in this diff Show More