Compare commits

...

36 Commits

Author SHA1 Message Date
zhao-oai
eb77fc79ed Merge branch 'main' into token-usage-heuristic 2025-10-27 17:59:28 -07:00
Ahmed Ibrahim
d7b333be97 Truncate the content-item for mcp tools (#5835)
This PR truncates the text output of MCP tool
2025-10-28 00:39:35 +00:00
zhao-oai
4d6a42a622 fix image drag drop (#5794)
fixing drag/drop photos bug in codex

state of the world before:

sometimes, when you drag screenshots into codex, the image does not
properly render into context. instead, the file name is shown in
quotation marks.


https://github.com/user-attachments/assets/3c0e540a-505c-4ec0-b634-e9add6a73119

the screenshot is not actually included in agent context. the agent
needs to manually call the view_image tool to see the screenshot. this
can be unreliable especially if the image is part of a longer prompt and
is dependent on the agent going out of its way to view the image.

state of the world after:


https://github.com/user-attachments/assets/5f2b7bf7-8a3f-4708-85f3-d68a017bfd97

now, images will always be directly embedded into chat context

## Technical Details

- MacOS sends screenshot paths with a narrow no‑break space right before
the “AM/PM” suffix, which used to trigger our non‑ASCII fallback in the
paste burst detector.
- That fallback flushed the partially buffered paste immediately, so the
path arrived in two separate `handle_paste` calls (quoted prefix +
`PM.png'`). The split string could not be normalized to a real path, so
we showed the quoted filename instead of embedding the image.
- We now append non‑ASCII characters into the burst buffer when a burst
is already active. Finder’s payload stays intact, the path normalizes,
and the image attaches automatically.
- When no burst is active (e.g. during IME typing), non‑ASCII characters
still bypass the buffer so text entry remains responsive.
2025-10-27 17:11:30 -07:00
kevin zhao
8835b955fb token usage heuristic 2025-10-27 16:24:45 -07:00
Gabriel Peal
b0bdc04c30 [MCP] Render MCP tool call result images to the model (#5600)
It's pretty amazing we have gotten here without the ability for the
model to see image content from MCP tool calls.

This PR builds off of 4391 and fixes #4819. I would like @KKcorps to get
adequete credit here but I also want to get this fix in ASAP so I gave
him a week to update it and haven't gotten a response so I'm going to
take it across the finish line.


This test highlights how absured the current situation is. I asked the
model to read this image using the Chrome MCP
<img width="2378" height="674" alt="image"
src="https://github.com/user-attachments/assets/9ef52608-72a2-4423-9f5e-7ae36b2b56e0"
/>

After this change, it correctly outputs:
> Captured the page: image dhows a dark terminal-style UI labeled
`OpenAI Codex (v0.0.0)` with prompt `model: gpt-5-codex medium` and
working directory `/codex/codex-rs`
(and more)  

Before this change, it said:
> Took the full-page screenshot you asked for. It shows a long,
horizontally repeating pattern of stylized people in orange, light-blue,
and mustard clothing, holding hands in alternating poses against a white
background. No text or other graphics-just rows of flat illustration
stretching off to the right.

Without this change, the Figma, Playwright, Chrome, and other visual MCP
servers are pretty much entirely useless.

I tested this change with the openai respones api as well as a third
party completions api
2025-10-27 17:55:57 -04:00
kevin zhao
470b13c26f normalizing model slug in get_model_info 2025-10-27 14:43:54 -07:00
Owen Lin
67a219ffc2 fix: move account struct to app-server-protocol and use camelCase (#5829)
Makes sense to move this struct to `app-server-protocol/` since we want
to serialize as camelCase, but we don't for structs defined in
`protocol/`

It was:
```
export type Account = { "type": "ApiKey", api_key: string, } | { "type": "chatgpt", email: string | null, plan_type: PlanType, };
```

But we want:
```
export type Account = { "type": "apiKey", apiKey: string, } | { "type": "chatgpt", email: string | null, planType: PlanType, };
```
2025-10-27 14:06:13 -07:00
Ahmed Ibrahim
7226365397 Centralize truncation in conversation history (#5652)
move the truncation logic to conversation history to use on any tool
output. This will help us in avoiding edge cases while truncating the
tool calls and mcp calls.
2025-10-27 14:05:35 -07:00
Celia Chen
0fc295d958 [Auth] Add keyring support for Codex CLI (#5591)
Follow-up PR to #5569. Add Keyring Support for Auth Storage in Codex CLI
as well as a hybrid mode (default to persisting in keychain but fall
back to file when unavailable.)

It also refactors out the keyringstore implementation from rmcp-client
[here](https://github.com/openai/codex/blob/main/codex-rs/rmcp-client/src/oauth.rs)
to a new keyring-store crate.

There will be a follow-up that picks the right credential mode depending
on the config, instead of hardcoding `AuthCredentialsStoreMode::File`.
2025-10-27 12:10:11 -07:00
jif-oai
3e50f94d76 feat: support verbosity in model_family (#5821) 2025-10-27 18:46:30 +00:00
Celia Chen
eb5b1b627f [Auth] Introduce New Auth Storage Abstraction for Codex CLI (#5569)
This PR introduces a new `Auth Storage` abstraction layer that takes
care of read, write, and load of auth tokens based on the
AuthCredentialsStoreMode. It is similar to how we handle MCP client
oauth
[here](https://github.com/openai/codex/blob/main/codex-rs/rmcp-client/src/oauth.rs).
Instead of reading and writing directly from disk for auth tokens, Codex
CLI workflows now should instead use this auth storage using the public
helper functions.

This PR is just a refactor of the current code so the behavior stays the
same. We will add support for keyring and hybrid mode in follow-up PRs.

I have read the CLA Document and I hereby sign the CLA
2025-10-27 11:01:14 -07:00
Eric Traut
0c1ff1d3fd Made token refresh code resilient to missing id_token (#5782)
This PR does the following:
1. Changes `try_refresh_token` to handle the case where the endpoint
returns a response without an `id_token`. The OpenID spec indicates that
this field is optional and clients should not assume it's present.
2. Changes the `attempt_stream_responses` to propagate token refresh
errors rather than silently ignoring them.
3. Fixes a typo in a couple of error messages (unrelated to the above,
but something I noticed in passing) - "reconnect" should be spelled
without a hyphen.

This PR does not implement the additional suggestion from @pakrym-oai
that we should sign out when receiving `refresh_token_expired` from the
refresh endpoint. Leaving this as a follow-on because I'm undecided on
whether this should be implemented in `try_refresh_token` or its
callers.
2025-10-27 10:09:53 -07:00
jif-oai
aea7610c76 feat: image resizing (#5446)
Add image resizing on the client side to reduce load on the API
2025-10-27 16:58:10 +00:00
jif-oai
775fbba6e0 feat: return an error if unknown enabled/disabled feature (#5817) 2025-10-27 16:53:00 +00:00
Michael Bolin
5ee8a17b4e feat: introduce GetConversationSummary RPC (#5803)
This adds an RPC to the app server to the the `ConversationSummary` via
a rollout path. Now that the VS Code extension supports showing the
Codex UI in an editor panel where the URI of the panel maps to the
rollout file, we need to be able to get the `ConversationSummary` from
the rollout file directly.
2025-10-27 09:11:45 -07:00
jif-oai
81be54b229 fix: test yield time (#5811) 2025-10-27 11:57:29 +00:00
jif-oai
5e8659dcbc chore: undo nits (#5631) 2025-10-27 11:48:01 +00:00
jif-oai
2338294b39 nit: doc on session task (#5809) 2025-10-27 11:43:33 +00:00
jif-oai
afc4eaab8b feat: TUI undo op (#5629) 2025-10-27 10:55:29 +00:00
jif-oai
e92c4f6561 feat: async ghost commit (#5618) 2025-10-27 10:09:10 +00:00
Michael Bolin
15fa2283e7 feat: update NewConversationParams to take an optional model_provider (#5793)
An AppServer client should be able to use any (`model_provider`, `model`) in the user's config. `NewConversationParams` already supported specifying the `model`, but this PR expands it to support `model_provider`, as well.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/5793).
* #5803
* __->__ #5793
2025-10-27 09:33:30 +00:00
Michael Bolin
5907422d65 feat: annotate conversations with model_provider for filtering (#5658)
Because conversations that use the Responses API can have encrypted
reasoning messages, trying to resume a conversation with a different
provider could lead to confusing "failed to decrypt" errors. (This is
reproducible by starting a conversation using ChatGPT login and resuming
it as a conversation that uses OpenAI models via Azure.)

This changes `ListConversationsParams` to take a `model_providers:
Option<Vec<String>>` and adds `model_provider` on each
`ConversationSummary` it returns so these cases can be disambiguated.

Note this ended up making changes to
`codex-rs/core/src/rollout/tests.rs` because it had a number of cases
where it expected `Some` for the value of `next_cursor`, but the list of
rollouts was complete, so according to this docstring:


bcd64c7e72/codex-rs/app-server-protocol/src/protocol.rs (L334-L337)

If there are no more items to return, then `next_cursor` should be
`None`. This PR updates that logic.






---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/5658).
* #5803
* #5793
* __->__ #5658
2025-10-27 02:03:30 -07:00
Ahmed Ibrahim
f178805252 Add feedback upload request handling (#5682) 2025-10-27 05:53:39 +00:00
Michael Bolin
a55b0c4bcc fix: revert "[app-server] fix account/read response annotation (#5642)" (#5796)
Revert #5642 because this generates:

```
// GENERATED CODE! DO NOT MODIFY BY HAND!

// This file was generated by [ts-rs](https://github.com/Aleph-Alpha/ts-rs). Do not edit this file manually.

export type GetAccountResponse = Account | null;
```

But `Account` is unknown.

The unique use of `#[ts(export)]` on `GetAccountResponse` is also
suspicious as are the changes to
`codex-rs/app-server-protocol/src/export.rs` since the existing system
has worked fine for quite some time.

Though a pure backout of #5642 puts things in a state where, as the PR
noted, the following does not work:

```
cargo run -p codex-app-server-protocol --bin export -- --out DIR
```

So in addition to the backout, this PR adds:

```rust
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct GetAccountResponse {
    pub account: Account,
}
```

and changes `GetAccount.response` as follows:

```diff
-        response: Option<Account>,
+        response: GetAccountResponse,
```

making it consistent with other types.

With this change, I verified that both of the following work:

```
just codex generate-ts --out /tmp/somewhere
cargo run -p codex-app-server-protocol --bin export -- --out /tmp/somewhere-else
```

The generated TypeScript is as follows:

```typescript
// GetAccountResponse.ts
import type { Account } from "./Account";

export type GetAccountResponse = { account: Account, };
```

and

```typescript
// Account.ts
import type { PlanType } from "./PlanType";

export type Account = { "type": "ApiKey", api_key: string, } | { "type": "chatgpt", email: string | null, plan_type: PlanType, };
```

Though while the inconsistency between `"type": "ApiKey"` and `"type":
"chatgpt"` is quite concerning, I'm not sure if that format is ever
written to disk in any case, but @owenlin0, I would recommend looking
into that.

Also, it appears that the types in `codex-rs/protocol/src/account.rs`
are used exclusively by the `app-server-protocol` crate, so perhaps they
should just be moved there?
2025-10-26 18:57:42 -07:00
Thibault Sottiaux
224222f09f fix: use codex-exp prefix for experimental models and consider codex- models to be production (#5797) 2025-10-27 01:55:12 +00:00
Gabriel Peal
7aab45e060 [MCP] Minor docs clarifications around stdio tokens (#5676)
Noticed
[here](https://github.com/openai/codex/issues/4707#issuecomment-3446547561)
2025-10-26 13:38:30 -04:00
Eric Traut
bcd64c7e72 Reduced runtime of unit test that was taking multiple minutes (#5688)
Modified `build_compacted_history_truncates_overlong_user_messages` test
to reduce runtime from minutes to tens of seconds
2025-10-25 23:46:08 -07:00
Eric Traut
c124f24354 Added support for sandbox_mode in profiles (#5686)
Currently, `approval_policy` is supported in profiles, but
`sandbox_mode` is not. This PR adds support for `sandbox_mode`.

Note: a fix for this was submitted in [this
PR](https://github.com/openai/codex/pull/2397), but the underlying code
has changed significantly since then.

This addresses issue #3034
2025-10-25 16:52:26 -07:00
pakrym-oai
c7e4e6d0ee Skip flaky test (#5680)
Did an investigation but couldn't find anything obvious. Let's skip for
now.
2025-10-25 12:11:16 -07:00
Ahmed Ibrahim
88abbf58ce Followup feedback (#5663)
- Added files to be uploaded
- Refactored
- Updated title
2025-10-25 06:07:40 +00:00
Ahmed Ibrahim
71f838389b Improve feedback (#5661)
<img width="1099" height="153" alt="image"
src="https://github.com/user-attachments/assets/2c901884-8baf-4b1b-b2c4-bcb61ff42be8"
/>

<img width="1082" height="125" alt="image"
src="https://github.com/user-attachments/assets/6336e6c9-9ace-46df-a383-a807ceffa524"
/>

<img width="1102" height="103" alt="image"
src="https://github.com/user-attachments/assets/78883682-7e44-4fa3-9e04-57f7df4766fd"
/>
2025-10-24 22:28:14 -07:00
Eric Traut
0533bd2e7c Fixed flaky unit test (#5654)
This PR fixes a test that is sporadically failing in CI.

The problem is that two unit tests (the older `login_and_cancel_chatgpt`
and a recently added
`login_chatgpt_includes_forced_workspace_query_param`) exercise code
paths that start the login server. The server binds to a hard-coded
localhost port number, so attempts to start more than one server at the
same time will fail. If these two tests happen to run concurrently, one
of them will fail.

To fix this, I've added a simple mutex. We can use this same mutex for
future tests that use the same pattern.
2025-10-24 16:31:24 -07:00
Anton Panasenko
6af83d86ff [codex][app-server] introduce codex/event/raw_item events (#5578) 2025-10-24 22:41:52 +00:00
Gabriel Peal
e2e1b65da6 [MCP] Properly gate login after mcp add with experimental_use_rmcp_client (#5653)
There was supposed to be a check here like in other places.
2025-10-24 18:32:15 -04:00
Gabriel Peal
817d1508bc [MCP] Redact environment variable values in /mcp and mcp get (#5648)
Fixes #5524
2025-10-24 18:30:20 -04:00
Eric Traut
f8af4f5c8d Added model summary and risk assessment for commands that violate sandbox policy (#5536)
This PR adds support for a model-based summary and risk assessment for
commands that violate the sandbox policy and require user approval. This
aids the user in evaluating whether the command should be approved.

The feature works by taking a failed command and passing it back to the
model and asking it to summarize the command, give it a risk level (low,
medium, high) and a risk category (e.g. "data deletion" or "data
exfiltration"). It uses a new conversation thread so the context in the
existing thread doesn't influence the answer. If the call to the model
fails or takes longer than 5 seconds, it falls back to the current
behavior.

For now, this is an experimental feature and is gated by a config key
`experimental_sandbox_command_assessment`.

Here is a screen shot of the approval prompt showing the risk assessment
and summary.

<img width="723" height="282" alt="image"
src="https://github.com/user-attachments/assets/4597dd7c-d5a0-4e9f-9d13-414bd082fd6b"
/>
2025-10-24 15:23:44 -07:00
144 changed files with 7389 additions and 1573 deletions

56
codex-rs/Cargo.lock generated
View File

@@ -843,6 +843,7 @@ dependencies = [
"codex-backend-client",
"codex-common",
"codex-core",
"codex-feedback",
"codex-file-search",
"codex-login",
"codex-protocol",
@@ -853,6 +854,7 @@ dependencies = [
"pretty_assertions",
"serde",
"serde_json",
"serial_test",
"tempfile",
"tokio",
"toml",
@@ -1061,10 +1063,13 @@ dependencies = [
"codex-apply-patch",
"codex-async-utils",
"codex-file-search",
"codex-git-tooling",
"codex-keyring-store",
"codex-otel",
"codex-protocol",
"codex-rmcp-client",
"codex-utils-pty",
"codex-utils-readiness",
"codex-utils-string",
"codex-utils-tokenizer",
"core-foundation 0.9.4",
@@ -1076,7 +1081,9 @@ dependencies = [
"eventsource-stream",
"futures",
"http",
"image",
"indexmap 2.10.0",
"keyring",
"landlock",
"libc",
"maplit",
@@ -1093,6 +1100,7 @@ dependencies = [
"serde_json",
"serial_test",
"sha1",
"sha2",
"shlex",
"similar",
"strum_macros 0.27.2",
@@ -1208,11 +1216,22 @@ version = "0.0.0"
dependencies = [
"assert_matches",
"pretty_assertions",
"schemars 0.8.22",
"serde",
"tempfile",
"thiserror 2.0.16",
"ts-rs",
"walkdir",
]
[[package]]
name = "codex-keyring-store"
version = "0.0.0"
dependencies = [
"keyring",
"tracing",
]
[[package]]
name = "codex-linux-sandbox"
version = "0.0.0"
@@ -1327,6 +1346,8 @@ version = "0.0.0"
dependencies = [
"anyhow",
"base64",
"codex-git-tooling",
"codex-utils-image",
"icu_decimal",
"icu_locale_core",
"mcp-types",
@@ -1376,6 +1397,7 @@ version = "0.0.0"
dependencies = [
"anyhow",
"axum",
"codex-keyring-store",
"codex-protocol",
"dirs",
"escargot",
@@ -1427,7 +1449,6 @@ dependencies = [
"codex-core",
"codex-feedback",
"codex-file-search",
"codex-git-tooling",
"codex-login",
"codex-ollama",
"codex-protocol",
@@ -1472,6 +1493,27 @@ dependencies = [
"vt100",
]
[[package]]
name = "codex-utils-cache"
version = "0.0.0"
dependencies = [
"lru",
"sha1",
"tokio",
]
[[package]]
name = "codex-utils-image"
version = "0.0.0"
dependencies = [
"base64",
"codex-utils-cache",
"image",
"tempfile",
"thiserror 2.0.16",
"tokio",
]
[[package]]
name = "codex-utils-json-to-toml"
version = "0.0.0"
@@ -5457,9 +5499,9 @@ dependencies = [
[[package]]
name = "serde"
version = "1.0.226"
version = "1.0.228"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0dca6411025b24b60bfa7ec1fe1f8e710ac09782dca409ee8237ba74b51295fd"
checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e"
dependencies = [
"serde_core",
"serde_derive",
@@ -5467,18 +5509,18 @@ dependencies = [
[[package]]
name = "serde_core"
version = "1.0.226"
version = "1.0.228"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ba2ba63999edb9dac981fb34b3e5c0d111a69b0924e253ed29d83f7c99e966a4"
checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad"
dependencies = [
"serde_derive",
]
[[package]]
name = "serde_derive"
version = "1.0.226"
version = "1.0.228"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8db53ae22f34573731bafa1db20f04027b2d25e02d8205921b569171699cdb33"
checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79"
dependencies = [
"proc-macro2",
"quote",

View File

@@ -16,6 +16,7 @@ members = [
"core",
"exec",
"execpolicy",
"keyring-store",
"file-search",
"git-tooling",
"linux-sandbox",
@@ -32,9 +33,11 @@ members = [
"otel",
"tui",
"git-apply",
"utils/cache",
"utils/image",
"utils/json-to-toml",
"utils/readiness",
"utils/pty",
"utils/readiness",
"utils/string",
"utils/tokenizer",
]
@@ -65,6 +68,7 @@ codex-exec = { path = "exec" }
codex-feedback = { path = "feedback" }
codex-file-search = { path = "file-search" }
codex-git-tooling = { path = "git-tooling" }
codex-keyring-store = { path = "keyring-store" }
codex-linux-sandbox = { path = "linux-sandbox" }
codex-login = { path = "login" }
codex-mcp-server = { path = "mcp-server" }
@@ -77,6 +81,8 @@ codex-responses-api-proxy = { path = "responses-api-proxy" }
codex-rmcp-client = { path = "rmcp-client" }
codex-stdio-to-uds = { path = "stdio-to-uds" }
codex-tui = { path = "tui" }
codex-utils-cache = { path = "utils/cache" }
codex-utils-image = { path = "utils/image" }
codex-utils-json-to-toml = { path = "utils/json-to-toml" }
codex-utils-pty = { path = "utils/pty" }
codex-utils-readiness = { path = "utils/readiness" }
@@ -129,6 +135,7 @@ landlock = "0.4.1"
lazy_static = "1"
libc = "0.2.175"
log = "0.4"
lru = "0.12.5"
maplit = "1.0.2"
mime_guess = "2.0.5"
multimap = "0.10.0"

View File

@@ -23,7 +23,6 @@ use std::io::Write;
use std::path::Path;
use std::path::PathBuf;
use std::process::Command;
use ts_rs::ExportError;
use ts_rs::TS;
const HEADER: &str = "// GENERATED CODE! DO NOT MODIFY BY HAND!\n\n";
@@ -105,19 +104,6 @@ macro_rules! for_each_schema_type {
};
}
fn export_ts_with_context<F>(label: &str, export: F) -> Result<()>
where
F: FnOnce() -> std::result::Result<(), ExportError>,
{
match export() {
Ok(()) => Ok(()),
Err(ExportError::CannotBeExported(ty)) => Err(anyhow!(
"failed to export {label}: dependency {ty} cannot be exported"
)),
Err(err) => Err(err.into()),
}
}
pub fn generate_types(out_dir: &Path, prettier: Option<&Path>) -> Result<()> {
generate_ts(out_dir, prettier)?;
generate_json(out_dir)?;
@@ -127,17 +113,13 @@ pub fn generate_types(out_dir: &Path, prettier: Option<&Path>) -> Result<()> {
pub fn generate_ts(out_dir: &Path, prettier: Option<&Path>) -> Result<()> {
ensure_dir(out_dir)?;
export_ts_with_context("ClientRequest", || ClientRequest::export_all_to(out_dir))?;
export_ts_with_context("client responses", || export_client_responses(out_dir))?;
export_ts_with_context("ClientNotification", || {
ClientNotification::export_all_to(out_dir)
})?;
ClientRequest::export_all_to(out_dir)?;
export_client_responses(out_dir)?;
ClientNotification::export_all_to(out_dir)?;
export_ts_with_context("ServerRequest", || ServerRequest::export_all_to(out_dir))?;
export_ts_with_context("server responses", || export_server_responses(out_dir))?;
export_ts_with_context("ServerNotification", || {
ServerNotification::export_all_to(out_dir)
})?;
ServerRequest::export_all_to(out_dir)?;
export_server_responses(out_dir)?;
ServerNotification::export_all_to(out_dir)?;
generate_index_ts(out_dir)?;

View File

@@ -5,7 +5,7 @@ use crate::JSONRPCNotification;
use crate::JSONRPCRequest;
use crate::RequestId;
use codex_protocol::ConversationId;
use codex_protocol::account::Account;
use codex_protocol::account::PlanType;
use codex_protocol::config_types::ForcedLoginMethod;
use codex_protocol::config_types::ReasoningEffort;
use codex_protocol::config_types::ReasoningSummary;
@@ -17,6 +17,7 @@ use codex_protocol::protocol::EventMsg;
use codex_protocol::protocol::FileChange;
use codex_protocol::protocol::RateLimitSnapshot;
use codex_protocol::protocol::ReviewDecision;
use codex_protocol::protocol::SandboxCommandAssessment;
use codex_protocol::protocol::SandboxPolicy;
use codex_protocol::protocol::TurnAbortReason;
use paste::paste;
@@ -123,6 +124,13 @@ client_request_definitions! {
response: GetAccountRateLimitsResponse,
},
#[serde(rename = "feedback/upload")]
#[ts(rename = "feedback/upload")]
UploadFeedback {
params: UploadFeedbackParams,
response: UploadFeedbackResponse,
},
#[serde(rename = "account/read")]
#[ts(rename = "account/read")]
GetAccount {
@@ -139,6 +147,10 @@ client_request_definitions! {
params: NewConversationParams,
response: NewConversationResponse,
},
GetConversationSummary {
params: GetConversationSummaryParams,
response: GetConversationSummaryResponse,
},
/// List recorded Codex conversations (rollouts) with optional pagination and search.
ListConversations {
params: ListConversationsParams,
@@ -224,6 +236,28 @@ client_request_definitions! {
},
}
#[derive(Debug, Clone, PartialEq, Deserialize, Serialize, JsonSchema, TS)]
#[serde(tag = "type", rename_all = "camelCase")]
#[ts(tag = "type")]
pub enum Account {
#[serde(rename = "apiKey", rename_all = "camelCase")]
#[ts(rename = "apiKey", rename_all = "camelCase")]
ApiKey { api_key: String },
#[serde(rename = "chatgpt", rename_all = "camelCase")]
#[ts(rename = "chatgpt", rename_all = "camelCase")]
ChatGpt {
email: Option<String>,
plan_type: PlanType,
},
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct GetAccountResponse {
pub account: Account,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Default, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct InitializeParams {
@@ -252,6 +286,10 @@ pub struct NewConversationParams {
#[serde(skip_serializing_if = "Option::is_none")]
pub model: Option<String>,
/// Override the model provider to use for this session.
#[serde(skip_serializing_if = "Option::is_none")]
pub model_provider: Option<String>,
/// Configuration profile from config.toml to specify default options.
#[serde(skip_serializing_if = "Option::is_none")]
pub profile: Option<String>,
@@ -304,6 +342,18 @@ pub struct ResumeConversationResponse {
pub initial_messages: Option<Vec<EventMsg>>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct GetConversationSummaryParams {
pub rollout_path: PathBuf,
}
#[derive(Serialize, Deserialize, Debug, Clone, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct GetConversationSummaryResponse {
pub summary: ConversationSummary,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Default, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct ListConversationsParams {
@@ -313,6 +363,12 @@ pub struct ListConversationsParams {
/// Opaque pagination cursor returned by a previous call.
#[serde(skip_serializing_if = "Option::is_none")]
pub cursor: Option<String>,
/// Optional model provider filter (matches against session metadata).
/// - None => filter by the server's default model provider
/// - Some([]) => no filtering, include all providers
/// - Some([...]) => only include sessions with one of the specified providers
#[serde(skip_serializing_if = "Option::is_none")]
pub model_providers: Option<Vec<String>>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
@@ -324,6 +380,8 @@ pub struct ConversationSummary {
/// RFC3339 timestamp string for the session start, if available.
#[serde(skip_serializing_if = "Option::is_none")]
pub timestamp: Option<String>,
/// Model provider recorded for the session (resolved when absent in metadata).
pub model_provider: String,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
@@ -377,6 +435,23 @@ pub struct ListModelsResponse {
pub next_cursor: Option<String>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct UploadFeedbackParams {
pub classification: String,
#[serde(skip_serializing_if = "Option::is_none")]
pub reason: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub conversation_id: Option<ConversationId>,
pub include_logs: bool,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct UploadFeedbackResponse {
pub thread_id: String,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(tag = "type")]
#[ts(tag = "type")]
@@ -534,12 +609,6 @@ pub struct GetAccountRateLimitsResponse {
pub rate_limits: RateLimitSnapshot,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(transparent)]
#[ts(export)]
#[ts(type = "Account | null")]
pub struct GetAccountResponse(#[ts(type = "Account | null")] pub Option<Account>);
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct GetAuthStatusResponse {
@@ -716,6 +785,8 @@ pub struct SendUserMessageResponse {}
#[serde(rename_all = "camelCase")]
pub struct AddConversationListenerParams {
pub conversation_id: ConversationId,
#[serde(default)]
pub experimental_raw_events: bool,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
@@ -847,6 +918,8 @@ pub struct ExecCommandApprovalParams {
pub cwd: PathBuf,
#[serde(skip_serializing_if = "Option::is_none")]
pub reason: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub risk: Option<SandboxCommandAssessment>,
pub parsed_cmd: Vec<ParsedCommand>,
}
@@ -995,6 +1068,7 @@ mod tests {
request_id: RequestId::Integer(42),
params: NewConversationParams {
model: Some("gpt-5-codex".to_string()),
model_provider: None,
profile: None,
cwd: None,
approval_policy: Some(AskForApproval::OnRequest),
@@ -1063,6 +1137,7 @@ mod tests {
command: vec!["echo".to_string(), "hello".to_string()],
cwd: PathBuf::from("/tmp"),
reason: Some("because tests".to_string()),
risk: None,
parsed_cmd: vec![ParsedCommand::Unknown {
cmd: "echo hello".to_string(),
}],
@@ -1187,6 +1262,35 @@ mod tests {
Ok(())
}
#[test]
fn account_serializes_fields_in_camel_case() -> Result<()> {
let api_key = Account::ApiKey {
api_key: "secret".to_string(),
};
assert_eq!(
json!({
"type": "apiKey",
"apiKey": "secret",
}),
serde_json::to_value(&api_key)?,
);
let chatgpt = Account::ChatGpt {
email: Some("user@example.com".to_string()),
plan_type: PlanType::Plus,
};
assert_eq!(
json!({
"type": "chatgpt",
"email": "user@example.com",
"planType": "plus",
}),
serde_json::to_value(&chatgpt)?,
);
Ok(())
}
#[test]
fn serialize_list_models() -> Result<()> {
let request = ClientRequest::ListModels {

View File

@@ -24,6 +24,7 @@ codex-file-search = { workspace = true }
codex-login = { workspace = true }
codex-protocol = { workspace = true }
codex-app-server-protocol = { workspace = true }
codex-feedback = { workspace = true }
codex-utils-json-to-toml = { workspace = true }
chrono = { workspace = true }
serde = { workspace = true, features = ["derive"] }
@@ -47,6 +48,7 @@ base64 = { workspace = true }
core_test_support = { workspace = true }
os_info = { workspace = true }
pretty_assertions = { workspace = true }
serial_test = { workspace = true }
tempfile = { workspace = true }
toml = { workspace = true }
wiremock = { workspace = true }

View File

@@ -21,6 +21,8 @@ use codex_app_server_protocol::ExecOneOffCommandResponse;
use codex_app_server_protocol::FuzzyFileSearchParams;
use codex_app_server_protocol::FuzzyFileSearchResponse;
use codex_app_server_protocol::GetAccountRateLimitsResponse;
use codex_app_server_protocol::GetConversationSummaryParams;
use codex_app_server_protocol::GetConversationSummaryResponse;
use codex_app_server_protocol::GetUserAgentResponse;
use codex_app_server_protocol::GetUserSavedConfigResponse;
use codex_app_server_protocol::GitDiffToRemoteResponse;
@@ -52,6 +54,8 @@ use codex_app_server_protocol::ServerRequestPayload;
use codex_app_server_protocol::SessionConfiguredNotification;
use codex_app_server_protocol::SetDefaultModelParams;
use codex_app_server_protocol::SetDefaultModelResponse;
use codex_app_server_protocol::UploadFeedbackParams;
use codex_app_server_protocol::UploadFeedbackResponse;
use codex_app_server_protocol::UserInfoResponse;
use codex_app_server_protocol::UserSavedConfig;
use codex_backend_client::Client as BackendClient;
@@ -64,9 +68,7 @@ use codex_core::NewConversation;
use codex_core::RolloutRecorder;
use codex_core::SessionMeta;
use codex_core::auth::CLIENT_ID;
use codex_core::auth::get_auth_file;
use codex_core::auth::login_with_api_key;
use codex_core::auth::try_read_auth_json;
use codex_core::config::Config;
use codex_core::config::ConfigOverrides;
use codex_core::config::ConfigToml;
@@ -85,6 +87,8 @@ use codex_core::protocol::EventMsg;
use codex_core::protocol::ExecApprovalRequestEvent;
use codex_core::protocol::Op;
use codex_core::protocol::ReviewDecision;
use codex_core::read_head_for_summary;
use codex_feedback::CodexFeedback;
use codex_login::ServerOptions as LoginServerOptions;
use codex_login::ShutdownHandle;
use codex_login::run_login_server;
@@ -98,6 +102,8 @@ use codex_protocol::user_input::UserInput as CoreInputItem;
use codex_utils_json_to_toml::json_to_toml;
use std::collections::HashMap;
use std::ffi::OsStr;
use std::io::Error as IoError;
use std::path::Path;
use std::path::PathBuf;
use std::sync::Arc;
use std::sync::atomic::AtomicBool;
@@ -136,6 +142,7 @@ pub(crate) struct CodexMessageProcessor {
// Queue of pending interrupt requests per conversation. We reply when TurnAborted arrives.
pending_interrupts: Arc<Mutex<HashMap<ConversationId, Vec<RequestId>>>>,
pending_fuzzy_searches: Arc<Mutex<HashMap<String, Arc<AtomicBool>>>>,
feedback: CodexFeedback,
}
impl CodexMessageProcessor {
@@ -145,6 +152,7 @@ impl CodexMessageProcessor {
outgoing: Arc<OutgoingMessageSender>,
codex_linux_sandbox_exe: Option<PathBuf>,
config: Arc<Config>,
feedback: CodexFeedback,
) -> Self {
Self {
auth_manager,
@@ -156,6 +164,7 @@ impl CodexMessageProcessor {
active_login: Arc::new(Mutex::new(None)),
pending_interrupts: Arc::new(Mutex::new(HashMap::new())),
pending_fuzzy_searches: Arc::new(Mutex::new(HashMap::new())),
feedback,
}
}
@@ -170,6 +179,9 @@ impl CodexMessageProcessor {
// created before processing any subsequent messages.
self.process_new_conversation(request_id, params).await;
}
ClientRequest::GetConversationSummary { request_id, params } => {
self.get_conversation_summary(request_id, params).await;
}
ClientRequest::ListConversations { request_id, params } => {
self.handle_list_conversations(request_id, params).await;
}
@@ -275,6 +287,9 @@ impl CodexMessageProcessor {
} => {
self.get_account_rate_limits(request_id).await;
}
ClientRequest::UploadFeedback { request_id, params } => {
self.upload_feedback(request_id, params).await;
}
}
}
@@ -654,12 +669,8 @@ impl CodexMessageProcessor {
}
async fn get_user_info(&self, request_id: RequestId) {
// Read alleged user email from auth.json (best-effort; not verified).
let auth_path = get_auth_file(&self.config.codex_home);
let alleged_user_email = match try_read_auth_json(&auth_path) {
Ok(auth) => auth.tokens.and_then(|t| t.id_token.email),
Err(_) => None,
};
// Read alleged user email from cached auth (best-effort; not verified).
let alleged_user_email = self.auth_manager.auth().and_then(|a| a.get_account_email());
let response = UserInfoResponse { alleged_user_email };
self.outgoing.send_response(request_id, response).await;
@@ -813,24 +824,76 @@ impl CodexMessageProcessor {
}
}
async fn get_conversation_summary(
&self,
request_id: RequestId,
params: GetConversationSummaryParams,
) {
let GetConversationSummaryParams { rollout_path } = params;
let path = if rollout_path.is_relative() {
self.config.codex_home.join(&rollout_path)
} else {
rollout_path.clone()
};
let fallback_provider = self.config.model_provider_id.as_str();
match read_summary_from_rollout(&path, fallback_provider).await {
Ok(summary) => {
let response = GetConversationSummaryResponse { summary };
self.outgoing.send_response(request_id, response).await;
}
Err(err) => {
let error = JSONRPCErrorError {
code: INTERNAL_ERROR_CODE,
message: format!(
"failed to load conversation summary from {}: {}",
path.display(),
err
),
data: None,
};
self.outgoing.send_error(request_id, error).await;
}
}
}
async fn handle_list_conversations(
&self,
request_id: RequestId,
params: ListConversationsParams,
) {
let page_size = params.page_size.unwrap_or(25);
let ListConversationsParams {
page_size,
cursor,
model_providers: model_provider,
} = params;
let page_size = page_size.unwrap_or(25);
// Decode the optional cursor string to a Cursor via serde (Cursor implements Deserialize from string)
let cursor_obj: Option<RolloutCursor> = match params.cursor {
let cursor_obj: Option<RolloutCursor> = match cursor {
Some(s) => serde_json::from_str::<RolloutCursor>(&format!("\"{s}\"")).ok(),
None => None,
};
let cursor_ref = cursor_obj.as_ref();
let model_provider_filter = match model_provider {
Some(providers) => {
if providers.is_empty() {
None
} else {
Some(providers)
}
}
None => Some(vec![self.config.model_provider_id.clone()]),
};
let model_provider_slice = model_provider_filter.as_deref();
let fallback_provider = self.config.model_provider_id.clone();
let page = match RolloutRecorder::list_conversations(
&self.config.codex_home,
page_size,
cursor_ref,
INTERACTIVE_SESSION_SOURCES,
model_provider_slice,
fallback_provider.as_str(),
)
.await
{
@@ -849,7 +912,7 @@ impl CodexMessageProcessor {
let items = page
.items
.into_iter()
.filter_map(|it| extract_conversation_summary(it.path, &it.head))
.filter_map(|it| extract_conversation_summary(it.path, &it.head, &fallback_provider))
.collect();
// Encode next_cursor as a plain string
@@ -1256,7 +1319,10 @@ impl CodexMessageProcessor {
request_id: RequestId,
params: AddConversationListenerParams,
) {
let AddConversationListenerParams { conversation_id } = params;
let AddConversationListenerParams {
conversation_id,
experimental_raw_events,
} = params;
let Ok(conversation) = self
.conversation_manager
.get_conversation(conversation_id)
@@ -1293,6 +1359,11 @@ impl CodexMessageProcessor {
}
};
if let EventMsg::RawResponseItem(_) = &event.msg
&& !experimental_raw_events {
continue;
}
// For now, we send a notification for every event,
// JSON-serializing the `Event` as-is, but these should
// be migrated to be variants of `ServerNotification`
@@ -1410,6 +1481,77 @@ impl CodexMessageProcessor {
let response = FuzzyFileSearchResponse { files: results };
self.outgoing.send_response(request_id, response).await;
}
async fn upload_feedback(&self, request_id: RequestId, params: UploadFeedbackParams) {
let UploadFeedbackParams {
classification,
reason,
conversation_id,
include_logs,
} = params;
let snapshot = self.feedback.snapshot(conversation_id);
let thread_id = snapshot.thread_id.clone();
let validated_rollout_path = if include_logs {
match conversation_id {
Some(conv_id) => self.resolve_rollout_path(conv_id).await,
None => None,
}
} else {
None
};
let upload_result = tokio::task::spawn_blocking(move || {
let rollout_path_ref = validated_rollout_path.as_deref();
snapshot.upload_feedback(
&classification,
reason.as_deref(),
include_logs,
rollout_path_ref,
)
})
.await;
let upload_result = match upload_result {
Ok(result) => result,
Err(join_err) => {
let error = JSONRPCErrorError {
code: INTERNAL_ERROR_CODE,
message: format!("failed to upload feedback: {join_err}"),
data: None,
};
self.outgoing.send_error(request_id, error).await;
return;
}
};
match upload_result {
Ok(()) => {
let response = UploadFeedbackResponse { thread_id };
self.outgoing.send_response(request_id, response).await;
}
Err(err) => {
let error = JSONRPCErrorError {
code: INTERNAL_ERROR_CODE,
message: format!("failed to upload feedback: {err}"),
data: None,
};
self.outgoing.send_error(request_id, error).await;
}
}
}
async fn resolve_rollout_path(&self, conversation_id: ConversationId) -> Option<PathBuf> {
match self
.conversation_manager
.get_conversation(conversation_id)
.await
{
Ok(conv) => Some(conv.rollout_path()),
Err(_) => None,
}
}
}
async fn apply_bespoke_event_handling(
@@ -1447,6 +1589,7 @@ async fn apply_bespoke_event_handling(
command,
cwd,
reason,
risk,
parsed_cmd,
}) => {
let params = ExecCommandApprovalParams {
@@ -1455,6 +1598,7 @@ async fn apply_bespoke_event_handling(
command,
cwd,
reason,
risk,
parsed_cmd,
};
let rx = outgoing
@@ -1501,6 +1645,7 @@ async fn derive_config_from_params(
) -> std::io::Result<Config> {
let NewConversationParams {
model,
model_provider,
profile,
cwd,
approval_policy,
@@ -1516,13 +1661,14 @@ async fn derive_config_from_params(
cwd: cwd.map(PathBuf::from),
approval_policy,
sandbox_mode,
model_provider: None,
model_provider,
codex_linux_sandbox_exe,
base_instructions,
include_apply_patch_tool,
include_view_image_tool: None,
show_raw_agent_reasoning: None,
tools_web_search_request: None,
experimental_sandbox_command_assessment: None,
additional_writable_roots: Vec::new(),
};
@@ -1613,9 +1759,54 @@ async fn on_exec_approval_response(
}
}
async fn read_summary_from_rollout(
path: &Path,
fallback_provider: &str,
) -> std::io::Result<ConversationSummary> {
let head = read_head_for_summary(path).await?;
let Some(first) = head.first() else {
return Err(IoError::other(format!(
"rollout at {} is empty",
path.display()
)));
};
let session_meta = serde_json::from_value::<SessionMeta>(first.clone()).map_err(|_| {
IoError::other(format!(
"rollout at {} does not start with session metadata",
path.display()
))
})?;
if let Some(summary) =
extract_conversation_summary(path.to_path_buf(), &head, fallback_provider)
{
return Ok(summary);
}
let timestamp = if session_meta.timestamp.is_empty() {
None
} else {
Some(session_meta.timestamp.clone())
};
let model_provider = session_meta
.model_provider
.unwrap_or_else(|| fallback_provider.to_string());
Ok(ConversationSummary {
conversation_id: session_meta.id,
timestamp,
path: path.to_path_buf(),
preview: String::new(),
model_provider,
})
}
fn extract_conversation_summary(
path: PathBuf,
head: &[serde_json::Value],
fallback_provider: &str,
) -> Option<ConversationSummary> {
let session_meta = match head.first() {
Some(first_line) => serde_json::from_value::<SessionMeta>(first_line.clone()).ok()?,
@@ -1640,12 +1831,17 @@ fn extract_conversation_summary(
} else {
Some(session_meta.timestamp.clone())
};
let conversation_id = session_meta.id;
let model_provider = session_meta
.model_provider
.unwrap_or_else(|| fallback_provider.to_string());
Some(ConversationSummary {
conversation_id: session_meta.id,
conversation_id,
timestamp,
path,
preview: preview.to_string(),
model_provider,
})
}
@@ -1655,6 +1851,7 @@ mod tests {
use anyhow::Result;
use pretty_assertions::assert_eq;
use serde_json::json;
use tempfile::TempDir;
#[test]
fn extract_conversation_summary_prefers_plain_user_messages() -> Result<()> {
@@ -1669,7 +1866,8 @@ mod tests {
"cwd": "/",
"originator": "codex",
"cli_version": "0.0.0",
"instructions": null
"instructions": null,
"model_provider": "test-provider"
}),
json!({
"type": "message",
@@ -1689,15 +1887,62 @@ mod tests {
}),
];
let summary = extract_conversation_summary(path.clone(), &head).expect("summary");
let summary =
extract_conversation_summary(path.clone(), &head, "test-provider").expect("summary");
assert_eq!(summary.conversation_id, conversation_id);
assert_eq!(
summary.timestamp,
Some("2025-09-05T16:53:11.850Z".to_string())
);
assert_eq!(summary.path, path);
assert_eq!(summary.preview, "Count to 5");
let expected = ConversationSummary {
conversation_id,
timestamp,
path,
preview: "Count to 5".to_string(),
model_provider: "test-provider".to_string(),
};
assert_eq!(summary, expected);
Ok(())
}
#[tokio::test]
async fn read_summary_from_rollout_returns_empty_preview_when_no_user_message() -> Result<()> {
use codex_protocol::protocol::RolloutItem;
use codex_protocol::protocol::RolloutLine;
use codex_protocol::protocol::SessionMetaLine;
use std::fs;
let temp_dir = TempDir::new()?;
let path = temp_dir.path().join("rollout.jsonl");
let conversation_id = ConversationId::from_string("bfd12a78-5900-467b-9bc5-d3d35df08191")?;
let timestamp = "2025-09-05T16:53:11.850Z".to_string();
let session_meta = SessionMeta {
id: conversation_id,
timestamp: timestamp.clone(),
model_provider: None,
..SessionMeta::default()
};
let line = RolloutLine {
timestamp: timestamp.clone(),
item: RolloutItem::SessionMeta(SessionMetaLine {
meta: session_meta.clone(),
git: None,
}),
};
fs::write(&path, format!("{}\n", serde_json::to_string(&line)?))?;
let summary = read_summary_from_rollout(path.as_path(), "fallback").await?;
let expected = ConversationSummary {
conversation_id,
timestamp: Some(timestamp),
path: path.clone(),
preview: String::new(),
model_provider: "fallback".to_string(),
};
assert_eq!(summary, expected);
Ok(())
}
}

View File

@@ -12,16 +12,19 @@ use crate::message_processor::MessageProcessor;
use crate::outgoing_message::OutgoingMessage;
use crate::outgoing_message::OutgoingMessageSender;
use codex_app_server_protocol::JSONRPCMessage;
use codex_feedback::CodexFeedback;
use tokio::io::AsyncBufReadExt;
use tokio::io::AsyncWriteExt;
use tokio::io::BufReader;
use tokio::io::{self};
use tokio::sync::mpsc;
use tracing::Level;
use tracing::debug;
use tracing::error;
use tracing::info;
use tracing_subscriber::EnvFilter;
use tracing_subscriber::Layer;
use tracing_subscriber::filter::Targets;
use tracing_subscriber::layer::SubscriberExt;
use tracing_subscriber::util::SubscriberInitExt;
@@ -82,6 +85,8 @@ pub async fn run_main(
std::io::Error::new(ErrorKind::InvalidData, format!("error loading config: {e}"))
})?;
let feedback = CodexFeedback::new();
let otel =
codex_core::otel_init::build_provider(&config, env!("CARGO_PKG_VERSION")).map_err(|e| {
std::io::Error::new(
@@ -96,8 +101,15 @@ pub async fn run_main(
.with_writer(std::io::stderr)
.with_filter(EnvFilter::from_default_env());
let feedback_layer = tracing_subscriber::fmt::layer()
.with_writer(feedback.make_writer())
.with_ansi(false)
.with_target(false)
.with_filter(Targets::new().with_default(Level::TRACE));
let _ = tracing_subscriber::registry()
.with(stderr_fmt)
.with(feedback_layer)
.with(otel.as_ref().map(|provider| {
OpenTelemetryTracingBridge::new(&provider.logger).with_filter(
tracing_subscriber::filter::filter_fn(codex_core::otel_init::codex_export_filter),
@@ -112,6 +124,7 @@ pub async fn run_main(
outgoing_message_sender,
codex_linux_sandbox_exe,
std::sync::Arc::new(config),
feedback.clone(),
);
async move {
while let Some(msg) = incoming_rx.recv().await {

View File

@@ -17,6 +17,7 @@ use codex_core::ConversationManager;
use codex_core::config::Config;
use codex_core::default_client::USER_AGENT_SUFFIX;
use codex_core::default_client::get_codex_user_agent;
use codex_feedback::CodexFeedback;
use codex_protocol::protocol::SessionSource;
use std::sync::Arc;
@@ -33,6 +34,7 @@ impl MessageProcessor {
outgoing: OutgoingMessageSender,
codex_linux_sandbox_exe: Option<PathBuf>,
config: Arc<Config>,
feedback: CodexFeedback,
) -> Self {
let outgoing = Arc::new(outgoing);
let auth_manager = AuthManager::shared(config.codex_home.clone(), false);
@@ -46,6 +48,7 @@ impl MessageProcessor {
outgoing.clone(),
codex_linux_sandbox_exe,
config,
feedback,
);
Self {

View File

@@ -7,8 +7,7 @@ use base64::engine::general_purpose::URL_SAFE_NO_PAD;
use chrono::DateTime;
use chrono::Utc;
use codex_core::auth::AuthDotJson;
use codex_core::auth::get_auth_file;
use codex_core::auth::write_auth_json;
use codex_core::auth::save_auth;
use codex_core::token_data::TokenData;
use codex_core::token_data::parse_id_token;
use serde_json::json;
@@ -127,5 +126,5 @@ pub fn write_chatgpt_auth(codex_home: &Path, fixture: ChatGptAuthFixture) -> Res
last_refresh,
};
write_auth_json(&get_auth_file(codex_home), &auth).context("write auth.json")
save_auth(codex_home, &auth).context("write auth.json")
}

View File

@@ -30,6 +30,7 @@ use codex_app_server_protocol::SendUserMessageParams;
use codex_app_server_protocol::SendUserTurnParams;
use codex_app_server_protocol::ServerRequest;
use codex_app_server_protocol::SetDefaultModelParams;
use codex_app_server_protocol::UploadFeedbackParams;
use codex_app_server_protocol::JSONRPCError;
use codex_app_server_protocol::JSONRPCMessage;
@@ -242,6 +243,15 @@ impl McpProcess {
self.send_request("account/rateLimits/read", None).await
}
/// Send a `feedback/upload` JSON-RPC request.
pub async fn send_upload_feedback_request(
&mut self,
params: UploadFeedbackParams,
) -> anyhow::Result<i64> {
let params = Some(serde_json::to_value(params)?);
self.send_request("feedback/upload", params).await
}
/// Send a `userInfo` JSON-RPC request.
pub async fn send_user_info_request(&mut self) -> anyhow::Result<i64> {
self.send_request("userInfo", None).await

View File

@@ -103,7 +103,10 @@ async fn test_codex_jsonrpc_conversation_flow() {
// 2) addConversationListener
let add_listener_id = mcp
.send_add_conversation_listener_request(AddConversationListenerParams { conversation_id })
.send_add_conversation_listener_request(AddConversationListenerParams {
conversation_id,
experimental_raw_events: false,
})
.await
.expect("send addConversationListener");
let add_listener_resp: JSONRPCResponse = timeout(
@@ -252,7 +255,10 @@ async fn test_send_user_turn_changes_approval_policy_behavior() {
// 2) addConversationListener
let add_listener_id = mcp
.send_add_conversation_listener_request(AddConversationListenerParams { conversation_id })
.send_add_conversation_listener_request(AddConversationListenerParams {
conversation_id,
experimental_raw_events: false,
})
.await
.expect("send addConversationListener");
let _: AddConversationSubscriptionResponse =
@@ -311,6 +317,7 @@ async fn test_send_user_turn_changes_approval_policy_behavior() {
],
cwd: working_directory.clone(),
reason: None,
risk: None,
parsed_cmd: vec![ParsedCommand::Unknown {
cmd: "python3 -c 'print(42)'".to_string()
}],
@@ -458,7 +465,10 @@ async fn test_send_user_turn_updates_sandbox_and_cwd_between_turns() {
.expect("deserialize newConversation response");
let add_listener_id = mcp
.send_add_conversation_listener_request(AddConversationListenerParams { conversation_id })
.send_add_conversation_listener_request(AddConversationListenerParams {
conversation_id,
experimental_raw_events: false,
})
.await
.expect("send addConversationListener");
timeout(

View File

@@ -67,7 +67,10 @@ async fn test_conversation_create_and_send_message_ok() {
// Add a listener so we receive notifications for this conversation (not strictly required for this test).
let add_listener_id = mcp
.send_add_conversation_listener_request(AddConversationListenerParams { conversation_id })
.send_add_conversation_listener_request(AddConversationListenerParams {
conversation_id,
experimental_raw_events: false,
})
.await
.expect("send addConversationListener");
let _sub: AddConversationSubscriptionResponse =

View File

@@ -88,7 +88,10 @@ async fn shell_command_interruption() -> anyhow::Result<()> {
// 2) addConversationListener
let add_listener_id = mcp
.send_add_conversation_listener_request(AddConversationListenerParams { conversation_id })
.send_add_conversation_listener_request(AddConversationListenerParams {
conversation_id,
experimental_raw_events: false,
})
.await?;
let _add_listener_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,

View File

@@ -30,18 +30,21 @@ async fn test_list_and_resume_conversations() {
"2025-01-02T12-00-00",
"2025-01-02T12:00:00Z",
"Hello A",
Some("openai"),
);
create_fake_rollout(
codex_home.path(),
"2025-01-01T13-00-00",
"2025-01-01T13:00:00Z",
"Hello B",
Some("openai"),
);
create_fake_rollout(
codex_home.path(),
"2025-01-01T12-00-00",
"2025-01-01T12:00:00Z",
"Hello C",
None,
);
let mut mcp = McpProcess::new(codex_home.path())
@@ -57,6 +60,7 @@ async fn test_list_and_resume_conversations() {
.send_list_conversations_request(ListConversationsParams {
page_size: Some(2),
cursor: None,
model_providers: None,
})
.await
.expect("send listConversations");
@@ -74,6 +78,8 @@ async fn test_list_and_resume_conversations() {
// Newest first; preview text should match
assert_eq!(items[0].preview, "Hello A");
assert_eq!(items[1].preview, "Hello B");
assert_eq!(items[0].model_provider, "openai");
assert_eq!(items[1].model_provider, "openai");
assert!(items[0].path.is_absolute());
assert!(next_cursor.is_some());
@@ -82,6 +88,7 @@ async fn test_list_and_resume_conversations() {
.send_list_conversations_request(ListConversationsParams {
page_size: Some(2),
cursor: next_cursor,
model_providers: None,
})
.await
.expect("send listConversations page 2");
@@ -99,7 +106,88 @@ async fn test_list_and_resume_conversations() {
} = to_response::<ListConversationsResponse>(resp2).expect("deserialize response");
assert_eq!(items2.len(), 1);
assert_eq!(items2[0].preview, "Hello C");
assert!(next2.is_some());
assert_eq!(items2[0].model_provider, "openai");
assert_eq!(next2, None);
// Add a conversation with an explicit non-OpenAI provider for filter tests.
create_fake_rollout(
codex_home.path(),
"2025-01-01T11-30-00",
"2025-01-01T11:30:00Z",
"Hello TP",
Some("test-provider"),
);
// Filtering by model provider should return only matching sessions.
let filter_req_id = mcp
.send_list_conversations_request(ListConversationsParams {
page_size: Some(10),
cursor: None,
model_providers: Some(vec!["test-provider".to_string()]),
})
.await
.expect("send listConversations filtered");
let filter_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(filter_req_id)),
)
.await
.expect("listConversations filtered timeout")
.expect("listConversations filtered resp");
let ListConversationsResponse {
items: filtered_items,
next_cursor: filtered_next,
} = to_response::<ListConversationsResponse>(filter_resp).expect("deserialize filtered");
assert_eq!(filtered_items.len(), 1);
assert_eq!(filtered_next, None);
assert_eq!(filtered_items[0].preview, "Hello TP");
assert_eq!(filtered_items[0].model_provider, "test-provider");
// Empty filter should include every session regardless of provider metadata.
let unfiltered_req_id = mcp
.send_list_conversations_request(ListConversationsParams {
page_size: Some(10),
cursor: None,
model_providers: Some(Vec::new()),
})
.await
.expect("send listConversations unfiltered");
let unfiltered_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(unfiltered_req_id)),
)
.await
.expect("listConversations unfiltered timeout")
.expect("listConversations unfiltered resp");
let ListConversationsResponse {
items: unfiltered_items,
next_cursor: unfiltered_next,
} = to_response::<ListConversationsResponse>(unfiltered_resp)
.expect("deserialize unfiltered response");
assert_eq!(unfiltered_items.len(), 4);
assert!(unfiltered_next.is_none());
let empty_req_id = mcp
.send_list_conversations_request(ListConversationsParams {
page_size: Some(10),
cursor: None,
model_providers: Some(vec!["other".to_string()]),
})
.await
.expect("send listConversations filtered empty");
let empty_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(empty_req_id)),
)
.await
.expect("listConversations filtered empty timeout")
.expect("listConversations filtered empty resp");
let ListConversationsResponse {
items: empty_items,
next_cursor: empty_next,
} = to_response::<ListConversationsResponse>(empty_resp).expect("deserialize filtered empty");
assert!(empty_items.is_empty());
assert!(empty_next.is_none());
// Now resume one of the sessions and expect a SessionConfigured notification and response.
let resume_req_id = mcp
@@ -152,7 +240,13 @@ async fn test_list_and_resume_conversations() {
assert!(!conversation_id.to_string().is_empty());
}
fn create_fake_rollout(codex_home: &Path, filename_ts: &str, meta_rfc3339: &str, preview: &str) {
fn create_fake_rollout(
codex_home: &Path,
filename_ts: &str,
meta_rfc3339: &str,
preview: &str,
model_provider: Option<&str>,
) {
let uuid = Uuid::new_v4();
// sessions/YYYY/MM/DD/ derived from filename_ts (YYYY-MM-DDThh-mm-ss)
let year = &filename_ts[0..4];
@@ -164,18 +258,22 @@ fn create_fake_rollout(codex_home: &Path, filename_ts: &str, meta_rfc3339: &str,
let file_path = dir.join(format!("rollout-{filename_ts}-{uuid}.jsonl"));
let mut lines = Vec::new();
// Meta line with timestamp (flattened meta in payload for new schema)
let mut payload = json!({
"id": uuid,
"timestamp": meta_rfc3339,
"cwd": "/",
"originator": "codex",
"cli_version": "0.0.0",
"instructions": null,
});
if let Some(provider) = model_provider {
payload["model_provider"] = json!(provider);
}
lines.push(
json!({
"timestamp": meta_rfc3339,
"type": "session_meta",
"payload": {
"id": uuid,
"timestamp": meta_rfc3339,
"cwd": "/",
"originator": "codex",
"cli_version": "0.0.0",
"instructions": null
}
"payload": payload
})
.to_string(),
);

View File

@@ -13,6 +13,7 @@ use codex_app_server_protocol::LoginChatGptResponse;
use codex_app_server_protocol::LogoutChatGptResponse;
use codex_app_server_protocol::RequestId;
use codex_login::login_with_api_key;
use serial_test::serial;
use tempfile::TempDir;
use tokio::time::timeout;
@@ -94,6 +95,8 @@ async fn logout_chatgpt_removes_auth() {
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
// Serialize tests that launch the login server since it binds to a fixed port.
#[serial(login_port)]
async fn login_and_cancel_chatgpt() {
let codex_home = TempDir::new().unwrap_or_else(|e| panic!("create tempdir: {e}"));
create_config_toml(codex_home.path()).unwrap_or_else(|err| panic!("write config.toml: {err}"));
@@ -208,6 +211,8 @@ async fn login_chatgpt_rejected_when_forced_api() {
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
// Serialize tests that launch the login server since it binds to a fixed port.
#[serial(login_port)]
async fn login_chatgpt_includes_forced_workspace_query_param() {
let codex_home = TempDir::new().unwrap_or_else(|e| panic!("create tempdir: {e}"));
create_config_toml_forced_workspace(codex_home.path(), "ws-forced")

View File

@@ -15,6 +15,8 @@ use codex_app_server_protocol::RequestId;
use codex_app_server_protocol::SendUserMessageParams;
use codex_app_server_protocol::SendUserMessageResponse;
use codex_protocol::ConversationId;
use codex_protocol::models::ContentItem;
use codex_protocol::models::ResponseItem;
use pretty_assertions::assert_eq;
use tempfile::TempDir;
use tokio::time::timeout;
@@ -62,7 +64,10 @@ async fn test_send_message_success() {
// 2) addConversationListener
let add_listener_id = mcp
.send_add_conversation_listener_request(AddConversationListenerParams { conversation_id })
.send_add_conversation_listener_request(AddConversationListenerParams {
conversation_id,
experimental_raw_events: false,
})
.await
.expect("send addConversationListener");
let add_listener_resp: JSONRPCResponse = timeout(
@@ -124,6 +129,105 @@ async fn send_message(message: &str, conversation_id: ConversationId, mcp: &mut
.expect("should have conversationId"),
&serde_json::Value::String(conversation_id.to_string())
);
let raw_attempt = tokio::time::timeout(
std::time::Duration::from_millis(200),
mcp.read_stream_until_notification_message("codex/event/raw_response_item"),
)
.await;
assert!(
raw_attempt.is_err(),
"unexpected raw item notification when not opted in"
);
}
#[tokio::test]
async fn test_send_message_raw_notifications_opt_in() {
let responses = vec![
create_final_assistant_message_sse_response("Done").expect("build mock assistant message"),
];
let server = create_mock_chat_completions_server(responses).await;
let codex_home = TempDir::new().expect("create temp dir");
create_config_toml(codex_home.path(), &server.uri()).expect("write config.toml");
let mut mcp = McpProcess::new(codex_home.path())
.await
.expect("spawn mcp process");
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize())
.await
.expect("init timed out")
.expect("init failed");
let new_conv_id = mcp
.send_new_conversation_request(NewConversationParams::default())
.await
.expect("send newConversation");
let new_conv_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(new_conv_id)),
)
.await
.expect("newConversation timeout")
.expect("newConversation resp");
let NewConversationResponse {
conversation_id, ..
} = to_response::<_>(new_conv_resp).expect("deserialize newConversation response");
let add_listener_id = mcp
.send_add_conversation_listener_request(AddConversationListenerParams {
conversation_id,
experimental_raw_events: true,
})
.await
.expect("send addConversationListener");
let add_listener_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(add_listener_id)),
)
.await
.expect("addConversationListener timeout")
.expect("addConversationListener resp");
let AddConversationSubscriptionResponse { subscription_id: _ } =
to_response::<_>(add_listener_resp).expect("deserialize addConversationListener response");
let send_id = mcp
.send_send_user_message_request(SendUserMessageParams {
conversation_id,
items: vec![InputItem::Text {
text: "Hello".to_string(),
}],
})
.await
.expect("send sendUserMessage");
let instructions = read_raw_response_item(&mut mcp, conversation_id).await;
assert_instructions_message(&instructions);
let environment = read_raw_response_item(&mut mcp, conversation_id).await;
assert_environment_message(&environment);
let response: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(send_id)),
)
.await
.expect("sendUserMessage response timeout")
.expect("sendUserMessage response error");
let _ok: SendUserMessageResponse = to_response::<SendUserMessageResponse>(response)
.expect("deserialize sendUserMessage response");
let user_message = read_raw_response_item(&mut mcp, conversation_id).await;
assert_user_message(&user_message, "Hello");
let assistant_message = read_raw_response_item(&mut mcp, conversation_id).await;
assert_assistant_message(&assistant_message, "Done");
let _ = tokio::time::timeout(
std::time::Duration::from_millis(250),
mcp.read_stream_until_notification_message("codex/event/task_complete"),
)
.await;
}
#[tokio::test]
@@ -184,3 +288,108 @@ stream_max_retries = 0
),
)
}
#[expect(clippy::expect_used)]
async fn read_raw_response_item(
mcp: &mut McpProcess,
conversation_id: ConversationId,
) -> ResponseItem {
let raw_notification: JSONRPCNotification = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("codex/event/raw_response_item"),
)
.await
.expect("codex/event/raw_response_item notification timeout")
.expect("codex/event/raw_response_item notification resp");
let serde_json::Value::Object(params) = raw_notification
.params
.expect("codex/event/raw_response_item should have params")
else {
panic!("codex/event/raw_response_item should have params");
};
let conversation_id_value = params
.get("conversationId")
.and_then(|value| value.as_str())
.expect("raw response item should include conversationId");
assert_eq!(
conversation_id_value,
conversation_id.to_string(),
"raw response item conversation mismatch"
);
let msg_value = params
.get("msg")
.cloned()
.expect("raw response item should include msg payload");
serde_json::from_value(msg_value).expect("deserialize raw response item")
}
fn assert_instructions_message(item: &ResponseItem) {
match item {
ResponseItem::Message { role, content, .. } => {
assert_eq!(role, "user");
let texts = content_texts(content);
assert!(
texts
.iter()
.any(|text| text.contains("<user_instructions>")),
"expected instructions message, got {texts:?}"
);
}
other => panic!("expected instructions message, got {other:?}"),
}
}
fn assert_environment_message(item: &ResponseItem) {
match item {
ResponseItem::Message { role, content, .. } => {
assert_eq!(role, "user");
let texts = content_texts(content);
assert!(
texts
.iter()
.any(|text| text.contains("<environment_context>")),
"expected environment context message, got {texts:?}"
);
}
other => panic!("expected environment message, got {other:?}"),
}
}
fn assert_user_message(item: &ResponseItem, expected_text: &str) {
match item {
ResponseItem::Message { role, content, .. } => {
assert_eq!(role, "user");
let texts = content_texts(content);
assert_eq!(texts, vec![expected_text]);
}
other => panic!("expected user message, got {other:?}"),
}
}
fn assert_assistant_message(item: &ResponseItem, expected_text: &str) {
match item {
ResponseItem::Message { role, content, .. } => {
assert_eq!(role, "assistant");
let texts = content_texts(content);
assert_eq!(texts, vec![expected_text]);
}
other => panic!("expected assistant message, got {other:?}"),
}
}
fn content_texts(content: &[ContentItem]) -> Vec<&str> {
content
.iter()
.filter_map(|item| match item {
ContentItem::InputText { text } | ContentItem::OutputText { text } => {
Some(text.as_str())
}
_ => None,
})
.collect()
}

View File

@@ -19,7 +19,7 @@ pub fn set_chatgpt_token_data(value: TokenData) {
/// Initialize the ChatGPT token from auth.json file
pub async fn init_chatgpt_token_from_auth(codex_home: &Path) -> std::io::Result<()> {
let auth = CodexAuth::from_codex_home(codex_home)?;
let auth = CodexAuth::from_auth_storage(codex_home)?;
if let Some(auth) = auth {
let token_data = auth.get_token_data().await?;
set_chatgpt_token_data(token_data);

View File

@@ -140,7 +140,7 @@ pub async fn run_login_with_device_code(
pub async fn run_login_status(cli_config_overrides: CliConfigOverrides) -> ! {
let config = load_config_or_exit(cli_config_overrides).await;
match CodexAuth::from_codex_home(&config.codex_home) {
match CodexAuth::from_auth_storage(&config.codex_home) {
Ok(Some(auth)) => match auth.mode {
AuthMode::ApiKey => match auth.get_token().await {
Ok(api_key) => {

View File

@@ -29,6 +29,7 @@ mod mcp_cmd;
use crate::mcp_cmd::McpCli;
use codex_core::config::Config;
use codex_core::config::ConfigOverrides;
use codex_core::features::is_known_feature_key;
/// Codex CLI
///
@@ -286,15 +287,25 @@ struct FeatureToggles {
}
impl FeatureToggles {
fn to_overrides(&self) -> Vec<String> {
fn to_overrides(&self) -> anyhow::Result<Vec<String>> {
let mut v = Vec::new();
for k in &self.enable {
v.push(format!("features.{k}=true"));
for feature in &self.enable {
Self::validate_feature(feature)?;
v.push(format!("features.{feature}=true"));
}
for k in &self.disable {
v.push(format!("features.{k}=false"));
for feature in &self.disable {
Self::validate_feature(feature)?;
v.push(format!("features.{feature}=false"));
}
Ok(v)
}
fn validate_feature(feature: &str) -> anyhow::Result<()> {
if is_known_feature_key(feature) {
Ok(())
} else {
anyhow::bail!("Unknown feature flag: {feature}")
}
v
}
}
@@ -345,9 +356,8 @@ async fn cli_main(codex_linux_sandbox_exe: Option<PathBuf>) -> anyhow::Result<()
} = MultitoolCli::parse();
// Fold --enable/--disable into config overrides so they flow to all subcommands.
root_config_overrides
.raw_overrides
.extend(feature_toggles.to_overrides());
let toggle_overrides = feature_toggles.to_overrides()?;
root_config_overrides.raw_overrides.extend(toggle_overrides);
match subcommand {
None => {
@@ -605,6 +615,7 @@ mod tests {
use assert_matches::assert_matches;
use codex_core::protocol::TokenUsage;
use codex_protocol::ConversationId;
use pretty_assertions::assert_eq;
fn finalize_from_args(args: &[&str]) -> TuiCli {
let cli = MultitoolCli::try_parse_from(args).expect("parse");
@@ -781,4 +792,32 @@ mod tests {
assert!(!interactive.resume_last);
assert_eq!(interactive.resume_session_id, None);
}
#[test]
fn feature_toggles_known_features_generate_overrides() {
let toggles = FeatureToggles {
enable: vec!["web_search_request".to_string()],
disable: vec!["unified_exec".to_string()],
};
let overrides = toggles.to_overrides().expect("valid features");
assert_eq!(
overrides,
vec![
"features.web_search_request=true".to_string(),
"features.unified_exec=false".to_string(),
]
);
}
#[test]
fn feature_toggles_unknown_feature_errors() {
let toggles = FeatureToggles {
enable: vec!["does_not_exist".to_string()],
disable: Vec::new(),
};
let err = toggles
.to_overrides()
.expect_err("feature should be rejected");
assert_eq!(err.to_string(), "Unknown feature flag: does_not_exist");
}
}

View File

@@ -274,19 +274,33 @@ async fn run_add(config_overrides: &CliConfigOverrides, add_args: AddArgs) -> Re
http_headers,
env_http_headers,
} = transport
&& matches!(supports_oauth_login(&url).await, Ok(true))
{
println!("Detected OAuth support. Starting OAuth flow…");
perform_oauth_login(
&name,
&url,
config.mcp_oauth_credentials_store_mode,
http_headers.clone(),
env_http_headers.clone(),
&Vec::new(),
)
.await?;
println!("Successfully logged in.");
match supports_oauth_login(&url).await {
Ok(true) => {
if !config.features.enabled(Feature::RmcpClient) {
println!(
"MCP server supports login. Add `experimental_use_rmcp_client = true` \
to your config.toml and run `codex mcp login {name}` to login."
);
} else {
println!("Detected OAuth support. Starting OAuth flow…");
perform_oauth_login(
&name,
&url,
config.mcp_oauth_credentials_store_mode,
http_headers.clone(),
env_http_headers.clone(),
&Vec::new(),
)
.await?;
println!("Successfully logged in.");
}
}
Ok(false) => {}
Err(_) => println!(
"MCP server may or may not require login. Run `codex mcp login {name}` to login."
),
}
}
Ok(())
@@ -523,10 +537,12 @@ async fn run_list(config_overrides: &CliConfigOverrides, list_args: ListArgs) ->
.map(|entry| entry.auth_status)
.unwrap_or(McpAuthStatus::Unsupported)
.to_string();
let bearer_token_display =
bearer_token_env_var.as_deref().unwrap_or("-").to_string();
http_rows.push([
name.clone(),
url.clone(),
bearer_token_env_var.clone().unwrap_or("-".to_string()),
bearer_token_display,
status,
auth_status,
]);
@@ -752,15 +768,15 @@ async fn run_get(config_overrides: &CliConfigOverrides, get_args: GetArgs) -> Re
} => {
println!(" transport: streamable_http");
println!(" url: {url}");
let env_var = bearer_token_env_var.as_deref().unwrap_or("-");
println!(" bearer_token_env_var: {env_var}");
let bearer_token_display = bearer_token_env_var.as_deref().unwrap_or("-");
println!(" bearer_token_env_var: {bearer_token_display}");
let headers_display = match http_headers {
Some(map) if !map.is_empty() => {
let mut pairs: Vec<_> = map.iter().collect();
pairs.sort_by(|(a, _), (b, _)| a.cmp(b));
pairs
.into_iter()
.map(|(k, v)| format!("{k}={v}"))
.map(|(k, _)| format!("{k}=*****"))
.collect::<Vec<_>>()
.join(", ")
}
@@ -773,7 +789,7 @@ async fn run_get(config_overrides: &CliConfigOverrides, get_args: GetArgs) -> Re
pairs.sort_by(|(a, _), (b, _)| a.cmp(b));
pairs
.into_iter()
.map(|(k, v)| format!("{k}={v}"))
.map(|(k, var)| format!("{k}={var}"))
.collect::<Vec<_>>()
.join(", ")
}

View File

@@ -68,9 +68,9 @@ async fn list_and_get_render_expected_output() -> Result<()> {
assert!(stdout.contains("Name"));
assert!(stdout.contains("docs"));
assert!(stdout.contains("docs-server"));
assert!(stdout.contains("TOKEN=secret"));
assert!(stdout.contains("APP_TOKEN=$APP_TOKEN"));
assert!(stdout.contains("WORKSPACE_ID=$WORKSPACE_ID"));
assert!(stdout.contains("TOKEN=*****"));
assert!(stdout.contains("APP_TOKEN=*****"));
assert!(stdout.contains("WORKSPACE_ID=*****"));
assert!(stdout.contains("Status"));
assert!(stdout.contains("Auth"));
assert!(stdout.contains("enabled"));
@@ -119,9 +119,9 @@ async fn list_and_get_render_expected_output() -> Result<()> {
assert!(stdout.contains("transport: stdio"));
assert!(stdout.contains("command: docs-server"));
assert!(stdout.contains("args: --port 4000"));
assert!(stdout.contains("env: TOKEN=secret"));
assert!(stdout.contains("APP_TOKEN=$APP_TOKEN"));
assert!(stdout.contains("WORKSPACE_ID=$WORKSPACE_ID"));
assert!(stdout.contains("env: TOKEN=*****"));
assert!(stdout.contains("APP_TOKEN=*****"));
assert!(stdout.contains("WORKSPACE_ID=*****"));
assert!(stdout.contains("enabled: true"));
assert!(stdout.contains("remove: codex mcp remove docs"));

View File

@@ -6,15 +6,11 @@ pub fn format_env_display(env: Option<&HashMap<String, String>>, env_vars: &[Str
if let Some(map) = env {
let mut pairs: Vec<_> = map.iter().collect();
pairs.sort_by(|(a, _), (b, _)| a.cmp(b));
parts.extend(
pairs
.into_iter()
.map(|(key, value)| format!("{key}={value}")),
);
parts.extend(pairs.into_iter().map(|(key, _)| format!("{key}=*****")));
}
if !env_vars.is_empty() {
parts.extend(env_vars.iter().map(|var| format!("{var}=${var}")));
parts.extend(env_vars.iter().map(|var| format!("{var}=*****")));
}
if parts.is_empty() {
@@ -42,14 +38,14 @@ mod tests {
env.insert("B".to_string(), "two".to_string());
env.insert("A".to_string(), "one".to_string());
assert_eq!(format_env_display(Some(&env), &[]), "A=one, B=two");
assert_eq!(format_env_display(Some(&env), &[]), "A=*****, B=*****");
}
#[test]
fn formats_env_vars_with_dollar_prefix() {
let vars = vec!["TOKEN".to_string(), "PATH".to_string()];
assert_eq!(format_env_display(None, &vars), "TOKEN=$TOKEN, PATH=$PATH");
assert_eq!(format_env_display(None, &vars), "TOKEN=*****, PATH=*****");
}
#[test]
@@ -60,7 +56,7 @@ mod tests {
assert_eq!(
format_env_display(Some(&env), &vars),
"HOME=/tmp, TOKEN=$TOKEN"
"HOME=*****, TOKEN=*****"
);
}
}

View File

@@ -21,13 +21,16 @@ bytes = { workspace = true }
chrono = { workspace = true, features = ["serde"] }
codex-app-server-protocol = { workspace = true }
codex-apply-patch = { workspace = true }
codex-async-utils = { workspace = true }
codex-file-search = { workspace = true }
codex-git-tooling = { workspace = true }
codex-keyring-store = { workspace = true }
codex-otel = { workspace = true, features = ["otel"] }
codex-protocol = { workspace = true }
codex-rmcp-client = { workspace = true }
codex-async-utils = { workspace = true }
codex-utils-string = { workspace = true }
codex-utils-pty = { workspace = true }
codex-utils-readiness = { workspace = true }
codex-utils-string = { workspace = true }
codex-utils-tokenizer = { workspace = true }
dirs = { workspace = true }
dunce = { workspace = true }
@@ -36,6 +39,7 @@ eventsource-stream = { workspace = true }
futures = { workspace = true }
http = { workspace = true }
indexmap = { workspace = true }
keyring = { workspace = true }
libc = { workspace = true }
mcp-types = { workspace = true }
os_info = { workspace = true }
@@ -45,6 +49,7 @@ reqwest = { workspace = true, features = ["json", "stream"] }
serde = { workspace = true, features = ["derive"] }
serde_json = { workspace = true }
sha1 = { workspace = true }
sha2 = { workspace = true }
shlex = { workspace = true }
similar = { workspace = true }
strum_macros = { workspace = true }
@@ -95,6 +100,7 @@ assert_cmd = { workspace = true }
assert_matches = { workspace = true }
core_test_support = { workspace = true }
escargot = { workspace = true }
image = { workspace = true, features = ["jpeg", "png"] }
maplit = { workspace = true }
predicates = { workspace = true }
pretty_assertions = { workspace = true }

View File

@@ -1,16 +1,12 @@
use chrono::DateTime;
mod storage;
use chrono::Utc;
use serde::Deserialize;
use serde::Serialize;
#[cfg(test)]
use serial_test::serial;
use std::env;
use std::fs::File;
use std::fs::OpenOptions;
use std::io::Read;
use std::io::Write;
#[cfg(unix)]
use std::os::unix::fs::OpenOptionsExt;
use std::fmt::Debug;
use std::path::Path;
use std::path::PathBuf;
use std::sync::Arc;
@@ -20,6 +16,10 @@ use std::time::Duration;
use codex_app_server_protocol::AuthMode;
use codex_protocol::config_types::ForcedLoginMethod;
pub use crate::auth::storage::AuthCredentialsStoreMode;
pub use crate::auth::storage::AuthDotJson;
use crate::auth::storage::AuthStorageBackend;
use crate::auth::storage::create_auth_storage;
use crate::config::Config;
use crate::default_client::CodexHttpClient;
use crate::token_data::PlanType;
@@ -32,7 +32,7 @@ pub struct CodexAuth {
pub(crate) api_key: Option<String>,
pub(crate) auth_dot_json: Arc<Mutex<Option<AuthDotJson>>>,
pub(crate) auth_file: PathBuf,
storage: Arc<dyn AuthStorageBackend>,
pub(crate) client: CodexHttpClient,
}
@@ -56,7 +56,7 @@ impl CodexAuth {
.map_err(std::io::Error::other)?;
let updated = update_tokens(
&self.auth_file,
&self.storage,
refresh_response.id_token,
refresh_response.access_token,
refresh_response.refresh_token,
@@ -78,8 +78,8 @@ impl CodexAuth {
Ok(access)
}
/// Loads the available auth information from the auth.json.
pub fn from_codex_home(codex_home: &Path) -> std::io::Result<Option<CodexAuth>> {
/// Loads the available auth information from auth storage.
pub fn from_auth_storage(codex_home: &Path) -> std::io::Result<Option<CodexAuth>> {
load_auth(codex_home, false)
}
@@ -103,7 +103,7 @@ impl CodexAuth {
.map_err(std::io::Error::other)?;
let updated_auth_dot_json = update_tokens(
&self.auth_file,
&self.storage,
refresh_response.id_token,
refresh_response.access_token,
refresh_response.refresh_token,
@@ -177,7 +177,7 @@ impl CodexAuth {
Self {
api_key: None,
mode: AuthMode::ChatGPT,
auth_file: PathBuf::new(),
storage: create_auth_storage(PathBuf::new(), AuthCredentialsStoreMode::File),
auth_dot_json,
client: crate::default_client::create_client(),
}
@@ -187,7 +187,7 @@ impl CodexAuth {
Self {
api_key: Some(api_key.to_owned()),
mode: AuthMode::ApiKey,
auth_file: PathBuf::new(),
storage: create_auth_storage(PathBuf::new(), AuthCredentialsStoreMode::File),
auth_dot_json: Arc::new(Mutex::new(None)),
client,
}
@@ -215,19 +215,11 @@ pub fn read_codex_api_key_from_env() -> Option<String> {
.filter(|value| !value.is_empty())
}
pub fn get_auth_file(codex_home: &Path) -> PathBuf {
codex_home.join("auth.json")
}
/// Delete the auth.json file inside `codex_home` if it exists. Returns `Ok(true)`
/// if a file was removed, `Ok(false)` if no auth file was present.
pub fn logout(codex_home: &Path) -> std::io::Result<bool> {
let auth_file = get_auth_file(codex_home);
match std::fs::remove_file(&auth_file) {
Ok(_) => Ok(true),
Err(err) if err.kind() == std::io::ErrorKind::NotFound => Ok(false),
Err(err) => Err(err),
}
let storage = create_auth_storage(codex_home.to_path_buf(), AuthCredentialsStoreMode::File);
storage.delete()
}
/// Writes an `auth.json` that contains only the API key.
@@ -237,7 +229,20 @@ pub fn login_with_api_key(codex_home: &Path, api_key: &str) -> std::io::Result<(
tokens: None,
last_refresh: None,
};
write_auth_json(&get_auth_file(codex_home), &auth_dot_json)
save_auth(codex_home, &auth_dot_json)
}
/// Persist the provided auth payload using the specified backend.
pub fn save_auth(codex_home: &Path, auth: &AuthDotJson) -> std::io::Result<()> {
let storage = create_auth_storage(codex_home.to_path_buf(), AuthCredentialsStoreMode::File);
storage.save(auth)
}
/// Load CLI auth data using the configured credential store backend.
/// Returns `None` when no credentials are stored.
pub fn load_auth_dot_json(codex_home: &Path) -> std::io::Result<Option<AuthDotJson>> {
let storage = create_auth_storage(codex_home.to_path_buf(), AuthCredentialsStoreMode::File);
storage.load()
}
pub async fn enforce_login_restrictions(config: &Config) -> std::io::Result<()> {
@@ -320,12 +325,12 @@ fn load_auth(
)));
}
let auth_file = get_auth_file(codex_home);
let storage = create_auth_storage(codex_home.to_path_buf(), AuthCredentialsStoreMode::File);
let client = crate::default_client::create_client();
let auth_dot_json = match try_read_auth_json(&auth_file) {
Ok(auth) => auth,
Err(err) if err.kind() == std::io::ErrorKind::NotFound => return Ok(None),
Err(err) => return Err(err),
let auth_dot_json = match storage.load()? {
Some(auth) => auth,
None => return Ok(None),
};
let AuthDotJson {
@@ -342,7 +347,7 @@ fn load_auth(
Ok(Some(CodexAuth {
api_key: None,
mode: AuthMode::ChatGPT,
auth_file,
storage: storage.clone(),
auth_dot_json: Arc::new(Mutex::new(Some(AuthDotJson {
openai_api_key: None,
tokens,
@@ -352,44 +357,20 @@ fn load_auth(
}))
}
/// Attempt to read and refresh the `auth.json` file in the given `CODEX_HOME` directory.
/// Returns the full AuthDotJson structure after refreshing if necessary.
pub fn try_read_auth_json(auth_file: &Path) -> std::io::Result<AuthDotJson> {
let mut file = File::open(auth_file)?;
let mut contents = String::new();
file.read_to_string(&mut contents)?;
let auth_dot_json: AuthDotJson = serde_json::from_str(&contents)?;
Ok(auth_dot_json)
}
pub fn write_auth_json(auth_file: &Path, auth_dot_json: &AuthDotJson) -> std::io::Result<()> {
if let Some(parent) = auth_file.parent() {
std::fs::create_dir_all(parent)?;
}
let json_data = serde_json::to_string_pretty(auth_dot_json)?;
let mut options = OpenOptions::new();
options.truncate(true).write(true).create(true);
#[cfg(unix)]
{
options.mode(0o600);
}
let mut file = options.open(auth_file)?;
file.write_all(json_data.as_bytes())?;
file.flush()?;
Ok(())
}
async fn update_tokens(
auth_file: &Path,
id_token: String,
storage: &Arc<dyn AuthStorageBackend>,
id_token: Option<String>,
access_token: Option<String>,
refresh_token: Option<String>,
) -> std::io::Result<AuthDotJson> {
let mut auth_dot_json = try_read_auth_json(auth_file)?;
let mut auth_dot_json = storage
.load()?
.ok_or(std::io::Error::other("Token data is not available."))?;
let tokens = auth_dot_json.tokens.get_or_insert_with(TokenData::default);
tokens.id_token = parse_id_token(&id_token).map_err(std::io::Error::other)?;
if let Some(id_token) = id_token {
tokens.id_token = parse_id_token(&id_token).map_err(std::io::Error::other)?;
}
if let Some(access_token) = access_token {
tokens.access_token = access_token;
}
@@ -397,7 +378,7 @@ async fn update_tokens(
tokens.refresh_token = refresh_token;
}
auth_dot_json.last_refresh = Some(Utc::now());
write_auth_json(auth_file, &auth_dot_json)?;
storage.save(&auth_dot_json)?;
Ok(auth_dot_json)
}
@@ -445,24 +426,11 @@ struct RefreshRequest {
#[derive(Deserialize, Clone)]
struct RefreshResponse {
id_token: String,
id_token: Option<String>,
access_token: Option<String>,
refresh_token: Option<String>,
}
/// Expected structure for $CODEX_HOME/auth.json.
#[derive(Deserialize, Serialize, Clone, Debug, PartialEq)]
pub struct AuthDotJson {
#[serde(rename = "OPENAI_API_KEY")]
pub openai_api_key: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub tokens: Option<TokenData>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub last_refresh: Option<DateTime<Utc>>,
}
// Shared constant for token refresh (client id used for oauth token refresh flow)
pub const CLIENT_ID: &str = "app_EMoamEEZ73f0CkXaXp7hrann";
@@ -477,12 +445,15 @@ struct CachedAuth {
#[cfg(test)]
mod tests {
use super::*;
use crate::auth::storage::FileAuthStorage;
use crate::auth::storage::get_auth_file;
use crate::config::Config;
use crate::config::ConfigOverrides;
use crate::config::ConfigToml;
use crate::token_data::IdTokenInfo;
use crate::token_data::KnownPlan;
use crate::token_data::PlanType;
use base64::Engine;
use codex_protocol::config_types::ForcedLoginMethod;
use pretty_assertions::assert_eq;
@@ -491,9 +462,9 @@ mod tests {
use tempfile::tempdir;
#[tokio::test]
async fn roundtrip_auth_dot_json() {
async fn refresh_without_id_token() {
let codex_home = tempdir().unwrap();
let _ = write_auth_file(
let fake_jwt = write_auth_file(
AuthFileParams {
openai_api_key: None,
chatgpt_plan_type: "pro".to_string(),
@@ -503,12 +474,23 @@ mod tests {
)
.expect("failed to write auth file");
let file = get_auth_file(codex_home.path());
let auth_dot_json = try_read_auth_json(&file).unwrap();
write_auth_json(&file, &auth_dot_json).unwrap();
let storage = create_auth_storage(
codex_home.path().to_path_buf(),
AuthCredentialsStoreMode::File,
);
let updated = super::update_tokens(
&storage,
None,
Some("new-access-token".to_string()),
Some("new-refresh-token".to_string()),
)
.await
.expect("update_tokens should succeed");
let same_auth_dot_json = try_read_auth_json(&file).unwrap();
assert_eq!(auth_dot_json, same_auth_dot_json);
let tokens = updated.tokens.expect("tokens should exist");
assert_eq!(tokens.id_token.raw_jwt, fake_jwt);
assert_eq!(tokens.access_token, "new-access-token");
assert_eq!(tokens.refresh_token, "new-refresh-token");
}
#[test]
@@ -532,7 +514,10 @@ mod tests {
super::login_with_api_key(dir.path(), "sk-new").expect("login_with_api_key should succeed");
let auth = super::try_read_auth_json(&auth_path).expect("auth.json should parse");
let storage = FileAuthStorage::new(dir.path().to_path_buf());
let auth = storage
.try_read_auth_json(&auth_path)
.expect("auth.json should parse");
assert_eq!(auth.openai_api_key.as_deref(), Some("sk-new"));
assert!(auth.tokens.is_none(), "tokens should be cleared");
}
@@ -540,7 +525,7 @@ mod tests {
#[test]
fn missing_auth_json_returns_none() {
let dir = tempdir().unwrap();
let auth = CodexAuth::from_codex_home(dir.path()).expect("call should succeed");
let auth = CodexAuth::from_auth_storage(dir.path()).expect("call should succeed");
assert_eq!(auth, None);
}
@@ -562,7 +547,7 @@ mod tests {
api_key,
mode,
auth_dot_json,
auth_file: _,
storage: _,
..
} = super::load_auth(codex_home.path(), false).unwrap().unwrap();
assert_eq!(None, api_key);
@@ -620,11 +605,11 @@ mod tests {
tokens: None,
last_refresh: None,
};
write_auth_json(&get_auth_file(dir.path()), &auth_dot_json)?;
assert!(dir.path().join("auth.json").exists());
let removed = logout(dir.path())?;
assert!(removed);
assert!(!dir.path().join("auth.json").exists());
super::save_auth(dir.path(), &auth_dot_json)?;
let auth_file = get_auth_file(dir.path());
assert!(auth_file.exists());
assert!(logout(dir.path())?);
assert!(!auth_file.exists());
Ok(())
}

View File

@@ -0,0 +1,672 @@
use chrono::DateTime;
use chrono::Utc;
use serde::Deserialize;
use serde::Serialize;
use sha2::Digest;
use sha2::Sha256;
use std::fmt::Debug;
use std::fs::File;
use std::fs::OpenOptions;
use std::io::Read;
use std::io::Write;
#[cfg(unix)]
use std::os::unix::fs::OpenOptionsExt;
use std::path::Path;
use std::path::PathBuf;
use std::sync::Arc;
use tracing::warn;
use crate::token_data::TokenData;
use codex_keyring_store::DefaultKeyringStore;
use codex_keyring_store::KeyringStore;
/// Determine where Codex should store CLI auth credentials.
#[derive(Debug, Default, Copy, Clone, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "lowercase")]
pub enum AuthCredentialsStoreMode {
#[default]
/// Persist credentials in CODEX_HOME/auth.json.
File,
/// Persist credentials in the keyring. Fail if unavailable.
Keyring,
/// Use keyring when available; otherwise, fall back to a file in CODEX_HOME.
Auto,
}
/// Expected structure for $CODEX_HOME/auth.json.
#[derive(Deserialize, Serialize, Clone, Debug, PartialEq)]
pub struct AuthDotJson {
#[serde(rename = "OPENAI_API_KEY")]
pub openai_api_key: Option<String>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub tokens: Option<TokenData>,
#[serde(default, skip_serializing_if = "Option::is_none")]
pub last_refresh: Option<DateTime<Utc>>,
}
pub(super) fn get_auth_file(codex_home: &Path) -> PathBuf {
codex_home.join("auth.json")
}
pub(super) fn delete_file_if_exists(codex_home: &Path) -> std::io::Result<bool> {
let auth_file = get_auth_file(codex_home);
match std::fs::remove_file(&auth_file) {
Ok(()) => Ok(true),
Err(err) if err.kind() == std::io::ErrorKind::NotFound => Ok(false),
Err(err) => Err(err),
}
}
pub(super) trait AuthStorageBackend: Debug + Send + Sync {
fn load(&self) -> std::io::Result<Option<AuthDotJson>>;
fn save(&self, auth: &AuthDotJson) -> std::io::Result<()>;
fn delete(&self) -> std::io::Result<bool>;
}
#[derive(Clone, Debug)]
pub(super) struct FileAuthStorage {
codex_home: PathBuf,
}
impl FileAuthStorage {
pub(super) fn new(codex_home: PathBuf) -> Self {
Self { codex_home }
}
/// Attempt to read and refresh the `auth.json` file in the given `CODEX_HOME` directory.
/// Returns the full AuthDotJson structure after refreshing if necessary.
pub(super) fn try_read_auth_json(&self, auth_file: &Path) -> std::io::Result<AuthDotJson> {
let mut file = File::open(auth_file)?;
let mut contents = String::new();
file.read_to_string(&mut contents)?;
let auth_dot_json: AuthDotJson = serde_json::from_str(&contents)?;
Ok(auth_dot_json)
}
}
impl AuthStorageBackend for FileAuthStorage {
fn load(&self) -> std::io::Result<Option<AuthDotJson>> {
let auth_file = get_auth_file(&self.codex_home);
let auth_dot_json = match self.try_read_auth_json(&auth_file) {
Ok(auth) => auth,
Err(err) if err.kind() == std::io::ErrorKind::NotFound => return Ok(None),
Err(err) => return Err(err),
};
Ok(Some(auth_dot_json))
}
fn save(&self, auth_dot_json: &AuthDotJson) -> std::io::Result<()> {
let auth_file = get_auth_file(&self.codex_home);
if let Some(parent) = auth_file.parent() {
std::fs::create_dir_all(parent)?;
}
let json_data = serde_json::to_string_pretty(auth_dot_json)?;
let mut options = OpenOptions::new();
options.truncate(true).write(true).create(true);
#[cfg(unix)]
{
options.mode(0o600);
}
let mut file = options.open(auth_file)?;
file.write_all(json_data.as_bytes())?;
file.flush()?;
Ok(())
}
fn delete(&self) -> std::io::Result<bool> {
delete_file_if_exists(&self.codex_home)
}
}
const KEYRING_SERVICE: &str = "Codex Auth";
// turns codex_home path into a stable, short key string
fn compute_store_key(codex_home: &Path) -> std::io::Result<String> {
let canonical = codex_home
.canonicalize()
.unwrap_or_else(|_| codex_home.to_path_buf());
let path_str = canonical.to_string_lossy();
let mut hasher = Sha256::new();
hasher.update(path_str.as_bytes());
let digest = hasher.finalize();
let hex = format!("{digest:x}");
let truncated = hex.get(..16).unwrap_or(&hex);
Ok(format!("cli|{truncated}"))
}
#[derive(Clone, Debug)]
struct KeyringAuthStorage {
codex_home: PathBuf,
keyring_store: Arc<dyn KeyringStore>,
}
impl KeyringAuthStorage {
fn new(codex_home: PathBuf, keyring_store: Arc<dyn KeyringStore>) -> Self {
Self {
codex_home,
keyring_store,
}
}
fn load_from_keyring(&self, key: &str) -> std::io::Result<Option<AuthDotJson>> {
match self.keyring_store.load(KEYRING_SERVICE, key) {
Ok(Some(serialized)) => serde_json::from_str(&serialized).map(Some).map_err(|err| {
std::io::Error::other(format!(
"failed to deserialize CLI auth from keyring: {err}"
))
}),
Ok(None) => Ok(None),
Err(error) => Err(std::io::Error::other(format!(
"failed to load CLI auth from keyring: {}",
error.message()
))),
}
}
fn save_to_keyring(&self, key: &str, value: &str) -> std::io::Result<()> {
match self.keyring_store.save(KEYRING_SERVICE, key, value) {
Ok(()) => Ok(()),
Err(error) => {
let message = format!(
"failed to write OAuth tokens to keyring: {}",
error.message()
);
warn!("{message}");
Err(std::io::Error::other(message))
}
}
}
}
impl AuthStorageBackend for KeyringAuthStorage {
fn load(&self) -> std::io::Result<Option<AuthDotJson>> {
let key = compute_store_key(&self.codex_home)?;
self.load_from_keyring(&key)
}
fn save(&self, auth: &AuthDotJson) -> std::io::Result<()> {
let key = compute_store_key(&self.codex_home)?;
// Simpler error mapping per style: prefer method reference over closure
let serialized = serde_json::to_string(auth).map_err(std::io::Error::other)?;
self.save_to_keyring(&key, &serialized)?;
if let Err(err) = delete_file_if_exists(&self.codex_home) {
warn!("failed to remove CLI auth fallback file: {err}");
}
Ok(())
}
fn delete(&self) -> std::io::Result<bool> {
let key = compute_store_key(&self.codex_home)?;
let keyring_removed = self
.keyring_store
.delete(KEYRING_SERVICE, &key)
.map_err(|err| {
std::io::Error::other(format!("failed to delete auth from keyring: {err}"))
})?;
let file_removed = delete_file_if_exists(&self.codex_home)?;
Ok(keyring_removed || file_removed)
}
}
#[derive(Clone, Debug)]
struct AutoAuthStorage {
keyring_storage: Arc<KeyringAuthStorage>,
file_storage: Arc<FileAuthStorage>,
}
impl AutoAuthStorage {
fn new(codex_home: PathBuf, keyring_store: Arc<dyn KeyringStore>) -> Self {
Self {
keyring_storage: Arc::new(KeyringAuthStorage::new(codex_home.clone(), keyring_store)),
file_storage: Arc::new(FileAuthStorage::new(codex_home)),
}
}
}
impl AuthStorageBackend for AutoAuthStorage {
fn load(&self) -> std::io::Result<Option<AuthDotJson>> {
match self.keyring_storage.load() {
Ok(Some(auth)) => Ok(Some(auth)),
Ok(None) => self.file_storage.load(),
Err(err) => {
warn!("failed to load CLI auth from keyring, falling back to file storage: {err}");
self.file_storage.load()
}
}
}
fn save(&self, auth: &AuthDotJson) -> std::io::Result<()> {
match self.keyring_storage.save(auth) {
Ok(()) => Ok(()),
Err(err) => {
warn!("failed to save auth to keyring, falling back to file storage: {err}");
self.file_storage.save(auth)
}
}
}
fn delete(&self) -> std::io::Result<bool> {
// Keyring storage will delete from disk as well
self.keyring_storage.delete()
}
}
pub(super) fn create_auth_storage(
codex_home: PathBuf,
mode: AuthCredentialsStoreMode,
) -> Arc<dyn AuthStorageBackend> {
let keyring_store: Arc<dyn KeyringStore> = Arc::new(DefaultKeyringStore);
create_auth_storage_with_keyring_store(codex_home, mode, keyring_store)
}
fn create_auth_storage_with_keyring_store(
codex_home: PathBuf,
mode: AuthCredentialsStoreMode,
keyring_store: Arc<dyn KeyringStore>,
) -> Arc<dyn AuthStorageBackend> {
match mode {
AuthCredentialsStoreMode::File => Arc::new(FileAuthStorage::new(codex_home)),
AuthCredentialsStoreMode::Keyring => {
Arc::new(KeyringAuthStorage::new(codex_home, keyring_store))
}
AuthCredentialsStoreMode::Auto => Arc::new(AutoAuthStorage::new(codex_home, keyring_store)),
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::token_data::IdTokenInfo;
use anyhow::Context;
use base64::Engine;
use pretty_assertions::assert_eq;
use serde_json::json;
use tempfile::tempdir;
use codex_keyring_store::tests::MockKeyringStore;
use keyring::Error as KeyringError;
#[tokio::test]
async fn file_storage_load_returns_auth_dot_json() -> anyhow::Result<()> {
let codex_home = tempdir()?;
let storage = FileAuthStorage::new(codex_home.path().to_path_buf());
let auth_dot_json = AuthDotJson {
openai_api_key: Some("test-key".to_string()),
tokens: None,
last_refresh: Some(Utc::now()),
};
storage
.save(&auth_dot_json)
.context("failed to save auth file")?;
let loaded = storage.load().context("failed to load auth file")?;
assert_eq!(Some(auth_dot_json), loaded);
Ok(())
}
#[tokio::test]
async fn file_storage_save_persists_auth_dot_json() -> anyhow::Result<()> {
let codex_home = tempdir()?;
let storage = FileAuthStorage::new(codex_home.path().to_path_buf());
let auth_dot_json = AuthDotJson {
openai_api_key: Some("test-key".to_string()),
tokens: None,
last_refresh: Some(Utc::now()),
};
let file = get_auth_file(codex_home.path());
storage
.save(&auth_dot_json)
.context("failed to save auth file")?;
let same_auth_dot_json = storage
.try_read_auth_json(&file)
.context("failed to read auth file after save")?;
assert_eq!(auth_dot_json, same_auth_dot_json);
Ok(())
}
#[test]
fn file_storage_delete_removes_auth_file() -> anyhow::Result<()> {
let dir = tempdir()?;
let auth_dot_json = AuthDotJson {
openai_api_key: Some("sk-test-key".to_string()),
tokens: None,
last_refresh: None,
};
let storage = create_auth_storage(dir.path().to_path_buf(), AuthCredentialsStoreMode::File);
storage.save(&auth_dot_json)?;
assert!(dir.path().join("auth.json").exists());
let storage = FileAuthStorage::new(dir.path().to_path_buf());
let removed = storage.delete()?;
assert!(removed);
assert!(!dir.path().join("auth.json").exists());
Ok(())
}
fn seed_keyring_and_fallback_auth_file_for_delete<F>(
mock_keyring: &MockKeyringStore,
codex_home: &Path,
compute_key: F,
) -> anyhow::Result<(String, PathBuf)>
where
F: FnOnce() -> std::io::Result<String>,
{
let key = compute_key()?;
mock_keyring.save(KEYRING_SERVICE, &key, "{}")?;
let auth_file = get_auth_file(codex_home);
std::fs::write(&auth_file, "stale")?;
Ok((key, auth_file))
}
fn seed_keyring_with_auth<F>(
mock_keyring: &MockKeyringStore,
compute_key: F,
auth: &AuthDotJson,
) -> anyhow::Result<()>
where
F: FnOnce() -> std::io::Result<String>,
{
let key = compute_key()?;
let serialized = serde_json::to_string(auth)?;
mock_keyring.save(KEYRING_SERVICE, &key, &serialized)?;
Ok(())
}
fn assert_keyring_saved_auth_and_removed_fallback(
mock_keyring: &MockKeyringStore,
key: &str,
codex_home: &Path,
expected: &AuthDotJson,
) {
let saved_value = mock_keyring
.saved_value(key)
.expect("keyring entry should exist");
let expected_serialized = serde_json::to_string(expected).expect("serialize expected auth");
assert_eq!(saved_value, expected_serialized);
let auth_file = get_auth_file(codex_home);
assert!(
!auth_file.exists(),
"fallback auth.json should be removed after keyring save"
);
}
fn id_token_with_prefix(prefix: &str) -> IdTokenInfo {
#[derive(Serialize)]
struct Header {
alg: &'static str,
typ: &'static str,
}
let header = Header {
alg: "none",
typ: "JWT",
};
let payload = json!({
"email": format!("{prefix}@example.com"),
"https://api.openai.com/auth": {
"chatgpt_account_id": format!("{prefix}-account"),
},
});
let encode = |bytes: &[u8]| base64::engine::general_purpose::URL_SAFE_NO_PAD.encode(bytes);
let header_b64 = encode(&serde_json::to_vec(&header).expect("serialize header"));
let payload_b64 = encode(&serde_json::to_vec(&payload).expect("serialize payload"));
let signature_b64 = encode(b"sig");
let fake_jwt = format!("{header_b64}.{payload_b64}.{signature_b64}");
crate::token_data::parse_id_token(&fake_jwt).expect("fake JWT should parse")
}
fn auth_with_prefix(prefix: &str) -> AuthDotJson {
AuthDotJson {
openai_api_key: Some(format!("{prefix}-api-key")),
tokens: Some(TokenData {
id_token: id_token_with_prefix(prefix),
access_token: format!("{prefix}-access"),
refresh_token: format!("{prefix}-refresh"),
account_id: Some(format!("{prefix}-account-id")),
}),
last_refresh: None,
}
}
#[test]
fn keyring_auth_storage_load_returns_deserialized_auth() -> anyhow::Result<()> {
let codex_home = tempdir()?;
let mock_keyring = MockKeyringStore::default();
let storage = KeyringAuthStorage::new(
codex_home.path().to_path_buf(),
Arc::new(mock_keyring.clone()),
);
let expected = AuthDotJson {
openai_api_key: Some("sk-test".to_string()),
tokens: None,
last_refresh: None,
};
seed_keyring_with_auth(
&mock_keyring,
|| compute_store_key(codex_home.path()),
&expected,
)?;
let loaded = storage.load()?;
assert_eq!(Some(expected), loaded);
Ok(())
}
#[test]
fn keyring_auth_storage_compute_store_key_for_home_directory() -> anyhow::Result<()> {
let codex_home = PathBuf::from("~/.codex");
let key = compute_store_key(codex_home.as_path())?;
assert_eq!(key, "cli|940db7b1d0e4eb40");
Ok(())
}
#[test]
fn keyring_auth_storage_save_persists_and_removes_fallback_file() -> anyhow::Result<()> {
let codex_home = tempdir()?;
let mock_keyring = MockKeyringStore::default();
let storage = KeyringAuthStorage::new(
codex_home.path().to_path_buf(),
Arc::new(mock_keyring.clone()),
);
let auth_file = get_auth_file(codex_home.path());
std::fs::write(&auth_file, "stale")?;
let auth = AuthDotJson {
openai_api_key: None,
tokens: Some(TokenData {
id_token: Default::default(),
access_token: "access".to_string(),
refresh_token: "refresh".to_string(),
account_id: Some("account".to_string()),
}),
last_refresh: Some(Utc::now()),
};
storage.save(&auth)?;
let key = compute_store_key(codex_home.path())?;
assert_keyring_saved_auth_and_removed_fallback(
&mock_keyring,
&key,
codex_home.path(),
&auth,
);
Ok(())
}
#[test]
fn keyring_auth_storage_delete_removes_keyring_and_file() -> anyhow::Result<()> {
let codex_home = tempdir()?;
let mock_keyring = MockKeyringStore::default();
let storage = KeyringAuthStorage::new(
codex_home.path().to_path_buf(),
Arc::new(mock_keyring.clone()),
);
let (key, auth_file) = seed_keyring_and_fallback_auth_file_for_delete(
&mock_keyring,
codex_home.path(),
|| compute_store_key(codex_home.path()),
)?;
let removed = storage.delete()?;
assert!(removed, "delete should report removal");
assert!(
!mock_keyring.contains(&key),
"keyring entry should be removed"
);
assert!(
!auth_file.exists(),
"fallback auth.json should be removed after keyring delete"
);
Ok(())
}
#[test]
fn auto_auth_storage_load_prefers_keyring_value() -> anyhow::Result<()> {
let codex_home = tempdir()?;
let mock_keyring = MockKeyringStore::default();
let storage = AutoAuthStorage::new(
codex_home.path().to_path_buf(),
Arc::new(mock_keyring.clone()),
);
let keyring_auth = auth_with_prefix("keyring");
seed_keyring_with_auth(
&mock_keyring,
|| compute_store_key(codex_home.path()),
&keyring_auth,
)?;
let file_auth = auth_with_prefix("file");
storage.file_storage.save(&file_auth)?;
let loaded = storage.load()?;
assert_eq!(loaded, Some(keyring_auth));
Ok(())
}
#[test]
fn auto_auth_storage_load_uses_file_when_keyring_empty() -> anyhow::Result<()> {
let codex_home = tempdir()?;
let mock_keyring = MockKeyringStore::default();
let storage = AutoAuthStorage::new(codex_home.path().to_path_buf(), Arc::new(mock_keyring));
let expected = auth_with_prefix("file-only");
storage.file_storage.save(&expected)?;
let loaded = storage.load()?;
assert_eq!(loaded, Some(expected));
Ok(())
}
#[test]
fn auto_auth_storage_load_falls_back_when_keyring_errors() -> anyhow::Result<()> {
let codex_home = tempdir()?;
let mock_keyring = MockKeyringStore::default();
let storage = AutoAuthStorage::new(
codex_home.path().to_path_buf(),
Arc::new(mock_keyring.clone()),
);
let key = compute_store_key(codex_home.path())?;
mock_keyring.set_error(&key, KeyringError::Invalid("error".into(), "load".into()));
let expected = auth_with_prefix("fallback");
storage.file_storage.save(&expected)?;
let loaded = storage.load()?;
assert_eq!(loaded, Some(expected));
Ok(())
}
#[test]
fn auto_auth_storage_save_prefers_keyring() -> anyhow::Result<()> {
let codex_home = tempdir()?;
let mock_keyring = MockKeyringStore::default();
let storage = AutoAuthStorage::new(
codex_home.path().to_path_buf(),
Arc::new(mock_keyring.clone()),
);
let key = compute_store_key(codex_home.path())?;
let stale = auth_with_prefix("stale");
storage.file_storage.save(&stale)?;
let expected = auth_with_prefix("to-save");
storage.save(&expected)?;
assert_keyring_saved_auth_and_removed_fallback(
&mock_keyring,
&key,
codex_home.path(),
&expected,
);
Ok(())
}
#[test]
fn auto_auth_storage_save_falls_back_when_keyring_errors() -> anyhow::Result<()> {
let codex_home = tempdir()?;
let mock_keyring = MockKeyringStore::default();
let storage = AutoAuthStorage::new(
codex_home.path().to_path_buf(),
Arc::new(mock_keyring.clone()),
);
let key = compute_store_key(codex_home.path())?;
mock_keyring.set_error(&key, KeyringError::Invalid("error".into(), "save".into()));
let auth = auth_with_prefix("fallback");
storage.save(&auth)?;
let auth_file = get_auth_file(codex_home.path());
assert!(
auth_file.exists(),
"fallback auth.json should be created when keyring save fails"
);
let saved = storage
.file_storage
.load()?
.context("fallback auth should exist")?;
assert_eq!(saved, auth);
assert!(
mock_keyring.saved_value(&key).is_none(),
"keyring should not contain value when save fails"
);
Ok(())
}
#[test]
fn auto_auth_storage_delete_removes_keyring_and_file() -> anyhow::Result<()> {
let codex_home = tempdir()?;
let mock_keyring = MockKeyringStore::default();
let storage = AutoAuthStorage::new(
codex_home.path().to_path_buf(),
Arc::new(mock_keyring.clone()),
);
let (key, auth_file) = seed_keyring_and_fallback_auth_file_for_delete(
&mock_keyring,
codex_home.path(),
|| compute_store_key(codex_home.path()),
)?;
let removed = storage.delete()?;
assert!(removed, "delete should report removal");
assert!(
!mock_keyring.contains(&key),
"keyring entry should be removed"
);
assert!(
!auth_file.exists(),
"fallback auth.json should be removed after delete"
);
Ok(())
}
}

View File

@@ -1,3 +1,4 @@
use std::sync::Arc;
use std::time::Duration;
use crate::ModelProviderInfo;
@@ -12,13 +13,16 @@ use crate::error::Result;
use crate::error::RetryLimitReachedError;
use crate::error::UnexpectedResponseError;
use crate::model_family::ModelFamily;
use crate::protocol::TokenUsage;
use crate::tools::spec::create_tools_json_for_chat_completions_api;
use crate::util::backoff;
use bytes::Bytes;
use codex_otel::otel_event_manager::OtelEventManager;
use codex_protocol::models::ContentItem;
use codex_protocol::models::FunctionCallOutputContentItem;
use codex_protocol::models::ReasoningItemContent;
use codex_protocol::models::ResponseItem;
use codex_utils_tokenizer::Tokenizer;
use eventsource_stream::Eventsource;
use futures::Stream;
use futures::StreamExt;
@@ -33,6 +37,102 @@ use tokio::time::timeout;
use tracing::debug;
use tracing::trace;
struct ChatUsageHeuristic {
tokenizer: Arc<Tokenizer>,
input_tokens: i64,
output_tokens: i64,
reasoning_tokens: i64,
}
impl ChatUsageHeuristic {
fn new(model: &str, messages: &[serde_json::Value]) -> Option<Self> {
let tokenizer = match Tokenizer::for_model(model) {
Ok(tok) => tok,
Err(err) => {
debug!(
"failed to build tokenizer for model {model}; falling back to default: {err:?}"
);
match Tokenizer::try_default() {
Ok(tok) => tok,
Err(fallback_err) => {
debug!(
"failed to fall back to default tokenizer for model {model}: {fallback_err:?}"
);
return None;
}
}
}
};
let tokenizer = Arc::new(tokenizer);
let mut input_tokens =
4_i64.saturating_mul(i64::try_from(messages.len()).unwrap_or(i64::MAX));
for message in messages {
input_tokens =
input_tokens.saturating_add(Self::count_value_tokens(tokenizer.as_ref(), message));
if let Some(tool_calls) = message.get("tool_calls").and_then(|v| v.as_array()) {
input_tokens = input_tokens.saturating_add(
8_i64.saturating_mul(i64::try_from(tool_calls.len()).unwrap_or(i64::MAX)),
);
}
}
Some(Self {
tokenizer,
input_tokens,
output_tokens: 0,
reasoning_tokens: 0,
})
}
fn record_output(&mut self, text: &str) {
if text.is_empty() {
return;
}
self.output_tokens = self
.output_tokens
.saturating_add(self.tokenizer.count(text));
}
fn record_reasoning(&mut self, text: &str) {
if text.is_empty() {
return;
}
self.reasoning_tokens = self
.reasoning_tokens
.saturating_add(self.tokenizer.count(text));
}
fn to_usage(&self) -> TokenUsage {
let total = self
.input_tokens
.saturating_add(self.output_tokens)
.saturating_add(self.reasoning_tokens);
TokenUsage {
input_tokens: self.input_tokens,
cached_input_tokens: 0,
output_tokens: self.output_tokens,
reasoning_output_tokens: self.reasoning_tokens,
total_tokens: total,
}
}
fn count_value_tokens(tokenizer: &Tokenizer, value: &serde_json::Value) -> i64 {
match value {
serde_json::Value::String(s) => tokenizer.count(s),
serde_json::Value::Array(items) => items.iter().fold(0_i64, |acc, item| {
acc.saturating_add(Self::count_value_tokens(tokenizer, item))
}),
serde_json::Value::Object(map) => map.values().fold(0_i64, |acc, item| {
acc.saturating_add(Self::count_value_tokens(tokenizer, item))
}),
_ => 0,
}
}
}
/// Implementation for the classic Chat Completions API.
pub(crate) async fn stream_chat_completions(
prompt: &Prompt,
@@ -76,6 +176,7 @@ pub(crate) async fn stream_chat_completions(
ResponseItem::CustomToolCall { .. } => {}
ResponseItem::CustomToolCallOutput { .. } => {}
ResponseItem::WebSearchCall { .. } => {}
ResponseItem::GhostSnapshot { .. } => {}
}
}
@@ -158,16 +259,26 @@ pub(crate) async fn stream_chat_completions(
for (idx, item) in input.iter().enumerate() {
match item {
ResponseItem::Message { role, content, .. } => {
// Build content either as a plain string (typical for assistant text)
// or as an array of content items when images are present (user/tool multimodal).
let mut text = String::new();
let mut items: Vec<serde_json::Value> = Vec::new();
let mut saw_image = false;
for c in content {
match c {
ContentItem::InputText { text: t }
| ContentItem::OutputText { text: t } => {
text.push_str(t);
items.push(json!({"type":"text","text": t}));
}
ContentItem::InputImage { image_url } => {
saw_image = true;
items.push(json!({"type":"image_url","image_url": {"url": image_url}}));
}
_ => {}
}
}
// Skip exact-duplicate assistant messages.
if role == "assistant" {
if let Some(prev) = &last_assistant_text
@@ -178,7 +289,17 @@ pub(crate) async fn stream_chat_completions(
last_assistant_text = Some(text.clone());
}
let mut msg = json!({"role": role, "content": text});
// For assistant messages, always send a plain string for compatibility.
// For user messages, if an image is present, send an array of content items.
let content_value = if role == "assistant" {
json!(text)
} else if saw_image {
json!(items)
} else {
json!(text)
};
let mut msg = json!({"role": role, "content": content_value});
if role == "assistant"
&& let Some(reasoning) = reasoning_by_anchor_index.get(&idx)
&& let Some(obj) = msg.as_object_mut()
@@ -237,10 +358,29 @@ pub(crate) async fn stream_chat_completions(
messages.push(msg);
}
ResponseItem::FunctionCallOutput { call_id, output } => {
// Prefer structured content items when available (e.g., images)
// otherwise fall back to the legacy plain-string content.
let content_value = if let Some(items) = &output.content_items {
let mapped: Vec<serde_json::Value> = items
.iter()
.map(|it| match it {
FunctionCallOutputContentItem::InputText { text } => {
json!({"type":"text","text": text})
}
FunctionCallOutputContentItem::InputImage { image_url } => {
json!({"type":"image_url","image_url": {"url": image_url}})
}
})
.collect();
json!(mapped)
} else {
json!(output.content)
};
messages.push(json!({
"role": "tool",
"tool_call_id": call_id,
"content": output.content,
"content": content_value,
}));
}
ResponseItem::CustomToolCall {
@@ -270,6 +410,10 @@ pub(crate) async fn stream_chat_completions(
"content": output,
}));
}
ResponseItem::GhostSnapshot { .. } => {
// Ghost snapshots annotate history but are not sent to the model.
continue;
}
ResponseItem::Reasoning { .. }
| ResponseItem::WebSearchCall { .. }
| ResponseItem::Other => {
@@ -280,6 +424,8 @@ pub(crate) async fn stream_chat_completions(
}
let tools_json = create_tools_json_for_chat_completions_api(&prompt.tools)?;
let usage_heuristic = ChatUsageHeuristic::new(model_family.slug.as_str(), &messages);
let payload = json!({
"model": model_family.slug,
"messages": messages,
@@ -323,6 +469,7 @@ pub(crate) async fn stream_chat_completions(
tx_event,
provider.stream_idle_timeout(),
otel_event_manager.clone(),
usage_heuristic,
));
return Ok(ResponseStream { rx_event });
}
@@ -376,6 +523,7 @@ async fn process_chat_sse<S>(
tx_event: mpsc::Sender<Result<ResponseEvent>>,
idle_timeout: Duration,
otel_event_manager: OtelEventManager,
mut usage_heuristic: Option<ChatUsageHeuristic>,
) where
S: Stream<Item = Result<Bytes>> + Unpin,
{
@@ -414,10 +562,11 @@ async fn process_chat_sse<S>(
}
Ok(None) => {
// Stream closed gracefully emit Completed with dummy id.
let token_usage = usage_heuristic.as_ref().map(ChatUsageHeuristic::to_usage);
let _ = tx_event
.send(Ok(ResponseEvent::Completed {
response_id: String::new(),
token_usage: None,
token_usage,
}))
.await;
return;
@@ -460,10 +609,11 @@ async fn process_chat_sse<S>(
let _ = tx_event.send(Ok(ResponseEvent::OutputItemDone(item))).await;
}
let token_usage = usage_heuristic.as_ref().map(ChatUsageHeuristic::to_usage);
let _ = tx_event
.send(Ok(ResponseEvent::Completed {
response_id: String::new(),
token_usage: None,
token_usage,
}))
.await;
return;
@@ -487,6 +637,9 @@ async fn process_chat_sse<S>(
&& !content.is_empty()
{
assistant_text.push_str(content);
if let Some(usage) = usage_heuristic.as_mut() {
usage.record_output(content);
}
let _ = tx_event
.send(Ok(ResponseEvent::OutputTextDelta(content.to_string())))
.await;
@@ -520,6 +673,9 @@ async fn process_chat_sse<S>(
if let Some(reasoning) = maybe_text {
// Accumulate so we can emit a terminal Reasoning item at the end.
reasoning_text.push_str(&reasoning);
if let Some(usage) = usage_heuristic.as_mut() {
usage.record_reasoning(&reasoning);
}
let _ = tx_event
.send(Ok(ResponseEvent::ReasoningContentDelta(reasoning)))
.await;
@@ -533,6 +689,9 @@ async fn process_chat_sse<S>(
if let Some(s) = message_reasoning.as_str() {
if !s.is_empty() {
reasoning_text.push_str(s);
if let Some(usage) = usage_heuristic.as_mut() {
usage.record_reasoning(s);
}
let _ = tx_event
.send(Ok(ResponseEvent::ReasoningContentDelta(s.to_string())))
.await;
@@ -545,6 +704,9 @@ async fn process_chat_sse<S>(
&& !s.is_empty()
{
reasoning_text.push_str(s);
if let Some(usage) = usage_heuristic.as_mut() {
usage.record_reasoning(s);
}
let _ = tx_event
.send(Ok(ResponseEvent::ReasoningContentDelta(s.to_string())))
.await;
@@ -563,18 +725,31 @@ async fn process_chat_sse<S>(
// Extract call_id if present.
if let Some(id) = tool_call.get("id").and_then(|v| v.as_str()) {
fn_call_state.call_id.get_or_insert_with(|| id.to_string());
if fn_call_state.call_id.is_none() {
if let Some(usage) = usage_heuristic.as_mut() {
usage.record_output(id);
}
fn_call_state.call_id = Some(id.to_string());
}
}
// Extract function details if present.
if let Some(function) = tool_call.get("function") {
if let Some(name) = function.get("name").and_then(|n| n.as_str()) {
fn_call_state.name.get_or_insert_with(|| name.to_string());
if fn_call_state.name.is_none() {
if let Some(usage) = usage_heuristic.as_mut() {
usage.record_output(name);
}
fn_call_state.name = Some(name.to_string());
}
}
if let Some(args_fragment) = function.get("arguments").and_then(|a| a.as_str())
{
fn_call_state.arguments.push_str(args_fragment);
if let Some(usage) = usage_heuristic.as_mut() {
usage.record_output(args_fragment);
}
}
}
}
@@ -637,10 +812,11 @@ async fn process_chat_sse<S>(
}
// Emit Completed regardless of reason so the agent can advance.
let token_usage = usage_heuristic.as_ref().map(ChatUsageHeuristic::to_usage);
let _ = tx_event
.send(Ok(ResponseEvent::Completed {
response_id: String::new(),
token_usage: None,
token_usage,
}))
.await;

View File

@@ -134,6 +134,14 @@ impl ModelClient {
self.stream_with_task_kind(prompt, TaskKind::Regular).await
}
pub fn config(&self) -> Arc<Config> {
Arc::clone(&self.config)
}
pub fn provider(&self) -> &ModelProviderInfo {
&self.provider
}
pub(crate) async fn stream_with_task_kind(
&self,
prompt: &Prompt,
@@ -215,18 +223,14 @@ impl ModelClient {
let input_with_instructions = prompt.get_formatted_input();
let verbosity = match &self.config.model_family.family {
family if family == "gpt-5" => self.config.model_verbosity,
_ => {
if self.config.model_verbosity.is_some() {
warn!(
"model_verbosity is set but ignored for non-gpt-5 model family: {}",
self.config.model_family.family
);
}
None
}
let verbosity = if self.config.model_family.support_verbosity {
self.config.model_verbosity
} else {
warn!(
"model_verbosity is set but ignored as the model does not support verbosity: {}",
self.config.model_family.family
);
None
};
// Only include `text.verbosity` for GPT-5 family models
@@ -381,9 +385,14 @@ impl ModelClient {
if status == StatusCode::UNAUTHORIZED
&& let Some(manager) = auth_manager.as_ref()
&& manager.auth().is_some()
&& let Some(auth) = auth.as_ref()
&& auth.mode == AuthMode::ChatGPT
{
let _ = manager.refresh_token().await;
manager.refresh_token().await.map_err(|err| {
StreamAttemptError::Fatal(CodexErr::Fatal(format!(
"Failed to refresh ChatGPT credentials: {err}"
)))
})?;
}
// The OpenAI Responses endpoint returns structured JSON bodies even for 4xx/5xx

View File

@@ -20,7 +20,6 @@ use async_channel::Sender;
use codex_apply_patch::ApplyPatchAction;
use codex_protocol::ConversationId;
use codex_protocol::items::TurnItem;
use codex_protocol::protocol::ConversationPathResponseEvent;
use codex_protocol::protocol::ExitedReviewModeEvent;
use codex_protocol::protocol::ItemCompletedEvent;
use codex_protocol::protocol::ItemStartedEvent;
@@ -88,6 +87,7 @@ use crate::protocol::Op;
use crate::protocol::RateLimitSnapshot;
use crate::protocol::ReviewDecision;
use crate::protocol::ReviewOutputEvent;
use crate::protocol::SandboxCommandAssessment;
use crate::protocol::SandboxPolicy;
use crate::protocol::SessionConfiguredEvent;
use crate::protocol::StreamErrorEvent;
@@ -104,8 +104,12 @@ use crate::state::SessionServices;
use crate::state::SessionState;
use crate::state::TaskKind;
use crate::tasks::CompactTask;
use crate::tasks::GhostSnapshotTask;
use crate::tasks::RegularTask;
use crate::tasks::ReviewTask;
use crate::tasks::SessionTask;
use crate::tasks::SessionTaskContext;
use crate::tasks::UndoTask;
use crate::tools::ToolRouter;
use crate::tools::context::SharedTurnDiffTracker;
use crate::tools::parallel::ToolCallRuntime;
@@ -128,6 +132,8 @@ use codex_protocol::models::ResponseInputItem;
use codex_protocol::models::ResponseItem;
use codex_protocol::protocol::InitialHistory;
use codex_protocol::user_input::UserInput;
use codex_utils_readiness::Readiness;
use codex_utils_readiness::ReadinessFlag;
pub mod compact;
use self::compact::build_compacted_history;
@@ -178,6 +184,7 @@ impl Codex {
sandbox_policy: config.sandbox_policy.clone(),
cwd: config.cwd.clone(),
original_config_do_not_use: Arc::clone(&config),
features: config.features.clone(),
};
// Generate a unique ID for the lifetime of this Codex session.
@@ -271,6 +278,7 @@ pub(crate) struct TurnContext {
pub(crate) is_review_mode: bool,
pub(crate) final_output_json_schema: Option<Value>,
pub(crate) codex_linux_sandbox_exe: Option<PathBuf>,
pub(crate) tool_call_gate: Arc<ReadinessFlag>,
}
impl TurnContext {
@@ -312,6 +320,9 @@ pub(crate) struct SessionConfiguration {
/// operate deterministically.
cwd: PathBuf,
/// Set of feature flags for this session
features: Features,
// TODO(pakrym): Remove config from here
original_config_do_not_use: Arc<Config>,
}
@@ -406,6 +417,7 @@ impl Session {
is_review_mode: false,
final_output_json_schema: None,
codex_linux_sandbox_exe: config.codex_linux_sandbox_exe.clone(),
tool_call_gate: Arc::new(ReadinessFlag::new()),
}
}
@@ -569,7 +581,6 @@ impl Session {
// Dispatch the SessionConfiguredEvent first and then report any errors.
// If resuming, include converted initial messages in the payload so UIs can render them immediately.
let initial_messages = initial_history.get_event_msgs();
sess.record_initial_history(initial_history).await;
let events = std::iter::once(Event {
id: INITIAL_SUBMIT_ID.to_owned(),
@@ -588,6 +599,9 @@ impl Session {
sess.send_event_raw(event).await;
}
// record_initial_history can emit events. We record only after the SessionConfiguredEvent is emitted.
sess.record_initial_history(initial_history).await;
Ok(sess)
}
@@ -595,6 +609,19 @@ impl Session {
self.tx_event.clone()
}
/// Ensure all rollout writes are durably flushed.
pub(crate) async fn flush_rollout(&self) {
let recorder = {
let guard = self.services.rollout.lock().await;
guard.clone()
};
if let Some(rec) = recorder
&& let Err(e) = rec.flush().await
{
warn!("failed to flush rollout recorder: {e}");
}
}
fn next_internal_sub_id(&self) -> String {
let id = self
.next_internal_sub_id
@@ -608,7 +635,9 @@ impl Session {
InitialHistory::New => {
// Build and record initial items (user instructions + environment context)
let items = self.build_initial_context(&turn_context);
self.record_conversation_items(&items).await;
self.record_conversation_items(&turn_context, &items).await;
// Ensure initial items are visible to immediate readers (e.g., tests, forks).
self.flush_rollout().await;
}
InitialHistory::Resumed(_) | InitialHistory::Forked(_) => {
let rollout_items = conversation_history.get_rollout_items();
@@ -625,6 +654,8 @@ impl Session {
if persist && !rollout_items.is_empty() {
self.persist_rollout_items(&rollout_items).await;
}
// Flush after seeding history and any persisted rollout copy.
self.flush_rollout().await;
}
}
}
@@ -755,6 +786,32 @@ impl Session {
}
}
pub(crate) async fn assess_sandbox_command(
&self,
turn_context: &TurnContext,
call_id: &str,
command: &[String],
failure_message: Option<&str>,
) -> Option<SandboxCommandAssessment> {
let config = turn_context.client.config();
let provider = turn_context.client.provider().clone();
let auth_manager = Arc::clone(&self.services.auth_manager);
let otel = self.services.otel_event_manager.clone();
crate::sandboxing::assessment::assess_command(
config,
provider,
auth_manager,
&otel,
self.conversation_id,
call_id,
command,
&turn_context.sandbox_policy,
&turn_context.cwd,
failure_message,
)
.await
}
/// Emit an exec approval request event and await the user's decision.
///
/// The request is keyed by `sub_id`/`call_id` so matching responses are delivered
@@ -767,6 +824,7 @@ impl Session {
command: Vec<String>,
cwd: PathBuf,
reason: Option<String>,
risk: Option<SandboxCommandAssessment>,
) -> ReviewDecision {
let sub_id = turn_context.sub_id.clone();
// Add the tx_approve callback to the map before sending the request.
@@ -792,6 +850,7 @@ impl Session {
command,
cwd,
reason,
risk,
parsed_cmd,
});
self.send_event(turn_context, event).await;
@@ -857,9 +916,14 @@ impl Session {
/// Records input items: always append to conversation history and
/// persist these response items to rollout.
pub(crate) async fn record_conversation_items(&self, items: &[ResponseItem]) {
pub(crate) async fn record_conversation_items(
&self,
turn_context: &TurnContext,
items: &[ResponseItem],
) {
self.record_into_history(items).await;
self.persist_rollout_response_items(items).await;
self.send_raw_response_items(turn_context, items).await;
}
fn reconstruct_history_from_rollout(
@@ -895,7 +959,7 @@ impl Session {
state.record_items(items.iter());
}
async fn replace_history(&self, items: Vec<ResponseItem>) {
pub(crate) async fn replace_history(&self, items: Vec<ResponseItem>) {
let mut state = self.state.lock().await;
state.replace_history(items);
}
@@ -909,6 +973,13 @@ impl Session {
self.persist_rollout_items(&rollout_items).await;
}
async fn send_raw_response_items(&self, turn_context: &TurnContext, items: &[ResponseItem]) {
for item in items {
self.send_event(turn_context, EventMsg::RawResponseItem(item.clone()))
.await;
}
}
pub(crate) fn build_initial_context(&self, turn_context: &TurnContext) -> Vec<ResponseItem> {
let mut items = Vec::<ResponseItem>::with_capacity(2);
if let Some(user_instructions) = turn_context.user_instructions.as_deref() {
@@ -1004,7 +1075,7 @@ impl Session {
) {
let response_item: ResponseItem = response_input.clone().into();
// Add to conversation history and persist response item to rollout
self.record_conversation_items(std::slice::from_ref(&response_item))
self.record_conversation_items(turn_context, std::slice::from_ref(&response_item))
.await;
// Derive user message events and persist only UserMessage to rollout
@@ -1037,6 +1108,43 @@ impl Session {
self.send_event(turn_context, event).await;
}
async fn maybe_start_ghost_snapshot(
self: &Arc<Self>,
turn_context: Arc<TurnContext>,
cancellation_token: CancellationToken,
) {
if turn_context.is_review_mode
|| !self
.state
.lock()
.await
.session_configuration
.features
.enabled(Feature::GhostCommit)
{
return;
}
let token = match turn_context.tool_call_gate.subscribe().await {
Ok(token) => token,
Err(err) => {
warn!("failed to subscribe to ghost snapshot readiness: {err}");
return;
}
};
info!("spawning ghost snapshot task");
let task = GhostSnapshotTask::new(token);
Arc::new(task)
.run(
Arc::new(SessionTaskContext::new(self.clone())),
turn_context.clone(),
Vec::new(),
cancellation_token,
)
.await;
}
/// Returns the input if there was no task running to inject into
pub async fn inject_input(&self, input: Vec<UserInput>) -> Result<(), Vec<UserInput>> {
let mut active = self.active_turn.lock().await;
@@ -1195,8 +1303,11 @@ async fn submission_loop(sess: Arc<Session>, config: Arc<Config>, rx_sub: Receiv
if let Some(env_item) = sess
.build_environment_update_item(previous_context.as_ref(), &current_context)
{
sess.record_conversation_items(std::slice::from_ref(&env_item))
.await;
sess.record_conversation_items(
&current_context,
std::slice::from_ref(&env_item),
)
.await;
}
sess.spawn_task(Arc::clone(&current_context), items, RegularTask)
@@ -1310,6 +1421,13 @@ async fn submission_loop(sess: Arc<Session>, config: Arc<Config>, rx_sub: Receiv
};
sess.send_event_raw(event).await;
}
Op::Undo => {
let turn_context = sess
.new_turn_with_sub_id(sub.id.clone(), SessionSettingsUpdate::default())
.await;
sess.spawn_task(turn_context, Vec::new(), UndoTask::new())
.await;
}
Op::Compact => {
let turn_context = sess
.new_turn_with_sub_id(sub.id.clone(), SessionSettingsUpdate::default())
@@ -1355,33 +1473,7 @@ async fn submission_loop(sess: Arc<Session>, config: Arc<Config>, rx_sub: Receiv
sess.send_event_raw(event).await;
break;
}
Op::GetPath => {
let sub_id = sub.id.clone();
// Flush rollout writes before returning the path so readers observe a consistent file.
let (path, rec_opt) = {
let guard = sess.services.rollout.lock().await;
match guard.as_ref() {
Some(rec) => (rec.get_rollout_path(), Some(rec.clone())),
None => {
error!("rollout recorder not found");
continue;
}
}
};
if let Some(rec) = rec_opt
&& let Err(e) = rec.flush().await
{
warn!("failed to flush rollout recorder before GetHistory: {e}");
}
let event = Event {
id: sub_id.clone(),
msg: EventMsg::ConversationPath(ConversationPathResponseEvent {
conversation_id: sess.conversation_id,
path,
}),
};
sess.send_event_raw(event).await;
}
Op::Review { review_request } => {
let turn_context = sess
.new_turn_with_sub_id(sub.id.clone(), SessionSettingsUpdate::default())
@@ -1472,6 +1564,7 @@ async fn spawn_review_thread(
is_review_mode: true,
final_output_json_schema: None,
codex_linux_sandbox_exe: parent_turn_context.codex_linux_sandbox_exe.clone(),
tool_call_gate: Arc::new(ReadinessFlag::new()),
};
// Seed the child task with the review prompt as the initial user message.
@@ -1535,6 +1628,8 @@ pub(crate) async fn run_task(
.await;
}
sess.maybe_start_ghost_snapshot(Arc::clone(&turn_context), cancellation_token.child_token())
.await;
let mut last_agent_message: Option<String> = None;
// Although from the perspective of codex.rs, TurnDiffTracker has the lifecycle of a Task which contains
// many turns, from the perspective of the user, it is a single turn.
@@ -1568,7 +1663,8 @@ pub(crate) async fn run_task(
}
review_thread_history.get_history()
} else {
sess.record_conversation_items(&pending_input).await;
sess.record_conversation_items(&turn_context, &pending_input)
.await;
sess.history_snapshot().await
};
@@ -1615,6 +1711,7 @@ pub(crate) async fn run_task(
is_review_mode,
&mut review_thread_history,
&sess,
&turn_context,
)
.await;
@@ -1663,6 +1760,7 @@ pub(crate) async fn run_task(
is_review_mode,
&mut review_thread_history,
&sess,
&turn_context,
)
.await;
// Aborted turn is reported via a different event.
@@ -1724,6 +1822,13 @@ fn parse_review_output_event(text: &str) -> ReviewOutputEvent {
}
}
fn filter_model_visible_history(input: Vec<ResponseItem>) -> Vec<ResponseItem> {
input
.into_iter()
.filter(|item| !matches!(item, ResponseItem::GhostSnapshot { .. }))
.collect()
}
async fn run_turn(
sess: Arc<Session>,
turn_context: Arc<TurnContext>,
@@ -1744,7 +1849,7 @@ async fn run_turn(
.supports_parallel_tool_calls;
let parallel_tool_calls = model_supports_parallel;
let prompt = Prompt {
input,
input: filter_model_visible_history(input),
tools: router.specs(),
parallel_tool_calls,
base_instructions_override: turn_context.base_instructions.clone(),
@@ -1806,7 +1911,7 @@ async fn run_turn(
// at a seemingly frozen screen.
sess.notify_stream_error(
turn_context.as_ref(),
format!("Re-connecting... {retries}/{max_retries}"),
format!("Reconnecting... {retries}/{max_retries}"),
)
.await;
@@ -1942,7 +2047,7 @@ async fn try_run_turn(
call_id: String::new(),
output: FunctionCallOutputPayload {
content: msg.to_string(),
success: None,
..Default::default()
},
};
add_completed(ProcessedResponseItem {
@@ -1956,7 +2061,7 @@ async fn try_run_turn(
call_id: String::new(),
output: FunctionCallOutputPayload {
content: message,
success: None,
..Default::default()
},
};
add_completed(ProcessedResponseItem {
@@ -2094,41 +2199,6 @@ pub(super) fn get_last_assistant_message_from_turn(responses: &[ResponseItem]) -
}
})
}
pub(crate) fn convert_call_tool_result_to_function_call_output_payload(
call_tool_result: &CallToolResult,
) -> FunctionCallOutputPayload {
let CallToolResult {
content,
is_error,
structured_content,
} = call_tool_result;
// In terms of what to send back to the model, we prefer structured_content,
// if available, and fallback to content, otherwise.
let mut is_success = is_error != &Some(true);
let content = if let Some(structured_content) = structured_content
&& structured_content != &serde_json::Value::Null
&& let Ok(serialized_structured_content) = serde_json::to_string(&structured_content)
{
serialized_structured_content
} else {
match serde_json::to_string(&content) {
Ok(serialized_content) => serialized_content,
Err(err) => {
// If we could not serialize either content or structured_content to
// JSON, flag this as an error.
is_success = false;
err.to_string()
}
}
};
FunctionCallOutputPayload {
content,
success: Some(is_success),
}
}
/// Emits an ExitedReviewMode Event with optional ReviewOutput,
/// and records a developer message with the review output.
pub(crate) async fn exit_review_mode(
@@ -2173,12 +2243,17 @@ pub(crate) async fn exit_review_mode(
}
session
.record_conversation_items(&[ResponseItem::Message {
id: None,
role: "user".to_string(),
content: vec![ContentItem::InputText { text: user_message }],
}])
.record_conversation_items(
&turn_context,
&[ResponseItem::Message {
id: None,
role: "user".to_string(),
content: vec![ContentItem::InputText { text: user_message }],
}],
)
.await;
// Make the recorded review note visible immediately for readers.
session.flush_rollout().await;
}
fn mcp_init_error_display(
@@ -2234,6 +2309,8 @@ fn is_mcp_client_startup_timeout_error(error: &anyhow::Error) -> bool {
|| error_message.contains("timed out handshaking with MCP server")
}
use crate::features::Feature;
use crate::features::Features;
#[cfg(test)]
pub(crate) use tests::make_session_and_context;
@@ -2254,10 +2331,6 @@ mod tests {
use crate::state::TaskKind;
use crate::tasks::SessionTask;
use crate::tasks::SessionTaskContext;
use crate::tools::MODEL_FORMAT_HEAD_LINES;
use crate::tools::MODEL_FORMAT_MAX_BYTES;
use crate::tools::MODEL_FORMAT_MAX_LINES;
use crate::tools::MODEL_FORMAT_TAIL_LINES;
use crate::tools::ToolRouter;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolOutput;
@@ -2331,7 +2404,7 @@ mod tests {
})),
};
let got = convert_call_tool_result_to_function_call_output_payload(&ctr);
let got = FunctionCallOutputPayload::from(&ctr);
let expected = FunctionCallOutputPayload {
content: serde_json::to_string(&json!({
"ok": true,
@@ -2339,100 +2412,12 @@ mod tests {
}))
.unwrap(),
success: Some(true),
..Default::default()
};
assert_eq!(expected, got);
}
#[test]
fn model_truncation_head_tail_by_lines() {
// Build 400 short lines so line-count limit, not byte budget, triggers truncation
let lines: Vec<String> = (1..=400).map(|i| format!("line{i}")).collect();
let full = lines.join("\n");
let exec = ExecToolCallOutput {
exit_code: 0,
stdout: StreamOutput::new(String::new()),
stderr: StreamOutput::new(String::new()),
aggregated_output: StreamOutput::new(full),
duration: StdDuration::from_secs(1),
timed_out: false,
};
let out = format_exec_output_str(&exec);
// Strip truncation header if present for subsequent assertions
let body = out
.strip_prefix("Total output lines: ")
.and_then(|rest| rest.split_once("\n\n").map(|x| x.1))
.unwrap_or(out.as_str());
// Expect elision marker with correct counts
let omitted = 400 - MODEL_FORMAT_MAX_LINES; // 144
let marker = format!("\n[... omitted {omitted} of 400 lines ...]\n\n");
assert!(out.contains(&marker), "missing marker: {out}");
// Validate head and tail
let parts: Vec<&str> = body.split(&marker).collect();
assert_eq!(parts.len(), 2, "expected one marker split");
let head = parts[0];
let tail = parts[1];
let expected_head: String = (1..=MODEL_FORMAT_HEAD_LINES)
.map(|i| format!("line{i}"))
.collect::<Vec<_>>()
.join("\n");
assert!(head.starts_with(&expected_head), "head mismatch");
let expected_tail: String = ((400 - MODEL_FORMAT_TAIL_LINES + 1)..=400)
.map(|i| format!("line{i}"))
.collect::<Vec<_>>()
.join("\n");
assert!(tail.ends_with(&expected_tail), "tail mismatch");
}
#[test]
fn model_truncation_respects_byte_budget() {
// Construct a large output (about 100kB) so byte budget dominates
let big_line = "x".repeat(100);
let full = std::iter::repeat_n(big_line, 1000)
.collect::<Vec<_>>()
.join("\n");
let exec = ExecToolCallOutput {
exit_code: 0,
stdout: StreamOutput::new(String::new()),
stderr: StreamOutput::new(String::new()),
aggregated_output: StreamOutput::new(full.clone()),
duration: StdDuration::from_secs(1),
timed_out: false,
};
let out = format_exec_output_str(&exec);
// Keep strict budget on the truncated body (excluding header)
let body = out
.strip_prefix("Total output lines: ")
.and_then(|rest| rest.split_once("\n\n").map(|x| x.1))
.unwrap_or(out.as_str());
assert!(body.len() <= MODEL_FORMAT_MAX_BYTES, "exceeds byte budget");
assert!(out.contains("omitted"), "should contain elision marker");
// Ensure head and tail are drawn from the original
assert!(full.starts_with(body.chars().take(8).collect::<String>().as_str()));
assert!(
full.ends_with(
body.chars()
.rev()
.take(8)
.collect::<String>()
.chars()
.rev()
.collect::<String>()
.as_str()
)
);
}
#[test]
fn includes_timed_out_message() {
let exec = ExecToolCallOutput {
@@ -2460,11 +2445,12 @@ mod tests {
structured_content: Some(serde_json::Value::Null),
};
let got = convert_call_tool_result_to_function_call_output_payload(&ctr);
let got = FunctionCallOutputPayload::from(&ctr);
let expected = FunctionCallOutputPayload {
content: serde_json::to_string(&vec![text_block("hello"), text_block("world")])
.unwrap(),
success: Some(true),
..Default::default()
};
assert_eq!(expected, got);
@@ -2478,10 +2464,11 @@ mod tests {
structured_content: Some(json!({ "message": "bad" })),
};
let got = convert_call_tool_result_to_function_call_output_payload(&ctr);
let got = FunctionCallOutputPayload::from(&ctr);
let expected = FunctionCallOutputPayload {
content: serde_json::to_string(&json!({ "message": "bad" })).unwrap(),
success: Some(false),
..Default::default()
};
assert_eq!(expected, got);
@@ -2495,10 +2482,11 @@ mod tests {
structured_content: None,
};
let got = convert_call_tool_result_to_function_call_output_payload(&ctr);
let got = FunctionCallOutputPayload::from(&ctr);
let expected = FunctionCallOutputPayload {
content: serde_json::to_string(&vec![text_block("alpha")]).unwrap(),
success: Some(true),
..Default::default()
};
assert_eq!(expected, got);
@@ -2550,6 +2538,7 @@ mod tests {
sandbox_policy: config.sandbox_policy.clone(),
cwd: config.cwd.clone(),
original_config_do_not_use: Arc::clone(&config),
features: Features::default(),
};
let state = SessionState::new(session_configuration.clone());
@@ -2618,6 +2607,7 @@ mod tests {
sandbox_policy: config.sandbox_policy.clone(),
cwd: config.cwd.clone(),
original_config_do_not_use: Arc::clone(&config),
features: Features::default(),
};
let state = SessionState::new(session_configuration.clone());
@@ -2772,13 +2762,19 @@ mod tests {
EventMsg::ExitedReviewMode(ev) => assert!(ev.review_output.is_none()),
other => panic!("unexpected first event: {other:?}"),
}
let second = tokio::time::timeout(std::time::Duration::from_secs(2), rx.recv())
.await
.expect("timeout waiting for second event")
.expect("second event");
match second.msg {
EventMsg::TurnAborted(e) => assert_eq!(TurnAbortReason::Interrupted, e.reason),
other => panic!("unexpected second event: {other:?}"),
loop {
let evt = tokio::time::timeout(std::time::Duration::from_secs(2), rx.recv())
.await
.expect("timeout waiting for next event")
.expect("event");
match evt.msg {
EventMsg::RawResponseItem(_) => continue,
EventMsg::TurnAborted(e) => {
assert_eq!(TurnAbortReason::Interrupted, e.reason);
break;
}
other => panic!("unexpected second event: {other:?}"),
}
}
let history = sess.history_snapshot().await;

View File

@@ -2,6 +2,7 @@ use std::sync::Arc;
use super::Session;
use super::TurnContext;
use super::filter_model_visible_history;
use super::get_last_assistant_message_from_turn;
use crate::Prompt;
use crate::client_common::ResponseEvent;
@@ -86,8 +87,9 @@ async fn run_compact_task_inner(
loop {
let turn_input = history.get_history();
let prompt_input = filter_model_visible_history(turn_input.clone());
let prompt = Prompt {
input: turn_input.clone(),
input: prompt_input.clone(),
..Default::default()
};
let attempt_result = drain_to_completed(&sess, turn_context.as_ref(), &prompt).await;
@@ -109,7 +111,7 @@ async fn run_compact_task_inner(
return;
}
Err(e @ CodexErr::ContextWindowExceeded) => {
if turn_input.len() > 1 {
if prompt_input.len() > 1 {
// Trim from the beginning to preserve cache (prefix-based) and keep recent messages intact.
error!(
"Context window exceeded while compacting; removing oldest history item. Error: {e}"
@@ -132,7 +134,7 @@ async fn run_compact_task_inner(
let delay = backoff(retries);
sess.notify_stream_error(
turn_context.as_ref(),
format!("Re-connecting... {retries}/{max_retries}"),
format!("Reconnecting... {retries}/{max_retries}"),
)
.await;
tokio::time::sleep(delay).await;
@@ -152,7 +154,13 @@ async fn run_compact_task_inner(
let summary_text = get_last_assistant_message_from_turn(&history_snapshot).unwrap_or_default();
let user_messages = collect_user_messages(&history_snapshot);
let initial_context = sess.build_initial_context(turn_context.as_ref());
let new_history = build_compacted_history(initial_context, &user_messages, &summary_text);
let mut new_history = build_compacted_history(initial_context, &user_messages, &summary_text);
let ghost_snapshots: Vec<ResponseItem> = history_snapshot
.iter()
.filter(|item| matches!(item, ResponseItem::GhostSnapshot { .. }))
.cloned()
.collect();
new_history.extend(ghost_snapshots);
sess.replace_history(new_history).await;
let rollout_item = RolloutItem::Compacted(CompactedItem {
@@ -200,7 +208,20 @@ pub(crate) fn build_compacted_history(
user_messages: &[String],
summary_text: &str,
) -> Vec<ResponseItem> {
let mut history = initial_context;
build_compacted_history_with_limit(
initial_context,
user_messages,
summary_text,
COMPACT_USER_MESSAGE_MAX_TOKENS * 4,
)
}
fn build_compacted_history_with_limit(
mut history: Vec<ResponseItem>,
user_messages: &[String],
summary_text: &str,
max_bytes: usize,
) -> Vec<ResponseItem> {
let mut user_messages_text = if user_messages.is_empty() {
"(none)".to_string()
} else {
@@ -208,7 +229,6 @@ pub(crate) fn build_compacted_history(
};
// Truncate the concatenated prior user messages so the bridge message
// stays well under the context window (approx. 4 bytes/token).
let max_bytes = COMPACT_USER_MESSAGE_MAX_TOKENS * 4;
if user_messages_text.len() > max_bytes {
user_messages_text = truncate_middle(&user_messages_text, max_bytes).0;
}
@@ -361,11 +381,16 @@ mod tests {
#[test]
fn build_compacted_history_truncates_overlong_user_messages() {
// Prepare a very large prior user message so the aggregated
// `user_messages_text` exceeds the truncation threshold used by
// `build_compacted_history` (80k bytes).
let big = "X".repeat(200_000);
let history = build_compacted_history(Vec::new(), std::slice::from_ref(&big), "SUMMARY");
// Use a small truncation limit so the test remains fast while still validating
// that oversized user content is truncated.
let max_bytes = 128;
let big = "X".repeat(max_bytes + 50);
let history = super::build_compacted_history_with_limit(
Vec::new(),
std::slice::from_ref(&big),
"SUMMARY",
max_bytes,
);
// Expect exactly one bridge message added to history (plus any initial context we provided, which is none).
assert_eq!(history.len(), 1);

View File

@@ -3,16 +3,21 @@ use crate::error::Result as CodexResult;
use crate::protocol::Event;
use crate::protocol::Op;
use crate::protocol::Submission;
use std::path::PathBuf;
pub struct CodexConversation {
codex: Codex,
rollout_path: PathBuf,
}
/// Conduit for the bidirectional stream of messages that compose a conversation
/// in Codex.
impl CodexConversation {
pub(crate) fn new(codex: Codex) -> Self {
Self { codex }
pub(crate) fn new(codex: Codex, rollout_path: PathBuf) -> Self {
Self {
codex,
rollout_path,
}
}
pub async fn submit(&self, op: Op) -> CodexResult<String> {
@@ -27,4 +32,8 @@ impl CodexConversation {
pub async fn next_event(&self) -> CodexResult<Event> {
self.codex.next_event().await
}
pub fn rollout_path(&self) -> PathBuf {
self.rollout_path.clone()
}
}

View File

@@ -223,6 +223,9 @@ pub struct Config {
pub tools_web_search_request: bool,
/// When `true`, run a model-based assessment for commands denied by the sandbox.
pub experimental_sandbox_command_assessment: bool,
pub use_experimental_streamable_shell_tool: bool,
/// If set to `true`, used only the experimental unified exec tool.
@@ -958,6 +961,7 @@ pub struct ConfigToml {
pub experimental_use_unified_exec_tool: Option<bool>,
pub experimental_use_rmcp_client: Option<bool>,
pub experimental_use_freeform_apply_patch: Option<bool>,
pub experimental_sandbox_command_assessment: Option<bool>,
}
impl From<ConfigToml> for UserSavedConfig {
@@ -1023,9 +1027,11 @@ impl ConfigToml {
fn derive_sandbox_policy(
&self,
sandbox_mode_override: Option<SandboxMode>,
profile_sandbox_mode: Option<SandboxMode>,
resolved_cwd: &Path,
) -> SandboxPolicy {
let resolved_sandbox_mode = sandbox_mode_override
.or(profile_sandbox_mode)
.or(self.sandbox_mode)
.or_else(|| {
// if no sandbox_mode is set, but user has marked directory as trusted, use WorkspaceWrite
@@ -1118,6 +1124,7 @@ pub struct ConfigOverrides {
pub include_view_image_tool: Option<bool>,
pub show_raw_agent_reasoning: Option<bool>,
pub tools_web_search_request: Option<bool>,
pub experimental_sandbox_command_assessment: Option<bool>,
/// Additional directories that should be treated as writable roots for this session.
pub additional_writable_roots: Vec<PathBuf>,
}
@@ -1147,6 +1154,7 @@ impl Config {
include_view_image_tool: include_view_image_tool_override,
show_raw_agent_reasoning,
tools_web_search_request: override_tools_web_search_request,
experimental_sandbox_command_assessment: sandbox_command_assessment_override,
additional_writable_roots,
} = overrides;
@@ -1172,6 +1180,7 @@ impl Config {
include_apply_patch_tool: include_apply_patch_tool_override,
include_view_image_tool: include_view_image_tool_override,
web_search_request: override_tools_web_search_request,
experimental_sandbox_command_assessment: sandbox_command_assessment_override,
};
let features = Features::from_config(&cfg, &config_profile, feature_overrides);
@@ -1212,7 +1221,8 @@ impl Config {
.get_active_project(&resolved_cwd)
.unwrap_or(ProjectConfig { trust_level: None });
let mut sandbox_policy = cfg.derive_sandbox_policy(sandbox_mode, &resolved_cwd);
let mut sandbox_policy =
cfg.derive_sandbox_policy(sandbox_mode, config_profile.sandbox_mode, &resolved_cwd);
if let SandboxPolicy::WorkspaceWrite { writable_roots, .. } = &mut sandbox_policy {
for path in additional_writable_roots {
if !writable_roots.iter().any(|existing| existing == &path) {
@@ -1235,8 +1245,8 @@ impl Config {
.is_some()
|| config_profile.approval_policy.is_some()
|| cfg.approval_policy.is_some()
// TODO(#3034): profile.sandbox_mode is not implemented
|| sandbox_mode.is_some()
|| config_profile.sandbox_mode.is_some()
|| cfg.sandbox_mode.is_some();
let mut model_providers = built_in_model_providers();
@@ -1269,6 +1279,8 @@ impl Config {
let use_experimental_streamable_shell_tool = features.enabled(Feature::StreamableShell);
let use_experimental_unified_exec_tool = features.enabled(Feature::UnifiedExec);
let use_experimental_use_rmcp_client = features.enabled(Feature::RmcpClient);
let experimental_sandbox_command_assessment =
features.enabled(Feature::SandboxCommandAssessment);
let forced_chatgpt_workspace_id =
cfg.forced_chatgpt_workspace_id.as_ref().and_then(|value| {
@@ -1390,6 +1402,7 @@ impl Config {
forced_login_method,
include_apply_patch_tool: include_apply_patch_tool_flag,
tools_web_search_request,
experimental_sandbox_command_assessment,
use_experimental_streamable_shell_tool,
use_experimental_unified_exec_tool,
use_experimental_use_rmcp_client,
@@ -1593,8 +1606,11 @@ network_access = false # This should be ignored.
let sandbox_mode_override = None;
assert_eq!(
SandboxPolicy::DangerFullAccess,
sandbox_full_access_cfg
.derive_sandbox_policy(sandbox_mode_override, &PathBuf::from("/tmp/test"))
sandbox_full_access_cfg.derive_sandbox_policy(
sandbox_mode_override,
None,
&PathBuf::from("/tmp/test")
)
);
let sandbox_read_only = r#"
@@ -1609,8 +1625,11 @@ network_access = true # This should be ignored.
let sandbox_mode_override = None;
assert_eq!(
SandboxPolicy::ReadOnly,
sandbox_read_only_cfg
.derive_sandbox_policy(sandbox_mode_override, &PathBuf::from("/tmp/test"))
sandbox_read_only_cfg.derive_sandbox_policy(
sandbox_mode_override,
None,
&PathBuf::from("/tmp/test")
)
);
let sandbox_workspace_write = r#"
@@ -1634,8 +1653,11 @@ exclude_slash_tmp = true
exclude_tmpdir_env_var: true,
exclude_slash_tmp: true,
},
sandbox_workspace_write_cfg
.derive_sandbox_policy(sandbox_mode_override, &PathBuf::from("/tmp/test"))
sandbox_workspace_write_cfg.derive_sandbox_policy(
sandbox_mode_override,
None,
&PathBuf::from("/tmp/test")
)
);
let sandbox_workspace_write = r#"
@@ -1662,8 +1684,11 @@ trust_level = "trusted"
exclude_tmpdir_env_var: true,
exclude_slash_tmp: true,
},
sandbox_workspace_write_cfg
.derive_sandbox_policy(sandbox_mode_override, &PathBuf::from("/tmp/test"))
sandbox_workspace_write_cfg.derive_sandbox_policy(
sandbox_mode_override,
None,
&PathBuf::from("/tmp/test")
)
);
}
@@ -1755,6 +1780,75 @@ trust_level = "trusted"
Ok(())
}
#[test]
fn profile_sandbox_mode_overrides_base() -> std::io::Result<()> {
let codex_home = TempDir::new()?;
let mut profiles = HashMap::new();
profiles.insert(
"work".to_string(),
ConfigProfile {
sandbox_mode: Some(SandboxMode::DangerFullAccess),
..Default::default()
},
);
let cfg = ConfigToml {
profiles,
profile: Some("work".to_string()),
sandbox_mode: Some(SandboxMode::ReadOnly),
..Default::default()
};
let config = Config::load_from_base_config_with_overrides(
cfg,
ConfigOverrides::default(),
codex_home.path().to_path_buf(),
)?;
assert!(matches!(
config.sandbox_policy,
SandboxPolicy::DangerFullAccess
));
assert!(config.did_user_set_custom_approval_policy_or_sandbox_mode);
Ok(())
}
#[test]
fn cli_override_takes_precedence_over_profile_sandbox_mode() -> std::io::Result<()> {
let codex_home = TempDir::new()?;
let mut profiles = HashMap::new();
profiles.insert(
"work".to_string(),
ConfigProfile {
sandbox_mode: Some(SandboxMode::DangerFullAccess),
..Default::default()
},
);
let cfg = ConfigToml {
profiles,
profile: Some("work".to_string()),
..Default::default()
};
let overrides = ConfigOverrides {
sandbox_mode: Some(SandboxMode::WorkspaceWrite),
..Default::default()
};
let config = Config::load_from_base_config_with_overrides(
cfg,
overrides,
codex_home.path().to_path_buf(),
)?;
assert!(matches!(
config.sandbox_policy,
SandboxPolicy::WorkspaceWrite { .. }
));
Ok(())
}
#[test]
fn feature_table_overrides_legacy_flags() -> std::io::Result<()> {
let codex_home = TempDir::new()?;
@@ -2873,6 +2967,7 @@ model_verbosity = "high"
forced_login_method: None,
include_apply_patch_tool: false,
tools_web_search_request: false,
experimental_sandbox_command_assessment: false,
use_experimental_streamable_shell_tool: false,
use_experimental_unified_exec_tool: false,
use_experimental_use_rmcp_client: false,
@@ -2941,6 +3036,7 @@ model_verbosity = "high"
forced_login_method: None,
include_apply_patch_tool: false,
tools_web_search_request: false,
experimental_sandbox_command_assessment: false,
use_experimental_streamable_shell_tool: false,
use_experimental_unified_exec_tool: false,
use_experimental_use_rmcp_client: false,
@@ -3024,6 +3120,7 @@ model_verbosity = "high"
forced_login_method: None,
include_apply_patch_tool: false,
tools_web_search_request: false,
experimental_sandbox_command_assessment: false,
use_experimental_streamable_shell_tool: false,
use_experimental_unified_exec_tool: false,
use_experimental_use_rmcp_client: false,
@@ -3093,6 +3190,7 @@ model_verbosity = "high"
forced_login_method: None,
include_apply_patch_tool: false,
tools_web_search_request: false,
experimental_sandbox_command_assessment: false,
use_experimental_streamable_shell_tool: false,
use_experimental_unified_exec_tool: false,
use_experimental_use_rmcp_client: false,

View File

@@ -4,6 +4,7 @@ use std::path::PathBuf;
use crate::protocol::AskForApproval;
use codex_protocol::config_types::ReasoningEffort;
use codex_protocol::config_types::ReasoningSummary;
use codex_protocol::config_types::SandboxMode;
use codex_protocol::config_types::Verbosity;
/// Collection of common configuration options that a user can define as a unit
@@ -15,6 +16,7 @@ pub struct ConfigProfile {
/// [`ModelProviderInfo`] to use.
pub model_provider: Option<String>,
pub approval_policy: Option<AskForApproval>,
pub sandbox_mode: Option<SandboxMode>,
pub model_reasoning_effort: Option<ReasoningEffort>,
pub model_reasoning_summary: Option<ReasoningSummary>,
pub model_verbosity: Option<Verbosity>,
@@ -26,6 +28,7 @@ pub struct ConfigProfile {
pub experimental_use_exec_command_tool: Option<bool>,
pub experimental_use_rmcp_client: Option<bool>,
pub experimental_use_freeform_apply_patch: Option<bool>,
pub experimental_sandbox_command_assessment: Option<bool>,
pub tools_web_search: Option<bool>,
pub tools_view_image: Option<bool>,
/// Optional feature toggles scoped to this profile.

View File

@@ -1,9 +1,20 @@
use codex_protocol::models::FunctionCallOutputContentItem;
use codex_protocol::models::FunctionCallOutputPayload;
use codex_protocol::models::ResponseItem;
use codex_protocol::protocol::TokenUsage;
use codex_protocol::protocol::TokenUsageInfo;
use codex_utils_string::take_bytes_at_char_boundary;
use codex_utils_string::take_last_bytes_at_char_boundary;
use std::ops::Deref;
use tracing::error;
// Model-formatting limits: clients get full streams; only content sent to the model is truncated.
pub(crate) const MODEL_FORMAT_MAX_BYTES: usize = 10 * 1024; // 10 KiB
pub(crate) const MODEL_FORMAT_MAX_LINES: usize = 256; // lines
pub(crate) const MODEL_FORMAT_HEAD_LINES: usize = MODEL_FORMAT_MAX_LINES / 2;
pub(crate) const MODEL_FORMAT_TAIL_LINES: usize = MODEL_FORMAT_MAX_LINES - MODEL_FORMAT_HEAD_LINES; // 128
pub(crate) const MODEL_FORMAT_HEAD_BYTES: usize = MODEL_FORMAT_MAX_BYTES / 2;
/// Transcript of conversation history
#[derive(Debug, Clone, Default)]
pub(crate) struct ConversationHistory {
@@ -40,11 +51,14 @@ impl ConversationHistory {
I::Item: std::ops::Deref<Target = ResponseItem>,
{
for item in items {
if !is_api_message(&item) {
let item_ref = item.deref();
let is_ghost_snapshot = matches!(item_ref, ResponseItem::GhostSnapshot { .. });
if !is_api_message(item_ref) && !is_ghost_snapshot {
continue;
}
self.items.push(item.clone());
let processed = Self::process_item(&item);
self.items.push(processed);
}
}
@@ -65,6 +79,22 @@ impl ConversationHistory {
}
}
pub(crate) fn replace(&mut self, items: Vec<ResponseItem>) {
self.items = items;
}
pub(crate) fn update_token_info(
&mut self,
usage: &TokenUsage,
model_context_window: Option<i64>,
) {
self.token_info = TokenUsageInfo::new_or_append(
&self.token_info,
&Some(usage.clone()),
model_context_window,
);
}
/// This function enforces a couple of invariants on the in-memory history:
/// 1. every call (function/custom) has a corresponding output entry
/// 2. every output has a corresponding call entry
@@ -107,7 +137,7 @@ impl ConversationHistory {
call_id: call_id.clone(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: None,
..Default::default()
},
},
));
@@ -154,7 +184,7 @@ impl ConversationHistory {
call_id: call_id.clone(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: None,
..Default::default()
},
},
));
@@ -165,6 +195,7 @@ impl ConversationHistory {
| ResponseItem::WebSearchCall { .. }
| ResponseItem::FunctionCallOutput { .. }
| ResponseItem::CustomToolCallOutput { .. }
| ResponseItem::GhostSnapshot { .. }
| ResponseItem::Other
| ResponseItem::Message { .. } => {
// nothing to do for these variants
@@ -231,6 +262,7 @@ impl ConversationHistory {
| ResponseItem::LocalShellCall { .. }
| ResponseItem::Reasoning { .. }
| ResponseItem::WebSearchCall { .. }
| ResponseItem::GhostSnapshot { .. }
| ResponseItem::Other
| ResponseItem::Message { .. } => {
// nothing to do for these variants
@@ -248,10 +280,6 @@ impl ConversationHistory {
}
}
pub(crate) fn replace(&mut self, items: Vec<ResponseItem>) {
self.items = items;
}
/// Removes the corresponding paired item for the provided `item`, if any.
///
/// Pairs:
@@ -321,19 +349,126 @@ impl ConversationHistory {
}
}
pub(crate) fn update_token_info(
&mut self,
usage: &TokenUsage,
model_context_window: Option<i64>,
) {
self.token_info = TokenUsageInfo::new_or_append(
&self.token_info,
&Some(usage.clone()),
model_context_window,
);
fn process_item(item: &ResponseItem) -> ResponseItem {
match item {
ResponseItem::FunctionCallOutput { call_id, output } => {
let truncated = format_output_for_model_body(output.content.as_str());
let truncated_items = output.content_items.as_ref().map(|items| {
items
.iter()
.map(|it| match it {
FunctionCallOutputContentItem::InputText { text } => {
FunctionCallOutputContentItem::InputText {
text: format_output_for_model_body(text),
}
}
FunctionCallOutputContentItem::InputImage { image_url } => {
FunctionCallOutputContentItem::InputImage {
image_url: image_url.clone(),
}
}
})
.collect()
});
ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output: FunctionCallOutputPayload {
content: truncated,
content_items: truncated_items,
success: output.success,
},
}
}
ResponseItem::CustomToolCallOutput { call_id, output } => {
let truncated = format_output_for_model_body(output);
ResponseItem::CustomToolCallOutput {
call_id: call_id.clone(),
output: truncated,
}
}
ResponseItem::Message { .. }
| ResponseItem::Reasoning { .. }
| ResponseItem::LocalShellCall { .. }
| ResponseItem::FunctionCall { .. }
| ResponseItem::WebSearchCall { .. }
| ResponseItem::CustomToolCall { .. }
| ResponseItem::GhostSnapshot { .. }
| ResponseItem::Other => item.clone(),
}
}
}
pub(crate) fn format_output_for_model_body(content: &str) -> String {
// Head+tail truncation for the model: show the beginning and end with an elision.
// Clients still receive full streams; only this formatted summary is capped.
let total_lines = content.lines().count();
if content.len() <= MODEL_FORMAT_MAX_BYTES && total_lines <= MODEL_FORMAT_MAX_LINES {
return content.to_string();
}
let output = truncate_formatted_exec_output(content, total_lines);
format!("Total output lines: {total_lines}\n\n{output}")
}
fn truncate_formatted_exec_output(content: &str, total_lines: usize) -> String {
let segments: Vec<&str> = content.split_inclusive('\n').collect();
let head_take = MODEL_FORMAT_HEAD_LINES.min(segments.len());
let tail_take = MODEL_FORMAT_TAIL_LINES.min(segments.len().saturating_sub(head_take));
let omitted = segments.len().saturating_sub(head_take + tail_take);
let head_slice_end: usize = segments
.iter()
.take(head_take)
.map(|segment| segment.len())
.sum();
let tail_slice_start: usize = if tail_take == 0 {
content.len()
} else {
content.len()
- segments
.iter()
.rev()
.take(tail_take)
.map(|segment| segment.len())
.sum::<usize>()
};
let head_slice = &content[..head_slice_end];
let tail_slice = &content[tail_slice_start..];
let truncated_by_bytes = content.len() > MODEL_FORMAT_MAX_BYTES;
// this is a bit wrong. We are counting metadata lines and not just shell output lines.
let marker = if omitted > 0 {
Some(format!(
"\n[... omitted {omitted} of {total_lines} lines ...]\n\n"
))
} else if truncated_by_bytes {
Some(format!(
"\n[... output truncated to fit {MODEL_FORMAT_MAX_BYTES} bytes ...]\n\n"
))
} else {
None
};
let marker_len = marker.as_ref().map_or(0, String::len);
let base_head_budget = MODEL_FORMAT_HEAD_BYTES.min(MODEL_FORMAT_MAX_BYTES);
let head_budget = base_head_budget.min(MODEL_FORMAT_MAX_BYTES.saturating_sub(marker_len));
let head_part = take_bytes_at_char_boundary(head_slice, head_budget);
let mut result = String::with_capacity(MODEL_FORMAT_MAX_BYTES.min(content.len()));
result.push_str(head_part);
if let Some(marker_text) = marker.as_ref() {
result.push_str(marker_text);
}
let remaining = MODEL_FORMAT_MAX_BYTES.saturating_sub(result.len());
if remaining == 0 {
return result;
}
let tail_part = take_last_bytes_at_char_boundary(tail_slice, remaining);
result.push_str(tail_part);
result
}
#[inline]
fn error_or_panic(message: String) {
if cfg!(debug_assertions) || env!("CARGO_PKG_VERSION").contains("alpha") {
@@ -355,6 +490,7 @@ fn is_api_message(message: &ResponseItem) -> bool {
| ResponseItem::LocalShellCall { .. }
| ResponseItem::Reasoning { .. }
| ResponseItem::WebSearchCall { .. } => true,
ResponseItem::GhostSnapshot { .. } => false,
ResponseItem::Other => false,
}
}
@@ -448,7 +584,7 @@ mod tests {
call_id: "call-1".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
success: None,
..Default::default()
},
},
];
@@ -464,7 +600,7 @@ mod tests {
call_id: "call-2".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
success: None,
..Default::default()
},
},
ResponseItem::FunctionCall {
@@ -498,7 +634,7 @@ mod tests {
call_id: "call-3".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
success: None,
..Default::default()
},
},
];
@@ -527,6 +663,184 @@ mod tests {
assert_eq!(h.contents(), vec![]);
}
#[test]
fn record_items_truncates_function_call_output_content() {
let mut history = ConversationHistory::new();
let long_line = "a very long line to trigger truncation\n";
let long_output = long_line.repeat(2_500);
let item = ResponseItem::FunctionCallOutput {
call_id: "call-100".to_string(),
output: FunctionCallOutputPayload {
content: long_output.clone(),
success: Some(true),
..Default::default()
},
};
history.record_items([&item]);
assert_eq!(history.items.len(), 1);
match &history.items[0] {
ResponseItem::FunctionCallOutput { output, .. } => {
assert_ne!(output.content, long_output);
assert!(
output.content.starts_with("Total output lines:"),
"expected truncated summary, got {}",
output.content
);
}
other => panic!("unexpected history item: {other:?}"),
}
}
#[test]
fn record_items_truncates_custom_tool_call_output_content() {
let mut history = ConversationHistory::new();
let line = "custom output that is very long\n";
let long_output = line.repeat(2_500);
let item = ResponseItem::CustomToolCallOutput {
call_id: "tool-200".to_string(),
output: long_output.clone(),
};
history.record_items([&item]);
assert_eq!(history.items.len(), 1);
match &history.items[0] {
ResponseItem::CustomToolCallOutput { output, .. } => {
assert_ne!(output, &long_output);
assert!(
output.starts_with("Total output lines:"),
"expected truncated summary, got {output}"
);
}
other => panic!("unexpected history item: {other:?}"),
}
}
// The following tests were adapted from tools::mod truncation tests to
// target the new truncation functions in conversation_history.
use regex_lite::Regex;
fn assert_truncated_message_matches(message: &str, line: &str, total_lines: usize) {
let pattern = truncated_message_pattern(line, total_lines);
let regex = Regex::new(&pattern).unwrap_or_else(|err| {
panic!("failed to compile regex {pattern}: {err}");
});
let captures = regex
.captures(message)
.unwrap_or_else(|| panic!("message failed to match pattern {pattern}: {message}"));
let body = captures
.name("body")
.expect("missing body capture")
.as_str();
assert!(
body.len() <= MODEL_FORMAT_MAX_BYTES,
"body exceeds byte limit: {} bytes",
body.len()
);
}
fn truncated_message_pattern(line: &str, total_lines: usize) -> String {
let head_take = MODEL_FORMAT_HEAD_LINES.min(total_lines);
let tail_take = MODEL_FORMAT_TAIL_LINES.min(total_lines.saturating_sub(head_take));
let omitted = total_lines.saturating_sub(head_take + tail_take);
let escaped_line = regex_lite::escape(line);
if omitted == 0 {
return format!(
r"(?s)^Total output lines: {total_lines}\n\n(?P<body>{escaped_line}.*\n\[\.{{3}} output truncated to fit {MODEL_FORMAT_MAX_BYTES} bytes \.{{3}}]\n\n.*)$",
);
}
format!(
r"(?s)^Total output lines: {total_lines}\n\n(?P<body>{escaped_line}.*\n\[\.{{3}} omitted {omitted} of {total_lines} lines \.{{3}}]\n\n.*)$",
)
}
#[test]
fn format_exec_output_truncates_large_error() {
let line = "very long execution error line that should trigger truncation\n";
let large_error = line.repeat(2_500); // way beyond both byte and line limits
let truncated = format_output_for_model_body(&large_error);
let total_lines = large_error.lines().count();
assert_truncated_message_matches(&truncated, line, total_lines);
assert_ne!(truncated, large_error);
}
#[test]
fn format_exec_output_marks_byte_truncation_without_omitted_lines() {
let long_line = "a".repeat(MODEL_FORMAT_MAX_BYTES + 50);
let truncated = format_output_for_model_body(&long_line);
assert_ne!(truncated, long_line);
let marker_line =
format!("[... output truncated to fit {MODEL_FORMAT_MAX_BYTES} bytes ...]");
assert!(
truncated.contains(&marker_line),
"missing byte truncation marker: {truncated}"
);
assert!(
!truncated.contains("omitted"),
"line omission marker should not appear when no lines were dropped: {truncated}"
);
}
#[test]
fn format_exec_output_returns_original_when_within_limits() {
let content = "example output\n".repeat(10);
assert_eq!(format_output_for_model_body(&content), content);
}
#[test]
fn format_exec_output_reports_omitted_lines_and_keeps_head_and_tail() {
let total_lines = MODEL_FORMAT_MAX_LINES + 100;
let content: String = (0..total_lines)
.map(|idx| format!("line-{idx}\n"))
.collect();
let truncated = format_output_for_model_body(&content);
let omitted = total_lines - MODEL_FORMAT_MAX_LINES;
let expected_marker = format!("[... omitted {omitted} of {total_lines} lines ...]");
assert!(
truncated.contains(&expected_marker),
"missing omitted marker: {truncated}"
);
assert!(
truncated.contains("line-0\n"),
"expected head line to remain: {truncated}"
);
let last_line = format!("line-{}\n", total_lines - 1);
assert!(
truncated.contains(&last_line),
"expected tail line to remain: {truncated}"
);
}
#[test]
fn format_exec_output_prefers_line_marker_when_both_limits_exceeded() {
let total_lines = MODEL_FORMAT_MAX_LINES + 42;
let long_line = "x".repeat(256);
let content: String = (0..total_lines)
.map(|idx| format!("line-{idx}-{long_line}\n"))
.collect();
let truncated = format_output_for_model_body(&content);
assert!(
truncated.contains("[... omitted 42 of 298 lines ...]"),
"expected omitted marker when line count exceeds limit: {truncated}"
);
assert!(
!truncated.contains("output truncated to fit"),
"line omission marker should take precedence over byte marker: {truncated}"
);
}
//TODO(aibrahim): run CI in release mode.
#[cfg(not(debug_assertions))]
#[test]
@@ -554,7 +868,7 @@ mod tests {
call_id: "call-x".to_string(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: None,
..Default::default()
},
},
]
@@ -631,7 +945,7 @@ mod tests {
call_id: "shell-1".to_string(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: None,
..Default::default()
},
},
]
@@ -645,7 +959,7 @@ mod tests {
call_id: "orphan-1".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
success: None,
..Default::default()
},
}];
let mut h = create_history_with_items(items);
@@ -685,7 +999,7 @@ mod tests {
call_id: "c2".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
success: None,
..Default::default()
},
},
// Will get an inserted custom tool output
@@ -727,7 +1041,7 @@ mod tests {
call_id: "c1".to_string(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: None,
..Default::default()
},
},
ResponseItem::CustomToolCall {
@@ -757,7 +1071,7 @@ mod tests {
call_id: "s1".to_string(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: None,
..Default::default()
},
},
]
@@ -822,7 +1136,7 @@ mod tests {
call_id: "orphan-1".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
success: None,
..Default::default()
},
}];
let mut h = create_history_with_items(items);
@@ -856,7 +1170,7 @@ mod tests {
call_id: "c2".to_string(),
output: FunctionCallOutputPayload {
content: "ok".to_string(),
success: None,
..Default::default()
},
},
ResponseItem::CustomToolCall {

View File

@@ -98,7 +98,10 @@ impl ConversationManager {
}
};
let conversation = Arc::new(CodexConversation::new(codex));
let conversation = Arc::new(CodexConversation::new(
codex,
session_configured.rollout_path.clone(),
));
self.conversations
.write()
.await

View File

@@ -55,7 +55,7 @@ pub enum SandboxErr {
#[derive(Error, Debug)]
pub enum CodexErr {
// todo(aibrahim): git rid of this error carrying the dangling artifacts
#[error("turn aborted")]
#[error("turn aborted. Something went wrong? Hit `/feedback` to report the issue.")]
TurnAborted {
dangling_artifacts: Vec<ProcessedResponseItem>,
},
@@ -91,7 +91,7 @@ pub enum CodexErr {
/// Returned by run_command_stream when the user pressed CtrlC (SIGINT). Session uses this to
/// surface a polite FunctionCallOutput back to the model instead of crashing the CLI.
#[error("interrupted (Ctrl-C)")]
#[error("interrupted (Ctrl-C). Something went wrong? Hit `/feedback` to report the issue.")]
Interrupted,
/// Unexpected HTTP status code.

View File

@@ -39,6 +39,10 @@ pub enum Feature {
ViewImageTool,
/// Allow the model to request web searches.
WebSearchRequest,
/// Enable the model-based risk assessments for sandboxed commands.
SandboxCommandAssessment,
/// Create a ghost commit at each turn.
GhostCommit,
}
impl Feature {
@@ -73,6 +77,7 @@ pub struct FeatureOverrides {
pub include_apply_patch_tool: Option<bool>,
pub include_view_image_tool: Option<bool>,
pub web_search_request: Option<bool>,
pub experimental_sandbox_command_assessment: Option<bool>,
}
impl FeatureOverrides {
@@ -137,6 +142,7 @@ impl Features {
let mut features = Features::with_defaults();
let base_legacy = LegacyFeatureToggles {
experimental_sandbox_command_assessment: cfg.experimental_sandbox_command_assessment,
experimental_use_freeform_apply_patch: cfg.experimental_use_freeform_apply_patch,
experimental_use_exec_command_tool: cfg.experimental_use_exec_command_tool,
experimental_use_unified_exec_tool: cfg.experimental_use_unified_exec_tool,
@@ -154,6 +160,8 @@ impl Features {
let profile_legacy = LegacyFeatureToggles {
include_apply_patch_tool: config_profile.include_apply_patch_tool,
include_view_image_tool: config_profile.include_view_image_tool,
experimental_sandbox_command_assessment: config_profile
.experimental_sandbox_command_assessment,
experimental_use_freeform_apply_patch: config_profile
.experimental_use_freeform_apply_patch,
experimental_use_exec_command_tool: config_profile.experimental_use_exec_command_tool,
@@ -183,6 +191,11 @@ fn feature_for_key(key: &str) -> Option<Feature> {
legacy::feature_for_key(key)
}
/// Returns `true` if the provided string matches a known feature toggle key.
pub fn is_known_feature_key(key: &str) -> bool {
feature_for_key(key).is_some()
}
/// Deserializable features table for TOML.
#[derive(Deserialize, Debug, Clone, Default, PartialEq)]
pub struct FeaturesToml {
@@ -236,4 +249,16 @@ pub const FEATURES: &[FeatureSpec] = &[
stage: Stage::Stable,
default_enabled: false,
},
FeatureSpec {
id: Feature::SandboxCommandAssessment,
key: "experimental_sandbox_command_assessment",
stage: Stage::Experimental,
default_enabled: false,
},
FeatureSpec {
id: Feature::GhostCommit,
key: "ghost_commit",
stage: Stage::Experimental,
default_enabled: false,
},
];

View File

@@ -9,6 +9,10 @@ struct Alias {
}
const ALIASES: &[Alias] = &[
Alias {
legacy_key: "experimental_sandbox_command_assessment",
feature: Feature::SandboxCommandAssessment,
},
Alias {
legacy_key: "experimental_use_unified_exec_tool",
feature: Feature::UnifiedExec,
@@ -53,6 +57,7 @@ pub(crate) fn feature_for_key(key: &str) -> Option<Feature> {
pub struct LegacyFeatureToggles {
pub include_apply_patch_tool: Option<bool>,
pub include_view_image_tool: Option<bool>,
pub experimental_sandbox_command_assessment: Option<bool>,
pub experimental_use_freeform_apply_patch: Option<bool>,
pub experimental_use_exec_command_tool: Option<bool>,
pub experimental_use_unified_exec_tool: Option<bool>,
@@ -69,6 +74,12 @@ impl LegacyFeatureToggles {
self.include_apply_patch_tool,
"include_apply_patch_tool",
);
set_if_some(
features,
Feature::SandboxCommandAssessment,
self.experimental_sandbox_command_assessment,
"experimental_sandbox_command_assessment",
);
set_if_some(
features,
Feature::ApplyPatchFreeform,

View File

@@ -77,6 +77,7 @@ pub use rollout::find_conversation_path_by_id_str;
pub use rollout::list::ConversationItem;
pub use rollout::list::ConversationsPage;
pub use rollout::list::Cursor;
pub use rollout::list::read_head_for_summary;
mod function_tool;
mod state;
mod tasks;

View File

@@ -35,6 +35,7 @@ pub(crate) async fn handle_mcp_tool_call(
output: FunctionCallOutputPayload {
content: format!("err: {e}"),
success: Some(false),
..Default::default()
},
};
}

View File

@@ -54,6 +54,9 @@ pub struct ModelFamily {
/// This is applied when computing the effective context window seen by
/// consumers.
pub effective_context_window_percent: i64,
/// If the model family supports setting the verbosity level when using Responses API.
pub support_verbosity: bool,
}
macro_rules! model_family {
@@ -73,6 +76,7 @@ macro_rules! model_family {
base_instructions: BASE_INSTRUCTIONS.to_string(),
experimental_supported_tools: Vec::new(),
effective_context_window_percent: 95,
support_verbosity: false,
};
// apply overrides
$(
@@ -128,10 +132,11 @@ pub fn find_family_for_model(slug: &str) -> Option<ModelFamily> {
"test_sync_tool".to_string(),
],
supports_parallel_tool_calls: true,
support_verbosity: true,
)
// Internal models.
} else if slug.starts_with("codex-") {
} else if slug.starts_with("codex-exp-") {
model_family!(
slug, slug,
supports_reasoning_summaries: true,
@@ -144,22 +149,25 @@ pub fn find_family_for_model(slug: &str) -> Option<ModelFamily> {
"read_file".to_string(),
],
supports_parallel_tool_calls: true,
support_verbosity: true,
)
// Production models.
} else if slug.starts_with("gpt-5-codex") {
} else if slug.starts_with("gpt-5-codex") || slug.starts_with("codex-") {
model_family!(
slug, slug,
supports_reasoning_summaries: true,
reasoning_summary_format: ReasoningSummaryFormat::Experimental,
base_instructions: GPT_5_CODEX_INSTRUCTIONS.to_string(),
apply_patch_tool_type: Some(ApplyPatchToolType::Freeform),
support_verbosity: true,
)
} else if slug.starts_with("gpt-5") {
model_family!(
slug, "gpt-5",
supports_reasoning_summaries: true,
needs_special_apply_patch_instructions: true,
support_verbosity: true,
)
} else {
None
@@ -179,5 +187,6 @@ pub fn derive_default_model_family(model: &str) -> ModelFamily {
base_instructions: BASE_INSTRUCTIONS.to_string(),
experimental_supported_tools: Vec::new(),
effective_context_window_percent: 95,
support_verbosity: false,
}
}

View File

@@ -37,8 +37,10 @@ impl ModelInfo {
}
pub(crate) fn get_model_info(model_family: &ModelFamily) -> Option<ModelInfo> {
let slug = model_family.slug.as_str();
match slug {
let raw_slug = model_family.slug.as_str();
let slug = raw_slug.strip_prefix("openai/").unwrap_or(raw_slug);
let normalized_slug = slug.replace(':', "-");
match normalized_slug.as_str() {
// OSS models have a 128k shared token pool.
// Arbitrarily splitting it: 3/4 input context, 1/4 output.
// https://openai.com/index/gpt-oss-model-card/

View File

@@ -1,4 +1,5 @@
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::conversation_history::ConversationHistory;
use codex_protocol::models::FunctionCallOutputPayload;
use codex_protocol::models::ResponseInputItem;
@@ -13,6 +14,7 @@ pub(crate) async fn process_items(
is_review_mode: bool,
review_thread_history: &mut ConversationHistory,
sess: &Session,
turn_context: &TurnContext,
) -> (Vec<ResponseInputItem>, Vec<ResponseItem>) {
let mut items_to_record_in_conversation_history = Vec::<ResponseItem>::new();
let mut responses = Vec::<ResponseInputItem>::new();
@@ -59,14 +61,11 @@ pub(crate) async fn process_items(
) => {
items_to_record_in_conversation_history.push(item);
let output = match result {
Ok(call_tool_result) => {
crate::codex::convert_call_tool_result_to_function_call_output_payload(
call_tool_result,
)
}
Ok(call_tool_result) => FunctionCallOutputPayload::from(call_tool_result),
Err(err) => FunctionCallOutputPayload {
content: err.clone(),
success: Some(false),
..Default::default()
},
};
items_to_record_in_conversation_history.push(ResponseItem::FunctionCallOutput {
@@ -104,7 +103,7 @@ pub(crate) async fn process_items(
if is_review_mode {
review_thread_history.record_items(items_to_record_in_conversation_history.iter());
} else {
sess.record_conversation_items(&items_to_record_in_conversation_history)
sess.record_conversation_items(turn_context, &items_to_record_in_conversation_history)
.await;
}
}

View File

@@ -54,6 +54,7 @@ struct HeadTailSummary {
saw_session_meta: bool,
saw_user_event: bool,
source: Option<SessionSource>,
model_provider: Option<String>,
created_at: Option<String>,
updated_at: Option<String>,
}
@@ -109,6 +110,8 @@ pub(crate) async fn get_conversations(
page_size: usize,
cursor: Option<&Cursor>,
allowed_sources: &[SessionSource],
model_providers: Option<&[String]>,
default_provider: &str,
) -> io::Result<ConversationsPage> {
let mut root = codex_home.to_path_buf();
root.push(SESSIONS_SUBDIR);
@@ -124,8 +127,17 @@ pub(crate) async fn get_conversations(
let anchor = cursor.cloned();
let result =
traverse_directories_for_paths(root.clone(), page_size, anchor, allowed_sources).await?;
let provider_matcher =
model_providers.and_then(|filters| ProviderMatcher::new(filters, default_provider));
let result = traverse_directories_for_paths(
root.clone(),
page_size,
anchor,
allowed_sources,
provider_matcher.as_ref(),
)
.await?;
Ok(result)
}
@@ -145,6 +157,7 @@ async fn traverse_directories_for_paths(
page_size: usize,
anchor: Option<Cursor>,
allowed_sources: &[SessionSource],
provider_matcher: Option<&ProviderMatcher<'_>>,
) -> io::Result<ConversationsPage> {
let mut items: Vec<ConversationItem> = Vec::with_capacity(page_size);
let mut scanned_files = 0usize;
@@ -153,6 +166,7 @@ async fn traverse_directories_for_paths(
Some(c) => (c.ts, c.id),
None => (OffsetDateTime::UNIX_EPOCH, Uuid::nil()),
};
let mut more_matches_available = false;
let year_dirs = collect_dirs_desc(&root, |s| s.parse::<u16>().ok()).await?;
@@ -184,6 +198,7 @@ async fn traverse_directories_for_paths(
for (ts, sid, _name_str, path) in day_files.into_iter() {
scanned_files += 1;
if scanned_files >= MAX_SCAN_FILES && items.len() >= page_size {
more_matches_available = true;
break 'outer;
}
if !anchor_passed {
@@ -194,6 +209,7 @@ async fn traverse_directories_for_paths(
}
}
if items.len() == page_size {
more_matches_available = true;
break 'outer;
}
// Read head and simultaneously detect message events within the same
@@ -208,6 +224,11 @@ async fn traverse_directories_for_paths(
{
continue;
}
if let Some(matcher) = provider_matcher
&& !matcher.matches(summary.model_provider.as_deref())
{
continue;
}
// Apply filters: must have session meta and at least one user message event
if summary.saw_session_meta && summary.saw_user_event {
let HeadTailSummary {
@@ -231,12 +252,21 @@ async fn traverse_directories_for_paths(
}
}
let next = build_next_cursor(&items);
let reached_scan_cap = scanned_files >= MAX_SCAN_FILES;
if reached_scan_cap && !items.is_empty() {
more_matches_available = true;
}
let next = if more_matches_available {
build_next_cursor(&items)
} else {
None
};
Ok(ConversationsPage {
items,
next_cursor: next,
num_scanned_files: scanned_files,
reached_scan_cap: scanned_files >= MAX_SCAN_FILES,
reached_scan_cap,
})
}
@@ -328,6 +358,32 @@ fn parse_timestamp_uuid_from_filename(name: &str) -> Option<(OffsetDateTime, Uui
Some((ts, uuid))
}
struct ProviderMatcher<'a> {
filters: &'a [String],
matches_default_provider: bool,
}
impl<'a> ProviderMatcher<'a> {
fn new(filters: &'a [String], default_provider: &'a str) -> Option<Self> {
if filters.is_empty() {
return None;
}
let matches_default_provider = filters.iter().any(|provider| provider == default_provider);
Some(Self {
filters,
matches_default_provider,
})
}
fn matches(&self, session_provider: Option<&str>) -> bool {
match session_provider {
Some(provider) => self.filters.iter().any(|candidate| candidate == provider),
None => self.matches_default_provider,
}
}
}
async fn read_head_and_tail(
path: &Path,
head_limit: usize,
@@ -354,6 +410,7 @@ async fn read_head_and_tail(
match rollout_line.item {
RolloutItem::SessionMeta(session_meta_line) => {
summary.source = Some(session_meta_line.meta.source);
summary.model_provider = session_meta_line.meta.model_provider.clone();
summary.created_at = summary
.created_at
.clone()
@@ -394,6 +451,13 @@ async fn read_head_and_tail(
Ok(summary)
}
/// Read up to `HEAD_RECORD_LIMIT` records from the start of the rollout file at `path`.
/// This should be enough to produce a summary including the session meta line.
pub async fn read_head_for_summary(path: &Path) -> io::Result<Vec<serde_json::Value>> {
let summary = read_head_and_tail(path, HEAD_RECORD_LIMIT, 0).await?;
Ok(summary.head)
}
async fn read_tail_records(
path: &Path,
max_records: usize,

View File

@@ -26,7 +26,8 @@ pub(crate) fn should_persist_response_item(item: &ResponseItem) -> bool {
| ResponseItem::FunctionCallOutput { .. }
| ResponseItem::CustomToolCall { .. }
| ResponseItem::CustomToolCallOutput { .. }
| ResponseItem::WebSearchCall { .. } => true,
| ResponseItem::WebSearchCall { .. }
| ResponseItem::GhostSnapshot { .. } => true,
ResponseItem::Other => false,
}
}
@@ -42,6 +43,7 @@ pub(crate) fn should_persist_event_msg(ev: &EventMsg) -> bool {
| EventMsg::TokenCount(_)
| EventMsg::EnteredReviewMode(_)
| EventMsg::ExitedReviewMode(_)
| EventMsg::UndoCompleted(_)
| EventMsg::TurnAborted(_) => true,
EventMsg::Error(_)
| EventMsg::TaskStarted(_)
@@ -50,6 +52,7 @@ pub(crate) fn should_persist_event_msg(ev: &EventMsg) -> bool {
| EventMsg::AgentReasoningDelta(_)
| EventMsg::AgentReasoningRawContentDelta(_)
| EventMsg::AgentReasoningSectionBreak(_)
| EventMsg::RawResponseItem(_)
| EventMsg::SessionConfigured(_)
| EventMsg::McpToolCallBegin(_)
| EventMsg::McpToolCallEnd(_)
@@ -66,12 +69,12 @@ pub(crate) fn should_persist_event_msg(ev: &EventMsg) -> bool {
| EventMsg::PatchApplyEnd(_)
| EventMsg::TurnDiff(_)
| EventMsg::GetHistoryEntryResponse(_)
| EventMsg::UndoStarted(_)
| EventMsg::McpListToolsResponse(_)
| EventMsg::ListCustomPromptsResponse(_)
| EventMsg::PlanUpdate(_)
| EventMsg::ShutdownComplete
| EventMsg::ViewImageToolCall(_)
| EventMsg::ConversationPath(_)
| EventMsg::ItemStarted(_)
| EventMsg::ItemCompleted(_) => false,
}

View File

@@ -97,8 +97,18 @@ impl RolloutRecorder {
page_size: usize,
cursor: Option<&Cursor>,
allowed_sources: &[SessionSource],
model_providers: Option<&[String]>,
default_provider: &str,
) -> std::io::Result<ConversationsPage> {
get_conversations(codex_home, page_size, cursor, allowed_sources).await
get_conversations(
codex_home,
page_size,
cursor,
allowed_sources,
model_providers,
default_provider,
)
.await
}
/// Attempt to create a new [`RolloutRecorder`]. If the sessions directory
@@ -137,6 +147,7 @@ impl RolloutRecorder {
cli_version: env!("CARGO_PKG_VERSION").to_string(),
instructions,
source,
model_provider: Some(config.model_provider_id.clone()),
}),
)
}
@@ -267,10 +278,6 @@ impl RolloutRecorder {
}))
}
pub(crate) fn get_rollout_path(&self) -> PathBuf {
self.rollout_path.clone()
}
pub async fn shutdown(&self) -> std::io::Result<()> {
let (tx_done, rx_done) = oneshot::channel();
match self.tx.send(RolloutCmd::Shutdown { ack: tx_done }).await {

View File

@@ -32,6 +32,14 @@ use codex_protocol::protocol::SessionSource;
use codex_protocol::protocol::UserMessageEvent;
const NO_SOURCE_FILTER: &[SessionSource] = &[];
const TEST_PROVIDER: &str = "test-provider";
fn provider_vec(providers: &[&str]) -> Vec<String> {
providers
.iter()
.map(std::string::ToString::to_string)
.collect()
}
fn write_session_file(
root: &Path,
@@ -39,6 +47,24 @@ fn write_session_file(
uuid: Uuid,
num_records: usize,
source: Option<SessionSource>,
) -> std::io::Result<(OffsetDateTime, Uuid)> {
write_session_file_with_provider(
root,
ts_str,
uuid,
num_records,
source,
Some("test-provider"),
)
}
fn write_session_file_with_provider(
root: &Path,
ts_str: &str,
uuid: Uuid,
num_records: usize,
source: Option<SessionSource>,
model_provider: Option<&str>,
) -> std::io::Result<(OffsetDateTime, Uuid)> {
let format: &[FormatItem] =
format_description!("[year]-[month]-[day]T[hour]-[minute]-[second]");
@@ -68,6 +94,9 @@ fn write_session_file(
if let Some(source) = source {
payload["source"] = serde_json::to_value(source).unwrap();
}
if let Some(provider) = model_provider {
payload["model_provider"] = serde_json::Value::String(provider.to_string());
}
let meta = serde_json::json!({
"timestamp": ts_str,
@@ -134,9 +163,17 @@ async fn test_list_conversations_latest_first() {
)
.unwrap();
let page = get_conversations(home, 10, None, INTERACTIVE_SESSION_SOURCES)
.await
.unwrap();
let provider_filter = provider_vec(&[TEST_PROVIDER]);
let page = get_conversations(
home,
10,
None,
INTERACTIVE_SESSION_SOURCES,
Some(provider_filter.as_slice()),
TEST_PROVIDER,
)
.await
.unwrap();
// Build expected objects
let p1 = home
@@ -166,6 +203,7 @@ async fn test_list_conversations_latest_first() {
"originator": "test_originator",
"cli_version": "test_version",
"source": "vscode",
"model_provider": "test-provider",
})];
let head_2 = vec![serde_json::json!({
"id": u2,
@@ -175,6 +213,7 @@ async fn test_list_conversations_latest_first() {
"originator": "test_originator",
"cli_version": "test_version",
"source": "vscode",
"model_provider": "test-provider",
})];
let head_1 = vec![serde_json::json!({
"id": u1,
@@ -184,11 +223,9 @@ async fn test_list_conversations_latest_first() {
"originator": "test_originator",
"cli_version": "test_version",
"source": "vscode",
"model_provider": "test-provider",
})];
let expected_cursor: Cursor =
serde_json::from_str(&format!("\"2025-01-01T12-00-00|{u1}\"")).unwrap();
let expected = ConversationsPage {
items: vec![
ConversationItem {
@@ -213,7 +250,7 @@ async fn test_list_conversations_latest_first() {
updated_at: Some("2025-01-01T12-00-00".into()),
},
],
next_cursor: Some(expected_cursor),
next_cursor: None,
num_scanned_files: 3,
reached_scan_cap: false,
};
@@ -275,9 +312,17 @@ async fn test_pagination_cursor() {
)
.unwrap();
let page1 = get_conversations(home, 2, None, INTERACTIVE_SESSION_SOURCES)
.await
.unwrap();
let provider_filter = provider_vec(&[TEST_PROVIDER]);
let page1 = get_conversations(
home,
2,
None,
INTERACTIVE_SESSION_SOURCES,
Some(provider_filter.as_slice()),
TEST_PROVIDER,
)
.await
.unwrap();
let p5 = home
.join("sessions")
.join("2025")
@@ -298,6 +343,7 @@ async fn test_pagination_cursor() {
"originator": "test_originator",
"cli_version": "test_version",
"source": "vscode",
"model_provider": "test-provider",
})];
let head_4 = vec![serde_json::json!({
"id": u4,
@@ -307,6 +353,7 @@ async fn test_pagination_cursor() {
"originator": "test_originator",
"cli_version": "test_version",
"source": "vscode",
"model_provider": "test-provider",
})];
let expected_cursor1: Cursor =
serde_json::from_str(&format!("\"2025-03-04T09-00-00|{u4}\"")).unwrap();
@@ -338,6 +385,8 @@ async fn test_pagination_cursor() {
2,
page1.next_cursor.as_ref(),
INTERACTIVE_SESSION_SOURCES,
Some(provider_filter.as_slice()),
TEST_PROVIDER,
)
.await
.unwrap();
@@ -361,6 +410,7 @@ async fn test_pagination_cursor() {
"originator": "test_originator",
"cli_version": "test_version",
"source": "vscode",
"model_provider": "test-provider",
})];
let head_2 = vec![serde_json::json!({
"id": u2,
@@ -370,6 +420,7 @@ async fn test_pagination_cursor() {
"originator": "test_originator",
"cli_version": "test_version",
"source": "vscode",
"model_provider": "test-provider",
})];
let expected_cursor2: Cursor =
serde_json::from_str(&format!("\"2025-03-02T09-00-00|{u2}\"")).unwrap();
@@ -401,6 +452,8 @@ async fn test_pagination_cursor() {
2,
page2.next_cursor.as_ref(),
INTERACTIVE_SESSION_SOURCES,
Some(provider_filter.as_slice()),
TEST_PROVIDER,
)
.await
.unwrap();
@@ -418,9 +471,8 @@ async fn test_pagination_cursor() {
"originator": "test_originator",
"cli_version": "test_version",
"source": "vscode",
"model_provider": "test-provider",
})];
let expected_cursor3: Cursor =
serde_json::from_str(&format!("\"2025-03-01T09-00-00|{u1}\"")).unwrap();
let expected_page3 = ConversationsPage {
items: vec![ConversationItem {
path: p1,
@@ -429,7 +481,7 @@ async fn test_pagination_cursor() {
created_at: Some("2025-03-01T09-00-00".into()),
updated_at: Some("2025-03-01T09-00-00".into()),
}],
next_cursor: Some(expected_cursor3),
next_cursor: None,
num_scanned_files: 5, // scanned 05, 04 (anchor), 03, 02 (anchor), 01
reached_scan_cap: false,
};
@@ -445,9 +497,17 @@ async fn test_get_conversation_contents() {
let ts = "2025-04-01T10-30-00";
write_session_file(home, ts, uuid, 2, Some(SessionSource::VSCode)).unwrap();
let page = get_conversations(home, 1, None, INTERACTIVE_SESSION_SOURCES)
.await
.unwrap();
let provider_filter = provider_vec(&[TEST_PROVIDER]);
let page = get_conversations(
home,
1,
None,
INTERACTIVE_SESSION_SOURCES,
Some(provider_filter.as_slice()),
TEST_PROVIDER,
)
.await
.unwrap();
let path = &page.items[0].path;
let content = get_conversation(path).await.unwrap();
@@ -467,8 +527,8 @@ async fn test_get_conversation_contents() {
"originator": "test_originator",
"cli_version": "test_version",
"source": "vscode",
"model_provider": "test-provider",
})];
let expected_cursor: Cursor = serde_json::from_str(&format!("\"{ts}|{uuid}\"")).unwrap();
let expected_page = ConversationsPage {
items: vec![ConversationItem {
path: expected_path,
@@ -477,7 +537,7 @@ async fn test_get_conversation_contents() {
created_at: Some(ts.into()),
updated_at: Some(ts.into()),
}],
next_cursor: Some(expected_cursor),
next_cursor: None,
num_scanned_files: 1,
reached_scan_cap: false,
};
@@ -495,6 +555,7 @@ async fn test_get_conversation_contents() {
"originator": "test_originator",
"cli_version": "test_version",
"source": "vscode",
"model_provider": "test-provider",
}
});
let user_event = serde_json::json!({
@@ -532,6 +593,7 @@ async fn test_tail_includes_last_response_items() -> Result<()> {
originator: "test_originator".into(),
cli_version: "test_version".into(),
source: SessionSource::VSCode,
model_provider: Some("test-provider".into()),
},
git: None,
}),
@@ -563,7 +625,16 @@ async fn test_tail_includes_last_response_items() -> Result<()> {
}
drop(file);
let page = get_conversations(home, 1, None, INTERACTIVE_SESSION_SOURCES).await?;
let provider_filter = provider_vec(&[TEST_PROVIDER]);
let page = get_conversations(
home,
1,
None,
INTERACTIVE_SESSION_SOURCES,
Some(provider_filter.as_slice()),
TEST_PROVIDER,
)
.await?;
let item = page.items.first().expect("conversation item");
let tail_len = item.tail.len();
assert_eq!(tail_len, 10usize.min(total_messages));
@@ -615,6 +686,7 @@ async fn test_tail_handles_short_sessions() -> Result<()> {
originator: "test_originator".into(),
cli_version: "test_version".into(),
source: SessionSource::VSCode,
model_provider: Some("test-provider".into()),
},
git: None,
}),
@@ -645,7 +717,16 @@ async fn test_tail_handles_short_sessions() -> Result<()> {
}
drop(file);
let page = get_conversations(home, 1, None, INTERACTIVE_SESSION_SOURCES).await?;
let provider_filter = provider_vec(&[TEST_PROVIDER]);
let page = get_conversations(
home,
1,
None,
INTERACTIVE_SESSION_SOURCES,
Some(provider_filter.as_slice()),
TEST_PROVIDER,
)
.await?;
let tail = &page.items.first().expect("conversation item").tail;
assert_eq!(tail.len(), 3);
@@ -699,6 +780,7 @@ async fn test_tail_skips_trailing_non_responses() -> Result<()> {
originator: "test_originator".into(),
cli_version: "test_version".into(),
source: SessionSource::VSCode,
model_provider: Some("test-provider".into()),
},
git: None,
}),
@@ -743,7 +825,16 @@ async fn test_tail_skips_trailing_non_responses() -> Result<()> {
writeln!(file, "{}", serde_json::to_string(&shutdown_event)?)?;
drop(file);
let page = get_conversations(home, 1, None, INTERACTIVE_SESSION_SOURCES).await?;
let provider_filter = provider_vec(&[TEST_PROVIDER]);
let page = get_conversations(
home,
1,
None,
INTERACTIVE_SESSION_SOURCES,
Some(provider_filter.as_slice()),
TEST_PROVIDER,
)
.await?;
let tail = &page.items.first().expect("conversation item").tail;
let expected: Vec<serde_json::Value> = (0..4)
@@ -785,9 +876,17 @@ async fn test_stable_ordering_same_second_pagination() {
write_session_file(home, ts, u2, 0, Some(SessionSource::VSCode)).unwrap();
write_session_file(home, ts, u3, 0, Some(SessionSource::VSCode)).unwrap();
let page1 = get_conversations(home, 2, None, INTERACTIVE_SESSION_SOURCES)
.await
.unwrap();
let provider_filter = provider_vec(&[TEST_PROVIDER]);
let page1 = get_conversations(
home,
2,
None,
INTERACTIVE_SESSION_SOURCES,
Some(provider_filter.as_slice()),
TEST_PROVIDER,
)
.await
.unwrap();
let p3 = home
.join("sessions")
@@ -810,6 +909,7 @@ async fn test_stable_ordering_same_second_pagination() {
"originator": "test_originator",
"cli_version": "test_version",
"source": "vscode",
"model_provider": "test-provider",
})]
};
let expected_cursor1: Cursor = serde_json::from_str(&format!("\"{ts}|{u2}\"")).unwrap();
@@ -841,6 +941,8 @@ async fn test_stable_ordering_same_second_pagination() {
2,
page1.next_cursor.as_ref(),
INTERACTIVE_SESSION_SOURCES,
Some(provider_filter.as_slice()),
TEST_PROVIDER,
)
.await
.unwrap();
@@ -850,7 +952,6 @@ async fn test_stable_ordering_same_second_pagination() {
.join("07")
.join("01")
.join(format!("rollout-2025-07-01T00-00-00-{u1}.jsonl"));
let expected_cursor2: Cursor = serde_json::from_str(&format!("\"{ts}|{u1}\"")).unwrap();
let expected_page2 = ConversationsPage {
items: vec![ConversationItem {
path: p1,
@@ -859,7 +960,7 @@ async fn test_stable_ordering_same_second_pagination() {
created_at: Some(ts.to_string()),
updated_at: Some(ts.to_string()),
}],
next_cursor: Some(expected_cursor2),
next_cursor: None,
num_scanned_files: 3, // scanned u3, u2 (anchor), u1
reached_scan_cap: false,
};
@@ -891,9 +992,17 @@ async fn test_source_filter_excludes_non_matching_sessions() {
)
.unwrap();
let interactive_only = get_conversations(home, 10, None, INTERACTIVE_SESSION_SOURCES)
.await
.unwrap();
let provider_filter = provider_vec(&[TEST_PROVIDER]);
let interactive_only = get_conversations(
home,
10,
None,
INTERACTIVE_SESSION_SOURCES,
Some(provider_filter.as_slice()),
TEST_PROVIDER,
)
.await
.unwrap();
let paths: Vec<_> = interactive_only
.items
.iter()
@@ -905,7 +1014,7 @@ async fn test_source_filter_excludes_non_matching_sessions() {
path.ends_with("rollout-2025-08-02T10-00-00-00000000-0000-0000-0000-00000000002a.jsonl")
}));
let all_sessions = get_conversations(home, 10, None, NO_SOURCE_FILTER)
let all_sessions = get_conversations(home, 10, None, NO_SOURCE_FILTER, None, TEST_PROVIDER)
.await
.unwrap();
let all_paths: Vec<_> = all_sessions
@@ -921,3 +1030,102 @@ async fn test_source_filter_excludes_non_matching_sessions() {
path.ends_with("rollout-2025-08-01T10-00-00-00000000-0000-0000-0000-00000000004d.jsonl")
}));
}
#[tokio::test]
async fn test_model_provider_filter_selects_only_matching_sessions() -> Result<()> {
let temp = TempDir::new().unwrap();
let home = temp.path();
let openai_id = Uuid::from_u128(1);
let beta_id = Uuid::from_u128(2);
let none_id = Uuid::from_u128(3);
write_session_file_with_provider(
home,
"2025-09-01T12-00-00",
openai_id,
1,
Some(SessionSource::VSCode),
Some("openai"),
)?;
write_session_file_with_provider(
home,
"2025-09-01T11-00-00",
beta_id,
1,
Some(SessionSource::VSCode),
Some("beta"),
)?;
write_session_file_with_provider(
home,
"2025-09-01T10-00-00",
none_id,
1,
Some(SessionSource::VSCode),
None,
)?;
let openai_id_str = openai_id.to_string();
let none_id_str = none_id.to_string();
let openai_filter = provider_vec(&["openai"]);
let openai_sessions = get_conversations(
home,
10,
None,
NO_SOURCE_FILTER,
Some(openai_filter.as_slice()),
"openai",
)
.await?;
assert_eq!(openai_sessions.items.len(), 2);
let openai_ids: Vec<_> = openai_sessions
.items
.iter()
.filter_map(|item| {
item.head
.first()
.and_then(|value| value.get("id"))
.and_then(serde_json::Value::as_str)
.map(str::to_string)
})
.collect();
assert!(openai_ids.contains(&openai_id_str));
assert!(openai_ids.contains(&none_id_str));
let beta_filter = provider_vec(&["beta"]);
let beta_sessions = get_conversations(
home,
10,
None,
NO_SOURCE_FILTER,
Some(beta_filter.as_slice()),
"openai",
)
.await?;
assert_eq!(beta_sessions.items.len(), 1);
let beta_id_str = beta_id.to_string();
let beta_head = beta_sessions
.items
.first()
.and_then(|item| item.head.first())
.and_then(|value| value.get("id"))
.and_then(serde_json::Value::as_str);
assert_eq!(beta_head, Some(beta_id_str.as_str()));
let unknown_filter = provider_vec(&["unknown"]);
let unknown_sessions = get_conversations(
home,
10,
None,
NO_SOURCE_FILTER,
Some(unknown_filter.as_slice()),
"openai",
)
.await?;
assert!(unknown_sessions.items.is_empty());
let all_sessions = get_conversations(home, 10, None, NO_SOURCE_FILTER, None, "openai").await?;
assert_eq!(all_sessions.items.len(), 3);
Ok(())
}

View File

@@ -0,0 +1,275 @@
use std::path::Path;
use std::path::PathBuf;
use std::sync::Arc;
use std::time::Duration;
use std::time::Instant;
use crate::AuthManager;
use crate::ModelProviderInfo;
use crate::client::ModelClient;
use crate::client_common::Prompt;
use crate::client_common::ResponseEvent;
use crate::config::Config;
use crate::protocol::SandboxPolicy;
use askama::Template;
use codex_otel::otel_event_manager::OtelEventManager;
use codex_protocol::ConversationId;
use codex_protocol::models::ContentItem;
use codex_protocol::models::ResponseItem;
use codex_protocol::protocol::SandboxCommandAssessment;
use futures::StreamExt;
use serde_json::json;
use tokio::time::timeout;
use tracing::warn;
const SANDBOX_ASSESSMENT_TIMEOUT: Duration = Duration::from_secs(5);
const SANDBOX_RISK_CATEGORY_VALUES: &[&str] = &[
"data_deletion",
"data_exfiltration",
"privilege_escalation",
"system_modification",
"network_access",
"resource_exhaustion",
"compliance",
];
#[derive(Template)]
#[template(path = "sandboxing/assessment_prompt.md", escape = "none")]
struct SandboxAssessmentPromptTemplate<'a> {
platform: &'a str,
sandbox_policy: &'a str,
filesystem_roots: Option<&'a str>,
working_directory: &'a str,
command_argv: &'a str,
command_joined: &'a str,
sandbox_failure_message: Option<&'a str>,
}
#[allow(clippy::too_many_arguments)]
pub(crate) async fn assess_command(
config: Arc<Config>,
provider: ModelProviderInfo,
auth_manager: Arc<AuthManager>,
parent_otel: &OtelEventManager,
conversation_id: ConversationId,
call_id: &str,
command: &[String],
sandbox_policy: &SandboxPolicy,
cwd: &Path,
failure_message: Option<&str>,
) -> Option<SandboxCommandAssessment> {
if !config.experimental_sandbox_command_assessment || command.is_empty() {
return None;
}
let command_json = serde_json::to_string(command).unwrap_or_else(|_| "[]".to_string());
let command_joined =
shlex::try_join(command.iter().map(String::as_str)).unwrap_or_else(|_| command.join(" "));
let failure = failure_message
.map(str::trim)
.filter(|msg| !msg.is_empty())
.map(str::to_string);
let cwd_str = cwd.to_string_lossy().to_string();
let sandbox_summary = summarize_sandbox_policy(sandbox_policy);
let mut roots = sandbox_roots_for_prompt(sandbox_policy, cwd);
roots.sort();
roots.dedup();
let platform = std::env::consts::OS;
let roots_formatted = roots.iter().map(|root| root.to_string_lossy().to_string());
let filesystem_roots = match roots_formatted.collect::<Vec<_>>() {
collected if collected.is_empty() => None,
collected => Some(collected.join(", ")),
};
let prompt_template = SandboxAssessmentPromptTemplate {
platform,
sandbox_policy: sandbox_summary.as_str(),
filesystem_roots: filesystem_roots.as_deref(),
working_directory: cwd_str.as_str(),
command_argv: command_json.as_str(),
command_joined: command_joined.as_str(),
sandbox_failure_message: failure.as_deref(),
};
let rendered_prompt = match prompt_template.render() {
Ok(rendered) => rendered,
Err(err) => {
warn!("failed to render sandbox assessment prompt: {err}");
return None;
}
};
let (system_prompt_section, user_prompt_section) = match rendered_prompt.split_once("\n---\n") {
Some(split) => split,
None => {
warn!("rendered sandbox assessment prompt missing separator");
return None;
}
};
let system_prompt = system_prompt_section
.strip_prefix("System Prompt:\n")
.unwrap_or(system_prompt_section)
.trim()
.to_string();
let user_prompt = user_prompt_section
.strip_prefix("User Prompt:\n")
.unwrap_or(user_prompt_section)
.trim()
.to_string();
let prompt = Prompt {
input: vec![ResponseItem::Message {
id: None,
role: "user".to_string(),
content: vec![ContentItem::InputText { text: user_prompt }],
}],
tools: Vec::new(),
parallel_tool_calls: false,
base_instructions_override: Some(system_prompt),
output_schema: Some(sandbox_assessment_schema()),
};
let child_otel =
parent_otel.with_model(config.model.as_str(), config.model_family.slug.as_str());
let client = ModelClient::new(
Arc::clone(&config),
Some(auth_manager),
child_otel,
provider,
config.model_reasoning_effort,
config.model_reasoning_summary,
conversation_id,
);
let start = Instant::now();
let assessment_result = timeout(SANDBOX_ASSESSMENT_TIMEOUT, async move {
let mut stream = client.stream(&prompt).await?;
let mut last_json: Option<String> = None;
while let Some(event) = stream.next().await {
match event {
Ok(ResponseEvent::OutputItemDone(item)) => {
if let Some(text) = response_item_text(&item) {
last_json = Some(text);
}
}
Ok(ResponseEvent::RateLimits(_)) => {}
Ok(ResponseEvent::Completed { .. }) => break,
Ok(_) => continue,
Err(err) => return Err(err),
}
}
Ok(last_json)
})
.await;
let duration = start.elapsed();
parent_otel.sandbox_assessment_latency(call_id, duration);
match assessment_result {
Ok(Ok(Some(raw))) => match serde_json::from_str::<SandboxCommandAssessment>(raw.trim()) {
Ok(assessment) => {
parent_otel.sandbox_assessment(
call_id,
"success",
Some(assessment.risk_level),
&assessment.risk_categories,
duration,
);
return Some(assessment);
}
Err(err) => {
warn!("failed to parse sandbox assessment JSON: {err}");
parent_otel.sandbox_assessment(call_id, "parse_error", None, &[], duration);
}
},
Ok(Ok(None)) => {
warn!("sandbox assessment response did not include any message");
parent_otel.sandbox_assessment(call_id, "no_output", None, &[], duration);
}
Ok(Err(err)) => {
warn!("sandbox assessment failed: {err}");
parent_otel.sandbox_assessment(call_id, "model_error", None, &[], duration);
}
Err(_) => {
warn!("sandbox assessment timed out");
parent_otel.sandbox_assessment(call_id, "timeout", None, &[], duration);
}
}
None
}
fn summarize_sandbox_policy(policy: &SandboxPolicy) -> String {
match policy {
SandboxPolicy::DangerFullAccess => "danger-full-access".to_string(),
SandboxPolicy::ReadOnly => "read-only".to_string(),
SandboxPolicy::WorkspaceWrite { network_access, .. } => {
let network = if *network_access {
"network"
} else {
"no-network"
};
format!("workspace-write (network_access={network})")
}
}
}
fn sandbox_roots_for_prompt(policy: &SandboxPolicy, cwd: &Path) -> Vec<PathBuf> {
let mut roots = vec![cwd.to_path_buf()];
if let SandboxPolicy::WorkspaceWrite { writable_roots, .. } = policy {
roots.extend(writable_roots.iter().cloned());
}
roots
}
fn sandbox_assessment_schema() -> serde_json::Value {
json!({
"type": "object",
"required": ["description", "risk_level", "risk_categories"],
"properties": {
"description": {
"type": "string",
"minLength": 1,
"maxLength": 500
},
"risk_level": {
"type": "string",
"enum": ["low", "medium", "high"]
},
"risk_categories": {
"type": "array",
"items": {
"type": "string",
"enum": SANDBOX_RISK_CATEGORY_VALUES
}
}
},
"additionalProperties": false
})
}
fn response_item_text(item: &ResponseItem) -> Option<String> {
match item {
ResponseItem::Message { content, .. } => {
let mut buffers: Vec<&str> = Vec::new();
for segment in content {
match segment {
ContentItem::InputText { text } | ContentItem::OutputText { text } => {
if !text.is_empty() {
buffers.push(text);
}
}
ContentItem::InputImage { .. } => {}
}
}
if buffers.is_empty() {
None
} else {
Some(buffers.join("\n"))
}
}
ResponseItem::FunctionCallOutput { output, .. } => Some(output.content.clone()),
_ => None,
}
}

View File

@@ -5,6 +5,9 @@ Build platform wrappers and produce ExecEnv for execution. Owns lowlevel
sandbox placement and transformation of portable CommandSpec into a
readytospawn environment.
*/
pub mod assessment;
use crate::exec::ExecToolCallOutput;
use crate::exec::SandboxType;
use crate::exec::StdoutStream;

View File

@@ -0,0 +1,110 @@
use crate::codex::TurnContext;
use crate::state::TaskKind;
use crate::tasks::SessionTask;
use crate::tasks::SessionTaskContext;
use async_trait::async_trait;
use codex_git_tooling::CreateGhostCommitOptions;
use codex_git_tooling::GitToolingError;
use codex_git_tooling::create_ghost_commit;
use codex_protocol::models::ResponseItem;
use codex_protocol::user_input::UserInput;
use codex_utils_readiness::Readiness;
use codex_utils_readiness::Token;
use std::sync::Arc;
use tokio_util::sync::CancellationToken;
use tracing::info;
use tracing::warn;
pub(crate) struct GhostSnapshotTask {
token: Token,
}
#[async_trait]
impl SessionTask for GhostSnapshotTask {
fn kind(&self) -> TaskKind {
TaskKind::Regular
}
async fn run(
self: Arc<Self>,
session: Arc<SessionTaskContext>,
ctx: Arc<TurnContext>,
_input: Vec<UserInput>,
cancellation_token: CancellationToken,
) -> Option<String> {
tokio::task::spawn(async move {
let token = self.token;
let ctx_for_task = Arc::clone(&ctx);
let cancelled = tokio::select! {
_ = cancellation_token.cancelled() => true,
_ = async {
let repo_path = ctx_for_task.cwd.clone();
// Required to run in a dedicated blocking pool.
match tokio::task::spawn_blocking(move || {
let options = CreateGhostCommitOptions::new(&repo_path);
create_ghost_commit(&options)
})
.await
{
Ok(Ok(ghost_commit)) => {
info!("ghost snapshot blocking task finished");
session
.session
.record_conversation_items(&ctx, &[ResponseItem::GhostSnapshot {
ghost_commit: ghost_commit.clone(),
}])
.await;
info!("ghost commit captured: {}", ghost_commit.id());
}
Ok(Err(err)) => {
warn!(
sub_id = ctx_for_task.sub_id.as_str(),
"failed to capture ghost snapshot: {err}"
);
let message = match err {
GitToolingError::NotAGitRepository { .. } => {
"Snapshots disabled: current directory is not a Git repository."
.to_string()
}
_ => format!("Snapshots disabled after ghost snapshot error: {err}."),
};
session
.session
.notify_background_event(&ctx_for_task, message)
.await;
}
Err(err) => {
warn!(
sub_id = ctx_for_task.sub_id.as_str(),
"ghost snapshot task panicked: {err}"
);
let message =
format!("Snapshots disabled after ghost snapshot panic: {err}.");
session
.session
.notify_background_event(&ctx_for_task, message)
.await;
}
}
} => false,
};
if cancelled {
info!("ghost snapshot task cancelled");
}
match ctx.tool_call_gate.mark_ready(token).await {
Ok(true) => info!("ghost snapshot gate marked ready"),
Ok(false) => warn!("ghost snapshot gate already ready"),
Err(err) => warn!("failed to mark ghost snapshot ready: {err}"),
}
});
None
}
}
impl GhostSnapshotTask {
pub(crate) fn new(token: Token) -> Self {
Self { token }
}
}

View File

@@ -1,6 +1,8 @@
mod compact;
mod ghost_snapshot;
mod regular;
mod review;
mod undo;
use std::sync::Arc;
use std::time::Duration;
@@ -25,8 +27,10 @@ use crate::state::TaskKind;
use codex_protocol::user_input::UserInput;
pub(crate) use compact::CompactTask;
pub(crate) use ghost_snapshot::GhostSnapshotTask;
pub(crate) use regular::RegularTask;
pub(crate) use review::ReviewTask;
pub(crate) use undo::UndoTask;
const GRACEFULL_INTERRUPTION_TIMEOUT_MS: u64 = 100;
@@ -46,10 +50,28 @@ impl SessionTaskContext {
}
}
/// Async task that drives a [`Session`] turn.
///
/// Implementations encapsulate a specific Codex workflow (regular chat,
/// reviews, ghost snapshots, etc.). Each task instance is owned by a
/// [`Session`] and executed on a background Tokio task. The trait is
/// intentionally small: implementers identify themselves via
/// [`SessionTask::kind`], perform their work in [`SessionTask::run`], and may
/// release resources in [`SessionTask::abort`].
#[async_trait]
pub(crate) trait SessionTask: Send + Sync + 'static {
/// Describes the type of work the task performs so the session can
/// surface it in telemetry and UI.
fn kind(&self) -> TaskKind;
/// Executes the task until completion or cancellation.
///
/// Implementations typically stream protocol events using `session` and
/// `ctx`, returning an optional final agent message when finished. The
/// provided `cancellation_token` is cancelled when the session requests an
/// abort; implementers should watch for it and terminate quickly once it
/// fires. Returning [`Some`] yields a final message that
/// [`Session::on_task_finished`] will emit to the client.
async fn run(
self: Arc<Self>,
session: Arc<SessionTaskContext>,
@@ -58,6 +80,11 @@ pub(crate) trait SessionTask: Send + Sync + 'static {
cancellation_token: CancellationToken,
) -> Option<String>;
/// Gives the task a chance to perform cleanup after an abort.
///
/// The default implementation is a no-op; override this if additional
/// teardown or notifications are required once
/// [`Session::abort_all_tasks`] cancels the task.
async fn abort(&self, session: Arc<SessionTaskContext>, ctx: Arc<TurnContext>) {
let _ = (session, ctx);
}

View File

@@ -0,0 +1,117 @@
use std::sync::Arc;
use crate::codex::TurnContext;
use crate::protocol::EventMsg;
use crate::protocol::UndoCompletedEvent;
use crate::protocol::UndoStartedEvent;
use crate::state::TaskKind;
use crate::tasks::SessionTask;
use crate::tasks::SessionTaskContext;
use async_trait::async_trait;
use codex_git_tooling::restore_ghost_commit;
use codex_protocol::models::ResponseItem;
use codex_protocol::user_input::UserInput;
use tokio_util::sync::CancellationToken;
use tracing::error;
use tracing::info;
use tracing::warn;
pub(crate) struct UndoTask;
impl UndoTask {
pub(crate) fn new() -> Self {
Self
}
}
#[async_trait]
impl SessionTask for UndoTask {
fn kind(&self) -> TaskKind {
TaskKind::Regular
}
async fn run(
self: Arc<Self>,
session: Arc<SessionTaskContext>,
ctx: Arc<TurnContext>,
_input: Vec<UserInput>,
cancellation_token: CancellationToken,
) -> Option<String> {
let sess = session.clone_session();
sess.send_event(
ctx.as_ref(),
EventMsg::UndoStarted(UndoStartedEvent {
message: Some("Undo in progress...".to_string()),
}),
)
.await;
if cancellation_token.is_cancelled() {
sess.send_event(
ctx.as_ref(),
EventMsg::UndoCompleted(UndoCompletedEvent {
success: false,
message: Some("Undo cancelled.".to_string()),
}),
)
.await;
return None;
}
let mut history = sess.clone_history().await;
let mut items = history.get_history();
let mut completed = UndoCompletedEvent {
success: false,
message: None,
};
let Some((idx, ghost_commit)) =
items
.iter()
.enumerate()
.rev()
.find_map(|(idx, item)| match item {
ResponseItem::GhostSnapshot { ghost_commit } => {
Some((idx, ghost_commit.clone()))
}
_ => None,
})
else {
completed.message = Some("No ghost snapshot available to undo.".to_string());
sess.send_event(ctx.as_ref(), EventMsg::UndoCompleted(completed))
.await;
return None;
};
let commit_id = ghost_commit.id().to_string();
let repo_path = ctx.cwd.clone();
let restore_result =
tokio::task::spawn_blocking(move || restore_ghost_commit(&repo_path, &ghost_commit))
.await;
match restore_result {
Ok(Ok(())) => {
items.remove(idx);
sess.replace_history(items).await;
let short_id: String = commit_id.chars().take(7).collect();
info!(commit_id = commit_id, "Undo restored ghost snapshot");
completed.success = true;
completed.message = Some(format!("Undo restored snapshot {short_id}."));
}
Ok(Err(err)) => {
let message = format!("Failed to restore snapshot {commit_id}: {err}");
warn!("{message}");
completed.message = Some(message);
}
Err(err) => {
let message = format!("Failed to restore snapshot {commit_id}: {err}");
error!("{message}");
completed.message = Some(message);
}
}
sess.send_event(ctx.as_ref(), EventMsg::UndoCompleted(completed))
.await;
None
}
}

View File

@@ -5,6 +5,7 @@ use crate::tools::TELEMETRY_PREVIEW_MAX_LINES;
use crate::tools::TELEMETRY_PREVIEW_TRUNCATION_NOTICE;
use crate::turn_diff_tracker::TurnDiffTracker;
use codex_otel::otel_event_manager::OtelEventManager;
use codex_protocol::models::FunctionCallOutputContentItem;
use codex_protocol::models::FunctionCallOutputPayload;
use codex_protocol::models::ResponseInputItem;
use codex_protocol::models::ShellToolCallParams;
@@ -65,7 +66,10 @@ impl ToolPayload {
#[derive(Clone)]
pub enum ToolOutput {
Function {
// Plain text representation of the tool output.
content: String,
// Some tool calls such as MCP calls may return structured content that can get parsed into an array of polymorphic content items.
content_items: Option<Vec<FunctionCallOutputContentItem>>,
success: Option<bool>,
},
Mcp {
@@ -90,7 +94,11 @@ impl ToolOutput {
pub fn into_response(self, call_id: &str, payload: &ToolPayload) -> ResponseInputItem {
match self {
ToolOutput::Function { content, success } => {
ToolOutput::Function {
content,
content_items,
success,
} => {
if matches!(payload, ToolPayload::Custom { .. }) {
ResponseInputItem::CustomToolCallOutput {
call_id: call_id.to_string(),
@@ -99,7 +107,11 @@ impl ToolOutput {
} else {
ResponseInputItem::FunctionCallOutput {
call_id: call_id.to_string(),
output: FunctionCallOutputPayload { content, success },
output: FunctionCallOutputPayload {
content,
content_items,
success,
},
}
}
}
@@ -163,6 +175,7 @@ mod tests {
};
let response = ToolOutput::Function {
content: "patched".to_string(),
content_items: None,
success: Some(true),
}
.into_response("call-42", &payload);
@@ -183,6 +196,7 @@ mod tests {
};
let response = ToolOutput::Function {
content: "ok".to_string(),
content_items: None,
success: Some(true),
}
.into_response("fn-1", &payload);
@@ -191,6 +205,7 @@ mod tests {
ResponseInputItem::FunctionCallOutput { call_id, output } => {
assert_eq!(call_id, "fn-1");
assert_eq!(output.content, "ok");
assert!(output.content_items.is_none());
assert_eq!(output.success, Some(true));
}
other => panic!("expected FunctionCallOutput, got {other:?}"),

View File

@@ -19,7 +19,6 @@ use std::path::Path;
use std::path::PathBuf;
use std::time::Duration;
use super::format_exec_output;
use super::format_exec_output_str;
#[derive(Clone, Copy)]
@@ -146,7 +145,7 @@ impl ToolEmitter {
(*message).to_string(),
-1,
Duration::ZERO,
format_exec_output(&message),
message.clone(),
)
.await;
}
@@ -241,7 +240,7 @@ impl ToolEmitter {
(*message).to_string(),
-1,
Duration::ZERO,
format_exec_output(&message),
message.clone(),
)
.await;
}
@@ -277,7 +276,7 @@ impl ToolEmitter {
}
Err(ToolError::Codex(err)) => {
let message = format!("execution error: {err:?}");
let response = super::format_exec_output(&message);
let response = message.clone();
event = ToolEventStage::Failure(ToolEventFailure::Message(message));
Err(FunctionCallError::RespondToModel(response))
}
@@ -289,9 +288,9 @@ impl ToolEmitter {
} else {
msg
};
let response = super::format_exec_output(&normalized);
event = ToolEventStage::Failure(ToolEventFailure::Message(normalized));
Err(FunctionCallError::RespondToModel(response))
let response = &normalized;
event = ToolEventStage::Failure(ToolEventFailure::Message(normalized.clone()));
Err(FunctionCallError::RespondToModel(response.clone()))
}
};
self.emit(ctx, event).await;

View File

@@ -82,6 +82,7 @@ impl ToolHandler for ApplyPatchHandler {
let content = item?;
Ok(ToolOutput::Function {
content,
content_items: None,
success: Some(true),
})
}
@@ -126,6 +127,7 @@ impl ToolHandler for ApplyPatchHandler {
let content = emitter.finish(event_ctx, out).await?;
Ok(ToolOutput::Function {
content,
content_items: None,
success: Some(true),
})
}

View File

@@ -90,11 +90,13 @@ impl ToolHandler for GrepFilesHandler {
if search_results.is_empty() {
Ok(ToolOutput::Function {
content: "No matches found.".to_string(),
content_items: None,
success: Some(false),
})
} else {
Ok(ToolOutput::Function {
content: search_results.join("\n"),
content_items: None,
success: Some(true),
})
}

View File

@@ -106,6 +106,7 @@ impl ToolHandler for ListDirHandler {
output.extend(entries);
Ok(ToolOutput::Function {
content: output.join("\n"),
content_items: None,
success: Some(true),
})
}

View File

@@ -56,8 +56,16 @@ impl ToolHandler for McpHandler {
Ok(ToolOutput::Mcp { result })
}
codex_protocol::models::ResponseInputItem::FunctionCallOutput { output, .. } => {
let codex_protocol::models::FunctionCallOutputPayload { content, success } = output;
Ok(ToolOutput::Function { content, success })
let codex_protocol::models::FunctionCallOutputPayload {
content,
content_items,
success,
} = output;
Ok(ToolOutput::Function {
content,
content_items,
success,
})
}
_ => Err(FunctionCallError::RespondToModel(
"mcp handler received unexpected response variant".to_string(),

View File

@@ -297,7 +297,10 @@ async fn handle_list_resources(
match payload_result {
Ok(payload) => match serialize_function_output(payload) {
Ok(output) => {
let ToolOutput::Function { content, success } = &output else {
let ToolOutput::Function {
content, success, ..
} = &output
else {
unreachable!("MCP resource handler should return function output");
};
let duration = start.elapsed();
@@ -403,7 +406,10 @@ async fn handle_list_resource_templates(
match payload_result {
Ok(payload) => match serialize_function_output(payload) {
Ok(output) => {
let ToolOutput::Function { content, success } = &output else {
let ToolOutput::Function {
content, success, ..
} = &output
else {
unreachable!("MCP resource handler should return function output");
};
let duration = start.elapsed();
@@ -489,7 +495,10 @@ async fn handle_read_resource(
match payload_result {
Ok(payload) => match serialize_function_output(payload) {
Ok(output) => {
let ToolOutput::Function { content, success } = &output else {
let ToolOutput::Function {
content, success, ..
} = &output
else {
unreachable!("MCP resource handler should return function output");
};
let duration = start.elapsed();
@@ -618,6 +627,7 @@ where
Ok(ToolOutput::Function {
content,
content_items: None,
success: Some(true),
})
}

View File

@@ -88,6 +88,7 @@ impl ToolHandler for PlanHandler {
Ok(ToolOutput::Function {
content,
content_items: None,
success: Some(true),
})
}

View File

@@ -149,6 +149,7 @@ impl ToolHandler for ReadFileHandler {
};
Ok(ToolOutput::Function {
content: collected.join("\n"),
content_items: None,
success: Some(true),
})
}

View File

@@ -136,6 +136,7 @@ impl ShellHandler {
let content = item?;
return Ok(ToolOutput::Function {
content,
content_items: None,
success: Some(true),
});
}
@@ -179,6 +180,7 @@ impl ShellHandler {
let content = emitter.finish(event_ctx, out).await?;
return Ok(ToolOutput::Function {
content,
content_items: None,
success: Some(true),
});
}
@@ -226,6 +228,7 @@ impl ShellHandler {
let content = emitter.finish(event_ctx, out).await?;
Ok(ToolOutput::Function {
content,
content_items: None,
success: Some(true),
})
}

View File

@@ -95,6 +95,7 @@ impl ToolHandler for TestSyncHandler {
Ok(ToolOutput::Function {
content: "ok".to_string(),
content_items: None,
success: Some(true),
})
}

View File

@@ -171,6 +171,7 @@ impl ToolHandler for UnifiedExecHandler {
Ok(ToolOutput::Function {
content,
content_items: None,
success: Some(true),
})
}

View File

@@ -85,6 +85,7 @@ impl ToolHandler for ViewImageHandler {
Ok(ToolOutput::Function {
content: "attached local image path".to_string(),
content_items: None,
success: Some(true),
})
}

View File

@@ -9,19 +9,11 @@ pub mod runtimes;
pub mod sandboxing;
pub mod spec;
use crate::conversation_history::format_output_for_model_body;
use crate::exec::ExecToolCallOutput;
use codex_utils_string::take_bytes_at_char_boundary;
use codex_utils_string::take_last_bytes_at_char_boundary;
pub use router::ToolRouter;
use serde::Serialize;
// Model-formatting limits: clients get full streams; only content sent to the model is truncated.
pub(crate) const MODEL_FORMAT_MAX_BYTES: usize = 10 * 1024; // 10 KiB
pub(crate) const MODEL_FORMAT_MAX_LINES: usize = 256; // lines
pub(crate) const MODEL_FORMAT_HEAD_LINES: usize = MODEL_FORMAT_MAX_LINES / 2;
pub(crate) const MODEL_FORMAT_TAIL_LINES: usize = MODEL_FORMAT_MAX_LINES - MODEL_FORMAT_HEAD_LINES; // 128
pub(crate) const MODEL_FORMAT_HEAD_BYTES: usize = MODEL_FORMAT_MAX_BYTES / 2;
// Telemetry preview limits: keep log events smaller than model budgets.
pub(crate) const TELEMETRY_PREVIEW_MAX_BYTES: usize = 2 * 1024; // 2 KiB
pub(crate) const TELEMETRY_PREVIEW_MAX_LINES: usize = 64; // lines
@@ -73,249 +65,15 @@ pub fn format_exec_output_str(exec_output: &ExecToolCallOutput) -> String {
let content = aggregated_output.text.as_str();
if exec_output.timed_out {
let prefixed = format!(
let body = if exec_output.timed_out {
format!(
"command timed out after {} milliseconds\n{content}",
exec_output.duration.as_millis()
);
return format_exec_output(&prefixed);
}
format_exec_output(content)
}
pub(super) fn format_exec_output(content: &str) -> String {
// Head+tail truncation for the model: show the beginning and end with an elision.
// Clients still receive full streams; only this formatted summary is capped.
let total_lines = content.lines().count();
if content.len() <= MODEL_FORMAT_MAX_BYTES && total_lines <= MODEL_FORMAT_MAX_LINES {
return content.to_string();
}
let output = truncate_formatted_exec_output(content, total_lines);
format!("Total output lines: {total_lines}\n\n{output}")
}
fn truncate_formatted_exec_output(content: &str, total_lines: usize) -> String {
let segments: Vec<&str> = content.split_inclusive('\n').collect();
let head_take = MODEL_FORMAT_HEAD_LINES.min(segments.len());
let tail_take = MODEL_FORMAT_TAIL_LINES.min(segments.len().saturating_sub(head_take));
let omitted = segments.len().saturating_sub(head_take + tail_take);
let head_slice_end: usize = segments
.iter()
.take(head_take)
.map(|segment| segment.len())
.sum();
let tail_slice_start: usize = if tail_take == 0 {
content.len()
} else {
content.len()
- segments
.iter()
.rev()
.take(tail_take)
.map(|segment| segment.len())
.sum::<usize>()
};
let head_slice = &content[..head_slice_end];
let tail_slice = &content[tail_slice_start..];
let truncated_by_bytes = content.len() > MODEL_FORMAT_MAX_BYTES;
let marker = if omitted > 0 {
Some(format!(
"\n[... omitted {omitted} of {total_lines} lines ...]\n\n"
))
} else if truncated_by_bytes {
Some(format!(
"\n[... output truncated to fit {MODEL_FORMAT_MAX_BYTES} bytes ...]\n\n"
))
} else {
None
};
let marker_len = marker.as_ref().map_or(0, String::len);
let base_head_budget = MODEL_FORMAT_HEAD_BYTES.min(MODEL_FORMAT_MAX_BYTES);
let head_budget = base_head_budget.min(MODEL_FORMAT_MAX_BYTES.saturating_sub(marker_len));
let head_part = take_bytes_at_char_boundary(head_slice, head_budget);
let mut result = String::with_capacity(MODEL_FORMAT_MAX_BYTES.min(content.len()));
result.push_str(head_part);
if let Some(marker_text) = marker.as_ref() {
result.push_str(marker_text);
}
let remaining = MODEL_FORMAT_MAX_BYTES.saturating_sub(result.len());
if remaining == 0 {
return result;
}
let tail_part = take_last_bytes_at_char_boundary(tail_slice, remaining);
result.push_str(tail_part);
result
}
#[cfg(test)]
mod tests {
use super::*;
use crate::function_tool::FunctionCallError;
use regex_lite::Regex;
fn truncate_function_error(err: FunctionCallError) -> FunctionCallError {
match err {
FunctionCallError::RespondToModel(msg) => {
FunctionCallError::RespondToModel(format_exec_output(&msg))
}
FunctionCallError::Denied(msg) => FunctionCallError::Denied(format_exec_output(&msg)),
FunctionCallError::Fatal(msg) => FunctionCallError::Fatal(format_exec_output(&msg)),
other => other,
}
}
fn assert_truncated_message_matches(message: &str, line: &str, total_lines: usize) {
let pattern = truncated_message_pattern(line, total_lines);
let regex = Regex::new(&pattern).unwrap_or_else(|err| {
panic!("failed to compile regex {pattern}: {err}");
});
let captures = regex
.captures(message)
.unwrap_or_else(|| panic!("message failed to match pattern {pattern}: {message}"));
let body = captures
.name("body")
.expect("missing body capture")
.as_str();
assert!(
body.len() <= MODEL_FORMAT_MAX_BYTES,
"body exceeds byte limit: {} bytes",
body.len()
);
}
fn truncated_message_pattern(line: &str, total_lines: usize) -> String {
let head_take = MODEL_FORMAT_HEAD_LINES.min(total_lines);
let tail_take = MODEL_FORMAT_TAIL_LINES.min(total_lines.saturating_sub(head_take));
let omitted = total_lines.saturating_sub(head_take + tail_take);
let escaped_line = regex_lite::escape(line);
if omitted == 0 {
return format!(
r"(?s)^Total output lines: {total_lines}\n\n(?P<body>{escaped_line}.*\n\[\.{{3}} output truncated to fit {MODEL_FORMAT_MAX_BYTES} bytes \.{{3}}]\n\n.*)$",
);
}
format!(
r"(?s)^Total output lines: {total_lines}\n\n(?P<body>{escaped_line}.*\n\[\.{{3}} omitted {omitted} of {total_lines} lines \.{{3}}]\n\n.*)$",
)
}
} else {
content.to_string()
};
#[test]
fn truncate_formatted_exec_output_truncates_large_error() {
let line = "very long execution error line that should trigger truncation\n";
let large_error = line.repeat(2_500); // way beyond both byte and line limits
let truncated = format_exec_output(&large_error);
let total_lines = large_error.lines().count();
assert_truncated_message_matches(&truncated, line, total_lines);
assert_ne!(truncated, large_error);
}
#[test]
fn truncate_function_error_trims_respond_to_model() {
let line = "respond-to-model error that should be truncated\n";
let huge = line.repeat(3_000);
let total_lines = huge.lines().count();
let err = truncate_function_error(FunctionCallError::RespondToModel(huge));
match err {
FunctionCallError::RespondToModel(message) => {
assert_truncated_message_matches(&message, line, total_lines);
}
other => panic!("unexpected error variant: {other:?}"),
}
}
#[test]
fn truncate_function_error_trims_fatal() {
let line = "fatal error output that should be truncated\n";
let huge = line.repeat(3_000);
let total_lines = huge.lines().count();
let err = truncate_function_error(FunctionCallError::Fatal(huge));
match err {
FunctionCallError::Fatal(message) => {
assert_truncated_message_matches(&message, line, total_lines);
}
other => panic!("unexpected error variant: {other:?}"),
}
}
#[test]
fn truncate_formatted_exec_output_marks_byte_truncation_without_omitted_lines() {
let long_line = "a".repeat(MODEL_FORMAT_MAX_BYTES + 50);
let truncated = format_exec_output(&long_line);
assert_ne!(truncated, long_line);
let marker_line =
format!("[... output truncated to fit {MODEL_FORMAT_MAX_BYTES} bytes ...]");
assert!(
truncated.contains(&marker_line),
"missing byte truncation marker: {truncated}"
);
assert!(
!truncated.contains("omitted"),
"line omission marker should not appear when no lines were dropped: {truncated}"
);
}
#[test]
fn truncate_formatted_exec_output_returns_original_when_within_limits() {
let content = "example output\n".repeat(10);
assert_eq!(format_exec_output(&content), content);
}
#[test]
fn truncate_formatted_exec_output_reports_omitted_lines_and_keeps_head_and_tail() {
let total_lines = MODEL_FORMAT_MAX_LINES + 100;
let content: String = (0..total_lines)
.map(|idx| format!("line-{idx}\n"))
.collect();
let truncated = format_exec_output(&content);
let omitted = total_lines - MODEL_FORMAT_MAX_LINES;
let expected_marker = format!("[... omitted {omitted} of {total_lines} lines ...]");
assert!(
truncated.contains(&expected_marker),
"missing omitted marker: {truncated}"
);
assert!(
truncated.contains("line-0\n"),
"expected head line to remain: {truncated}"
);
let last_line = format!("line-{}\n", total_lines - 1);
assert!(
truncated.contains(&last_line),
"expected tail line to remain: {truncated}"
);
}
#[test]
fn truncate_formatted_exec_output_prefers_line_marker_when_both_limits_exceeded() {
let total_lines = MODEL_FORMAT_MAX_LINES + 42;
let long_line = "x".repeat(256);
let content: String = (0..total_lines)
.map(|idx| format!("line-{idx}-{long_line}\n"))
.collect();
let truncated = format_exec_output(&content);
assert!(
truncated.contains("[... omitted 42 of 298 lines ...]"),
"expected omitted marker when line count exceeds limit: {truncated}"
);
assert!(
!truncated.contains("output truncated to fit"),
"line omission marker should take precedence over byte marker: {truncated}"
);
}
// Truncate for model consumption before serialization.
format_output_for_model_body(&body)
}

View File

@@ -7,9 +7,11 @@ retry without sandbox on denial (no reapproval thanks to caching).
*/
use crate::error::CodexErr;
use crate::error::SandboxErr;
use crate::error::get_error_message_ui;
use crate::exec::ExecToolCallOutput;
use crate::sandboxing::SandboxManager;
use crate::tools::sandboxing::ApprovalCtx;
use crate::tools::sandboxing::ProvidesSandboxRetryData;
use crate::tools::sandboxing::SandboxAttempt;
use crate::tools::sandboxing::ToolCtx;
use crate::tools::sandboxing::ToolError;
@@ -38,6 +40,7 @@ impl ToolOrchestrator {
) -> Result<Out, ToolError>
where
T: ToolRuntime<Rq, Out>,
Rq: ProvidesSandboxRetryData,
{
let otel = turn_ctx.client.get_otel_event_manager();
let otel_tn = &tool_ctx.tool_name;
@@ -56,6 +59,7 @@ impl ToolOrchestrator {
turn: turn_ctx,
call_id: &tool_ctx.call_id,
retry_reason: None,
risk: None,
};
let decision = tool.start_approval_async(req, approval_ctx).await;
@@ -107,12 +111,33 @@ impl ToolOrchestrator {
// Ask for approval before retrying without sandbox.
if !tool.should_bypass_approval(approval_policy, already_approved) {
let mut risk = None;
if let Some(metadata) = req.sandbox_retry_data() {
let err = SandboxErr::Denied {
output: output.clone(),
};
let friendly = get_error_message_ui(&CodexErr::Sandbox(err));
let failure_summary = format!("failed in sandbox: {friendly}");
risk = tool_ctx
.session
.assess_sandbox_command(
turn_ctx,
&tool_ctx.call_id,
&metadata.command,
Some(failure_summary.as_str()),
)
.await;
}
let reason_msg = build_denial_reason_from_output(output.as_ref());
let approval_ctx = ApprovalCtx {
session: tool_ctx.session,
turn: turn_ctx,
call_id: &tool_ctx.call_id,
retry_reason: Some(reason_msg),
risk,
};
let decision = tool.start_approval_async(req, approval_ctx).await;

View File

@@ -15,6 +15,7 @@ use crate::tools::router::ToolCall;
use crate::tools::router::ToolRouter;
use codex_protocol::models::FunctionCallOutputPayload;
use codex_protocol::models::ResponseInputItem;
use codex_utils_readiness::Readiness;
pub(crate) struct ToolCallRuntime {
router: Arc<ToolRouter>,
@@ -53,12 +54,16 @@ impl ToolCallRuntime {
let tracker = Arc::clone(&self.tracker);
let lock = Arc::clone(&self.parallel_execution);
let aborted_response = Self::aborted_response(&call);
let readiness = self.turn_context.tool_call_gate.clone();
let handle: AbortOnDropHandle<Result<ResponseInputItem, FunctionCallError>> =
AbortOnDropHandle::new(tokio::spawn(async move {
tokio::select! {
_ = cancellation_token.cancelled() => Ok(aborted_response),
res = async {
tracing::info!("waiting for tool gate");
readiness.wait_ready().await;
tracing::info!("tool gate released");
let _guard = if supports_parallel {
Either::Left(lock.read().await)
} else {
@@ -100,7 +105,7 @@ impl ToolCallRuntime {
call_id: call.call_id.clone(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: None,
..Default::default()
},
},
}

View File

@@ -181,6 +181,7 @@ impl ToolRouter {
output: codex_protocol::models::FunctionCallOutputPayload {
content: message,
success: Some(false),
..Default::default()
},
}
}

View File

@@ -10,7 +10,9 @@ use crate::sandboxing::CommandSpec;
use crate::sandboxing::execute_env;
use crate::tools::sandboxing::Approvable;
use crate::tools::sandboxing::ApprovalCtx;
use crate::tools::sandboxing::ProvidesSandboxRetryData;
use crate::tools::sandboxing::SandboxAttempt;
use crate::tools::sandboxing::SandboxRetryData;
use crate::tools::sandboxing::Sandboxable;
use crate::tools::sandboxing::SandboxablePreference;
use crate::tools::sandboxing::ToolCtx;
@@ -32,6 +34,12 @@ pub struct ApplyPatchRequest {
pub codex_exe: Option<PathBuf>,
}
impl ProvidesSandboxRetryData for ApplyPatchRequest {
fn sandbox_retry_data(&self) -> Option<SandboxRetryData> {
None
}
}
#[derive(Default)]
pub struct ApplyPatchRuntime;
@@ -106,9 +114,10 @@ impl Approvable<ApplyPatchRequest> for ApplyPatchRuntime {
let call_id = ctx.call_id.to_string();
let cwd = req.cwd.clone();
let retry_reason = ctx.retry_reason.clone();
let risk = ctx.risk.clone();
let user_explicitly_approved = req.user_explicitly_approved;
Box::pin(async move {
with_cached_approval(&session.services, key, || async move {
with_cached_approval(&session.services, key, move || async move {
if let Some(reason) = retry_reason {
session
.request_command_approval(
@@ -117,6 +126,7 @@ impl Approvable<ApplyPatchRequest> for ApplyPatchRuntime {
vec!["apply_patch".to_string()],
cwd,
Some(reason),
risk,
)
.await
} else if user_explicitly_approved {

View File

@@ -12,7 +12,9 @@ use crate::sandboxing::execute_env;
use crate::tools::runtimes::build_command_spec;
use crate::tools::sandboxing::Approvable;
use crate::tools::sandboxing::ApprovalCtx;
use crate::tools::sandboxing::ProvidesSandboxRetryData;
use crate::tools::sandboxing::SandboxAttempt;
use crate::tools::sandboxing::SandboxRetryData;
use crate::tools::sandboxing::Sandboxable;
use crate::tools::sandboxing::SandboxablePreference;
use crate::tools::sandboxing::ToolCtx;
@@ -34,6 +36,15 @@ pub struct ShellRequest {
pub justification: Option<String>,
}
impl ProvidesSandboxRetryData for ShellRequest {
fn sandbox_retry_data(&self) -> Option<SandboxRetryData> {
Some(SandboxRetryData {
command: self.command.clone(),
cwd: self.cwd.clone(),
})
}
}
#[derive(Default)]
pub struct ShellRuntime;
@@ -90,13 +101,14 @@ impl Approvable<ShellRequest> for ShellRuntime {
.retry_reason
.clone()
.or_else(|| req.justification.clone());
let risk = ctx.risk.clone();
let session = ctx.session;
let turn = ctx.turn;
let call_id = ctx.call_id.to_string();
Box::pin(async move {
with_cached_approval(&session.services, key, || async move {
with_cached_approval(&session.services, key, move || async move {
session
.request_command_approval(turn, call_id, command, cwd, reason)
.request_command_approval(turn, call_id, command, cwd, reason, risk)
.await
})
.await

View File

@@ -9,7 +9,9 @@ use crate::error::SandboxErr;
use crate::tools::runtimes::build_command_spec;
use crate::tools::sandboxing::Approvable;
use crate::tools::sandboxing::ApprovalCtx;
use crate::tools::sandboxing::ProvidesSandboxRetryData;
use crate::tools::sandboxing::SandboxAttempt;
use crate::tools::sandboxing::SandboxRetryData;
use crate::tools::sandboxing::Sandboxable;
use crate::tools::sandboxing::SandboxablePreference;
use crate::tools::sandboxing::ToolCtx;
@@ -31,6 +33,15 @@ pub struct UnifiedExecRequest {
pub env: HashMap<String, String>,
}
impl ProvidesSandboxRetryData for UnifiedExecRequest {
fn sandbox_retry_data(&self) -> Option<SandboxRetryData> {
Some(SandboxRetryData {
command: self.command.clone(),
cwd: self.cwd.clone(),
})
}
}
#[derive(serde::Serialize, Clone, Debug, Eq, PartialEq, Hash)]
pub struct UnifiedExecApprovalKey {
pub command: Vec<String>,
@@ -85,10 +96,11 @@ impl Approvable<UnifiedExecRequest> for UnifiedExecRuntime<'_> {
let command = req.command.clone();
let cwd = req.cwd.clone();
let reason = ctx.retry_reason.clone();
let risk = ctx.risk.clone();
Box::pin(async move {
with_cached_approval(&session.services, key, || async move {
session
.request_command_approval(turn, call_id, command, cwd, reason)
.request_command_approval(turn, call_id, command, cwd, reason, risk)
.await
})
.await

View File

@@ -7,6 +7,7 @@
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::error::CodexErr;
use crate::protocol::SandboxCommandAssessment;
use crate::protocol::SandboxPolicy;
use crate::sandboxing::CommandSpec;
use crate::sandboxing::SandboxManager;
@@ -18,6 +19,7 @@ use std::collections::HashMap;
use std::fmt::Debug;
use std::hash::Hash;
use std::path::Path;
use std::path::PathBuf;
use futures::Future;
use futures::future::BoxFuture;
@@ -81,6 +83,7 @@ pub(crate) struct ApprovalCtx<'a> {
pub turn: &'a TurnContext,
pub call_id: &'a str,
pub retry_reason: Option<String>,
pub risk: Option<SandboxCommandAssessment>,
}
pub(crate) trait Approvable<Req> {
@@ -156,6 +159,17 @@ pub(crate) struct ToolCtx<'a> {
pub tool_name: String,
}
/// Captures the command metadata needed to re-run a tool request without sandboxing.
#[derive(Clone, Debug, PartialEq, Eq)]
pub(crate) struct SandboxRetryData {
pub command: Vec<String>,
pub cwd: PathBuf,
}
pub(crate) trait ProvidesSandboxRetryData {
fn sandbox_retry_data(&self) -> Option<SandboxRetryData>;
}
#[derive(Debug)]
pub(crate) enum ToolError {
Rejected(String),

View File

@@ -0,0 +1,27 @@
You are a security analyst evaluating shell commands that were blocked by a sandbox. Given the provided metadata, summarize the command's likely intent and assess the risk. Return strictly valid JSON with the keys:
- description (concise summary, at most two sentences)
- risk_level ("low", "medium", or "high")
- risk_categories (optional array of zero or more category strings)
Risk level examples:
- low: read-only inspections, listing files, printing configuration
- medium: modifying project files, installing dependencies, fetching artifacts from trusted sources
- high: deleting or overwriting data, exfiltrating secrets, escalating privileges, or disabling security controls
Recognized risk_categories: data_deletion, data_exfiltration, privilege_escalation, system_modification, network_access, resource_exhaustion, compliance.
Use multiple categories when appropriate.
If information is insufficient, choose the most cautious risk level supported by the evidence.
Respond with JSON only, without markdown code fences or extra commentary.
---
Command metadata:
Platform: {{ platform }}
Sandbox policy: {{ sandbox_policy }}
{% if let Some(roots) = filesystem_roots %}
Filesystem roots: {{ roots }}
{% endif %}
Working directory: {{ working_directory }}
Command argv: {{ command_argv }}
Command (joined): {{ command_joined }}
{% if let Some(message) = sandbox_failure_message %}
Sandbox failure message: {{ message }}
{% endif %}

View File

@@ -185,6 +185,49 @@ async fn streams_text_without_reasoning() {
assert_matches!(events[2], ResponseEvent::Completed { .. });
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn completed_event_includes_usage_estimate() {
if network_disabled() {
println!(
"Skipping test because it cannot execute when network is disabled in a Codex sandbox."
);
return;
}
let sse = concat!(
"data: {\"choices\":[{\"delta\":{\"content\":\"hi\"}}]}\n\n",
"data: {\"choices\":[{\"delta\":{}}]}\n\n",
"data: [DONE]\n\n",
);
let events = run_stream(sse).await;
assert_eq!(events.len(), 3, "unexpected events: {events:?}");
let usage = events
.iter()
.find_map(|event| match event {
ResponseEvent::Completed {
token_usage: Some(usage),
..
} => Some(usage.clone()),
_ => None,
})
.expect("missing usage estimate on Completed event");
assert!(
usage.input_tokens > 0,
"expected input tokens > 0, got {usage:?}"
);
assert!(
usage.output_tokens > 0,
"expected output tokens > 0, got {usage:?}"
);
assert!(
usage.total_tokens >= usage.input_tokens + usage.output_tokens,
"expected total tokens to cover input + output, got {usage:?}"
);
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn streams_reasoning_from_string_delta() {
if network_disabled() {

View File

@@ -75,9 +75,17 @@ async fn chat_mode_stream_cli() {
server.verify().await;
// Verify a new session rollout was created and is discoverable via list_conversations
let page = RolloutRecorder::list_conversations(home.path(), 10, None, &[])
.await
.expect("list conversations");
let provider_filter = vec!["mock".to_string()];
let page = RolloutRecorder::list_conversations(
home.path(),
10,
None,
&[],
Some(provider_filter.as_slice()),
"mock",
)
.await
.expect("list conversations");
assert!(
!page.items.is_empty(),
"expected at least one session to be listed"

View File

@@ -154,7 +154,8 @@ async fn resume_includes_initial_messages_and_sends_prior_items() {
"instructions": "be nice",
"cwd": ".",
"originator": "test_originator",
"cli_version": "test_version"
"cli_version": "test_version",
"model_provider": "test-provider"
}
})
)
@@ -524,7 +525,7 @@ async fn prefers_apikey_when_config_prefers_apikey_even_with_chatgpt_tokens() {
let mut config = load_default_config_for_test(&codex_home);
config.model_provider = model_provider;
let auth_manager = match CodexAuth::from_codex_home(codex_home.path()) {
let auth_manager = match CodexAuth::from_auth_storage(codex_home.path()) {
Ok(Some(auth)) => codex_core::AuthManager::from_auth_for_testing(auth),
Ok(None) => panic!("No CodexAuth found in codex_home"),
Err(e) => panic!("Failed to load CodexAuth: {e}"),

View File

@@ -18,7 +18,6 @@ use codex_core::built_in_model_providers;
use codex_core::codex::compact::SUMMARIZATION_PROMPT;
use codex_core::config::Config;
use codex_core::config::OPENAI_DEFAULT_MODEL;
use codex_core::protocol::ConversationPathResponseEvent;
use codex_core::protocol::EventMsg;
use codex_core::protocol::Op;
use codex_core::spawn::CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR;
@@ -42,6 +41,29 @@ fn network_disabled() -> bool {
std::env::var(CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR).is_ok()
}
fn filter_out_ghost_snapshot_entries(items: &[Value]) -> Vec<Value> {
items
.iter()
.filter(|item| !is_ghost_snapshot_message(item))
.cloned()
.collect()
}
fn is_ghost_snapshot_message(item: &Value) -> bool {
if item.get("type").and_then(Value::as_str) != Some("message") {
return false;
}
if item.get("role").and_then(Value::as_str) != Some("user") {
return false;
}
item.get("content")
.and_then(Value::as_array)
.and_then(|content| content.first())
.and_then(|entry| entry.get("text"))
.and_then(Value::as_str)
.is_some_and(|text| text.trim_start().starts_with("<ghost_snapshot>"))
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
/// Scenario: compact an initial conversation, resume it, fork one turn back, and
/// ensure the model-visible history matches expectations at each request.
@@ -61,7 +83,7 @@ async fn compact_resume_and_fork_preserve_model_history_view() {
user_turn(&base, "hello world").await;
compact_conversation(&base).await;
user_turn(&base, "AFTER_COMPACT").await;
let base_path = fetch_conversation_path(&base, "base conversation").await;
let base_path = fetch_conversation_path(&base).await;
assert!(
base_path.exists(),
"compact+resume test expects base path {base_path:?} to exist",
@@ -69,7 +91,7 @@ async fn compact_resume_and_fork_preserve_model_history_view() {
let resumed = resume_conversation(&manager, &config, base_path).await;
user_turn(&resumed, "AFTER_RESUME").await;
let resumed_path = fetch_conversation_path(&resumed, "resumed conversation").await;
let resumed_path = fetch_conversation_path(&resumed).await;
assert!(
resumed_path.exists(),
"compact+resume test expects resumed path {resumed_path:?} to exist",
@@ -518,7 +540,7 @@ async fn compact_resume_after_second_compaction_preserves_history() {
user_turn(&base, "hello world").await;
compact_conversation(&base).await;
user_turn(&base, "AFTER_COMPACT").await;
let base_path = fetch_conversation_path(&base, "base conversation").await;
let base_path = fetch_conversation_path(&base).await;
assert!(
base_path.exists(),
"second compact test expects base path {base_path:?} to exist",
@@ -526,7 +548,7 @@ async fn compact_resume_after_second_compaction_preserves_history() {
let resumed = resume_conversation(&manager, &config, base_path).await;
user_turn(&resumed, "AFTER_RESUME").await;
let resumed_path = fetch_conversation_path(&resumed, "resumed conversation").await;
let resumed_path = fetch_conversation_path(&resumed).await;
assert!(
resumed_path.exists(),
"second compact test expects resumed path {resumed_path:?} to exist",
@@ -537,7 +559,7 @@ async fn compact_resume_after_second_compaction_preserves_history() {
compact_conversation(&forked).await;
user_turn(&forked, "AFTER_COMPACT_2").await;
let forked_path = fetch_conversation_path(&forked, "forked conversation").await;
let forked_path = fetch_conversation_path(&forked).await;
assert!(
forked_path.exists(),
"second compact test expects forked path {forked_path:?} to exist",
@@ -557,13 +579,15 @@ async fn compact_resume_after_second_compaction_preserves_history() {
let resume_input_array = input_after_resume
.as_array()
.expect("input after resume should be an array");
let compact_filtered = filter_out_ghost_snapshot_entries(compact_input_array);
let resume_filtered = filter_out_ghost_snapshot_entries(resume_input_array);
assert!(
compact_input_array.len() <= resume_input_array.len(),
compact_filtered.len() <= resume_filtered.len(),
"after-resume input should have at least as many items as after-compact"
);
assert_eq!(
compact_input_array.as_slice(),
&resume_input_array[..compact_input_array.len()]
compact_filtered.as_slice(),
&resume_filtered[..compact_filtered.len()]
);
// hard coded test
let prompt = requests[0]["instructions"]
@@ -792,22 +816,8 @@ async fn compact_conversation(conversation: &Arc<CodexConversation>) {
wait_for_event(conversation, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
}
async fn fetch_conversation_path(
conversation: &Arc<CodexConversation>,
context: &str,
) -> std::path::PathBuf {
conversation
.submit(Op::GetPath)
.await
.expect("request conversation path");
match wait_for_event(conversation, |ev| {
matches!(ev, EventMsg::ConversationPath(_))
})
.await
{
EventMsg::ConversationPath(ConversationPathResponseEvent { path, .. }) => path,
_ => panic!("expected ConversationPath event for {context}"),
}
async fn fetch_conversation_path(conversation: &Arc<CodexConversation>) -> std::path::PathBuf {
conversation.rollout_path()
}
async fn resume_conversation(

View File

@@ -4,7 +4,6 @@ use codex_core::ModelProviderInfo;
use codex_core::NewConversation;
use codex_core::built_in_model_providers;
use codex_core::parse_turn_item;
use codex_core::protocol::ConversationPathResponseEvent;
use codex_core::protocol::EventMsg;
use codex_core::protocol::Op;
use codex_core::protocol::RolloutItem;
@@ -79,13 +78,7 @@ async fn fork_conversation_twice_drops_to_first_message() {
}
// Request history from the base conversation to obtain rollout path.
codex.submit(Op::GetPath).await.unwrap();
let base_history =
wait_for_event(&codex, |ev| matches!(ev, EventMsg::ConversationPath(_))).await;
let base_path = match &base_history {
EventMsg::ConversationPath(ConversationPathResponseEvent { path, .. }) => path.clone(),
_ => panic!("expected ConversationHistory event"),
};
let base_path = codex.rollout_path();
// GetHistory flushes before returning the path; no wait needed.
@@ -140,15 +133,7 @@ async fn fork_conversation_twice_drops_to_first_message() {
.await
.expect("fork 1");
codex_fork1.submit(Op::GetPath).await.unwrap();
let fork1_history = wait_for_event(&codex_fork1, |ev| {
matches!(ev, EventMsg::ConversationPath(_))
})
.await;
let fork1_path = match &fork1_history {
EventMsg::ConversationPath(ConversationPathResponseEvent { path, .. }) => path.clone(),
_ => panic!("expected ConversationHistory event after first fork"),
};
let fork1_path = codex_fork1.rollout_path();
// GetHistory on fork1 flushed; the file is ready.
let fork1_items = read_items(&fork1_path);
@@ -166,15 +151,7 @@ async fn fork_conversation_twice_drops_to_first_message() {
.await
.expect("fork 2");
codex_fork2.submit(Op::GetPath).await.unwrap();
let fork2_history = wait_for_event(&codex_fork2, |ev| {
matches!(ev, EventMsg::ConversationPath(_))
})
.await;
let fork2_path = match &fork2_history {
EventMsg::ConversationPath(ConversationPathResponseEvent { path, .. }) => path.clone(),
_ => panic!("expected ConversationHistory event after second fork"),
};
let fork2_path = codex_fork2.rollout_path();
// GetHistory on fork2 flushed; the file is ready.
let fork1_items = read_items(&fork1_path);
let fork1_user_inputs = find_user_input_positions(&fork1_items);

View File

@@ -33,6 +33,7 @@ mod stream_no_completed;
mod tool_harness;
mod tool_parallelism;
mod tools;
mod truncation;
mod unified_exec;
mod user_notification;
mod view_image;

View File

@@ -7,7 +7,6 @@ use codex_core::REVIEW_PROMPT;
use codex_core::ResponseItem;
use codex_core::built_in_model_providers;
use codex_core::config::Config;
use codex_core::protocol::ConversationPathResponseEvent;
use codex_core::protocol::ENVIRONMENT_CONTEXT_OPEN_TAG;
use codex_core::protocol::EventMsg;
use codex_core::protocol::ExitedReviewModeEvent;
@@ -120,13 +119,7 @@ async fn review_op_emits_lifecycle_and_review_output() {
// Also verify that a user message with the header and a formatted finding
// was recorded back in the parent session's rollout.
codex.submit(Op::GetPath).await.unwrap();
let history_event =
wait_for_event(&codex, |ev| matches!(ev, EventMsg::ConversationPath(_))).await;
let path = match history_event {
EventMsg::ConversationPath(ConversationPathResponseEvent { path, .. }) => path,
other => panic!("expected ConversationPath event, got {other:?}"),
};
let path = codex.rollout_path();
let text = std::fs::read_to_string(&path).expect("read rollout file");
let mut saw_header = false;
@@ -375,7 +368,8 @@ async fn review_input_isolated_from_parent_history() {
"instructions": null,
"cwd": ".",
"originator": "test_originator",
"cli_version": "test_version"
"cli_version": "test_version",
"model_provider": "test-provider"
}
});
f.write_all(format!("{meta_line}\n").as_bytes())
@@ -482,13 +476,7 @@ async fn review_input_isolated_from_parent_history() {
assert_eq!(instructions, REVIEW_PROMPT);
// Also verify that a user interruption note was recorded in the rollout.
codex.submit(Op::GetPath).await.unwrap();
let history_event =
wait_for_event(&codex, |ev| matches!(ev, EventMsg::ConversationPath(_))).await;
let path = match history_event {
EventMsg::ConversationPath(ConversationPathResponseEvent { path, .. }) => path,
other => panic!("expected ConversationPath event, got {other:?}"),
};
let path = codex.rollout_path();
let text = std::fs::read_to_string(&path).expect("read rollout file");
let mut saw_interruption_message = false;
for line in text.lines() {

View File

@@ -14,6 +14,8 @@ use codex_core::features::Feature;
use codex_core::protocol::AskForApproval;
use codex_core::protocol::EventMsg;
use codex_core::protocol::McpInvocation;
use codex_core::protocol::McpToolCallBeginEvent;
use codex_core::protocol::Op;
use codex_core::protocol::SandboxPolicy;
use codex_protocol::config_types::ReasoningSummary;
@@ -25,7 +27,9 @@ use core_test_support::test_codex::test_codex;
use core_test_support::wait_for_event;
use core_test_support::wait_for_event_with_timeout;
use escargot::CargoBuild;
use mcp_types::ContentBlock;
use serde_json::Value;
use serde_json::json;
use serial_test::serial;
use tempfile::tempdir;
use tokio::net::TcpStream;
@@ -35,6 +39,8 @@ use tokio::time::Instant;
use tokio::time::sleep;
use wiremock::matchers::any;
static OPENAI_PNG: &str = "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAD0AAAA9CAYAAAAeYmHpAAAE6klEQVR4Aeyau44UVxCGx1fZsmRLlm3Zoe0XcGQ5cUiCCIgJeS9CHgAhMkISQnIuGQgJEkBcxLW+nqnZ6uqqc+nuWRC7q/P3qetf9e+MtOwyX25O4Nep6JPyop++0qev9HrfgZ+F6r2DuB/vHOrt/UIkqdDHYvujOW6fO7h/CNEI+a5jc+pBR8uy0jVFsziYu5HtfSUk+Io34q921hLNctFSX0gwww+S8wce8K1LfCU+cYW4888aov8NxqvQILUPPReLOrm6zyLxa4i+6VZuFbJo8d1MOHZm+7VUtB/aIvhPWc/3SWg49JcwFLlHxuXKjtyloo+YNhuW3VS+WPBuUEMvCFKjEDVgFBQHXrnazpqiSxNZCkQ1kYiozsbm9Oz7l4i2Il7vGccGNWAc3XosDrZe/9P3ZnMmzHNEQw4smf8RQ87XEAMsC7Az0Au+dgXerfH4+sHvEc0SYGic8WBBUGqFH2gN7yDrazy7m2pbRTeRmU3+MjZmr1h6LJgPbGy23SI6GlYT0brQ71IY8Us4PNQCm+zepSbaD2BY9xCaAsD9IIj/IzFmKMSdHHonwdZATbTnYREf6/VZGER98N9yCWIvXQwXDoDdhZJoT8jwLnJXDB9w4Sb3e6nK5ndzlkTLnP3JBu4LKkbrYrU69gCVceV0JvpyuW1xlsUVngzhwMetn/XamtTORF9IO5YnWNiyeF9zCAfqR3fUW+vZZKLtgP+ts8BmQRBREAdRDhH3o8QuRh/YucNFz2BEjxbRN6LGzphfKmvP6v6QhqIQyZ8XNJ0W0X83MR1PEcJBNO2KC2Z1TW/v244scp9FwRViZxIOBF0Lctk7ZVSavdLvRlV1hz/ysUi9sr8CIcB3nvWBwA93ykTz18eAYxQ6N/K2DkPA1lv3iXCwmDUT7YkjIby9siXueIJj9H+pzSqJ9oIuJWTUgSSt4WO7o/9GGg0viR4VinNRUDoIj34xoCd6pxD3aK3zfdbnx5v1J3ZNNEJsE0sBG7N27ReDrJc4sFxz7dI/ZAbOmmiKvHBitQXpAdR6+F7v+/ol/tOouUV01EeMZQF2BoQDn6dP4XNr+j9GZEtEK1/L8pFw7bd3a53tsTa7WD+054jOFmPg1XBKPQgnqFfmFcy32ZRvjmiIIQTYFvyDxQ8nH8WIwwGwlyDjDznnilYyFr6njrlZwsKkBpO59A7OwgdzPEWRm+G+oeb7IfyNuzjEEVLrOVxJsxvxwF8kmCM6I2QYmJunz4u4TrADpfl7mlbRTWQ7VmrBzh3+C9f6Grc3YoGN9dg/SXFthpRsT6vobfXRs2VBlgBHXVMLHjDNbIZv1sZ9+X3hB09cXdH1JKViyG0+W9bWZDa/r2f9zAFR71sTzGpMSWz2iI4YssWjWo3REy1MDGjdwe5e0dFSiAC1JakBvu4/CUS8Eh6dqHdU0Or0ioY3W5ClSqDXAy7/6SRfgw8vt4I+tbvvNtFT2kVDhY5+IGb1rCqYaXNF08vSALsXCPmt0kQNqJT1p5eI1mkIV/BxCY1z85lOzeFbPBQHURkkPTlwTYK9gTVE25l84IbFFN+YJDHjdpn0gq6mrHht0dkcjbM4UL9283O5p77GN+SPW/QwVB4IUYg7Or+Kp7naR6qktP98LNF2UxWo9yObPIT9KYg+hK4i56no4rfnM0qeyFf6AwAAAP//trwR3wAAAAZJREFUAwBZ0sR75itw5gAAAABJRU5ErkJggg==";
#[tokio::test(flavor = "multi_thread", worker_threads = 1)]
#[serial(mcp_test_value)]
async fn stdio_server_round_trip() -> anyhow::Result<()> {
@@ -175,6 +181,352 @@ async fn stdio_server_round_trip() -> anyhow::Result<()> {
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 1)]
#[serial(mcp_test_value)]
async fn stdio_image_responses_round_trip() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let server = responses::start_mock_server().await;
let call_id = "img-1";
let server_name = "rmcp";
let tool_name = format!("mcp__{server_name}__image");
// First stream: model decides to call the image tool.
mount_sse_once_match(
&server,
any(),
responses::sse(vec![
responses::ev_response_created("resp-1"),
responses::ev_function_call(call_id, &tool_name, "{}"),
responses::ev_completed("resp-1"),
]),
)
.await;
// Second stream: after tool execution, assistant emits a message and completes.
let final_mock = mount_sse_once_match(
&server,
any(),
responses::sse(vec![
responses::ev_assistant_message("msg-1", "rmcp image tool completed successfully."),
responses::ev_completed("resp-2"),
]),
)
.await;
// Build the stdio rmcp server and pass the image as data URL so it can construct ImageContent.
let rmcp_test_server_bin = CargoBuild::new()
.package("codex-rmcp-client")
.bin("test_stdio_server")
.run()?
.path()
.to_string_lossy()
.into_owned();
let fixture = test_codex()
.with_config(move |config| {
config.features.enable(Feature::RmcpClient);
config.mcp_servers.insert(
server_name.to_string(),
McpServerConfig {
transport: McpServerTransportConfig::Stdio {
command: rmcp_test_server_bin,
args: Vec::new(),
env: Some(HashMap::from([(
"MCP_TEST_IMAGE_DATA_URL".to_string(),
OPENAI_PNG.to_string(),
)])),
env_vars: Vec::new(),
cwd: None,
},
enabled: true,
startup_timeout_sec: Some(Duration::from_secs(10)),
tool_timeout_sec: None,
enabled_tools: None,
disabled_tools: None,
},
);
})
.build(&server)
.await?;
let session_model = fixture.session_configured.model.clone();
fixture
.codex
.submit(Op::UserTurn {
items: vec![UserInput::Text {
text: "call the rmcp image tool".into(),
}],
final_output_json_schema: None,
cwd: fixture.cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::ReadOnly,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
// Wait for tool begin/end and final completion.
let begin_event = wait_for_event_with_timeout(
&fixture.codex,
|ev| matches!(ev, EventMsg::McpToolCallBegin(_)),
Duration::from_secs(10),
)
.await;
let EventMsg::McpToolCallBegin(begin) = begin_event else {
unreachable!("begin");
};
assert_eq!(
begin,
McpToolCallBeginEvent {
call_id: call_id.to_string(),
invocation: McpInvocation {
server: server_name.to_string(),
tool: "image".to_string(),
arguments: Some(json!({})),
},
},
);
let end_event = wait_for_event(&fixture.codex, |ev| {
matches!(ev, EventMsg::McpToolCallEnd(_))
})
.await;
let EventMsg::McpToolCallEnd(end) = end_event else {
unreachable!("end");
};
assert_eq!(end.call_id, call_id);
assert_eq!(
end.invocation,
McpInvocation {
server: server_name.to_string(),
tool: "image".to_string(),
arguments: Some(json!({})),
}
);
let result = end.result.expect("rmcp image tool should return success");
assert_eq!(result.is_error, Some(false));
assert_eq!(result.content.len(), 1);
let base64_only = OPENAI_PNG
.strip_prefix("data:image/png;base64,")
.expect("data url prefix");
match &result.content[0] {
ContentBlock::ImageContent(img) => {
assert_eq!(img.mime_type, "image/png");
assert_eq!(img.r#type, "image");
assert_eq!(img.data, base64_only);
}
other => panic!("expected image content, got {other:?}"),
}
wait_for_event(&fixture.codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
let output_item = final_mock.single_request().function_call_output(call_id);
assert_eq!(
output_item,
json!({
"type": "function_call_output",
"call_id": call_id,
"output": [{
"type": "input_image",
"image_url": OPENAI_PNG
}]
})
);
server.verify().await;
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 1)]
#[serial(mcp_test_value)]
async fn stdio_image_completions_round_trip() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let server = responses::start_mock_server().await;
let call_id = "img-cc-1";
let server_name = "rmcp";
let tool_name = format!("mcp__{server_name}__image");
let tool_call = json!({
"choices": [
{
"delta": {
"tool_calls": [
{
"id": call_id,
"type": "function",
"function": {"name": tool_name, "arguments": "{}"}
}
]
},
"finish_reason": "tool_calls"
}
]
});
let sse_tool_call = format!(
"data: {}\n\ndata: [DONE]\n\n",
serde_json::to_string(&tool_call)?
);
let final_assistant = json!({
"choices": [
{
"delta": {"content": "rmcp image tool completed successfully."},
"finish_reason": "stop"
}
]
});
let sse_final = format!(
"data: {}\n\ndata: [DONE]\n\n",
serde_json::to_string(&final_assistant)?
);
use std::sync::atomic::AtomicUsize;
use std::sync::atomic::Ordering;
struct ChatSeqResponder {
num_calls: AtomicUsize,
bodies: Vec<String>,
}
impl wiremock::Respond for ChatSeqResponder {
fn respond(&self, _: &wiremock::Request) -> wiremock::ResponseTemplate {
let idx = self.num_calls.fetch_add(1, Ordering::SeqCst);
match self.bodies.get(idx) {
Some(body) => wiremock::ResponseTemplate::new(200)
.insert_header("content-type", "text/event-stream")
.set_body_string(body.clone()),
None => panic!("no chat completion response for index {idx}"),
}
}
}
let chat_seq = ChatSeqResponder {
num_calls: AtomicUsize::new(0),
bodies: vec![sse_tool_call, sse_final],
};
wiremock::Mock::given(wiremock::matchers::method("POST"))
.and(wiremock::matchers::path("/v1/chat/completions"))
.respond_with(chat_seq)
.expect(2)
.mount(&server)
.await;
let rmcp_test_server_bin = CargoBuild::new()
.package("codex-rmcp-client")
.bin("test_stdio_server")
.run()?
.path()
.to_string_lossy()
.into_owned();
let fixture = test_codex()
.with_config(move |config| {
config.model_provider.wire_api = codex_core::WireApi::Chat;
config.features.enable(Feature::RmcpClient);
config.mcp_servers.insert(
server_name.to_string(),
McpServerConfig {
transport: McpServerTransportConfig::Stdio {
command: rmcp_test_server_bin,
args: Vec::new(),
env: Some(HashMap::from([(
"MCP_TEST_IMAGE_DATA_URL".to_string(),
OPENAI_PNG.to_string(),
)])),
env_vars: Vec::new(),
cwd: None,
},
enabled: true,
startup_timeout_sec: Some(Duration::from_secs(10)),
tool_timeout_sec: None,
enabled_tools: None,
disabled_tools: None,
},
);
})
.build(&server)
.await?;
let session_model = fixture.session_configured.model.clone();
fixture
.codex
.submit(Op::UserTurn {
items: vec![UserInput::Text {
text: "call the rmcp image tool".into(),
}],
final_output_json_schema: None,
cwd: fixture.cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::ReadOnly,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
let begin_event = wait_for_event_with_timeout(
&fixture.codex,
|ev| matches!(ev, EventMsg::McpToolCallBegin(_)),
Duration::from_secs(10),
)
.await;
let EventMsg::McpToolCallBegin(begin) = begin_event else {
unreachable!("begin");
};
assert_eq!(
begin,
McpToolCallBeginEvent {
call_id: call_id.to_string(),
invocation: McpInvocation {
server: server_name.to_string(),
tool: "image".to_string(),
arguments: Some(json!({})),
},
},
);
let end_event = wait_for_event(&fixture.codex, |ev| {
matches!(ev, EventMsg::McpToolCallEnd(_))
})
.await;
let EventMsg::McpToolCallEnd(end) = end_event else {
unreachable!("end");
};
assert!(end.result.as_ref().is_ok(), "tool call should succeed");
wait_for_event(&fixture.codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
// Chat Completions assertion: the second POST should include a tool role message
// with an array `content` containing an item with the expected data URL.
let requests = server.received_requests().await.expect("requests captured");
assert!(requests.len() >= 2, "expected two chat completion calls");
let second = &requests[1];
let body: Value = serde_json::from_slice(&second.body)?;
let messages = body
.get("messages")
.and_then(Value::as_array)
.cloned()
.expect("messages array");
let tool_msg = messages
.iter()
.find(|m| {
m.get("role") == Some(&json!("tool")) && m.get("tool_call_id") == Some(&json!(call_id))
})
.cloned()
.expect("tool message present");
assert_eq!(
tool_msg,
json!({
"role": "tool",
"tool_call_id": call_id,
"content": [{"type": "image_url", "image_url": {"url": OPENAI_PNG}}]
})
);
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 1)]
#[serial(mcp_test_value)]
async fn stdio_server_propagates_whitelisted_env_vars() -> anyhow::Result<()> {

View File

@@ -28,7 +28,8 @@ fn write_minimal_rollout_with_id(codex_home: &Path, id: Uuid) -> PathBuf {
"instructions": null,
"cwd": ".",
"originator": "test",
"cli_version": "test"
"cli_version": "test",
"model_provider": "test-provider"
}
})
)

View File

@@ -0,0 +1,270 @@
#![cfg(not(target_os = "windows"))]
#![allow(clippy::unwrap_used, clippy::expect_used)]
use anyhow::Context;
use anyhow::Result;
use codex_core::features::Feature;
use codex_core::model_family::find_family_for_model;
use codex_core::protocol::SandboxPolicy;
use core_test_support::assert_regex_match;
use core_test_support::responses;
use core_test_support::responses::ev_assistant_message;
use core_test_support::responses::ev_completed;
use core_test_support::responses::ev_function_call;
use core_test_support::responses::ev_response_created;
use core_test_support::responses::mount_sse_once_match;
use core_test_support::responses::mount_sse_sequence;
use core_test_support::responses::sse;
use core_test_support::responses::start_mock_server;
use core_test_support::skip_if_no_network;
use core_test_support::test_codex::test_codex;
use escargot::CargoBuild;
use regex_lite::Regex;
use serde_json::Value;
use serde_json::json;
use wiremock::matchers::any;
// Verifies byte-truncation formatting for function error output (RespondToModel errors)
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn truncate_function_error_trims_respond_to_model() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let mut builder = test_codex().with_config(|config| {
// Use the test model that wires function tools like grep_files
config.model = "test-gpt-5-codex".to_string();
config.model_family =
find_family_for_model("test-gpt-5-codex").expect("model family for test model");
});
let test = builder.build(&server).await?;
// Construct a very long, non-existent path to force a RespondToModel error with a large message
let long_path = "a".repeat(20_000);
let call_id = "grep-huge-error";
let args = json!({
"pattern": "alpha",
"path": long_path,
"limit": 10
});
let responses = vec![
sse(vec![
ev_response_created("resp-1"),
ev_function_call(call_id, "grep_files", &serde_json::to_string(&args)?),
ev_completed("resp-1"),
]),
sse(vec![
ev_assistant_message("msg-1", "done"),
ev_completed("resp-2"),
]),
];
let mock = mount_sse_sequence(&server, responses).await;
test.submit_turn_with_policy(
"trigger grep_files with long path to test truncation",
SandboxPolicy::DangerFullAccess,
)
.await?;
let output = mock
.function_call_output_text(call_id)
.context("function error output present")?;
tracing::debug!(output = %output, "truncated function error output");
// Expect plaintext with byte-truncation marker and no omitted-lines marker
assert!(
serde_json::from_str::<serde_json::Value>(&output).is_err(),
"expected error output to be plain text",
);
let truncated_pattern = r#"(?s)^Total output lines: 1\s+.*\[\.\.\. output truncated to fit 10240 bytes \.\.\.\]\s*$"#;
assert_regex_match(truncated_pattern, &output);
assert!(
!output.contains("omitted"),
"line omission marker should not appear when no lines were dropped: {output}"
);
Ok(())
}
// Verifies that a standard tool call (shell) exceeding the model formatting
// limits is truncated before being sent back to the model.
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn tool_call_output_exceeds_limit_truncated_for_model() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
// Use a model that exposes the generic shell tool.
let mut builder = test_codex().with_config(|config| {
config.model = "gpt-5-codex".to_string();
config.model_family =
find_family_for_model("gpt-5-codex").expect("gpt-5-codex is a model family");
});
let fixture = builder.build(&server).await?;
let call_id = "shell-too-large";
let args = serde_json::json!({
"command": ["/bin/sh", "-c", "seq 1 400"],
"timeout_ms": 5_000,
});
// First response: model tells us to run the tool; second: complete the turn.
mount_sse_once_match(
&server,
any(),
sse(vec![
responses::ev_response_created("resp-1"),
responses::ev_function_call(call_id, "shell", &serde_json::to_string(&args)?),
responses::ev_completed("resp-1"),
]),
)
.await;
let mock2 = mount_sse_once_match(
&server,
any(),
sse(vec![
responses::ev_assistant_message("msg-1", "done"),
responses::ev_completed("resp-2"),
]),
)
.await;
fixture
.submit_turn_with_policy("trigger big shell output", SandboxPolicy::DangerFullAccess)
.await?;
// Inspect what we sent back to the model; it should contain a truncated
// function_call_output for the shell call.
let output = mock2
.single_request()
.function_call_output_text(call_id)
.context("function_call_output present for shell call")?;
// Expect plain text (not JSON) with truncation markers and line elision.
assert!(
serde_json::from_str::<Value>(&output).is_err(),
"expected truncated shell output to be plain text"
);
let truncated_pattern = r#"(?s)^Exit code: 0
Wall time: .* seconds
Total output lines: 400
Output:
1
2
3
4
5
6
.*
\[\.{3} omitted 144 of 400 lines \.{3}\]
.*
396
397
398
399
400
$"#;
assert_regex_match(truncated_pattern, &output);
Ok(())
}
// Verifies that an MCP tool call result exceeding the model formatting limits
// is truncated before being sent back to the model.
#[tokio::test(flavor = "multi_thread", worker_threads = 1)]
async fn mcp_tool_call_output_exceeds_limit_truncated_for_model() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let call_id = "rmcp-truncated";
let server_name = "rmcp";
let tool_name = format!("mcp__{server_name}__echo");
// Build a very large message to exceed 10KiB once serialized.
let large_msg = "long-message-with-newlines-".repeat(600);
let args_json = serde_json::json!({ "message": large_msg });
mount_sse_once_match(
&server,
any(),
sse(vec![
responses::ev_response_created("resp-1"),
responses::ev_function_call(call_id, &tool_name, &args_json.to_string()),
responses::ev_completed("resp-1"),
]),
)
.await;
let mock2 = mount_sse_once_match(
&server,
any(),
sse(vec![
responses::ev_assistant_message("msg-1", "rmcp echo tool completed."),
responses::ev_completed("resp-2"),
]),
)
.await;
// Compile the rmcp stdio test server and configure it.
let rmcp_test_server_bin = CargoBuild::new()
.package("codex-rmcp-client")
.bin("test_stdio_server")
.run()?
.path()
.to_string_lossy()
.into_owned();
let mut builder = test_codex().with_config(move |config| {
config.features.enable(Feature::RmcpClient);
config.mcp_servers.insert(
server_name.to_string(),
codex_core::config_types::McpServerConfig {
transport: codex_core::config_types::McpServerTransportConfig::Stdio {
command: rmcp_test_server_bin,
args: Vec::new(),
env: None,
env_vars: Vec::new(),
cwd: None,
},
enabled: true,
startup_timeout_sec: Some(std::time::Duration::from_secs(10)),
tool_timeout_sec: None,
enabled_tools: None,
disabled_tools: None,
},
);
});
let fixture = builder.build(&server).await?;
fixture
.submit_turn_with_policy(
"call the rmcp echo tool with a very large message",
SandboxPolicy::ReadOnly,
)
.await?;
// The MCP tool call output is converted to a function_call_output for the model.
let output = mock2
.single_request()
.function_call_output_text(call_id)
.context("function_call_output present for rmcp call")?;
// Expect plain text with byte-based truncation marker.
assert!(
serde_json::from_str::<Value>(&output).is_err(),
"expected truncated MCP output to be plain text"
);
assert!(
output.starts_with("Total output lines: 1\n\n{"),
"expected total line header and JSON head, got: {output}"
);
let byte_marker = Regex::new(r"\[\.\.\. output truncated to fit 10240 bytes \.\.\.\]")
.expect("compile regex");
assert!(
byte_marker.is_match(&output),
"expected byte truncation marker, got: {output}"
);
Ok(())
}

View File

@@ -240,7 +240,7 @@ async fn unified_exec_emits_output_delta_for_exec_command() -> Result<()> {
let call_id = "uexec-delta-1";
let args = json!({
"cmd": "printf 'HELLO-UEXEC'",
"yield_time_ms": 250,
"yield_time_ms": 1000,
});
let responses = vec![

View File

@@ -19,6 +19,10 @@ use core_test_support::skip_if_no_network;
use core_test_support::test_codex::TestCodex;
use core_test_support::test_codex::test_codex;
use core_test_support::wait_for_event;
use image::GenericImageView;
use image::ImageBuffer;
use image::Rgba;
use image::load_from_memory;
use serde_json::Value;
use wiremock::matchers::any;
@@ -49,6 +53,88 @@ fn extract_output_text(item: &Value) -> Option<&str> {
})
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn user_turn_with_local_image_attaches_image() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let TestCodex {
codex,
cwd,
session_configured,
..
} = test_codex().build(&server).await?;
let rel_path = "user-turn/example.png";
let abs_path = cwd.path().join(rel_path);
if let Some(parent) = abs_path.parent() {
std::fs::create_dir_all(parent)?;
}
let image = ImageBuffer::from_pixel(4096, 1024, Rgba([20u8, 40, 60, 255]));
image.save(&abs_path)?;
let response = sse(vec![
ev_response_created("resp-1"),
ev_assistant_message("msg-1", "done"),
ev_completed("resp-1"),
]);
let mock = responses::mount_sse_once_match(&server, any(), response).await;
let session_model = session_configured.model.clone();
codex
.submit(Op::UserTurn {
items: vec![UserInput::LocalImage {
path: abs_path.clone(),
}],
final_output_json_schema: None,
cwd: cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
wait_for_event(&codex, |event| matches!(event, EventMsg::TaskComplete(_))).await;
let body = mock.single_request().body_json();
let image_message =
find_image_message(&body).expect("pending input image message not included in request");
let image_url = image_message
.get("content")
.and_then(Value::as_array)
.and_then(|content| {
content.iter().find_map(|span| {
if span.get("type").and_then(Value::as_str) == Some("input_image") {
span.get("image_url").and_then(Value::as_str)
} else {
None
}
})
})
.expect("image_url present");
let (prefix, encoded) = image_url
.split_once(',')
.expect("image url contains data prefix");
assert_eq!(prefix, "data:image/png;base64");
let decoded = BASE64_STANDARD
.decode(encoded)
.expect("image data decodes from base64 for request");
let resized = load_from_memory(&decoded).expect("load resized image");
let (width, height) = resized.dimensions();
assert!(width <= 2048);
assert!(height <= 768);
assert!(width < 4096);
assert!(height < 1024);
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn view_image_tool_attaches_local_image() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
@@ -67,8 +153,8 @@ async fn view_image_tool_attaches_local_image() -> anyhow::Result<()> {
if let Some(parent) = abs_path.parent() {
std::fs::create_dir_all(parent)?;
}
let image_bytes = b"fake_png_bytes".to_vec();
std::fs::write(&abs_path, &image_bytes)?;
let image = ImageBuffer::from_pixel(4096, 1024, Rgba([255u8, 0, 0, 255]));
image.save(&abs_path)?;
let call_id = "view-image-call";
let arguments = serde_json::json!({ "path": rel_path }).to_string();
@@ -143,11 +229,20 @@ async fn view_image_tool_attaches_local_image() -> anyhow::Result<()> {
})
.expect("image_url present");
let expected_image_url = format!(
"data:image/png;base64,{}",
BASE64_STANDARD.encode(&image_bytes)
);
assert_eq!(image_url, expected_image_url);
let (prefix, encoded) = image_url
.split_once(',')
.expect("image url contains data prefix");
assert_eq!(prefix, "data:image/png;base64");
let decoded = BASE64_STANDARD
.decode(encoded)
.expect("image data decodes from base64 for request");
let resized = load_from_memory(&decoded).expect("load resized image");
let (resized_width, resized_height) = resized.dimensions();
assert!(resized_width <= 2048);
assert!(resized_height <= 768);
assert!(resized_width < 4096);
assert!(resized_height < 1024);
Ok(())
}

View File

@@ -20,7 +20,6 @@ use codex_core::protocol::StreamErrorEvent;
use codex_core::protocol::TaskCompleteEvent;
use codex_core::protocol::TurnAbortReason;
use codex_core::protocol::TurnDiffEvent;
use codex_core::protocol::WebSearchBeginEvent;
use codex_core::protocol::WebSearchEndEvent;
use codex_protocol::num_format::format_with_separators;
use owo_colors::OwoColorize;
@@ -216,7 +215,6 @@ impl EventProcessor for EventProcessorWithHumanOutput {
cwd.to_string_lossy(),
);
}
EventMsg::ExecCommandOutputDelta(_) => {}
EventMsg::ExecCommandEnd(ExecCommandEndEvent {
aggregated_output,
duration,
@@ -283,7 +281,6 @@ impl EventProcessor for EventProcessorWithHumanOutput {
}
}
}
EventMsg::WebSearchBegin(WebSearchBeginEvent { call_id: _ }) => {}
EventMsg::WebSearchEnd(WebSearchEndEvent { call_id: _, query }) => {
ts_msg!(self, "🌐 Searched: {query}");
}
@@ -411,12 +408,6 @@ impl EventProcessor for EventProcessorWithHumanOutput {
);
eprintln!("{unified_diff}");
}
EventMsg::ExecApprovalRequest(_) => {
// Should we exit?
}
EventMsg::ApplyPatchApprovalRequest(_) => {
// Should we exit?
}
EventMsg::AgentReasoning(agent_reasoning_event) => {
if self.show_agent_reasoning {
ts_msg!(
@@ -481,15 +472,6 @@ impl EventProcessor for EventProcessorWithHumanOutput {
}
}
}
EventMsg::GetHistoryEntryResponse(_) => {
// Currently ignored in exec output.
}
EventMsg::McpListToolsResponse(_) => {
// Currently ignored in exec output.
}
EventMsg::ListCustomPromptsResponse(_) => {
// Currently ignored in exec output.
}
EventMsg::ViewImageToolCall(view) => {
ts_msg!(
self,
@@ -510,15 +492,24 @@ impl EventProcessor for EventProcessorWithHumanOutput {
}
},
EventMsg::ShutdownComplete => return CodexStatus::Shutdown,
EventMsg::ConversationPath(_) => {}
EventMsg::UserMessage(_) => {}
EventMsg::EnteredReviewMode(_) => {}
EventMsg::ExitedReviewMode(_) => {}
EventMsg::AgentMessageDelta(_) => {}
EventMsg::AgentReasoningDelta(_) => {}
EventMsg::AgentReasoningRawContentDelta(_) => {}
EventMsg::ItemStarted(_) => {}
EventMsg::ItemCompleted(_) => {}
EventMsg::WebSearchBegin(_)
| EventMsg::ExecApprovalRequest(_)
| EventMsg::ApplyPatchApprovalRequest(_)
| EventMsg::ExecCommandOutputDelta(_)
| EventMsg::GetHistoryEntryResponse(_)
| EventMsg::McpListToolsResponse(_)
| EventMsg::ListCustomPromptsResponse(_)
| EventMsg::RawResponseItem(_)
| EventMsg::UserMessage(_)
| EventMsg::EnteredReviewMode(_)
| EventMsg::ExitedReviewMode(_)
| EventMsg::AgentMessageDelta(_)
| EventMsg::AgentReasoningDelta(_)
| EventMsg::AgentReasoningRawContentDelta(_)
| EventMsg::ItemStarted(_)
| EventMsg::ItemCompleted(_)
| EventMsg::UndoCompleted(_)
| EventMsg::UndoStarted(_) => {}
}
CodexStatus::Running
}

View File

@@ -179,6 +179,7 @@ pub async fn run_main(cli: Cli, codex_linux_sandbox_exe: Option<PathBuf>) -> any
include_view_image_tool: None,
show_raw_agent_reasoning: oss.then_some(true),
tools_web_search_request: None,
experimental_sandbox_command_assessment: None,
additional_writable_roots: Vec::new(),
};
// Parse `-c` overrides.
@@ -388,8 +389,16 @@ async fn resolve_resume_path(
args: &crate::cli::ResumeArgs,
) -> anyhow::Result<Option<PathBuf>> {
if args.last {
match codex_core::RolloutRecorder::list_conversations(&config.codex_home, 1, None, &[])
.await
let default_provider_filter = vec![config.model_provider_id.clone()];
match codex_core::RolloutRecorder::list_conversations(
&config.codex_home,
1,
None,
&[],
Some(default_provider_filter.as_slice()),
&config.model_provider_id,
)
.await
{
Ok(page) => Ok(page.items.first().map(|it| it.path.clone())),
Err(e) => {

View File

@@ -167,8 +167,16 @@ impl CodexLogSnapshot {
Ok(path)
}
pub fn upload_to_sentry(&self) -> Result<()> {
/// Upload feedback to Sentry with optional attachments.
pub fn upload_feedback(
&self,
classification: &str,
reason: Option<&str>,
include_logs: bool,
rollout_path: Option<&std::path::Path>,
) -> Result<()> {
use std::collections::BTreeMap;
use std::fs;
use std::str::FromStr;
use std::sync::Arc;
@@ -182,36 +190,91 @@ impl CodexLogSnapshot {
use sentry::transports::DefaultTransportFactory;
use sentry::types::Dsn;
// Build Sentry client
let client = Client::from_config(ClientOptions {
dsn: Some(Dsn::from_str(SENTRY_DSN).map_err(|e| anyhow!("invalid DSN: {}", e))?),
dsn: Some(Dsn::from_str(SENTRY_DSN).map_err(|e| anyhow!("invalid DSN: {e}"))?),
transport: Some(Arc::new(DefaultTransportFactory {})),
..Default::default()
});
let tags = BTreeMap::from([(String::from("thread_id"), self.thread_id.to_string())]);
let cli_version = env!("CARGO_PKG_VERSION");
let mut tags = BTreeMap::from([
(String::from("thread_id"), self.thread_id.to_string()),
(String::from("classification"), classification.to_string()),
(String::from("cli_version"), cli_version.to_string()),
]);
if let Some(r) = reason {
tags.insert(String::from("reason"), r.to_string());
}
let event = Event {
level: Level::Error,
message: Some("Codex Log Upload ".to_string() + &self.thread_id),
let level = match classification {
"bug" | "bad_result" => Level::Error,
_ => Level::Info,
};
let mut envelope = Envelope::new();
let title = format!(
"[{}]: Codex session {}",
display_classification(classification),
self.thread_id
);
let mut event = Event {
level,
message: Some(title.clone()),
tags,
..Default::default()
};
let mut envelope = Envelope::new();
if let Some(r) = reason {
use sentry::protocol::Exception;
use sentry::protocol::Values;
event.exception = Values::from(vec![Exception {
ty: title.clone(),
value: Some(r.to_string()),
..Default::default()
}]);
}
envelope.add_item(EnvelopeItem::Event(event));
envelope.add_item(EnvelopeItem::Attachment(Attachment {
buffer: self.bytes.clone(),
filename: String::from("codex-logs.log"),
content_type: Some("text/plain".to_string()),
ty: None,
}));
if include_logs {
envelope.add_item(EnvelopeItem::Attachment(Attachment {
buffer: self.bytes.clone(),
filename: String::from("codex-logs.log"),
content_type: Some("text/plain".to_string()),
ty: None,
}));
}
if let Some((path, data)) = rollout_path.and_then(|p| fs::read(p).ok().map(|d| (p, d))) {
let fname = path
.file_name()
.map(|s| s.to_string_lossy().to_string())
.unwrap_or_else(|| "rollout.jsonl".to_string());
let content_type = "text/plain".to_string();
envelope.add_item(EnvelopeItem::Attachment(Attachment {
buffer: data,
filename: fname,
content_type: Some(content_type),
ty: None,
}));
}
client.send_envelope(envelope);
client.flush(Some(Duration::from_secs(UPLOAD_TIMEOUT_SECS)));
Ok(())
}
}
fn display_classification(classification: &str) -> String {
match classification {
"bug" => "Bug".to_string(),
"bad_result" => "Bad result".to_string(),
"good_result" => "Good result".to_string(),
_ => "Other".to_string(),
}
}
#[cfg(test)]
mod tests {
use super::*;

View File

@@ -9,13 +9,20 @@ name = "codex_git_tooling"
path = "src/lib.rs"
[dependencies]
tempfile = "3"
thiserror = "2"
walkdir = "2"
tempfile = { workspace = true }
thiserror = { workspace = true }
walkdir = { workspace = true }
schemars = { workspace = true }
serde = { workspace = true, features = ["derive"] }
ts-rs = { workspace = true, features = [
"uuid-impl",
"serde-json-impl",
"no-serde-warnings",
] }
[lints]
workspace = true
[dev-dependencies]
assert_matches = { workspace = true }
pretty_assertions = "1.4.1"
pretty_assertions = { workspace = true }

View File

@@ -1,4 +1,7 @@
use std::collections::HashSet;
use std::ffi::OsString;
use std::fs;
use std::io;
use std::path::Path;
use std::path::PathBuf;
@@ -14,6 +17,7 @@ use crate::operations::resolve_head;
use crate::operations::resolve_repository_root;
use crate::operations::run_git_for_status;
use crate::operations::run_git_for_stdout;
use crate::operations::run_git_for_stdout_all;
/// Default commit message used for ghost commits when none is provided.
const DEFAULT_COMMIT_MESSAGE: &str = "codex snapshot";
@@ -69,6 +73,8 @@ pub fn create_ghost_commit(
let repo_root = resolve_repository_root(options.repo_path)?;
let repo_prefix = repo_subdir(repo_root.as_path(), options.repo_path);
let parent = resolve_head(repo_root.as_path())?;
let existing_untracked =
capture_existing_untracked(repo_root.as_path(), repo_prefix.as_deref())?;
let normalized_force = options
.force_include
@@ -84,6 +90,16 @@ pub fn create_ghost_commit(
OsString::from(index_path.as_os_str()),
)];
// Pre-populate the temporary index with HEAD so unchanged tracked files
// are included in the snapshot tree.
if let Some(parent_sha) = parent.as_deref() {
run_git_for_status(
repo_root.as_path(),
vec![OsString::from("read-tree"), OsString::from(parent_sha)],
Some(base_env.as_slice()),
)?;
}
let mut add_args = vec![OsString::from("add"), OsString::from("--all")];
if let Some(prefix) = repo_prefix.as_deref() {
add_args.extend([OsString::from("--"), prefix.as_os_str().to_os_string()]);
@@ -127,12 +143,29 @@ pub fn create_ghost_commit(
Some(commit_env.as_slice()),
)?;
Ok(GhostCommit::new(commit_id, parent))
Ok(GhostCommit::new(
commit_id,
parent,
existing_untracked.files,
existing_untracked.dirs,
))
}
/// Restore the working tree to match the provided ghost commit.
pub fn restore_ghost_commit(repo_path: &Path, commit: &GhostCommit) -> Result<(), GitToolingError> {
restore_to_commit(repo_path, commit.id())
ensure_git_repository(repo_path)?;
let repo_root = resolve_repository_root(repo_path)?;
let repo_prefix = repo_subdir(repo_root.as_path(), repo_path);
let current_untracked =
capture_existing_untracked(repo_root.as_path(), repo_prefix.as_deref())?;
restore_to_commit_inner(repo_root.as_path(), repo_prefix.as_deref(), commit.id())?;
remove_new_untracked(
repo_root.as_path(),
commit.preexisting_untracked_files(),
commit.preexisting_untracked_dirs(),
current_untracked,
)
}
/// Restore the working tree to match the given commit ID.
@@ -141,7 +174,16 @@ pub fn restore_to_commit(repo_path: &Path, commit_id: &str) -> Result<(), GitToo
let repo_root = resolve_repository_root(repo_path)?;
let repo_prefix = repo_subdir(repo_root.as_path(), repo_path);
restore_to_commit_inner(repo_root.as_path(), repo_prefix.as_deref(), commit_id)
}
/// Restores the working tree and index to the given commit using `git restore`.
/// The repository root and optional repository-relative prefix limit the restore scope.
fn restore_to_commit_inner(
repo_root: &Path,
repo_prefix: Option<&Path>,
commit_id: &str,
) -> Result<(), GitToolingError> {
let mut restore_args = vec![
OsString::from("restore"),
OsString::from("--source"),
@@ -150,13 +192,143 @@ pub fn restore_to_commit(repo_path: &Path, commit_id: &str) -> Result<(), GitToo
OsString::from("--staged"),
OsString::from("--"),
];
if let Some(prefix) = repo_prefix.as_deref() {
if let Some(prefix) = repo_prefix {
restore_args.push(prefix.as_os_str().to_os_string());
} else {
restore_args.push(OsString::from("."));
}
run_git_for_status(repo_root.as_path(), restore_args, None)?;
run_git_for_status(repo_root, restore_args, None)?;
Ok(())
}
#[derive(Default)]
struct UntrackedSnapshot {
files: Vec<PathBuf>,
dirs: Vec<PathBuf>,
}
/// Captures the untracked and ignored entries under `repo_root`, optionally limited by `repo_prefix`.
/// Returns the result as an `UntrackedSnapshot`.
fn capture_existing_untracked(
repo_root: &Path,
repo_prefix: Option<&Path>,
) -> Result<UntrackedSnapshot, GitToolingError> {
// Ask git for the zero-delimited porcelain status so we can enumerate
// every untracked or ignored path (including ones filtered by prefix).
let mut args = vec![
OsString::from("status"),
OsString::from("--porcelain=2"),
OsString::from("-z"),
OsString::from("--ignored=matching"),
OsString::from("--untracked-files=all"),
];
if let Some(prefix) = repo_prefix {
args.push(OsString::from("--"));
args.push(prefix.as_os_str().to_os_string());
}
let output = run_git_for_stdout_all(repo_root, args, None)?;
if output.is_empty() {
return Ok(UntrackedSnapshot::default());
}
let mut snapshot = UntrackedSnapshot::default();
// Each entry is of the form "<code> <path>" where code is '?' (untracked)
// or '!' (ignored); everything else is irrelevant to this snapshot.
for entry in output.split('\0') {
if entry.is_empty() {
continue;
}
let mut parts = entry.splitn(2, ' ');
let code = parts.next();
let path_part = parts.next();
let (Some(code), Some(path_part)) = (code, path_part) else {
continue;
};
if code != "?" && code != "!" {
continue;
}
if path_part.is_empty() {
continue;
}
let normalized = normalize_relative_path(Path::new(path_part))?;
let absolute = repo_root.join(&normalized);
let is_dir = absolute.is_dir();
if is_dir {
snapshot.dirs.push(normalized);
} else {
snapshot.files.push(normalized);
}
}
Ok(snapshot)
}
/// Removes untracked files and directories that were not present when the snapshot was captured.
fn remove_new_untracked(
repo_root: &Path,
preserved_files: &[PathBuf],
preserved_dirs: &[PathBuf],
current: UntrackedSnapshot,
) -> Result<(), GitToolingError> {
if current.files.is_empty() && current.dirs.is_empty() {
return Ok(());
}
let preserved_file_set: HashSet<PathBuf> = preserved_files.iter().cloned().collect();
let preserved_dirs_vec: Vec<PathBuf> = preserved_dirs.to_vec();
for path in current.files {
if should_preserve(&path, &preserved_file_set, &preserved_dirs_vec) {
continue;
}
remove_path(&repo_root.join(&path))?;
}
for dir in current.dirs {
if should_preserve(&dir, &preserved_file_set, &preserved_dirs_vec) {
continue;
}
remove_path(&repo_root.join(&dir))?;
}
Ok(())
}
/// Determines whether an untracked path should be kept because it existed in the snapshot.
fn should_preserve(
path: &Path,
preserved_files: &HashSet<PathBuf>,
preserved_dirs: &[PathBuf],
) -> bool {
if preserved_files.contains(path) {
return true;
}
preserved_dirs
.iter()
.any(|dir| path.starts_with(dir.as_path()))
}
/// Deletes the file or directory at the provided path, ignoring if it is already absent.
fn remove_path(path: &Path) -> Result<(), GitToolingError> {
match fs::symlink_metadata(path) {
Ok(metadata) => {
if metadata.is_dir() {
fs::remove_dir_all(path)?;
} else {
fs::remove_file(path)?;
}
}
Err(err) => {
if err.kind() == io::ErrorKind::NotFound {
return Ok(());
}
return Err(err.into());
}
}
Ok(())
}
@@ -239,6 +411,9 @@ mod tests {
],
);
let preexisting_untracked = repo.join("notes.txt");
std::fs::write(&preexisting_untracked, "notes before\n")?;
let tracked_contents = "modified contents\n";
std::fs::write(repo.join("tracked.txt"), tracked_contents)?;
std::fs::remove_file(repo.join("delete-me.txt"))?;
@@ -267,6 +442,7 @@ mod tests {
std::fs::write(repo.join("ignored.txt"), "changed\n")?;
std::fs::remove_file(repo.join("new-file.txt"))?;
std::fs::write(repo.join("ephemeral.txt"), "temp data\n")?;
std::fs::write(&preexisting_untracked, "notes after\n")?;
restore_ghost_commit(repo, &ghost)?;
@@ -277,7 +453,9 @@ mod tests {
let new_file_after = std::fs::read_to_string(repo.join("new-file.txt"))?;
assert_eq!(new_file_after, new_file_contents);
assert_eq!(repo.join("delete-me.txt").exists(), false);
assert!(repo.join("ephemeral.txt").exists());
assert!(!repo.join("ephemeral.txt").exists());
let notes_after = std::fs::read_to_string(&preexisting_untracked)?;
assert_eq!(notes_after, "notes before\n");
Ok(())
}
@@ -488,7 +666,43 @@ mod tests {
assert!(vscode.join("settings.json").exists());
let settings_after = std::fs::read_to_string(vscode.join("settings.json"))?;
assert_eq!(settings_after, "{\n \"after\": true\n}\n");
assert!(repo.join("temp.txt").exists());
assert!(!repo.join("temp.txt").exists());
Ok(())
}
#[test]
/// Restoring removes ignored directories created after the snapshot.
fn restore_removes_new_ignored_directory() -> Result<(), GitToolingError> {
let temp = tempfile::tempdir()?;
let repo = temp.path();
init_test_repo(repo);
std::fs::write(repo.join(".gitignore"), ".vscode/\n")?;
std::fs::write(repo.join("tracked.txt"), "snapshot version\n")?;
run_git_in(repo, &["add", ".gitignore", "tracked.txt"]);
run_git_in(
repo,
&[
"-c",
"user.name=Tester",
"-c",
"user.email=test@example.com",
"commit",
"-m",
"initial",
],
);
let ghost = create_ghost_commit(&CreateGhostCommitOptions::new(repo))?;
let vscode = repo.join(".vscode");
std::fs::create_dir_all(&vscode)?;
std::fs::write(vscode.join("settings.json"), "{\n \"after\": true\n}\n")?;
restore_ghost_commit(repo, &ghost)?;
assert!(!vscode.exists());
Ok(())
}

View File

@@ -1,4 +1,5 @@
use std::fmt;
use std::path::PathBuf;
mod errors;
mod ghost_commits;
@@ -11,18 +12,36 @@ pub use ghost_commits::create_ghost_commit;
pub use ghost_commits::restore_ghost_commit;
pub use ghost_commits::restore_to_commit;
pub use platform::create_symlink;
use schemars::JsonSchema;
use serde::Deserialize;
use serde::Serialize;
use ts_rs::TS;
type CommitID = String;
/// Details of a ghost commit created from a repository state.
#[derive(Debug, Clone, PartialEq, Eq)]
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize, JsonSchema, TS)]
pub struct GhostCommit {
id: String,
parent: Option<String>,
id: CommitID,
parent: Option<CommitID>,
preexisting_untracked_files: Vec<PathBuf>,
preexisting_untracked_dirs: Vec<PathBuf>,
}
impl GhostCommit {
/// Create a new ghost commit wrapper from a raw commit ID and optional parent.
pub fn new(id: String, parent: Option<String>) -> Self {
Self { id, parent }
pub fn new(
id: CommitID,
parent: Option<CommitID>,
preexisting_untracked_files: Vec<PathBuf>,
preexisting_untracked_dirs: Vec<PathBuf>,
) -> Self {
Self {
id,
parent,
preexisting_untracked_files,
preexisting_untracked_dirs,
}
}
/// Commit ID for the snapshot.
@@ -34,6 +53,16 @@ impl GhostCommit {
pub fn parent(&self) -> Option<&str> {
self.parent.as_deref()
}
/// Untracked or ignored files that already existed when the snapshot was captured.
pub fn preexisting_untracked_files(&self) -> &[PathBuf] {
&self.preexisting_untracked_files
}
/// Untracked or ignored directories that already existed when the snapshot was captured.
pub fn preexisting_untracked_dirs(&self) -> &[PathBuf] {
&self.preexisting_untracked_dirs
}
}
impl fmt::Display for GhostCommit {

View File

@@ -161,6 +161,27 @@ where
})
}
/// Executes `git` and returns the full stdout without trimming so callers
/// can parse delimiter-sensitive output, propagating UTF-8 errors with context.
pub(crate) fn run_git_for_stdout_all<I, S>(
dir: &Path,
args: I,
env: Option<&[(OsString, OsString)]>,
) -> Result<String, GitToolingError>
where
I: IntoIterator<Item = S>,
S: AsRef<OsStr>,
{
// Keep the raw stdout untouched so callers can parse delimiter-sensitive
// output (e.g. NUL-separated paths) without trimming artefacts.
let run = run_git(dir, args, env)?;
// Propagate UTF-8 conversion failures with the command context for debugging.
String::from_utf8(run.output.stdout).map_err(|source| GitToolingError::GitOutputUtf8 {
command: run.command,
source,
})
}
fn run_git<I, S>(
dir: &Path,
args: I,

View File

@@ -0,0 +1,11 @@
[package]
edition = "2024"
name = "codex-keyring-store"
version = { workspace = true }
[lints]
workspace = true
[dependencies]
keyring = { workspace = true }
tracing = { workspace = true }

View File

@@ -0,0 +1,226 @@
use keyring::Entry;
use keyring::Error as KeyringError;
use std::error::Error;
use std::fmt;
use std::fmt::Debug;
use tracing::trace;
#[derive(Debug)]
pub enum CredentialStoreError {
Other(KeyringError),
}
impl CredentialStoreError {
pub fn new(error: KeyringError) -> Self {
Self::Other(error)
}
pub fn message(&self) -> String {
match self {
Self::Other(error) => error.to_string(),
}
}
pub fn into_error(self) -> KeyringError {
match self {
Self::Other(error) => error,
}
}
}
impl fmt::Display for CredentialStoreError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
Self::Other(error) => write!(f, "{error}"),
}
}
}
impl Error for CredentialStoreError {}
/// Shared credential store abstraction for keyring-backed implementations.
pub trait KeyringStore: Debug + Send + Sync {
fn load(&self, service: &str, account: &str) -> Result<Option<String>, CredentialStoreError>;
fn save(&self, service: &str, account: &str, value: &str) -> Result<(), CredentialStoreError>;
fn delete(&self, service: &str, account: &str) -> Result<bool, CredentialStoreError>;
}
#[derive(Debug)]
pub struct DefaultKeyringStore;
impl KeyringStore for DefaultKeyringStore {
fn load(&self, service: &str, account: &str) -> Result<Option<String>, CredentialStoreError> {
trace!("keyring.load start, service={service}, account={account}");
let entry = Entry::new(service, account).map_err(CredentialStoreError::new)?;
match entry.get_password() {
Ok(password) => {
trace!("keyring.load success, service={service}, account={account}");
Ok(Some(password))
}
Err(keyring::Error::NoEntry) => {
trace!("keyring.load no entry, service={service}, account={account}");
Ok(None)
}
Err(error) => {
trace!("keyring.load error, service={service}, account={account}, error={error}");
Err(CredentialStoreError::new(error))
}
}
}
fn save(&self, service: &str, account: &str, value: &str) -> Result<(), CredentialStoreError> {
trace!(
"keyring.save start, service={service}, account={account}, value_len={}",
value.len()
);
let entry = Entry::new(service, account).map_err(CredentialStoreError::new)?;
match entry.set_password(value) {
Ok(()) => {
trace!("keyring.save success, service={service}, account={account}");
Ok(())
}
Err(error) => {
trace!("keyring.save error, service={service}, account={account}, error={error}");
Err(CredentialStoreError::new(error))
}
}
}
fn delete(&self, service: &str, account: &str) -> Result<bool, CredentialStoreError> {
trace!("keyring.delete start, service={service}, account={account}");
let entry = Entry::new(service, account).map_err(CredentialStoreError::new)?;
match entry.delete_credential() {
Ok(()) => {
trace!("keyring.delete success, service={service}, account={account}");
Ok(true)
}
Err(keyring::Error::NoEntry) => {
trace!("keyring.delete no entry, service={service}, account={account}");
Ok(false)
}
Err(error) => {
trace!("keyring.delete error, service={service}, account={account}, error={error}");
Err(CredentialStoreError::new(error))
}
}
}
}
pub mod tests {
use super::CredentialStoreError;
use super::KeyringStore;
use keyring::Error as KeyringError;
use keyring::credential::CredentialApi as _;
use keyring::mock::MockCredential;
use std::collections::HashMap;
use std::sync::Arc;
use std::sync::Mutex;
use std::sync::PoisonError;
#[derive(Default, Clone, Debug)]
pub struct MockKeyringStore {
credentials: Arc<Mutex<HashMap<String, Arc<MockCredential>>>>,
}
impl MockKeyringStore {
pub fn credential(&self, account: &str) -> Arc<MockCredential> {
let mut guard = self
.credentials
.lock()
.unwrap_or_else(PoisonError::into_inner);
guard
.entry(account.to_string())
.or_insert_with(|| Arc::new(MockCredential::default()))
.clone()
}
pub fn saved_value(&self, account: &str) -> Option<String> {
let credential = {
let guard = self
.credentials
.lock()
.unwrap_or_else(PoisonError::into_inner);
guard.get(account).cloned()
}?;
credential.get_password().ok()
}
pub fn set_error(&self, account: &str, error: KeyringError) {
let credential = self.credential(account);
credential.set_error(error);
}
pub fn contains(&self, account: &str) -> bool {
let guard = self
.credentials
.lock()
.unwrap_or_else(PoisonError::into_inner);
guard.contains_key(account)
}
}
impl KeyringStore for MockKeyringStore {
fn load(
&self,
_service: &str,
account: &str,
) -> Result<Option<String>, CredentialStoreError> {
let credential = {
let guard = self
.credentials
.lock()
.unwrap_or_else(PoisonError::into_inner);
guard.get(account).cloned()
};
let Some(credential) = credential else {
return Ok(None);
};
match credential.get_password() {
Ok(password) => Ok(Some(password)),
Err(KeyringError::NoEntry) => Ok(None),
Err(error) => Err(CredentialStoreError::new(error)),
}
}
fn save(
&self,
_service: &str,
account: &str,
value: &str,
) -> Result<(), CredentialStoreError> {
let credential = self.credential(account);
credential
.set_password(value)
.map_err(CredentialStoreError::new)
}
fn delete(&self, _service: &str, account: &str) -> Result<bool, CredentialStoreError> {
let credential = {
let guard = self
.credentials
.lock()
.unwrap_or_else(PoisonError::into_inner);
guard.get(account).cloned()
};
let Some(credential) = credential else {
return Ok(false);
};
let removed = match credential.delete_credential() {
Ok(()) => Ok(true),
Err(KeyringError::NoEntry) => Ok(false),
Err(error) => Err(CredentialStoreError::new(error)),
}?;
let mut guard = self
.credentials
.lock()
.unwrap_or_else(PoisonError::into_inner);
guard.remove(account);
Ok(removed)
}
}
}

View File

@@ -16,9 +16,7 @@ pub use codex_core::auth::AuthDotJson;
pub use codex_core::auth::CLIENT_ID;
pub use codex_core::auth::CODEX_API_KEY_ENV_VAR;
pub use codex_core::auth::OPENAI_API_KEY_ENV_VAR;
pub use codex_core::auth::get_auth_file;
pub use codex_core::auth::login_with_api_key;
pub use codex_core::auth::logout;
pub use codex_core::auth::try_read_auth_json;
pub use codex_core::auth::write_auth_json;
pub use codex_core::auth::save_auth;
pub use codex_core::token_data::TokenData;

View File

@@ -15,7 +15,7 @@ use crate::pkce::generate_pkce;
use base64::Engine;
use chrono::Utc;
use codex_core::auth::AuthDotJson;
use codex_core::auth::get_auth_file;
use codex_core::auth::save_auth;
use codex_core::default_client::originator;
use codex_core::token_data::TokenData;
use codex_core::token_data::parse_id_token;
@@ -540,13 +540,6 @@ pub(crate) async fn persist_tokens_async(
// Reuse existing synchronous logic but run it off the async runtime.
let codex_home = codex_home.to_path_buf();
tokio::task::spawn_blocking(move || {
let auth_file = get_auth_file(&codex_home);
if let Some(parent) = auth_file.parent()
&& !parent.exists()
{
std::fs::create_dir_all(parent).map_err(io::Error::other)?;
}
let mut tokens = TokenData {
id_token: parse_id_token(&id_token).map_err(io::Error::other)?,
access_token,
@@ -564,7 +557,7 @@ pub(crate) async fn persist_tokens_async(
tokens: Some(tokens),
last_refresh: Some(Utc::now()),
};
codex_core::auth::write_auth_json(&auth_file, &auth)
save_auth(&codex_home, &auth)
})
.await
.map_err(|e| io::Error::other(format!("persist task failed: {e}")))?

View File

@@ -1,9 +1,9 @@
#![allow(clippy::unwrap_used)]
use anyhow::Context;
use base64::Engine;
use base64::engine::general_purpose::URL_SAFE_NO_PAD;
use codex_core::auth::get_auth_file;
use codex_core::auth::try_read_auth_json;
use codex_core::auth::load_auth_dot_json;
use codex_login::ServerOptions;
use codex_login::run_device_code_login;
use serde_json::json;
@@ -108,8 +108,8 @@ fn server_opts(codex_home: &tempfile::TempDir, issuer: String) -> ServerOptions
}
#[tokio::test]
async fn device_code_login_integration_succeeds() {
skip_if_no_network!();
async fn device_code_login_integration_succeeds() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let codex_home = tempdir().unwrap();
let mock_server = MockServer::start().await;
@@ -133,19 +133,21 @@ async fn device_code_login_integration_succeeds() {
.await
.expect("device code login integration should succeed");
let auth_path = get_auth_file(codex_home.path());
let auth = try_read_auth_json(&auth_path).expect("auth.json written");
let auth = load_auth_dot_json(codex_home.path())
.context("auth.json should load after login succeeds")?
.context("auth.json written")?;
// assert_eq!(auth.openai_api_key.as_deref(), Some("api-key-321"));
let tokens = auth.tokens.expect("tokens persisted");
assert_eq!(tokens.access_token, "access-token-123");
assert_eq!(tokens.refresh_token, "refresh-token-123");
assert_eq!(tokens.id_token.raw_jwt, jwt);
assert_eq!(tokens.account_id.as_deref(), Some("acct_321"));
Ok(())
}
#[tokio::test]
async fn device_code_login_rejects_workspace_mismatch() {
skip_if_no_network!();
async fn device_code_login_rejects_workspace_mismatch() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let codex_home = tempdir().unwrap();
let mock_server = MockServer::start().await;
@@ -172,16 +174,18 @@ async fn device_code_login_rejects_workspace_mismatch() {
.expect_err("device code login should fail when workspace mismatches");
assert_eq!(err.kind(), std::io::ErrorKind::PermissionDenied);
let auth_path = get_auth_file(codex_home.path());
let auth =
load_auth_dot_json(codex_home.path()).context("auth.json should load after login fails")?;
assert!(
!auth_path.exists(),
auth.is_none(),
"auth.json should not be created when workspace validation fails"
);
Ok(())
}
#[tokio::test]
async fn device_code_login_integration_handles_usercode_http_failure() {
skip_if_no_network!();
async fn device_code_login_integration_handles_usercode_http_failure() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let codex_home = tempdir().unwrap();
let mock_server = MockServer::start().await;
@@ -201,13 +205,19 @@ async fn device_code_login_integration_handles_usercode_http_failure() {
"unexpected error: {err:?}"
);
let auth_path = get_auth_file(codex_home.path());
assert!(!auth_path.exists());
let auth =
load_auth_dot_json(codex_home.path()).context("auth.json should load after login fails")?;
assert!(
auth.is_none(),
"auth.json should not be created when login fails"
);
Ok(())
}
#[tokio::test]
async fn device_code_login_integration_persists_without_api_key_on_exchange_failure() {
skip_if_no_network!();
async fn device_code_login_integration_persists_without_api_key_on_exchange_failure()
-> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let codex_home = tempdir().unwrap();
@@ -235,18 +245,20 @@ async fn device_code_login_integration_persists_without_api_key_on_exchange_fail
.await
.expect("device login should succeed without API key exchange");
let auth_path = get_auth_file(codex_home.path());
let auth = try_read_auth_json(&auth_path).expect("auth.json written");
let auth = load_auth_dot_json(codex_home.path())
.context("auth.json should load after login succeeds")?
.context("auth.json written")?;
assert!(auth.openai_api_key.is_none());
let tokens = auth.tokens.expect("tokens persisted");
assert_eq!(tokens.access_token, "access-token-123");
assert_eq!(tokens.refresh_token, "refresh-token-123");
assert_eq!(tokens.id_token.raw_jwt, jwt);
Ok(())
}
#[tokio::test]
async fn device_code_login_integration_handles_error_payload() {
skip_if_no_network!();
async fn device_code_login_integration_handles_error_payload() -> anyhow::Result<()> {
skip_if_no_network!(Ok(()));
let codex_home = tempdir().unwrap();
@@ -288,9 +300,11 @@ async fn device_code_login_integration_handles_error_payload() {
"Expected an authorization_declined / 400 / 404 error, got {err:?}"
);
let auth_path = get_auth_file(codex_home.path());
let auth =
load_auth_dot_json(codex_home.path()).context("auth.json should load after login fails")?;
assert!(
!auth_path.exists(),
auth.is_none(),
"auth.json should not be created when device auth fails"
);
Ok(())
}

View File

@@ -158,6 +158,7 @@ impl CodexToolCallParam {
include_view_image_tool: None,
show_raw_agent_reasoning: None,
tools_web_search_request: None,
experimental_sandbox_command_assessment: None,
additional_writable_roots: Vec::new(),
};

View File

@@ -178,6 +178,7 @@ async fn run_codex_tool_session_inner(
cwd,
call_id,
reason: _,
risk,
parsed_cmd,
}) => {
handle_exec_approval_request(
@@ -190,6 +191,7 @@ async fn run_codex_tool_session_inner(
event.id.clone(),
call_id,
parsed_cmd,
risk,
)
.await;
continue;
@@ -279,13 +281,15 @@ async fn run_codex_tool_session_inner(
| EventMsg::GetHistoryEntryResponse(_)
| EventMsg::PlanUpdate(_)
| EventMsg::TurnAborted(_)
| EventMsg::ConversationPath(_)
| EventMsg::UserMessage(_)
| EventMsg::ShutdownComplete
| EventMsg::ViewImageToolCall(_)
| EventMsg::RawResponseItem(_)
| EventMsg::EnteredReviewMode(_)
| EventMsg::ItemStarted(_)
| EventMsg::ItemCompleted(_)
| EventMsg::UndoStarted(_)
| EventMsg::UndoCompleted(_)
| EventMsg::ExitedReviewMode(_) => {
// For now, we do not do anything extra for these
// events. Note that

View File

@@ -4,6 +4,7 @@ use std::sync::Arc;
use codex_core::CodexConversation;
use codex_core::protocol::Op;
use codex_core::protocol::ReviewDecision;
use codex_core::protocol::SandboxCommandAssessment;
use codex_protocol::parse_command::ParsedCommand;
use mcp_types::ElicitRequest;
use mcp_types::ElicitRequestParamsRequestedSchema;
@@ -37,6 +38,8 @@ pub struct ExecApprovalElicitRequestParams {
pub codex_command: Vec<String>,
pub codex_cwd: PathBuf,
pub codex_parsed_cmd: Vec<ParsedCommand>,
#[serde(skip_serializing_if = "Option::is_none")]
pub codex_risk: Option<SandboxCommandAssessment>,
}
// TODO(mbolin): ExecApprovalResponse does not conform to ElicitResult. See:
@@ -59,6 +62,7 @@ pub(crate) async fn handle_exec_approval_request(
event_id: String,
call_id: String,
codex_parsed_cmd: Vec<ParsedCommand>,
codex_risk: Option<SandboxCommandAssessment>,
) {
let escaped_command =
shlex::try_join(command.iter().map(String::as_str)).unwrap_or_else(|_| command.join(" "));
@@ -81,6 +85,7 @@ pub(crate) async fn handle_exec_approval_request(
codex_command: command,
codex_cwd: cwd,
codex_parsed_cmd,
codex_risk,
};
let params_json = match serde_json::to_value(&params) {
Ok(value) => value,

Some files were not shown because too many files have changed in this diff Show More