Compare commits

...

29 Commits

Author SHA1 Message Date
Ahmed Ibrahim
257d5eceff use shared tokenizer 2025-10-24 12:31:08 -07:00
Ahmed Ibrahim
56dfa60801 use shared tokenizer 2025-10-24 12:05:06 -07:00
Ahmed Ibrahim
50055c7ed5 use shared tokenizer 2025-10-24 12:02:48 -07:00
Ahmed Ibrahim
89e9bb1d42 comment 2025-10-24 11:44:05 -07:00
Ahmed Ibrahim
81d0003f0f tests 2025-10-24 11:36:36 -07:00
Ahmed Ibrahim
5d7983520b tests 2025-10-24 11:23:25 -07:00
Ahmed Ibrahim
4e7089a8ab default 2025-10-24 11:16:45 -07:00
Ahmed Ibrahim
697367cd3f bug 2025-10-24 10:12:36 -07:00
Ahmed Ibrahim
6ffbd0d4e3 bug 2025-10-24 10:08:47 -07:00
Ahmed Ibrahim
79c628a823 tests 2025-10-24 09:55:08 -07:00
Ahmed Ibrahim
9446de0923 tests 2025-10-23 23:26:02 -07:00
Ahmed Ibrahim
0f02954edb Merge branch 'main' into input-validation 2025-10-23 22:36:12 -07:00
Gabriel Peal
abccd3e367 [MCP] Update rmcp to 0.8.3 (#5542)
Picks up modelcontextprotocol/rust-sdk#497 which fixes #5208 by allowing 204 response to MCP initialize notifications instead of just 202.
2025-10-23 20:45:29 -07:00
Ahmed Ibrahim
0f4fd33ddd Moving token_info to ConversationHistory (#5581)
I want to centralize input processing and management to
`ConversationHistory`. This would need `ConversationHistory` to have
access to `token_info` (i.e. preventing adding a big input to the
history). Besides, it makes more sense to have it on
`ConversationHistory` than `state`.
2025-10-23 20:30:58 -07:00
Josh McKinney
e258f0f044 Use Option symbol for mac key hints (#5582)
## Summary
- show the Option (⌥) symbol in key hints when the TUI is built for
macOS so the shortcut text matches the platform terminology

## Testing
- cargo test -p codex-tui

------
https://chatgpt.com/codex/tasks/task_i_68fab7505530832992780a9e13fb707b
2025-10-23 20:04:15 -07:00
Ahmed Ibrahim
7a8da22d7e add file 2025-10-23 19:03:24 -07:00
Ahmed Ibrahim
8f8ca17da0 input-validation 2025-10-23 18:58:08 -07:00
Ahmed Ibrahim
8b095d3cf1 add-input-validation 2025-10-23 16:17:10 -07:00
jif-oai
a6b9471548 feat: end events on unified exec (#5551) 2025-10-23 18:51:34 +01:00
Thibault Sottiaux
3059373e06 fix: resume lookup for gitignored CODEX_HOME (#5311)
Walk the sessions tree instead of using file_search so gitignored
CODEX_HOME directories can resume sessions. Add a regression test that
covers a .gitignore'd sessions directory.

Fixes #5247
Fixes #5412

---------

Co-authored-by: Owen Lin <owen@openai.com>
2025-10-23 17:04:40 +00:00
jif-oai
0b4527146e feat: use actual tokenizer for unified_exec truncation (#5514) 2025-10-23 17:08:06 +01:00
jif-oai
6745b12427 chore: testing on apply_path (#5557) 2025-10-23 17:00:48 +01:00
Ahmed Ibrahim
f59978ed3d Handle cancelling/aborting while processing a turn (#5543)
Currently we collect all all turn items in a vector, then we add it to
the history on success. This result in losing those items on errors
including aborting `ctrl+c`.

This PR:
- Adds the ability for the tool call to handle cancellation
- bubble the turn items up to where we are recording this info

Admittedly, this logic is an ad-hoc logic that doesn't handle a lot of
error edge cases. The right thing to do is recording to the history on
the spot as `items`/`tool calls output` come. However, this isn't
possible because of having different `task_kind` that has different
`conversation_histories`. The `try_run_turn` has no idea what thread are
we using. We cannot also pass an `arc` to the `conversation_histories`
because it's a private element of `state`.

That's said, `abort` is the most common case and we should cover it
until we remove `task kind`
2025-10-23 08:47:10 -07:00
Jeremy Rose
3ab6028e80 tui: show aggregated output in display (#5539)
This shows the aggregated (stdout + stderr) buffer regardless of exit
code.

Many commands output useful / relevant info on stdout when returning a
non-zero exit code, or the same on stderr when returning an exit code of
0. Often, useful info is present on both stdout AND stderr. Also, the
model sees both. So it is confusing to see commands listed as "(no
output)" that in fact do have output, just on the stream that doesn't
match the exit status, or to see some sort of trivial output like "Tests
failed" but lacking any information about the actual failure.

As such, always display the aggregated output in the display. Transcript
mode remains unchanged as it was already displaying the text that the
model sees, which seems correct for transcript mode.
2025-10-23 08:05:08 -07:00
jif-oai
892eaff46d fix: approval issue (#5525) 2025-10-23 11:13:53 +01:00
jif-oai
8e291a1706 chore: clean handle_container_exec_with_params (#5516)
Drop `handle_container_exec_with_params` to have simpler and more
straight forward execution path
2025-10-23 09:24:01 +01:00
Owen Lin
aee321f62b [app-server] add new account method API stubs (#5527)
These are the schema definitions for the new JSON-RPC APIs associated
with accounts. These are not wired up to business logic yet and will
currently throw an internal error indicating these are unimplemented.
2025-10-22 15:36:11 -07:00
Genki Takiuchi
ed32da04d7 Fix IME submissions dropping leading digits (#4359)
- ensure paste burst flush preserves ASCII characters before IME commits
- add regression test covering digit followed by Japanese text
submission

Fixes openai/codex#4356

Co-authored-by: Josh McKinney <joshka@openai.com>
2025-10-22 22:18:17 +00:00
Owen Lin
8ae3949072 [app-server] send account/rateLimits/updated notifications (#5477)
Codex will now send an `account/rateLimits/updated` notification
whenever the user's rate limits are updated.

This is implemented by just transforming the existing TokenCount event.
2025-10-22 20:12:40 +00:00
60 changed files with 3625 additions and 829 deletions

10
codex-rs/Cargo.lock generated
View File

@@ -1066,6 +1066,7 @@ dependencies = [
"codex-rmcp-client",
"codex-utils-pty",
"codex-utils-string",
"codex-utils-tokenizer",
"core-foundation 0.9.4",
"core_test_support",
"dirs",
@@ -1633,6 +1634,7 @@ dependencies = [
"anyhow",
"assert_cmd",
"codex-core",
"codex-protocol",
"notify",
"regex-lite",
"serde_json",
@@ -4952,9 +4954,9 @@ dependencies = [
[[package]]
name = "rmcp"
version = "0.8.2"
version = "0.8.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4e35d31f89beb59c83bc31363426da25b323ce0c2e5b53c7bf29867d16ee7898"
checksum = "1fdad1258f7259fdc0f2dfc266939c82c3b5d1fd72bcde274d600cdc27e60243"
dependencies = [
"base64",
"bytes",
@@ -4986,9 +4988,9 @@ dependencies = [
[[package]]
name = "rmcp-macros"
version = "0.8.2"
version = "0.8.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d88518b38110c439a03f0f4eee40e5105d648a530711cb87f98991e3f324a664"
checksum = "ede0589a208cc7ce81d1be68aa7e74b917fcd03c81528408bab0457e187dcd9b"
dependencies = [
"darling 0.21.3",
"proc-macro2",

View File

@@ -153,7 +153,7 @@ ratatui = "0.29.0"
ratatui-macros = "0.6.0"
regex-lite = "0.1.7"
reqwest = "0.12"
rmcp = { version = "0.8.2", default-features = false }
rmcp = { version = "0.8.3", default-features = false }
schemars = "0.8.22"
seccompiler = "0.5.0"
sentry = "0.34.0"

View File

@@ -5,6 +5,7 @@ use crate::JSONRPCNotification;
use crate::JSONRPCRequest;
use crate::RequestId;
use codex_protocol::ConversationId;
use codex_protocol::account::Account;
use codex_protocol::config_types::ForcedLoginMethod;
use codex_protocol::config_types::ReasoningEffort;
use codex_protocol::config_types::ReasoningSummary;
@@ -93,6 +94,43 @@ macro_rules! client_request_definitions {
}
client_request_definitions! {
/// NEW APIs
#[serde(rename = "model/list")]
#[ts(rename = "model/list")]
ListModels {
params: ListModelsParams,
response: ListModelsResponse,
},
#[serde(rename = "account/login")]
#[ts(rename = "account/login")]
LoginAccount {
params: LoginAccountParams,
response: LoginAccountResponse,
},
#[serde(rename = "account/logout")]
#[ts(rename = "account/logout")]
LogoutAccount {
params: #[ts(type = "undefined")] #[serde(skip_serializing_if = "Option::is_none")] Option<()>,
response: LogoutAccountResponse,
},
#[serde(rename = "account/rateLimits/read")]
#[ts(rename = "account/rateLimits/read")]
GetAccountRateLimits {
params: #[ts(type = "undefined")] #[serde(skip_serializing_if = "Option::is_none")] Option<()>,
response: GetAccountRateLimitsResponse,
},
#[serde(rename = "account/read")]
#[ts(rename = "account/read")]
GetAccount {
params: #[ts(type = "undefined")] #[serde(skip_serializing_if = "Option::is_none")] Option<()>,
response: Option<Account>,
},
/// DEPRECATED APIs below
Initialize {
params: InitializeParams,
response: InitializeResponse,
@@ -106,13 +144,6 @@ client_request_definitions! {
params: ListConversationsParams,
response: ListConversationsResponse,
},
#[serde(rename = "model/list")]
#[ts(rename = "model/list")]
/// List available Codex models along with display metadata.
ListModels {
params: ListModelsParams,
response: ListModelsResponse,
},
/// Resume a recorded Codex conversation from a rollout file.
ResumeConversation {
params: ResumeConversationParams,
@@ -191,12 +222,6 @@ client_request_definitions! {
params: ExecOneOffCommandParams,
response: ExecOneOffCommandResponse,
},
#[serde(rename = "account/rateLimits/read")]
#[ts(rename = "account/rateLimits/read")]
GetAccountRateLimits {
params: #[ts(type = "undefined")] #[serde(skip_serializing_if = "Option::is_none")] Option<()>,
response: GetAccountRateLimitsResponse,
},
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, Default, JsonSchema, TS)]
@@ -352,6 +377,38 @@ pub struct ListModelsResponse {
pub next_cursor: Option<String>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(tag = "type")]
#[ts(tag = "type")]
pub enum LoginAccountParams {
#[serde(rename = "apiKey")]
#[ts(rename = "apiKey")]
ApiKey {
#[serde(rename = "apiKey")]
#[ts(rename = "apiKey")]
api_key: String,
},
#[serde(rename = "chatgpt")]
#[ts(rename = "chatgpt")]
ChatGpt,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct LoginAccountResponse {
/// Only set if the login method is ChatGPT.
#[schemars(with = "String")]
pub login_id: Option<Uuid>,
/// URL the client should open in a browser to initiate the OAuth flow.
/// Only set if the login method is ChatGPT.
pub auth_url: Option<String>,
}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct LogoutAccountResponse {}
#[derive(Serialize, Deserialize, Debug, Clone, PartialEq, JsonSchema, TS)]
#[serde(rename_all = "camelCase")]
pub struct ResumeConversationParams {
@@ -875,6 +932,13 @@ pub struct AuthStatusChangeNotification {
#[serde(tag = "method", content = "params", rename_all = "camelCase")]
#[strum(serialize_all = "camelCase")]
pub enum ServerNotification {
/// NEW NOTIFICATIONS
#[serde(rename = "account/rateLimits/updated")]
#[ts(rename = "account/rateLimits/updated")]
#[strum(serialize = "account/rateLimits/updated")]
AccountRateLimitsUpdated(RateLimitSnapshot),
/// DEPRECATED NOTIFICATIONS below
/// Authentication status changed
AuthStatusChange(AuthStatusChangeNotification),
@@ -888,6 +952,7 @@ pub enum ServerNotification {
impl ServerNotification {
pub fn to_params(self) -> Result<serde_json::Value, serde_json::Error> {
match self {
ServerNotification::AccountRateLimitsUpdated(params) => serde_json::to_value(params),
ServerNotification::AuthStatusChange(params) => serde_json::to_value(params),
ServerNotification::LoginChatGptComplete(params) => serde_json::to_value(params),
ServerNotification::SessionConfigured(params) => serde_json::to_value(params),
@@ -1043,16 +1108,89 @@ mod tests {
Ok(())
}
#[test]
fn serialize_account_login_api_key() -> Result<()> {
let request = ClientRequest::LoginAccount {
request_id: RequestId::Integer(2),
params: LoginAccountParams::ApiKey {
api_key: "secret".to_string(),
},
};
assert_eq!(
json!({
"method": "account/login",
"id": 2,
"params": {
"type": "apiKey",
"apiKey": "secret"
}
}),
serde_json::to_value(&request)?,
);
Ok(())
}
#[test]
fn serialize_account_login_chatgpt() -> Result<()> {
let request = ClientRequest::LoginAccount {
request_id: RequestId::Integer(3),
params: LoginAccountParams::ChatGpt,
};
assert_eq!(
json!({
"method": "account/login",
"id": 3,
"params": {
"type": "chatgpt"
}
}),
serde_json::to_value(&request)?,
);
Ok(())
}
#[test]
fn serialize_account_logout() -> Result<()> {
let request = ClientRequest::LogoutAccount {
request_id: RequestId::Integer(4),
params: None,
};
assert_eq!(
json!({
"method": "account/logout",
"id": 4,
}),
serde_json::to_value(&request)?,
);
Ok(())
}
#[test]
fn serialize_get_account() -> Result<()> {
let request = ClientRequest::GetAccount {
request_id: RequestId::Integer(5),
params: None,
};
assert_eq!(
json!({
"method": "account/read",
"id": 5,
}),
serde_json::to_value(&request)?,
);
Ok(())
}
#[test]
fn serialize_list_models() -> Result<()> {
let request = ClientRequest::ListModels {
request_id: RequestId::Integer(2),
request_id: RequestId::Integer(6),
params: ListModelsParams::default(),
};
assert_eq!(
json!({
"method": "model/list",
"id": 2,
"id": 6,
"params": {}
}),
serde_json::to_value(&request)?,

View File

@@ -176,6 +176,27 @@ impl CodexMessageProcessor {
ClientRequest::ListModels { request_id, params } => {
self.list_models(request_id, params).await;
}
ClientRequest::LoginAccount {
request_id,
params: _,
} => {
self.send_unimplemented_error(request_id, "account/login")
.await;
}
ClientRequest::LogoutAccount {
request_id,
params: _,
} => {
self.send_unimplemented_error(request_id, "account/logout")
.await;
}
ClientRequest::GetAccount {
request_id,
params: _,
} => {
self.send_unimplemented_error(request_id, "account/read")
.await;
}
ClientRequest::ResumeConversation { request_id, params } => {
self.handle_resume_conversation(request_id, params).await;
}
@@ -257,6 +278,15 @@ impl CodexMessageProcessor {
}
}
async fn send_unimplemented_error(&self, request_id: RequestId, method: &str) {
let error = JSONRPCErrorError {
code: INTERNAL_ERROR_CODE,
message: format!("{method} is not implemented yet"),
data: None,
};
self.outgoing.send_error(request_id, error).await;
}
async fn login_api_key(&mut self, request_id: RequestId, params: LoginApiKeyParams) {
if matches!(
self.config.forced_login_method,
@@ -1436,6 +1466,15 @@ async fn apply_bespoke_event_handling(
on_exec_approval_response(event_id, rx, conversation).await;
});
}
EventMsg::TokenCount(token_count_event) => {
if let Some(rate_limits) = token_count_event.rate_limits {
outgoing
.send_server_notification(ServerNotification::AccountRateLimitsUpdated(
rate_limits,
))
.await;
}
}
// If this is a TurnAborted, reply to any pending interrupt requests.
EventMsg::TurnAborted(turn_aborted_event) => {
let pending = {

View File

@@ -46,6 +46,7 @@ pub(crate) async fn run_fuzzy_file_search(
threads,
cancel_flag,
COMPUTE_INDICES,
true,
) {
Ok(res) => Ok((root, res)),
Err(err) => Err((root, err)),

View File

@@ -142,6 +142,8 @@ pub(crate) struct OutgoingError {
#[cfg(test)]
mod tests {
use codex_app_server_protocol::LoginChatGptCompleteNotification;
use codex_protocol::protocol::RateLimitSnapshot;
use codex_protocol::protocol::RateLimitWindow;
use pretty_assertions::assert_eq;
use serde_json::json;
use uuid::Uuid;
@@ -171,4 +173,34 @@ mod tests {
"ensure the strum macros serialize the method field correctly"
);
}
#[test]
fn verify_account_rate_limits_notification_serialization() {
let notification = ServerNotification::AccountRateLimitsUpdated(RateLimitSnapshot {
primary: Some(RateLimitWindow {
used_percent: 25.0,
window_minutes: Some(15),
resets_at: Some(123),
}),
secondary: None,
});
let jsonrpc_notification = OutgoingMessage::AppServerNotification(notification);
assert_eq!(
json!({
"method": "account/rateLimits/updated",
"params": {
"primary": {
"used_percent": 25.0,
"window_minutes": 15,
"resets_at": 123,
},
"secondary": null,
},
}),
serde_json::to_value(jsonrpc_notification)
.expect("ensure the notification serializes correctly"),
"ensure the notification serializes correctly"
);
}
}

View File

@@ -1 +1,3 @@
mod cli;
#[cfg(not(target_os = "windows"))]
mod tool;

View File

@@ -0,0 +1,257 @@
use assert_cmd::Command;
use pretty_assertions::assert_eq;
use std::fs;
use std::path::Path;
use tempfile::tempdir;
fn run_apply_patch_in_dir(dir: &Path, patch: &str) -> anyhow::Result<assert_cmd::assert::Assert> {
let mut cmd = Command::cargo_bin("apply_patch")?;
cmd.current_dir(dir);
Ok(cmd.arg(patch).assert())
}
fn apply_patch_command(dir: &Path) -> anyhow::Result<Command> {
let mut cmd = Command::cargo_bin("apply_patch")?;
cmd.current_dir(dir);
Ok(cmd)
}
#[test]
fn test_apply_patch_cli_applies_multiple_operations() -> anyhow::Result<()> {
let tmp = tempdir()?;
let modify_path = tmp.path().join("modify.txt");
let delete_path = tmp.path().join("delete.txt");
fs::write(&modify_path, "line1\nline2\n")?;
fs::write(&delete_path, "obsolete\n")?;
let patch = "*** Begin Patch\n*** Add File: nested/new.txt\n+created\n*** Delete File: delete.txt\n*** Update File: modify.txt\n@@\n-line2\n+changed\n*** End Patch";
run_apply_patch_in_dir(tmp.path(), patch)?.success().stdout(
"Success. Updated the following files:\nA nested/new.txt\nM modify.txt\nD delete.txt\n",
);
assert_eq!(
fs::read_to_string(tmp.path().join("nested/new.txt"))?,
"created\n"
);
assert_eq!(fs::read_to_string(&modify_path)?, "line1\nchanged\n");
assert!(!delete_path.exists());
Ok(())
}
#[test]
fn test_apply_patch_cli_applies_multiple_chunks() -> anyhow::Result<()> {
let tmp = tempdir()?;
let target_path = tmp.path().join("multi.txt");
fs::write(&target_path, "line1\nline2\nline3\nline4\n")?;
let patch = "*** Begin Patch\n*** Update File: multi.txt\n@@\n-line2\n+changed2\n@@\n-line4\n+changed4\n*** End Patch";
run_apply_patch_in_dir(tmp.path(), patch)?
.success()
.stdout("Success. Updated the following files:\nM multi.txt\n");
assert_eq!(
fs::read_to_string(&target_path)?,
"line1\nchanged2\nline3\nchanged4\n"
);
Ok(())
}
#[test]
fn test_apply_patch_cli_moves_file_to_new_directory() -> anyhow::Result<()> {
let tmp = tempdir()?;
let original_path = tmp.path().join("old/name.txt");
let new_path = tmp.path().join("renamed/dir/name.txt");
fs::create_dir_all(original_path.parent().expect("parent should exist"))?;
fs::write(&original_path, "old content\n")?;
let patch = "*** Begin Patch\n*** Update File: old/name.txt\n*** Move to: renamed/dir/name.txt\n@@\n-old content\n+new content\n*** End Patch";
run_apply_patch_in_dir(tmp.path(), patch)?
.success()
.stdout("Success. Updated the following files:\nM renamed/dir/name.txt\n");
assert!(!original_path.exists());
assert_eq!(fs::read_to_string(&new_path)?, "new content\n");
Ok(())
}
#[test]
fn test_apply_patch_cli_rejects_empty_patch() -> anyhow::Result<()> {
let tmp = tempdir()?;
apply_patch_command(tmp.path())?
.arg("*** Begin Patch\n*** End Patch")
.assert()
.failure()
.stderr("No files were modified.\n");
Ok(())
}
#[test]
fn test_apply_patch_cli_reports_missing_context() -> anyhow::Result<()> {
let tmp = tempdir()?;
let target_path = tmp.path().join("modify.txt");
fs::write(&target_path, "line1\nline2\n")?;
apply_patch_command(tmp.path())?
.arg("*** Begin Patch\n*** Update File: modify.txt\n@@\n-missing\n+changed\n*** End Patch")
.assert()
.failure()
.stderr("Failed to find expected lines in modify.txt:\nmissing\n");
assert_eq!(fs::read_to_string(&target_path)?, "line1\nline2\n");
Ok(())
}
#[test]
fn test_apply_patch_cli_rejects_missing_file_delete() -> anyhow::Result<()> {
let tmp = tempdir()?;
apply_patch_command(tmp.path())?
.arg("*** Begin Patch\n*** Delete File: missing.txt\n*** End Patch")
.assert()
.failure()
.stderr("Failed to delete file missing.txt\n");
Ok(())
}
#[test]
fn test_apply_patch_cli_rejects_empty_update_hunk() -> anyhow::Result<()> {
let tmp = tempdir()?;
apply_patch_command(tmp.path())?
.arg("*** Begin Patch\n*** Update File: foo.txt\n*** End Patch")
.assert()
.failure()
.stderr("Invalid patch hunk on line 2: Update file hunk for path 'foo.txt' is empty\n");
Ok(())
}
#[test]
fn test_apply_patch_cli_requires_existing_file_for_update() -> anyhow::Result<()> {
let tmp = tempdir()?;
apply_patch_command(tmp.path())?
.arg("*** Begin Patch\n*** Update File: missing.txt\n@@\n-old\n+new\n*** End Patch")
.assert()
.failure()
.stderr(
"Failed to read file to update missing.txt: No such file or directory (os error 2)\n",
);
Ok(())
}
#[test]
fn test_apply_patch_cli_move_overwrites_existing_destination() -> anyhow::Result<()> {
let tmp = tempdir()?;
let original_path = tmp.path().join("old/name.txt");
let destination = tmp.path().join("renamed/dir/name.txt");
fs::create_dir_all(original_path.parent().expect("parent should exist"))?;
fs::create_dir_all(destination.parent().expect("parent should exist"))?;
fs::write(&original_path, "from\n")?;
fs::write(&destination, "existing\n")?;
run_apply_patch_in_dir(
tmp.path(),
"*** Begin Patch\n*** Update File: old/name.txt\n*** Move to: renamed/dir/name.txt\n@@\n-from\n+new\n*** End Patch",
)?
.success()
.stdout("Success. Updated the following files:\nM renamed/dir/name.txt\n");
assert!(!original_path.exists());
assert_eq!(fs::read_to_string(&destination)?, "new\n");
Ok(())
}
#[test]
fn test_apply_patch_cli_add_overwrites_existing_file() -> anyhow::Result<()> {
let tmp = tempdir()?;
let path = tmp.path().join("duplicate.txt");
fs::write(&path, "old content\n")?;
run_apply_patch_in_dir(
tmp.path(),
"*** Begin Patch\n*** Add File: duplicate.txt\n+new content\n*** End Patch",
)?
.success()
.stdout("Success. Updated the following files:\nA duplicate.txt\n");
assert_eq!(fs::read_to_string(&path)?, "new content\n");
Ok(())
}
#[test]
fn test_apply_patch_cli_delete_directory_fails() -> anyhow::Result<()> {
let tmp = tempdir()?;
fs::create_dir(tmp.path().join("dir"))?;
apply_patch_command(tmp.path())?
.arg("*** Begin Patch\n*** Delete File: dir\n*** End Patch")
.assert()
.failure()
.stderr("Failed to delete file dir\n");
Ok(())
}
#[test]
fn test_apply_patch_cli_rejects_invalid_hunk_header() -> anyhow::Result<()> {
let tmp = tempdir()?;
apply_patch_command(tmp.path())?
.arg("*** Begin Patch\n*** Frobnicate File: foo\n*** End Patch")
.assert()
.failure()
.stderr("Invalid patch hunk on line 2: '*** Frobnicate File: foo' is not a valid hunk header. Valid hunk headers: '*** Add File: {path}', '*** Delete File: {path}', '*** Update File: {path}'\n");
Ok(())
}
#[test]
fn test_apply_patch_cli_updates_file_appends_trailing_newline() -> anyhow::Result<()> {
let tmp = tempdir()?;
let target_path = tmp.path().join("no_newline.txt");
fs::write(&target_path, "no newline at end")?;
run_apply_patch_in_dir(
tmp.path(),
"*** Begin Patch\n*** Update File: no_newline.txt\n@@\n-no newline at end\n+first line\n+second line\n*** End Patch",
)?
.success()
.stdout("Success. Updated the following files:\nM no_newline.txt\n");
let contents = fs::read_to_string(&target_path)?;
assert!(contents.ends_with('\n'));
assert_eq!(contents, "first line\nsecond line\n");
Ok(())
}
#[test]
fn test_apply_patch_cli_failure_after_partial_success_leaves_changes() -> anyhow::Result<()> {
let tmp = tempdir()?;
let new_file = tmp.path().join("created.txt");
apply_patch_command(tmp.path())?
.arg("*** Begin Patch\n*** Add File: created.txt\n+hello\n*** Update File: missing.txt\n@@\n-old\n+new\n*** End Patch")
.assert()
.failure()
.stdout("")
.stderr("Failed to read file to update missing.txt: No such file or directory (os error 2)\n");
assert_eq!(fs::read_to_string(&new_file)?, "hello\n");
Ok(())
}

View File

@@ -28,6 +28,7 @@ codex-rmcp-client = { workspace = true }
codex-async-utils = { workspace = true }
codex-utils-string = { workspace = true }
codex-utils-pty = { workspace = true }
codex-utils-tokenizer = { workspace = true }
dirs = { workspace = true }
dunce = { workspace = true }
env-flags = { workspace = true }

View File

@@ -10,6 +10,7 @@ use crate::function_tool::FunctionCallError;
use crate::mcp::auth::McpAuthStatusEntry;
use crate::parse_command::parse_command;
use crate::parse_turn_item;
use crate::response_processing::process_items;
use crate::review_format::format_review_findings_block;
use crate::terminal;
use crate::user_notification::UserNotifier;
@@ -58,6 +59,7 @@ use crate::config::Config;
use crate::config_types::McpServerTransportConfig;
use crate::config_types::ShellEnvironmentPolicy;
use crate::conversation_history::ConversationHistory;
use crate::conversation_history::prefetch_tokenizer_in_background;
use crate::environment_context::EnvironmentContext;
use crate::error::CodexErr;
use crate::error::Result as CodexResult;
@@ -158,6 +160,8 @@ impl Codex {
conversation_history: InitialHistory,
session_source: SessionSource,
) -> CodexResult<CodexSpawnOk> {
// Start loading the tokenizer in the background so we don't block later.
prefetch_tokenizer_in_background();
let (tx_sub, rx_sub) = async_channel::bounded(SUBMISSION_CHANNEL_CAPACITY);
let (tx_event, rx_event) = async_channel::unbounded();
@@ -567,7 +571,9 @@ impl Session {
// Dispatch the SessionConfiguredEvent first and then report any errors.
// If resuming, include converted initial messages in the payload so UIs can render them immediately.
let initial_messages = initial_history.get_event_msgs();
sess.record_initial_history(initial_history).await;
sess.record_initial_history(initial_history)
.await
.map_err(anyhow::Error::new)?;
let events = std::iter::once(Event {
id: INITIAL_SUBMIT_ID.to_owned(),
@@ -600,13 +606,16 @@ impl Session {
format!("auto-compact-{id}")
}
async fn record_initial_history(&self, conversation_history: InitialHistory) {
async fn record_initial_history(
&self,
conversation_history: InitialHistory,
) -> CodexResult<()> {
let turn_context = self.new_turn(SessionSettingsUpdate::default()).await;
match conversation_history {
InitialHistory::New => {
// Build and record initial items (user instructions + environment context)
let items = self.build_initial_context(&turn_context);
self.record_conversation_items(&items).await;
self.record_conversation_items(&items).await?;
}
InitialHistory::Resumed(_) | InitialHistory::Forked(_) => {
let rollout_items = conversation_history.get_rollout_items();
@@ -614,9 +623,9 @@ impl Session {
// Always add response items to conversation history
let reconstructed_history =
self.reconstruct_history_from_rollout(&turn_context, &rollout_items);
self.reconstruct_history_from_rollout(&turn_context, &rollout_items)?;
if !reconstructed_history.is_empty() {
self.record_into_history(&reconstructed_history).await;
self.record_into_history(&reconstructed_history).await?;
}
// If persisting, persist all rollout items as-is (recorder filters)
@@ -625,6 +634,7 @@ impl Session {
}
}
}
Ok(())
}
pub(crate) async fn update_settings(&self, updates: SessionSettingsUpdate) {
@@ -855,21 +865,25 @@ impl Session {
/// Records input items: always append to conversation history and
/// persist these response items to rollout.
async fn record_conversation_items(&self, items: &[ResponseItem]) {
self.record_into_history(items).await;
pub(crate) async fn record_conversation_items(
&self,
items: &[ResponseItem],
) -> CodexResult<()> {
self.record_into_history(items).await?;
self.persist_rollout_response_items(items).await;
Ok(())
}
fn reconstruct_history_from_rollout(
&self,
turn_context: &TurnContext,
rollout_items: &[RolloutItem],
) -> Vec<ResponseItem> {
) -> CodexResult<Vec<ResponseItem>> {
let mut history = ConversationHistory::new();
for item in rollout_items {
match item {
RolloutItem::ResponseItem(response_item) => {
history.record_items(std::iter::once(response_item));
history.record_items(std::iter::once(response_item))?;
}
RolloutItem::Compacted(compacted) => {
let snapshot = history.get_history();
@@ -884,13 +898,14 @@ impl Session {
_ => {}
}
}
history.get_history()
Ok(history.get_history())
}
/// Append ResponseItems to the in-memory conversation history only.
async fn record_into_history(&self, items: &[ResponseItem]) {
async fn record_into_history(&self, items: &[ResponseItem]) -> CodexResult<()> {
let mut state = self.state.lock().await;
state.record_items(items.iter());
state.record_items(items.iter())?;
Ok(())
}
async fn replace_history(&self, items: Vec<ResponseItem>) {
@@ -999,11 +1014,11 @@ impl Session {
&self,
turn_context: &TurnContext,
response_input: &ResponseInputItem,
) {
) -> CodexResult<()> {
let response_item: ResponseItem = response_input.clone().into();
// Add to conversation history and persist response item to rollout
self.record_conversation_items(std::slice::from_ref(&response_item))
.await;
.await?;
// Derive user message events and persist only UserMessage to rollout
let turn_item = parse_turn_item(&response_item);
@@ -1012,6 +1027,7 @@ impl Session {
self.emit_turn_item_started_completed(turn_context, item, false)
.await;
}
Ok(())
}
/// Helper that emits a BackgroundEvent with the given message. This keeps
@@ -1192,9 +1208,17 @@ async fn submission_loop(sess: Arc<Session>, config: Arc<Config>, rx_sub: Receiv
if let Err(items) = sess.inject_input(items).await {
if let Some(env_item) = sess
.build_environment_update_item(previous_context.as_ref(), &current_context)
&& let Err(err) = sess
.record_conversation_items(std::slice::from_ref(&env_item))
.await
{
sess.record_conversation_items(std::slice::from_ref(&env_item))
.await;
sess.send_event(
current_context.as_ref(),
EventMsg::Error(ErrorEvent {
message: err.to_string(),
}),
)
.await;
}
sess.spawn_task(Arc::clone(&current_context), items, RegularTask)
@@ -1507,9 +1531,9 @@ pub(crate) async fn run_task(
input: Vec<UserInput>,
task_kind: TaskKind,
cancellation_token: CancellationToken,
) -> Option<String> {
) -> CodexResult<Option<String>> {
if input.is_empty() {
return None;
return Ok(None);
}
let event = EventMsg::TaskStarted(TaskStartedEvent {
model_context_window: turn_context.client.get_model_context_window(),
@@ -1526,11 +1550,11 @@ pub(crate) async fn run_task(
if is_review_mode {
// Seed review threads with environment context so the model knows the working directory.
review_thread_history
.record_items(sess.build_initial_context(turn_context.as_ref()).iter());
review_thread_history.record_items(std::iter::once(&initial_input_for_turn.into()));
.record_items(sess.build_initial_context(turn_context.as_ref()).iter())?;
review_thread_history.record_items(std::iter::once(&initial_input_for_turn.into()))?;
} else {
sess.record_input_and_rollout_usermsg(turn_context.as_ref(), &initial_input_for_turn)
.await;
.await?;
}
let mut last_agent_message: Option<String> = None;
@@ -1562,11 +1586,11 @@ pub(crate) async fn run_task(
// represents an append-only log without duplicates.
let turn_input: Vec<ResponseItem> = if is_review_mode {
if !pending_input.is_empty() {
review_thread_history.record_items(&pending_input);
review_thread_history.record_items(&pending_input)?;
}
review_thread_history.get_history()
} else {
sess.record_conversation_items(&pending_input).await;
sess.record_conversation_items(&pending_input).await?;
sess.history_snapshot().await
};
@@ -1608,109 +1632,13 @@ pub(crate) async fn run_task(
let token_limit_reached = total_usage_tokens
.map(|tokens| tokens >= limit)
.unwrap_or(false);
let mut items_to_record_in_conversation_history = Vec::<ResponseItem>::new();
let mut responses = Vec::<ResponseInputItem>::new();
for processed_response_item in processed_items {
let ProcessedResponseItem { item, response } = processed_response_item;
match (&item, &response) {
(ResponseItem::Message { role, .. }, None) if role == "assistant" => {
// If the model returned a message, we need to record it.
items_to_record_in_conversation_history.push(item);
}
(
ResponseItem::LocalShellCall { .. },
Some(ResponseInputItem::FunctionCallOutput { call_id, output }),
) => {
items_to_record_in_conversation_history.push(item);
items_to_record_in_conversation_history.push(
ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output: output.clone(),
},
);
}
(
ResponseItem::FunctionCall { .. },
Some(ResponseInputItem::FunctionCallOutput { call_id, output }),
) => {
items_to_record_in_conversation_history.push(item);
items_to_record_in_conversation_history.push(
ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output: output.clone(),
},
);
}
(
ResponseItem::CustomToolCall { .. },
Some(ResponseInputItem::CustomToolCallOutput { call_id, output }),
) => {
items_to_record_in_conversation_history.push(item);
items_to_record_in_conversation_history.push(
ResponseItem::CustomToolCallOutput {
call_id: call_id.clone(),
output: output.clone(),
},
);
}
(
ResponseItem::FunctionCall { .. },
Some(ResponseInputItem::McpToolCallOutput { call_id, result }),
) => {
items_to_record_in_conversation_history.push(item);
let output = match result {
Ok(call_tool_result) => {
convert_call_tool_result_to_function_call_output_payload(
call_tool_result,
)
}
Err(err) => FunctionCallOutputPayload {
content: err.clone(),
success: Some(false),
},
};
items_to_record_in_conversation_history.push(
ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output,
},
);
}
(
ResponseItem::Reasoning {
id,
summary,
content,
encrypted_content,
},
None,
) => {
items_to_record_in_conversation_history.push(ResponseItem::Reasoning {
id: id.clone(),
summary: summary.clone(),
content: content.clone(),
encrypted_content: encrypted_content.clone(),
});
}
_ => {
warn!("Unexpected response item: {item:?} with response: {response:?}");
}
};
if let Some(response) = response {
responses.push(response);
}
}
// Only attempt to take the lock if there is something to record.
if !items_to_record_in_conversation_history.is_empty() {
if is_review_mode {
review_thread_history
.record_items(items_to_record_in_conversation_history.iter());
} else {
sess.record_conversation_items(&items_to_record_in_conversation_history)
.await;
}
}
let (responses, items_to_record_in_conversation_history) = process_items(
processed_items,
is_review_mode,
&mut review_thread_history,
&sess,
)
.await?;
if token_limit_reached {
if auto_compact_recently_attempted {
@@ -1727,7 +1655,8 @@ pub(crate) async fn run_task(
break;
}
auto_compact_recently_attempted = true;
compact::run_inline_auto_compact_task(sess.clone(), turn_context.clone()).await;
compact::run_inline_auto_compact_task(sess.clone(), turn_context.clone())
.await?;
continue;
}
@@ -1749,7 +1678,16 @@ pub(crate) async fn run_task(
}
continue;
}
Err(CodexErr::TurnAborted) => {
Err(CodexErr::TurnAborted {
dangling_artifacts: processed_items,
}) => {
process_items(
processed_items,
is_review_mode,
&mut review_thread_history,
&sess,
)
.await?;
// Aborted turn is reported via a different event.
break;
}
@@ -1778,10 +1716,10 @@ pub(crate) async fn run_task(
Arc::clone(&turn_context),
last_agent_message.as_deref().map(parse_review_output_event),
)
.await;
.await?;
}
last_agent_message
Ok(last_agent_message)
}
/// Parse the review output; when not valid JSON, build a structured
@@ -1850,7 +1788,13 @@ async fn run_turn(
.await
{
Ok(output) => return Ok(output),
Err(CodexErr::TurnAborted) => return Err(CodexErr::TurnAborted),
Err(CodexErr::TurnAborted {
dangling_artifacts: processed_items,
}) => {
return Err(CodexErr::TurnAborted {
dangling_artifacts: processed_items,
});
}
Err(CodexErr::Interrupted) => return Err(CodexErr::Interrupted),
Err(CodexErr::EnvVar(var)) => return Err(CodexErr::EnvVar(var)),
Err(e @ CodexErr::Fatal(_)) => return Err(e),
@@ -1903,9 +1847,9 @@ async fn run_turn(
/// "handled" such that it produces a `ResponseInputItem` that needs to be
/// sent back to the model on the next turn.
#[derive(Debug)]
pub(crate) struct ProcessedResponseItem {
pub(crate) item: ResponseItem,
pub(crate) response: Option<ResponseInputItem>,
pub struct ProcessedResponseItem {
pub item: ResponseItem,
pub response: Option<ResponseInputItem>,
}
#[derive(Debug)]
@@ -1954,7 +1898,15 @@ async fn try_run_turn(
// Poll the next item from the model stream. We must inspect *both* Ok and Err
// cases so that transient stream failures (e.g., dropped SSE connection before
// `response.completed`) bubble up and trigger the caller's retry logic.
let event = stream.next().or_cancel(&cancellation_token).await?;
let event = match stream.next().or_cancel(&cancellation_token).await {
Ok(event) => event,
Err(codex_async_utils::CancelErr::Cancelled) => {
let processed_items = output.try_collect().await?;
return Err(CodexErr::TurnAborted {
dangling_artifacts: processed_items,
});
}
};
let event = match event {
Some(res) => res?,
@@ -1978,7 +1930,8 @@ async fn try_run_turn(
let payload_preview = call.payload.log_payload().into_owned();
tracing::info!("ToolCall: {} {}", call.tool_name, payload_preview);
let response = tool_runtime.handle_tool_call(call);
let response =
tool_runtime.handle_tool_call(call, cancellation_token.child_token());
output.push_back(
async move {
@@ -2060,12 +2013,7 @@ async fn try_run_turn(
} => {
sess.update_token_usage_info(turn_context.as_ref(), token_usage.as_ref())
.await;
let processed_items = output
.try_collect()
.or_cancel(&cancellation_token)
.await??;
let processed_items = output.try_collect().await?;
let unified_diff = {
let mut tracker = turn_diff_tracker.lock().await;
tracker.get_unified_diff()
@@ -2169,7 +2117,7 @@ pub(super) fn get_last_assistant_message_from_turn(responses: &[ResponseItem]) -
}
})
}
fn convert_call_tool_result_to_function_call_output_payload(
pub(crate) fn convert_call_tool_result_to_function_call_output_payload(
call_tool_result: &CallToolResult,
) -> FunctionCallOutputPayload {
let CallToolResult {
@@ -2210,7 +2158,7 @@ pub(crate) async fn exit_review_mode(
session: Arc<Session>,
turn_context: Arc<TurnContext>,
review_output: Option<ReviewOutputEvent>,
) {
) -> CodexResult<()> {
let event = EventMsg::ExitedReviewMode(ExitedReviewModeEvent {
review_output: review_output.clone(),
});
@@ -2253,7 +2201,8 @@ pub(crate) async fn exit_review_mode(
role: "user".to_string(),
content: vec![ContentItem::InputText { text: user_message }],
}])
.await;
.await?;
Ok(())
}
fn mcp_init_error_display(
@@ -2301,6 +2250,7 @@ mod tests {
use crate::config::ConfigToml;
use crate::config_types::McpServerConfig;
use crate::config_types::McpServerTransportConfig;
use crate::error::Result as CodexResult;
use crate::exec::ExecToolCallOutput;
use crate::mcp::auth::McpAuthStatusEntry;
use crate::tools::format_exec_output_str;
@@ -2316,7 +2266,11 @@ mod tests {
use crate::tools::MODEL_FORMAT_MAX_LINES;
use crate::tools::MODEL_FORMAT_TAIL_LINES;
use crate::tools::ToolRouter;
use crate::tools::handle_container_exec_with_params;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolOutput;
use crate::tools::context::ToolPayload;
use crate::tools::handlers::ShellHandler;
use crate::tools::registry::ToolHandler;
use crate::turn_diff_tracker::TurnDiffTracker;
use codex_app_server_protocol::AuthMode;
use codex_protocol::models::ContentItem;
@@ -2337,9 +2291,12 @@ mod tests {
#[test]
fn reconstruct_history_matches_live_compactions() {
let (session, turn_context) = make_session_and_context();
let (rollout_items, expected) = sample_rollout(&session, &turn_context);
let (rollout_items, expected) =
sample_rollout(&session, &turn_context).expect("sample rollout");
let reconstructed = session.reconstruct_history_from_rollout(&turn_context, &rollout_items);
let reconstructed = session
.reconstruct_history_from_rollout(&turn_context, &rollout_items)
.expect("reconstruct history");
assert_eq!(expected, reconstructed);
}
@@ -2347,15 +2304,19 @@ mod tests {
#[test]
fn record_initial_history_reconstructs_resumed_transcript() {
let (session, turn_context) = make_session_and_context();
let (rollout_items, expected) = sample_rollout(&session, &turn_context);
let (rollout_items, expected) =
sample_rollout(&session, &turn_context).expect("sample rollout");
tokio_test::block_on(session.record_initial_history(InitialHistory::Resumed(
ResumedHistory {
conversation_id: ConversationId::default(),
history: rollout_items,
rollout_path: PathBuf::from("/tmp/resume.jsonl"),
},
)));
tokio_test::block_on(async {
session
.record_initial_history(InitialHistory::Resumed(ResumedHistory {
conversation_id: ConversationId::default(),
history: rollout_items,
rollout_path: PathBuf::from("/tmp/resume.jsonl"),
}))
.await
.expect("record resumed history");
});
let actual = tokio_test::block_on(async { session.state.lock().await.history_snapshot() });
assert_eq!(expected, actual);
@@ -2364,9 +2325,15 @@ mod tests {
#[test]
fn record_initial_history_reconstructs_forked_transcript() {
let (session, turn_context) = make_session_and_context();
let (rollout_items, expected) = sample_rollout(&session, &turn_context);
let (rollout_items, expected) =
sample_rollout(&session, &turn_context).expect("sample rollout");
tokio_test::block_on(session.record_initial_history(InitialHistory::Forked(rollout_items)));
tokio_test::block_on(async {
session
.record_initial_history(InitialHistory::Forked(rollout_items))
.await
.expect("record forked history");
});
let actual = tokio_test::block_on(async { session.state.lock().await.history_snapshot() });
assert_eq!(expected, actual);
@@ -2726,10 +2693,10 @@ mod tests {
_ctx: Arc<TurnContext>,
_input: Vec<UserInput>,
cancellation_token: CancellationToken,
) -> Option<String> {
) -> CodexResult<Option<String>> {
if self.listen_to_cancellation_token {
cancellation_token.cancelled().await;
return None;
return Ok(None);
}
loop {
sleep(Duration::from_secs(60)).await;
@@ -2738,7 +2705,7 @@ mod tests {
async fn abort(&self, session: Arc<SessionTaskContext>, ctx: Arc<TurnContext>) {
if let TaskKind::Review = self.kind {
exit_review_mode(session.clone_session(), ctx, None).await;
let _ = exit_review_mode(session.clone_session(), ctx, None).await;
}
}
}
@@ -2894,7 +2861,7 @@ mod tests {
fn sample_rollout(
session: &Session,
turn_context: &TurnContext,
) -> (Vec<RolloutItem>, Vec<ResponseItem>) {
) -> CodexResult<(Vec<RolloutItem>, Vec<ResponseItem>)> {
let mut rollout_items = Vec::new();
let mut live_history = ConversationHistory::new();
@@ -2902,7 +2869,7 @@ mod tests {
for item in &initial_context {
rollout_items.push(RolloutItem::ResponseItem(item.clone()));
}
live_history.record_items(initial_context.iter());
live_history.record_items(initial_context.iter())?;
let user1 = ResponseItem::Message {
id: None,
@@ -2911,7 +2878,7 @@ mod tests {
text: "first user".to_string(),
}],
};
live_history.record_items(std::iter::once(&user1));
live_history.record_items(std::iter::once(&user1))?;
rollout_items.push(RolloutItem::ResponseItem(user1.clone()));
let assistant1 = ResponseItem::Message {
@@ -2921,7 +2888,7 @@ mod tests {
text: "assistant reply one".to_string(),
}],
};
live_history.record_items(std::iter::once(&assistant1));
live_history.record_items(std::iter::once(&assistant1))?;
rollout_items.push(RolloutItem::ResponseItem(assistant1.clone()));
let summary1 = "summary one";
@@ -2944,7 +2911,7 @@ mod tests {
text: "second user".to_string(),
}],
};
live_history.record_items(std::iter::once(&user2));
live_history.record_items(std::iter::once(&user2))?;
rollout_items.push(RolloutItem::ResponseItem(user2.clone()));
let assistant2 = ResponseItem::Message {
@@ -2954,7 +2921,7 @@ mod tests {
text: "assistant reply two".to_string(),
}],
};
live_history.record_items(std::iter::once(&assistant2));
live_history.record_items(std::iter::once(&assistant2))?;
rollout_items.push(RolloutItem::ResponseItem(assistant2.clone()));
let summary2 = "summary two";
@@ -2977,7 +2944,7 @@ mod tests {
text: "third user".to_string(),
}],
};
live_history.record_items(std::iter::once(&user3));
live_history.record_items(std::iter::once(&user3))?;
rollout_items.push(RolloutItem::ResponseItem(user3.clone()));
let assistant3 = ResponseItem::Message {
@@ -2987,10 +2954,10 @@ mod tests {
text: "assistant reply three".to_string(),
}],
};
live_history.record_items(std::iter::once(&assistant3));
live_history.record_items(std::iter::once(&assistant3))?;
rollout_items.push(RolloutItem::ResponseItem(assistant3.clone()));
(rollout_items, live_history.get_history())
Ok((rollout_items, live_history.get_history()))
}
#[tokio::test]
@@ -3039,15 +3006,26 @@ mod tests {
let tool_name = "shell";
let call_id = "test-call".to_string();
let resp = handle_container_exec_with_params(
tool_name,
params,
Arc::clone(&session),
Arc::clone(&turn_context),
Arc::clone(&turn_diff_tracker),
call_id,
)
.await;
let handler = ShellHandler;
let resp = handler
.handle(ToolInvocation {
session: Arc::clone(&session),
turn: Arc::clone(&turn_context),
tracker: Arc::clone(&turn_diff_tracker),
call_id,
tool_name: tool_name.to_string(),
payload: ToolPayload::Function {
arguments: serde_json::json!({
"command": params.command.clone(),
"workdir": Some(turn_context.cwd.to_string_lossy().to_string()),
"timeout_ms": params.timeout_ms,
"with_escalated_permissions": params.with_escalated_permissions,
"justification": params.justification.clone(),
})
.to_string(),
},
})
.await;
let Err(FunctionCallError::RespondToModel(output)) = resp else {
panic!("expected error result");
@@ -3066,17 +3044,30 @@ mod tests {
.expect("unique turn context Arc")
.sandbox_policy = SandboxPolicy::DangerFullAccess;
let resp2 = handle_container_exec_with_params(
tool_name,
params2,
Arc::clone(&session),
Arc::clone(&turn_context),
Arc::clone(&turn_diff_tracker),
"test-call-2".to_string(),
)
.await;
let resp2 = handler
.handle(ToolInvocation {
session: Arc::clone(&session),
turn: Arc::clone(&turn_context),
tracker: Arc::clone(&turn_diff_tracker),
call_id: "test-call-2".to_string(),
tool_name: tool_name.to_string(),
payload: ToolPayload::Function {
arguments: serde_json::json!({
"command": params2.command.clone(),
"workdir": Some(turn_context.cwd.to_string_lossy().to_string()),
"timeout_ms": params2.timeout_ms,
"with_escalated_permissions": params2.with_escalated_permissions,
"justification": params2.justification.clone(),
})
.to_string(),
},
})
.await;
let output = resp2.expect("expected Ok result");
let output = match resp2.expect("expected Ok result") {
ToolOutput::Function { content, .. } => content,
_ => panic!("unexpected tool output"),
};
#[derive(Deserialize, PartialEq, Eq, Debug)]
struct ResponseExecMetadata {

View File

@@ -39,35 +39,35 @@ struct HistoryBridgeTemplate<'a> {
pub(crate) async fn run_inline_auto_compact_task(
sess: Arc<Session>,
turn_context: Arc<TurnContext>,
) {
) -> CodexResult<()> {
let input = vec![UserInput::Text {
text: SUMMARIZATION_PROMPT.to_string(),
}];
run_compact_task_inner(sess, turn_context, input).await;
run_compact_task_inner(sess, turn_context, input).await
}
pub(crate) async fn run_compact_task(
sess: Arc<Session>,
turn_context: Arc<TurnContext>,
input: Vec<UserInput>,
) -> Option<String> {
) -> CodexResult<Option<String>> {
let start_event = EventMsg::TaskStarted(TaskStartedEvent {
model_context_window: turn_context.client.get_model_context_window(),
});
sess.send_event(&turn_context, start_event).await;
run_compact_task_inner(sess.clone(), turn_context, input).await;
None
run_compact_task_inner(sess.clone(), turn_context, input).await?;
Ok(None)
}
async fn run_compact_task_inner(
sess: Arc<Session>,
turn_context: Arc<TurnContext>,
input: Vec<UserInput>,
) {
) -> CodexResult<()> {
let initial_input_for_turn: ResponseInputItem = ResponseInputItem::from(input);
let mut history = sess.clone_history().await;
history.record_items(&[initial_input_for_turn.into()]);
history.record_items(&[initial_input_for_turn.into()])?;
let mut truncated_count = 0usize;
@@ -106,7 +106,7 @@ async fn run_compact_task_inner(
break;
}
Err(CodexErr::Interrupted) => {
return;
return Ok(());
}
Err(e @ CodexErr::ContextWindowExceeded) => {
if turn_input.len() > 1 {
@@ -124,7 +124,7 @@ async fn run_compact_task_inner(
message: e.to_string(),
});
sess.send_event(&turn_context, event).await;
return;
return Ok(());
}
Err(e) => {
if retries < max_retries {
@@ -142,7 +142,7 @@ async fn run_compact_task_inner(
message: e.to_string(),
});
sess.send_event(&turn_context, event).await;
return;
return Ok(());
}
}
}
@@ -164,6 +164,7 @@ async fn run_compact_task_inner(
message: "Compact task completed".to_string(),
});
sess.send_event(&turn_context, event).await;
Ok(())
}
pub fn content_items_to_text(content: &[ContentItem]) -> Option<String> {
@@ -252,7 +253,8 @@ async fn drain_to_completed(
};
match event {
Ok(ResponseEvent::OutputItemDone(item)) => {
sess.record_into_history(std::slice::from_ref(&item)).await;
sess.record_into_history(std::slice::from_ref(&item))
.await?;
}
Ok(ResponseEvent::RateLimits(snapshot)) => {
sess.update_rate_limits(turn_context, snapshot).await;

View File

@@ -1,21 +1,49 @@
use std::sync::Arc;
use std::sync::OnceLock;
use codex_protocol::models::FunctionCallOutputPayload;
use codex_protocol::models::ResponseItem;
use codex_protocol::protocol::TokenUsage;
use codex_protocol::protocol::TokenUsageInfo;
use codex_utils_tokenizer::Tokenizer;
use tokio::task;
use tracing::error;
static TOKENIZER: OnceLock<Option<Arc<Tokenizer>>> = OnceLock::new();
use crate::error::CodexErr;
/// Transcript of conversation history
#[derive(Debug, Clone, Default)]
#[derive(Debug, Clone)]
pub(crate) struct ConversationHistory {
/// The oldest items are at the beginning of the vector.
items: Vec<ResponseItem>,
token_info: Option<TokenUsageInfo>,
}
impl ConversationHistory {
pub(crate) fn new() -> Self {
Self { items: Vec::new() }
Self {
items: Vec::new(),
token_info: None,
}
}
pub(crate) fn token_info(&self) -> Option<TokenUsageInfo> {
self.token_info.clone()
}
pub(crate) fn set_token_usage_full(&mut self, context_window: i64) {
match &mut self.token_info {
Some(info) => info.fill_to_context_window(context_window),
None => {
self.token_info = Some(TokenUsageInfo::full_context_window(context_window));
}
}
}
/// `items` is ordered from oldest to newest.
pub(crate) fn record_items<I>(&mut self, items: I)
pub(crate) fn record_items<I>(&mut self, items: I) -> Result<(), CodexErr>
where
I: IntoIterator,
I::Item: std::ops::Deref<Target = ResponseItem>,
@@ -24,9 +52,10 @@ impl ConversationHistory {
if !is_api_message(&item) {
continue;
}
self.validate_input(&item)?;
self.items.push(item.clone());
}
Ok(())
}
pub(crate) fn get_history(&mut self) -> Vec<ResponseItem> {
@@ -62,6 +91,65 @@ impl ConversationHistory {
self.items.clone()
}
fn validate_input(&self, item: &ResponseItem) -> Result<(), CodexErr> {
match item {
ResponseItem::Message { content, .. } => {
self.validate_input_content_item(content)?;
Ok(())
}
ResponseItem::FunctionCall { .. }
| ResponseItem::FunctionCallOutput { .. }
| ResponseItem::CustomToolCall { .. }
| ResponseItem::CustomToolCallOutput { .. }
| ResponseItem::LocalShellCall { .. }
| ResponseItem::Reasoning { .. }
| ResponseItem::WebSearchCall { .. } => Ok(()),
ResponseItem::Other => Err(CodexErr::InvalidInput(format!("invalid input: {item:?}"))),
}
}
fn validate_input_content_item(
&self,
content: &[codex_protocol::models::ContentItem],
) -> Result<(), CodexErr> {
let Some(info) = &self.token_info else {
return Ok(());
};
// this will intentionally not check the context for the first turn before getting this information.
// it's acceptable tradeoff.
let Some(context_window) = info.model_context_window else {
return Ok(());
};
let tokenizer = match shared_tokenizer() {
Some(t) => t,
None => return Ok(()),
};
let mut input_tokens: i64 = 0;
for item in content {
match item {
codex_protocol::models::ContentItem::InputText { text } => {
input_tokens += tokenizer.count(text);
}
codex_protocol::models::ContentItem::InputImage { .. } => {
// no validation currently
}
codex_protocol::models::ContentItem::OutputText { .. } => {
// no validation currently
}
}
}
let prior_total = info.last_token_usage.total_tokens;
let combined_tokens = prior_total.saturating_add(input_tokens);
let threshold = context_window * 95 / 100;
if combined_tokens > threshold {
return Err(CodexErr::InvalidInput("input too large".to_string()));
}
Ok(())
}
fn ensure_call_outputs_present(&mut self) {
// Collect synthetic outputs to insert immediately after their calls.
// Store the insertion position (index of call) alongside the item so
@@ -301,6 +389,18 @@ impl ConversationHistory {
self.items.remove(pos);
}
}
pub(crate) fn update_token_info(
&mut self,
usage: &TokenUsage,
model_context_window: Option<i64>,
) {
self.token_info = TokenUsageInfo::new_or_append(
&self.token_info,
&Some(usage.clone()),
model_context_window,
);
}
}
#[inline]
@@ -312,6 +412,36 @@ fn error_or_panic(message: String) {
}
}
fn shared_tokenizer() -> Option<Arc<Tokenizer>> {
TOKENIZER.get().and_then(|opt| opt.as_ref().map(Arc::clone))
}
/// Kick off background initialization of the shared tokenizer without blocking the caller.
pub(crate) fn prefetch_tokenizer_in_background() {
if TOKENIZER.get().is_some() {
return;
}
// Spawn a background task to initialize the tokenizer. Use spawn_blocking in case
// initialization performs CPU-heavy work or file I/O.
tokio::spawn(async {
let result = task::spawn_blocking(Tokenizer::try_default).await;
match result {
Ok(Ok(tokenizer)) => {
let _ = TOKENIZER.set(Some(Arc::new(tokenizer)));
}
Ok(Err(error)) => {
error!("failed to create tokenizer: {error}");
let _ = TOKENIZER.set(None);
}
Err(join_error) => {
error!("failed to join tokenizer init task: {join_error}");
let _ = TOKENIZER.set(None);
}
}
});
}
/// Anything that is not a system message or "reasoning" message is considered
/// an API message.
fn is_api_message(message: &ResponseItem) -> bool {
@@ -350,7 +480,7 @@ mod tests {
fn create_history_with_items(items: Vec<ResponseItem>) -> ConversationHistory {
let mut h = ConversationHistory::new();
h.record_items(items.iter());
h.record_items(items.iter()).unwrap();
h
}
@@ -366,7 +496,7 @@ mod tests {
#[test]
fn filters_non_api_messages() {
let mut h = ConversationHistory::default();
let mut h = ConversationHistory::new();
// System message is not an API message; Other is ignored.
let system = ResponseItem::Message {
id: None,
@@ -375,12 +505,12 @@ mod tests {
text: "ignored".to_string(),
}],
};
h.record_items([&system, &ResponseItem::Other]);
h.record_items([&system, &ResponseItem::Other]).unwrap();
// User and assistant should be retained.
let u = user_msg("hi");
let a = assistant_msg("hello");
h.record_items([&u, &a]);
h.record_items([&u, &a]).unwrap();
let items = h.contents();
assert_eq!(

View File

@@ -1,3 +1,4 @@
use crate::codex::ProcessedResponseItem;
use crate::exec::ExecToolCallOutput;
use crate::token_data::KnownPlan;
use crate::token_data::PlanType;
@@ -53,8 +54,11 @@ pub enum SandboxErr {
#[derive(Error, Debug)]
pub enum CodexErr {
// todo(aibrahim): git rid of this error carrying the dangling artifacts
#[error("turn aborted")]
TurnAborted,
TurnAborted {
dangling_artifacts: Vec<ProcessedResponseItem>,
},
/// Returned by ResponsesClient when the SSE stream disconnects or errors out **after** the HTTP
/// handshake has succeeded but **before** it finished emitting `response.completed`.
@@ -154,11 +158,16 @@ pub enum CodexErr {
#[error("{0}")]
EnvVar(EnvVarError),
#[error("invalid input: {0}")]
InvalidInput(String),
}
impl From<CancelErr> for CodexErr {
fn from(_: CancelErr) -> Self {
CodexErr::TurnAborted
CodexErr::TurnAborted {
dangling_artifacts: Vec::new(),
}
}
}

View File

@@ -36,6 +36,7 @@ mod mcp_tool_call;
mod message_history;
mod model_provider_info;
pub mod parse_command;
mod response_processing;
pub mod sandboxing;
pub mod token_data;
mod truncate;

View File

@@ -0,0 +1,113 @@
use crate::codex::Session;
use crate::conversation_history::ConversationHistory;
use crate::error::Result as CodexResult;
use codex_protocol::models::FunctionCallOutputPayload;
use codex_protocol::models::ResponseInputItem;
use codex_protocol::models::ResponseItem;
use tracing::warn;
/// Process streamed `ResponseItem`s from the model into the pair of:
/// - items we should record in conversation history; and
/// - `ResponseInputItem`s to send back to the model on the next turn.
pub(crate) async fn process_items(
processed_items: Vec<crate::codex::ProcessedResponseItem>,
is_review_mode: bool,
review_thread_history: &mut ConversationHistory,
sess: &Session,
) -> CodexResult<(Vec<ResponseInputItem>, Vec<ResponseItem>)> {
let mut items_to_record_in_conversation_history = Vec::<ResponseItem>::new();
let mut responses = Vec::<ResponseInputItem>::new();
for processed_response_item in processed_items {
let crate::codex::ProcessedResponseItem { item, response } = processed_response_item;
match (&item, &response) {
(ResponseItem::Message { role, .. }, None) if role == "assistant" => {
// If the model returned a message, we need to record it.
items_to_record_in_conversation_history.push(item);
}
(
ResponseItem::LocalShellCall { .. },
Some(ResponseInputItem::FunctionCallOutput { call_id, output }),
) => {
items_to_record_in_conversation_history.push(item);
items_to_record_in_conversation_history.push(ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output: output.clone(),
});
}
(
ResponseItem::FunctionCall { .. },
Some(ResponseInputItem::FunctionCallOutput { call_id, output }),
) => {
items_to_record_in_conversation_history.push(item);
items_to_record_in_conversation_history.push(ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output: output.clone(),
});
}
(
ResponseItem::CustomToolCall { .. },
Some(ResponseInputItem::CustomToolCallOutput { call_id, output }),
) => {
items_to_record_in_conversation_history.push(item);
items_to_record_in_conversation_history.push(ResponseItem::CustomToolCallOutput {
call_id: call_id.clone(),
output: output.clone(),
});
}
(
ResponseItem::FunctionCall { .. },
Some(ResponseInputItem::McpToolCallOutput { call_id, result }),
) => {
items_to_record_in_conversation_history.push(item);
let output = match result {
Ok(call_tool_result) => {
crate::codex::convert_call_tool_result_to_function_call_output_payload(
call_tool_result,
)
}
Err(err) => FunctionCallOutputPayload {
content: err.clone(),
success: Some(false),
},
};
items_to_record_in_conversation_history.push(ResponseItem::FunctionCallOutput {
call_id: call_id.clone(),
output,
});
}
(
ResponseItem::Reasoning {
id,
summary,
content,
encrypted_content,
},
None,
) => {
items_to_record_in_conversation_history.push(ResponseItem::Reasoning {
id: id.clone(),
summary: summary.clone(),
content: content.clone(),
encrypted_content: encrypted_content.clone(),
});
}
_ => {
warn!("Unexpected response item: {item:?} with response: {response:?}");
}
};
if let Some(response) = response {
responses.push(response);
}
}
// Only attempt to take the lock if there is something to record.
if !items_to_record_in_conversation_history.is_empty() {
if is_review_mode {
review_thread_history.record_items(items_to_record_in_conversation_history.iter())?;
} else {
sess.record_conversation_items(&items_to_record_in_conversation_history)
.await?;
}
}
Ok((responses, items_to_record_in_conversation_history))
}

View File

@@ -1,12 +1,11 @@
use std::cmp::Reverse;
use std::io::{self};
use std::num::NonZero;
use std::path::Path;
use std::path::PathBuf;
use codex_file_search as file_search;
use std::num::NonZero;
use std::sync::Arc;
use std::sync::atomic::AtomicBool;
use time::OffsetDateTime;
use time::PrimitiveDateTime;
use time::format_description::FormatItem;
@@ -15,6 +14,7 @@ use uuid::Uuid;
use super::SESSIONS_SUBDIR;
use crate::protocol::EventMsg;
use codex_file_search as file_search;
use codex_protocol::protocol::RolloutItem;
use codex_protocol::protocol::RolloutLine;
use codex_protocol::protocol::SessionSource;
@@ -515,6 +515,7 @@ pub async fn find_conversation_path_by_id_str(
threads,
cancel,
compute_indices,
false,
)
.map_err(|e| io::Error::other(format!("file search failed: {e}")))?;

View File

@@ -4,6 +4,7 @@ use codex_protocol::models::ResponseItem;
use crate::codex::SessionConfiguration;
use crate::conversation_history::ConversationHistory;
use crate::error::CodexErr;
use crate::protocol::RateLimitSnapshot;
use crate::protocol::TokenUsage;
use crate::protocol::TokenUsageInfo;
@@ -12,7 +13,6 @@ use crate::protocol::TokenUsageInfo;
pub(crate) struct SessionState {
pub(crate) session_configuration: SessionConfiguration,
pub(crate) history: ConversationHistory,
pub(crate) token_info: Option<TokenUsageInfo>,
pub(crate) latest_rate_limits: Option<RateLimitSnapshot>,
}
@@ -22,18 +22,18 @@ impl SessionState {
Self {
session_configuration,
history: ConversationHistory::new(),
token_info: None,
latest_rate_limits: None,
}
}
// History helpers
pub(crate) fn record_items<I>(&mut self, items: I)
pub(crate) fn record_items<I>(&mut self, items: I) -> Result<(), CodexErr>
where
I: IntoIterator,
I::Item: std::ops::Deref<Target = ResponseItem>,
{
self.history.record_items(items)
self.history.record_items(items)?;
Ok(())
}
pub(crate) fn history_snapshot(&mut self) -> Vec<ResponseItem> {
@@ -54,11 +54,11 @@ impl SessionState {
usage: &TokenUsage,
model_context_window: Option<i64>,
) {
self.token_info = TokenUsageInfo::new_or_append(
&self.token_info,
&Some(usage.clone()),
model_context_window,
);
self.history.update_token_info(usage, model_context_window);
}
pub(crate) fn token_info(&self) -> Option<TokenUsageInfo> {
self.history.token_info()
}
pub(crate) fn set_rate_limits(&mut self, snapshot: RateLimitSnapshot) {
@@ -68,17 +68,17 @@ impl SessionState {
pub(crate) fn token_info_and_rate_limits(
&self,
) -> (Option<TokenUsageInfo>, Option<RateLimitSnapshot>) {
(self.token_info.clone(), self.latest_rate_limits.clone())
let info = self.token_info().and_then(|info| {
if info.total_token_usage.is_zero() && info.last_token_usage.is_zero() {
None
} else {
Some(info)
}
});
(info, self.latest_rate_limits.clone())
}
pub(crate) fn set_token_usage_full(&mut self, context_window: i64) {
match &mut self.token_info {
Some(info) => info.fill_to_context_window(context_window),
None => {
self.token_info = Some(TokenUsageInfo::full_context_window(context_window));
}
}
self.history.set_token_usage_full(context_window);
}
// Pending input/approval moved to TurnState.
}

View File

@@ -5,6 +5,7 @@ use tokio_util::sync::CancellationToken;
use crate::codex::TurnContext;
use crate::codex::compact;
use crate::error::Result as CodexResult;
use crate::state::TaskKind;
use codex_protocol::user_input::UserInput;
@@ -26,7 +27,7 @@ impl SessionTask for CompactTask {
ctx: Arc<TurnContext>,
input: Vec<UserInput>,
_cancellation_token: CancellationToken,
) -> Option<String> {
) -> CodexResult<Option<String>> {
compact::run_compact_task(session.clone_session(), ctx, input).await
}
}

View File

@@ -15,6 +15,8 @@ use tracing::warn;
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::error::Result as CodexResult;
use crate::protocol::ErrorEvent;
use crate::protocol::EventMsg;
use crate::protocol::TaskCompleteEvent;
use crate::protocol::TurnAbortReason;
@@ -56,7 +58,7 @@ pub(crate) trait SessionTask: Send + Sync + 'static {
ctx: Arc<TurnContext>,
input: Vec<UserInput>,
cancellation_token: CancellationToken,
) -> Option<String>;
) -> CodexResult<Option<String>>;
async fn abort(&self, session: Arc<SessionTaskContext>, ctx: Arc<TurnContext>) {
let _ = (session, ctx);
@@ -86,7 +88,7 @@ impl Session {
let task_cancellation_token = cancellation_token.child_token();
tokio::spawn(async move {
let ctx_for_finish = Arc::clone(&ctx);
let last_agent_message = task_for_run
let run_result = task_for_run
.run(
Arc::clone(&session_ctx),
ctx,
@@ -98,8 +100,21 @@ impl Session {
if !task_cancellation_token.is_cancelled() {
// Emit completion uniformly from spawn site so all tasks share the same lifecycle.
let sess = session_ctx.clone_session();
sess.on_task_finished(ctx_for_finish, last_agent_message)
.await;
match run_result {
Ok(last_agent_message) => {
sess.on_task_finished(ctx_for_finish, last_agent_message)
.await;
}
Err(err) => {
let message = err.to_string();
sess.send_event(
ctx_for_finish.as_ref(),
EventMsg::Error(ErrorEvent { message }),
)
.await;
sess.on_task_finished(ctx_for_finish, None).await;
}
}
}
done_clone.notify_waiters();
})

View File

@@ -5,6 +5,7 @@ use tokio_util::sync::CancellationToken;
use crate::codex::TurnContext;
use crate::codex::run_task;
use crate::error::Result as CodexResult;
use crate::state::TaskKind;
use codex_protocol::user_input::UserInput;
@@ -26,7 +27,7 @@ impl SessionTask for RegularTask {
ctx: Arc<TurnContext>,
input: Vec<UserInput>,
cancellation_token: CancellationToken,
) -> Option<String> {
) -> CodexResult<Option<String>> {
let sess = session.clone_session();
run_task(sess, ctx, input, TaskKind::Regular, cancellation_token).await
}

View File

@@ -6,6 +6,7 @@ use tokio_util::sync::CancellationToken;
use crate::codex::TurnContext;
use crate::codex::exit_review_mode;
use crate::codex::run_task;
use crate::error::Result as CodexResult;
use crate::state::TaskKind;
use codex_protocol::user_input::UserInput;
@@ -27,12 +28,12 @@ impl SessionTask for ReviewTask {
ctx: Arc<TurnContext>,
input: Vec<UserInput>,
cancellation_token: CancellationToken,
) -> Option<String> {
) -> CodexResult<Option<String>> {
let sess = session.clone_session();
run_task(sess, ctx, input, TaskKind::Review, cancellation_token).await
}
async fn abort(&self, session: Arc<SessionTaskContext>, ctx: Arc<TurnContext>) {
exit_review_mode(session.clone_session(), ctx, None).await;
let _ = exit_review_mode(session.clone_session(), ctx, None).await;
}
}

View File

@@ -1,6 +1,9 @@
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::error::CodexErr;
use crate::error::SandboxErr;
use crate::exec::ExecToolCallOutput;
use crate::function_tool::FunctionCallError;
use crate::parse_command::parse_command;
use crate::protocol::EventMsg;
use crate::protocol::ExecCommandBeginEvent;
@@ -10,6 +13,7 @@ use crate::protocol::PatchApplyBeginEvent;
use crate::protocol::PatchApplyEndEvent;
use crate::protocol::TurnDiffEvent;
use crate::tools::context::SharedTurnDiffTracker;
use crate::tools::sandboxing::ToolError;
use std::collections::HashMap;
use std::path::Path;
use std::path::PathBuf;
@@ -196,12 +200,103 @@ impl ToolEmitter {
) => {
emit_patch_end(ctx, String::new(), (*message).to_string(), false).await;
}
(Self::UnifiedExec { command, cwd, .. }, _) => {
// TODO(jif) add end and failures.
(Self::UnifiedExec { command, cwd, .. }, ToolEventStage::Begin) => {
emit_exec_command_begin(ctx, &[command.to_string()], cwd.as_path()).await;
}
(Self::UnifiedExec { .. }, ToolEventStage::Success(output)) => {
emit_exec_end(
ctx,
output.stdout.text.clone(),
output.stderr.text.clone(),
output.aggregated_output.text.clone(),
output.exit_code,
output.duration,
format_exec_output_str(&output),
)
.await;
}
(
Self::UnifiedExec { .. },
ToolEventStage::Failure(ToolEventFailure::Output(output)),
) => {
emit_exec_end(
ctx,
output.stdout.text.clone(),
output.stderr.text.clone(),
output.aggregated_output.text.clone(),
output.exit_code,
output.duration,
format_exec_output_str(&output),
)
.await;
}
(
Self::UnifiedExec { .. },
ToolEventStage::Failure(ToolEventFailure::Message(message)),
) => {
emit_exec_end(
ctx,
String::new(),
(*message).to_string(),
(*message).to_string(),
-1,
Duration::ZERO,
format_exec_output(&message),
)
.await;
}
}
}
pub async fn begin(&self, ctx: ToolEventCtx<'_>) {
self.emit(ctx, ToolEventStage::Begin).await;
}
pub async fn finish(
&self,
ctx: ToolEventCtx<'_>,
out: Result<ExecToolCallOutput, ToolError>,
) -> Result<String, FunctionCallError> {
let event;
let result = match out {
Ok(output) => {
let content = super::format_exec_output_for_model(&output);
let exit_code = output.exit_code;
event = ToolEventStage::Success(output);
if exit_code == 0 {
Ok(content)
} else {
Err(FunctionCallError::RespondToModel(content))
}
}
Err(ToolError::Codex(CodexErr::Sandbox(SandboxErr::Timeout { output })))
| Err(ToolError::Codex(CodexErr::Sandbox(SandboxErr::Denied { output }))) => {
let response = super::format_exec_output_for_model(&output);
event = ToolEventStage::Failure(ToolEventFailure::Output(*output));
Err(FunctionCallError::RespondToModel(response))
}
Err(ToolError::Codex(err)) => {
let message = format!("execution error: {err:?}");
let response = super::format_exec_output(&message);
event = ToolEventStage::Failure(ToolEventFailure::Message(message));
Err(FunctionCallError::RespondToModel(response))
}
Err(ToolError::Rejected(msg)) | Err(ToolError::SandboxDenied(msg)) => {
// Normalize common rejection messages for exec tools so tests and
// users see a clear, consistent phrase.
let normalized = if msg == "rejected by user" {
"exec command rejected by user".to_string()
} else {
msg
};
let response = super::format_exec_output(&normalized);
event = ToolEventStage::Failure(ToolEventFailure::Message(normalized));
Err(FunctionCallError::RespondToModel(response))
}
};
self.emit(ctx, event).await;
result
}
}
async fn emit_exec_end(

View File

@@ -1,19 +1,24 @@
use std::collections::BTreeMap;
use std::collections::HashMap;
use std::sync::Arc;
use crate::apply_patch;
use crate::apply_patch::InternalApplyPatchInvocation;
use crate::apply_patch::convert_apply_patch_to_protocol;
use crate::client_common::tools::FreeformTool;
use crate::client_common::tools::FreeformToolFormat;
use crate::client_common::tools::ResponsesApiTool;
use crate::client_common::tools::ToolSpec;
use crate::exec::ExecParams;
use crate::function_tool::FunctionCallError;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolOutput;
use crate::tools::context::ToolPayload;
use crate::tools::handle_container_exec_with_params;
use crate::tools::events::ToolEmitter;
use crate::tools::events::ToolEventCtx;
use crate::tools::orchestrator::ToolOrchestrator;
use crate::tools::registry::ToolHandler;
use crate::tools::registry::ToolKind;
use crate::tools::runtimes::apply_patch::ApplyPatchRequest;
use crate::tools::runtimes::apply_patch::ApplyPatchRuntime;
use crate::tools::sandboxing::ToolCtx;
use crate::tools::spec::ApplyPatchToolArgs;
use crate::tools::spec::JsonSchema;
use async_trait::async_trait;
@@ -64,30 +69,85 @@ impl ToolHandler for ApplyPatchHandler {
}
};
let exec_params = ExecParams {
command: vec!["apply_patch".to_string(), patch_input.clone()],
cwd: turn.cwd.clone(),
timeout_ms: None,
env: HashMap::new(),
with_escalated_permissions: None,
justification: None,
arg0: None,
};
// Re-parse and verify the patch so we can compute changes and approval.
// Avoid building temporary ExecParams/command vectors; derive directly from inputs.
let cwd = turn.cwd.clone();
let command = vec!["apply_patch".to_string(), patch_input.clone()];
match codex_apply_patch::maybe_parse_apply_patch_verified(&command, &cwd) {
codex_apply_patch::MaybeApplyPatchVerified::Body(changes) => {
match apply_patch::apply_patch(session.as_ref(), turn.as_ref(), &call_id, changes)
.await
{
InternalApplyPatchInvocation::Output(item) => {
let content = item?;
Ok(ToolOutput::Function {
content,
success: Some(true),
})
}
InternalApplyPatchInvocation::DelegateToExec(apply) => {
let emitter = ToolEmitter::apply_patch(
convert_apply_patch_to_protocol(&apply.action),
!apply.user_explicitly_approved_this_action,
);
let event_ctx = ToolEventCtx::new(
session.as_ref(),
turn.as_ref(),
&call_id,
Some(&tracker),
);
emitter.begin(event_ctx).await;
let content = handle_container_exec_with_params(
tool_name.as_str(),
exec_params,
Arc::clone(&session),
Arc::clone(&turn),
Arc::clone(&tracker),
call_id.clone(),
)
.await?;
let req = ApplyPatchRequest {
patch: apply.action.patch.clone(),
cwd: apply.action.cwd.clone(),
timeout_ms: None,
user_explicitly_approved: apply.user_explicitly_approved_this_action,
codex_exe: turn.codex_linux_sandbox_exe.clone(),
};
Ok(ToolOutput::Function {
content,
success: Some(true),
})
let mut orchestrator = ToolOrchestrator::new();
let mut runtime = ApplyPatchRuntime::new();
let tool_ctx = ToolCtx {
session: session.as_ref(),
turn: turn.as_ref(),
call_id: call_id.clone(),
tool_name: tool_name.to_string(),
};
let out = orchestrator
.run(&mut runtime, &req, &tool_ctx, &turn, turn.approval_policy)
.await;
let event_ctx = ToolEventCtx::new(
session.as_ref(),
turn.as_ref(),
&call_id,
Some(&tracker),
);
let content = emitter.finish(event_ctx, out).await?;
Ok(ToolOutput::Function {
content,
success: Some(true),
})
}
}
}
codex_apply_patch::MaybeApplyPatchVerified::CorrectnessError(parse_error) => {
Err(FunctionCallError::RespondToModel(format!(
"apply_patch verification failed: {parse_error}"
)))
}
codex_apply_patch::MaybeApplyPatchVerified::ShellParseError(error) => {
tracing::trace!("Failed to parse apply_patch input, {error:?}");
Err(FunctionCallError::RespondToModel(
"apply_patch handler received invalid patch input".to_string(),
))
}
codex_apply_patch::MaybeApplyPatchVerified::NotApplyPatch => {
Err(FunctionCallError::RespondToModel(
"apply_patch handler received non-apply_patch input".to_string(),
))
}
}
}
}

View File

@@ -2,6 +2,9 @@ use async_trait::async_trait;
use codex_protocol::models::ShellToolCallParams;
use std::sync::Arc;
use crate::apply_patch;
use crate::apply_patch::InternalApplyPatchInvocation;
use crate::apply_patch::convert_apply_patch_to_protocol;
use crate::codex::TurnContext;
use crate::exec::ExecParams;
use crate::exec_env::create_env;
@@ -9,9 +12,16 @@ use crate::function_tool::FunctionCallError;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolOutput;
use crate::tools::context::ToolPayload;
use crate::tools::handle_container_exec_with_params;
use crate::tools::events::ToolEmitter;
use crate::tools::events::ToolEventCtx;
use crate::tools::orchestrator::ToolOrchestrator;
use crate::tools::registry::ToolHandler;
use crate::tools::registry::ToolKind;
use crate::tools::runtimes::apply_patch::ApplyPatchRequest;
use crate::tools::runtimes::apply_patch::ApplyPatchRuntime;
use crate::tools::runtimes::shell::ShellRequest;
use crate::tools::runtimes::shell::ShellRuntime;
use crate::tools::sandboxing::ToolCtx;
pub struct ShellHandler;
@@ -61,35 +71,27 @@ impl ToolHandler for ShellHandler {
))
})?;
let exec_params = Self::to_exec_params(params, turn.as_ref());
let content = handle_container_exec_with_params(
Self::run_exec_like(
tool_name.as_str(),
exec_params,
Arc::clone(&session),
Arc::clone(&turn),
Arc::clone(&tracker),
call_id.clone(),
session,
turn,
tracker,
call_id,
)
.await?;
Ok(ToolOutput::Function {
content,
success: Some(true),
})
.await
}
ToolPayload::LocalShell { params } => {
let exec_params = Self::to_exec_params(params, turn.as_ref());
let content = handle_container_exec_with_params(
Self::run_exec_like(
tool_name.as_str(),
exec_params,
Arc::clone(&session),
Arc::clone(&turn),
Arc::clone(&tracker),
call_id.clone(),
session,
turn,
tracker,
call_id,
)
.await?;
Ok(ToolOutput::Function {
content,
success: Some(true),
})
.await
}
_ => Err(FunctionCallError::RespondToModel(format!(
"unsupported payload for shell handler: {tool_name}"
@@ -97,3 +99,134 @@ impl ToolHandler for ShellHandler {
}
}
}
impl ShellHandler {
async fn run_exec_like(
tool_name: &str,
exec_params: ExecParams,
session: Arc<crate::codex::Session>,
turn: Arc<TurnContext>,
tracker: crate::tools::context::SharedTurnDiffTracker,
call_id: String,
) -> Result<ToolOutput, FunctionCallError> {
// Approval policy guard for explicit escalation in non-OnRequest modes.
if exec_params.with_escalated_permissions.unwrap_or(false)
&& !matches!(
turn.approval_policy,
codex_protocol::protocol::AskForApproval::OnRequest
)
{
return Err(FunctionCallError::RespondToModel(format!(
"approval policy is {policy:?}; reject command — you should not ask for escalated permissions if the approval policy is {policy:?}",
policy = turn.approval_policy
)));
}
// Intercept apply_patch if present.
match codex_apply_patch::maybe_parse_apply_patch_verified(
&exec_params.command,
&exec_params.cwd,
) {
codex_apply_patch::MaybeApplyPatchVerified::Body(changes) => {
match apply_patch::apply_patch(session.as_ref(), turn.as_ref(), &call_id, changes)
.await
{
InternalApplyPatchInvocation::Output(item) => {
// Programmatic apply_patch path; return its result.
let content = item?;
return Ok(ToolOutput::Function {
content,
success: Some(true),
});
}
InternalApplyPatchInvocation::DelegateToExec(apply) => {
let emitter = ToolEmitter::apply_patch(
convert_apply_patch_to_protocol(&apply.action),
!apply.user_explicitly_approved_this_action,
);
let event_ctx = ToolEventCtx::new(
session.as_ref(),
turn.as_ref(),
&call_id,
Some(&tracker),
);
emitter.begin(event_ctx).await;
let req = ApplyPatchRequest {
patch: apply.action.patch.clone(),
cwd: apply.action.cwd.clone(),
timeout_ms: exec_params.timeout_ms,
user_explicitly_approved: apply.user_explicitly_approved_this_action,
codex_exe: turn.codex_linux_sandbox_exe.clone(),
};
let mut orchestrator = ToolOrchestrator::new();
let mut runtime = ApplyPatchRuntime::new();
let tool_ctx = ToolCtx {
session: session.as_ref(),
turn: turn.as_ref(),
call_id: call_id.clone(),
tool_name: tool_name.to_string(),
};
let out = orchestrator
.run(&mut runtime, &req, &tool_ctx, &turn, turn.approval_policy)
.await;
let event_ctx = ToolEventCtx::new(
session.as_ref(),
turn.as_ref(),
&call_id,
Some(&tracker),
);
let content = emitter.finish(event_ctx, out).await?;
return Ok(ToolOutput::Function {
content,
success: Some(true),
});
}
}
}
codex_apply_patch::MaybeApplyPatchVerified::CorrectnessError(parse_error) => {
return Err(FunctionCallError::RespondToModel(format!(
"apply_patch verification failed: {parse_error}"
)));
}
codex_apply_patch::MaybeApplyPatchVerified::ShellParseError(error) => {
tracing::trace!("Failed to parse shell command, {error:?}");
// Fall through to regular shell execution.
}
codex_apply_patch::MaybeApplyPatchVerified::NotApplyPatch => {
// Fall through to regular shell execution.
}
}
// Regular shell execution path.
let emitter = ToolEmitter::shell(exec_params.command.clone(), exec_params.cwd.clone());
let event_ctx = ToolEventCtx::new(session.as_ref(), turn.as_ref(), &call_id, None);
emitter.begin(event_ctx).await;
let req = ShellRequest {
command: exec_params.command.clone(),
cwd: exec_params.cwd.clone(),
timeout_ms: exec_params.timeout_ms,
env: exec_params.env.clone(),
with_escalated_permissions: exec_params.with_escalated_permissions,
justification: exec_params.justification.clone(),
};
let mut orchestrator = ToolOrchestrator::new();
let mut runtime = ShellRuntime::new();
let tool_ctx = ToolCtx {
session: session.as_ref(),
turn: turn.as_ref(),
call_id: call_id.clone(),
tool_name: tool_name.to_string(),
};
let out = orchestrator
.run(&mut runtime, &req, &tool_ctx, &turn, turn.approval_policy)
.await;
let event_ctx = ToolEventCtx::new(session.as_ref(), turn.as_ref(), &call_id, None);
let content = emitter.finish(event_ctx, out).await?;
Ok(ToolOutput::Function {
content,
success: Some(true),
})
}
}

View File

@@ -5,6 +5,9 @@ use serde::Deserialize;
use serde::Serialize;
use crate::function_tool::FunctionCallError;
use crate::protocol::EventMsg;
use crate::protocol::ExecCommandOutputDeltaEvent;
use crate::protocol::ExecOutputStream;
use crate::tools::context::ToolInvocation;
use crate::tools::context::ToolOutput;
use crate::tools::context::ToolPayload;
@@ -87,11 +90,7 @@ impl ToolHandler for UnifiedExecHandler {
};
let manager: &UnifiedExecSessionManager = &session.services.unified_exec_manager;
let context = UnifiedExecContext {
session: &session,
turn: turn.as_ref(),
call_id: &call_id,
};
let context = UnifiedExecContext::new(session.clone(), turn.clone(), call_id.clone());
let response = match tool_name.as_str() {
"exec_command" => {
@@ -101,8 +100,12 @@ impl ToolHandler for UnifiedExecHandler {
))
})?;
let event_ctx =
ToolEventCtx::new(context.session, context.turn, context.call_id, None);
let event_ctx = ToolEventCtx::new(
context.session.as_ref(),
context.turn.as_ref(),
&context.call_id,
None,
);
let emitter =
ToolEmitter::unified_exec(args.cmd.clone(), context.turn.cwd.clone(), true);
emitter.emit(event_ctx, ToolEventStage::Begin).await;
@@ -148,6 +151,18 @@ impl ToolHandler for UnifiedExecHandler {
}
};
// Emit a delta event with the chunk of output we just produced, if any.
if !response.output.is_empty() {
let delta = ExecCommandOutputDeltaEvent {
call_id: response.event_call_id.clone(),
stream: ExecOutputStream::Stdout,
chunk: response.output.as_bytes().to_vec(),
};
session
.send_event(turn.as_ref(), EventMsg::ExecCommandOutputDelta(delta))
.await;
}
let content = serialize_response(&response).map_err(|err| {
FunctionCallError::RespondToModel(format!(
"failed to serialize unified exec output: {err:?}"

View File

@@ -9,37 +9,11 @@ pub mod runtimes;
pub mod sandboxing;
pub mod spec;
use crate::apply_patch;
use crate::apply_patch::InternalApplyPatchInvocation;
use crate::apply_patch::convert_apply_patch_to_protocol;
use crate::codex::Session;
use crate::codex::TurnContext;
use crate::error::CodexErr;
use crate::error::SandboxErr;
use crate::exec::ExecParams;
use crate::exec::ExecToolCallOutput;
use crate::function_tool::FunctionCallError;
use crate::tools::context::SharedTurnDiffTracker;
use crate::tools::events::ToolEmitter;
use crate::tools::events::ToolEventCtx;
use crate::tools::events::ToolEventFailure;
use crate::tools::events::ToolEventStage;
use crate::tools::orchestrator::ToolOrchestrator;
use crate::tools::runtimes::apply_patch::ApplyPatchRequest;
use crate::tools::runtimes::apply_patch::ApplyPatchRuntime;
use crate::tools::runtimes::shell::ShellRequest;
use crate::tools::runtimes::shell::ShellRuntime;
use crate::tools::sandboxing::ToolCtx;
use crate::tools::sandboxing::ToolError;
use codex_apply_patch::MaybeApplyPatchVerified;
use codex_apply_patch::maybe_parse_apply_patch_verified;
use codex_protocol::protocol::AskForApproval;
use codex_utils_string::take_bytes_at_char_boundary;
use codex_utils_string::take_last_bytes_at_char_boundary;
pub use router::ToolRouter;
use serde::Serialize;
use std::sync::Arc;
use tracing::trace;
// Model-formatting limits: clients get full streams; only content sent to the model is truncated.
pub(crate) const MODEL_FORMAT_MAX_BYTES: usize = 10 * 1024; // 10 KiB
@@ -54,186 +28,6 @@ pub(crate) const TELEMETRY_PREVIEW_MAX_LINES: usize = 64; // lines
pub(crate) const TELEMETRY_PREVIEW_TRUNCATION_NOTICE: &str =
"[... telemetry preview truncated ...]";
// TODO(jif) break this down
pub(crate) async fn handle_container_exec_with_params(
tool_name: &str,
params: ExecParams,
sess: Arc<Session>,
turn_context: Arc<TurnContext>,
turn_diff_tracker: SharedTurnDiffTracker,
call_id: String,
) -> Result<String, FunctionCallError> {
let _otel_event_manager = turn_context.client.get_otel_event_manager();
if params.with_escalated_permissions.unwrap_or(false)
&& !matches!(turn_context.approval_policy, AskForApproval::OnRequest)
{
return Err(FunctionCallError::RespondToModel(format!(
"approval policy is {policy:?}; reject command — you should not ask for escalated permissions if the approval policy is {policy:?}",
policy = turn_context.approval_policy
)));
}
// check if this was a patch, and apply it if so
let apply_patch_exec = match maybe_parse_apply_patch_verified(&params.command, &params.cwd) {
MaybeApplyPatchVerified::Body(changes) => {
match apply_patch::apply_patch(sess.as_ref(), turn_context.as_ref(), &call_id, changes)
.await
{
InternalApplyPatchInvocation::Output(item) => return item,
InternalApplyPatchInvocation::DelegateToExec(apply_patch_exec) => {
Some(apply_patch_exec)
}
}
}
MaybeApplyPatchVerified::CorrectnessError(parse_error) => {
// It looks like an invocation of `apply_patch`, but we
// could not resolve it into a patch that would apply
// cleanly. Return to model for resample.
return Err(FunctionCallError::RespondToModel(format!(
"apply_patch verification failed: {parse_error}"
)));
}
MaybeApplyPatchVerified::ShellParseError(error) => {
trace!("Failed to parse shell command, {error:?}");
None
}
MaybeApplyPatchVerified::NotApplyPatch => None,
};
let (event_emitter, diff_opt) = match apply_patch_exec.as_ref() {
Some(exec) => (
ToolEmitter::apply_patch(
convert_apply_patch_to_protocol(&exec.action),
!exec.user_explicitly_approved_this_action,
),
Some(&turn_diff_tracker),
),
None => (
ToolEmitter::shell(params.command.clone(), params.cwd.clone()),
None,
),
};
let event_ctx = ToolEventCtx::new(sess.as_ref(), turn_context.as_ref(), &call_id, diff_opt);
event_emitter.emit(event_ctx, ToolEventStage::Begin).await;
// Build runtime contexts only when needed (shell/apply_patch below).
if let Some(exec) = apply_patch_exec {
// Route apply_patch execution through the new orchestrator/runtime.
let req = ApplyPatchRequest {
patch: exec.action.patch.clone(),
cwd: params.cwd.clone(),
timeout_ms: params.timeout_ms,
user_explicitly_approved: exec.user_explicitly_approved_this_action,
codex_exe: turn_context.codex_linux_sandbox_exe.clone(),
};
let mut orchestrator = ToolOrchestrator::new();
let mut runtime = ApplyPatchRuntime::new();
let tool_ctx = ToolCtx {
session: sess.as_ref(),
turn: turn_context.as_ref(),
call_id: call_id.clone(),
tool_name: tool_name.to_string(),
};
let out = orchestrator
.run(
&mut runtime,
&req,
&tool_ctx,
&turn_context,
turn_context.approval_policy,
)
.await;
handle_exec_outcome(&event_emitter, event_ctx, out).await
} else {
// Route shell execution through the new orchestrator/runtime.
let req = ShellRequest {
command: params.command.clone(),
cwd: params.cwd.clone(),
timeout_ms: params.timeout_ms,
env: params.env.clone(),
with_escalated_permissions: params.with_escalated_permissions,
justification: params.justification.clone(),
};
let mut orchestrator = ToolOrchestrator::new();
let mut runtime = ShellRuntime::new();
let tool_ctx = ToolCtx {
session: sess.as_ref(),
turn: turn_context.as_ref(),
call_id: call_id.clone(),
tool_name: tool_name.to_string(),
};
let out = orchestrator
.run(
&mut runtime,
&req,
&tool_ctx,
&turn_context,
turn_context.approval_policy,
)
.await;
handle_exec_outcome(&event_emitter, event_ctx, out).await
}
}
async fn handle_exec_outcome(
event_emitter: &ToolEmitter,
event_ctx: ToolEventCtx<'_>,
out: Result<ExecToolCallOutput, ToolError>,
) -> Result<String, FunctionCallError> {
let event;
let result = match out {
Ok(output) => {
let content = format_exec_output_for_model(&output);
let exit_code = output.exit_code;
event = ToolEventStage::Success(output);
if exit_code == 0 {
Ok(content)
} else {
Err(FunctionCallError::RespondToModel(content))
}
}
Err(ToolError::Codex(CodexErr::Sandbox(SandboxErr::Timeout { output })))
| Err(ToolError::Codex(CodexErr::Sandbox(SandboxErr::Denied { output }))) => {
let response = format_exec_output_for_model(&output);
event = ToolEventStage::Failure(ToolEventFailure::Output(*output));
Err(FunctionCallError::RespondToModel(response))
}
Err(ToolError::Codex(err)) => {
let message = format!("execution error: {err:?}");
let response = format_exec_output(&message);
event = ToolEventStage::Failure(ToolEventFailure::Message(message));
Err(FunctionCallError::RespondToModel(format_exec_output(
&response,
)))
}
Err(ToolError::Rejected(msg)) | Err(ToolError::SandboxDenied(msg)) => {
// Normalize common rejection messages for exec tools so tests and
// users see a clear, consistent phrase.
let normalized = if msg == "rejected by user" {
"exec command rejected by user".to_string()
} else {
msg
};
let response = format_exec_output(&normalized);
event = ToolEventStage::Failure(ToolEventFailure::Message(normalized));
Err(FunctionCallError::RespondToModel(format_exec_output(
&response,
)))
}
};
event_emitter.emit(event_ctx, event).await;
result
}
/// Format the combined exec output for sending back to the model.
/// Includes exit code and duration metadata; truncates large bodies safely.
pub fn format_exec_output_for_model(exec_output: &ExecToolCallOutput) -> String {
@@ -363,6 +157,7 @@ fn truncate_formatted_exec_output(content: &str, total_lines: usize) -> String {
#[cfg(test)]
mod tests {
use super::*;
use crate::function_tool::FunctionCallError;
use regex_lite::Regex;
fn truncate_function_error(err: FunctionCallError) -> FunctionCallError {

View File

@@ -98,9 +98,9 @@ impl ToolOrchestrator {
"sandbox denied and no retry".to_string(),
));
}
// Under `Never`, do not retry without sandbox; surface a concise message
// Under `Never` or `OnRequest`, do not retry without sandbox; surface a concise message
// derived from the actual output (platform-agnostic).
if matches!(approval_policy, AskForApproval::Never) {
if !tool.wants_no_sandbox_approval(approval_policy) {
let msg = build_never_denied_message_from_output(output.as_ref());
return Err(ToolError::SandboxDenied(msg));
}

View File

@@ -2,6 +2,7 @@ use std::sync::Arc;
use tokio::sync::RwLock;
use tokio_util::either::Either;
use tokio_util::sync::CancellationToken;
use tokio_util::task::AbortOnDropHandle;
use crate::codex::Session;
@@ -9,8 +10,10 @@ use crate::codex::TurnContext;
use crate::error::CodexErr;
use crate::function_tool::FunctionCallError;
use crate::tools::context::SharedTurnDiffTracker;
use crate::tools::context::ToolPayload;
use crate::tools::router::ToolCall;
use crate::tools::router::ToolRouter;
use codex_protocol::models::FunctionCallOutputPayload;
use codex_protocol::models::ResponseInputItem;
pub(crate) struct ToolCallRuntime {
@@ -40,6 +43,7 @@ impl ToolCallRuntime {
pub(crate) fn handle_tool_call(
&self,
call: ToolCall,
cancellation_token: CancellationToken,
) -> impl std::future::Future<Output = Result<ResponseInputItem, CodexErr>> {
let supports_parallel = self.router.tool_supports_parallel(&call.tool_name);
@@ -48,18 +52,24 @@ impl ToolCallRuntime {
let turn = Arc::clone(&self.turn_context);
let tracker = Arc::clone(&self.tracker);
let lock = Arc::clone(&self.parallel_execution);
let aborted_response = Self::aborted_response(&call);
let handle: AbortOnDropHandle<Result<ResponseInputItem, FunctionCallError>> =
AbortOnDropHandle::new(tokio::spawn(async move {
let _guard = if supports_parallel {
Either::Left(lock.read().await)
} else {
Either::Right(lock.write().await)
};
tokio::select! {
_ = cancellation_token.cancelled() => Ok(aborted_response),
res = async {
let _guard = if supports_parallel {
Either::Left(lock.read().await)
} else {
Either::Right(lock.write().await)
};
router
.dispatch_tool_call(session, turn, tracker, call)
.await
router
.dispatch_tool_call(session, turn, tracker, call)
.await
} => res,
}
}));
async move {
@@ -74,3 +84,25 @@ impl ToolCallRuntime {
}
}
}
impl ToolCallRuntime {
fn aborted_response(call: &ToolCall) -> ResponseInputItem {
match &call.payload {
ToolPayload::Custom { .. } => ResponseInputItem::CustomToolCallOutput {
call_id: call.call_id.clone(),
output: "aborted".to_string(),
},
ToolPayload::Mcp { .. } => ResponseInputItem::McpToolCallOutput {
call_id: call.call_id.clone(),
result: Err("aborted".to_string()),
},
_ => ResponseInputItem::FunctionCallOutput {
call_id: call.call_id.clone(),
output: FunctionCallOutputPayload {
content: "aborted".to_string(),
success: None,
},
},
}
}
}

View File

@@ -17,6 +17,7 @@ use crate::tools::sandboxing::ToolCtx;
use crate::tools::sandboxing::ToolError;
use crate::tools::sandboxing::ToolRuntime;
use crate::tools::sandboxing::with_cached_approval;
use codex_protocol::protocol::AskForApproval;
use codex_protocol::protocol::ReviewDecision;
use futures::future::BoxFuture;
use std::collections::HashMap;
@@ -127,6 +128,10 @@ impl Approvable<ApplyPatchRequest> for ApplyPatchRuntime {
.await
})
}
fn wants_no_sandbox_approval(&self, policy: AskForApproval) -> bool {
!matches!(policy, AskForApproval::Never)
}
}
impl ToolRuntime<ApplyPatchRequest, ExecToolCallOutput> for ApplyPatchRuntime {

View File

@@ -121,6 +121,11 @@ pub(crate) trait Approvable<Req> {
}
}
/// Decide we can request an approval for no-sandbox execution.
fn wants_no_sandbox_approval(&self, policy: AskForApproval) -> bool {
!matches!(policy, AskForApproval::Never | AskForApproval::OnRequest)
}
fn start_approval_async<'a>(
&'a mut self,
req: &'a Req,

View File

@@ -1,18 +1,35 @@
//! Utilities for truncating large chunks of output while preserving a prefix
//! and suffix on UTF-8 boundaries.
use codex_utils_tokenizer::Tokenizer;
/// Truncate the middle of a UTF-8 string to at most `max_bytes` bytes,
/// preserving the beginning and the end. Returns the possibly truncated
/// string and `Some(original_token_count)` (estimated at 4 bytes/token)
/// string and `Some(original_token_count)` (counted with the local tokenizer;
/// falls back to a 4-bytes-per-token estimate if the tokenizer cannot load)
/// if truncation occurred; otherwise returns the original string and `None`.
pub(crate) fn truncate_middle(s: &str, max_bytes: usize) -> (String, Option<u64>) {
if s.len() <= max_bytes {
return (s.to_string(), None);
}
let est_tokens = (s.len() as u64).div_ceil(4);
// Build a tokenizer for counting (default to o200k_base; fall back to cl100k_base).
// If both fail, fall back to a 4-bytes-per-token estimate.
let tok = Tokenizer::try_default().ok();
let token_count = |text: &str| -> u64 {
if let Some(ref t) = tok {
t.count(text) as u64
} else {
(text.len() as u64).div_ceil(4)
}
};
let total_tokens = token_count(s);
if max_bytes == 0 {
return (format!("{est_tokens} tokens truncated…"), Some(est_tokens));
return (
format!("{total_tokens} tokens truncated…"),
Some(total_tokens),
);
}
fn truncate_on_boundary(input: &str, max_len: usize) -> &str {
@@ -50,13 +67,17 @@ pub(crate) fn truncate_middle(s: &str, max_bytes: usize) -> (String, Option<u64>
idx
}
let mut guess_tokens = est_tokens;
// Iterate to stabilize marker length → keep budget → boundaries.
let mut guess_tokens: u64 = 1;
for _ in 0..4 {
let marker = format!("{guess_tokens} tokens truncated…");
let marker_len = marker.len();
let keep_budget = max_bytes.saturating_sub(marker_len);
if keep_budget == 0 {
return (format!("{est_tokens} tokens truncated…"), Some(est_tokens));
return (
format!("{total_tokens} tokens truncated…"),
Some(total_tokens),
);
}
let left_budget = keep_budget / 2;
@@ -67,59 +88,72 @@ pub(crate) fn truncate_middle(s: &str, max_bytes: usize) -> (String, Option<u64>
suffix_start = prefix_end;
}
let kept_content_bytes = prefix_end + (s.len() - suffix_start);
let truncated_content_bytes = s.len().saturating_sub(kept_content_bytes);
let new_tokens = (truncated_content_bytes as u64).div_ceil(4);
// Tokens actually removed (middle slice) using the real tokenizer.
let removed_tokens = token_count(&s[prefix_end..suffix_start]);
if new_tokens == guess_tokens {
let mut out = String::with_capacity(marker_len + kept_content_bytes + 1);
// If the number of digits in the token count does not change the marker length,
// we can finalize output.
let final_marker = format!("{removed_tokens} tokens truncated…");
if final_marker.len() == marker_len {
let kept_content_bytes = prefix_end + (s.len() - suffix_start);
let mut out = String::with_capacity(final_marker.len() + kept_content_bytes + 1);
out.push_str(&s[..prefix_end]);
out.push_str(&marker);
out.push_str(&final_marker);
out.push('\n');
out.push_str(&s[suffix_start..]);
return (out, Some(est_tokens));
return (out, Some(total_tokens));
}
guess_tokens = new_tokens;
guess_tokens = removed_tokens;
}
// Fallback build after iterations: compute with the last guess.
let marker = format!("{guess_tokens} tokens truncated…");
let marker_len = marker.len();
let keep_budget = max_bytes.saturating_sub(marker_len);
if keep_budget == 0 {
return (format!("{est_tokens} tokens truncated…"), Some(est_tokens));
return (
format!("{total_tokens} tokens truncated…"),
Some(total_tokens),
);
}
let left_budget = keep_budget / 2;
let right_budget = keep_budget - left_budget;
let prefix_end = pick_prefix_end(s, left_budget);
let suffix_start = pick_suffix_start(s, right_budget);
let mut suffix_start = pick_suffix_start(s, right_budget);
if suffix_start < prefix_end {
suffix_start = prefix_end;
}
let mut out = String::with_capacity(marker_len + prefix_end + (s.len() - suffix_start) + 1);
out.push_str(&s[..prefix_end]);
out.push_str(&marker);
out.push('\n');
out.push_str(&s[suffix_start..]);
(out, Some(est_tokens))
(out, Some(total_tokens))
}
#[cfg(test)]
mod tests {
use super::truncate_middle;
use codex_utils_tokenizer::Tokenizer;
#[test]
fn truncate_middle_no_newlines_fallback() {
let tok = Tokenizer::try_default().expect("load tokenizer");
let s = "abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ*";
let max_bytes = 32;
let (out, original) = truncate_middle(s, max_bytes);
assert!(out.starts_with("abc"));
assert!(out.contains("tokens truncated"));
assert!(out.ends_with("XYZ*"));
assert_eq!(original, Some((s.len() as u64).div_ceil(4)));
assert_eq!(original, Some(tok.count(s) as u64));
}
#[test]
fn truncate_middle_prefers_newline_boundaries() {
let tok = Tokenizer::try_default().expect("load tokenizer");
let mut s = String::new();
for i in 1..=20 {
s.push_str(&format!("{i:03}\n"));
@@ -131,50 +165,36 @@ mod tests {
assert!(out.starts_with("001\n002\n003\n004\n"));
assert!(out.contains("tokens truncated"));
assert!(out.ends_with("017\n018\n019\n020\n"));
assert_eq!(tokens, Some(20));
assert_eq!(tokens, Some(tok.count(&s) as u64));
}
#[test]
fn truncate_middle_handles_utf8_content() {
let tok = Tokenizer::try_default().expect("load tokenizer");
let s = "😀😀😀😀😀😀😀😀😀😀\nsecond line with ascii text\n";
let max_bytes = 32;
let (out, tokens) = truncate_middle(s, max_bytes);
assert!(out.contains("tokens truncated"));
assert!(!out.contains('\u{fffd}'));
assert_eq!(tokens, Some((s.len() as u64).div_ceil(4)));
assert_eq!(tokens, Some(tok.count(s) as u64));
}
#[test]
fn truncate_middle_prefers_newline_boundaries_2() {
let tok = Tokenizer::try_default().expect("load tokenizer");
// Build a multi-line string of 20 numbered lines (each "NNN\n").
let mut s = String::new();
for i in 1..=20 {
s.push_str(&format!("{i:03}\n"));
}
// Total length: 20 lines * 4 bytes per line = 80 bytes.
assert_eq!(s.len(), 80);
// Choose a cap that forces truncation while leaving room for
// a few lines on each side after accounting for the marker.
let max_bytes = 64;
// Expect exact output: first 4 lines, marker, last 4 lines, and correct token estimate (80/4 = 20).
assert_eq!(
truncate_middle(&s, max_bytes),
(
r#"001
002
003
004
…12 tokens truncated…
017
018
019
020
"#
.to_string(),
Some(20)
)
);
let (out, total) = truncate_middle(&s, max_bytes);
assert!(out.starts_with("001\n002\n003\n004\n"));
assert!(out.contains("tokens truncated"));
assert!(out.ends_with("017\n018\n019\n020\n"));
assert_eq!(total, Some(tok.count(&s) as u64));
}
}

View File

@@ -22,6 +22,8 @@
//! - `session_manager.rs`: orchestration (approvals, sandboxing, reuse) and request handling.
use std::collections::HashMap;
use std::path::PathBuf;
use std::sync::Arc;
use std::sync::atomic::AtomicI32;
use std::time::Duration;
@@ -45,10 +47,20 @@ pub(crate) const MAX_YIELD_TIME_MS: u64 = 30_000;
pub(crate) const DEFAULT_MAX_OUTPUT_TOKENS: usize = 10_000;
pub(crate) const UNIFIED_EXEC_OUTPUT_MAX_BYTES: usize = 1024 * 1024; // 1 MiB
pub(crate) struct UnifiedExecContext<'a> {
pub session: &'a Session,
pub turn: &'a TurnContext,
pub call_id: &'a str,
pub(crate) struct UnifiedExecContext {
pub session: Arc<Session>,
pub turn: Arc<TurnContext>,
pub call_id: String,
}
impl UnifiedExecContext {
pub fn new(session: Arc<Session>, turn: Arc<TurnContext>, call_id: String) -> Self {
Self {
session,
turn,
call_id,
}
}
}
#[derive(Debug)]
@@ -70,6 +82,7 @@ pub(crate) struct WriteStdinRequest<'a> {
#[derive(Debug, Clone, PartialEq)]
pub(crate) struct UnifiedExecResponse {
pub event_call_id: String,
pub chunk_id: String,
pub wall_time: Duration,
pub output: String,
@@ -78,10 +91,20 @@ pub(crate) struct UnifiedExecResponse {
pub original_token_count: Option<usize>,
}
#[derive(Debug, Default)]
#[derive(Default)]
pub(crate) struct UnifiedExecSessionManager {
next_session_id: AtomicI32,
sessions: Mutex<HashMap<i32, session::UnifiedExecSession>>,
sessions: Mutex<HashMap<i32, SessionEntry>>,
}
struct SessionEntry {
session: session::UnifiedExecSession,
session_ref: Arc<Session>,
turn_ref: Arc<TurnContext>,
call_id: String,
command: String,
cwd: PathBuf,
started_at: tokio::time::Instant,
}
pub(crate) fn clamp_yield_time(yield_time_ms: Option<u64>) -> u64 {
@@ -163,11 +186,8 @@ mod tests {
cmd: &str,
yield_time_ms: Option<u64>,
) -> Result<UnifiedExecResponse, UnifiedExecError> {
let context = UnifiedExecContext {
session,
turn: turn.as_ref(),
call_id: "call",
};
let context =
UnifiedExecContext::new(Arc::clone(session), Arc::clone(turn), "call".to_string());
session
.services

View File

@@ -5,8 +5,13 @@ use tokio::sync::mpsc;
use tokio::time::Duration;
use tokio::time::Instant;
use crate::exec::ExecToolCallOutput;
use crate::exec::StreamOutput;
use crate::exec_env::create_env;
use crate::sandboxing::ExecEnv;
use crate::tools::events::ToolEmitter;
use crate::tools::events::ToolEventCtx;
use crate::tools::events::ToolEventStage;
use crate::tools::orchestrator::ToolOrchestrator;
use crate::tools::runtimes::unified_exec::UnifiedExecRequest as UnifiedExecToolRequest;
use crate::tools::runtimes::unified_exec::UnifiedExecRuntime;
@@ -14,6 +19,7 @@ use crate::tools::sandboxing::ToolCtx;
use super::ExecCommandRequest;
use super::MIN_YIELD_TIME_MS;
use super::SessionEntry;
use super::UnifiedExecContext;
use super::UnifiedExecError;
use super::UnifiedExecResponse;
@@ -30,7 +36,7 @@ impl UnifiedExecSessionManager {
pub(crate) async fn exec_command(
&self,
request: ExecCommandRequest<'_>,
context: &UnifiedExecContext<'_>,
context: &UnifiedExecContext,
) -> Result<UnifiedExecResponse, UnifiedExecError> {
let shell_flag = if request.login { "-lc" } else { "-c" };
let command = vec![
@@ -59,17 +65,36 @@ impl UnifiedExecSessionManager {
let session_id = if session.has_exited() {
None
} else {
Some(self.store_session(session).await)
Some(
self.store_session(session, context, request.command, start)
.await,
)
};
Ok(UnifiedExecResponse {
let response = UnifiedExecResponse {
event_call_id: context.call_id.clone(),
chunk_id,
wall_time,
output,
session_id,
exit_code,
original_token_count,
})
};
// If the command completed during this call, emit an ExecCommandEnd via the emitter.
if response.session_id.is_none() {
let exit = response.exit_code.unwrap_or(-1);
Self::emit_exec_end_from_context(
context,
request.command.to_string(),
response.output.clone(),
exit,
response.wall_time,
)
.await;
}
Ok(response)
}
pub(crate) async fn write_stdin(
@@ -98,37 +123,60 @@ impl UnifiedExecSessionManager {
let (output, original_token_count) = truncate_output_to_tokens(&text, max_tokens);
let chunk_id = generate_chunk_id();
let (session_id, exit_code) = self.refresh_session_state(session_id).await;
let status = self.refresh_session_state(session_id).await;
let (session_id, exit_code, completion_entry, event_call_id) = match status {
SessionStatus::Alive { exit_code, call_id } => {
(Some(session_id), exit_code, None, call_id)
}
SessionStatus::Exited { exit_code, entry } => {
let call_id = entry.call_id.clone();
(None, exit_code, Some(*entry), call_id)
}
SessionStatus::Unknown => {
return Err(UnifiedExecError::UnknownSessionId { session_id });
}
};
Ok(UnifiedExecResponse {
let response = UnifiedExecResponse {
event_call_id,
chunk_id,
wall_time,
output,
session_id,
exit_code,
original_token_count,
})
}
};
async fn refresh_session_state(&self, session_id: i32) -> (Option<i32>, Option<i32>) {
let mut sessions = self.sessions.lock().await;
if !sessions.contains_key(&session_id) {
return (None, None);
if let (Some(exit), Some(entry)) = (response.exit_code, completion_entry) {
let total_duration = Instant::now().saturating_duration_since(entry.started_at);
Self::emit_exec_end_from_entry(entry, response.output.clone(), exit, total_duration)
.await;
}
let has_exited = sessions
.get(&session_id)
.map(UnifiedExecSession::has_exited)
.unwrap_or(false);
let exit_code = sessions
.get(&session_id)
.and_then(UnifiedExecSession::exit_code);
Ok(response)
}
if has_exited {
sessions.remove(&session_id);
(None, exit_code)
async fn refresh_session_state(&self, session_id: i32) -> SessionStatus {
let mut sessions = self.sessions.lock().await;
let Some(entry) = sessions.get(&session_id) else {
return SessionStatus::Unknown;
};
let exit_code = entry.session.exit_code();
if entry.session.has_exited() {
let Some(entry) = sessions.remove(&session_id) else {
return SessionStatus::Unknown;
};
SessionStatus::Exited {
exit_code,
entry: Box::new(entry),
}
} else {
(Some(session_id), exit_code)
SessionStatus::Alive {
exit_code,
call_id: entry.call_id.clone(),
}
}
}
@@ -138,9 +186,9 @@ impl UnifiedExecSessionManager {
) -> Result<(mpsc::Sender<Vec<u8>>, OutputBuffer, Arc<Notify>), UnifiedExecError> {
let sessions = self.sessions.lock().await;
let (output_buffer, output_notify, writer_tx) =
if let Some(session) = sessions.get(&session_id) {
let (buffer, notify) = session.output_handles();
(buffer, notify, session.writer_sender())
if let Some(entry) = sessions.get(&session_id) {
let (buffer, notify) = entry.session.output_handles();
(buffer, notify, entry.session.writer_sender())
} else {
return Err(UnifiedExecError::UnknownSessionId { session_id });
};
@@ -158,14 +206,82 @@ impl UnifiedExecSessionManager {
.map_err(|_| UnifiedExecError::WriteToStdin)
}
async fn store_session(&self, session: UnifiedExecSession) -> i32 {
async fn store_session(
&self,
session: UnifiedExecSession,
context: &UnifiedExecContext,
command: &str,
started_at: Instant,
) -> i32 {
let session_id = self
.next_session_id
.fetch_add(1, std::sync::atomic::Ordering::SeqCst);
self.sessions.lock().await.insert(session_id, session);
let entry = SessionEntry {
session,
session_ref: Arc::clone(&context.session),
turn_ref: Arc::clone(&context.turn),
call_id: context.call_id.clone(),
command: command.to_string(),
cwd: context.turn.cwd.clone(),
started_at,
};
self.sessions.lock().await.insert(session_id, entry);
session_id
}
async fn emit_exec_end_from_entry(
entry: SessionEntry,
aggregated_output: String,
exit_code: i32,
duration: Duration,
) {
let output = ExecToolCallOutput {
exit_code,
stdout: StreamOutput::new(aggregated_output.clone()),
stderr: StreamOutput::new(String::new()),
aggregated_output: StreamOutput::new(aggregated_output),
duration,
timed_out: false,
};
let event_ctx = ToolEventCtx::new(
entry.session_ref.as_ref(),
entry.turn_ref.as_ref(),
&entry.call_id,
None,
);
let emitter = ToolEmitter::unified_exec(entry.command, entry.cwd, true);
emitter
.emit(event_ctx, ToolEventStage::Success(output))
.await;
}
async fn emit_exec_end_from_context(
context: &UnifiedExecContext,
command: String,
aggregated_output: String,
exit_code: i32,
duration: Duration,
) {
let output = ExecToolCallOutput {
exit_code,
stdout: StreamOutput::new(aggregated_output.clone()),
stderr: StreamOutput::new(String::new()),
aggregated_output: StreamOutput::new(aggregated_output),
duration,
timed_out: false,
};
let event_ctx = ToolEventCtx::new(
context.session.as_ref(),
context.turn.as_ref(),
&context.call_id,
None,
);
let emitter = ToolEmitter::unified_exec(command, context.turn.cwd.clone(), true);
emitter
.emit(event_ctx, ToolEventStage::Success(output))
.await;
}
pub(crate) async fn open_session_with_exec_env(
&self,
env: &ExecEnv,
@@ -184,7 +300,7 @@ impl UnifiedExecSessionManager {
pub(super) async fn open_session_with_sandbox(
&self,
command: Vec<String>,
context: &UnifiedExecContext<'_>,
context: &UnifiedExecContext,
) -> Result<UnifiedExecSession, UnifiedExecError> {
let mut orchestrator = ToolOrchestrator::new();
let mut runtime = UnifiedExecRuntime::new(self);
@@ -194,9 +310,9 @@ impl UnifiedExecSessionManager {
create_env(&context.turn.shell_environment_policy),
);
let tool_ctx = ToolCtx {
session: context.session,
turn: context.turn,
call_id: context.call_id.to_string(),
session: context.session.as_ref(),
turn: context.turn.as_ref(),
call_id: context.call_id.clone(),
tool_name: "exec_command".to_string(),
};
orchestrator
@@ -204,7 +320,7 @@ impl UnifiedExecSessionManager {
&mut runtime,
&req,
&tool_ctx,
context.turn,
context.turn.as_ref(),
context.turn.approval_policy,
)
.await
@@ -255,3 +371,15 @@ impl UnifiedExecSessionManager {
collected
}
}
enum SessionStatus {
Alive {
exit_code: Option<i32>,
call_id: String,
},
Exited {
exit_code: Option<i32>,
entry: Box<SessionEntry>,
},
Unknown,
}

View File

@@ -10,6 +10,7 @@ path = "lib.rs"
anyhow = { workspace = true }
assert_cmd = { workspace = true }
codex-core = { workspace = true }
codex-protocol = { workspace = true }
notify = { workspace = true }
regex-lite = { workspace = true }
serde_json = { workspace = true }

View File

@@ -35,6 +35,22 @@ impl ResponseMock {
pub fn requests(&self) -> Vec<ResponsesRequest> {
self.requests.lock().unwrap().clone()
}
/// Returns true if any captured request contains a `function_call` with the
/// provided `call_id`.
pub fn saw_function_call(&self, call_id: &str) -> bool {
self.requests()
.iter()
.any(|req| req.has_function_call(call_id))
}
/// Returns the `output` string for a matching `function_call_output` with
/// the provided `call_id`, searching across all captured requests.
pub fn function_call_output_text(&self, call_id: &str) -> Option<String> {
self.requests()
.iter()
.find_map(|req| req.function_call_output_text(call_id))
}
}
#[derive(Debug, Clone)]
@@ -70,6 +86,28 @@ impl ResponsesRequest {
.unwrap_or_else(|| panic!("function call output {call_id} item not found in request"))
}
/// Returns true if this request's `input` contains a `function_call` with
/// the specified `call_id`.
pub fn has_function_call(&self, call_id: &str) -> bool {
self.input().iter().any(|item| {
item.get("type").and_then(Value::as_str) == Some("function_call")
&& item.get("call_id").and_then(Value::as_str) == Some(call_id)
})
}
/// If present, returns the `output` string of the `function_call_output`
/// entry matching `call_id` in this request's `input`.
pub fn function_call_output_text(&self, call_id: &str) -> Option<String> {
let binding = self.input();
let item = binding.iter().find(|item| {
item.get("type").and_then(Value::as_str) == Some("function_call_output")
&& item.get("call_id").and_then(Value::as_str) == Some(call_id)
})?;
item.get("output")
.and_then(Value::as_str)
.map(str::to_string)
}
pub fn header(&self, name: &str) -> Option<String> {
self.0
.headers

View File

@@ -1,17 +1,30 @@
use std::mem::swap;
use std::path::Path;
use std::path::PathBuf;
use std::sync::Arc;
use anyhow::Result;
use codex_core::CodexAuth;
use codex_core::CodexConversation;
use codex_core::ConversationManager;
use codex_core::ModelProviderInfo;
use codex_core::built_in_model_providers;
use codex_core::config::Config;
use codex_core::features::Feature;
use codex_core::protocol::AskForApproval;
use codex_core::protocol::EventMsg;
use codex_core::protocol::Op;
use codex_core::protocol::SandboxPolicy;
use codex_core::protocol::SessionConfiguredEvent;
use codex_protocol::config_types::ReasoningSummary;
use codex_protocol::user_input::UserInput;
use serde_json::Value;
use tempfile::TempDir;
use wiremock::MockServer;
use crate::load_default_config_for_test;
use crate::responses::start_mock_server;
use crate::wait_for_event;
type ConfigMutator = dyn FnOnce(&mut Config) + Send;
@@ -96,6 +109,12 @@ impl TestCodexBuilder {
mutator(&mut config);
}
if config.include_apply_patch_tool {
config.features.enable(Feature::ApplyPatchFreeform);
} else {
config.features.disable(Feature::ApplyPatchFreeform);
}
Ok((config, cwd))
}
}
@@ -107,6 +126,139 @@ pub struct TestCodex {
pub session_configured: SessionConfiguredEvent,
}
impl TestCodex {
pub fn cwd_path(&self) -> &Path {
self.cwd.path()
}
pub fn workspace_path(&self, rel: impl AsRef<Path>) -> PathBuf {
self.cwd_path().join(rel)
}
pub async fn submit_turn(&self, prompt: &str) -> Result<()> {
self.submit_turn_with_policy(prompt, SandboxPolicy::DangerFullAccess)
.await
}
pub async fn submit_turn_with_policy(
&self,
prompt: &str,
sandbox_policy: SandboxPolicy,
) -> Result<()> {
let session_model = self.session_configured.model.clone();
self.codex
.submit(Op::UserTurn {
items: vec![UserInput::Text {
text: prompt.into(),
}],
final_output_json_schema: None,
cwd: self.cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
wait_for_event(&self.codex, |event| {
matches!(event, EventMsg::TaskComplete(_))
})
.await;
Ok(())
}
}
pub struct TestCodexHarness {
server: MockServer,
test: TestCodex,
}
impl TestCodexHarness {
pub async fn new() -> Result<Self> {
Self::with_builder(test_codex()).await
}
pub async fn with_config(mutator: impl FnOnce(&mut Config) + Send + 'static) -> Result<Self> {
Self::with_builder(test_codex().with_config(mutator)).await
}
pub async fn with_builder(mut builder: TestCodexBuilder) -> Result<Self> {
let server = start_mock_server().await;
let test = builder.build(&server).await?;
Ok(Self { server, test })
}
pub fn server(&self) -> &MockServer {
&self.server
}
pub fn test(&self) -> &TestCodex {
&self.test
}
pub fn cwd(&self) -> &Path {
self.test.cwd_path()
}
pub fn path(&self, rel: impl AsRef<Path>) -> PathBuf {
self.test.workspace_path(rel)
}
pub async fn submit(&self, prompt: &str) -> Result<()> {
self.test.submit_turn(prompt).await
}
pub async fn submit_with_policy(
&self,
prompt: &str,
sandbox_policy: SandboxPolicy,
) -> Result<()> {
self.test
.submit_turn_with_policy(prompt, sandbox_policy)
.await
}
pub async fn request_bodies(&self) -> Vec<Value> {
self.server
.received_requests()
.await
.expect("requests")
.into_iter()
.map(|req| serde_json::from_slice(&req.body).expect("request body json"))
.collect()
}
pub async fn function_call_output_value(&self, call_id: &str) -> Value {
let bodies = self.request_bodies().await;
function_call_output(&bodies, call_id).clone()
}
pub async fn function_call_stdout(&self, call_id: &str) -> String {
self.function_call_output_value(call_id)
.await
.get("output")
.and_then(Value::as_str)
.expect("output string")
.to_string()
}
}
fn function_call_output<'a>(bodies: &'a [Value], call_id: &str) -> &'a Value {
for body in bodies {
if let Some(items) = body.get("input").and_then(Value::as_array) {
for item in items {
if item.get("type").and_then(Value::as_str) == Some("function_call_output")
&& item.get("call_id").and_then(Value::as_str) == Some(call_id)
{
return item;
}
}
}
}
panic!("function_call_output {call_id} not found");
}
pub fn test_codex() -> TestCodexBuilder {
TestCodexBuilder {
config_mutators: vec![],

View File

@@ -1,3 +1,4 @@
use std::sync::Arc;
use std::time::Duration;
use codex_core::protocol::EventMsg;
@@ -5,7 +6,9 @@ use codex_core::protocol::Op;
use codex_protocol::user_input::UserInput;
use core_test_support::responses::ev_completed;
use core_test_support::responses::ev_function_call;
use core_test_support::responses::ev_response_created;
use core_test_support::responses::mount_sse_once;
use core_test_support::responses::mount_sse_sequence;
use core_test_support::responses::sse;
use core_test_support::responses::start_mock_server;
use core_test_support::test_codex::test_codex;
@@ -67,3 +70,98 @@ async fn interrupt_long_running_tool_emits_turn_aborted() {
)
.await;
}
/// After an interrupt we expect the next request to the model to include both
/// the original tool call and an `"aborted"` `function_call_output`. This test
/// exercises the follow-up flow: it sends another user turn, inspects the mock
/// responses server, and ensures the model receives the synthesized abort.
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn interrupt_tool_records_history_entries() {
let command = vec![
"bash".to_string(),
"-lc".to_string(),
"sleep 60".to_string(),
];
let call_id = "call-history";
let args = json!({
"command": command,
"timeout_ms": 60_000
})
.to_string();
let first_body = sse(vec![
ev_response_created("resp-history"),
ev_function_call(call_id, "shell", &args),
ev_completed("resp-history"),
]);
let follow_up_body = sse(vec![
ev_response_created("resp-followup"),
ev_completed("resp-followup"),
]);
let server = start_mock_server().await;
let response_mock = mount_sse_sequence(&server, vec![first_body, follow_up_body]).await;
let fixture = test_codex().build(&server).await.unwrap();
let codex = Arc::clone(&fixture.codex);
let wait_timeout = Duration::from_millis(100);
codex
.submit(Op::UserInput {
items: vec![UserInput::Text {
text: "start history recording".into(),
}],
})
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::ExecCommandBegin(_)),
wait_timeout,
)
.await;
codex.submit(Op::Interrupt).await.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TurnAborted(_)),
wait_timeout,
)
.await;
codex
.submit(Op::UserInput {
items: vec![UserInput::Text {
text: "follow up".into(),
}],
})
.await
.unwrap();
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TaskComplete(_)),
wait_timeout,
)
.await;
let requests = response_mock.requests();
assert!(
requests.len() == 2,
"expected two calls to the responses API, got {}",
requests.len()
);
assert!(
response_mock.saw_function_call(call_id),
"function call not recorded in responses payload"
);
assert_eq!(
response_mock.function_call_output_text(call_id).as_deref(),
Some("aborted"),
"aborted function call output not recorded in responses payload"
);
}

File diff suppressed because it is too large Load Diff

View File

@@ -279,11 +279,6 @@ async fn auto_compact_runs_after_token_limit_hit() {
ev_completed_with_tokens("r2", 330_000),
]);
let sse3 = sse(vec![
ev_assistant_message("m3", AUTO_SUMMARY_TEXT),
ev_completed_with_tokens("r3", 200),
]);
let first_matcher = |req: &wiremock::Request| {
let body = std::str::from_utf8(&req.body).unwrap_or("");
body.contains(FIRST_AUTO_MSG)
@@ -300,12 +295,6 @@ async fn auto_compact_runs_after_token_limit_hit() {
};
mount_sse_once_match(&server, second_matcher, sse2).await;
let third_matcher = |req: &wiremock::Request| {
let body = std::str::from_utf8(&req.body).unwrap_or("");
body.contains("You have exceeded the maximum number of tokens")
};
mount_sse_once_match(&server, third_matcher, sse3).await;
let model_provider = ModelProviderInfo {
base_url: Some(format!("{}/v1", server.uri())),
..built_in_model_providers()["openai"].clone()
@@ -342,69 +331,28 @@ async fn auto_compact_runs_after_token_limit_hit() {
.await
.unwrap();
let error_event = wait_for_event(&codex, |ev| matches!(ev, EventMsg::Error(_))).await;
let EventMsg::Error(error_event) = error_event else {
unreachable!("wait_for_event returned unexpected payload");
};
assert_eq!(error_event.message, "invalid input: input too large");
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
// wait_for_event(&codex, |ev| matches!(ev, EventMsg::TaskComplete(_))).await;
let requests = server.received_requests().await.unwrap();
assert!(
requests.len() >= 3,
"auto compact should add at least a third request, got {}",
requests.len()
assert_eq!(
requests.len(),
2,
"auto compact should reject oversize prompts before issuing another request"
);
let is_auto_compact = |req: &wiremock::Request| {
let saw_compact_prompt = requests.iter().any(|req| {
std::str::from_utf8(&req.body)
.unwrap_or("")
.contains("You have exceeded the maximum number of tokens")
};
let auto_compact_count = requests.iter().filter(|req| is_auto_compact(req)).count();
assert_eq!(
auto_compact_count, 1,
"expected exactly one auto compact request"
);
let auto_compact_index = requests
.iter()
.enumerate()
.find_map(|(idx, req)| is_auto_compact(req).then_some(idx))
.expect("auto compact request missing");
assert_eq!(
auto_compact_index, 2,
"auto compact should add a third request"
);
let body_first = requests[0].body_json::<serde_json::Value>().unwrap();
let body3 = requests[auto_compact_index]
.body_json::<serde_json::Value>()
.unwrap();
let instructions = body3
.get("instructions")
.and_then(|v| v.as_str())
.unwrap_or_default();
let baseline_instructions = body_first
.get("instructions")
.and_then(|v| v.as_str())
.unwrap_or_default()
.to_string();
assert_eq!(
instructions, baseline_instructions,
"auto compact should keep the standard developer instructions",
);
let input3 = body3.get("input").and_then(|v| v.as_array()).unwrap();
let last3 = input3
.last()
.expect("auto compact request should append a user message");
assert_eq!(last3.get("type").and_then(|v| v.as_str()), Some("message"));
assert_eq!(last3.get("role").and_then(|v| v.as_str()), Some("user"));
let last_text = last3
.get("content")
.and_then(|v| v.as_array())
.and_then(|items| items.first())
.and_then(|item| item.get("text"))
.and_then(|text| text.as_str())
.unwrap_or_default();
assert_eq!(
last_text, SUMMARIZATION_PROMPT,
"auto compact should send the summarization prompt as a user message",
});
assert!(
!saw_compact_prompt,
"no auto compact request should be sent when the summarization prompt exceeds the limit"
);
}
@@ -869,7 +817,7 @@ async fn auto_compact_triggers_after_function_call_over_95_percent_usage() {
let server = start_mock_server().await;
let context_window = 100;
let context_window = 20_000;
let limit = context_window * 90 / 100;
let over_limit_tokens = context_window * 95 / 100 + 1;

View File

@@ -0,0 +1,69 @@
use codex_core::protocol::EventMsg;
use codex_core::protocol::Op;
use codex_protocol::user_input::UserInput;
use core_test_support::responses;
use core_test_support::responses::ev_assistant_message;
use core_test_support::responses::ev_completed;
use core_test_support::responses::ev_response_created;
use core_test_support::responses::sse;
use core_test_support::responses::start_mock_server;
use core_test_support::test_codex::test_codex;
use core_test_support::wait_for_event_with_timeout;
use std::sync::Arc;
use std::time::Duration;
use wiremock::matchers::any;
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn input_validation_should_fail_for_too_large_input() {
let server = start_mock_server().await;
let fixture = test_codex().build(&server).await.unwrap();
let codex = Arc::clone(&fixture.codex);
// First: normal message with a mocked assistant response
let first_response = sse(vec![
ev_response_created("resp-1"),
ev_assistant_message("msg-1", "ok"),
ev_completed("resp-1"),
]);
responses::mount_sse_once_match(&server, any(), first_response).await;
codex
.submit(Op::UserInput {
items: vec![UserInput::Text {
text: "hello world".into(),
}],
})
.await
.unwrap();
// Wait for the normal turn to complete before sending the oversized input
let turn_timeout = Duration::from_secs(1);
wait_for_event_with_timeout(
&codex,
|ev| matches!(ev, EventMsg::TaskComplete(_)),
turn_timeout,
)
.await;
// Then: 300k-token message should trigger validation error
let wait_timeout = Duration::from_millis(100);
let input_300_tokens = "token ".repeat(300_000);
codex
.submit(Op::UserInput {
items: vec![UserInput::Text {
text: input_300_tokens,
}],
})
.await
.unwrap();
let error_event =
wait_for_event_with_timeout(&codex, |ev| matches!(ev, EventMsg::Error(_)), wait_timeout)
.await;
let EventMsg::Error(error_event) = error_event else {
unreachable!("wait_for_event_with_timeout returned unexpected payload");
};
assert_eq!(error_event.message, "invalid input: input too large");
}

View File

@@ -3,6 +3,8 @@
#[cfg(not(target_os = "windows"))]
mod abort_tasks;
#[cfg(not(target_os = "windows"))]
mod apply_patch_cli;
#[cfg(not(target_os = "windows"))]
mod approvals;
mod cli_stream;
mod client;
@@ -11,6 +13,7 @@ mod compact_resume_fork;
mod exec;
mod fork_conversation;
mod grep_files;
mod input_validation;
mod items;
mod json_result;
mod list_dir;

View File

@@ -1,5 +1,6 @@
#![allow(clippy::unwrap_used, clippy::expect_used)]
use std::io::Write;
use std::path::Path;
use std::path::PathBuf;
use codex_core::find_conversation_path_by_id_str;
@@ -8,8 +9,8 @@ use uuid::Uuid;
/// Create sessions/YYYY/MM/DD and write a minimal rollout file containing the
/// provided conversation id in the SessionMeta line. Returns the absolute path.
fn write_minimal_rollout_with_id(codex_home: &TempDir, id: Uuid) -> PathBuf {
let sessions = codex_home.path().join("sessions/2024/01/01");
fn write_minimal_rollout_with_id(codex_home: &Path, id: Uuid) -> PathBuf {
let sessions = codex_home.join("sessions/2024/01/01");
std::fs::create_dir_all(&sessions).unwrap();
let file = sessions.join(format!("rollout-2024-01-01T00-00-00-{id}.jsonl"));
@@ -40,7 +41,7 @@ fn write_minimal_rollout_with_id(codex_home: &TempDir, id: Uuid) -> PathBuf {
async fn find_locates_rollout_file_by_id() {
let home = TempDir::new().unwrap();
let id = Uuid::new_v4();
let expected = write_minimal_rollout_with_id(&home, id);
let expected = write_minimal_rollout_with_id(home.path(), id);
let found = find_conversation_path_by_id_str(home.path(), &id.to_string())
.await
@@ -48,3 +49,33 @@ async fn find_locates_rollout_file_by_id() {
assert_eq!(found.unwrap(), expected);
}
#[tokio::test]
async fn find_handles_gitignore_covering_codex_home_directory() {
let repo = TempDir::new().unwrap();
let codex_home = repo.path().join(".codex");
std::fs::create_dir_all(&codex_home).unwrap();
std::fs::write(repo.path().join(".gitignore"), ".codex/**\n").unwrap();
let id = Uuid::new_v4();
let expected = write_minimal_rollout_with_id(&codex_home, id);
let found = find_conversation_path_by_id_str(&codex_home, &id.to_string())
.await
.unwrap();
assert_eq!(found, Some(expected));
}
#[tokio::test]
async fn find_ignores_granular_gitignore_rules() {
let home = TempDir::new().unwrap();
let id = Uuid::new_v4();
let expected = write_minimal_rollout_with_id(home.path(), id);
std::fs::write(home.path().join("sessions/.gitignore"), "*.jsonl\n").unwrap();
let found = find_conversation_path_by_id_str(home.path(), &id.to_string())
.await
.unwrap();
assert_eq!(found, Some(expected));
}

View File

@@ -133,6 +133,247 @@ async fn unified_exec_emits_exec_command_begin_event() -> Result<()> {
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn unified_exec_emits_exec_command_end_event() -> Result<()> {
skip_if_no_network!(Ok(()));
skip_if_sandbox!(Ok(()));
let server = start_mock_server().await;
let mut builder = test_codex().with_config(|config| {
config.use_experimental_unified_exec_tool = true;
config.features.enable(Feature::UnifiedExec);
});
let TestCodex {
codex,
cwd,
session_configured,
..
} = builder.build(&server).await?;
let call_id = "uexec-end-event";
let args = json!({
"cmd": "/bin/echo END-EVENT".to_string(),
"yield_time_ms": 250,
});
let responses = vec![
sse(vec![
ev_response_created("resp-1"),
ev_function_call(call_id, "exec_command", &serde_json::to_string(&args)?),
ev_completed("resp-1"),
]),
sse(vec![
ev_response_created("resp-2"),
ev_assistant_message("msg-1", "finished"),
ev_completed("resp-2"),
]),
];
mount_sse_sequence(&server, responses).await;
let session_model = session_configured.model.clone();
codex
.submit(Op::UserTurn {
items: vec![UserInput::Text {
text: "emit end event".into(),
}],
final_output_json_schema: None,
cwd: cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
let end_event = wait_for_event_match(&codex, |msg| match msg {
EventMsg::ExecCommandEnd(ev) if ev.call_id == call_id => Some(ev.clone()),
_ => None,
})
.await;
assert_eq!(end_event.exit_code, 0);
assert!(
end_event.aggregated_output.contains("END-EVENT"),
"expected aggregated output to contain marker"
);
wait_for_event(&codex, |event| matches!(event, EventMsg::TaskComplete(_))).await;
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn unified_exec_emits_output_delta_for_exec_command() -> Result<()> {
skip_if_no_network!(Ok(()));
skip_if_sandbox!(Ok(()));
let server = start_mock_server().await;
let mut builder = test_codex().with_config(|config| {
config.use_experimental_unified_exec_tool = true;
config.features.enable(Feature::UnifiedExec);
});
let TestCodex {
codex,
cwd,
session_configured,
..
} = builder.build(&server).await?;
let call_id = "uexec-delta-1";
let args = json!({
"cmd": "printf 'HELLO-UEXEC'",
"yield_time_ms": 250,
});
let responses = vec![
sse(vec![
ev_response_created("resp-1"),
ev_function_call(call_id, "exec_command", &serde_json::to_string(&args)?),
ev_completed("resp-1"),
]),
sse(vec![
ev_response_created("resp-2"),
ev_assistant_message("msg-1", "finished"),
ev_completed("resp-2"),
]),
];
mount_sse_sequence(&server, responses).await;
let session_model = session_configured.model.clone();
codex
.submit(Op::UserTurn {
items: vec![UserInput::Text {
text: "emit delta".into(),
}],
final_output_json_schema: None,
cwd: cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
let delta = wait_for_event_match(&codex, |msg| match msg {
EventMsg::ExecCommandOutputDelta(ev) if ev.call_id == call_id => Some(ev.clone()),
_ => None,
})
.await;
let text = String::from_utf8_lossy(&delta.chunk).to_string();
assert!(
text.contains("HELLO-UEXEC"),
"delta chunk missing expected text: {text:?}"
);
wait_for_event(&codex, |event| matches!(event, EventMsg::TaskComplete(_))).await;
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn unified_exec_emits_output_delta_for_write_stdin() -> Result<()> {
skip_if_no_network!(Ok(()));
skip_if_sandbox!(Ok(()));
let server = start_mock_server().await;
let mut builder = test_codex().with_config(|config| {
config.use_experimental_unified_exec_tool = true;
config.features.enable(Feature::UnifiedExec);
});
let TestCodex {
codex,
cwd,
session_configured,
..
} = builder.build(&server).await?;
let open_call_id = "uexec-open";
let open_args = json!({
"cmd": "/bin/bash -i",
"yield_time_ms": 200,
});
let stdin_call_id = "uexec-stdin-delta";
let stdin_args = json!({
"chars": "echo WSTDIN-MARK\\n",
"session_id": 0,
"yield_time_ms": 800,
});
let responses = vec![
sse(vec![
ev_response_created("resp-1"),
ev_function_call(
open_call_id,
"exec_command",
&serde_json::to_string(&open_args)?,
),
ev_completed("resp-1"),
]),
sse(vec![
ev_response_created("resp-2"),
ev_function_call(
stdin_call_id,
"write_stdin",
&serde_json::to_string(&stdin_args)?,
),
ev_completed("resp-2"),
]),
sse(vec![
ev_response_created("resp-3"),
ev_assistant_message("msg-1", "done"),
ev_completed("resp-3"),
]),
];
mount_sse_sequence(&server, responses).await;
let session_model = session_configured.model.clone();
codex
.submit(Op::UserTurn {
items: vec![UserInput::Text {
text: "stdin delta".into(),
}],
final_output_json_schema: None,
cwd: cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
// Expect a delta event corresponding to the write_stdin call.
let delta = wait_for_event_match(&codex, |msg| match msg {
EventMsg::ExecCommandOutputDelta(ev) if ev.call_id == open_call_id => {
let text = String::from_utf8_lossy(&ev.chunk);
if text.contains("WSTDIN-MARK") {
Some(ev.clone())
} else {
None
}
}
_ => None,
})
.await;
let text = String::from_utf8_lossy(&delta.chunk).to_string();
assert!(
text.contains("WSTDIN-MARK"),
"stdin delta chunk missing expected text: {text:?}"
);
wait_for_event(&codex, |event| matches!(event, EventMsg::TaskComplete(_))).await;
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn unified_exec_skips_begin_event_for_empty_input() -> Result<()> {
use tokio::time::Duration;
@@ -516,6 +757,110 @@ async fn write_stdin_returns_exit_metadata_and_clears_session() -> Result<()> {
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn unified_exec_emits_end_event_when_session_dies_via_stdin() -> Result<()> {
skip_if_no_network!(Ok(()));
skip_if_sandbox!(Ok(()));
let server = start_mock_server().await;
let mut builder = test_codex().with_config(|config| {
config.use_experimental_unified_exec_tool = true;
config.features.enable(Feature::UnifiedExec);
});
let TestCodex {
codex,
cwd,
session_configured,
..
} = builder.build(&server).await?;
let start_call_id = "uexec-end-on-exit-start";
let start_args = serde_json::json!({
"cmd": "/bin/cat",
"yield_time_ms": 200,
});
let echo_call_id = "uexec-end-on-exit-echo";
let echo_args = serde_json::json!({
"chars": "bye-END\n",
"session_id": 0,
"yield_time_ms": 300,
});
let exit_call_id = "uexec-end-on-exit";
let exit_args = serde_json::json!({
"chars": "\u{0004}",
"session_id": 0,
"yield_time_ms": 500,
});
let responses = vec![
sse(vec![
ev_response_created("resp-1"),
ev_function_call(
start_call_id,
"exec_command",
&serde_json::to_string(&start_args)?,
),
ev_completed("resp-1"),
]),
sse(vec![
ev_response_created("resp-2"),
ev_function_call(
echo_call_id,
"write_stdin",
&serde_json::to_string(&echo_args)?,
),
ev_completed("resp-2"),
]),
sse(vec![
ev_response_created("resp-3"),
ev_function_call(
exit_call_id,
"write_stdin",
&serde_json::to_string(&exit_args)?,
),
ev_completed("resp-3"),
]),
sse(vec![
ev_response_created("resp-4"),
ev_assistant_message("msg-1", "done"),
ev_completed("resp-4"),
]),
];
mount_sse_sequence(&server, responses).await;
let session_model = session_configured.model.clone();
codex
.submit(Op::UserTurn {
items: vec![UserInput::Text {
text: "end on exit".into(),
}],
final_output_json_schema: None,
cwd: cwd.path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: session_model,
effort: None,
summary: ReasoningSummary::Auto,
})
.await?;
// We expect the ExecCommandEnd event to match the initial exec_command call_id.
let end_event = wait_for_event_match(&codex, |msg| match msg {
EventMsg::ExecCommandEnd(ev) if ev.call_id == start_call_id => Some(ev.clone()),
_ => None,
})
.await;
assert_eq!(end_event.exit_code, 0);
wait_for_event(&codex, |event| matches!(event, EventMsg::TaskComplete(_))).await;
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn unified_exec_reuses_session_via_stdin() -> Result<()> {
skip_if_no_network!(Ok(()));

View File

@@ -105,6 +105,7 @@ pub async fn run_main<T: Reporter>(
threads,
cancel_flag,
compute_indices,
true,
)?;
let match_count = matches.len();
let matches_truncated = total_match_count > match_count;
@@ -121,6 +122,7 @@ pub async fn run_main<T: Reporter>(
/// The worker threads will periodically check `cancel_flag` to see if they
/// should stop processing files.
#[allow(clippy::too_many_arguments)]
pub fn run(
pattern_text: &str,
limit: NonZero<usize>,
@@ -129,6 +131,7 @@ pub fn run(
threads: NonZero<usize>,
cancel_flag: Arc<AtomicBool>,
compute_indices: bool,
respect_gitignore: bool,
) -> anyhow::Result<FileSearchResults> {
let pattern = create_pattern(pattern_text);
// Create one BestMatchesList per worker thread so that each worker can
@@ -157,6 +160,14 @@ pub fn run(
.hidden(false)
// Don't require git to be present to apply to apply git-related ignore rules.
.require_git(false);
if !respect_gitignore {
walk_builder
.git_ignore(false)
.git_global(false)
.git_exclude(false)
.ignore(false)
.parents(false);
}
if !exclude.is_empty() {
let mut override_builder = OverrideBuilder::new(search_directory);

View File

@@ -0,0 +1,35 @@
use schemars::JsonSchema;
use serde::Deserialize;
use serde::Serialize;
use ts_rs::TS;
#[derive(Serialize, Deserialize, Copy, Clone, Debug, PartialEq, Eq, JsonSchema, TS, Default)]
#[serde(rename_all = "lowercase")]
#[ts(rename_all = "lowercase")]
pub enum PlanType {
#[default]
Free,
Plus,
Pro,
Team,
Business,
Enterprise,
Edu,
#[serde(other)]
Unknown,
}
#[derive(Debug, Clone, PartialEq, Deserialize, Serialize, JsonSchema, TS)]
#[serde(tag = "type")]
#[ts(tag = "type")]
pub enum Account {
ApiKey {
api_key: String,
},
#[serde(rename = "chatgpt")]
#[ts(rename = "chatgpt")]
ChatGpt {
email: Option<String>,
plan_type: PlanType,
},
}

View File

@@ -1,3 +1,4 @@
pub mod account;
mod conversation_id;
pub use conversation_id::ConversationId;
pub mod config_types;

View File

@@ -575,9 +575,11 @@ pub struct TokenUsage {
pub total_tokens: i64,
}
#[derive(Debug, Clone, Deserialize, Serialize, JsonSchema, TS)]
#[derive(Debug, Clone, Deserialize, Serialize, JsonSchema, TS, Default)]
pub struct TokenUsageInfo {
/// The total token usage for the session. accumulated from all turns.
pub total_token_usage: TokenUsage,
/// The token usage for the last turn. Received from the API. It's total tokens is the whole window size.
pub last_token_usage: TokenUsage,
#[ts(type = "number | null")]
pub model_context_window: Option<i64>,

View File

@@ -2069,6 +2069,35 @@ mod tests {
}
}
#[test]
fn ascii_prefix_survives_non_ascii_followup() {
use crossterm::event::KeyCode;
use crossterm::event::KeyEvent;
use crossterm::event::KeyModifiers;
let (tx, _rx) = unbounded_channel::<AppEvent>();
let sender = AppEventSender::new(tx);
let mut composer = ChatComposer::new(
true,
sender,
false,
"Ask Codex to do anything".to_string(),
false,
);
let _ = composer.handle_key_event(KeyEvent::new(KeyCode::Char('1'), KeyModifiers::NONE));
assert!(composer.is_in_paste_burst());
let _ = composer.handle_key_event(KeyEvent::new(KeyCode::Char('あ'), KeyModifiers::NONE));
let (result, _) =
composer.handle_key_event(KeyEvent::new(KeyCode::Enter, KeyModifiers::NONE));
match result {
InputResult::Submitted(text) => assert_eq!(text, "1あ"),
_ => panic!("expected Submitted"),
}
}
#[test]
fn handle_paste_small_inserts_text() {
use crossterm::event::KeyCode;

View File

@@ -198,12 +198,15 @@ impl PasteBurst {
/// Before applying modified/non-char input: flush buffered burst immediately.
pub fn flush_before_modified_input(&mut self) -> Option<String> {
if self.is_active() {
self.active = false;
Some(std::mem::take(&mut self.buffer))
} else {
None
if !self.is_active() {
return None;
}
self.active = false;
let mut out = std::mem::take(&mut self.buffer);
if let Some((ch, _at)) = self.pending_first_char.take() {
out.push(ch);
}
Some(out)
}
/// Clear only the timing window and any pending first-char.

View File

@@ -745,9 +745,8 @@ impl ChatWidget {
&ev.call_id,
CommandOutput {
exit_code: ev.exit_code,
stdout: ev.stdout.clone(),
stderr: ev.stderr.clone(),
formatted_output: ev.formatted_output.clone(),
aggregated_output: ev.aggregated_output.clone(),
},
ev.duration,
);

View File

@@ -71,18 +71,26 @@ fn upgrade_event_payload_for_tests(mut payload: serde_json::Value) -> serde_json
&& let Some(m) = msg.as_object_mut()
{
let ty = m.get("type").and_then(|v| v.as_str()).unwrap_or("");
if ty == "exec_command_end" && !m.contains_key("formatted_output") {
if ty == "exec_command_end" {
let stdout = m.get("stdout").and_then(|v| v.as_str()).unwrap_or("");
let stderr = m.get("stderr").and_then(|v| v.as_str()).unwrap_or("");
let formatted = if stderr.is_empty() {
let aggregated = if stderr.is_empty() {
stdout.to_string()
} else {
format!("{stdout}{stderr}")
};
m.insert(
"formatted_output".to_string(),
serde_json::Value::String(formatted),
);
if !m.contains_key("formatted_output") {
m.insert(
"formatted_output".to_string(),
serde_json::Value::String(aggregated.clone()),
);
}
if !m.contains_key("aggregated_output") {
m.insert(
"aggregated_output".to_string(),
serde_json::Value::String(aggregated),
);
}
}
}
payload

View File

@@ -3,11 +3,12 @@ use std::time::Instant;
use codex_protocol::parse_command::ParsedCommand;
#[derive(Clone, Debug)]
#[derive(Clone, Debug, Default)]
pub(crate) struct CommandOutput {
pub(crate) exit_code: i32,
pub(crate) stdout: String,
pub(crate) stderr: String,
/// The aggregated stderr + stdout interleaved.
pub(crate) aggregated_output: String,
/// The formatted output of the command, as seen by the model.
pub(crate) formatted_output: String,
}
@@ -82,9 +83,8 @@ impl ExecCell {
call.duration = Some(elapsed);
call.output = Some(CommandOutput {
exit_code: 1,
stdout: String::new(),
stderr: String::new(),
formatted_output: String::new(),
aggregated_output: String::new(),
});
}
}

View File

@@ -28,7 +28,6 @@ use unicode_width::UnicodeWidthStr;
pub(crate) const TOOL_CALL_MAX_LINES: usize = 5;
pub(crate) struct OutputLinesParams {
pub(crate) only_err: bool,
pub(crate) include_angle_pipe: bool,
pub(crate) include_prefix: bool,
}
@@ -59,22 +58,12 @@ pub(crate) fn output_lines(
params: OutputLinesParams,
) -> OutputLines {
let OutputLinesParams {
only_err,
include_angle_pipe,
include_prefix,
} = params;
let CommandOutput {
exit_code,
stdout,
stderr,
..
aggregated_output, ..
} = match output {
Some(output) if only_err && output.exit_code == 0 => {
return OutputLines {
lines: Vec::new(),
omitted: None,
};
}
Some(output) => output,
None => {
return OutputLines {
@@ -84,7 +73,7 @@ pub(crate) fn output_lines(
}
};
let src = if *exit_code == 0 { stdout } else { stderr };
let src = aggregated_output;
let lines: Vec<&str> = src.lines().collect();
let total = lines.len();
let limit = TOOL_CALL_MAX_LINES;
@@ -398,7 +387,6 @@ impl ExecCell {
let raw_output = output_lines(
Some(output),
OutputLinesParams {
only_err: false,
include_angle_pipe: false,
include_prefix: false,
},

View File

@@ -172,6 +172,7 @@ impl FileSearchManager {
NUM_FILE_SEARCH_THREADS,
cancellation_token.clone(),
compute_indices,
true,
)
.map(|res| res.matches)
.unwrap_or_default();

View File

@@ -1293,12 +1293,10 @@ pub(crate) fn new_patch_apply_failure(stderr: String) -> PlainHistoryCell {
let output = output_lines(
Some(&CommandOutput {
exit_code: 1,
stdout: String::new(),
stderr,
formatted_output: String::new(),
aggregated_output: stderr,
}),
OutputLinesParams {
only_err: true,
include_angle_pipe: true,
include_prefix: true,
},
@@ -1739,16 +1737,7 @@ mod tests {
duration: None,
});
// Mark call complete so markers are ✓
cell.complete_call(
&call_id,
CommandOutput {
exit_code: 0,
stdout: String::new(),
stderr: String::new(),
formatted_output: String::new(),
},
Duration::from_millis(1),
);
cell.complete_call(&call_id, CommandOutput::default(), Duration::from_millis(1));
let lines = cell.display_lines(80);
let rendered = render_lines(&lines).join("\n");
@@ -1770,16 +1759,7 @@ mod tests {
duration: None,
});
// Call 1: Search only
cell.complete_call(
"c1",
CommandOutput {
exit_code: 0,
stdout: String::new(),
stderr: String::new(),
formatted_output: String::new(),
},
Duration::from_millis(1),
);
cell.complete_call("c1", CommandOutput::default(), Duration::from_millis(1));
// Call 2: Read A
cell = cell
.with_added_call(
@@ -1792,16 +1772,7 @@ mod tests {
}],
)
.unwrap();
cell.complete_call(
"c2",
CommandOutput {
exit_code: 0,
stdout: String::new(),
stderr: String::new(),
formatted_output: String::new(),
},
Duration::from_millis(1),
);
cell.complete_call("c2", CommandOutput::default(), Duration::from_millis(1));
// Call 3: Read B
cell = cell
.with_added_call(
@@ -1814,16 +1785,7 @@ mod tests {
}],
)
.unwrap();
cell.complete_call(
"c3",
CommandOutput {
exit_code: 0,
stdout: String::new(),
stderr: String::new(),
formatted_output: String::new(),
},
Duration::from_millis(1),
);
cell.complete_call("c3", CommandOutput::default(), Duration::from_millis(1));
let lines = cell.display_lines(80);
let rendered = render_lines(&lines).join("\n");
@@ -1856,16 +1818,7 @@ mod tests {
start_time: Some(Instant::now()),
duration: None,
});
cell.complete_call(
"c1",
CommandOutput {
exit_code: 0,
stdout: String::new(),
stderr: String::new(),
formatted_output: String::new(),
},
Duration::from_millis(1),
);
cell.complete_call("c1", CommandOutput::default(), Duration::from_millis(1));
let lines = cell.display_lines(80);
let rendered = render_lines(&lines).join("\n");
insta::assert_snapshot!(rendered);
@@ -1885,16 +1838,7 @@ mod tests {
duration: None,
});
// Mark call complete so it renders as "Ran"
cell.complete_call(
&call_id,
CommandOutput {
exit_code: 0,
stdout: String::new(),
stderr: String::new(),
formatted_output: String::new(),
},
Duration::from_millis(1),
);
cell.complete_call(&call_id, CommandOutput::default(), Duration::from_millis(1));
// Small width to force wrapping on both lines
let width: u16 = 28;
@@ -1914,16 +1858,7 @@ mod tests {
start_time: Some(Instant::now()),
duration: None,
});
cell.complete_call(
&call_id,
CommandOutput {
exit_code: 0,
stdout: String::new(),
stderr: String::new(),
formatted_output: String::new(),
},
Duration::from_millis(1),
);
cell.complete_call(&call_id, CommandOutput::default(), Duration::from_millis(1));
// Wide enough that it fits inline
let lines = cell.display_lines(80);
let rendered = render_lines(&lines).join("\n");
@@ -1942,16 +1877,7 @@ mod tests {
start_time: Some(Instant::now()),
duration: None,
});
cell.complete_call(
&call_id,
CommandOutput {
exit_code: 0,
stdout: String::new(),
stderr: String::new(),
formatted_output: String::new(),
},
Duration::from_millis(1),
);
cell.complete_call(&call_id, CommandOutput::default(), Duration::from_millis(1));
let lines = cell.display_lines(24);
let rendered = render_lines(&lines).join("\n");
insta::assert_snapshot!(rendered);
@@ -1969,16 +1895,7 @@ mod tests {
start_time: Some(Instant::now()),
duration: None,
});
cell.complete_call(
&call_id,
CommandOutput {
exit_code: 0,
stdout: String::new(),
stderr: String::new(),
formatted_output: String::new(),
},
Duration::from_millis(1),
);
cell.complete_call(&call_id, CommandOutput::default(), Duration::from_millis(1));
let lines = cell.display_lines(80);
let rendered = render_lines(&lines).join("\n");
insta::assert_snapshot!(rendered);
@@ -1997,16 +1914,7 @@ mod tests {
start_time: Some(Instant::now()),
duration: None,
});
cell.complete_call(
&call_id,
CommandOutput {
exit_code: 0,
stdout: String::new(),
stderr: String::new(),
formatted_output: String::new(),
},
Duration::from_millis(1),
);
cell.complete_call(&call_id, CommandOutput::default(), Duration::from_millis(1));
let lines = cell.display_lines(28);
let rendered = render_lines(&lines).join("\n");
insta::assert_snapshot!(rendered);
@@ -2033,9 +1941,8 @@ mod tests {
&call_id,
CommandOutput {
exit_code: 1,
stdout: String::new(),
stderr,
formatted_output: String::new(),
aggregated_output: stderr,
},
Duration::from_millis(1),
);
@@ -2077,9 +1984,8 @@ mod tests {
&call_id,
CommandOutput {
exit_code: 1,
stdout: String::new(),
stderr,
formatted_output: String::new(),
aggregated_output: stderr,
},
Duration::from_millis(5),
);

View File

@@ -6,6 +6,9 @@ use ratatui::style::Style;
use ratatui::style::Stylize;
use ratatui::text::Span;
#[cfg(target_os = "macos")]
const ALT_PREFIX: &str = "⌥ + ";
#[cfg(not(target_os = "macos"))]
const ALT_PREFIX: &str = "alt + ";
const CTRL_PREFIX: &str = "ctrl + ";
const SHIFT_PREFIX: &str = "shift + ";

View File

@@ -724,8 +724,7 @@ mod tests {
"exec-1",
CommandOutput {
exit_code: 0,
stdout: "src\nREADME.md\n".into(),
stderr: String::new(),
aggregated_output: "src\nREADME.md\n".into(),
formatted_output: "src\nREADME.md\n".into(),
},
Duration::from_millis(420),

View File

@@ -0,0 +1,13 @@
---
source: tui/src/status_indicator_widget.rs
assertion_line: 289
expression: terminal.backend()
---
"• Working (0s • esc to interrupt) "
" "
" ↳ first "
" ↳ second "
" ⌥ + ↑ edit "
" "
" "
" "

View File

@@ -284,6 +284,11 @@ mod tests {
terminal
.draw(|f| w.render_ref(f.area(), f.buffer_mut()))
.expect("draw");
#[cfg(target_os = "macos")]
insta::with_settings!({ snapshot_suffix => "macos" }, {
insta::assert_snapshot!(terminal.backend());
});
#[cfg(not(target_os = "macos"))]
insta::assert_snapshot!(terminal.backend());
}

View File

@@ -55,8 +55,13 @@ impl Tokenizer {
Ok(Self { inner })
}
/// Default to `O200kBase`
pub fn try_default() -> Result<Self, TokenizerError> {
Self::new(EncodingKind::O200kBase)
}
/// Build a tokenizer using an `OpenAI` model name (maps to an encoding).
/// Falls back to the `o200k_base` encoding when the model is unknown.
/// Falls back to the `O200kBase` encoding when the model is unknown.
pub fn for_model(model: &str) -> Result<Self, TokenizerError> {
match tiktoken_rs::get_bpe_from_model(model) {
Ok(inner) => Ok(Self { inner }),
@@ -102,6 +107,12 @@ impl Tokenizer {
}
}
impl fmt::Debug for Tokenizer {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
f.debug_struct("Tokenizer").finish()
}
}
#[cfg(test)]
mod tests {
use super::*;