Compare commits

...

18 Commits

Author SHA1 Message Date
Ahmed Ibrahim
aa28fcfb5c Remove /collab command and make /plan toggle 2026-02-17 09:51:54 -08:00
Charley Cunningham
cab607befb Centralize context update diffing logic (#11807)
## Summary
This PR centralizes model-visible state diffing for turn context updates
into one module, while keeping existing behavior and call sites stable.

### What changed
- Added `core/src/context_updates.rs` with the consolidated diffing
logic for:
  - environment context updates
  - permissions/policy updates
  - collaboration mode updates
  - model-instruction switch updates
  - personality updates
- Added `BuildSettingsUpdateItemsParams` so required dependencies are
passed explicitly.
- Updated `Session::build_settings_update_items` in `core/src/codex.rs`
to delegate to the centralized module.
- Reused the same centralized `personality_message_for` helper from
initial-context assembly to avoid duplicated logic.
- Registered the new module in `core/src/lib.rs`.

## Why
This is a minimal, shippable step toward the model-visible-state design:
all state diff decisions for turn-context update items now live in one
place, improving reviewability and reducing drift risk without expanding
scope.

## Behavior
- Intended to be behavior-preserving.
- No protocol/schema changes.
- No call-site behavior changes beyond routing through the new
centralized logic.

## Testing
Ran targeted tests in this worktree:
- `cargo test -p codex-core
build_settings_update_items_emits_environment_item_for_network_changes`
- `cargo test -p codex-core collaboration_instructions --test all`

Both passed.

## Codex author
`codex resume 019c540f-3951-7352-a3fa-6f07b834d4ce`
2026-02-17 09:21:44 -08:00
Eric Traut
281b0eae8b Don't allow model_supports_reasoning_summaries to disable reasoning (#11833)
The `model_supports_reasoning_summaries` config option was originally
added so users could enable reasoning for custom models (models that
codex doesn't know about). This is how it was documented in the source,
but its implementation didn't match. It was implemented such that it can
also be used to disable reasoning for models that otherwise support
reasoning. This leads to bad behavior for some reasoning models like
`gpt-5.3-codex`. Diagnosing this is difficult, and it has led to many
support issues.

This PR changes the handling of `model_supports_reasoning_summaries` so
it matches its original documented behavior. If it is set to false, it
is a no-op. That is, it never disables reasoning for models that are
known to support reasoning. It can still be used for its intended
purpose -- to enable reasoning for unknown models.
2026-02-17 07:19:28 -08:00
jif-oai
4ab44e2c5c feat: add --compact mode to just log (#11994)
Summary:
- add a `--compact` flag to the logs client to suppress thread/target
info
- format rows and timestamps differently when compact mode is enabled so
only hour time, level, and message remain
2026-02-17 14:21:26 +00:00
jif-oai
31d4bfdde0 feat: add --search to just log (#11995)
Summary
- extend the log client to accept an optional `--search` substring
filter when querying codex-state logs
- propagate the filter through `LogQuery` and apply it in
`push_log_filters` via `INSTR(message, ...)`
- add an integration test that exercises the new search filtering
behavior

Testing
- Not run (not requested)
2026-02-17 14:19:52 +00:00
jif-oai
56cd85cd4b nit: wording multi-agent (#11986) 2026-02-17 11:45:59 +00:00
jif-oai
5ae84197b2 Exit early when session initialization fails (#11908)
Summary
- wait for the initial session startup loop to finish and handle exit
before waiting for the first message in fresh sessions
- propagate AppRunControl::Exit to return immediately when
initialization fails
2026-02-17 11:22:30 +00:00
Dylan Hurd
fcf16e97a6 fix(ci) Fix shell-tool-mcp.yml (#11969)
## Summary
We're seeing failures for shell-tool-mcp.yml during git checkouts. This
is a quick attempt to unblock releases - we should revisit this build
pipeline since we've hit a number of errors.
2026-02-17 11:13:18 +00:00
jif-oai
77f74a5c17 fix: race in js repl (#11922)
js_repl_reset previously raced with in-flight/new js_repl executions
because reset() could clear exec_tool_calls without synchronizing with
execute(). In that window, a running exec could lose its per-exec
tool-call context, and subsequent kernel RunTool messages would fail
with js_repl exec context not found. The fix serializes reset and
execute on the same exec_lock, so reset cannot run concurrently with
exec setup/teardown. We also keep the timeout path safe by performing
reset steps inline while execute() already holds the lock, avoiding
re-entrant lock acquisition. A regression test now verifies that reset
waits for the exec lock and does not clear tool-call state early.
2026-02-17 11:06:14 +00:00
jif-oai
b994b52994 Hide /debug slash commands from popup menu (#11974)
Summary
- filter command popup builtins to remove any `/debug*` entries so they
stay usable but are not listed
- added regression tests to ensure the popup hides debug commands while
dispatch still resolves them
2026-02-17 10:30:17 +00:00
jif-oai
846464e869 fix: js_repl reset hang by clearing exec tool calls without waiting (#11932)
Remove the waiting loop in `reset` so it no longer blocks on potentially
hanging exec tool calls + add `clear_all_exec_tool_calls_map` to drain
the map and notify waiters so `reset` completes immediately
2026-02-17 08:40:54 +00:00
Dylan Hurd
0fbe10a807 fix(core) exec_policy parsing fixes (#11951)
## Summary
Fixes a few things in our exec_policy handling of prefix_rules:
1. Correctly match redirects specifically for exec_policy parsing. i.e.
if you have `prefix_rule(["echo"], decision="allow")` then `echo hello >
output.txt` should match - this should fix #10321
2. If there already exists any rule that would match our prefix rule
(not just a prompt), then drop it, since it won't do anything.


## Testing
- [x] Updated unit tests, added approvals ScenarioSpecs
2026-02-16 23:11:59 -08:00
Fouad Matin
02e9006547 add(core): safety check downgrade warning (#11964)
Add per-turn notice when a request is downgraded to a fallback model due
to cyber safety checks.

**Changes**

- codex-api: Emit a ServerModel event based on the openai-model response
header and/or response payload (SSE + WebSocket), including when the
model changes mid-stream.
- core: When the server-reported model differs from the requested model,
emit a single per-turn warning explaining the reroute to gpt-5.2 and
directing users to Trusted
    Access verification and the cyber safety explainer.
- app-server (v2): Surface these cyber model-routing warnings as
synthetic userMessage items with text prefixed by Warning: (and document
this behavior).
2026-02-16 22:13:36 -08:00
Eric Traut
08f689843f Fixed screen reader regression in CLI (#11860)
The `tui.animations` switch should gate all animations in the TUI, but a
recent change introduced a regression that didn't include the gate. This
makes it difficult to use the TUI with a screen reader.

This fix addresses #11856
2026-02-16 18:17:52 -08:00
Fouad Matin
b37555dd75 add(feedback): over-refusal / safety check (#11948)
Add new feedback option for "Over-refusal / safety check"
2026-02-16 16:24:47 -08:00
Dylan Hurd
19afbc35c1 chore(core) rm Feature::RequestRule (#11866)
## Summary
This feature is now reasonably stable, let's remove it so we can
simplify our upcoming iterations here.

## Testing 
- [x] Existing tests pass
2026-02-16 22:30:23 +00:00
Matthew Zeng
5b421bba34 [apps] Fix app mention syntax. (#11894)
- [x] Fix app mention syntax.
2026-02-16 22:01:49 +00:00
jif-oai
beb5cb4f48 Rename collab modules to multi agents (#11939)
Summary
- rename the `collab` handlers and UI files to `multi_agents` to match
the new naming
- update module references and specs so the handlers and TUI widgets
consistently use the renamed files
- keep the existing functionality while aligning file and module names
with the multi-agent terminology
2026-02-16 19:05:13 +00:00
49 changed files with 1730 additions and 474 deletions

View File

@@ -408,9 +408,8 @@ jobs:
shell: bash
run: |
set -euo pipefail
git clone --depth 1 https://git.code.sf.net/p/zsh/code /tmp/zsh
git clone https://git.code.sf.net/p/zsh/code /tmp/zsh
cd /tmp/zsh
git fetch --depth 1 origin 77045ef899e53b9598bebc5a41db93a548a40ca6
git checkout 77045ef899e53b9598bebc5a41db93a548a40ca6
git apply "${GITHUB_WORKSPACE}/shell-tool-mcp/patches/zsh-exec-wrapper.patch"
./Util/preconfig
@@ -487,9 +486,8 @@ jobs:
shell: bash
run: |
set -euo pipefail
git clone --depth 1 https://git.code.sf.net/p/zsh/code /tmp/zsh
git clone https://git.code.sf.net/p/zsh/code /tmp/zsh
cd /tmp/zsh
git fetch --depth 1 origin 77045ef899e53b9598bebc5a41db93a548a40ca6
git checkout 77045ef899e53b9598bebc5a41db93a548a40ca6
git apply "${GITHUB_WORKSPACE}/shell-tool-mcp/patches/zsh-exec-wrapper.patch"
./Util/preconfig

View File

@@ -557,7 +557,7 @@ Today both notifications carry an empty `items` array even when item events were
`ThreadItem` is the tagged union carried in turn responses and `item/*` notifications. Currently we support events for the following items:
- `userMessage``{id, content}` where `content` is a list of user inputs (`text`, `image`, or `localImage`).
- `userMessage``{id, content}` where `content` is a list of user inputs (`text`, `image`, or `localImage`). Cyber model-routing warnings are surfaced as synthetic `userMessage` items with `text` prefixed by `Warning:`.
- `agentMessage``{id, text}` containing the accumulated agent reply.
- `plan``{id, text}` emitted for plan-mode turns; plan text can stream via `item/plan/delta` (experimental).
- `reasoning``{id, summary, content}` where `summary` holds streamed reasoning summaries (applicable for most OpenAI models) and `content` holds raw reasoning blocks (applicable for e.g. open source models).

View File

@@ -68,6 +68,7 @@ use codex_app_server_protocol::TurnInterruptResponse;
use codex_app_server_protocol::TurnPlanStep;
use codex_app_server_protocol::TurnPlanUpdatedNotification;
use codex_app_server_protocol::TurnStatus;
use codex_app_server_protocol::UserInput as V2UserInput;
use codex_app_server_protocol::build_turns_from_rollout_items;
use codex_core::CodexThread;
use codex_core::parse_command::shlex_join;
@@ -95,6 +96,8 @@ use codex_protocol::request_user_input::RequestUserInputAnswer as CoreRequestUse
use codex_protocol::request_user_input::RequestUserInputResponse as CoreRequestUserInputResponse;
use std::collections::HashMap;
use std::convert::TryFrom;
use std::hash::Hash;
use std::hash::Hasher;
use std::path::PathBuf;
use std::sync::Arc;
use tokio::sync::Mutex;
@@ -122,6 +125,35 @@ pub(crate) async fn apply_bespoke_event_handling(
EventMsg::TurnComplete(_ev) => {
handle_turn_complete(conversation_id, event_turn_id, &outgoing, &thread_state).await;
}
EventMsg::Warning(warning_event) => {
if matches!(api_version, ApiVersion::V2)
&& is_safety_check_downgrade_warning(&warning_event.message)
{
let item = ThreadItem::UserMessage {
id: warning_item_id(&event_turn_id, &warning_event.message),
content: vec![V2UserInput::Text {
text: format!("Warning: {}", warning_event.message),
text_elements: Vec::new(),
}],
};
let started = ItemStartedNotification {
thread_id: conversation_id.to_string(),
turn_id: event_turn_id.clone(),
item: item.clone(),
};
outgoing
.send_server_notification(ServerNotification::ItemStarted(started))
.await;
let completed = ItemCompletedNotification {
thread_id: conversation_id.to_string(),
turn_id: event_turn_id.clone(),
item,
};
outgoing
.send_server_notification(ServerNotification::ItemCompleted(completed))
.await;
}
}
EventMsg::ApplyPatchApprovalRequest(ApplyPatchApprovalRequestEvent {
call_id,
turn_id,
@@ -1286,6 +1318,18 @@ async fn complete_command_execution_item(
.await;
}
fn is_safety_check_downgrade_warning(message: &str) -> bool {
message.contains("Your account was flagged for potentially high-risk cyber activity")
&& message.contains("apply for trusted access: https://chatgpt.com/cyber")
}
fn warning_item_id(turn_id: &str, message: &str) -> String {
let mut hasher = std::collections::hash_map::DefaultHasher::new();
message.hash(&mut hasher);
let digest = hasher.finish();
format!("{turn_id}-warning-{digest:x}")
}
async fn maybe_emit_raw_response_item_completed(
api_version: ApiVersion,
conversation_id: ThreadId,
@@ -2016,6 +2060,18 @@ mod tests {
assert_eq!(item, expected);
}
#[test]
fn safety_check_downgrade_warning_detection_matches_expected_message() {
let warning = "Your account was flagged for potentially high-risk cyber activity and this request was routed to gpt-5.2 as a fallback. To regain access to gpt-5.3-codex, apply for trusted access: https://chatgpt.com/cyber\nLearn more: https://developers.openai.com/codex/concepts/cyber-safety";
assert!(is_safety_check_downgrade_warning(warning));
}
#[test]
fn safety_check_downgrade_warning_detection_ignores_other_warnings() {
let warning = "Model metadata for `mock-model` not found. Defaulting to fallback metadata; this can degrade performance and cause issues.";
assert!(!is_safety_check_downgrade_warning(warning));
}
#[tokio::test]
async fn test_handle_error_records_message() -> Result<()> {
let conversation_id = ThreadId::new();

View File

@@ -477,7 +477,6 @@ fn assert_permissions_message(item: &ResponseItem) {
&SandboxPolicy::DangerFullAccess,
AskForApproval::Never,
&Policy::empty(),
false,
&PathBuf::from("/tmp"),
)
.into_text();

View File

@@ -15,6 +15,7 @@ mod plan_item;
mod rate_limits;
mod request_user_input;
mod review;
mod safety_check_downgrade;
mod skills_list;
mod thread_archive;
mod thread_fork;

View File

@@ -0,0 +1,266 @@
use anyhow::Result;
use app_test_support::McpProcess;
use app_test_support::to_response;
use codex_app_server_protocol::ItemCompletedNotification;
use codex_app_server_protocol::ItemStartedNotification;
use codex_app_server_protocol::JSONRPCNotification;
use codex_app_server_protocol::JSONRPCResponse;
use codex_app_server_protocol::RequestId;
use codex_app_server_protocol::ThreadItem;
use codex_app_server_protocol::ThreadStartParams;
use codex_app_server_protocol::ThreadStartResponse;
use codex_app_server_protocol::TurnStartParams;
use codex_app_server_protocol::TurnStartResponse;
use codex_app_server_protocol::UserInput;
use core_test_support::responses;
use core_test_support::skip_if_no_network;
use pretty_assertions::assert_eq;
use tempfile::TempDir;
use tokio::time::timeout;
const DEFAULT_READ_TIMEOUT: std::time::Duration = std::time::Duration::from_secs(10);
const REQUESTED_MODEL: &str = "gpt-5.1-codex-max";
const SERVER_MODEL: &str = "gpt-5.2-codex";
#[tokio::test]
async fn openai_model_header_mismatch_emits_warning_item_v2() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = responses::start_mock_server().await;
let body = responses::sse(vec![
responses::ev_response_created("resp-1"),
responses::ev_assistant_message("msg-1", "Done"),
responses::ev_completed("resp-1"),
]);
let response = responses::sse_response(body).insert_header("OpenAI-Model", SERVER_MODEL);
let _response_mock = responses::mount_response_once(&server, response).await;
let codex_home = TempDir::new()?;
create_config_toml(codex_home.path(), &server.uri())?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
let thread_req = mcp
.send_thread_start_request(ThreadStartParams {
model: Some(REQUESTED_MODEL.to_string()),
..Default::default()
})
.await?;
let thread_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(thread_req)),
)
.await??;
let ThreadStartResponse { thread, .. } = to_response::<ThreadStartResponse>(thread_resp)?;
let turn_req = mcp
.send_turn_start_request(TurnStartParams {
thread_id: thread.id.clone(),
input: vec![UserInput::Text {
text: "trigger safeguard".to_string(),
text_elements: Vec::new(),
}],
..Default::default()
})
.await?;
let _turn_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(turn_req)),
)
.await??;
let _turn_start: TurnStartResponse = to_response(_turn_resp)?;
let warning_started = timeout(DEFAULT_READ_TIMEOUT, async {
loop {
let notification: JSONRPCNotification = mcp
.read_stream_until_notification_message("item/started")
.await?;
let params = notification.params.expect("item/started params");
let started: ItemStartedNotification =
serde_json::from_value(params).expect("deserialize item/started");
if warning_text_from_item(&started.item).is_some_and(is_cyber_model_warning_text) {
return Ok::<ItemStartedNotification, anyhow::Error>(started);
}
}
})
.await??;
let warning_text =
warning_text_from_item(&warning_started.item).expect("expected warning user message item");
assert!(warning_text.contains("Warning:"));
assert!(warning_text.contains("gpt-5.2 as a fallback"));
assert!(warning_text.contains("regain access to gpt-5.3-codex"));
let warning_completed = timeout(DEFAULT_READ_TIMEOUT, async {
loop {
let notification: JSONRPCNotification = mcp
.read_stream_until_notification_message("item/completed")
.await?;
let params = notification.params.expect("item/completed params");
let completed: ItemCompletedNotification =
serde_json::from_value(params).expect("deserialize item/completed");
if warning_text_from_item(&completed.item).is_some_and(is_cyber_model_warning_text) {
return Ok::<ItemCompletedNotification, anyhow::Error>(completed);
}
}
})
.await??;
assert_eq!(
warning_text_from_item(&warning_completed.item),
warning_text_from_item(&warning_started.item)
);
timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("turn/completed"),
)
.await??;
Ok(())
}
#[tokio::test]
async fn response_model_field_mismatch_emits_warning_item_v2_when_header_matches_requested()
-> Result<()> {
skip_if_no_network!(Ok(()));
let server = responses::start_mock_server().await;
let body = responses::sse(vec![
serde_json::json!({
"type": "response.created",
"response": {
"id": "resp-1",
"model": SERVER_MODEL,
}
}),
responses::ev_assistant_message("msg-1", "Done"),
responses::ev_completed("resp-1"),
]);
let response = responses::sse_response(body).insert_header("OpenAI-Model", REQUESTED_MODEL);
let _response_mock = responses::mount_response_once(&server, response).await;
let codex_home = TempDir::new()?;
create_config_toml(codex_home.path(), &server.uri())?;
let mut mcp = McpProcess::new(codex_home.path()).await?;
timeout(DEFAULT_READ_TIMEOUT, mcp.initialize()).await??;
let thread_req = mcp
.send_thread_start_request(ThreadStartParams {
model: Some(REQUESTED_MODEL.to_string()),
..Default::default()
})
.await?;
let thread_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(thread_req)),
)
.await??;
let ThreadStartResponse { thread, .. } = to_response::<ThreadStartResponse>(thread_resp)?;
let turn_req = mcp
.send_turn_start_request(TurnStartParams {
thread_id: thread.id.clone(),
input: vec![UserInput::Text {
text: "trigger response model check".to_string(),
text_elements: Vec::new(),
}],
..Default::default()
})
.await?;
let turn_resp: JSONRPCResponse = timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_response_message(RequestId::Integer(turn_req)),
)
.await??;
let _turn_start: TurnStartResponse = to_response(turn_resp)?;
let warning_started = timeout(DEFAULT_READ_TIMEOUT, async {
loop {
let notification: JSONRPCNotification = mcp
.read_stream_until_notification_message("item/started")
.await?;
let params = notification.params.expect("item/started params");
let started: ItemStartedNotification =
serde_json::from_value(params).expect("deserialize item/started");
if warning_text_from_item(&started.item).is_some_and(is_cyber_model_warning_text) {
return Ok::<ItemStartedNotification, anyhow::Error>(started);
}
}
})
.await??;
let warning_text =
warning_text_from_item(&warning_started.item).expect("expected warning user message item");
assert!(warning_text.contains("gpt-5.2 as a fallback"));
let warning_completed = timeout(DEFAULT_READ_TIMEOUT, async {
loop {
let notification: JSONRPCNotification = mcp
.read_stream_until_notification_message("item/completed")
.await?;
let params = notification.params.expect("item/completed params");
let completed: ItemCompletedNotification =
serde_json::from_value(params).expect("deserialize item/completed");
if warning_text_from_item(&completed.item).is_some_and(is_cyber_model_warning_text) {
return Ok::<ItemCompletedNotification, anyhow::Error>(completed);
}
}
})
.await??;
assert_eq!(
warning_text_from_item(&warning_completed.item),
warning_text_from_item(&warning_started.item)
);
timeout(
DEFAULT_READ_TIMEOUT,
mcp.read_stream_until_notification_message("turn/completed"),
)
.await??;
Ok(())
}
fn warning_text_from_item(item: &ThreadItem) -> Option<&str> {
let ThreadItem::UserMessage { content, .. } = item else {
return None;
};
content.iter().find_map(|input| match input {
UserInput::Text { text, .. } if text.starts_with("Warning: ") => Some(text.as_str()),
_ => None,
})
}
fn is_cyber_model_warning_text(text: &str) -> bool {
text.contains("flagged for potentially high-risk cyber activity")
&& text.contains("apply for trusted access: https://chatgpt.com/cyber")
}
fn create_config_toml(codex_home: &std::path::Path, server_uri: &str) -> std::io::Result<()> {
let config_toml = codex_home.join("config.toml");
std::fs::write(
config_toml,
format!(
r#"
model = "{REQUESTED_MODEL}"
approval_policy = "never"
sandbox_mode = "read-only"
model_provider = "mock_provider"
[features]
remote_models = false
personality = true
[model_providers.mock_provider]
name = "Mock provider for test"
base_url = "{server_uri}/v1"
wire_api = "responses"
request_max_retries = 0
stream_max_retries = 0
"#
),
)
}

View File

@@ -56,6 +56,9 @@ pub enum ResponseEvent {
Created,
OutputItemDone(ResponseItem),
OutputItemAdded(ResponseItem),
/// Emitted when the server includes `OpenAI-Model` on the stream response.
/// This can differ from the requested model when backend safety routing applies.
ServerModel(String),
/// Emitted when `X-Reasoning-Included: true` is present on the response,
/// meaning the server already accounted for past reasoning tokens and the
/// client should not re-estimate them.

View File

@@ -63,6 +63,9 @@ impl Stream for AggregatedStream {
Poll::Ready(Some(Ok(ResponseEvent::ModelsEtag(etag)))) => {
return Poll::Ready(Some(Ok(ResponseEvent::ModelsEtag(etag))));
}
Poll::Ready(Some(Ok(ResponseEvent::ServerModel(model)))) => {
return Poll::Ready(Some(Ok(ResponseEvent::ServerModel(model))));
}
Poll::Ready(Some(Ok(ResponseEvent::Completed {
response_id,
token_usage,

View File

@@ -163,6 +163,7 @@ impl Drop for WsStream {
const X_CODEX_TURN_STATE_HEADER: &str = "x-codex-turn-state";
const X_MODELS_ETAG_HEADER: &str = "x-models-etag";
const X_REASONING_INCLUDED_HEADER: &str = "x-reasoning-included";
const OPENAI_MODEL_HEADER: &str = "openai-model";
pub struct ResponsesWebsocketConnection {
stream: Arc<Mutex<Option<WsStream>>>,
@@ -170,6 +171,7 @@ pub struct ResponsesWebsocketConnection {
idle_timeout: Duration,
server_reasoning_included: bool,
models_etag: Option<String>,
server_model: Option<String>,
telemetry: Option<Arc<dyn WebsocketTelemetry>>,
}
@@ -179,6 +181,7 @@ impl ResponsesWebsocketConnection {
idle_timeout: Duration,
server_reasoning_included: bool,
models_etag: Option<String>,
server_model: Option<String>,
telemetry: Option<Arc<dyn WebsocketTelemetry>>,
) -> Self {
Self {
@@ -186,6 +189,7 @@ impl ResponsesWebsocketConnection {
idle_timeout,
server_reasoning_included,
models_etag,
server_model,
telemetry,
}
}
@@ -204,12 +208,16 @@ impl ResponsesWebsocketConnection {
let idle_timeout = self.idle_timeout;
let server_reasoning_included = self.server_reasoning_included;
let models_etag = self.models_etag.clone();
let server_model = self.server_model.clone();
let telemetry = self.telemetry.clone();
let request_body = serde_json::to_value(&request).map_err(|err| {
ApiError::Stream(format!("failed to encode websocket request: {err}"))
})?;
tokio::spawn(async move {
if let Some(model) = server_model {
let _ = tx_event.send(Ok(ResponseEvent::ServerModel(model))).await;
}
if let Some(etag) = models_etag {
let _ = tx_event.send(Ok(ResponseEvent::ModelsEtag(etag))).await;
}
@@ -273,13 +281,14 @@ impl<A: AuthProvider> ResponsesWebsocketClient<A> {
merge_request_headers(&self.provider.headers, extra_headers, default_headers);
add_auth_headers_to_header_map(&self.auth, &mut headers);
let (stream, server_reasoning_included, models_etag) =
let (stream, server_reasoning_included, models_etag, server_model) =
connect_websocket(ws_url, headers, turn_state.clone()).await?;
Ok(ResponsesWebsocketConnection::new(
stream,
self.provider.stream_idle_timeout,
server_reasoning_included,
models_etag,
server_model,
telemetry,
))
}
@@ -304,7 +313,7 @@ async fn connect_websocket(
url: Url,
headers: HeaderMap,
turn_state: Option<Arc<OnceLock<String>>>,
) -> Result<(WsStream, bool, Option<String>), ApiError> {
) -> Result<(WsStream, bool, Option<String>, Option<String>), ApiError> {
ensure_rustls_crypto_provider();
info!("connecting to websocket: {url}");
@@ -341,6 +350,11 @@ async fn connect_websocket(
.get(X_MODELS_ETAG_HEADER)
.and_then(|value| value.to_str().ok())
.map(ToString::to_string);
let server_model = response
.headers()
.get(OPENAI_MODEL_HEADER)
.and_then(|value| value.to_str().ok())
.map(ToString::to_string);
if let Some(turn_state) = turn_state
&& let Some(header_value) = response
.headers()
@@ -349,7 +363,12 @@ async fn connect_websocket(
{
let _ = turn_state.set(header_value.to_string());
}
Ok((WsStream::new(stream), reasoning_included, models_etag))
Ok((
WsStream::new(stream),
reasoning_included,
models_etag,
server_model,
))
}
fn websocket_config() -> WebSocketConfig {
@@ -469,6 +488,7 @@ async fn run_websocket_response_stream(
idle_timeout: Duration,
telemetry: Option<Arc<dyn WebsocketTelemetry>>,
) -> Result<(), ApiError> {
let mut last_server_model: Option<String> = None;
let request_text = match serde_json::to_string(&request_body) {
Ok(text) => text,
Err(err) => {
@@ -536,6 +556,14 @@ async fn run_websocket_response_stream(
}
continue;
}
if let Some(model) = event.response_model()
&& last_server_model.as_deref() != Some(model.as_str())
{
let _ = tx_event
.send(Ok(ResponseEvent::ServerModel(model.clone())))
.await;
last_server_model = Some(model);
}
match process_responses_event(event) {
Ok(Some(event)) => {
let is_completed = matches!(event, ResponseEvent::Completed { .. });

View File

@@ -26,6 +26,7 @@ use tracing::debug;
use tracing::trace;
const X_REASONING_INCLUDED_HEADER: &str = "x-reasoning-included";
const OPENAI_MODEL_HEADER: &str = "openai-model";
/// Streams SSE events from an on-disk fixture for tests.
pub fn stream_from_fixture(
@@ -60,6 +61,11 @@ pub fn spawn_response_stream(
.get("X-Models-Etag")
.and_then(|v| v.to_str().ok())
.map(ToString::to_string);
let server_model = stream_response
.headers
.get(OPENAI_MODEL_HEADER)
.and_then(|v| v.to_str().ok())
.map(ToString::to_string);
let reasoning_included = stream_response
.headers
.get(X_REASONING_INCLUDED_HEADER)
@@ -74,6 +80,9 @@ pub fn spawn_response_stream(
}
let (tx_event, rx_event) = mpsc::channel::<Result<ResponseEvent, ApiError>>(1600);
tokio::spawn(async move {
if let Some(model) = server_model {
let _ = tx_event.send(Ok(ResponseEvent::ServerModel(model))).await;
}
for snapshot in rate_limit_snapshots {
let _ = tx_event.send(Ok(ResponseEvent::RateLimits(snapshot))).await;
}
@@ -169,6 +178,41 @@ impl ResponsesStreamEvent {
pub fn kind(&self) -> &str {
&self.kind
}
pub fn response_model(&self) -> Option<String> {
self.response.as_ref().and_then(extract_server_model)
}
}
fn extract_server_model(value: &Value) -> Option<String> {
value
.get("model")
.and_then(json_value_as_string)
.or_else(|| {
value
.get("headers")
.and_then(header_openai_model_value_from_json)
})
}
fn header_openai_model_value_from_json(value: &Value) -> Option<String> {
let headers = value.as_object()?;
headers.iter().find_map(|(name, value)| {
if name.eq_ignore_ascii_case("openai-model") || name.eq_ignore_ascii_case("x-openai-model")
{
json_value_as_string(value)
} else {
None
}
})
}
fn json_value_as_string(value: &Value) -> Option<String> {
match value {
Value::String(value) => Some(value.clone()),
Value::Array(items) => items.first().and_then(json_value_as_string),
_ => None,
}
}
#[derive(Debug)]
@@ -339,6 +383,7 @@ pub async fn process_sse(
) {
let mut stream = stream.eventsource();
let mut response_error: Option<ApiError> = None;
let mut last_server_model: Option<String> = None;
loop {
let start = Instant::now();
@@ -378,6 +423,19 @@ pub async fn process_sse(
}
};
if let Some(model) = event.response_model()
&& last_server_model.as_deref() != Some(model.as_str())
{
if tx_event
.send(Ok(ResponseEvent::ServerModel(model.clone())))
.await
.is_err()
{
return;
}
last_server_model = Some(model);
}
match process_responses_event(event) {
Ok(Some(event)) => {
let is_completed = matches!(event, ResponseEvent::Completed { .. });
@@ -456,9 +514,13 @@ mod tests {
use super::*;
use assert_matches::assert_matches;
use bytes::Bytes;
use codex_client::StreamResponse;
use codex_protocol::models::MessagePhase;
use codex_protocol::models::ResponseItem;
use futures::stream;
use http::HeaderMap;
use http::HeaderValue;
use http::StatusCode;
use pretty_assertions::assert_eq;
use serde_json::json;
use tokio::sync::mpsc;
@@ -870,6 +932,149 @@ mod tests {
}
}
#[tokio::test]
async fn spawn_response_stream_emits_server_model_header() {
let mut headers = HeaderMap::new();
headers.insert(
OPENAI_MODEL_HEADER,
HeaderValue::from_static(CYBER_RESTRICTED_MODEL_FOR_TESTS),
);
let bytes = stream::iter(Vec::<Result<Bytes, TransportError>>::new());
let stream_response = StreamResponse {
status: StatusCode::OK,
headers,
bytes: Box::pin(bytes),
};
let mut stream = spawn_response_stream(stream_response, idle_timeout(), None, None);
let event = stream
.rx_event
.recv()
.await
.expect("expected server model event")
.expect("expected ok event");
match event {
ResponseEvent::ServerModel(model) => {
assert_eq!(model, CYBER_RESTRICTED_MODEL_FOR_TESTS);
}
other => panic!("expected server model event, got {other:?}"),
}
}
#[tokio::test]
async fn process_sse_emits_server_model_from_response_payload() {
let events = run_sse(vec![
json!({
"type": "response.created",
"response": {
"id": "resp-1",
"model": CYBER_RESTRICTED_MODEL_FOR_TESTS
}
}),
json!({
"type": "response.completed",
"response": {
"id": "resp-1",
"model": CYBER_RESTRICTED_MODEL_FOR_TESTS
}
}),
])
.await;
assert_eq!(events.len(), 3);
assert_matches!(
&events[0],
ResponseEvent::ServerModel(model) if model == CYBER_RESTRICTED_MODEL_FOR_TESTS
);
assert_matches!(&events[1], ResponseEvent::Created);
assert_matches!(
&events[2],
ResponseEvent::Completed {
response_id,
token_usage: None,
can_append: false
} if response_id == "resp-1"
);
}
#[tokio::test]
async fn process_sse_emits_server_model_from_response_headers_payload() {
let events = run_sse(vec![
json!({
"type": "response.created",
"response": {
"id": "resp-1",
"headers": {
"OpenAI-Model": CYBER_RESTRICTED_MODEL_FOR_TESTS
}
}
}),
json!({
"type": "response.completed",
"response": {
"id": "resp-1"
}
}),
])
.await;
assert_eq!(events.len(), 3);
assert_matches!(
&events[0],
ResponseEvent::ServerModel(model) if model == CYBER_RESTRICTED_MODEL_FOR_TESTS
);
assert_matches!(&events[1], ResponseEvent::Created);
assert_matches!(
&events[2],
ResponseEvent::Completed {
response_id,
token_usage: None,
can_append: false
} if response_id == "resp-1"
);
}
#[tokio::test]
async fn process_sse_emits_server_model_again_when_response_model_changes() {
let events = run_sse(vec![
json!({
"type": "response.created",
"response": {
"id": "resp-1",
"model": "gpt-5.2-codex"
}
}),
json!({
"type": "response.completed",
"response": {
"id": "resp-1",
"model": "gpt-5.3-codex"
}
}),
])
.await;
assert_eq!(events.len(), 4);
assert_matches!(
&events[0],
ResponseEvent::ServerModel(model) if model == "gpt-5.2-codex"
);
assert_matches!(&events[1], ResponseEvent::Created);
assert_matches!(
&events[2],
ResponseEvent::ServerModel(model) if model == "gpt-5.3-codex"
);
assert_matches!(
&events[3],
ResponseEvent::Completed {
response_id,
token_usage: None,
can_append: false
} if response_id == "resp-1"
);
}
#[test]
fn test_try_parse_retry_after() {
let err = Error {
@@ -909,4 +1114,6 @@ mod tests {
let delay = try_parse_retry_after(&err);
assert_eq!(delay, Some(Duration::from_secs(35)));
}
const CYBER_RESTRICTED_MODEL_FOR_TESTS: &str = "gpt-5.3-codex";
}

View File

@@ -2,6 +2,6 @@ use crate::mcp::CODEX_APPS_MCP_SERVER_NAME;
pub(crate) fn render_apps_section() -> String {
format!(
"## Apps\nApps are mentioned in the prompt in the format `[$app-name](apps://{{connector_id}})`.\nAn app is equivalent to a set of MCP tools within the `{CODEX_APPS_MCP_SERVER_NAME}` MCP.\nWhen you see an app mention, the app's MCP tools are either already provided in `{CODEX_APPS_MCP_SERVER_NAME}`, or do not exist because the user did not install it.\nDo not additionally call list_mcp_resources for apps that are already mentioned."
"## Apps\nApps are mentioned in the prompt in the format `[$app-name](app://{{connector_id}})`.\nAn app is equivalent to a set of MCP tools within the `{CODEX_APPS_MCP_SERVER_NAME}` MCP.\nWhen you see an app mention, the app's MCP tools are either already provided in `{CODEX_APPS_MCP_SERVER_NAME}`, or do not exist because the user did not install it.\nDo not additionally call list_mcp_resources for apps that are already mentioned."
)
}

View File

@@ -279,6 +279,8 @@ pub struct CodexSpawnOk {
pub(crate) const INITIAL_SUBMIT_ID: &str = "";
pub(crate) const SUBMISSION_CHANNEL_CAPACITY: usize = 64;
const CYBER_VERIFY_URL: &str = "https://chatgpt.com/cyber";
const CYBER_SAFETY_URL: &str = "https://developers.openai.com/codex/concepts/cyber-safety";
impl Codex {
/// Spawn a new [`Codex`] and initialize the session.
@@ -1967,120 +1969,6 @@ impl Session {
state.session_configuration.collaboration_mode.clone()
}
fn build_environment_update_item(
&self,
previous: Option<&Arc<TurnContext>>,
next: &TurnContext,
) -> Option<ResponseItem> {
let prev = previous?;
let shell = self.user_shell();
let prev_context = EnvironmentContext::from_turn_context(prev.as_ref(), shell.as_ref());
let next_context = EnvironmentContext::from_turn_context(next, shell.as_ref());
if prev_context.equals_except_shell(&next_context) {
return None;
}
Some(ResponseItem::from(EnvironmentContext::diff(
prev.as_ref(),
next,
shell.as_ref(),
)))
}
fn build_permissions_update_item(
&self,
previous: Option<&Arc<TurnContext>>,
next: &TurnContext,
) -> Option<ResponseItem> {
let prev = previous?;
if prev.sandbox_policy == next.sandbox_policy
&& prev.approval_policy == next.approval_policy
{
return None;
}
Some(
DeveloperInstructions::from_policy(
&next.sandbox_policy,
next.approval_policy,
self.services.exec_policy.current().as_ref(),
self.features.enabled(Feature::RequestRule),
&next.cwd,
)
.into(),
)
}
fn build_personality_update_item(
&self,
previous: Option<&Arc<TurnContext>>,
next: &TurnContext,
) -> Option<ResponseItem> {
if !self.features.enabled(Feature::Personality) {
return None;
}
let previous = previous?;
if next.model_info.slug != previous.model_info.slug {
return None;
}
// if a personality is specified and it's different from the previous one, build a personality update item
if let Some(personality) = next.personality
&& next.personality != previous.personality
{
let model_info = &next.model_info;
let personality_message = Self::personality_message_for(model_info, personality);
personality_message.map(|personality_message| {
DeveloperInstructions::personality_spec_message(personality_message).into()
})
} else {
None
}
}
fn personality_message_for(model_info: &ModelInfo, personality: Personality) -> Option<String> {
model_info
.model_messages
.as_ref()
.and_then(|spec| spec.get_personality_message(Some(personality)))
.filter(|message| !message.is_empty())
}
fn build_collaboration_mode_update_item(
&self,
previous: Option<&Arc<TurnContext>>,
next: &TurnContext,
) -> Option<ResponseItem> {
let prev = previous?;
if prev.collaboration_mode != next.collaboration_mode {
// If the next mode has empty developer instructions, this returns None and we emit no
// update, so prior collaboration instructions remain in the prompt history.
Some(DeveloperInstructions::from_collaboration_mode(&next.collaboration_mode)?.into())
} else {
None
}
}
fn build_model_instructions_update_item(
&self,
previous: Option<&Arc<TurnContext>>,
resumed_model: Option<&str>,
next: &TurnContext,
) -> Option<ResponseItem> {
let previous_model =
resumed_model.or_else(|| previous.map(|prev| prev.model_info.slug.as_str()))?;
if previous_model == next.model_info.slug {
return None;
}
let model_instructions = next.model_info.get_model_instructions(next.personality);
if model_instructions.is_empty() {
return None;
}
Some(DeveloperInstructions::model_switch_message(model_instructions).into())
}
pub(crate) fn is_model_switch_developer_message(item: &ResponseItem) -> bool {
let ResponseItem::Message { role, content, .. } = item else {
return false;
@@ -2093,42 +1981,26 @@ impl Session {
)
})
}
fn build_settings_update_items(
&self,
previous_context: Option<&Arc<TurnContext>>,
resumed_model: Option<&str>,
current_context: &TurnContext,
) -> Vec<ResponseItem> {
let mut update_items = Vec::new();
if let Some(env_item) =
self.build_environment_update_item(previous_context, current_context)
{
update_items.push(env_item);
}
if let Some(permissions_item) =
self.build_permissions_update_item(previous_context, current_context)
{
update_items.push(permissions_item);
}
if let Some(collaboration_mode_item) =
self.build_collaboration_mode_update_item(previous_context, current_context)
{
update_items.push(collaboration_mode_item);
}
if let Some(model_instructions_item) = self.build_model_instructions_update_item(
previous_context,
// TODO: Make context updates a pure diff of persisted previous/current TurnContextItem
// state so replay/backtracking is deterministic. Runtime inputs that affect model-visible
// context (shell, exec policy, feature gates, resumed model bridge) should be persisted
// state or explicit non-state replay events.
let shell = self.user_shell();
let exec_policy = self.services.exec_policy.current();
crate::context_manager::updates::build_settings_update_items(
previous_context.map(Arc::as_ref),
resumed_model,
current_context,
) {
update_items.push(model_instructions_item);
}
if let Some(personality_item) =
self.build_personality_update_item(previous_context, current_context)
{
update_items.push(personality_item);
}
update_items
shell.as_ref(),
exec_policy.as_ref(),
self.features.enabled(Feature::Personality),
)
}
/// Persist the event to rollout and send it to clients.
@@ -2561,6 +2433,35 @@ impl Session {
self.record_conversation_items(ctx, &[item]).await;
}
async fn maybe_warn_on_server_model_mismatch(
self: &Arc<Self>,
turn_context: &Arc<TurnContext>,
server_model: String,
) -> bool {
let requested_model = turn_context.model_info.slug.as_str();
if server_model == requested_model {
info!("server reported model {server_model} (matches requested model)");
return false;
}
warn!("server reported model {server_model} while requested model was {requested_model}");
let warning_message = format!(
"Your account was flagged for potentially high-risk cyber activity and this request was routed to gpt-5.2 as a fallback. To regain access to gpt-5.3-codex, apply for trusted access: {CYBER_VERIFY_URL} or learn more: {CYBER_SAFETY_URL}"
);
self.send_event(
turn_context,
EventMsg::Warning(WarningEvent {
message: warning_message.clone(),
}),
)
.await;
self.record_model_warning(warning_message, turn_context)
.await;
true
}
pub(crate) async fn replace_history(&self, items: Vec<ResponseItem>) {
let mut state = self.state.lock().await;
state.replace_history(items);
@@ -2624,7 +2525,6 @@ impl Session {
&turn_context.sandbox_policy,
turn_context.approval_policy,
self.services.exec_policy.current().as_ref(),
self.features.enabled(Feature::RequestRule),
&turn_context.cwd,
)
.into(),
@@ -2660,7 +2560,10 @@ impl Session {
&& base_instructions == model_info.get_model_instructions(Some(personality));
if !has_baked_personality
&& let Some(personality_message) =
Self::personality_message_for(&model_info, personality)
crate::context_manager::updates::personality_message_for(
&model_info,
personality,
)
{
items.push(
DeveloperInstructions::personality_spec_message(personality_message).into(),
@@ -4437,6 +4340,7 @@ pub(crate) async fn run_turn(
// Although from the perspective of codex.rs, TurnDiffTracker has the lifecycle of a Task which contains
// many turns, from the perspective of the user, it is a single turn.
let turn_diff_tracker = Arc::new(tokio::sync::Mutex::new(TurnDiffTracker::new()));
let mut server_model_warning_emitted_for_turn = false;
// `ModelClientSession` is turn-scoped and caches WebSocket + sticky routing state, so we reuse
// one instance across retries within this turn.
@@ -4499,6 +4403,7 @@ pub(crate) async fn run_turn(
sampling_request_input,
&explicitly_enabled_connectors,
skills_outcome.as_ref(),
&mut server_model_warning_emitted_for_turn,
cancellation_token.child_token(),
)
.await
@@ -4872,6 +4777,7 @@ async fn run_sampling_request(
input: Vec<ResponseItem>,
explicitly_enabled_connectors: &HashSet<String>,
skills_outcome: Option<&SkillLoadOutcome>,
server_model_warning_emitted_for_turn: &mut bool,
cancellation_token: CancellationToken,
) -> CodexResult<SamplingRequestResult> {
let router = built_tools(
@@ -4908,6 +4814,7 @@ async fn run_sampling_request(
client_session,
turn_metadata_header,
Arc::clone(&turn_diff_tracker),
server_model_warning_emitted_for_turn,
&prompt,
cancellation_token.child_token(),
)
@@ -5476,6 +5383,7 @@ async fn try_run_sampling_request(
client_session: &mut ModelClientSession,
turn_metadata_header: Option<&str>,
turn_diff_tracker: SharedTurnDiffTracker,
server_model_warning_emitted_for_turn: &mut bool,
prompt: &Prompt,
cancellation_token: CancellationToken,
) -> CodexResult<SamplingRequestResult> {
@@ -5618,6 +5526,15 @@ async fn try_run_sampling_request(
active_item = Some(turn_item);
}
}
ResponseEvent::ServerModel(server_model) => {
if !*server_model_warning_emitted_for_turn
&& sess
.maybe_warn_on_server_model_mismatch(&turn_context, server_model)
.await
{
*server_model_warning_emitted_for_turn = true;
}
}
ResponseEvent::ServerReasoningIncluded(included) => {
sess.set_server_reasoning_included(included).await;
}

View File

@@ -1,5 +1,6 @@
mod history;
mod normalize;
pub(crate) mod updates;
pub(crate) use history::ContextManager;
pub(crate) use history::TotalTokenUsageBreakdown;

View File

@@ -0,0 +1,148 @@
use crate::codex::TurnContext;
use crate::environment_context::EnvironmentContext;
use crate::shell::Shell;
use codex_execpolicy::Policy;
use codex_protocol::config_types::Personality;
use codex_protocol::models::DeveloperInstructions;
use codex_protocol::models::ResponseItem;
use codex_protocol::openai_models::ModelInfo;
fn build_environment_update_item(
previous: Option<&TurnContext>,
next: &TurnContext,
shell: &Shell,
) -> Option<ResponseItem> {
let prev = previous?;
let prev_context = EnvironmentContext::from_turn_context(prev, shell);
let next_context = EnvironmentContext::from_turn_context(next, shell);
if prev_context.equals_except_shell(&next_context) {
return None;
}
Some(ResponseItem::from(EnvironmentContext::diff(
prev, next, shell,
)))
}
fn build_permissions_update_item(
previous: Option<&TurnContext>,
next: &TurnContext,
exec_policy: &Policy,
) -> Option<ResponseItem> {
let prev = previous?;
if prev.sandbox_policy == next.sandbox_policy && prev.approval_policy == next.approval_policy {
return None;
}
Some(
DeveloperInstructions::from_policy(
&next.sandbox_policy,
next.approval_policy,
exec_policy,
&next.cwd,
)
.into(),
)
}
fn build_collaboration_mode_update_item(
previous: Option<&TurnContext>,
next: &TurnContext,
) -> Option<ResponseItem> {
let prev = previous?;
if prev.collaboration_mode != next.collaboration_mode {
// If the next mode has empty developer instructions, this returns None and we emit no
// update, so prior collaboration instructions remain in the prompt history.
Some(DeveloperInstructions::from_collaboration_mode(&next.collaboration_mode)?.into())
} else {
None
}
}
fn build_personality_update_item(
previous: Option<&TurnContext>,
next: &TurnContext,
personality_feature_enabled: bool,
) -> Option<ResponseItem> {
if !personality_feature_enabled {
return None;
}
let previous = previous?;
if next.model_info.slug != previous.model_info.slug {
return None;
}
if let Some(personality) = next.personality
&& next.personality != previous.personality
{
let model_info = &next.model_info;
let personality_message = personality_message_for(model_info, personality);
personality_message
.map(|message| DeveloperInstructions::personality_spec_message(message).into())
} else {
None
}
}
pub(crate) fn personality_message_for(
model_info: &ModelInfo,
personality: Personality,
) -> Option<String> {
model_info
.model_messages
.as_ref()
.and_then(|spec| spec.get_personality_message(Some(personality)))
.filter(|message| !message.is_empty())
}
pub(crate) fn build_model_instructions_update_item(
previous: Option<&TurnContext>,
resumed_model: Option<&str>,
next: &TurnContext,
) -> Option<ResponseItem> {
let previous_model =
resumed_model.or_else(|| previous.map(|prev| prev.model_info.slug.as_str()))?;
if previous_model == next.model_info.slug {
return None;
}
let model_instructions = next.model_info.get_model_instructions(next.personality);
if model_instructions.is_empty() {
return None;
}
Some(DeveloperInstructions::model_switch_message(model_instructions).into())
}
pub(crate) fn build_settings_update_items(
previous: Option<&TurnContext>,
resumed_model: Option<&str>,
next: &TurnContext,
shell: &Shell,
exec_policy: &Policy,
personality_feature_enabled: bool,
) -> Vec<ResponseItem> {
let mut update_items = Vec::new();
if let Some(env_item) = build_environment_update_item(previous, next, shell) {
update_items.push(env_item);
}
if let Some(permissions_item) = build_permissions_update_item(previous, next, exec_policy) {
update_items.push(permissions_item);
}
if let Some(collaboration_mode_item) = build_collaboration_mode_update_item(previous, next) {
update_items.push(collaboration_mode_item);
}
if let Some(model_instructions_item) =
build_model_instructions_update_item(previous, resumed_model, next)
{
update_items.push(model_instructions_item);
}
if let Some(personality_item) =
build_personality_update_item(previous, next, personality_feature_enabled)
{
update_items.push(personality_item);
}
update_items
}

View File

@@ -170,23 +170,26 @@ impl ExecPolicyManager {
prefix_rule,
} = req;
let exec_policy = self.current();
let (commands, used_heredoc_fallback) = commands_for_exec_policy(command);
let (commands, used_complex_parsing) = commands_for_exec_policy(command);
// Keep heredoc prefix parsing for rule evaluation so existing
// allow/prompt/forbidden rules still apply, but avoid auto-derived
// amendments when only the heredoc fallback parser matched.
let auto_amendment_allowed = !used_heredoc_fallback;
let auto_amendment_allowed = !used_complex_parsing;
let exec_policy_fallback = |cmd: &[String]| {
render_decision_for_unmatched_command(
approval_policy,
sandbox_policy,
cmd,
sandbox_permissions,
used_complex_parsing,
)
};
let evaluation = exec_policy.check_multiple(commands.iter(), &exec_policy_fallback);
let requested_amendment =
derive_requested_execpolicy_amendment(prefix_rule.as_ref(), &evaluation.matched_rules);
let requested_amendment = derive_requested_execpolicy_amendment_from_prefix_rule(
prefix_rule.as_ref(),
&evaluation.matched_rules,
);
match evaluation.decision {
Decision::Forbidden => ExecApprovalRequirement::Forbidden {
@@ -406,8 +409,9 @@ pub fn render_decision_for_unmatched_command(
sandbox_policy: &SandboxPolicy,
command: &[String],
sandbox_permissions: SandboxPermissions,
used_complex_parsing: bool,
) -> Decision {
if is_known_safe_command(command) {
if is_known_safe_command(command) && !used_complex_parsing {
return Decision::Allow;
}
@@ -534,7 +538,7 @@ fn try_derive_execpolicy_amendment_for_allow_rules(
})
}
fn derive_requested_execpolicy_amendment(
fn derive_requested_execpolicy_amendment_from_prefix_rule(
prefix_rule: Option<&Vec<String>>,
matched_rules: &[RuleMatch],
) -> Option<ExecPolicyAmendment> {
@@ -552,10 +556,8 @@ fn derive_requested_execpolicy_amendment(
return None;
}
if matched_rules
.iter()
.any(|rule_match| is_policy_match(rule_match) && rule_match.decision() == Decision::Prompt)
{
// if any policy rule already matches, don't suggest an additional rule that might conflict or not apply
if matched_rules.iter().any(is_policy_match) {
return None;
}
@@ -686,8 +688,6 @@ mod tests {
use crate::config_loader::ConfigLayerStack;
use crate::config_loader::ConfigRequirements;
use crate::config_loader::ConfigRequirementsToml;
use crate::features::Feature;
use crate::features::Features;
use codex_app_server_protocol::ConfigLayerSource;
use codex_protocol::protocol::AskForApproval;
use codex_protocol::protocol::SandboxPolicy;
@@ -1281,8 +1281,6 @@ prefix_rule(
"cargo-insta".to_string(),
];
let manager = ExecPolicyManager::default();
let mut features = Features::with_defaults();
features.enable(Feature::RequestRule);
let requirement = manager
.create_exec_approval_requirement_for_command(ExecApprovalRequest {
@@ -1564,14 +1562,17 @@ prefix_rule(
#[test]
fn derive_requested_execpolicy_amendment_returns_none_for_missing_prefix_rule() {
assert_eq!(None, derive_requested_execpolicy_amendment(None, &[]));
assert_eq!(
None,
derive_requested_execpolicy_amendment_from_prefix_rule(None, &[])
);
}
#[test]
fn derive_requested_execpolicy_amendment_returns_none_for_empty_prefix_rule() {
assert_eq!(
None,
derive_requested_execpolicy_amendment(Some(&Vec::new()), &[])
derive_requested_execpolicy_amendment_from_prefix_rule(Some(&Vec::new()), &[])
);
}
@@ -1579,7 +1580,7 @@ prefix_rule(
fn derive_requested_execpolicy_amendment_returns_none_for_exact_banned_prefix_rule() {
assert_eq!(
None,
derive_requested_execpolicy_amendment(
derive_requested_execpolicy_amendment_from_prefix_rule(
Some(&vec!["python".to_string(), "-c".to_string()]),
&[],
)
@@ -1598,7 +1599,7 @@ prefix_rule(
] {
assert_eq!(
None,
derive_requested_execpolicy_amendment(Some(&prefix_rule), &[])
derive_requested_execpolicy_amendment_from_prefix_rule(Some(&prefix_rule), &[])
);
}
}
@@ -1624,7 +1625,7 @@ prefix_rule(
] {
assert_eq!(
None,
derive_requested_execpolicy_amendment(Some(&prefix_rule), &[])
derive_requested_execpolicy_amendment_from_prefix_rule(Some(&prefix_rule), &[])
);
}
}
@@ -1639,22 +1640,52 @@ prefix_rule(
assert_eq!(
Some(ExecPolicyAmendment::new(prefix_rule.clone())),
derive_requested_execpolicy_amendment(Some(&prefix_rule), &[])
derive_requested_execpolicy_amendment_from_prefix_rule(Some(&prefix_rule), &[])
);
}
#[test]
fn derive_requested_execpolicy_amendment_returns_none_when_policy_prompt_matches() {
fn derive_requested_execpolicy_amendment_returns_none_when_policy_matches() {
let prefix_rule = vec!["cargo".to_string(), "build".to_string()];
let matched_rules = vec![RuleMatch::PrefixRuleMatch {
let matched_rules_prompt = vec![RuleMatch::PrefixRuleMatch {
matched_prefix: vec!["cargo".to_string()],
decision: Decision::Prompt,
justification: None,
}];
assert_eq!(
None,
derive_requested_execpolicy_amendment(Some(&prefix_rule), &matched_rules)
derive_requested_execpolicy_amendment_from_prefix_rule(
Some(&prefix_rule),
&matched_rules_prompt
),
"should return none when prompt policy matches"
);
let matched_rules_allow = vec![RuleMatch::PrefixRuleMatch {
matched_prefix: vec!["cargo".to_string()],
decision: Decision::Allow,
justification: None,
}];
assert_eq!(
None,
derive_requested_execpolicy_amendment_from_prefix_rule(
Some(&prefix_rule),
&matched_rules_allow
),
"should return none when prompt policy matches"
);
let matched_rules_forbidden = vec![RuleMatch::PrefixRuleMatch {
matched_prefix: vec!["cargo".to_string()],
decision: Decision::Forbidden,
justification: None,
}];
assert_eq!(
None,
derive_requested_execpolicy_amendment_from_prefix_rule(
Some(&prefix_rule),
&matched_rules_forbidden
),
"should return none when prompt policy matches"
);
}

View File

@@ -511,8 +511,8 @@ pub const FEATURES: &[FeatureSpec] = &[
FeatureSpec {
id: Feature::RequestRule,
key: "request_rule",
stage: Stage::Stable,
default_enabled: true,
stage: Stage::Removed,
default_enabled: false,
},
FeatureSpec {
id: Feature::WindowsSandbox,
@@ -554,9 +554,9 @@ pub const FEATURES: &[FeatureSpec] = &[
id: Feature::Collab,
key: "multi_agent",
stage: Stage::Experimental {
name: "Sub-agents",
name: "Multi-agents",
menu_description: "Ask Codex to spawn multiple agents to parallelize the work and win in efficiency.",
announcement: "NEW: Sub-agents can now be spawned by Codex. Enable in /experimental and restart Codex!",
announcement: "NEW: Multi-agents can now be spawned by Codex. Enable in /experimental and restart Codex!",
},
default_enabled: false,
},

View File

@@ -20,8 +20,10 @@ const LOCAL_PRAGMATIC_TEMPLATE: &str = "You are a deeply pragmatic, effective so
const PERSONALITY_PLACEHOLDER: &str = "{{ personality }}";
pub(crate) fn with_config_overrides(mut model: ModelInfo, config: &Config) -> ModelInfo {
if let Some(supports_reasoning_summaries) = config.model_supports_reasoning_summaries {
model.supports_reasoning_summaries = supports_reasoning_summaries;
if let Some(supports_reasoning_summaries) = config.model_supports_reasoning_summaries
&& supports_reasoning_summaries
{
model.supports_reasoning_summaries = true;
}
if let Some(context_window) = config.model_context_window {
model.context_window = Some(context_window);
@@ -100,3 +102,46 @@ fn local_personality_messages_for_slug(slug: &str) -> Option<ModelMessages> {
_ => None,
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::config::test_config;
use pretty_assertions::assert_eq;
#[test]
fn reasoning_summaries_override_true_enables_support() {
let model = model_info_from_slug("unknown-model");
let mut config = test_config();
config.model_supports_reasoning_summaries = Some(true);
let updated = with_config_overrides(model.clone(), &config);
let mut expected = model;
expected.supports_reasoning_summaries = true;
assert_eq!(updated, expected);
}
#[test]
fn reasoning_summaries_override_false_does_not_disable_support() {
let mut model = model_info_from_slug("unknown-model");
model.supports_reasoning_summaries = true;
let mut config = test_config();
config.model_supports_reasoning_summaries = Some(false);
let updated = with_config_overrides(model.clone(), &config);
assert_eq!(updated, model);
}
#[test]
fn reasoning_summaries_override_false_is_noop_when_model_is_false() {
let model = model_info_from_slug("unknown-model");
let mut config = test_config();
config.model_supports_reasoning_summaries = Some(false);
let updated = with_config_overrides(model.clone(), &config);
assert_eq!(updated, model);
}
}

View File

@@ -1,11 +1,11 @@
pub mod apply_patch;
pub(crate) mod collab;
mod dynamic;
mod grep_files;
mod js_repl;
mod list_dir;
mod mcp;
mod mcp_resource;
pub(crate) mod multi_agents;
mod plan;
mod read_file;
mod request_user_input;
@@ -20,7 +20,6 @@ use serde::Deserialize;
use crate::function_tool::FunctionCallError;
pub use apply_patch::ApplyPatchHandler;
pub use collab::CollabHandler;
pub use dynamic::DynamicToolHandler;
pub use grep_files::GrepFilesHandler;
pub use js_repl::JsReplHandler;
@@ -28,6 +27,7 @@ pub use js_repl::JsReplResetHandler;
pub use list_dir::ListDirHandler;
pub use mcp::McpHandler;
pub use mcp_resource::McpResourceHandler;
pub use multi_agents::MultiAgentHandler;
pub use plan::PlanHandler;
pub use read_file::ReadFileHandler;
pub use request_user_input::RequestUserInputHandler;

View File

@@ -34,7 +34,7 @@ use codex_protocol::user_input::UserInput;
use serde::Deserialize;
use serde::Serialize;
pub struct CollabHandler;
pub struct MultiAgentHandler;
/// Minimum wait timeout to prevent tight polling loops from burning CPU.
pub(crate) const MIN_WAIT_TIMEOUT_MS: i64 = 10_000;
@@ -47,7 +47,7 @@ struct CloseAgentArgs {
}
#[async_trait]
impl ToolHandler for CollabHandler {
impl ToolHandler for MultiAgentHandler {
fn kind(&self) -> ToolKind {
ToolKind::Function
}
@@ -917,7 +917,7 @@ mod tests {
input: "hello".to_string(),
},
);
let Err(err) = CollabHandler.handle(invocation).await else {
let Err(err) = MultiAgentHandler.handle(invocation).await else {
panic!("payload should be rejected");
};
assert_eq!(
@@ -937,7 +937,7 @@ mod tests {
"unknown_tool",
function_payload(json!({})),
);
let Err(err) = CollabHandler.handle(invocation).await else {
let Err(err) = MultiAgentHandler.handle(invocation).await else {
panic!("tool should be rejected");
};
assert_eq!(
@@ -955,7 +955,7 @@ mod tests {
"spawn_agent",
function_payload(json!({"message": " "})),
);
let Err(err) = CollabHandler.handle(invocation).await else {
let Err(err) = MultiAgentHandler.handle(invocation).await else {
panic!("empty message should be rejected");
};
assert_eq!(
@@ -978,7 +978,7 @@ mod tests {
"items": [{"type": "mention", "name": "drive", "path": "app://drive"}]
})),
);
let Err(err) = CollabHandler.handle(invocation).await else {
let Err(err) = MultiAgentHandler.handle(invocation).await else {
panic!("message+items should be rejected");
};
assert_eq!(
@@ -1016,7 +1016,7 @@ mod tests {
"agent_type": "explorer"
})),
);
let output = CollabHandler
let output = MultiAgentHandler
.handle(invocation)
.await
.expect("spawn_agent should succeed");
@@ -1048,7 +1048,7 @@ mod tests {
"spawn_agent",
function_payload(json!({"message": "hello"})),
);
let Err(err) = CollabHandler.handle(invocation).await else {
let Err(err) = MultiAgentHandler.handle(invocation).await else {
panic!("spawn should fail without a manager");
};
assert_eq!(
@@ -1074,7 +1074,7 @@ mod tests {
"spawn_agent",
function_payload(json!({"message": "hello"})),
);
let Err(err) = CollabHandler.handle(invocation).await else {
let Err(err) = MultiAgentHandler.handle(invocation).await else {
panic!("spawn should fail when depth limit exceeded");
};
assert_eq!(
@@ -1094,7 +1094,7 @@ mod tests {
"send_input",
function_payload(json!({"id": ThreadId::new().to_string(), "message": ""})),
);
let Err(err) = CollabHandler.handle(invocation).await else {
let Err(err) = MultiAgentHandler.handle(invocation).await else {
panic!("empty message should be rejected");
};
assert_eq!(
@@ -1118,7 +1118,7 @@ mod tests {
"items": [{"type": "mention", "name": "drive", "path": "app://drive"}]
})),
);
let Err(err) = CollabHandler.handle(invocation).await else {
let Err(err) = MultiAgentHandler.handle(invocation).await else {
panic!("message+items should be rejected");
};
assert_eq!(
@@ -1138,7 +1138,7 @@ mod tests {
"send_input",
function_payload(json!({"id": "not-a-uuid", "message": "hi"})),
);
let Err(err) = CollabHandler.handle(invocation).await else {
let Err(err) = MultiAgentHandler.handle(invocation).await else {
panic!("invalid id should be rejected");
};
let FunctionCallError::RespondToModel(msg) = err else {
@@ -1159,7 +1159,7 @@ mod tests {
"send_input",
function_payload(json!({"id": agent_id.to_string(), "message": "hi"})),
);
let Err(err) = CollabHandler.handle(invocation).await else {
let Err(err) = MultiAgentHandler.handle(invocation).await else {
panic!("missing agent should be reported");
};
assert_eq!(
@@ -1186,7 +1186,7 @@ mod tests {
"interrupt": true
})),
);
CollabHandler
MultiAgentHandler
.handle(invocation)
.await
.expect("send_input should succeed");
@@ -1227,7 +1227,7 @@ mod tests {
]
})),
);
CollabHandler
MultiAgentHandler
.handle(invocation)
.await
.expect("send_input should succeed");
@@ -1267,7 +1267,7 @@ mod tests {
"resume_agent",
function_payload(json!({"id": "not-a-uuid"})),
);
let Err(err) = CollabHandler.handle(invocation).await else {
let Err(err) = MultiAgentHandler.handle(invocation).await else {
panic!("invalid id should be rejected");
};
let FunctionCallError::RespondToModel(msg) = err else {
@@ -1288,7 +1288,7 @@ mod tests {
"resume_agent",
function_payload(json!({"id": agent_id.to_string()})),
);
let Err(err) = CollabHandler.handle(invocation).await else {
let Err(err) = MultiAgentHandler.handle(invocation).await else {
panic!("missing agent should be reported");
};
assert_eq!(
@@ -1313,7 +1313,7 @@ mod tests {
function_payload(json!({"id": agent_id.to_string()})),
);
let output = CollabHandler
let output = MultiAgentHandler
.handle(invocation)
.await
.expect("resume_agent should succeed");
@@ -1382,7 +1382,7 @@ mod tests {
"resume_agent",
function_payload(json!({"id": agent_id.to_string()})),
);
let output = CollabHandler
let output = MultiAgentHandler
.handle(resume_invocation)
.await
.expect("resume_agent should succeed");
@@ -1405,7 +1405,7 @@ mod tests {
"send_input",
function_payload(json!({"id": agent_id.to_string(), "message": "hello"})),
);
let output = CollabHandler
let output = MultiAgentHandler
.handle(send_invocation)
.await
.expect("send_input should succeed after resume");
@@ -1450,7 +1450,7 @@ mod tests {
"resume_agent",
function_payload(json!({"id": ThreadId::new().to_string()})),
);
let Err(err) = CollabHandler.handle(invocation).await else {
let Err(err) = MultiAgentHandler.handle(invocation).await else {
panic!("resume should fail when depth limit exceeded");
};
assert_eq!(
@@ -1479,7 +1479,7 @@ mod tests {
"timeout_ms": 0
})),
);
let Err(err) = CollabHandler.handle(invocation).await else {
let Err(err) = MultiAgentHandler.handle(invocation).await else {
panic!("non-positive timeout should be rejected");
};
assert_eq!(
@@ -1497,7 +1497,7 @@ mod tests {
"wait",
function_payload(json!({"ids": ["invalid"]})),
);
let Err(err) = CollabHandler.handle(invocation).await else {
let Err(err) = MultiAgentHandler.handle(invocation).await else {
panic!("invalid id should be rejected");
};
let FunctionCallError::RespondToModel(msg) = err else {
@@ -1515,7 +1515,7 @@ mod tests {
"wait",
function_payload(json!({"ids": []})),
);
let Err(err) = CollabHandler.handle(invocation).await else {
let Err(err) = MultiAgentHandler.handle(invocation).await else {
panic!("empty ids should be rejected");
};
assert_eq!(
@@ -1540,7 +1540,7 @@ mod tests {
"timeout_ms": 1000
})),
);
let output = CollabHandler
let output = MultiAgentHandler
.handle(invocation)
.await
.expect("wait should succeed");
@@ -1584,7 +1584,7 @@ mod tests {
"timeout_ms": MIN_WAIT_TIMEOUT_MS
})),
);
let output = CollabHandler
let output = MultiAgentHandler
.handle(invocation)
.await
.expect("wait should succeed");
@@ -1632,7 +1632,11 @@ mod tests {
})),
);
let early = timeout(Duration::from_millis(50), CollabHandler.handle(invocation)).await;
let early = timeout(
Duration::from_millis(50),
MultiAgentHandler.handle(invocation),
)
.await;
assert!(
early.is_err(),
"wait should not return before the minimum timeout clamp"
@@ -1677,7 +1681,7 @@ mod tests {
"timeout_ms": 1000
})),
);
let output = CollabHandler
let output = MultiAgentHandler
.handle(invocation)
.await
.expect("wait should succeed");
@@ -1717,7 +1721,7 @@ mod tests {
"close_agent",
function_payload(json!({"id": agent_id.to_string()})),
);
let output = CollabHandler
let output = MultiAgentHandler
.handle(invocation)
.await
.expect("close_agent should succeed");

View File

@@ -243,14 +243,6 @@ impl ShellHandler {
freeform,
} = args;
let features = session.features();
let request_rule_enabled = features.enabled(crate::features::Feature::RequestRule);
let prefix_rule = if request_rule_enabled {
prefix_rule
} else {
None
};
let mut exec_params = exec_params;
let dependency_env = session.dependency_env().await;
if !dependency_env.is_empty() {

View File

@@ -142,14 +142,6 @@ impl ToolHandler for UnifiedExecHandler {
..
} = args;
let features = session.features();
let request_rule_enabled = features.enabled(crate::features::Feature::RequestRule);
let prefix_rule = if request_rule_enabled {
prefix_rule
} else {
None
};
if sandbox_permissions.requires_escalated_permissions()
&& !matches!(
context.turn.approval_policy,

View File

@@ -114,6 +114,7 @@ struct ExecContext {
struct ExecToolCalls {
in_flight: usize,
notify: Arc<Notify>,
cancel: CancellationToken,
}
enum KernelStreamEnd {
@@ -300,6 +301,7 @@ impl JsReplManager {
async fn clear_exec_tool_calls(&self, exec_id: &str) {
if let Some(state) = self.exec_tool_calls.lock().await.remove(exec_id) {
state.cancel.cancel();
state.notify.notify_waiters();
}
}
@@ -320,32 +322,14 @@ impl JsReplManager {
}
}
async fn wait_for_all_exec_tool_calls(&self) {
loop {
let notified = {
let calls = self.exec_tool_calls.lock().await;
calls
.values()
.find(|state| state.in_flight > 0)
.map(|state| Arc::clone(&state.notify).notified_owned())
};
match notified {
Some(notified) => notified.await,
None => return,
}
}
}
async fn begin_exec_tool_call(
exec_tool_calls: &Arc<Mutex<HashMap<String, ExecToolCalls>>>,
exec_id: &str,
) -> bool {
) -> Option<CancellationToken> {
let mut calls = exec_tool_calls.lock().await;
let Some(state) = calls.get_mut(exec_id) else {
return false;
};
let state = calls.get_mut(exec_id)?;
state.in_flight += 1;
true
Some(state.cancel.clone())
}
async fn finish_exec_tool_call(
@@ -396,14 +380,30 @@ impl JsReplManager {
exec_id: &str,
) {
if let Some(state) = exec_tool_calls.lock().await.remove(exec_id) {
state.cancel.cancel();
state.notify.notify_waiters();
}
}
async fn clear_all_exec_tool_calls_map(
exec_tool_calls: &Arc<Mutex<HashMap<String, ExecToolCalls>>>,
) {
let states = {
let mut calls = exec_tool_calls.lock().await;
calls.drain().map(|(_, state)| state).collect::<Vec<_>>()
};
for state in states {
state.cancel.cancel();
state.notify.notify_waiters();
}
}
pub async fn reset(&self) -> Result<(), FunctionCallError> {
let _permit = self.exec_lock.clone().acquire_owned().await.map_err(|_| {
FunctionCallError::RespondToModel("js_repl execution unavailable".to_string())
})?;
self.reset_kernel().await;
self.wait_for_all_exec_tool_calls().await;
self.exec_tool_calls.lock().await.clear();
Self::clear_all_exec_tool_calls_map(&self.exec_tool_calls).await;
Ok(())
}
@@ -530,7 +530,9 @@ impl JsReplManager {
return Err(FunctionCallError::RespondToModel(message));
}
Err(_) => {
self.reset().await?;
self.reset_kernel().await;
self.wait_for_exec_tool_calls(&req_id).await;
self.exec_tool_calls.lock().await.clear();
return Err(FunctionCallError::RespondToModel(
"js_repl execution timed out; kernel reset, rerun your request".to_string(),
));
@@ -854,7 +856,9 @@ impl JsReplManager {
JsReplManager::clear_exec_tool_calls_map(&exec_tool_calls, &id).await;
}
KernelToHost::RunTool(req) => {
if !JsReplManager::begin_exec_tool_call(&exec_tool_calls, &req.exec_id).await {
let Some(reset_cancel) =
JsReplManager::begin_exec_tool_call(&exec_tool_calls, &req.exec_id).await
else {
let exec_id = req.exec_id.clone();
let tool_call_id = req.id.clone();
let payload = HostToKernel::RunToolResult(RunToolResult {
@@ -877,10 +881,10 @@ impl JsReplManager {
);
}
continue;
}
};
let stdin_clone = Arc::clone(&stdin);
let exec_contexts = Arc::clone(&exec_contexts);
let exec_tool_calls = Arc::clone(&exec_tool_calls);
let exec_tool_calls_for_task = Arc::clone(&exec_tool_calls);
let recent_stderr = Arc::clone(&recent_stderr);
tokio::spawn(async move {
let exec_id = req.exec_id.clone();
@@ -888,15 +892,26 @@ impl JsReplManager {
let tool_name = req.tool_name.clone();
let context = { exec_contexts.lock().await.get(&exec_id).cloned() };
let result = match context {
Some(ctx) => JsReplManager::run_tool_request(ctx, req).await,
Some(ctx) => {
tokio::select! {
_ = reset_cancel.cancelled() => RunToolResult {
id: tool_call_id.clone(),
ok: false,
response: None,
error: Some("js_repl execution reset".to_string()),
},
result = JsReplManager::run_tool_request(ctx, req) => result,
}
}
None => RunToolResult {
id: req.id.clone(),
id: tool_call_id.clone(),
ok: false,
response: None,
error: Some("js_repl exec context not found".to_string()),
},
};
JsReplManager::finish_exec_tool_call(&exec_tool_calls, &exec_id).await;
JsReplManager::finish_exec_tool_call(&exec_tool_calls_for_task, &exec_id)
.await;
let payload = HostToKernel::RunToolResult(result);
if let Err(err) = JsReplManager::write_message(&stdin_clone, &payload).await
{
@@ -1417,7 +1432,11 @@ mod tests {
.lock()
.await
.insert(exec_id.clone(), ExecToolCalls::default());
assert!(JsReplManager::begin_exec_tool_call(&exec_tool_calls, &exec_id).await);
assert!(
JsReplManager::begin_exec_tool_call(&exec_tool_calls, &exec_id)
.await
.is_some()
);
let wait_map = Arc::clone(&exec_tool_calls);
let wait_exec_id = exec_id.clone();
@@ -1441,6 +1460,110 @@ mod tests {
JsReplManager::clear_exec_tool_calls_map(&exec_tool_calls, &exec_id).await;
}
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn reset_waits_for_exec_lock_before_clearing_exec_tool_calls() {
let manager = JsReplManager::new(None, PathBuf::from("."))
.await
.expect("manager should initialize");
let permit = manager
.exec_lock
.clone()
.acquire_owned()
.await
.expect("lock should be acquirable");
let exec_id = Uuid::new_v4().to_string();
manager.register_exec_tool_calls(&exec_id).await;
let reset_manager = Arc::clone(&manager);
let mut reset_task = tokio::spawn(async move { reset_manager.reset().await });
tokio::time::sleep(Duration::from_millis(50)).await;
assert!(
!reset_task.is_finished(),
"reset should wait until execute lock is released"
);
assert!(
manager.exec_tool_calls.lock().await.contains_key(&exec_id),
"reset must not clear tool-call contexts while execute lock is held"
);
drop(permit);
tokio::time::timeout(Duration::from_secs(1), &mut reset_task)
.await
.expect("reset should complete after execute lock release")
.expect("reset task should not panic")
.expect("reset should succeed");
assert!(
!manager.exec_tool_calls.lock().await.contains_key(&exec_id),
"reset should clear tool-call contexts after lock acquisition"
);
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn reset_clears_inflight_exec_tool_calls_without_waiting() {
let manager = JsReplManager::new(None, std::env::temp_dir())
.await
.expect("manager should initialize");
let exec_id = Uuid::new_v4().to_string();
manager.register_exec_tool_calls(&exec_id).await;
assert!(
JsReplManager::begin_exec_tool_call(&manager.exec_tool_calls, &exec_id)
.await
.is_some()
);
let wait_manager = Arc::clone(&manager);
let wait_exec_id = exec_id.clone();
let waiter = tokio::spawn(async move {
wait_manager.wait_for_exec_tool_calls(&wait_exec_id).await;
});
tokio::task::yield_now().await;
tokio::time::timeout(Duration::from_secs(1), manager.reset())
.await
.expect("reset should not hang")
.expect("reset should succeed");
tokio::time::timeout(Duration::from_secs(1), waiter)
.await
.expect("waiter should be released")
.expect("wait task should not panic");
assert!(manager.exec_tool_calls.lock().await.is_empty());
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn reset_aborts_inflight_exec_tool_tasks() {
let manager = JsReplManager::new(None, std::env::temp_dir())
.await
.expect("manager should initialize");
let exec_id = Uuid::new_v4().to_string();
manager.register_exec_tool_calls(&exec_id).await;
let reset_cancel = JsReplManager::begin_exec_tool_call(&manager.exec_tool_calls, &exec_id)
.await
.expect("exec should be registered");
let task = tokio::spawn(async move {
tokio::select! {
_ = reset_cancel.cancelled() => "cancelled",
_ = tokio::time::sleep(Duration::from_secs(60)) => "timed_out",
}
});
tokio::time::timeout(Duration::from_secs(1), manager.reset())
.await
.expect("reset should not hang")
.expect("reset should succeed");
let outcome = tokio::time::timeout(Duration::from_secs(1), task)
.await
.expect("cancelled task should resolve promptly")
.expect("task should not panic");
assert_eq!(outcome, "cancelled");
}
async fn can_run_js_repl_runtime_tests() -> bool {
if std::env::var_os("CODEX_SANDBOX").is_some() {
return false;

View File

@@ -10,9 +10,9 @@ use crate::tools::handlers::SEARCH_TOOL_BM25_DEFAULT_LIMIT;
use crate::tools::handlers::SEARCH_TOOL_BM25_TOOL_NAME;
use crate::tools::handlers::apply_patch::create_apply_patch_freeform_tool;
use crate::tools::handlers::apply_patch::create_apply_patch_json_tool;
use crate::tools::handlers::collab::DEFAULT_WAIT_TIMEOUT_MS;
use crate::tools::handlers::collab::MAX_WAIT_TIMEOUT_MS;
use crate::tools::handlers::collab::MIN_WAIT_TIMEOUT_MS;
use crate::tools::handlers::multi_agents::DEFAULT_WAIT_TIMEOUT_MS;
use crate::tools::handlers::multi_agents::MAX_WAIT_TIMEOUT_MS;
use crate::tools::handlers::multi_agents::MIN_WAIT_TIMEOUT_MS;
use crate::tools::handlers::request_user_input_tool_description;
use crate::tools::registry::ToolRegistryBuilder;
use codex_protocol::config_types::WebSearchMode;
@@ -41,7 +41,6 @@ pub(crate) struct ToolsConfig {
pub js_repl_tools_only: bool,
pub collab_tools: bool,
pub collaboration_modes_tools: bool,
pub request_rule_enabled: bool,
pub experimental_supported_tools: Vec<String>,
}
@@ -64,7 +63,6 @@ impl ToolsConfig {
include_js_repl && features.enabled(Feature::JsReplToolsOnly);
let include_collab_tools = features.enabled(Feature::Collab);
let include_collaboration_modes_tools = features.enabled(Feature::CollaborationModes);
let request_rule_enabled = features.enabled(Feature::RequestRule);
let include_search_tool = features.enabled(Feature::Apps);
let shell_type = if !features.enabled(Feature::ShellTool) {
@@ -101,7 +99,6 @@ impl ToolsConfig {
js_repl_tools_only: include_js_repl_tools_only,
collab_tools: include_collab_tools,
collaboration_modes_tools: include_collaboration_modes_tools,
request_rule_enabled,
experimental_supported_tools: model_info.experimental_supported_tools.clone(),
}
}
@@ -174,7 +171,7 @@ impl From<JsonSchema> for AdditionalProperties {
}
}
fn create_approval_parameters(include_prefix_rule: bool) -> BTreeMap<String, JsonSchema> {
fn create_approval_parameters() -> BTreeMap<String, JsonSchema> {
let mut properties = BTreeMap::from([
(
"sandbox_permissions".to_string(),
@@ -200,23 +197,22 @@ fn create_approval_parameters(include_prefix_rule: bool) -> BTreeMap<String, Jso
),
]);
if include_prefix_rule {
properties.insert(
"prefix_rule".to_string(),
JsonSchema::Array {
items: Box::new(JsonSchema::String { description: None }),
description: Some(
r#"Only specify when sandbox_permissions is `require_escalated`.
properties.insert(
"prefix_rule".to_string(),
JsonSchema::Array {
items: Box::new(JsonSchema::String { description: None }),
description: Some(
r#"Only specify when sandbox_permissions is `require_escalated`.
Suggest a prefix command pattern that will allow you to fulfill similar requests from the user in the future.
Should be a short but reasonable prefix, e.g. [\"git\", \"pull\"] or [\"uv\", \"run\"] or [\"pytest\"]."#.to_string(),
),
});
}
),
},
);
properties
}
fn create_exec_command_tool(include_prefix_rule: bool) -> ToolSpec {
fn create_exec_command_tool() -> ToolSpec {
let mut properties = BTreeMap::from([
(
"cmd".to_string(),
@@ -274,7 +270,7 @@ fn create_exec_command_tool(include_prefix_rule: bool) -> ToolSpec {
},
),
]);
properties.extend(create_approval_parameters(include_prefix_rule));
properties.extend(create_approval_parameters());
ToolSpec::Function(ResponsesApiTool {
name: "exec_command".to_string(),
@@ -337,7 +333,7 @@ fn create_write_stdin_tool() -> ToolSpec {
})
}
fn create_shell_tool(include_prefix_rule: bool) -> ToolSpec {
fn create_shell_tool() -> ToolSpec {
let mut properties = BTreeMap::from([
(
"command".to_string(),
@@ -359,7 +355,7 @@ fn create_shell_tool(include_prefix_rule: bool) -> ToolSpec {
},
),
]);
properties.extend(create_approval_parameters(include_prefix_rule));
properties.extend(create_approval_parameters());
let description = if cfg!(windows) {
r#"Runs a Powershell command (Windows) and returns its output. Arguments to `shell` will be passed to CreateProcessW(). Most commands should be prefixed with ["powershell.exe", "-Command"].
@@ -390,7 +386,7 @@ Examples of valid command strings:
})
}
fn create_shell_command_tool(include_prefix_rule: bool) -> ToolSpec {
fn create_shell_command_tool() -> ToolSpec {
let mut properties = BTreeMap::from([
(
"command".to_string(),
@@ -422,7 +418,7 @@ fn create_shell_command_tool(include_prefix_rule: bool) -> ToolSpec {
},
),
]);
properties.extend(create_approval_parameters(include_prefix_rule));
properties.extend(create_approval_parameters());
let description = if cfg!(windows) {
r#"Runs a Powershell command (Windows) and returns its output.
@@ -1405,7 +1401,6 @@ pub(crate) fn build_specs(
dynamic_tools: &[DynamicToolSpec],
) -> ToolRegistryBuilder {
use crate::tools::handlers::ApplyPatchHandler;
use crate::tools::handlers::CollabHandler;
use crate::tools::handlers::DynamicToolHandler;
use crate::tools::handlers::GrepFilesHandler;
use crate::tools::handlers::JsReplHandler;
@@ -1413,6 +1408,7 @@ pub(crate) fn build_specs(
use crate::tools::handlers::ListDirHandler;
use crate::tools::handlers::McpHandler;
use crate::tools::handlers::McpResourceHandler;
use crate::tools::handlers::MultiAgentHandler;
use crate::tools::handlers::PlanHandler;
use crate::tools::handlers::ReadFileHandler;
use crate::tools::handlers::RequestUserInputHandler;
@@ -1442,19 +1438,13 @@ pub(crate) fn build_specs(
match &config.shell_type {
ConfigShellToolType::Default => {
builder.push_spec_with_parallel_support(
create_shell_tool(config.request_rule_enabled),
true,
);
builder.push_spec_with_parallel_support(create_shell_tool(), true);
}
ConfigShellToolType::Local => {
builder.push_spec_with_parallel_support(ToolSpec::LocalShell {}, true);
}
ConfigShellToolType::UnifiedExec => {
builder.push_spec_with_parallel_support(
create_exec_command_tool(config.request_rule_enabled),
true,
);
builder.push_spec_with_parallel_support(create_exec_command_tool(), true);
builder.push_spec(create_write_stdin_tool());
builder.register_handler("exec_command", unified_exec_handler.clone());
builder.register_handler("write_stdin", unified_exec_handler);
@@ -1463,10 +1453,7 @@ pub(crate) fn build_specs(
// Do nothing.
}
ConfigShellToolType::ShellCommand => {
builder.push_spec_with_parallel_support(
create_shell_command_tool(config.request_rule_enabled),
true,
);
builder.push_spec_with_parallel_support(create_shell_command_tool(), true);
}
}
@@ -1576,17 +1563,17 @@ pub(crate) fn build_specs(
builder.register_handler("view_image", view_image_handler);
if config.collab_tools {
let collab_handler = Arc::new(CollabHandler);
let multi_agent_handler = Arc::new(MultiAgentHandler);
builder.push_spec(create_spawn_agent_tool());
builder.push_spec(create_send_input_tool());
builder.push_spec(create_resume_agent_tool());
builder.push_spec(create_wait_tool());
builder.push_spec(create_close_agent_tool());
builder.register_handler("spawn_agent", collab_handler.clone());
builder.register_handler("send_input", collab_handler.clone());
builder.register_handler("resume_agent", collab_handler.clone());
builder.register_handler("wait", collab_handler.clone());
builder.register_handler("close_agent", collab_handler);
builder.register_handler("spawn_agent", multi_agent_handler.clone());
builder.register_handler("send_input", multi_agent_handler.clone());
builder.register_handler("resume_agent", multi_agent_handler.clone());
builder.register_handler("wait", multi_agent_handler.clone());
builder.register_handler("close_agent", multi_agent_handler);
}
if let Some(mcp_tools) = mcp_tools {
@@ -1832,7 +1819,7 @@ mod tests {
// Build expected from the same helpers used by the builder.
let mut expected: BTreeMap<String, ToolSpec> = BTreeMap::from([]);
for spec in [
create_exec_command_tool(true),
create_exec_command_tool(),
create_write_stdin_tool(),
PLAN_TOOL.clone(),
create_request_user_input_tool(),
@@ -2815,7 +2802,7 @@ mod tests {
#[test]
fn test_shell_tool() {
let tool = super::create_shell_tool(true);
let tool = super::create_shell_tool();
let ToolSpec::Function(ResponsesApiTool {
description, name, ..
}) = &tool
@@ -2845,7 +2832,7 @@ Examples of valid command strings:
#[test]
fn test_shell_command_tool() {
let tool = super::create_shell_command_tool(true);
let tool = super::create_shell_command_tool();
let ToolSpec::Function(ResponsesApiTool {
description, name, ..
}) = &tool

View File

@@ -24,5 +24,5 @@ Notes:
- `description`
- `connector_name`
- input schema property keys (`input_keys`)
- If the needed app is already explicit in the prompt (for example an `apps://...` mention) or already present in the current `tools` list, you can call that tool directly.
- If the needed app is already explicit in the prompt (for example `[$app-name](app://{connector_id})`) or already present in the current `tools` list, you can call that tool directly.
- Do not use `search_tool_bm25` for non-apps/local tasks (filesystem, repo search, or shell-only workflows) or anything not related to {{app_names}}.

View File

@@ -1479,6 +1479,44 @@ fn scenarios() -> Vec<ScenarioSpec> {
output_contains: "rejected by user",
},
},
ScenarioSpec {
name: "safe command with heredoc and redirect still requires approval",
approval_policy: AskForApproval::OnRequest,
sandbox_policy: workspace_write(false),
action: ActionKind::RunUnifiedExecCommand {
command: "cat <<'EOF' > /tmp/out.txt \nhello\nEOF",
justification: None,
},
sandbox_permissions: SandboxPermissions::RequireEscalated,
features: vec![Feature::UnifiedExec],
model_override: None,
outcome: Outcome::ExecApproval {
decision: ReviewDecision::Denied,
expected_reason: None,
},
expectation: Expectation::CommandFailure {
output_contains: "rejected by user",
},
},
ScenarioSpec {
name: "compound command with one safe command still requires approval",
approval_policy: AskForApproval::OnRequest,
sandbox_policy: workspace_write(false),
action: ActionKind::RunUnifiedExecCommand {
command: "cat ./one.txt && touch ./two.txt",
justification: None,
},
sandbox_permissions: SandboxPermissions::RequireEscalated,
features: vec![Feature::UnifiedExec],
model_override: None,
outcome: Outcome::ExecApproval {
decision: ReviewDecision::Denied,
expected_reason: None,
},
expectation: Expectation::CommandFailure {
output_contains: "rejected by user",
},
},
]
}
@@ -1890,14 +1928,15 @@ async fn approving_execpolicy_amendment_persists_policy_and_skips_future_prompts
Ok(())
}
// todo(dylan) add ScenarioSpec support for rules
#[tokio::test(flavor = "current_thread")]
#[cfg(unix)]
async fn heredoc_with_chained_allowed_prefix_still_requires_approval() -> Result<()> {
async fn compound_command_with_one_safe_command_still_requires_approval() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let approval_policy = AskForApproval::UnlessTrusted;
let sandbox_policy = SandboxPolicy::new_read_only_policy();
let sandbox_policy = SandboxPolicy::new_workspace_write_policy();
let sandbox_policy_for_config = sandbox_policy.clone();
let mut builder = test_codex().with_config(move |config| {
config.permissions.approval_policy = Constrained::allow_any(approval_policy);
@@ -1913,12 +1952,12 @@ async fn heredoc_with_chained_allowed_prefix_still_requires_approval() -> Result
)?;
let call_id = "heredoc-with-chained-prefix";
let command = "cat <<'EOF' > /tmp/test.txt && touch allow-prefix.txt\nhello\nEOF";
let command = "touch ./test.txt && rm ./test.txt";
let (event, expected_command) = ActionKind::RunCommand { command }
.prepare(&test, &server, call_id, SandboxPermissions::UseDefault)
.await?;
let expected_command =
expected_command.expect("heredoc chained command scenario should produce a shell command");
expected_command.expect("compound command should produce a shell command");
let _ = mount_sse_once(
&server,
@@ -1940,7 +1979,7 @@ async fn heredoc_with_chained_allowed_prefix_still_requires_approval() -> Result
submit_turn(
&test,
"heredoc chained prefix",
"compound command",
approval_policy,
sandbox_policy.clone(),
)

View File

@@ -103,6 +103,7 @@ mod resume_warning;
mod review;
mod rmcp_client;
mod rollout_list_find;
mod safety_check_downgrade;
mod search_tool;
mod seatbelt;
mod shell_command;

View File

@@ -488,7 +488,6 @@ async fn permissions_message_includes_writable_roots() -> Result<()> {
&sandbox_policy,
AskForApproval::OnRequest,
&Policy::empty(),
true,
test.config.cwd.as_path(),
)
.into_text();

View File

@@ -0,0 +1,228 @@
use anyhow::Result;
use codex_core::protocol::AskForApproval;
use codex_core::protocol::EventMsg;
use codex_core::protocol::Op;
use codex_core::protocol::SandboxPolicy;
use codex_protocol::config_types::ReasoningSummary;
use codex_protocol::models::ContentItem;
use codex_protocol::models::ResponseItem;
use codex_protocol::user_input::UserInput;
use core_test_support::responses::ev_assistant_message;
use core_test_support::responses::ev_function_call;
use core_test_support::responses::ev_response_created;
use core_test_support::responses::mount_response_once;
use core_test_support::responses::mount_response_sequence;
use core_test_support::responses::sse;
use core_test_support::responses::sse_completed;
use core_test_support::responses::sse_response;
use core_test_support::responses::start_mock_server;
use core_test_support::skip_if_no_network;
use core_test_support::test_codex::test_codex;
use core_test_support::wait_for_event;
use pretty_assertions::assert_eq;
const SERVER_MODEL: &str = "gpt-5.2";
const REQUESTED_MODEL: &str = "gpt-5.3-codex";
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn openai_model_header_mismatch_emits_warning_event_and_warning_item() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let response =
sse_response(sse_completed("resp-1")).insert_header("OpenAI-Model", SERVER_MODEL);
let _mock = mount_response_once(&server, response).await;
let mut builder = test_codex().with_model(REQUESTED_MODEL);
let test = builder.build(&server).await?;
test.codex
.submit(Op::UserTurn {
items: vec![UserInput::Text {
text: "trigger safety check".to_string(),
text_elements: Vec::new(),
}],
final_output_json_schema: None,
cwd: test.cwd_path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: REQUESTED_MODEL.to_string(),
effort: test.config.model_reasoning_effort,
summary: ReasoningSummary::Auto,
collaboration_mode: None,
personality: None,
})
.await?;
let warning = wait_for_event(&test.codex, |event| matches!(event, EventMsg::Warning(_))).await;
let EventMsg::Warning(warning) = warning else {
panic!("expected warning event");
};
assert!(warning.message.contains(REQUESTED_MODEL));
assert!(warning.message.contains(SERVER_MODEL));
let warning_item = wait_for_event(&test.codex, |event| {
matches!(
event,
EventMsg::RawResponseItem(raw)
if matches!(
&raw.item,
ResponseItem::Message { content, .. }
if content.iter().any(|item| matches!(
item,
ContentItem::InputText { text } if text.starts_with("Warning: ")
))
)
)
})
.await;
let EventMsg::RawResponseItem(raw) = warning_item else {
panic!("expected raw response item event");
};
let ResponseItem::Message { role, content, .. } = raw.item else {
panic!("expected warning to be recorded as a message item");
};
assert_eq!(role, "user");
let warning_text = content.iter().find_map(|item| match item {
ContentItem::InputText { text } => Some(text.as_str()),
_ => None,
});
let warning_text = warning_text.expect("warning message should include input_text content");
assert!(warning_text.contains(REQUESTED_MODEL));
assert!(warning_text.contains(SERVER_MODEL));
let _ = wait_for_event(&test.codex, |event| {
matches!(event, EventMsg::TurnComplete(_))
})
.await;
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn response_model_field_mismatch_emits_warning_when_header_matches_requested() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let response = sse_response(sse(vec![
serde_json::json!({
"type": "response.created",
"response": {
"id": "resp-1",
"model": SERVER_MODEL,
}
}),
core_test_support::responses::ev_completed("resp-1"),
]))
.insert_header("OpenAI-Model", REQUESTED_MODEL);
let _mock = mount_response_once(&server, response).await;
let mut builder = test_codex().with_model(REQUESTED_MODEL);
let test = builder.build(&server).await?;
test.codex
.submit(Op::UserTurn {
items: vec![UserInput::Text {
text: "trigger response model check".to_string(),
text_elements: Vec::new(),
}],
final_output_json_schema: None,
cwd: test.cwd_path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: REQUESTED_MODEL.to_string(),
effort: test.config.model_reasoning_effort,
summary: ReasoningSummary::Auto,
collaboration_mode: None,
personality: None,
})
.await?;
let warning = wait_for_event(&test.codex, |event| {
matches!(
event,
EventMsg::Warning(warning)
if warning
.message
.contains("flagged for potentially high-risk cyber activity")
)
})
.await;
let EventMsg::Warning(warning) = warning else {
panic!("expected warning event");
};
assert!(warning.message.contains("gpt-5.2 as a fallback"));
let _ = wait_for_event(&test.codex, |event| {
matches!(event, EventMsg::TurnComplete(_))
})
.await;
Ok(())
}
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn openai_model_header_mismatch_only_emits_one_warning_per_turn() -> Result<()> {
skip_if_no_network!(Ok(()));
let server = start_mock_server().await;
let tool_args = serde_json::json!({
"command": "echo hello",
"timeout_ms": 1_000
});
let first_response = sse_response(sse(vec![
ev_response_created("resp-1"),
ev_function_call(
"call-1",
"shell_command",
&serde_json::to_string(&tool_args)?,
),
core_test_support::responses::ev_completed("resp-1"),
]))
.insert_header("OpenAI-Model", SERVER_MODEL);
let second_response = sse_response(sse(vec![
ev_response_created("resp-2"),
ev_assistant_message("msg-1", "done"),
core_test_support::responses::ev_completed("resp-2"),
]))
.insert_header("OpenAI-Model", SERVER_MODEL);
let _mock = mount_response_sequence(&server, vec![first_response, second_response]).await;
let mut builder = test_codex().with_model(REQUESTED_MODEL);
let test = builder.build(&server).await?;
test.codex
.submit(Op::UserTurn {
items: vec![UserInput::Text {
text: "trigger follow-up turn".to_string(),
text_elements: Vec::new(),
}],
final_output_json_schema: None,
cwd: test.cwd_path().to_path_buf(),
approval_policy: AskForApproval::Never,
sandbox_policy: SandboxPolicy::DangerFullAccess,
model: REQUESTED_MODEL.to_string(),
effort: test.config.model_reasoning_effort,
summary: ReasoningSummary::Auto,
collaboration_mode: None,
personality: None,
})
.await?;
let mut warning_count = 0;
loop {
let event = wait_for_event(&test.codex, |_| true).await;
match event {
EventMsg::Warning(warning) if warning.message.contains(REQUESTED_MODEL) => {
warning_count += 1;
}
EventMsg::TurnComplete(_) => break,
_ => {}
}
}
assert_eq!(warning_count, 1);
Ok(())
}

View File

@@ -279,7 +279,7 @@ impl CodexLogSnapshot {
}
let level = match classification {
"bug" | "bad_result" => Level::Error,
"bug" | "bad_result" | "safety_check" => Level::Error,
_ => Level::Info,
};
@@ -342,6 +342,7 @@ fn display_classification(classification: &str) -> String {
"bug" => "Bug".to_string(),
"bad_result" => "Bad result".to_string(),
"good_result" => "Good result".to_string(),
"safety_check" => "Safety check".to_string(),
_ => "Other".to_string(),
}
}

View File

@@ -748,6 +748,7 @@ impl OtelManager {
ResponseEvent::ReasoningSummaryPartAdded { .. } => {
"reasoning_summary_part_added".into()
}
ResponseEvent::ServerModel(_) => "server_model".into(),
ResponseEvent::ServerReasoningIncluded(_) => "server_reasoning_included".into(),
ResponseEvent::RateLimits(_) => "rate_limits".into(),
ResponseEvent::ModelsEtag(_) => "models_etag".into(),

View File

@@ -225,8 +225,6 @@ const APPROVAL_POLICY_UNLESS_TRUSTED: &str =
include_str!("prompts/permissions/approval_policy/unless_trusted.md");
const APPROVAL_POLICY_ON_FAILURE: &str =
include_str!("prompts/permissions/approval_policy/on_failure.md");
const APPROVAL_POLICY_ON_REQUEST: &str =
include_str!("prompts/permissions/approval_policy/on_request.md");
const APPROVAL_POLICY_ON_REQUEST_RULE: &str =
include_str!("prompts/permissions/approval_policy/on_request_rule.md");
@@ -241,29 +239,20 @@ impl DeveloperInstructions {
Self { text: text.into() }
}
pub fn from(
approval_policy: AskForApproval,
exec_policy: &Policy,
request_rule_enabled: bool,
) -> DeveloperInstructions {
pub fn from(approval_policy: AskForApproval, exec_policy: &Policy) -> DeveloperInstructions {
let text = match approval_policy {
AskForApproval::Never => APPROVAL_POLICY_NEVER.to_string(),
AskForApproval::UnlessTrusted => APPROVAL_POLICY_UNLESS_TRUSTED.to_string(),
AskForApproval::OnFailure => APPROVAL_POLICY_ON_FAILURE.to_string(),
AskForApproval::OnRequest => {
if !request_rule_enabled {
APPROVAL_POLICY_ON_REQUEST.to_string()
} else {
let command_prefixes =
format_allow_prefixes(exec_policy.get_allowed_prefixes());
match command_prefixes {
Some(prefixes) => {
format!(
"{APPROVAL_POLICY_ON_REQUEST_RULE}\n## Approved command prefixes\nThe following prefix rules have already been approved: {prefixes}"
)
}
None => APPROVAL_POLICY_ON_REQUEST_RULE.to_string(),
let command_prefixes = format_allow_prefixes(exec_policy.get_allowed_prefixes());
match command_prefixes {
Some(prefixes) => {
format!(
"{APPROVAL_POLICY_ON_REQUEST_RULE}\n## Approved command prefixes\nThe following prefix rules have already been approved: {prefixes}"
)
}
None => APPROVAL_POLICY_ON_REQUEST_RULE.to_string(),
}
}
};
@@ -301,7 +290,6 @@ impl DeveloperInstructions {
sandbox_policy: &SandboxPolicy,
approval_policy: AskForApproval,
exec_policy: &Policy,
request_rule_enabled: bool,
cwd: &Path,
) -> Self {
let network_access = if sandbox_policy.has_full_network_access() {
@@ -325,7 +313,6 @@ impl DeveloperInstructions {
network_access,
approval_policy,
exec_policy,
request_rule_enabled,
writable_roots,
)
}
@@ -349,7 +336,6 @@ impl DeveloperInstructions {
network_access: NetworkAccess,
approval_policy: AskForApproval,
exec_policy: &Policy,
request_rule_enabled: bool,
writable_roots: Option<Vec<WritableRoot>>,
) -> Self {
let start_tag = DeveloperInstructions::new("<permissions instructions>");
@@ -359,11 +345,7 @@ impl DeveloperInstructions {
sandbox_mode,
network_access,
))
.concat(DeveloperInstructions::from(
approval_policy,
exec_policy,
request_rule_enabled,
))
.concat(DeveloperInstructions::from(approval_policy, exec_policy))
.concat(DeveloperInstructions::from_writable_roots(writable_roots))
.concat(end_tag)
}
@@ -1207,7 +1189,6 @@ mod tests {
NetworkAccess::Enabled,
AskForApproval::OnRequest,
&Policy::empty(),
false,
None,
);
@@ -1217,7 +1198,7 @@ mod tests {
"expected network access to be enabled in message"
);
assert!(
text.contains("`approval_policy` is `on-request`"),
text.contains("How to request escalation"),
"expected approval guidance to be included"
);
}
@@ -1236,7 +1217,6 @@ mod tests {
&policy,
AskForApproval::UnlessTrusted,
&Policy::empty(),
false,
&PathBuf::from("/tmp"),
);
let text = instructions.into_text();
@@ -1245,7 +1225,7 @@ mod tests {
}
#[test]
fn includes_request_rule_instructions_when_enabled() {
fn includes_request_rule_instructions_for_on_request() {
let mut exec_policy = Policy::empty();
exec_policy
.add_prefix_rule(
@@ -1258,7 +1238,6 @@ mod tests {
NetworkAccess::Enabled,
AskForApproval::OnRequest,
&exec_policy,
true,
None,
);

View File

@@ -1,12 +0,0 @@
Approvals are your mechanism to get user consent to run shell commands without the sandbox. `approval_policy` is `on-request`: Commands will be run in the sandbox by default, and you can specify in your tool call if you want to escalate a command to run without sandboxing. If the completing the task requires escalated permissions, Do not let these settings or the sandbox deter you from attempting to accomplish the user's task.
Here are scenarios where you'll need to request approval:
- You need to run a command that writes to a directory that requires it (e.g. running tests that write to /var)
- You need to run a GUI app (e.g., open/xdg-open/osascript) to open browsers or files.
- You are running sandboxed and need to run a command that requires network access (e.g. installing packages)
- If you run a command that is important to solving the user's query, but it fails because of sandboxing, rerun the command with approval. ALWAYS proceed to use the `sandbox_permissions` and `justification` parameters - do not message the user before requesting approval for the command.
- You are about to take a potentially destructive action such as an `rm` or `git reset` that the user did not explicitly ask for.
When requesting approval to execute a command that will require escalated privileges:
- Provide the `sandbox_permissions` parameter with the value `"require_escalated"`
- Include a short, 1 sentence explanation for why you need escalated permissions in the justification parameter

View File

@@ -128,9 +128,6 @@ pub fn parse_shell_lc_single_command_prefix(command: &[String]) -> Option<Vec<St
if root.has_error() {
return None;
}
if !has_named_descendant_kind(root, "heredoc_redirect") {
return None;
}
let command_node = find_single_command_node(root)?;
parse_heredoc_command_words(command_node, script)
@@ -237,20 +234,6 @@ fn is_literal_word_or_number(node: Node<'_>) -> bool {
node.named_children(&mut cursor).next().is_none()
}
fn has_named_descendant_kind(node: Node<'_>, kind: &str) -> bool {
let mut stack = vec![node];
while let Some(current) = stack.pop() {
if current.kind() == kind {
return true;
}
let mut cursor = current.walk();
for child in current.named_children(&mut cursor) {
stack.push(child);
}
}
false
}
fn is_allowed_heredoc_attachment_kind(kind: &str) -> bool {
matches!(
kind,
@@ -532,7 +515,10 @@ mod tests {
"-lc".to_string(),
"echo hello > /tmp/out.txt".to_string(),
];
assert_eq!(parse_shell_lc_single_command_prefix(&command), None);
assert_eq!(
parse_shell_lc_single_command_prefix(&command),
Some(vec!["echo".to_string(), "hello".to_string()])
);
}
#[test]
@@ -548,6 +534,16 @@ mod tests {
);
}
#[test]
fn parse_shell_lc_single_command_prefix_rejects_herestring_with_chaining() {
let command = vec![
"bash".to_string(),
"-lc".to_string(),
r#"echo hello > /tmp/out.txt && cat /tmp/out.txt"#.to_string(),
];
assert_eq!(parse_shell_lc_single_command_prefix(&command), None);
}
#[test]
fn parse_shell_lc_single_command_prefix_rejects_herestring_with_substitution() {
let command = vec![

View File

@@ -46,6 +46,10 @@ struct Args {
#[arg(long = "thread-id")]
thread_id: Vec<String>,
/// Substring match against the log message.
#[arg(long)]
search: Option<String>,
/// Include logs that do not have a thread id.
#[arg(long)]
threadless: bool,
@@ -57,6 +61,10 @@ struct Args {
/// Poll interval in milliseconds.
#[arg(long, default_value_t = 500)]
poll_ms: u64,
/// Show compact output with only time, level, and message.
#[arg(long)]
compact: bool,
}
#[derive(Debug, Clone)]
@@ -67,6 +75,7 @@ struct LogFilter {
module_like: Vec<String>,
file_like: Vec<String>,
thread_ids: Vec<String>,
search: Option<String>,
include_threadless: bool,
}
@@ -81,7 +90,8 @@ async fn main() -> anyhow::Result<()> {
.unwrap_or_else(|| PathBuf::from("."));
let runtime = StateRuntime::init(codex_home, "logs-client".to_string(), None).await?;
let mut last_id = print_backfill(runtime.as_ref(), &filter, args.backfill).await?;
let mut last_id =
print_backfill(runtime.as_ref(), &filter, args.backfill, args.compact).await?;
if last_id == 0 {
last_id = fetch_max_id(runtime.as_ref(), &filter).await?;
}
@@ -91,7 +101,7 @@ async fn main() -> anyhow::Result<()> {
let rows = fetch_new_rows(runtime.as_ref(), &filter, last_id).await?;
for row in rows {
last_id = last_id.max(row.id);
println!("{}", format_row(&row));
println!("{}", format_row(&row, args.compact));
}
tokio::time::sleep(poll_interval).await;
}
@@ -154,6 +164,7 @@ fn build_filter(args: &Args) -> anyhow::Result<LogFilter> {
module_like,
file_like,
thread_ids,
search: args.search.clone(),
include_threadless: args.threadless,
})
}
@@ -172,6 +183,7 @@ async fn print_backfill(
runtime: &StateRuntime,
filter: &LogFilter,
backfill: usize,
compact: bool,
) -> anyhow::Result<i64> {
if backfill == 0 {
return Ok(0);
@@ -183,7 +195,7 @@ async fn print_backfill(
let mut last_id = 0;
for row in rows {
last_id = last_id.max(row.id);
println!("{}", format_row(&row));
println!("{}", format_row(&row, compact));
}
Ok(last_id)
}
@@ -233,6 +245,7 @@ fn to_log_query(
module_like: filter.module_like.clone(),
file_like: filter.file_like.clone(),
thread_ids: filter.thread_ids.clone(),
search: filter.search.clone(),
include_threadless: filter.include_threadless,
after_id,
limit,
@@ -240,8 +253,8 @@ fn to_log_query(
}
}
fn format_row(row: &LogRow) -> String {
let timestamp = formatter::ts(row.ts, row.ts_nanos);
fn format_row(row: &LogRow, compact: bool) -> String {
let timestamp = formatter::ts(row.ts, row.ts_nanos, compact);
let level = row.level.as_str();
let target = row.target.as_str();
let message = row.message.as_deref().unwrap_or("");
@@ -251,9 +264,13 @@ fn format_row(row: &LogRow) -> String {
let thread_id_colored = thread_id.blue().dimmed().to_string();
let target_colored = target.dimmed().to_string();
let message_colored = heuristic_formatting(message);
format!(
"{timestamp_colored} {level_colored} [{thread_id_colored}] {target_colored} - {message_colored}"
)
if compact {
format!("{timestamp_colored} {level_colored} {message_colored}")
} else {
format!(
"{timestamp_colored} {level_colored} [{thread_id_colored}] {target_colored} - {message_colored}"
)
}
}
fn heuristic_formatting(message: &str) -> String {
@@ -292,9 +309,10 @@ mod formatter {
.join("\n")
}
pub(super) fn ts(ts: i64, ts_nanos: i64) -> String {
pub(super) fn ts(ts: i64, ts_nanos: i64, compact: bool) -> String {
let nanos = u32::try_from(ts_nanos).unwrap_or(0);
match DateTime::<Utc>::from_timestamp(ts, nanos) {
Some(dt) if compact => dt.format("%H:%M:%S").to_string(),
Some(dt) => dt.to_rfc3339_opts(SecondsFormat::Millis, true),
None => format!("{ts}.{ts_nanos:09}Z"),
}

View File

@@ -37,6 +37,7 @@ pub struct LogQuery {
pub module_like: Vec<String>,
pub file_like: Vec<String>,
pub thread_ids: Vec<String>,
pub search: Option<String>,
pub include_threadless: bool,
pub after_id: Option<i64>,
pub limit: Option<usize>,

View File

@@ -711,6 +711,11 @@ fn push_log_filters<'a>(builder: &mut QueryBuilder<'a, Sqlite>, query: &'a LogQu
if let Some(after_id) = query.after_id {
builder.push(" AND id > ").push_bind(after_id);
}
if let Some(search) = query.search.as_ref() {
builder.push(" AND INSTR(message, ");
builder.push_bind(search.as_str());
builder.push(") > 0");
}
}
fn push_like_filters<'a>(
@@ -906,6 +911,8 @@ mod tests {
use super::StateRuntime;
use super::ThreadMetadata;
use super::state_db_filename;
use crate::LogEntry;
use crate::LogQuery;
use crate::STATE_DB_FILENAME;
use crate::STATE_DB_VERSION;
use crate::model::Phase2JobClaimOutcome;
@@ -2495,6 +2502,57 @@ VALUES (?, ?, ?, ?, ?)
let _ = tokio::fs::remove_dir_all(codex_home).await;
}
#[tokio::test]
async fn query_logs_with_search_matches_substring() {
let codex_home = unique_temp_dir();
let runtime = StateRuntime::init(codex_home.clone(), "test-provider".to_string(), None)
.await
.expect("initialize runtime");
runtime
.insert_logs(&[
LogEntry {
ts: 1_700_000_001,
ts_nanos: 0,
level: "INFO".to_string(),
target: "cli".to_string(),
message: Some("alpha".to_string()),
thread_id: Some("thread-1".to_string()),
process_uuid: None,
file: Some("main.rs".to_string()),
line: Some(42),
module_path: None,
},
LogEntry {
ts: 1_700_000_002,
ts_nanos: 0,
level: "INFO".to_string(),
target: "cli".to_string(),
message: Some("alphabet".to_string()),
thread_id: Some("thread-1".to_string()),
process_uuid: None,
file: Some("main.rs".to_string()),
line: Some(43),
module_path: None,
},
])
.await
.expect("insert test logs");
let rows = runtime
.query_logs(&LogQuery {
search: Some("alphab".to_string()),
..Default::default()
})
.await
.expect("query matching logs");
assert_eq!(rows.len(), 1);
assert_eq!(rows[0].message.as_deref(), Some("alphabet"));
let _ = tokio::fs::remove_dir_all(codex_home).await;
}
fn test_thread_metadata(
codex_home: &Path,
thread_id: ThreadId,

View File

@@ -978,6 +978,27 @@ impl App {
self.refresh_status_line();
}
fn should_wait_for_initial_session(session_selection: &SessionSelection) -> bool {
matches!(
session_selection,
SessionSelection::StartFresh | SessionSelection::Exit
)
}
fn should_handle_active_thread_events(
waiting_for_initial_session_configured: bool,
has_active_thread_receiver: bool,
) -> bool {
has_active_thread_receiver && !waiting_for_initial_session_configured
}
fn should_stop_waiting_for_initial_session(
waiting_for_initial_session_configured: bool,
primary_thread_id: Option<ThreadId>,
) -> bool {
waiting_for_initial_session_configured && primary_thread_id.is_some()
}
#[allow(clippy::too_many_arguments)]
pub async fn run(
tui: &mut tui::Tui,
@@ -1068,6 +1089,8 @@ impl App {
let status_line_invalid_items_warned = Arc::new(AtomicBool::new(false));
let enhanced_keys_supported = tui.enhanced_keys_supported();
let wait_for_initial_session_configured =
Self::should_wait_for_initial_session(&session_selection);
let mut chat_widget = match session_selection {
SessionSelection::StartFresh | SessionSelection::Exit => {
let init = crate::chatwidget::ChatWidgetInit {
@@ -1252,6 +1275,7 @@ impl App {
let mut thread_created_rx = thread_manager.subscribe_thread_created();
let mut listen_for_threads = true;
let mut waiting_for_initial_session_configured = wait_for_initial_session_configured;
let exit_reason = loop {
let control = select! {
@@ -1264,7 +1288,10 @@ impl App {
} else {
None
}
}, if app.active_thread_rx.is_some() => {
}, if App::should_handle_active_thread_events(
waiting_for_initial_session_configured,
app.active_thread_rx.is_some()
) => {
if let Some(event) = active {
app.handle_active_thread_event(tui, event).await?;
} else {
@@ -1291,6 +1318,12 @@ impl App {
AppRunControl::Continue
}
};
if App::should_stop_waiting_for_initial_session(
waiting_for_initial_session_configured,
app.primary_thread_id,
) {
waiting_for_initial_session_configured = false;
}
match control {
AppRunControl::Continue => {}
AppRunControl::Exit(reason) => break reason,
@@ -2915,6 +2948,76 @@ mod tests {
Ok(())
}
#[test]
fn startup_waiting_gate_is_only_for_fresh_or_exit_session_selection() {
assert_eq!(
App::should_wait_for_initial_session(&SessionSelection::StartFresh),
true
);
assert_eq!(
App::should_wait_for_initial_session(&SessionSelection::Exit),
true
);
assert_eq!(
App::should_wait_for_initial_session(&SessionSelection::Resume(PathBuf::from(
"/tmp/restore"
))),
false
);
assert_eq!(
App::should_wait_for_initial_session(&SessionSelection::Fork(PathBuf::from(
"/tmp/fork"
))),
false
);
}
#[test]
fn startup_waiting_gate_holds_active_thread_events_until_primary_thread_configured() {
let mut wait_for_initial_session =
App::should_wait_for_initial_session(&SessionSelection::StartFresh);
assert_eq!(wait_for_initial_session, true);
assert_eq!(
App::should_handle_active_thread_events(wait_for_initial_session, true),
false
);
assert_eq!(
App::should_stop_waiting_for_initial_session(wait_for_initial_session, None),
false
);
if App::should_stop_waiting_for_initial_session(
wait_for_initial_session,
Some(ThreadId::new()),
) {
wait_for_initial_session = false;
}
assert_eq!(wait_for_initial_session, false);
assert_eq!(
App::should_handle_active_thread_events(wait_for_initial_session, true),
true
);
}
#[test]
fn startup_waiting_gate_not_applied_for_resume_or_fork_session_selection() {
let wait_for_resume = App::should_wait_for_initial_session(&SessionSelection::Resume(
PathBuf::from("/tmp/restore"),
));
assert_eq!(
App::should_handle_active_thread_events(wait_for_resume, true),
true
);
let wait_for_fork = App::should_wait_for_initial_session(&SessionSelection::Fork(
PathBuf::from("/tmp/fork"),
));
assert_eq!(
App::should_handle_active_thread_events(wait_for_fork, true),
true
);
}
#[tokio::test]
async fn enqueue_thread_event_does_not_block_when_channel_full() -> Result<()> {
let mut app = make_test_app().await;

View File

@@ -371,5 +371,6 @@ pub(crate) enum FeedbackCategory {
BadResult,
GoodResult,
Bug,
SafetyCheck,
Other,
}

View File

@@ -45,12 +45,15 @@ pub(crate) struct CommandPopupFlags {
impl CommandPopup {
pub(crate) fn new(mut prompts: Vec<CustomPrompt>, flags: CommandPopupFlags) -> Self {
// Keep built-in availability in sync with the composer.
let builtins = slash_commands::builtins_for_input(
let builtins: Vec<(&'static str, SlashCommand)> = slash_commands::builtins_for_input(
flags.collaboration_modes_enabled,
flags.connectors_enabled,
flags.personality_command_enabled,
flags.windows_degraded_sandbox_active,
);
)
.into_iter()
.filter(|(name, _)| !name.starts_with("debug"))
.collect();
// Exclude prompts that collide with builtin command names and sort by name.
let exclude: HashSet<String> = builtins.iter().map(|(n, _)| (*n).to_string()).collect();
prompts.retain(|p| !exclude.contains(&p.name));
@@ -461,7 +464,7 @@ mod tests {
}
#[test]
fn collab_command_hidden_when_collaboration_modes_disabled() {
fn plan_command_hidden_when_collaboration_modes_disabled() {
let mut popup = CommandPopup::new(Vec::new(), CommandPopupFlags::default());
popup.on_composer_text_change("/".to_string());
@@ -473,35 +476,12 @@ mod tests {
CommandItem::UserPrompt(_) => None,
})
.collect();
assert!(
!cmds.contains(&"collab"),
"expected '/collab' to be hidden when collaboration modes are disabled, got {cmds:?}"
);
assert!(
!cmds.contains(&"plan"),
"expected '/plan' to be hidden when collaboration modes are disabled, got {cmds:?}"
);
}
#[test]
fn collab_command_visible_when_collaboration_modes_enabled() {
let mut popup = CommandPopup::new(
Vec::new(),
CommandPopupFlags {
collaboration_modes_enabled: true,
connectors_enabled: false,
personality_command_enabled: true,
windows_degraded_sandbox_active: false,
},
);
popup.on_composer_text_change("/collab".to_string());
match popup.selected_item() {
Some(CommandItem::Builtin(cmd)) => assert_eq!(cmd.command(), "collab"),
other => panic!("expected collab to be selected for exact match, got {other:?}"),
}
}
#[test]
fn plan_command_visible_when_collaboration_modes_enabled() {
let mut popup = CommandPopup::new(
@@ -566,4 +546,22 @@ mod tests {
other => panic!("expected personality to be selected for exact match, got {other:?}"),
}
}
#[test]
fn debug_commands_are_hidden_from_popup() {
let popup = CommandPopup::new(Vec::new(), CommandPopupFlags::default());
let cmds: Vec<&str> = popup
.filtered_items()
.into_iter()
.filter_map(|item| match item {
CommandItem::Builtin(cmd) => Some(cmd.command()),
CommandItem::UserPrompt(_) => None,
})
.collect();
assert!(
!cmds.iter().any(|name| name.starts_with("debug")),
"expected no /debug* command in popup menu, got {cmds:?}"
);
}
}

View File

@@ -353,6 +353,10 @@ fn feedback_title_and_placeholder(category: FeedbackCategory) -> (String, String
"Tell us more (bug)".to_string(),
"(optional) Write a short description to help us further".to_string(),
),
FeedbackCategory::SafetyCheck => (
"Tell us more (safety check)".to_string(),
"(optional) Share what was refused and why it should have been allowed".to_string(),
),
FeedbackCategory::Other => (
"Tell us more (other)".to_string(),
"(optional) Write a short description to help us further".to_string(),
@@ -365,6 +369,7 @@ fn feedback_classification(category: FeedbackCategory) -> &'static str {
FeedbackCategory::BadResult => "bad_result",
FeedbackCategory::GoodResult => "good_result",
FeedbackCategory::Bug => "bug",
FeedbackCategory::SafetyCheck => "safety_check",
FeedbackCategory::Other => "other",
}
}
@@ -378,14 +383,15 @@ fn issue_url_for_category(
// the external GitHub behavior identical while routing internal users to
// the internal go link.
match category {
FeedbackCategory::Bug | FeedbackCategory::BadResult | FeedbackCategory::Other => {
Some(match feedback_audience {
FeedbackAudience::OpenAiEmployee => slack_feedback_url(thread_id),
FeedbackAudience::External => {
format!("{BASE_BUG_ISSUE_URL}&steps=Uploaded%20thread:%20{thread_id}")
}
})
}
FeedbackCategory::Bug
| FeedbackCategory::BadResult
| FeedbackCategory::SafetyCheck
| FeedbackCategory::Other => Some(match feedback_audience {
FeedbackAudience::OpenAiEmployee => slack_feedback_url(thread_id),
FeedbackAudience::External => {
format!("{BASE_BUG_ISSUE_URL}&steps=Uploaded%20thread:%20{thread_id}")
}
}),
FeedbackCategory::GoodResult => None,
}
}
@@ -423,6 +429,12 @@ pub(crate) fn feedback_selection_params(
"Helpful, correct, highquality, or delightful result worth celebrating.",
FeedbackCategory::GoodResult,
),
make_feedback_item(
app_event_tx.clone(),
"safety check",
"Benign usage blocked due to safety checks or refusals.",
FeedbackCategory::SafetyCheck,
),
make_feedback_item(
app_event_tx,
"other",
@@ -616,7 +628,14 @@ mod tests {
}
#[test]
fn issue_url_available_for_bug_bad_result_and_other() {
fn feedback_view_safety_check() {
let view = make_view(FeedbackCategory::SafetyCheck);
let rendered = render(&view, 60);
insta::assert_snapshot!("feedback_view_safety_check", rendered);
}
#[test]
fn issue_url_available_for_bug_bad_result_safety_check_and_other() {
let bug_url = issue_url_for_category(
FeedbackCategory::Bug,
"thread-1",
@@ -639,6 +658,13 @@ mod tests {
);
assert!(other_url.is_some());
let safety_check_url = issue_url_for_category(
FeedbackCategory::SafetyCheck,
"thread-4",
FeedbackAudience::OpenAiEmployee,
);
assert!(safety_check_url.is_some());
assert!(
issue_url_for_category(
FeedbackCategory::GoodResult,

View File

@@ -20,7 +20,7 @@ pub(crate) fn builtins_for_input(
.filter(|(_, cmd)| allow_elevate_sandbox || *cmd != SlashCommand::ElevateSandbox)
.filter(|(_, cmd)| {
collaboration_modes_enabled
|| !matches!(*cmd, SlashCommand::Collab | SlashCommand::Plan)
|| *cmd != SlashCommand::Plan
})
.filter(|(_, cmd)| connectors_enabled || *cmd != SlashCommand::Apps)
.filter(|(_, cmd)| personality_command_enabled || *cmd != SlashCommand::Personality)
@@ -63,3 +63,15 @@ pub(crate) fn has_builtin_prefix(
.into_iter()
.any(|(command_name, _)| fuzzy_match(command_name, name).is_some())
}
#[cfg(test)]
mod tests {
use super::*;
use pretty_assertions::assert_eq;
#[test]
fn debug_command_still_resolves_for_dispatch() {
let cmd = find_builtin_command("debug-config", true, true, true, false);
assert_eq!(cmd, Some(SlashCommand::DebugConfig));
}
}

View File

@@ -0,0 +1,9 @@
---
source: tui/src/bottom_pane/feedback_view.rs
expression: rendered
---
▌ Tell us more (safety check)
▌ (optional) Share what was refused and why it should have b
Press enter to confirm or esc to go back

View File

@@ -184,7 +184,6 @@ use crate::bottom_pane::SelectionViewParams;
use crate::bottom_pane::custom_prompt_view::CustomPromptView;
use crate::bottom_pane::popup_consts::standard_popup_hint_line;
use crate::clipboard_paste::paste_image_to_temp_png;
use crate::collab;
use crate::collaboration_modes;
use crate::diff_render::display_path_for;
use crate::exec_cell::CommandOutput;
@@ -201,6 +200,7 @@ use crate::history_cell::WebSearchCell;
use crate::key_hint;
use crate::key_hint::KeyBinding;
use crate::markdown::append_markdown;
use crate::multi_agents;
use crate::render::Insets;
use crate::render::renderable::ColumnRenderable;
use crate::render::renderable::FlexRenderable;
@@ -3299,22 +3299,18 @@ impl ChatWidget {
);
return;
}
if let Some(mask) = collaboration_modes::plan_mask(self.models_manager.as_ref()) {
if self.active_mode_kind() == ModeKind::Plan {
let Some(mask) = collaboration_modes::default_mask(self.models_manager.as_ref()) else {
self.add_info_message("Default mode unavailable right now.".to_string(), None);
return;
};
self.set_collaboration_mask(mask);
} else if let Some(mask) = collaboration_modes::plan_mask(self.models_manager.as_ref()) {
self.set_collaboration_mask(mask);
} else {
self.add_info_message("Plan mode unavailable right now.".to_string(), None);
}
}
SlashCommand::Collab => {
if !self.collaboration_modes_enabled() {
self.add_info_message(
"Collaboration modes are disabled.".to_string(),
Some("Enable collaboration modes to use /collab.".to_string()),
);
return;
}
self.open_collaboration_modes_popup();
}
SlashCommand::Agent => {
self.app_event_tx.send(AppEvent::OpenAgentPicker);
}
@@ -4113,17 +4109,19 @@ impl ChatWidget {
EventMsg::ExitedReviewMode(review) => self.on_exited_review_mode(review),
EventMsg::ContextCompacted(_) => self.on_agent_message("Context compacted".to_owned()),
EventMsg::CollabAgentSpawnBegin(_) => {}
EventMsg::CollabAgentSpawnEnd(ev) => self.on_collab_event(collab::spawn_end(ev)),
EventMsg::CollabAgentSpawnEnd(ev) => self.on_collab_event(multi_agents::spawn_end(ev)),
EventMsg::CollabAgentInteractionBegin(_) => {}
EventMsg::CollabAgentInteractionEnd(ev) => {
self.on_collab_event(collab::interaction_end(ev))
self.on_collab_event(multi_agents::interaction_end(ev))
}
EventMsg::CollabWaitingBegin(ev) => self.on_collab_event(collab::waiting_begin(ev)),
EventMsg::CollabWaitingEnd(ev) => self.on_collab_event(collab::waiting_end(ev)),
EventMsg::CollabWaitingBegin(ev) => {
self.on_collab_event(multi_agents::waiting_begin(ev))
}
EventMsg::CollabWaitingEnd(ev) => self.on_collab_event(multi_agents::waiting_end(ev)),
EventMsg::CollabCloseBegin(_) => {}
EventMsg::CollabCloseEnd(ev) => self.on_collab_event(collab::close_end(ev)),
EventMsg::CollabResumeBegin(ev) => self.on_collab_event(collab::resume_begin(ev)),
EventMsg::CollabResumeEnd(ev) => self.on_collab_event(collab::resume_end(ev)),
EventMsg::CollabCloseEnd(ev) => self.on_collab_event(multi_agents::close_end(ev)),
EventMsg::CollabResumeBegin(ev) => self.on_collab_event(multi_agents::resume_begin(ev)),
EventMsg::CollabResumeEnd(ev) => self.on_collab_event(multi_agents::resume_end(ev)),
EventMsg::ThreadRolledBack(rollback) => {
if from_replay {
self.app_event_tx.send(AppEvent::ApplyThreadRollback {

View File

@@ -4,8 +4,10 @@ expression: popup
---
How was this?
1. bug Crash, error message, hang, or broken UI/behavior.
2. bad result Output was off-target, incorrect, incomplete, or unhelpful.
3. good result Helpful, correct, highquality, or delightful result worth
celebrating.
4. other Slowness, feature suggestion, UX feedback, or anything else.
1. bug Crash, error message, hang, or broken UI/behavior.
2. bad result Output was off-target, incorrect, incomplete, or unhelpful.
3. good result Helpful, correct, highquality, or delightful result worth
celebrating.
4. safety check Benign usage blocked due to safety checks or refusals.
5. other Slowness, feature suggestion, UX feedback, or anything
else.

View File

@@ -3578,35 +3578,32 @@ async fn collab_mode_shift_tab_cycles_only_when_enabled_and_idle() {
}
#[tokio::test]
async fn collab_slash_command_opens_picker_and_updates_mode() {
async fn plan_slash_command_toggles_plan_mode() {
let (mut chat, mut rx, mut op_rx) = make_chatwidget_manual(None).await;
chat.thread_id = Some(ThreadId::new());
chat.set_feature_enabled(Feature::CollaborationModes, true);
let initial = chat.current_collaboration_mode().clone();
let default = initial.clone();
chat.dispatch_command(SlashCommand::Collab);
let popup = render_bottom_popup(&chat, 80);
assert!(
popup.contains("Select Collaboration Mode"),
"expected collaboration picker: {popup}"
);
chat.dispatch_command(SlashCommand::Plan);
chat.handle_key_event(KeyEvent::from(KeyCode::Enter));
let selected_mask = match rx.try_recv() {
Ok(AppEvent::UpdateCollaborationMode(mask)) => mask,
other => panic!("expected UpdateCollaborationMode event, got {other:?}"),
};
chat.set_collaboration_mask(selected_mask);
assert!(rx.try_recv().is_err(), "plan should not emit an app event");
assert_eq!(chat.active_collaboration_mode_kind(), ModeKind::Plan);
assert_eq!(chat.current_collaboration_mode(), &default);
chat.dispatch_command(SlashCommand::Plan);
assert_eq!(chat.active_collaboration_mode_kind(), ModeKind::Default);
assert_eq!(chat.current_collaboration_mode(), &initial);
chat.bottom_pane
.set_composer_text("hello".to_string(), Vec::new(), Vec::new());
chat.handle_key_event(KeyEvent::from(KeyCode::Enter));
match next_submit_op(&mut op_rx) {
Op::UserTurn {
collaboration_mode:
Some(CollaborationMode {
mode: ModeKind::Default,
..
}),
collaboration_mode: Some(CollaborationMode {
mode: ModeKind::Default,
..
}),
personality: Some(Personality::Pragmatic),
..
} => {}
@@ -3614,7 +3611,6 @@ async fn collab_slash_command_opens_picker_and_updates_mode() {
panic!("expected Op::UserTurn with code collab mode, got {other:?}")
}
}
chat.bottom_pane
.set_composer_text("follow up".to_string(), Vec::new(), Vec::new());
chat.handle_key_event(KeyEvent::from(KeyCode::Enter));
@@ -3632,6 +3628,8 @@ async fn collab_slash_command_opens_picker_and_updates_mode() {
panic!("expected Op::UserTurn with code collab mode, got {other:?}")
}
}
assert_eq!(chat.current_collaboration_mode(), &initial);
assert_eq!(chat.active_mode_kind(), ModeKind::Default);
}
#[tokio::test]

View File

@@ -64,7 +64,6 @@ mod bottom_pane;
mod chatwidget;
mod cli;
mod clipboard_paste;
mod collab;
mod collaboration_modes;
mod color;
pub mod custom_terminal;
@@ -86,6 +85,7 @@ mod markdown_render;
mod markdown_stream;
mod mention_codec;
mod model_migration;
mod multi_agents;
mod notifications;
pub mod onboarding;
mod oss_selection;

View File

@@ -29,7 +29,6 @@ pub enum SlashCommand {
Init,
Compact,
Plan,
Collab,
Agent,
// Undo,
Diff,
@@ -82,7 +81,6 @@ impl SlashCommand {
SlashCommand::Model => "choose what model and reasoning effort to use",
SlashCommand::Personality => "choose a communication style for Codex",
SlashCommand::Plan => "switch to Plan mode",
SlashCommand::Collab => "change collaboration mode (experimental)",
SlashCommand::Agent => "switch the active agent thread",
SlashCommand::Approvals => "choose what Codex is allowed to do",
SlashCommand::Permissions => "choose what Codex is allowed to do",
@@ -152,7 +150,6 @@ impl SlashCommand {
| SlashCommand::Exit => true,
SlashCommand::Rollout => true,
SlashCommand::TestApproval => true,
SlashCommand::Collab => true,
SlashCommand::Agent => true,
SlashCommand::Statusline => false,
}

View File

@@ -218,9 +218,11 @@ impl Renderable for StatusIndicatorWidget {
return;
}
// Schedule next animation frame.
self.frame_requester
.schedule_frame_in(Duration::from_millis(32));
if self.animations_enabled {
// Schedule next animation frame.
self.frame_requester
.schedule_frame_in(Duration::from_millis(32));
}
let now = Instant::now();
let elapsed_duration = self.elapsed_duration_at(now);
let pretty_elapsed = fmt_elapsed_compact(elapsed_duration.as_secs());