mirror of
https://github.com/openai/codex.git
synced 2026-04-26 23:55:25 +00:00
## Why Enterprises can already constrain approvals, sandboxing, and web search through `requirements.toml` and MDM, but feature flags were still only configurable as managed defaults. That meant an enterprise could suggest feature values, but it could not actually pin them. This change closes that gap and makes enterprise feature requirements behave like the other constrained settings. The effective feature set now stays consistent with enterprise requirements during config load, when config writes are validated, and when runtime code mutates feature flags later in the session. It also tightens the runtime API for managed features. `ManagedFeatures` now follows the same constraint-oriented shape as `Constrained<T>` instead of exposing panic-prone mutation helpers, and production code can no longer construct it through an unconstrained `From<Features>` path. The PR also hardens the `compact_resume_fork` integration coverage on Windows. After the feature-management changes, `compact_resume_after_second_compaction_preserves_history` was overflowing the libtest/Tokio thread stacks on Windows, so the test now uses an explicit larger-stack harness as a pragmatic mitigation. That may not be the ideal root-cause fix, and it merits a parallel investigation into whether part of the async future chain should be boxed to reduce stack pressure instead. ## What Changed Enterprises can now pin feature values in `requirements.toml` with the requirements-side `features` table: ```toml [features] personality = true unified_exec = false ``` Only canonical feature keys are allowed in the requirements `features` table; omitted keys remain unconstrained. - Added a requirements-side pinned feature map to `ConfigRequirementsToml`, threaded it through source-preserving requirements merge and normalization in `codex-config`, and made the TOML surface use `[features]` (while still accepting legacy `[feature_requirements]` for compatibility). - Exposed `featureRequirements` from `configRequirements/read`, regenerated the JSON/TypeScript schema artifacts, and updated the app-server README. - Wrapped the effective feature set in `ManagedFeatures`, backed by `ConstrainedWithSource<Features>`, and changed its API to mirror `Constrained<T>`: `can_set(...)`, `set(...) -> ConstraintResult<()>`, and result-returning `enable` / `disable` / `set_enabled` helpers. - Removed the legacy-usage and bulk-map passthroughs from `ManagedFeatures`; callers that need those behaviors now mutate a plain `Features` value and reapply it through `set(...)`, so the constrained wrapper remains the enforcement boundary. - Removed the production loophole for constructing unconstrained `ManagedFeatures`. Non-test code now creates it through the configured feature-loading path, and `impl From<Features> for ManagedFeatures` is restricted to `#[cfg(test)]`. - Rejected legacy feature aliases in enterprise feature requirements, and return a load error when a pinned combination cannot survive dependency normalization. - Validated config writes against enterprise feature requirements before persisting changes, including explicit conflicting writes and profile-specific feature states that normalize into invalid combinations. - Updated runtime and TUI feature-toggle paths to use the constrained setter API and to persist or apply the effective post-constraint value rather than the requested value. - Updated the `core_test_support` Bazel target to include the bundled core model-catalog fixtures in its runtime data, so helper code that resolves `core/models.json` through runfiles works in remote Bazel test environments. - Renamed the core config test coverage to emphasize that effective feature values are normalized at runtime, while conflicting persisted config writes are rejected. - Ran `compact_resume_after_second_compaction_preserves_history` inside an explicit 8 MiB test thread and Tokio runtime worker stack, following the existing larger-stack integration-test pattern, to keep the Windows `compact_resume_fork` test slice from aborting while a parallel investigation continues into whether some of the underlying async futures should be boxed. ## Verification - `cargo test -p codex-config` - `cargo test -p codex-core feature_requirements_ -- --nocapture` - `cargo test -p codex-core load_requirements_toml_produces_expected_constraints -- --nocapture` - `cargo test -p codex-core compact_resume_after_second_compaction_preserves_history -- --nocapture` - `cargo test -p codex-core compact_resume_fork -- --nocapture` - Re-ran the built `codex-core` `tests/all` binary with `RUST_MIN_STACK=262144` for `compact_resume_after_second_compaction_preserves_history` to confirm the explicit-stack harness fixes the deterministic low-stack repro. - `cargo test -p codex-core` - This still fails locally in unrelated integration areas that expect the `codex` / `test_stdio_server` binaries or hit existing `search_tool` wiremock mismatches. ## Docs `developers.openai.com/codex` should document the requirements-side `[features]` table for enterprise and MDM-managed configuration, including that it only accepts canonical feature keys and that conflicting config writes are rejected.
1024 lines
34 KiB
Rust
1024 lines
34 KiB
Rust
#![allow(clippy::unwrap_used)]
|
|
|
|
use codex_apply_patch::APPLY_PATCH_TOOL_INSTRUCTIONS;
|
|
use codex_core::features::Feature;
|
|
use codex_core::shell::Shell;
|
|
use codex_core::shell::default_user_shell;
|
|
use codex_protocol::config_types::CollaborationMode;
|
|
use codex_protocol::config_types::ModeKind;
|
|
use codex_protocol::config_types::ReasoningSummary;
|
|
use codex_protocol::config_types::Settings;
|
|
use codex_protocol::config_types::WebSearchMode;
|
|
use codex_protocol::openai_models::ReasoningEffort;
|
|
use codex_protocol::protocol::AskForApproval;
|
|
use codex_protocol::protocol::ENVIRONMENT_CONTEXT_OPEN_TAG;
|
|
use codex_protocol::protocol::EventMsg;
|
|
use codex_protocol::protocol::Op;
|
|
use codex_protocol::protocol::SandboxPolicy;
|
|
use codex_protocol::user_input::UserInput;
|
|
use codex_utils_absolute_path::AbsolutePathBuf;
|
|
use core_test_support::responses::ev_completed;
|
|
use core_test_support::responses::ev_response_created;
|
|
use core_test_support::responses::mount_sse_once;
|
|
use core_test_support::responses::sse;
|
|
use core_test_support::responses::start_mock_server;
|
|
use core_test_support::skip_if_no_network;
|
|
use core_test_support::test_codex::TestCodex;
|
|
use core_test_support::test_codex::test_codex;
|
|
use core_test_support::wait_for_event;
|
|
use pretty_assertions::assert_eq;
|
|
use tempfile::TempDir;
|
|
|
|
fn text_user_input(text: String) -> serde_json::Value {
|
|
text_user_input_parts(vec![text])
|
|
}
|
|
|
|
fn text_user_input_parts(texts: Vec<String>) -> serde_json::Value {
|
|
serde_json::json!({
|
|
"type": "message",
|
|
"role": "user",
|
|
"content": texts
|
|
.into_iter()
|
|
.map(|text| serde_json::json!({ "type": "input_text", "text": text }))
|
|
.collect::<Vec<_>>()
|
|
})
|
|
}
|
|
|
|
fn assert_default_env_context(text: &str, cwd: &str, shell: &Shell) {
|
|
let shell_name = shell.name();
|
|
assert!(
|
|
text.starts_with(ENVIRONMENT_CONTEXT_OPEN_TAG),
|
|
"expected environment context fragment: {text}"
|
|
);
|
|
assert!(
|
|
text.contains(&format!("<cwd>{cwd}</cwd>")),
|
|
"expected cwd in environment context: {text}"
|
|
);
|
|
assert!(
|
|
text.contains(&format!("<shell>{shell_name}</shell>")),
|
|
"expected shell in environment context: {text}"
|
|
);
|
|
assert!(
|
|
text.contains("<current_date>") && text.contains("</current_date>"),
|
|
"expected current_date in environment context: {text}"
|
|
);
|
|
assert!(
|
|
text.contains("<timezone>") && text.contains("</timezone>"),
|
|
"expected timezone in environment context: {text}"
|
|
);
|
|
assert!(
|
|
text.ends_with("</environment_context>"),
|
|
"expected closing environment_context tag: {text}"
|
|
);
|
|
}
|
|
|
|
fn assert_tool_names(body: &serde_json::Value, expected_names: &[&str]) {
|
|
assert_eq!(
|
|
body["tools"]
|
|
.as_array()
|
|
.unwrap()
|
|
.iter()
|
|
.map(|t| {
|
|
t.get("name")
|
|
.and_then(|value| value.as_str())
|
|
.or_else(|| t.get("type").and_then(|value| value.as_str()))
|
|
.unwrap()
|
|
.to_string()
|
|
})
|
|
.collect::<Vec<_>>(),
|
|
expected_names
|
|
);
|
|
}
|
|
|
|
fn normalize_newlines(text: &str) -> String {
|
|
text.replace("\r\n", "\n")
|
|
}
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 4)]
|
|
async fn prompt_tools_are_consistent_across_requests() -> anyhow::Result<()> {
|
|
skip_if_no_network!(Ok(()));
|
|
use pretty_assertions::assert_eq;
|
|
|
|
let server = start_mock_server().await;
|
|
let req1 = mount_sse_once(
|
|
&server,
|
|
sse(vec![ev_response_created("resp-1"), ev_completed("resp-1")]),
|
|
)
|
|
.await;
|
|
let req2 = mount_sse_once(
|
|
&server,
|
|
sse(vec![ev_response_created("resp-2"), ev_completed("resp-2")]),
|
|
)
|
|
.await;
|
|
|
|
let TestCodex {
|
|
codex,
|
|
config,
|
|
thread_manager,
|
|
..
|
|
} = test_codex()
|
|
.with_config(|config| {
|
|
config.user_instructions = Some("be consistent and helpful".to_string());
|
|
config.model = Some("gpt-5.1-codex-max".to_string());
|
|
// Keep tool expectations stable when the default web_search mode changes.
|
|
config
|
|
.web_search_mode
|
|
.set(WebSearchMode::Cached)
|
|
.expect("test web_search_mode should satisfy constraints");
|
|
config
|
|
.features
|
|
.enable(Feature::CollaborationModes)
|
|
.expect("test config should allow feature update");
|
|
})
|
|
.build(&server)
|
|
.await?;
|
|
let base_instructions = thread_manager
|
|
.get_models_manager()
|
|
.get_model_info(
|
|
config
|
|
.model
|
|
.as_deref()
|
|
.expect("test config should have a model"),
|
|
&config,
|
|
)
|
|
.await
|
|
.base_instructions;
|
|
|
|
codex
|
|
.submit(Op::UserInput {
|
|
items: vec![UserInput::Text {
|
|
text: "hello 1".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
})
|
|
.await?;
|
|
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
codex
|
|
.submit(Op::UserInput {
|
|
items: vec![UserInput::Text {
|
|
text: "hello 2".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
})
|
|
.await?;
|
|
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
let mut expected_tools_names = if cfg!(windows) {
|
|
vec!["shell_command"]
|
|
} else {
|
|
vec!["exec_command", "write_stdin"]
|
|
};
|
|
expected_tools_names.extend([
|
|
"update_plan",
|
|
"request_user_input",
|
|
"apply_patch",
|
|
"web_search",
|
|
"view_image",
|
|
]);
|
|
let body0 = req1.single_request().body_json();
|
|
|
|
let expected_instructions = if expected_tools_names.contains(&"apply_patch") {
|
|
base_instructions
|
|
} else {
|
|
[base_instructions, APPLY_PATCH_TOOL_INSTRUCTIONS.to_string()].join("\n")
|
|
};
|
|
|
|
assert_eq!(
|
|
body0["instructions"],
|
|
serde_json::json!(expected_instructions),
|
|
);
|
|
assert_tool_names(&body0, &expected_tools_names);
|
|
|
|
let body1 = req2.single_request().body_json();
|
|
assert_eq!(
|
|
body1["instructions"],
|
|
serde_json::json!(expected_instructions),
|
|
);
|
|
assert_tool_names(&body1, &expected_tools_names);
|
|
|
|
Ok(())
|
|
}
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
async fn gpt_5_tools_without_apply_patch_append_apply_patch_instructions() -> anyhow::Result<()> {
|
|
skip_if_no_network!(Ok(()));
|
|
use pretty_assertions::assert_eq;
|
|
|
|
let server = start_mock_server().await;
|
|
let req1 = mount_sse_once(
|
|
&server,
|
|
sse(vec![ev_response_created("resp-1"), ev_completed("resp-1")]),
|
|
)
|
|
.await;
|
|
let req2 = mount_sse_once(
|
|
&server,
|
|
sse(vec![ev_response_created("resp-2"), ev_completed("resp-2")]),
|
|
)
|
|
.await;
|
|
|
|
let TestCodex { codex, .. } = test_codex()
|
|
.with_config(|config| {
|
|
config.user_instructions = Some("be consistent and helpful".to_string());
|
|
config
|
|
.features
|
|
.disable(Feature::ApplyPatchFreeform)
|
|
.expect("test config should allow feature update");
|
|
config
|
|
.features
|
|
.enable(Feature::CollaborationModes)
|
|
.expect("test config should allow feature update");
|
|
config.model = Some("gpt-5".to_string());
|
|
})
|
|
.build(&server)
|
|
.await?;
|
|
|
|
codex
|
|
.submit(Op::UserInput {
|
|
items: vec![UserInput::Text {
|
|
text: "hello 1".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
})
|
|
.await?;
|
|
|
|
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
codex
|
|
.submit(Op::UserInput {
|
|
items: vec![UserInput::Text {
|
|
text: "hello 2".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
})
|
|
.await?;
|
|
|
|
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
let body0 = req1.single_request().body_json();
|
|
let instructions0 = body0["instructions"]
|
|
.as_str()
|
|
.expect("instructions should be a string");
|
|
assert!(
|
|
instructions0.contains("You are"),
|
|
"expected non-empty instructions"
|
|
);
|
|
|
|
let body1 = req2.single_request().body_json();
|
|
let instructions1 = body1["instructions"]
|
|
.as_str()
|
|
.expect("instructions should be a string");
|
|
assert_eq!(
|
|
normalize_newlines(instructions1),
|
|
normalize_newlines(instructions0)
|
|
);
|
|
|
|
Ok(())
|
|
}
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
async fn prefixes_context_and_instructions_once_and_consistently_across_requests()
|
|
-> anyhow::Result<()> {
|
|
skip_if_no_network!(Ok(()));
|
|
use pretty_assertions::assert_eq;
|
|
|
|
let server = start_mock_server().await;
|
|
let req1 = mount_sse_once(
|
|
&server,
|
|
sse(vec![ev_response_created("resp-1"), ev_completed("resp-1")]),
|
|
)
|
|
.await;
|
|
let req2 = mount_sse_once(
|
|
&server,
|
|
sse(vec![ev_response_created("resp-2"), ev_completed("resp-2")]),
|
|
)
|
|
.await;
|
|
|
|
let TestCodex { codex, config, .. } = test_codex()
|
|
.with_config(|config| {
|
|
config.user_instructions = Some("be consistent and helpful".to_string());
|
|
config
|
|
.features
|
|
.enable(Feature::CollaborationModes)
|
|
.expect("test config should allow feature update");
|
|
})
|
|
.build(&server)
|
|
.await?;
|
|
|
|
codex
|
|
.submit(Op::UserInput {
|
|
items: vec![UserInput::Text {
|
|
text: "hello 1".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
})
|
|
.await?;
|
|
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
codex
|
|
.submit(Op::UserInput {
|
|
items: vec![UserInput::Text {
|
|
text: "hello 2".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
})
|
|
.await?;
|
|
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
let body1 = req1.single_request().body_json();
|
|
let input1 = body1["input"].as_array().expect("input array");
|
|
assert_eq!(
|
|
input1.len(),
|
|
3,
|
|
"expected permissions + cached contextual user prefix + user msg"
|
|
);
|
|
|
|
let ui_text = input1[1]["content"][0]["text"]
|
|
.as_str()
|
|
.expect("ui message text");
|
|
assert!(
|
|
ui_text.contains("be consistent and helpful"),
|
|
"expected user instructions in UI message: {ui_text}"
|
|
);
|
|
|
|
let shell = default_user_shell();
|
|
let cwd_str = config.cwd.to_string_lossy();
|
|
let env_text = input1[1]["content"][1]["text"]
|
|
.as_str()
|
|
.expect("environment context text");
|
|
assert_default_env_context(env_text, &cwd_str, &shell);
|
|
assert_eq!(
|
|
input1[1]["content"][1]["type"].as_str(),
|
|
Some("input_text"),
|
|
"expected environment context bundled after UI message in cached contextual message"
|
|
);
|
|
assert_eq!(input1[2], text_user_input("hello 1".to_string()));
|
|
|
|
let body2 = req2.single_request().body_json();
|
|
let input2 = body2["input"].as_array().expect("input array");
|
|
assert_eq!(
|
|
&input2[..input1.len()],
|
|
input1.as_slice(),
|
|
"expected cached prefix to be reused"
|
|
);
|
|
assert_eq!(input2[input1.len()], text_user_input("hello 2".to_string()));
|
|
|
|
Ok(())
|
|
}
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
async fn overrides_turn_context_but_keeps_cached_prefix_and_key_constant() -> anyhow::Result<()> {
|
|
skip_if_no_network!(Ok(()));
|
|
use pretty_assertions::assert_eq;
|
|
|
|
let server = start_mock_server().await;
|
|
let req1 = mount_sse_once(
|
|
&server,
|
|
sse(vec![ev_response_created("resp-1"), ev_completed("resp-1")]),
|
|
)
|
|
.await;
|
|
let req2 = mount_sse_once(
|
|
&server,
|
|
sse(vec![ev_response_created("resp-2"), ev_completed("resp-2")]),
|
|
)
|
|
.await;
|
|
|
|
let TestCodex { codex, .. } = test_codex()
|
|
.with_config(|config| {
|
|
config.user_instructions = Some("be consistent and helpful".to_string());
|
|
config
|
|
.features
|
|
.enable(Feature::CollaborationModes)
|
|
.expect("test config should allow feature update");
|
|
})
|
|
.build(&server)
|
|
.await?;
|
|
|
|
// First turn
|
|
codex
|
|
.submit(Op::UserInput {
|
|
items: vec![UserInput::Text {
|
|
text: "hello 1".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
})
|
|
.await?;
|
|
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
let writable = TempDir::new().unwrap();
|
|
let new_policy = SandboxPolicy::WorkspaceWrite {
|
|
writable_roots: vec![writable.path().try_into().unwrap()],
|
|
read_only_access: Default::default(),
|
|
network_access: true,
|
|
exclude_tmpdir_env_var: true,
|
|
exclude_slash_tmp: true,
|
|
};
|
|
codex
|
|
.submit(Op::OverrideTurnContext {
|
|
cwd: None,
|
|
approval_policy: Some(AskForApproval::Never),
|
|
sandbox_policy: Some(new_policy.clone()),
|
|
windows_sandbox_level: None,
|
|
model: None,
|
|
effort: Some(Some(ReasoningEffort::High)),
|
|
summary: Some(ReasoningSummary::Detailed),
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
personality: None,
|
|
})
|
|
.await?;
|
|
|
|
// Second turn after overrides
|
|
codex
|
|
.submit(Op::UserInput {
|
|
items: vec![UserInput::Text {
|
|
text: "hello 2".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
})
|
|
.await?;
|
|
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
let request1 = req1.single_request();
|
|
let request2 = req2.single_request();
|
|
let body1 = request1.body_json();
|
|
let body2 = request2.body_json();
|
|
// prompt_cache_key should remain constant across overrides
|
|
assert_eq!(
|
|
body1["prompt_cache_key"], body2["prompt_cache_key"],
|
|
"prompt_cache_key should not change across overrides"
|
|
);
|
|
|
|
// The entire prefix from the first request should be identical and reused
|
|
// as the prefix of the second request, ensuring cache hit potential.
|
|
let expected_user_message_2 = serde_json::json!({
|
|
"type": "message",
|
|
"role": "user",
|
|
"content": [ { "type": "input_text", "text": "hello 2" } ]
|
|
});
|
|
let expected_permissions_msg = body1["input"][0].clone();
|
|
let body1_input = body1["input"].as_array().expect("input array");
|
|
// After overriding the turn context, emit one updated permissions message.
|
|
let expected_permissions_msg_2 = body2["input"][body1_input.len()].clone();
|
|
assert_ne!(
|
|
expected_permissions_msg_2, expected_permissions_msg,
|
|
"expected updated permissions message after override"
|
|
);
|
|
let mut expected_body2 = body1_input.to_vec();
|
|
expected_body2.push(expected_permissions_msg_2);
|
|
expected_body2.push(expected_user_message_2);
|
|
assert_eq!(body2["input"], serde_json::Value::Array(expected_body2));
|
|
|
|
Ok(())
|
|
}
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
async fn override_before_first_turn_emits_environment_context() -> anyhow::Result<()> {
|
|
skip_if_no_network!(Ok(()));
|
|
|
|
let server = start_mock_server().await;
|
|
let req = mount_sse_once(
|
|
&server,
|
|
sse(vec![ev_response_created("resp-1"), ev_completed("resp-1")]),
|
|
)
|
|
.await;
|
|
|
|
let TestCodex { codex, .. } = test_codex().build(&server).await?;
|
|
|
|
let collaboration_mode = CollaborationMode {
|
|
mode: ModeKind::Default,
|
|
settings: Settings {
|
|
model: "gpt-5.1".to_string(),
|
|
reasoning_effort: Some(ReasoningEffort::High),
|
|
developer_instructions: None,
|
|
},
|
|
};
|
|
|
|
codex
|
|
.submit(Op::OverrideTurnContext {
|
|
cwd: None,
|
|
approval_policy: Some(AskForApproval::Never),
|
|
sandbox_policy: None,
|
|
windows_sandbox_level: None,
|
|
model: Some("gpt-5.1-codex".to_string()),
|
|
effort: Some(Some(ReasoningEffort::Low)),
|
|
summary: None,
|
|
service_tier: None,
|
|
collaboration_mode: Some(collaboration_mode),
|
|
personality: None,
|
|
})
|
|
.await?;
|
|
|
|
codex
|
|
.submit(Op::UserInput {
|
|
items: vec![UserInput::Text {
|
|
text: "first message".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
})
|
|
.await?;
|
|
|
|
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
let body = req.single_request().body_json();
|
|
assert_eq!(body["model"].as_str(), Some("gpt-5.1"));
|
|
assert_eq!(
|
|
body.get("reasoning")
|
|
.and_then(|reasoning| reasoning.get("effort"))
|
|
.and_then(|value| value.as_str()),
|
|
Some("high")
|
|
);
|
|
let input = body["input"]
|
|
.as_array()
|
|
.expect("input array must be present");
|
|
assert!(
|
|
!input.is_empty(),
|
|
"expected at least environment context and user message"
|
|
);
|
|
|
|
let env_texts: Vec<&str> = input
|
|
.iter()
|
|
.filter_map(|msg| {
|
|
msg["content"].as_array().map(|content| {
|
|
content
|
|
.iter()
|
|
.filter_map(|item| item["text"].as_str())
|
|
.collect::<Vec<_>>()
|
|
})
|
|
})
|
|
.flatten()
|
|
.filter(|text| text.starts_with(ENVIRONMENT_CONTEXT_OPEN_TAG))
|
|
.collect();
|
|
assert!(
|
|
!env_texts.is_empty(),
|
|
"expected environment context to be emitted: {env_texts:?}"
|
|
);
|
|
assert!(
|
|
env_texts
|
|
.iter()
|
|
.any(|text| text.contains("<current_date>") && text.contains("</current_date>")),
|
|
"expected current_date in environment context: {env_texts:?}"
|
|
);
|
|
assert!(
|
|
env_texts
|
|
.iter()
|
|
.any(|text| text.contains("<timezone>") && text.contains("</timezone>")),
|
|
"expected timezone in environment context: {env_texts:?}"
|
|
);
|
|
|
|
let env_count = input
|
|
.iter()
|
|
.filter(|msg| {
|
|
msg["content"]
|
|
.as_array()
|
|
.and_then(|content| {
|
|
content.iter().find(|item| {
|
|
item["type"].as_str() == Some("input_text")
|
|
&& item["text"]
|
|
.as_str()
|
|
.map(|text| text.starts_with(ENVIRONMENT_CONTEXT_OPEN_TAG))
|
|
.unwrap_or(false)
|
|
})
|
|
})
|
|
.is_some()
|
|
})
|
|
.count();
|
|
assert!(
|
|
env_count >= 1,
|
|
"environment context should appear at least once, found {env_count}"
|
|
);
|
|
|
|
let permissions_texts: Vec<&str> = input
|
|
.iter()
|
|
.filter_map(|msg| {
|
|
let role = msg["role"].as_str()?;
|
|
if role != "developer" {
|
|
return None;
|
|
}
|
|
msg["content"]
|
|
.as_array()
|
|
.and_then(|content| content.first())
|
|
.and_then(|item| item["text"].as_str())
|
|
})
|
|
.collect();
|
|
assert!(
|
|
permissions_texts.iter().any(|text| {
|
|
let lower = text.to_ascii_lowercase();
|
|
(lower.contains("approval policy") || lower.contains("approval_policy"))
|
|
&& lower.contains("never")
|
|
}),
|
|
"permissions message should reflect overridden approval policy: {permissions_texts:?}"
|
|
);
|
|
|
|
let user_texts: Vec<&str> = input
|
|
.iter()
|
|
.filter_map(|msg| {
|
|
msg["content"].as_array().map(|content| {
|
|
content
|
|
.iter()
|
|
.filter_map(|item| item["text"].as_str())
|
|
.collect::<Vec<_>>()
|
|
})
|
|
})
|
|
.flatten()
|
|
.collect();
|
|
assert!(
|
|
user_texts.contains(&"first message"),
|
|
"expected user message text, got {user_texts:?}"
|
|
);
|
|
|
|
Ok(())
|
|
}
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
async fn per_turn_overrides_keep_cached_prefix_and_key_constant() -> anyhow::Result<()> {
|
|
skip_if_no_network!(Ok(()));
|
|
use pretty_assertions::assert_eq;
|
|
|
|
let server = start_mock_server().await;
|
|
let req1 = mount_sse_once(
|
|
&server,
|
|
sse(vec![ev_response_created("resp-1"), ev_completed("resp-1")]),
|
|
)
|
|
.await;
|
|
let req2 = mount_sse_once(
|
|
&server,
|
|
sse(vec![ev_response_created("resp-2"), ev_completed("resp-2")]),
|
|
)
|
|
.await;
|
|
|
|
let TestCodex { codex, .. } = test_codex()
|
|
.with_config(|config| {
|
|
config.user_instructions = Some("be consistent and helpful".to_string());
|
|
config
|
|
.features
|
|
.enable(Feature::CollaborationModes)
|
|
.expect("test config should allow feature update");
|
|
})
|
|
.build(&server)
|
|
.await?;
|
|
|
|
// First turn
|
|
codex
|
|
.submit(Op::UserInput {
|
|
items: vec![UserInput::Text {
|
|
text: "hello 1".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
final_output_json_schema: None,
|
|
})
|
|
.await?;
|
|
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
// Second turn using per-turn overrides via UserTurn
|
|
let new_cwd = TempDir::new().unwrap();
|
|
let writable = TempDir::new().unwrap();
|
|
let new_policy = SandboxPolicy::WorkspaceWrite {
|
|
writable_roots: vec![AbsolutePathBuf::try_from(writable.path()).unwrap()],
|
|
read_only_access: Default::default(),
|
|
network_access: true,
|
|
exclude_tmpdir_env_var: true,
|
|
exclude_slash_tmp: true,
|
|
};
|
|
codex
|
|
.submit(Op::UserTurn {
|
|
items: vec![UserInput::Text {
|
|
text: "hello 2".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
cwd: new_cwd.path().to_path_buf(),
|
|
approval_policy: AskForApproval::Never,
|
|
sandbox_policy: new_policy.clone(),
|
|
model: "o3".to_string(),
|
|
effort: Some(ReasoningEffort::High),
|
|
summary: Some(ReasoningSummary::Detailed),
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
final_output_json_schema: None,
|
|
personality: None,
|
|
})
|
|
.await?;
|
|
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
let request1 = req1.single_request();
|
|
let request2 = req2.single_request();
|
|
let body1 = request1.body_json();
|
|
let body2 = request2.body_json();
|
|
|
|
// prompt_cache_key should remain constant across per-turn overrides
|
|
assert_eq!(
|
|
body1["prompt_cache_key"], body2["prompt_cache_key"],
|
|
"prompt_cache_key should not change across per-turn overrides"
|
|
);
|
|
|
|
// The entire prefix from the first request should be identical and reused
|
|
// as the prefix of the second request.
|
|
let expected_user_message_2 = serde_json::json!({
|
|
"type": "message",
|
|
"role": "user",
|
|
"content": [ { "type": "input_text", "text": "hello 2" } ]
|
|
});
|
|
let expected_permissions_msg = body1["input"][0].clone();
|
|
let body1_input = body1["input"].as_array().expect("input array");
|
|
let expected_settings_update_msg = body2["input"][body1_input.len()].clone();
|
|
assert_ne!(
|
|
expected_settings_update_msg, expected_permissions_msg,
|
|
"expected updated permissions message after per-turn override"
|
|
);
|
|
assert_eq!(
|
|
expected_settings_update_msg["role"].as_str(),
|
|
Some("developer")
|
|
);
|
|
assert!(
|
|
request2.has_message_with_input_texts("developer", |texts| {
|
|
texts.iter().any(|text| text.contains("<model_switch>"))
|
|
}),
|
|
"expected model switch section after model override: {expected_settings_update_msg:?}"
|
|
);
|
|
let expected_env_msg_2 = body2["input"][body1_input.len() + 1].clone();
|
|
assert_eq!(expected_env_msg_2["role"].as_str(), Some("user"));
|
|
let env_text = expected_env_msg_2["content"][0]["text"]
|
|
.as_str()
|
|
.expect("environment context text");
|
|
let shell = default_user_shell();
|
|
let expected_cwd = new_cwd.path().display().to_string();
|
|
assert_default_env_context(env_text, &expected_cwd, &shell);
|
|
let mut expected_body2 = body1_input.to_vec();
|
|
expected_body2.push(expected_settings_update_msg);
|
|
expected_body2.push(expected_env_msg_2);
|
|
expected_body2.push(expected_user_message_2);
|
|
assert_eq!(body2["input"], serde_json::Value::Array(expected_body2));
|
|
|
|
Ok(())
|
|
}
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
async fn send_user_turn_with_no_changes_does_not_send_environment_context() -> anyhow::Result<()> {
|
|
skip_if_no_network!(Ok(()));
|
|
use pretty_assertions::assert_eq;
|
|
|
|
let server = start_mock_server().await;
|
|
let req1 = mount_sse_once(
|
|
&server,
|
|
sse(vec![ev_response_created("resp-1"), ev_completed("resp-1")]),
|
|
)
|
|
.await;
|
|
let req2 = mount_sse_once(
|
|
&server,
|
|
sse(vec![ev_response_created("resp-2"), ev_completed("resp-2")]),
|
|
)
|
|
.await;
|
|
|
|
let TestCodex {
|
|
codex,
|
|
config,
|
|
session_configured,
|
|
..
|
|
} = test_codex()
|
|
.with_config(|config| {
|
|
config.user_instructions = Some("be consistent and helpful".to_string());
|
|
config
|
|
.features
|
|
.enable(Feature::CollaborationModes)
|
|
.expect("test config should allow feature update");
|
|
})
|
|
.build(&server)
|
|
.await?;
|
|
|
|
let default_cwd = config.cwd.clone();
|
|
let default_approval_policy = config.permissions.approval_policy.value();
|
|
let default_sandbox_policy = config.permissions.sandbox_policy.get();
|
|
let default_model = session_configured.model;
|
|
let default_effort = config.model_reasoning_effort;
|
|
let default_summary = config.model_reasoning_summary;
|
|
|
|
codex
|
|
.submit(Op::UserTurn {
|
|
items: vec![UserInput::Text {
|
|
text: "hello 1".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
cwd: default_cwd.clone(),
|
|
approval_policy: default_approval_policy,
|
|
sandbox_policy: default_sandbox_policy.clone(),
|
|
model: default_model.clone(),
|
|
effort: default_effort,
|
|
summary: Some(default_summary.unwrap_or(ReasoningSummary::Auto)),
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
final_output_json_schema: None,
|
|
personality: None,
|
|
})
|
|
.await?;
|
|
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
codex
|
|
.submit(Op::UserTurn {
|
|
items: vec![UserInput::Text {
|
|
text: "hello 2".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
cwd: default_cwd.clone(),
|
|
approval_policy: default_approval_policy,
|
|
sandbox_policy: default_sandbox_policy.clone(),
|
|
model: default_model.clone(),
|
|
effort: default_effort,
|
|
summary: Some(default_summary.unwrap_or(ReasoningSummary::Auto)),
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
final_output_json_schema: None,
|
|
personality: None,
|
|
})
|
|
.await?;
|
|
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
let request1 = req1.single_request();
|
|
let request2 = req2.single_request();
|
|
let body1 = request1.body_json();
|
|
let body2 = request2.body_json();
|
|
|
|
let expected_permissions_msg = body1["input"][0].clone();
|
|
let expected_ui_msg = body1["input"][1].clone();
|
|
|
|
let shell = default_user_shell();
|
|
let default_cwd_lossy = default_cwd.to_string_lossy();
|
|
let expected_env_text_1 = expected_ui_msg["content"][1]["text"]
|
|
.as_str()
|
|
.expect("cached environment context text")
|
|
.to_string();
|
|
assert_default_env_context(&expected_env_text_1, &default_cwd_lossy, &shell);
|
|
|
|
let expected_contextual_user_msg_1 = text_user_input_parts(vec![
|
|
expected_ui_msg["content"][0]["text"]
|
|
.as_str()
|
|
.expect("cached user instructions text")
|
|
.to_string(),
|
|
expected_env_text_1,
|
|
]);
|
|
let expected_user_message_1 = text_user_input("hello 1".to_string());
|
|
|
|
let expected_input_1 = serde_json::Value::Array(vec![
|
|
expected_permissions_msg.clone(),
|
|
expected_contextual_user_msg_1.clone(),
|
|
expected_user_message_1.clone(),
|
|
]);
|
|
assert_eq!(body1["input"], expected_input_1);
|
|
|
|
let expected_user_message_2 = text_user_input("hello 2".to_string());
|
|
let expected_input_2 = serde_json::Value::Array(vec![
|
|
expected_permissions_msg,
|
|
expected_contextual_user_msg_1,
|
|
expected_user_message_1,
|
|
expected_user_message_2,
|
|
]);
|
|
assert_eq!(body2["input"], expected_input_2);
|
|
|
|
Ok(())
|
|
}
|
|
|
|
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
|
|
async fn send_user_turn_with_changes_sends_environment_context() -> anyhow::Result<()> {
|
|
skip_if_no_network!(Ok(()));
|
|
use pretty_assertions::assert_eq;
|
|
|
|
let server = start_mock_server().await;
|
|
|
|
let req1 = mount_sse_once(
|
|
&server,
|
|
sse(vec![ev_response_created("resp-1"), ev_completed("resp-1")]),
|
|
)
|
|
.await;
|
|
let req2 = mount_sse_once(
|
|
&server,
|
|
sse(vec![ev_response_created("resp-2"), ev_completed("resp-2")]),
|
|
)
|
|
.await;
|
|
let TestCodex {
|
|
codex,
|
|
config,
|
|
session_configured,
|
|
..
|
|
} = test_codex()
|
|
.with_config(|config| {
|
|
config.user_instructions = Some("be consistent and helpful".to_string());
|
|
config
|
|
.features
|
|
.enable(Feature::CollaborationModes)
|
|
.expect("test config should allow feature update");
|
|
})
|
|
.build(&server)
|
|
.await?;
|
|
|
|
let default_cwd = config.cwd.clone();
|
|
let default_approval_policy = config.permissions.approval_policy.value();
|
|
let default_sandbox_policy = config.permissions.sandbox_policy.get();
|
|
let default_model = session_configured.model;
|
|
let default_effort = config.model_reasoning_effort;
|
|
let default_summary = config.model_reasoning_summary;
|
|
|
|
codex
|
|
.submit(Op::UserTurn {
|
|
items: vec![UserInput::Text {
|
|
text: "hello 1".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
cwd: default_cwd.clone(),
|
|
approval_policy: default_approval_policy,
|
|
sandbox_policy: default_sandbox_policy.clone(),
|
|
model: default_model,
|
|
effort: default_effort,
|
|
summary: Some(default_summary.unwrap_or(ReasoningSummary::Auto)),
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
final_output_json_schema: None,
|
|
personality: None,
|
|
})
|
|
.await?;
|
|
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
codex
|
|
.submit(Op::UserTurn {
|
|
items: vec![UserInput::Text {
|
|
text: "hello 2".into(),
|
|
text_elements: Vec::new(),
|
|
}],
|
|
cwd: default_cwd.clone(),
|
|
approval_policy: AskForApproval::Never,
|
|
sandbox_policy: SandboxPolicy::DangerFullAccess,
|
|
model: "o3".to_string(),
|
|
effort: Some(ReasoningEffort::High),
|
|
summary: Some(ReasoningSummary::Detailed),
|
|
service_tier: None,
|
|
collaboration_mode: None,
|
|
final_output_json_schema: None,
|
|
personality: None,
|
|
})
|
|
.await?;
|
|
wait_for_event(&codex, |ev| matches!(ev, EventMsg::TurnComplete(_))).await;
|
|
|
|
let request1 = req1.single_request();
|
|
let request2 = req2.single_request();
|
|
let body1 = request1.body_json();
|
|
let body2 = request2.body_json();
|
|
|
|
let expected_permissions_msg = body1["input"][0].clone();
|
|
let expected_ui_msg = body1["input"][1].clone();
|
|
|
|
let shell = default_user_shell();
|
|
let expected_env_text_1 = expected_ui_msg["content"][1]["text"]
|
|
.as_str()
|
|
.expect("cached environment context text")
|
|
.to_string();
|
|
assert_default_env_context(&expected_env_text_1, &default_cwd.to_string_lossy(), &shell);
|
|
let expected_contextual_user_msg_1 = text_user_input_parts(vec![
|
|
expected_ui_msg["content"][0]["text"]
|
|
.as_str()
|
|
.expect("cached user instructions text")
|
|
.to_string(),
|
|
expected_env_text_1,
|
|
]);
|
|
let expected_user_message_1 = text_user_input("hello 1".to_string());
|
|
let expected_input_1 = serde_json::Value::Array(vec![
|
|
expected_permissions_msg.clone(),
|
|
expected_contextual_user_msg_1.clone(),
|
|
expected_user_message_1.clone(),
|
|
]);
|
|
assert_eq!(body1["input"], expected_input_1);
|
|
|
|
let body1_input = body1["input"].as_array().expect("input array");
|
|
let expected_settings_update_msg = body2["input"][body1_input.len()].clone();
|
|
assert_ne!(
|
|
expected_settings_update_msg, expected_permissions_msg,
|
|
"expected updated permissions message after policy change"
|
|
);
|
|
assert_eq!(
|
|
expected_settings_update_msg["role"].as_str(),
|
|
Some("developer")
|
|
);
|
|
assert!(
|
|
request2.has_message_with_input_texts("developer", |texts| {
|
|
texts.iter().any(|text| text.contains("<model_switch>"))
|
|
}),
|
|
"expected model switch section after model override: {expected_settings_update_msg:?}"
|
|
);
|
|
let expected_user_message_2 = text_user_input("hello 2".to_string());
|
|
let expected_input_2 = serde_json::Value::Array(vec![
|
|
expected_permissions_msg,
|
|
expected_contextual_user_msg_1,
|
|
expected_user_message_1,
|
|
expected_settings_update_msg,
|
|
expected_user_message_2,
|
|
]);
|
|
assert_eq!(body2["input"], expected_input_2);
|
|
|
|
Ok(())
|
|
}
|