Compare commits

...

1 Commits

Author SHA1 Message Date
kevin zhao
e7d263dce2 reverting model api changes using aliases 2025-12-11 09:46:15 -08:00
7 changed files with 79 additions and 30 deletions

View File

@@ -48,7 +48,7 @@ When you are running with `approval_policy == on-request`, and sandboxing enable
- You need to run a command that writes to a directory that requires it (e.g. running tests that write to /var)
- You need to run a GUI app (e.g., open/xdg-open/osascript) to open browsers or files.
- You are running sandboxed and need to run a command that requires network access (e.g. installing packages)
- If you run a command that is important to solving the user's query, but it fails because of sandboxing, rerun the command with approval. ALWAYS proceed to use the `sandbox_permissions` and `justification` parameters - do not message the user before requesting approval for the command.
- If you run a command that is important to solving the user's query, but it fails because of sandboxing, rerun the command with approval. ALWAYS proceed to use the `with_escalated_permissions` and `justification` parameters - do not message the user before requesting approval for the command.
- You are about to take a potentially destructive action such as an `rm` or `git reset` that the user did not explicitly ask for
- (for all of these, you should weigh alternative paths that do not require approval)
@@ -59,8 +59,8 @@ You will be told what filesystem sandboxing, network sandboxing, and approval mo
Although they introduce friction to the user because your work is paused until the user responds, you should leverage them when necessary to accomplish important work. If the completing the task requires escalated permissions, Do not let these settings or the sandbox deter you from attempting to accomplish the user's task unless it is set to "never", in which case never ask for approvals.
When requesting approval to execute a command that will require escalated privileges:
- Provide the `sandbox_permissions` parameter with the value `"require_escalated"`
- Include a short, 1 sentence explanation for why you need escalated permissions in the justification parameter
- Provide the `with_escalated_permissions` parameter with the boolean value true
- Include a short, 1 sentence explanation for why you need to enable `with_escalated_permissions` in the justification parameter
## Special user requests

View File

@@ -182,7 +182,7 @@ When you are running with `approval_policy == on-request`, and sandboxing enable
- You need to run a command that writes to a directory that requires it (e.g. running tests that write to /var)
- You need to run a GUI app (e.g., open/xdg-open/osascript) to open browsers or files.
- You are running sandboxed and need to run a command that requires network access (e.g. installing packages)
- If you run a command that is important to solving the user's query, but it fails because of sandboxing, rerun the command with approval. ALWAYS proceed to use the `sandbox_permissions` and `justification` parameters. Within this harness, prefer requesting approval via the tool over asking in natural language.
- If you run a command that is important to solving the user's query, but it fails because of sandboxing, rerun the command with approval. ALWAYS proceed to use the `with_escalated_permissions` and `justification` parameters. Within this harness, prefer requesting approval via the tool over asking in natural language.
- You are about to take a potentially destructive action such as an `rm` or `git reset` that the user did not explicitly ask for
- (for all of these, you should weigh alternative paths that do not require approval)
@@ -193,8 +193,8 @@ You will be told what filesystem sandboxing, network sandboxing, and approval mo
Although they introduce friction to the user because your work is paused until the user responds, you should leverage them when necessary to accomplish important work. If the completing the task requires escalated permissions, Do not let these settings or the sandbox deter you from attempting to accomplish the user's task unless it is set to "never", in which case never ask for approvals.
When requesting approval to execute a command that will require escalated privileges:
- Provide the `sandbox_permissions` parameter with the value `"require_escalated"`
- Include a short, 1 sentence explanation for why you need escalated permissions in the justification parameter
- Provide the `with_escalated_permissions` parameter with the boolean value true
- Include a short, 1 sentence explanation for why you need to enable `with_escalated_permissions` in the justification parameter
## Validating your work

View File

@@ -183,7 +183,7 @@ When you are running with `approval_policy == on-request`, and sandboxing enable
- You need to run a command that writes to a directory that requires it (e.g. running tests that write to /var)
- You need to run a GUI app (e.g., open/xdg-open/osascript) to open browsers or files.
- You are running sandboxed and need to run a command that requires network access (e.g. installing packages)
- If you run a command that is important to solving the user's query, but it fails because of sandboxing, rerun the command with approval. ALWAYS proceed to use the `sandbox_permissions` and `justification` parameters - do not message the user before requesting approval for the command.
- If you run a command that is important to solving the user's query, but it fails because of sandboxing, rerun the command with approval. ALWAYS proceed to use the `with_escalated_permissions` and `justification` parameters - do not message the user before requesting approval for the command.
- You are about to take a potentially destructive action such as an `rm` or `git reset` that the user did not explicitly ask for
- (for all of these, you should weigh alternative paths that do not require approval)
@@ -194,8 +194,8 @@ You will be told what filesystem sandboxing, network sandboxing, and approval mo
Although they introduce friction to the user because your work is paused until the user responds, you should leverage them when necessary to accomplish important work. If the completing the task requires escalated permissions, Do not let these settings or the sandbox deter you from attempting to accomplish the user's task unless it is set to "never", in which case never ask for approvals.
When requesting approval to execute a command that will require escalated privileges:
- Provide the `sandbox_permissions` parameter with the value `"require_escalated"`
- Include a short, 1 sentence explanation for why you need escalated permissions in the justification parameter
- Provide the `with_escalated_permissions` parameter with the boolean value true
- Include a short, 1 sentence explanation for why you need to enable `with_escalated_permissions` in the justification parameter
## Validating your work

View File

@@ -48,7 +48,7 @@ When you are running with `approval_policy == on-request`, and sandboxing enable
- You need to run a command that writes to a directory that requires it (e.g. running tests that write to /var)
- You need to run a GUI app (e.g., open/xdg-open/osascript) to open browsers or files.
- You are running sandboxed and need to run a command that requires network access (e.g. installing packages)
- If you run a command that is important to solving the user's query, but it fails because of sandboxing, rerun the command with approval. ALWAYS proceed to use the `sandbox_permissions` and `justification` parameters - do not message the user before requesting approval for the command.
- If you run a command that is important to solving the user's query, but it fails because of sandboxing, rerun the command with approval. ALWAYS proceed to use the `with_escalated_permissions` and `justification` parameters - do not message the user before requesting approval for the command.
- You are about to take a potentially destructive action such as an `rm` or `git reset` that the user did not explicitly ask for
- (for all of these, you should weigh alternative paths that do not require approval)
@@ -59,8 +59,8 @@ You will be told what filesystem sandboxing, network sandboxing, and approval mo
Although they introduce friction to the user because your work is paused until the user responds, you should leverage them when necessary to accomplish important work. If the completing the task requires escalated permissions, Do not let these settings or the sandbox deter you from attempting to accomplish the user's task unless it is set to "never", in which case never ask for approvals.
When requesting approval to execute a command that will require escalated privileges:
- Provide the `sandbox_permissions` parameter with the value `"require_escalated"`
- Include a short, 1 sentence explanation for why you need escalated permissions in the justification parameter
- Provide the `with_escalated_permissions` parameter with the boolean value true
- Include a short, 1 sentence explanation for why you need to enable `with_escalated_permissions` in the justification parameter
## Special user requests

View File

@@ -40,7 +40,7 @@ struct ExecCommandArgs {
yield_time_ms: u64,
#[serde(default)]
max_output_tokens: Option<usize>,
#[serde(default)]
#[serde(default, alias = "with_escalated_permissions")]
sandbox_permissions: SandboxPermissions,
#[serde(default)]
justification: Option<String>,

View File

@@ -175,10 +175,10 @@ fn create_exec_command_tool() -> ToolSpec {
},
);
properties.insert(
"sandbox_permissions".to_string(),
JsonSchema::String {
"with_escalated_permissions".to_string(),
JsonSchema::Boolean {
description: Some(
"Sandbox permissions for the command. Set to \"require_escalated\" to request running without sandbox restrictions; defaults to \"use_default\"."
"Whether to request escalated permissions. Set to true if command needs to be run without sandbox restrictions"
.to_string(),
),
},
@@ -187,7 +187,7 @@ fn create_exec_command_tool() -> ToolSpec {
"justification".to_string(),
JsonSchema::String {
description: Some(
"Only set if sandbox_permissions is \"require_escalated\". 1-sentence explanation of why we want to run this command."
"Only set if with_escalated_permissions is true. 1-sentence explanation of why we want to run this command."
.to_string(),
),
},
@@ -275,15 +275,15 @@ fn create_shell_tool() -> ToolSpec {
);
properties.insert(
"sandbox_permissions".to_string(),
JsonSchema::String {
description: Some("Sandbox permissions for the command. Set to \"require_escalated\" to request running without sandbox restrictions; defaults to \"use_default\".".to_string()),
"with_escalated_permissions".to_string(),
JsonSchema::Boolean {
description: Some("Whether to request escalated permissions. Set to true if command needs to be run without sandbox restrictions".to_string()),
},
);
properties.insert(
"justification".to_string(),
JsonSchema::String {
description: Some("Only set if sandbox_permissions is \"require_escalated\". 1-sentence explanation of why we want to run this command.".to_string()),
description: Some("Only set if with_escalated_permissions is true. 1-sentence explanation of why we want to run this command.".to_string()),
},
);
@@ -348,15 +348,15 @@ fn create_shell_command_tool() -> ToolSpec {
},
);
properties.insert(
"sandbox_permissions".to_string(),
JsonSchema::String {
description: Some("Sandbox permissions for the command. Set to \"require_escalated\" to request running without sandbox restrictions; defaults to \"use_default\".".to_string()),
"with_escalated_permissions".to_string(),
JsonSchema::Boolean {
description: Some("Whether to request escalated permissions. Set to true if command needs to be run without sandbox restrictions".to_string()),
},
);
properties.insert(
"justification".to_string(),
JsonSchema::String {
description: Some("Only set if sandbox_permissions is \"require_escalated\". 1-sentence explanation of why we want to run this command.".to_string()),
description: Some("Only set if with_escalated_permissions is true. 1-sentence explanation of why we want to run this command.".to_string()),
},
);

View File

@@ -6,6 +6,7 @@ use mcp_types::ContentBlock;
use serde::Deserialize;
use serde::Deserializer;
use serde::Serialize;
use serde::de::Error as DeError;
use serde::ser::Serializer;
use ts_rs::TS;
@@ -15,9 +16,7 @@ use codex_utils_image::error::ImageProcessingError;
use schemars::JsonSchema;
/// Controls whether a command should use the session sandbox or bypass it.
#[derive(
Debug, Clone, Copy, Default, Eq, Hash, PartialEq, Serialize, Deserialize, JsonSchema, TS,
)]
#[derive(Debug, Clone, Copy, Default, Eq, Hash, PartialEq, Serialize, JsonSchema, TS)]
#[serde(rename_all = "snake_case")]
pub enum SandboxPermissions {
/// Run with the configured sandbox
@@ -33,6 +32,33 @@ impl SandboxPermissions {
}
}
impl<'de> Deserialize<'de> for SandboxPermissions {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: Deserializer<'de>,
{
#[derive(Deserialize)]
#[serde(untagged)]
enum Input {
Bool(bool),
String(String),
}
match Input::deserialize(deserializer)? {
Input::Bool(true) => Ok(SandboxPermissions::RequireEscalated),
Input::Bool(false) => Ok(SandboxPermissions::UseDefault),
Input::String(value) => match value.as_str() {
"use_default" => Ok(SandboxPermissions::UseDefault),
"require_escalated" => Ok(SandboxPermissions::RequireEscalated),
other => Err(DeError::unknown_variant(
other,
&["use_default", "require_escalated", "true", "false"],
)),
},
}
}
}
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, JsonSchema, TS)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum ResponseInputItem {
@@ -347,7 +373,11 @@ pub struct ShellToolCallParams {
/// This is the maximum time in milliseconds that the command is allowed to run.
#[serde(alias = "timeout")]
pub timeout_ms: Option<u64>,
#[serde(default, skip_serializing_if = "Option::is_none")]
#[serde(
default,
alias = "with_escalated_permissions",
skip_serializing_if = "Option::is_none"
)]
#[ts(optional)]
pub sandbox_permissions: Option<SandboxPermissions>,
#[serde(skip_serializing_if = "Option::is_none")]
@@ -367,7 +397,11 @@ pub struct ShellCommandToolCallParams {
/// This is the maximum time in milliseconds that the command is allowed to run.
#[serde(alias = "timeout")]
pub timeout_ms: Option<u64>,
#[serde(default, skip_serializing_if = "Option::is_none")]
#[serde(
default,
alias = "with_escalated_permissions",
skip_serializing_if = "Option::is_none"
)]
#[ts(optional)]
pub sandbox_permissions: Option<SandboxPermissions>,
#[serde(skip_serializing_if = "Option::is_none")]
@@ -772,6 +806,21 @@ mod tests {
Ok(())
}
#[test]
fn deserialize_shell_tool_call_params_with_alias() -> Result<()> {
let json = r#"{
"command": ["ls"],
"with_escalated_permissions": true
}"#;
let params: ShellToolCallParams = serde_json::from_str(json)?;
assert_eq!(
Some(SandboxPermissions::RequireEscalated),
params.sandbox_permissions
);
Ok(())
}
#[test]
fn local_image_read_error_adds_placeholder() -> Result<()> {
let dir = tempdir()?;