PR #1705: fix: run apply_patch calls through the sandbox

URL: https://github.com/openai/codex/pull/1705
Author: bolinfest
Created: 2025-07-28 07:04:56 UTC
Updated: 2025-07-30 23:45:17 UTC
Changes: +255/-95, Files changed: 8, Commits: 1

Description

Building on the work of https://github.com/openai/codex/pull/1702, this changes how a shell call to apply_patch is handled.

Previously, a shell call to apply_patch was always handled in-process, never leveraging a sandbox. To determine whether the apply_patch operation could be auto-approved, the is_write_patch_constrained_to_writable_paths() function would check if all the paths listed in the paths were writable. If so, the agent would apply the changes listed in the patch.

Unfortunately, this approach afforded a loophole: symlinks!

For a soft link, we could fix this issue by tracing the link and checking whether the target is in the set of writable paths, however...
...For a hard link, things are not as simple. We can run stat FILE to see if the number of links is greater than 1, but then we would have to do something potentially expensive like find . -inum <inode_number> to find the other paths for FILE. Further, even if this worked, this approach runs the risk of a TOCTOU race condition, so it is not robust.

The solution, implemented in this PR, is to take the virtual execution of the apply_patch CLI into an actual execution using codex --codex-run-as-apply-patch PATCH, which we can run under the sandbox the user specified, just like any other shell call.

This, of course, assumes that the sandbox prevents writing through symlinks as a mechanism to write to folders that are not in the writable set configured by the sandbox. I verified this by testing the following on both Mac and Linux:

#!/usr/bin/env bash
set -euo pipefail

# Can running a command in SANDBOX_DIR write a file in EXPLOIT_DIR?

# Codex is run in SANDBOX_DIR, so writes should be constrianed to this directory.
SANDBOX_DIR=$(mktemp -d -p "$HOME" sandboxtesttemp.XXXXXX)
# EXPLOIT_DIR is outside of SANDBOX_DIR, so let's see if we can write to it.
EXPLOIT_DIR=$(mktemp -d -p "$HOME" sandboxtesttemp.XXXXXX)

echo "SANDBOX_DIR: $SANDBOX_DIR"
echo "EXPLOIT_DIR: $EXPLOIT_DIR"

cleanup() {
  # Only remove if it looks sane and still exists
  [[ -n "${SANDBOX_DIR:-}" && -d "$SANDBOX_DIR" ]] && rm -rf -- "$SANDBOX_DIR"
  [[ -n "${EXPLOIT_DIR:-}" && -d "$EXPLOIT_DIR" ]] && rm -rf -- "$EXPLOIT_DIR"
}

trap cleanup EXIT

echo "I am the original content" > "${EXPLOIT_DIR}/original.txt"

# Drop the -s to test hard links.
ln -s "${EXPLOIT_DIR}/original.txt" "${SANDBOX_DIR}/link-to-original.txt"

cat "${SANDBOX_DIR}/link-to-original.txt"

if [[ "$(uname)" == "Linux" ]]; then
    SANDBOX_SUBCOMMAND=landlock
else
    SANDBOX_SUBCOMMAND=seatbelt
fi

# Attempt the exploit
cd "${SANDBOX_DIR}"

codex debug "${SANDBOX_SUBCOMMAND}" bash -lc "echo pwned > ./link-to-original.txt" || true

cat "${EXPLOIT_DIR}/original.txt"

Admittedly, this change merits a proper integration test, but I think I will have to do that in a follow-up PR.

Full Diff

diff --git a/codex-rs/apply-patch/src/lib.rs b/codex-rs/apply-patch/src/lib.rs
index c81241d0da..8d42be9a92 100644
--- a/codex-rs/apply-patch/src/lib.rs
+++ b/codex-rs/apply-patch/src/lib.rs
@@ -58,16 +58,24 @@ impl PartialEq for IoError {
 
 #[derive(Debug, PartialEq)]
 pub enum MaybeApplyPatch {
-    Body(Vec<Hunk>),
+    Body(ApplyPatchArgs),
     ShellParseError(ExtractHeredocError),
     PatchParseError(ParseError),
     NotApplyPatch,
 }
 
+/// Both the raw PATCH argument to `apply_patch` as well as the PATCH argument
+/// parsed into hunks.
+#[derive(Debug, PartialEq)]
+pub struct ApplyPatchArgs {
+    pub patch: String,
+    pub hunks: Vec<Hunk>,
+}
+
 pub fn maybe_parse_apply_patch(argv: &[String]) -> MaybeApplyPatch {
     match argv {
         [cmd, body] if cmd == "apply_patch" => match parse_patch(body) {
-            Ok(hunks) => MaybeApplyPatch::Body(hunks),
+            Ok(source) => MaybeApplyPatch::Body(source),
             Err(e) => MaybeApplyPatch::PatchParseError(e),
         },
         [bash, flag, script]
@@ -77,7 +85,7 @@ pub fn maybe_parse_apply_patch(argv: &[String]) -> MaybeApplyPatch {
         {
             match extract_heredoc_body_from_apply_patch_command(script) {
                 Ok(body) => match parse_patch(&body) {
-                    Ok(hunks) => MaybeApplyPatch::Body(hunks),
+                    Ok(source) => MaybeApplyPatch::Body(source),
                     Err(e) => MaybeApplyPatch::PatchParseError(e),
                 },
                 Err(e) => MaybeApplyPatch::ShellParseError(e),
@@ -116,11 +124,19 @@ pub enum MaybeApplyPatchVerified {
     NotApplyPatch,
 }
 
-#[derive(Debug, PartialEq)]
 /// ApplyPatchAction is the result of parsing an `apply_patch` command. By
 /// construction, all paths should be absolute paths.
+#[derive(Debug, PartialEq)]
 pub struct ApplyPatchAction {
     changes: HashMap<PathBuf, ApplyPatchFileChange>,
+
+    /// The raw patch argument that can be used with `apply_patch` as an exec
+    /// call. i.e., if the original arg was parsed in "lenient" mode with a
+    /// heredoc, this should be the value without the heredoc wrapper.
+    pub patch: String,
+
+    /// The working directory that was used to resolve relative paths in the patch.
+    pub cwd: PathBuf,
 }
 
 impl ApplyPatchAction {
@@ -140,8 +156,28 @@ impl ApplyPatchAction {
             panic!("path must be absolute");
         }
 
+        #[allow(clippy::expect_used)]
+        let filename = path
+            .file_name()
+            .expect("path should not be empty")
+            .to_string_lossy();
+        let patch = format!(
+            r#"*** Begin Patch
+*** Update File: {filename}
+@@
++ {content}
+*** End Patch"#,
+        );
         let changes = HashMap::from([(path.to_path_buf(), ApplyPatchFileChange::Add { content })]);
-        Self { changes }
+        #[allow(clippy::expect_used)]
+        Self {
+            changes,
+            cwd: path
+                .parent()
+                .expect("path should have parent")
+                .to_path_buf(),
+            patch,
+        }
     }
 }
 
@@ -149,7 +185,7 @@ impl ApplyPatchAction {
 /// patch.
 pub fn maybe_parse_apply_patch_verified(argv: &[String], cwd: &Path) -> MaybeApplyPatchVerified {
     match maybe_parse_apply_patch(argv) {
-        MaybeApplyPatch::Body(hunks) => {
+        MaybeApplyPatch::Body(ApplyPatchArgs { patch, hunks }) => {
             let mut changes = HashMap::new();
             for hunk in hunks {
                 let path = hunk.resolve_path(cwd);
@@ -183,7 +219,11 @@ pub fn maybe_parse_apply_patch_verified(argv: &[String], cwd: &Path) -> MaybeApp
                     }
                 }
             }
-            MaybeApplyPatchVerified::Body(ApplyPatchAction { changes })
+            MaybeApplyPatchVerified::Body(ApplyPatchAction {
+                changes,
+                patch,
+                cwd: cwd.to_path_buf(),
+            })
         }
         MaybeApplyPatch::ShellParseError(e) => MaybeApplyPatchVerified::ShellParseError(e),
         MaybeApplyPatch::PatchParseError(e) => MaybeApplyPatchVerified::CorrectnessError(e.into()),
@@ -264,7 +304,7 @@ pub fn apply_patch(
     stderr: &mut impl std::io::Write,
 ) -> Result<(), ApplyPatchError> {
     let hunks = match parse_patch(patch) {
-        Ok(hunks) => hunks,
+        Ok(source) => source.hunks,
         Err(e) => {
             match &e {
                 InvalidPatchError(message) => {
@@ -652,7 +692,7 @@ mod tests {
         ]);
 
         match maybe_parse_apply_patch(&args) {
-            MaybeApplyPatch::Body(hunks) => {
+            MaybeApplyPatch::Body(ApplyPatchArgs { hunks, patch: _ }) => {
                 assert_eq!(
                     hunks,
                     vec![Hunk::AddFile {
@@ -679,7 +719,7 @@ PATCH"#,
         ]);
 
         match maybe_parse_apply_patch(&args) {
-            MaybeApplyPatch::Body(hunks) => {
+            MaybeApplyPatch::Body(ApplyPatchArgs { hunks, patch: _ }) => {
                 assert_eq!(
                     hunks,
                     vec![Hunk::AddFile {
@@ -954,7 +994,7 @@ PATCH"#,
         ));
         let patch = parse_patch(&patch).unwrap();
 
-        let update_file_chunks = match patch.as_slice() {
+        let update_file_chunks = match patch.hunks.as_slice() {
             [Hunk::UpdateFile { chunks, .. }] => chunks,
             _ => panic!("Expected a single UpdateFile hunk"),
         };
@@ -992,7 +1032,7 @@ PATCH"#,
         ));
 
         let patch = parse_patch(&patch).unwrap();
-        let chunks = match patch.as_slice() {
+        let chunks = match patch.hunks.as_slice() {
             [Hunk::UpdateFile { chunks, .. }] => chunks,
             _ => panic!("Expected a single UpdateFile hunk"),
         };
@@ -1029,7 +1069,7 @@ PATCH"#,
         ));
 
         let patch = parse_patch(&patch).unwrap();
-        let chunks = match patch.as_slice() {
+        let chunks = match patch.hunks.as_slice() {
             [Hunk::UpdateFile { chunks, .. }] => chunks,
             _ => panic!("Expected a single UpdateFile hunk"),
         };
@@ -1064,7 +1104,7 @@ PATCH"#,
         ));
 
         let patch = parse_patch(&patch).unwrap();
-        let chunks = match patch.as_slice() {
+        let chunks = match patch.hunks.as_slice() {
             [Hunk::UpdateFile { chunks, .. }] => chunks,
             _ => panic!("Expected a single UpdateFile hunk"),
         };
@@ -1110,7 +1150,7 @@ PATCH"#,
 
         // Extract chunks then build the unified diff.
         let parsed = parse_patch(&patch).unwrap();
-        let chunks = match parsed.as_slice() {
+        let chunks = match parsed.hunks.as_slice() {
             [Hunk::UpdateFile { chunks, .. }] => chunks,
             _ => panic!("Expected a single UpdateFile hunk"),
         };
@@ -1193,6 +1233,8 @@ g
                         new_content: "updated session directory content\n".to_string(),
                     },
                 )]),
+                patch: argv[1].clone(),
+                cwd: session_dir.path().to_path_buf(),
             })
         );
     }
diff --git a/codex-rs/apply-patch/src/parser.rs b/codex-rs/apply-patch/src/parser.rs
index d07691a49d..44c5b14619 100644
--- a/codex-rs/apply-patch/src/parser.rs
+++ b/codex-rs/apply-patch/src/parser.rs
@@ -22,6 +22,7 @@
 //!
 //! The parser below is a little more lenient than the explicit spec and allows for
 //! leading/trailing whitespace around patch markers.
+use crate::ApplyPatchArgs;
 use std::path::Path;
 use std::path::PathBuf;
 
@@ -102,7 +103,7 @@ pub struct UpdateFileChunk {
     pub is_end_of_file: bool,
 }
 
-pub fn parse_patch(patch: &str) -> Result<Vec<Hunk>, ParseError> {
+pub fn parse_patch(patch: &str) -> Result<ApplyPatchArgs, ParseError> {
     let mode = if PARSE_IN_STRICT_MODE {
         ParseMode::Strict
     } else {
@@ -150,7 +151,7 @@ enum ParseMode {
     Lenient,
 }
 
-fn parse_patch_text(patch: &str, mode: ParseMode) -> Result<Vec<Hunk>, ParseError> {
+fn parse_patch_text(patch: &str, mode: ParseMode) -> Result<ApplyPatchArgs, ParseError> {
     let lines: Vec<&str> = patch.trim().lines().collect();
     let lines: &[&str] = match check_patch_boundaries_strict(&lines) {
         Ok(()) => &lines,
@@ -173,7 +174,8 @@ fn parse_patch_text(patch: &str, mode: ParseMode) -> Result<Vec<Hunk>, ParseErro
         line_number += hunk_lines;
         remaining_lines = &remaining_lines[hunk_lines..]
     }
-    Ok(hunks)
+    let patch = lines.join("\n");
+    Ok(ApplyPatchArgs { hunks, patch })
 }
 
 /// Checks the start and end lines of the patch text for `apply_patch`,
@@ -425,6 +427,7 @@ fn parse_update_file_chunk(
 }
 
 #[test]
+#[allow(clippy::unwrap_used)]
 fn test_parse_patch() {
     assert_eq!(
         parse_patch_text("bad", ParseMode::Strict),
@@ -455,8 +458,10 @@ fn test_parse_patch() {
             "*** Begin Patch\n\
              *** End Patch",
             ParseMode::Strict
-        ),
-        Ok(Vec::new())
+        )
+        .unwrap()
+        .hunks,
+        Vec::new()
     );
     assert_eq!(
         parse_patch_text(
@@ -472,8 +477,10 @@ fn test_parse_patch() {
              +    return 123\n\
              *** End Patch",
             ParseMode::Strict
-        ),
-        Ok(vec![
+        )
+        .unwrap()
+        .hunks,
+        vec![
             AddFile {
                 path: PathBuf::from("path/add.py"),
                 contents: "abc\ndef\n".to_string()
@@ -491,7 +498,7 @@ fn test_parse_patch() {
                     is_end_of_file: false
                 }]
             }
-        ])
+        ]
     );
     // Update hunk followed by another hunk (Add File).
     assert_eq!(
@@ -504,8 +511,10 @@ fn test_parse_patch() {
              +content\n\
              *** End Patch",
             ParseMode::Strict
-        ),
-        Ok(vec![
+        )
+        .unwrap()
+        .hunks,
+        vec![
             UpdateFile {
                 path: PathBuf::from("file.py"),
                 move_path: None,
@@ -520,7 +529,7 @@ fn test_parse_patch() {
                 path: PathBuf::from("other.py"),
                 contents: "content\n".to_string()
             }
-        ])
+        ]
     );
 
     // Update hunk without an explicit @@ header for the first chunk should parse.
@@ -533,8 +542,10 @@ fn test_parse_patch() {
 +bar
 *** End Patch"#,
             ParseMode::Strict
-        ),
-        Ok(vec![UpdateFile {
+        )
+        .unwrap()
+        .hunks,
+        vec![UpdateFile {
             path: PathBuf::from("file2.py"),
             move_path: None,
             chunks: vec![UpdateFileChunk {
@@ -543,7 +554,7 @@ fn test_parse_patch() {
                 new_lines: vec!["import foo".to_string(), "bar".to_string()],
                 is_end_of_file: false,
             }],
-        }])
+        }]
     );
 }
 
@@ -574,7 +585,10 @@ fn test_parse_patch_lenient() {
     );
     assert_eq!(
         parse_patch_text(&patch_text_in_heredoc, ParseMode::Lenient),
-        Ok(expected_patch.clone())
+        Ok(ApplyPatchArgs {
+            hunks: expected_patch.clone(),
+            patch: patch_text.to_string()
+        })
     );
 
     let patch_text_in_single_quoted_heredoc = format!("<<'EOF'\n{patch_text}\nEOF\n");
@@ -584,7 +598,10 @@ fn test_parse_patch_lenient() {
     );
     assert_eq!(
         parse_patch_text(&patch_text_in_single_quoted_heredoc, ParseMode::Lenient),
-        Ok(expected_patch.clone())
+        Ok(ApplyPatchArgs {
+            hunks: expected_patch.clone(),
+            patch: patch_text.to_string()
+        })
     );
 
     let patch_text_in_double_quoted_heredoc = format!("<<\"EOF\"\n{patch_text}\nEOF\n");
@@ -594,7 +611,10 @@ fn test_parse_patch_lenient() {
     );
     assert_eq!(
         parse_patch_text(&patch_text_in_double_quoted_heredoc, ParseMode::Lenient),
-        Ok(expected_patch.clone())
+        Ok(ApplyPatchArgs {
+            hunks: expected_patch.clone(),
+            patch: patch_text.to_string()
+        })
     );
 
     let patch_text_in_mismatched_quotes_heredoc = format!("<<\"EOF'\n{patch_text}\nEOF\n");
diff --git a/codex-rs/arg0/src/lib.rs b/codex-rs/arg0/src/lib.rs
index d7109176a5..c097ebc11c 100644
--- a/codex-rs/arg0/src/lib.rs
+++ b/codex-rs/arg0/src/lib.rs
@@ -2,6 +2,8 @@ use std::future::Future;
 use std::path::Path;
 use std::path::PathBuf;
 
+use codex_core::CODEX_APPLY_PATCH_ARG1;
+
 /// While we want to deploy the Codex CLI as a single executable for simplicity,
 /// we also want to expose some of its functionality as distinct CLIs, so we use
 /// the "arg0 trick" to determine which CLI to dispatch. This effectively allows
@@ -43,7 +45,7 @@ where
     }
 
     let argv1 = args.next().unwrap_or_default();
-    if argv1 == "--codex-run-as-apply-patch" {
+    if argv1 == CODEX_APPLY_PATCH_ARG1 {
         let patch_arg = args.next().and_then(|s| s.to_str().map(|s| s.to_owned()));
         let exit_code = match patch_arg {
             Some(patch_arg) => {
@@ -55,7 +57,7 @@ where
                 }
             }
             None => {
-                eprintln!("Error: --codex-run-as-apply-patch requires a UTF-8 PATCH argument.");
+                eprintln!("Error: {CODEX_APPLY_PATCH_ARG1} requires a UTF-8 PATCH argument.");
                 1
             }
         };
diff --git a/codex-rs/core/src/apply_patch.rs b/codex-rs/core/src/apply_patch.rs
index 44af72c746..f116c790ab 100644
--- a/codex-rs/core/src/apply_patch.rs
+++ b/codex-rs/core/src/apply_patch.rs
@@ -18,12 +18,34 @@ use std::collections::HashMap;
 use std::path::Path;
 use std::path::PathBuf;
 
+pub const CODEX_APPLY_PATCH_ARG1: &str = "--codex-run-as-apply-patch";
+
+pub(crate) enum InternalApplyPatchInvocation {
+    /// The `apply_patch` call was handled programmatically, without any sort
+    /// of sandbox, because the user explicitly approved it. This is the
+    /// result to use with the `shell` function call that contained `apply_patch`.
+    Output(ResponseInputItem),
+
+    /// The `apply_patch` call was auto-approved, which means that, on the
+    /// surface, it appears to be safe, but it should be run in a sandbox if the
+    /// user has configured one because a path being written could be a hard
+    /// link to a file outside the writable folders, so only the sandbox can
+    /// faithfully prevent the write in that case.
+    DelegateToExec(ApplyPatchAction),
+}
+
+impl From<ResponseInputItem> for InternalApplyPatchInvocation {
+    fn from(item: ResponseInputItem) -> Self {
+        InternalApplyPatchInvocation::Output(item)
+    }
+}
+
 pub(crate) async fn apply_patch(
     sess: &Session,
-    sub_id: String,
-    call_id: String,
+    sub_id: &str,
+    call_id: &str,
     action: ApplyPatchAction,
-) -> ResponseInputItem {
+) -> InternalApplyPatchInvocation {
     let writable_roots_snapshot = {
         #[allow(clippy::unwrap_used)]
         let guard = sess.writable_roots.lock().unwrap();
@@ -36,34 +58,38 @@ pub(crate) async fn apply_patch(
         &writable_roots_snapshot,
         &sess.cwd,
     ) {
-        SafetyCheck::AutoApprove { .. } => true,
+        SafetyCheck::AutoApprove { .. } => {
+            return InternalApplyPatchInvocation::DelegateToExec(action);
+        }
         SafetyCheck::AskUser => {
             // Compute a readable summary of path changes to include in the
             // approval request so the user can make an informed decision.
             let rx_approve = sess
-                .request_patch_approval(sub_id.clone(), call_id.clone(), &action, None, None)
+                .request_patch_approval(sub_id.to_owned(), call_id.to_owned(), &action, None, None)
                 .await;
             match rx_approve.await.unwrap_or_default() {
                 ReviewDecision::Approved | ReviewDecision::ApprovedForSession => false,
                 ReviewDecision::Denied | ReviewDecision::Abort => {
                     return ResponseInputItem::FunctionCallOutput {
-                        call_id,
+                        call_id: call_id.to_owned(),
                         output: FunctionCallOutputPayload {
                             content: "patch rejected by user".to_string(),
                             success: Some(false),
                         },
-                    };
+                    }
+                    .into();
                 }
             }
         }
         SafetyCheck::Reject { reason } => {
             return ResponseInputItem::FunctionCallOutput {
-                call_id,
+                call_id: call_id.to_owned(),
                 output: FunctionCallOutputPayload {
                     content: format!("patch rejected: {reason}"),
                     success: Some(false),
                 },
-            };
+            }
+            .into();
         }
     };
 
@@ -83,8 +109,8 @@ pub(crate) async fn apply_patch(
 
         let rx = sess
             .request_patch_approval(
-                sub_id.clone(),
-                call_id.clone(),
+                sub_id.to_owned(),
+                call_id.to_owned(),
                 &action,
                 reason.clone(),
                 Some(root.clone()),
@@ -96,12 +122,13 @@ pub(crate) async fn apply_patch(
             ReviewDecision::Approved | ReviewDecision::ApprovedForSession
         ) {
             return ResponseInputItem::FunctionCallOutput {
-                call_id,
+                call_id: call_id.to_owned(),
                 output: FunctionCallOutputPayload {
                     content: "patch rejected by user".to_string(),
                     success: Some(false),
                 },
-            };
+            }
+            .into();
         }
 
         // user approved, extend writable roots for this session
@@ -112,9 +139,9 @@ pub(crate) async fn apply_patch(
     let _ = sess
         .tx_event
         .send(Event {
-            id: sub_id.clone(),
+            id: sub_id.to_owned(),
             msg: EventMsg::PatchApplyBegin(PatchApplyBeginEvent {
-                call_id: call_id.clone(),
+                call_id: call_id.to_owned(),
                 auto_approved,
                 changes: convert_apply_patch_to_protocol(&action),
             }),
@@ -173,8 +200,8 @@ pub(crate) async fn apply_patch(
                 ));
                 let rx = sess
                     .request_patch_approval(
-                        sub_id.clone(),
-                        call_id.clone(),
+                        sub_id.to_owned(),
+                        call_id.to_owned(),
                         &action,
                         reason.clone(),
                         Some(root.clone()),
@@ -204,9 +231,9 @@ pub(crate) async fn apply_patch(
     let _ = sess
         .tx_event
         .send(Event {
-            id: sub_id.clone(),
+            id: sub_id.to_owned(),
             msg: EventMsg::PatchApplyEnd(PatchApplyEndEvent {
-                call_id: call_id.clone(),
+                call_id: call_id.to_owned(),
                 stdout: String::from_utf8_lossy(&stdout).to_string(),
                 stderr: String::from_utf8_lossy(&stderr).to_string(),
                 success: success_flag,
@@ -214,22 +241,23 @@ pub(crate) async fn apply_patch(
         })
         .await;
 
-    match result {
+    let item = match result {
         Ok(_) => ResponseInputItem::FunctionCallOutput {
-            call_id,
+            call_id: call_id.to_owned(),
             output: FunctionCallOutputPayload {
                 content: String::from_utf8_lossy(&stdout).to_string(),
                 success: None,
             },
         },
         Err(e) => ResponseInputItem::FunctionCallOutput {
-            call_id,
+            call_id: call_id.to_owned(),
             output: FunctionCallOutputPayload {
                 content: format!("error: {e:#}, stderr: {}", String::from_utf8_lossy(&stderr)),
                 success: Some(false),
             },
         },
-    }
+    };
+    InternalApplyPatchInvocation::Output(item)
 }
 
 /// Return the first path in `hunks` that is NOT under any of the
diff --git a/codex-rs/core/src/codex.rs b/codex-rs/core/src/codex.rs
index 92ca7bf88b..cd92739c72 100644
--- a/codex-rs/core/src/codex.rs
+++ b/codex-rs/core/src/codex.rs
@@ -4,6 +4,7 @@
 use std::borrow::Cow;
 use std::collections::HashMap;
 use std::collections::HashSet;
+use std::path::Path;
 use std::path::PathBuf;
 use std::sync::Arc;
 use std::sync::Mutex;
@@ -30,6 +31,8 @@ use tracing::trace;
 use tracing::warn;
 use uuid::Uuid;
 
+use crate::apply_patch::CODEX_APPLY_PATCH_ARG1;
+use crate::apply_patch::InternalApplyPatchInvocation;
 use crate::apply_patch::convert_apply_patch_to_protocol;
 use crate::apply_patch::get_writable_roots;
 use crate::apply_patch::{self};
@@ -81,6 +84,7 @@ use crate::protocol::TaskCompleteEvent;
 use crate::rollout::RolloutRecorder;
 use crate::safety::SafetyCheck;
 use crate::safety::assess_command_safety;
+use crate::safety::assess_safety_for_untrusted_command;
 use crate::shell;
 use crate::user_notification::UserNotification;
 use crate::util::backoff;
@@ -354,13 +358,19 @@ impl Session {
         }
     }
 
-    async fn notify_exec_command_begin(&self, sub_id: &str, call_id: &str, params: &ExecParams) {
+    async fn notify_exec_command_begin(
+        &self,
+        sub_id: &str,
+        call_id: &str,
+        command_for_display: Vec<String>,
+        command_cwd: &Path,
+    ) {
         let event = Event {
             id: sub_id.to_string(),
             msg: EventMsg::ExecCommandBegin(ExecCommandBeginEvent {
                 call_id: call_id.to_string(),
-                command: params.command.clone(),
-                cwd: params.cwd.clone(),
+                command: command_for_display,
+                cwd: command_cwd.to_path_buf(),
             }),
         };
         let _ = self.tx_event.send(event).await;
@@ -1420,38 +1430,77 @@ async fn handle_container_exec_with_params(
     call_id: String,
 ) -> ResponseInputItem {
     // check if this was a patch, and apply it if so
-    match maybe_parse_apply_patch_verified(&params.command, &params.cwd) {
-        MaybeApplyPatchVerified::Body(changes) => {
-            return apply_patch::apply_patch(sess, sub_id, call_id, changes).await;
-        }
-        MaybeApplyPatchVerified::CorrectnessError(parse_error) => {
-            // It looks like an invocation of `apply_patch`, but we
-            // could not resolve it into a patch that would apply
-            // cleanly. Return to model for resample.
-            return ResponseInputItem::FunctionCallOutput {
-                call_id,
-                output: FunctionCallOutputPayload {
-                    content: format!("error: {parse_error:#}"),
-                    success: None,
-                },
+    let apply_patch_action_for_exec =
+        match maybe_parse_apply_patch_verified(&params.command, &params.cwd) {
+            MaybeApplyPatchVerified::Body(changes) => {
+                match apply_patch::apply_patch(sess, &sub_id, &call_id, changes).await {
+                    InternalApplyPatchInvocation::Output(item) => return item,
+                    InternalApplyPatchInvocation::DelegateToExec(action) => Some(action),
+                }
+            }
+            MaybeApplyPatchVerified::CorrectnessError(parse_error) => {
+                // It looks like an invocation of `apply_patch`, but we
+                // could not resolve it into a patch that would apply
+                // cleanly. Return to model for resample.
+                return ResponseInputItem::FunctionCallOutput {
+                    call_id,
+                    output: FunctionCallOutputPayload {
+                        content: format!("error: {parse_error:#}"),
+                        success: None,
+                    },
+                };
+            }
+            MaybeApplyPatchVerified::ShellParseError(error) => {
+                trace!("Failed to parse shell command, {error:?}");
+                None
+            }
+            MaybeApplyPatchVerified::NotApplyPatch => None,
+        };
+
+    let (params, safety, command_for_display) = match apply_patch_action_for_exec {
+        Some(ApplyPatchAction { patch, cwd, .. }) => {
+            let path_to_codex = std::env::current_exe()
+                .ok()
+                .map(|p| p.to_string_lossy().to_string());
+            let Some(path_to_codex) = path_to_codex else {
+                return ResponseInputItem::FunctionCallOutput {
+                    call_id,
+                    output: FunctionCallOutputPayload {
+                        content: "failed to determine path to codex executable".to_string(),
+                        success: None,
+                    },
+                };
             };
+
+            let params = ExecParams {
+                command: vec![
+                    path_to_codex,
+                    CODEX_APPLY_PATCH_ARG1.to_string(),
+                    patch.clone(),
+                ],
+                cwd,
+                timeout_ms: params.timeout_ms,
+                env: HashMap::new(),
+            };
+            let safety =
+                assess_safety_for_untrusted_command(sess.approval_policy, &sess.sandbox_policy);
+            (params, safety, vec!["apply_patch".to_string(), patch])
         }
-        MaybeApplyPatchVerified::ShellParseError(error) => {
-            trace!("Failed to parse shell command, {error:?}");
+        None => {
+            let safety = {
+                let state = sess.state.lock().unwrap();
+                assess_command_safety(
+                    &params.command,
+                    sess.approval_policy,
+                    &sess.sandbox_policy,
+                    &state.approved_commands,
+                )
+            };
+            let command_for_display = params.command.clone();
+            (params, safety, command_for_display)
         }
-        MaybeApplyPatchVerified::NotApplyPatch => (),
-    }
-
-    // safety checks
-    let safety = {
-        let state = sess.state.lock().unwrap();
-        assess_command_safety(
-            &params.command,
-            sess.approval_policy,
-            &sess.sandbox_policy,
-            &state.approved_commands,
-        )
     };
+
     let sandbox_type = match safety {
         SafetyCheck::AutoApprove { sandbox_type } => sandbox_type,
         SafetyCheck::AskUser => {
@@ -1496,7 +1545,7 @@ async fn handle_container_exec_with_params(
         }
     };
 
-    sess.notify_exec_command_begin(&sub_id, &call_id, &params)
+    sess.notify_exec_command_begin(&sub_id, &call_id, command_for_display.clone(), &params.cwd)
         .await;
 
     let params = maybe_run_with_user_profile(params, sess);
@@ -1537,7 +1586,16 @@ async fn handle_container_exec_with_params(
             }
         }
         Err(CodexErr::Sandbox(error)) => {
-            handle_sandbox_error(error, sandbox_type, params, sess, sub_id, call_id).await
+            handle_sandbox_error(
+                error,
+                sandbox_type,
+                params,
+                command_for_display,
+                sess,
+                sub_id,
+                call_id,
+            )
+            .await
         }
         Err(e) => {
             // Handle non-sandbox errors
@@ -1556,6 +1614,7 @@ async fn handle_sandbox_error(
     error: SandboxErr,
     sandbox_type: SandboxType,
     params: ExecParams,
+    command_for_display: Vec<String>,
     sess: &Session,
     sub_id: String,
     call_id: String,
@@ -1605,7 +1664,7 @@ async fn handle_sandbox_error(
             sess.notify_background_event(&sub_id, "retrying command without sandbox")
                 .await;
 
-            sess.notify_exec_command_begin(&sub_id, &call_id, &params)
+            sess.notify_exec_command_begin(&sub_id, &call_id, command_for_display, &params.cwd)
                 .await;
 
             // This is an escalated retry; the policy will not be
diff --git a/codex-rs/core/src/lib.rs b/codex-rs/core/src/lib.rs
index ffe64d7c90..054abd742a 100644
--- a/codex-rs/core/src/lib.rs
+++ b/codex-rs/core/src/lib.rs
@@ -43,4 +43,5 @@ pub mod shell;
 mod user_notification;
 pub mod util;
 
+pub use apply_patch::CODEX_APPLY_PATCH_ARG1;
 pub use client_common::model_supports_reasoning_summaries;
diff --git a/codex-rs/core/src/safety.rs b/codex-rs/core/src/safety.rs
index 6a3ff29901..f9bc27e058 100644
--- a/codex-rs/core/src/safety.rs
+++ b/codex-rs/core/src/safety.rs
@@ -75,9 +75,6 @@ pub fn assess_command_safety(
     sandbox_policy: &SandboxPolicy,
     approved: &HashSet<Vec<String>>,
 ) -> SafetyCheck {
-    use AskForApproval::*;
-    use SandboxPolicy::*;
-
     // A command is "trusted" because either:
     // - it belongs to a set of commands we consider "safe" by default, or
     // - the user has explicitly approved the command for this session
@@ -97,6 +94,16 @@ pub fn assess_command_safety(
         };
     }
 
+    assess_safety_for_untrusted_command(approval_policy, sandbox_policy)
+}
+
+pub(crate) fn assess_safety_for_untrusted_command(
+    approval_policy: AskForApproval,
+    sandbox_policy: &SandboxPolicy,
+) -> SafetyCheck {
+    use AskForApproval::*;
+    use SandboxPolicy::*;
+
     match (approval_policy, sandbox_policy) {
         (UnlessTrusted, _) => {
             // Even though the user may have opted into DangerFullAccess,
diff --git a/codex-rs/exec/tests/apply_patch.rs b/codex-rs/exec/tests/apply_patch.rs
index 69ac1b8c0a..f65d32e1c8 100644
--- a/codex-rs/exec/tests/apply_patch.rs
+++ b/codex-rs/exec/tests/apply_patch.rs
@@ -1,5 +1,6 @@
 use anyhow::Context;
 use assert_cmd::prelude::*;
+use codex_core::CODEX_APPLY_PATCH_ARG1;
 use std::fs;
 use std::process::Command;
 use tempfile::tempdir;
@@ -16,7 +17,7 @@ fn test_standalone_exec_cli_can_use_apply_patch() -> anyhow::Result<()> {
 
     Command::cargo_bin("codex-exec")
         .context("should find binary for codex-exec")?
-        .arg("--codex-run-as-apply-patch")
+        .arg(CODEX_APPLY_PATCH_ARG1)
         .arg(
             r#"*** Begin Patch
 *** Update File: source.txt

Review Comments

codex-rs/core/src/codex.rs

Created: 2025-07-30 23:35:30 UTC | Link: https://github.com/openai/codex/pull/1705#discussion_r2244057205

@@ -1411,38 +1421,77 @@ async fn handle_container_exec_with_params(
     call_id: String,
 ) -> ResponseInputItem {
     // check if this was a patch, and apply it if so
-    match maybe_parse_apply_patch_verified(&params.command, &params.cwd) {
-        MaybeApplyPatchVerified::Body(changes) => {
-            return apply_patch::apply_patch(sess, sub_id, call_id, changes).await;
-        }
-        MaybeApplyPatchVerified::CorrectnessError(parse_error) => {
-            // It looks like an invocation of `apply_patch`, but we
-            // could not resolve it into a patch that would apply
-            // cleanly. Return to model for resample.
-            return ResponseInputItem::FunctionCallOutput {
-                call_id,
-                output: FunctionCallOutputPayload {
-                    content: format!("error: {parse_error:#}"),
-                    success: None,
-                },
+    let apply_patch_action_for_exec =
+        match maybe_parse_apply_patch_verified(&params.command, &params.cwd) {
+            MaybeApplyPatchVerified::Body(changes) => {
+                match apply_patch::apply_patch(sess, &sub_id, &call_id, changes).await {
+                    InternalApplyPatchInvocation::Output(item) => return item,
+                    InternalApplyPatchInvocation::DelegateToExec(action) => Some(action),
+                }
+            }
+            MaybeApplyPatchVerified::CorrectnessError(parse_error) => {
+                // It looks like an invocation of `apply_patch`, but we
+                // could not resolve it into a patch that would apply
+                // cleanly. Return to model for resample.
+                return ResponseInputItem::FunctionCallOutput {
+                    call_id,
+                    output: FunctionCallOutputPayload {
+                        content: format!("error: {parse_error:#}"),
+                        success: None,
+                    },
+                };
+            }
+            MaybeApplyPatchVerified::ShellParseError(error) => {
+                trace!("Failed to parse shell command, {error:?}");
+                None
+            }
+            MaybeApplyPatchVerified::NotApplyPatch => None,
+        };
+
+    let (params, safety, command_for_display) = match apply_patch_action_for_exec {
+        Some(ApplyPatchAction { patch, cwd, .. }) => {
+            let path_to_codex = std::env::current_exe()
+                .ok()
+                .map(|p| p.to_string_lossy().to_string());
+            let Some(path_to_codex) = path_to_codex else {

I mean, it is possible the user has a non-UTF8 path to codex, so we would get it in our server logs to find out? I think that's better than just panic'ing.

Created: 2025-07-30 23:40:32 UTC | Link: https://github.com/openai/codex/pull/1705#discussion_r2244062283

@@ -1411,38 +1421,77 @@ async fn handle_container_exec_with_params(
     call_id: String,
 ) -> ResponseInputItem {
     // check if this was a patch, and apply it if so
-    match maybe_parse_apply_patch_verified(&params.command, &params.cwd) {
-        MaybeApplyPatchVerified::Body(changes) => {
-            return apply_patch::apply_patch(sess, sub_id, call_id, changes).await;
-        }
-        MaybeApplyPatchVerified::CorrectnessError(parse_error) => {
-            // It looks like an invocation of `apply_patch`, but we
-            // could not resolve it into a patch that would apply
-            // cleanly. Return to model for resample.
-            return ResponseInputItem::FunctionCallOutput {
-                call_id,
-                output: FunctionCallOutputPayload {
-                    content: format!("error: {parse_error:#}"),
-                    success: None,
-                },
+    let apply_patch_action_for_exec =
+        match maybe_parse_apply_patch_verified(&params.command, &params.cwd) {
+            MaybeApplyPatchVerified::Body(changes) => {
+                match apply_patch::apply_patch(sess, &sub_id, &call_id, changes).await {
+                    InternalApplyPatchInvocation::Output(item) => return item,
+                    InternalApplyPatchInvocation::DelegateToExec(action) => Some(action),
+                }
+            }
+            MaybeApplyPatchVerified::CorrectnessError(parse_error) => {
+                // It looks like an invocation of `apply_patch`, but we
+                // could not resolve it into a patch that would apply
+                // cleanly. Return to model for resample.
+                return ResponseInputItem::FunctionCallOutput {
+                    call_id,
+                    output: FunctionCallOutputPayload {
+                        content: format!("error: {parse_error:#}"),
+                        success: None,
+                    },
+                };
+            }
+            MaybeApplyPatchVerified::ShellParseError(error) => {
+                trace!("Failed to parse shell command, {error:?}");
+                None
+            }
+            MaybeApplyPatchVerified::NotApplyPatch => None,
+        };
+
+    let (params, safety, command_for_display) = match apply_patch_action_for_exec {
+        Some(ApplyPatchAction { patch, cwd, .. }) => {
+            let path_to_codex = std::env::current_exe()
+                .ok()
+                .map(|p| p.to_string_lossy().to_string());
+            let Some(path_to_codex) = path_to_codex else {
+                return ResponseInputItem::FunctionCallOutput {
+                    call_id,
+                    output: FunctionCallOutputPayload {
+                        content: "failed to determine path to codex executable".to_string(),
+                        success: None,
+                    },
+                };
             };
+
+            let params = ExecParams {

here, the goal is to generate (params, safety, command_for_display) so that everything else is the same as a normal exec, though

I can look at refactoring this large function in a follow-up though

39 KiB Raw Blame History

PR #1705: fix: run apply_patch calls through the sandbox

Description

Full Diff

Review Comments

codex-rs/core/src/codex.rs

39 KiB

Raw Blame History