fix: change codex/sandbox-state/update from a notification to a request (#8142)

Historically, `accept_elicitation_for_prompt_rule()` was flaky because we were using a notification to update the sandbox followed by a `shell` tool request that we expected to be subject to the new sandbox config, but because [rmcp](https://crates.io/crates/rmcp) MCP servers delegate each incoming message to a new Tokio task, messages are not guaranteed to be processed in order, so sometimes the `shell` tool call would run before the notification was processed. Prior to this PR, we relied on a generous `sleep()` between the notification and the request to reduce the change of the test flaking out. This PR implements a proper fix, which is to use a _request_ instead of a notification for the sandbox update so that we can wait for the response to the sandbox request before sending the request to the `shell` tool call. Previously, `rmcp` did not support custom requests, but I fixed that in https://github.com/modelcontextprotocol/rust-sdk/pull/590, which made it into the `0.12.0` release (see #8288). This PR updates `shell-tool-mcp` to expect `"codex/sandbox-state/update"` as a _request_ instead of a notification and sends the appropriate ack. Note this behavior is tied to our custom `codex/sandbox-state` capability, which Codex honors as an MCP client, which is why `core/src/mcp_connection_manager.rs` had to be updated as part of this PR, as well. This PR also updates the docs at `shell-tool-mcp/README.md`.
2026-04-25 23:24:55 +00:00 · 2025-12-18 15:32:01 -08:00
parent 358a5baba0
commit 46baedd7cb
7 changed files with 98 additions and 62 deletions
--- a/codex-rs/exec-server/tests/suite/accept_elicitation.rs
+++ b/codex-rs/exec-server/tests/suite/accept_elicitation.rs
@@ -3,7 +3,6 @@ use std::borrow::Cow;
 use std::path::PathBuf;
 use std::sync::Arc;
 use std::sync::Mutex;
-use std::time::Duration;

 use anyhow::Context;
 use anyhow::Result;
@@ -19,6 +18,8 @@ use rmcp::ServiceExt;
 use rmcp::model::CallToolRequestParam;
 use rmcp::model::CallToolResult;
 use rmcp::model::CreateElicitationRequestParam;
+use rmcp::model::EmptyResult;
+use rmcp::model::ServerResult;
 use rmcp::model::object;
 use serde_json::json;
 use std::os::unix::fs::PermissionsExt;
@@ -82,19 +83,11 @@ prefix_rule(
    } else {
        None
    };
-    notify_readable_sandbox(&project_root_path, codex_linux_sandbox_exe, &service).await?;
-
-    // TODO(mbolin): Remove this hack to remove flakiness when possible.
-    // As noted in the commentary on https://github.com/openai/codex/pull/7832,
-    // an rmcp server does not process messages serially: it takes messages off
-    // the queue and immediately dispatches them to handlers, which may complete
-    // out of order. The proper fix is to replace our custom notification with a
-    // custom request where we wait for the response before proceeding. However,
-    // rmcp does not currently support custom requests, so as a temporary
-    // workaround we just wait a bit to increase the probability the server has
-    // processed the notification. Assuming we can upstream rmcp support for
-    // custom requests, we will remove this once the functionality is available.
-    tokio::time::sleep(Duration::from_secs(4)).await;
+    let response =
+        notify_readable_sandbox(&project_root_path, codex_linux_sandbox_exe, &service).await?;
+    let ServerResult::EmptyResult(EmptyResult {}) = response else {
+        panic!("expected EmptyResult from sandbox state notification but found: {response:?}");
+    };

    // Call the shell tool and verify that an elicitation was created and
    // auto-approved.