Historically, `accept_elicitation_for_prompt_rule()` was flaky because we were using a notification to update the sandbox followed by a `shell` tool request that we expected to be subject to the new sandbox config, but because [rmcp](https://crates.io/crates/rmcp) MCP servers delegate each incoming message to a new Tokio task, messages are not guaranteed to be processed in order, so sometimes the `shell` tool call would run before the notification was processed. Prior to this PR, we relied on a generous `sleep()` between the notification and the request to reduce the change of the test flaking out. This PR implements a proper fix, which is to use a _request_ instead of a notification for the sandbox update so that we can wait for the response to the sandbox request before sending the request to the `shell` tool call. Previously, `rmcp` did not support custom requests, but I fixed that in https://github.com/modelcontextprotocol/rust-sdk/pull/590, which made it into the `0.12.0` release (see #8288). This PR updates `shell-tool-mcp` to expect `"codex/sandbox-state/update"` as a _request_ instead of a notification and sends the appropriate ack. Note this behavior is tied to our custom `codex/sandbox-state` capability, which Codex honors as an MCP client, which is why `core/src/mcp_connection_manager.rs` had to be updated as part of this PR, as well. This PR also updates the docs at `shell-tool-mcp/README.md`.
@openai/codex-shell-tool-mcp
Note: This MCP server is still experimental. When using it with Codex CLI, ensure the CLI version matches the MCP server version.
@openai/codex-shell-tool-mcp is an MCP server that provides a tool named shell that runs a shell command inside a sandboxed instance of Bash. This special instance of Bash intercepts requests to spawn new processes (specifically, execve(2) calls). For each call, it makes a request back to the MCP server to determine whether to allow the proposed command to execute. It also has the option of escalating the command to run unprivileged outside of the sandbox governing the Bash process.
The user can use Codex .rules files to define how a command should be handled. The action to take is determined by the decision parameter of a matching rule as follows:
allow: the command will be escalated and run outside the sandboxprompt: the command will be subject to human approval via an MCP elicitation (it will run escalated if approved)forbidden: the command will fail with exit code1and an error message will be written tostderr
Commands that do not match an explicit rule in .rules will be allowed to run as-is, though they will still be subject to the sandbox applied to the parent Bash process.
Motivation
When a software agent asks if it is safe to run a command like ls, without more context, it is unclear whether it will result in executing /bin/ls. Consider:
- There could be another executable named
lsthat appears before/bin/lson the$PATH. lscould be mapped to a shell alias or function.
Because @openai/codex-shell-tool-mcp intercepts execve(2) calls directly, it always knows the full path to the program being executed. In turn, this makes it possible to provide stronger guarantees on how Codex .rules are enforced.
Usage
First, verify that you can download and run the MCP executable:
npx -y @openai/codex-shell-tool-mcp --version
To test out the MCP with a one-off invocation of Codex CLI, it is important to disable the default shell tool in addition to enabling the MCP so Codex has exactly one shell-like tool available to it:
codex --disable shell_tool \
--config 'mcp_servers.bash={command = "npx", args = ["-y", "@openai/codex-shell-tool-mcp"]}'
To configure this permanently so you can use the MCP while running codex without additional command-line flags, add the following to your ~/.codex/config.toml:
[features]
shell_tool = false
[mcp_servers.shell-tool]
command = "npx"
args = ["-y", "@openai/codex-shell-tool-mcp"]
Note when the @openai/codex-shell-tool-mcp launcher runs, it selects the appropriate native binary to run based on the host OS/architecture. For the Bash wrapper, it inspects /etc/os-release on Linux or the Darwin major version on macOS to try to find the best match it has available. See bashSelection.ts for details.
MCP Client Requirements
This MCP server is designed to be used with Codex, as it declares the following capability that Codex supports when acting as an MCP client:
{
"capabilities": {
"experimental": {
"codex/sandbox-state": {
"version": "1.0.0"
}
}
}
}
This capability means the MCP server honors requests like the following to update the sandbox policy the MCP server uses when spawning Bash:
{
"id": "req-42",
"method": "codex/sandbox-state/update",
"params": {
"sandboxPolicy": {
"type": "workspace-write",
"writable_roots": ["/home/user/code/codex"],
"network_access": false,
"exclude_tmpdir_env_var": false,
"exclude_slash_tmp": false
}
}
}
Once the server has processed the update, it sends an empty response to acknowledge the request:
{
"id": "req-42",
"result": {}
}
The Codex harness (used by the CLI and the VS Code extension) sends such requests to MCP servers that declare the codex/sandbox-state capability.
Package Contents
This package wraps the codex-exec-mcp-server binary and its helpers so that the shell MCP can be invoked via npx -y @openai/codex-shell-tool-mcp. It bundles:
codex-exec-mcp-serverandcodex-execve-wrapperbuilt for macOS (arm64, x64) and Linux (musl arm64, musl x64).- A patched Bash that honors
BASH_EXEC_WRAPPER, built for multiple glibc baselines (Ubuntu 24.04/22.04/20.04, Debian 12/11, CentOS-like 9) and macOS (15/14/13). - A launcher (
bin/mcp-server.js) that picks the correct binaries for the currentprocess.platform/process.arch, specifying--execveand--bashfor the MCP, as appropriate.
See the README in the Codex repo for details.