codex

mirror of https://github.com/openai/codex.git synced 2026-05-16 17:23:57 +00:00

Author	SHA1	Message	Date
sayan-oai	b9e8df47da	Use MCP server instructions in deferred namespace descriptions (#21053 ) ## Why MCP servers can provide `instructions` that explain what their tools are for. Directly exposed MCP namespaces already use those instructions when a connector description is not available, but deferred `tool_search` results did not preserve that fallback. The direct path falls back from connector metadata to server instructions, while the deferred path only carried `connector_description` and otherwise fell back to generic namespace text. That meant a plain MCP server could provide useful model-facing guidance and still appear as `Tools in the X namespace.` whenever it was discovered lazily through `tool_search`. ## What changed - Store one model-facing `namespace_description` on `ToolInfo`, using connector descriptions for connector-backed tools and server instructions for plain MCP servers. - Thread that namespace description through the `tool_search` source list, search indexing, and returned namespace metadata. - Add an end-to-end regression test for deferred non-app MCP search results exposing server instructions as the namespace description. ## Verification - `cargo test -p codex-tools search_tool_description_lists_each_mcp_source_once --lib` - `cargo test -p codex-core --test all tool_search_uses_non_app_mcp_server_instructions_as_namespace_description`	2026-05-04 19:36:07 +00:00
Eric Traut	4e0cf945b7	Terminate stdio MCP servers on shutdown to avoid process leaks (#19753 ) ## Why Several bug reports describe thread shutdown (including subagent threads) leaving stdio MCP server processes behind. These reports all point at the same lifecycle gap: Codex launches stdio MCP servers, but the session-level shutdown path does not explicitly close MCP clients or terminate the server process tree. Fixes #12491 Fixes #12976 Fixes #18881 Fixes #19469 ## History This is best understood as a regression/coverage gap in MCP session lifecycle management, not as stdio MCP cleanup being absent all along. #10710 added process-group cleanup for stdio MCP servers, but that cleanup only runs when the `RmcpClient`/transport is dropped. The older reports (#12491 and #12976) came after that cleanup existed, which suggests the remaining problem was that some higher-level shutdown paths kept the MCP manager alive or replaced it without explicitly draining clients. The newer reports (#18881 and #19469) exposed the same family around manager replacement and shutdown. ## What changed - Added an explicit stdio MCP process handle in `codex-rmcp-client` so local MCP servers terminate their process group and executor-backed MCP servers call the executor process terminator. - Added `RmcpClient::shutdown()` and manager-level MCP shutdown draining so session shutdown, channel-close fallback, MCP refresh, and connector probing stop owned MCP clients. - Added regression coverage that starts a stdio MCP server, begins an in-flight blocking tool call, shuts down the client, and asserts the server process exits. ## Verification - `cargo test -p codex-rmcp-client` - `cargo test -p codex-mcp` - `just fix -p codex-rmcp-client` - `just fix -p codex-mcp` - `just fix -p codex-core` - Manual before/after validation with a temporary repro script: - Pre-fix binary from `HEAD^` (`fed0a8f4fa`): reproduced the leak with surviving MCP server and child PIDs, `survivors=[77583, 77592]`, `leaked=true`. - Post-fix binary from this branch (`67e318148b`): verified both MCP processes were gone after interrupting `codex exec`, `survivors=[]`, `leaked=false`.	2026-04-28 09:29:57 -07:00
Ahmed Ibrahim	6de6eaa0c1	[4/4] Honor Streamable HTTP MCP placement (#18584 )	2026-04-24 15:03:55 -07:00
Ahmed Ibrahim	996aa23e4c	[5/6] Wire executor-backed MCP stdio (#18212 ) ## Summary - Add the executor-backed RMCP stdio transport. - Wire MCP stdio placement through the executor environment config. - Cover local and executor-backed stdio paths with the existing MCP test helpers. ## Stack ```text o #18027 [6/6] Fail exec client operations after disconnect │ @ #18212 [5/6] Wire executor-backed MCP stdio │ o #18087 [4/6] Abstract MCP stdio server launching │ o #18020 [3/6] Add pushed exec process events │ o #18086 [2/6] Support piped stdin in exec process API │ o #18085 [1/6] Add MCP server environment config │ o main ``` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-18 21:47:43 -07:00
Michael Bolin	1265df0ec2	refactor: narrow async lock guard lifetimes (#18211 ) Follow-up to https://github.com/openai/codex/pull/18178, where we called out enabling the await-holding lint as a follow-up. The long-term goal is to enable Clippy coverage for async guards held across awaits. This PR is intentionally only the first, low-risk cleanup pass: it narrows obvious lock guard lifetimes and leaves `codex-rs/Cargo.toml` unchanged so the lint is not enabled until the remaining cases are fixed or explicitly justified. It intentionally leaves the active-turn/turn-state locking pattern alone because those checks and mutations need to stay atomic. ## Common fixes used here These are the main patterns reviewers should expect in this PR, and they are also the patterns to reach for when fixing future `await_holding_` findings: - Scope the guard to the synchronous work.* If the code only needs data from a locked value, move the lock into a small block, clone or compute the needed values, and do the later `.await` after the block. - Use direct one-line mutations when there is no later await. Cases like `map.lock().await.remove(&id)` are acceptable when the guard is only needed for that single mutation and the statement ends before any async work. - Drain or clone work out of the lock before notifying or awaiting. For example, the JS REPL drains pending exec senders into a local vector and the websocket writer clones buffered envelopes before it serializes or sends them. - Use a `Semaphore` only when serialization is intentional across async work. The test serialization guards intentionally span awaited setup or execution, so using a semaphore communicates "one at a time" without holding a mutex guard. - Remove the mutex when there is only one owner. The PTY stdin writer task owns `stdin` directly; the old `Arc<Mutex<_>>` did not protect shared access because nothing else had access to the writer. - Do not split locks that protect an atomic invariant. This PR deliberately leaves active-turn/turn-state paths alone because those checks and mutations need to stay atomic. Those cases should be fixed separately with a design change or documented with `#[expect]`. ## What changed - Narrow scoped async mutex guards in app-server, JS REPL, network approval, remote-control websocket, and the RMCP test server. - Replace test-only async mutex serialization guards with semaphores where the guard intentionally lives across async work. - Let the PTY pipe writer task own stdin directly instead of wrapping it in an async mutex. ## Verification - `just fix -p codex-core -p codex-app-server -p codex-rmcp-client -p codex-shell-escalation -p codex-utils-pty -p codex-utils-readiness` - `just clippy -p codex-core` - `cargo test -p codex-core -p codex-app-server -p codex-rmcp-client -p codex-shell-escalation -p codex-utils-pty -p codex-utils-readiness` was run; the app-server suite passed, and `codex-core` failed in the local sandbox on six otel approval tests plus `suite::user_shell_cmd::user_shell_command_does_not_set_network_sandbox_env_var`, which appear to depend on local command approval/default rules and `CODEX_SANDBOX_NETWORK_DISABLED=1` in this environment.	2026-04-17 14:06:50 -07:00
Curtis 'Fjord' Hawthorne	9e2fc31854	Support original-detail metadata on MCP image outputs (#17714 ) ## Summary - honor `_meta["codex/imageDetail"] == "original"` on MCP image content and map it to `detail: "original"` where supported - strip that detail back out when the active model does not support original-detail image inputs - update code-mode `image(...)` to accept individual MCP image blocks - teach `js_repl` / `codex.emitImage(...)` to preserve the same hint from raw MCP image outputs - document the new `_meta` contract and add generic RMCP-backed coverage across protocol, core, code-mode, and js_repl paths	2026-04-15 14:43:33 -07:00
aaronl-openai	42528a905d	Send sandbox state through MCP tool metadata (#17763 ) ## Changes Allows MCPs to opt in to receiving sandbox config info through `_meta` on model-initiated tool calls. This lets MCPs adhere to the thread's sandbox if they choose to. ## Details - Adds the `codex/sandbox-state-meta` experimental MCP capability. - Tracks whether each MCP server advertises that capability. - When a server opts in, `codex-core` injects the current `SandboxState` into model-initiated MCP tool-call request `_meta`. ## Verification - added an integration test for the capability	2026-04-15 00:49:15 -07:00
josiah-openai	937dd3812d	Add `supports_parallel_tool_calls` flag to included mcps (#17667 ) ## Why For more advanced MCP usage, we want the model to be able to emit parallel MCP tool calls and have Codex execute eligible ones concurrently, instead of forcing all MCP calls through the serial block. The main design choice was where to thread the config. I made this server-level because parallel safety depends on the MCP server implementation. Codex reads the flag from `mcp_servers`, threads the opted-in server names into `ToolRouter`, and checks the parsed `ToolPayload::Mcp { server, .. }` at execution time. That avoids relying on model-visible tool names, which can be incomplete in deferred/search-tool paths or ambiguous for similarly named servers/tools. ## What was added Added `supports_parallel_tool_calls` for MCP servers. Before: ```toml [mcp_servers.docs] command = "docs-server" ``` After: ```toml [mcp_servers.docs] command = "docs-server" supports_parallel_tool_calls = true ``` MCP calls remain serial by default. Only tools from opted-in servers are eligible to run in parallel. Docs also now warn to enable this only when the server’s tools are safe to run concurrently, especially around shared state or read/write races. ## Testing Tested with a local stdio MCP server exposing real delay tools. The model/Responses side was mocked only to deterministically emit two MCP calls in the same turn. Each test called `query_with_delay` and `query_with_delay_2` with `{ "seconds": 25 }`. \| Build/config \| Observed \| Wall time \| \| --- \| --- \| --- \| \| main with flag enabled \| serial \| `58.79s` \| \| PR with flag enabled \| parallel \| `31.73s` \| \| PR without flag \| serial \| `56.70s` \| PR with flag enabled showed both tools start before either completed; main and PR-without-flag completed the first delay before starting the second. Also added an integration test. Additional checks: - `cargo test -p codex-tools` passed - `cargo test -p codex-core mcp_parallel_support_uses_exact_payload_server` passed - `git diff --check` passed	2026-04-13 15:16:34 -07:00
Vivian Fang	7bbe3b6011	Add output_schema to code mode render (#17210 ) This updates code-mode tool rendering so MCP tools can surface structured output types from their `outputSchema`. What changed: - Detect MCP tool-call result wrappers from the output schema shape instead of relying on tool-name parsing or provenance flags. - Render shared TypeScript aliases once for MCP tool results (`CallToolResult`, `ContentBlock`, etc.) so multiple MCP tool declarations stay compact. - Type `structuredContent` from the tool definition's `outputSchema` instead of rendering it as `unknown`. - Update the shared MCP aliases to match the MCP draft `CallToolResult` schema more closely. Example: - Before: `declare const tools: { mcp__rmcp__echo(args: { env_var?: string; message: string; }): Promise<{ _meta?: unknown; content: Array<unknown>; isError?: boolean; structuredContent?: unknown; }>; };` - After: `declare const tools: { mcp__rmcp__echo(args: { env_var?: string; message: string; }): Promise<CallToolResult<{ echo: string; env: string \| null; }>>; };`	2026-04-10 11:41:44 +00:00
Fouad Matin	32c4993c8a	fix(core): default approval behavior for mcp missing annotations (#15519 ) - Changed `requires_mcp_tool_approval` to apply MCP spec defaults when annotations are missing. - Unannotated tools now default to: - `readOnlyHint = false` - `destructiveHint = true` - `openWorldHint = true` - This means unannotated MCP tools now go through approval/ARC monitoring instead of silently bypassing it. - Explicitly read-only tools still skip approval unless they are also explicitly marked destructive. Previous behavior Failed open for missing annotations, which was unsafe for custom MCP tools that omitted or forgot annotations. --------- Co-authored-by: colby-oai <228809017+colby-oai@users.noreply.github.com>	2026-03-25 07:55:41 -07:00
pakrym-oai	dadffd27d4	Fix MCP tool calling (#14491 ) Properly escape mcp tool names and make tools only available via imports.	2026-03-12 13:38:52 -07:00
Ahmed Ibrahim	44ecc527cb	Stabilize RMCP streamable HTTP readiness tests (#13880 ) ## What changed - The RMCP streamable HTTP tests now wait for metadata and tool readiness before issuing tool calls. - OAuth state is isolated per test home. - The helper server startup path now uses bounded bind retries so transient `AddrInUse` collisions do not fail the test immediately. ## Why this fixes the flake - The old tests could begin issuing tool requests before the helper server had finished advertising its metadata and tools, so the first request sometimes raced the server startup sequence. - On top of that, shared OAuth state and occasional bind collisions on CI runners introduced cross-test environmental noise unrelated to the functionality under test. - Readiness polling makes the client wait for an observable “server is ready” signal, while isolated state and bounded bind retries remove external contention that was causing intermittent failures. ## Scope - Test-only change.	2026-03-09 19:52:55 +00:00
Casey Chow	b3765a07e8	[rmcp-client] Recover from streamable HTTP 404 sessions (#13514 ) ## Summary - add one-time session recovery in `RmcpClient` for streamable HTTP MCP `404` session expiry - rebuild the transport and retry the failed operation once after reinitializing the client state - extend the test server and integration coverage for `404`, `401`, single-retry, and non-session failure scenarios ## Testing - just fmt - cargo test -p codex-rmcp-client (the post-rebase run lost its final summary in the terminal; the suite had passed earlier before the rebase) - just fix -p codex-rmcp-client	2026-03-06 10:02:42 -05:00
Matthew Zeng	9f1009540b	Upgrade rmcp to 0.14 (#10718 ) - [x] Upgrade rmcp to 0.14	2026-02-08 15:07:53 -08:00
K Bediako	337643b00a	Fix: Render MCP image outputs regardless of ordering (#9815 ) ## What? - Render an MCP image output cell whenever a decodable image block exists in `CallToolResult.content` (including text-before-image or malformed image before valid image). ## Why? - Tool results that include caption text before the image currently drop the image output cell. - A malformed image block can also suppress later valid image output. ## How? - Iterate `content` and return the first successfully decoded image instead of only checking the first block. - Add unit tests that cover text-before-image ordering and invalid-image-before-valid. ## Before ```rust let image = match result { Ok(mcp_types::CallToolResult { content, .. }) => { if let Some(mcp_types::ContentBlock::ImageContent(image)) = content.first() { // decode image (fails -> None) } else { None } } _ => None, }?; ``` ## After ```rust let image = result .as_ref() .ok()? .content .iter() .find_map(decode_mcp_image)?; ``` ## Risk / Impact - Low: only affects image cell creation for MCP tool results; no change for non-image outputs. ## Tests - [x] `just fmt` - [x] `cargo test -p codex-tui` - [x] Rerun after branch update (2026-01-27): `just fmt`, `cargo test -p codex-tui` Manual testing # Manual testing: MCP image tool result rendering (Codex TUI) # Build the rmcp stdio test server binary: cd codex-rs cargo build -p codex-rmcp-client --bin test_stdio_server # Register the server as an MCP server (absolute path to the built binary): codex mcp add mcpimg -- /Users/joshka/code/codex-pr-review/codex-rs/target/debug/test_stdio_server # Then in Codex TUI, ask it to call: - mcpimg.image_scenario({"scenario":"image_only"}) - mcpimg.image_scenario({"scenario":"text_then_image","caption":"Here is the image:"}) - mcpimg.image_scenario({"scenario":"invalid_base64_then_image"}) - mcpimg.image_scenario({"scenario":"invalid_image_bytes_then_image"}) - mcpimg.image_scenario({"scenario":"multiple_valid_images"}) - mcpimg.image_scenario({"scenario":"image_then_text","caption":"Here is the image:"}) - mcpimg.image_scenario({"scenario":"text_only","caption":"Here is the image:"}) # Expected: # - You should see an extra history cell: "tool result (image output)" when the # tool result contains at least one decodable image block (even if earlier # blocks are text or invalid images). Fixes #9814 --------- Co-authored-by: Josh McKinney <joshka@openai.com>	2026-01-27 21:14:08 +00:00
Michael Bolin	53f53173a8	chore: upgrade rmcp crate from 0.10.0 to 0.12.0 (#8288 ) Version `0.12.0` includes https://github.com/modelcontextprotocol/rust-sdk/pull/590, which I will use in https://github.com/openai/codex/pull/8142. Changes: - `rmcp::model::CustomClientNotification` was renamed to `rmcp::model::CustomNotification` - a bunch of types have a `meta` field now, but it is `Option`, so I added `meta: None` to a bunch of things	2025-12-18 14:28:46 -08:00
Gabriel Peal	9b538a8672	Upgrade rmcp to 0.8.4 (#6234 ) Picks up https://github.com/modelcontextprotocol/rust-sdk/pull/509 which fixes https://github.com/openai/codex/issues/6164	2025-11-05 00:23:24 -05:00
Gabriel Peal	b0bdc04c30	[MCP] Render MCP tool call result images to the model (#5600 ) It's pretty amazing we have gotten here without the ability for the model to see image content from MCP tool calls. This PR builds off of 4391 and fixes #4819. I would like @KKcorps to get adequete credit here but I also want to get this fix in ASAP so I gave him a week to update it and haven't gotten a response so I'm going to take it across the finish line. This test highlights how absured the current situation is. I asked the model to read this image using the Chrome MCP <img width="2378" height="674" alt="image" src="https://github.com/user-attachments/assets/9ef52608-72a2-4423-9f5e-7ae36b2b56e0" /> After this change, it correctly outputs: > Captured the page: image dhows a dark terminal-style UI labeled `OpenAI Codex (v0.0.0)` with prompt `model: gpt-5-codex medium` and working directory `/codex/codex-rs` (and more) Before this change, it said: > Took the full-page screenshot you asked for. It shows a long, horizontally repeating pattern of stylized people in orange, light-blue, and mustard clothing, holding hands in alternating poses against a white background. No text or other graphics-just rows of flat illustration stretching off to the right. Without this change, the Figma, Playwright, Chrome, and other visual MCP servers are pretty much entirely useless. I tested this change with the openai respones api as well as a third party completions api	2025-10-27 17:55:57 -04:00
Gabriel Peal	40fba1bb4c	[MCP] Add support for resources (#5239 ) This PR adds support for [MCP resources](https://modelcontextprotocol.io/specification/2025-06-18/server/resources) by adding three new tools for the model: 1. `list_resources` 2. `list_resource_templates` 3. `read_resource` These 3 tools correspond to the [three primary MCP resource protocol messages](https://modelcontextprotocol.io/specification/2025-06-18/server/resources#protocol-messages). Example of listing and reading a GitHub resource tempalte <img width="2984" height="804" alt="CleanShot 2025-10-15 at 17 31 10" src="https://github.com/user-attachments/assets/89b7f215-2e2a-41c5-90dd-b932ac84a585" /> `/mcp` with Figma configured <img width="2984" height="442" alt="CleanShot 2025-10-15 at 18 29 35" src="https://github.com/user-attachments/assets/a7578080-2ed2-4c59-b9b4-d8461f90d8ee" /> Fixes #4956	2025-10-17 01:05:15 -04:00
Gabriel Peal	1d17ca1fa3	[MCP] Add support for MCP Oauth credentials (#4517 ) This PR adds oauth login support to streamable http servers when `experimental_use_rmcp_client` is enabled. This PR is large but represents the minimal amount of work required for this to work. To keep this PR smaller, login can only be done with `codex mcp login` and `codex mcp logout` but it doesn't appear in `/mcp` or `codex mcp list` yet. Fingers crossed that this is the last large MCP PR and that subsequent PRs can be smaller. Under the hood, credentials are stored using platform credential managers using the [keyring crate](https://crates.io/crates/keyring). When the keyring isn't available, it falls back to storing credentials in `CODEX_HOME/.credentials.json` which is consistent with how other coding agents handle authentication. I tested this on macOS, Windows, WSL (ubuntu), and Linux. I wasn't able to test the dbus store on linux but did verify that the fallback works. One quirk is that if you have credentials, during development, every build will have its own ad-hoc binary so the keyring won't recognize the reader as being the same as the write so it may ask for the user's password. I may add an override to disable this or allow users/enterprises to opt-out of the keyring storage if it causes issues. <img width="5064" height="686" alt="CleanShot 2025-09-30 at 19 31 40" src="https://github.com/user-attachments/assets/9573f9b4-07f1-4160-83b8-2920db287e2d" /> <img width="745" height="486" alt="image" src="https://github.com/user-attachments/assets/9562649b-ea5f-4f22-ace2-d0cb438b143e" />	2025-10-03 13:43:12 -04:00
Gabriel Peal	3a1be084f9	[MCP] Add experimental support for streamable HTTP MCP servers (#4317 ) This PR adds support for streamable HTTP MCP servers when the `experimental_use_rmcp_client` is enabled. To set one up, simply add a new mcp server config with the url: ``` [mcp_servers.figma] url = "http://127.0.0.1:3845/mcp" ``` It also supports an optional `bearer_token` which will be provided in an authorization header. The full oauth flow is not supported yet. The config parsing will throw if it detects that the user mixed and matched config fields (like command + bearer token or url + env). The best way to review it is to review `core/src` and then `rmcp-client/src/rmcp_client.rs` first. The rest is tests and propagating the `Transport` struct around the codebase. Example with the Figma MCP: <img width="5084" height="1614" alt="CleanShot 2025-09-26 at 13 35 40" src="https://github.com/user-attachments/assets/eaf2771e-df3e-4300-816b-184d7dec5a28" />	2025-09-26 21:24:01 -04:00
Gabriel Peal	e555a36c6a	[MCP] Introduce an experimental official rust sdk based mcp client (#4252 ) The [official Rust SDK](`57fc428c57`) has come a long way since we first started our mcp client implementation 5 months ago and, today, it is much more complete than our own stdio-only implementation. This PR introduces a new config flag `experimental_use_rmcp_client` which will use a new mcp client powered by the sdk instead of our own. To keep this PR simple, I've only implemented the same stdio MCP functionality that we had but will expand on it with future PRs. --------- Co-authored-by: pakrym-oai <pakrym@openai.com>	2025-09-26 13:13:37 -04:00

22 Commits