## What?
- Render an MCP image output cell whenever a decodable image block
exists in `CallToolResult.content` (including text-before-image or
malformed image before valid image).
## Why?
- Tool results that include caption text before the image currently drop
the image output cell.
- A malformed image block can also suppress later valid image output.
## How?
- Iterate `content` and return the first successfully decoded image
instead of only checking the first block.
- Add unit tests that cover text-before-image ordering and
invalid-image-before-valid.
## Before
```rust
let image = match result {
Ok(mcp_types::CallToolResult { content, .. }) => {
if let Some(mcp_types::ContentBlock::ImageContent(image)) = content.first() {
// decode image (fails -> None)
} else {
None
}
}
_ => None,
}?;
```
## After
```rust
let image = result
.as_ref()
.ok()?
.content
.iter()
.find_map(decode_mcp_image)?;
```
## Risk / Impact
- Low: only affects image cell creation for MCP tool results; no change
for non-image outputs.
## Tests
- [x] `just fmt`
- [x] `cargo test -p codex-tui`
- [x] Rerun after branch update (2026-01-27): `just fmt`, `cargo test -p
codex-tui`
Manual testing
# Manual testing: MCP image tool result rendering (Codex TUI)
# Build the rmcp stdio test server binary:
cd codex-rs
cargo build -p codex-rmcp-client --bin test_stdio_server
# Register the server as an MCP server (absolute path to the built binary):
codex mcp add mcpimg -- /Users/joshka/code/codex-pr-review/codex-rs/target/debug/test_stdio_server
# Then in Codex TUI, ask it to call:
- mcpimg.image_scenario({"scenario":"image_only"})
- mcpimg.image_scenario({"scenario":"text_then_image","caption":"Here is the image:"})
- mcpimg.image_scenario({"scenario":"invalid_base64_then_image"})
- mcpimg.image_scenario({"scenario":"invalid_image_bytes_then_image"})
- mcpimg.image_scenario({"scenario":"multiple_valid_images"})
- mcpimg.image_scenario({"scenario":"image_then_text","caption":"Here is the image:"})
- mcpimg.image_scenario({"scenario":"text_only","caption":"Here is the image:"})
# Expected:
# - You should see an extra history cell: "tool result (image output)" when the
# tool result contains at least one decodable image block (even if earlier
# blocks are text or invalid images).
Fixes#9814
---------
Co-authored-by: Josh McKinney <joshka@openai.com>
It's pretty amazing we have gotten here without the ability for the
model to see image content from MCP tool calls.
This PR builds off of 4391 and fixes#4819. I would like @KKcorps to get
adequete credit here but I also want to get this fix in ASAP so I gave
him a week to update it and haven't gotten a response so I'm going to
take it across the finish line.
This test highlights how absured the current situation is. I asked the
model to read this image using the Chrome MCP
<img width="2378" height="674" alt="image"
src="https://github.com/user-attachments/assets/9ef52608-72a2-4423-9f5e-7ae36b2b56e0"
/>
After this change, it correctly outputs:
> Captured the page: image dhows a dark terminal-style UI labeled
`OpenAI Codex (v0.0.0)` with prompt `model: gpt-5-codex medium` and
working directory `/codex/codex-rs`
(and more)
Before this change, it said:
> Took the full-page screenshot you asked for. It shows a long,
horizontally repeating pattern of stylized people in orange, light-blue,
and mustard clothing, holding hands in alternating poses against a white
background. No text or other graphics-just rows of flat illustration
stretching off to the right.
Without this change, the Figma, Playwright, Chrome, and other visual MCP
servers are pretty much entirely useless.
I tested this change with the openai respones api as well as a third
party completions api
This PR adds support for streamable HTTP MCP servers when the
`experimental_use_rmcp_client` is enabled.
To set one up, simply add a new mcp server config with the url:
```
[mcp_servers.figma]
url = "http://127.0.0.1:3845/mcp"
```
It also supports an optional `bearer_token` which will be provided in an
authorization header. The full oauth flow is not supported yet.
The config parsing will throw if it detects that the user mixed and
matched config fields (like command + bearer token or url + env).
The best way to review it is to review `core/src` and then
`rmcp-client/src/rmcp_client.rs` first. The rest is tests and
propagating the `Transport` struct around the codebase.
Example with the Figma MCP:
<img width="5084" height="1614" alt="CleanShot 2025-09-26 at 13 35 40"
src="https://github.com/user-attachments/assets/eaf2771e-df3e-4300-816b-184d7dec5a28"
/>