Support multimodal custom tool outputs (#12948)

## Summary

This changes `custom_tool_call_output` to use the same output payload
shape as `function_call_output`, so freeform tools can return either
plain text or structured content items.

The main goal is to let `js_repl` return image content from nested
`view_image` calls in its own `custom_tool_call_output`, instead of
relying on a separate injected message.

## What changed

- Changed `custom_tool_call_output.output` from `string` to
`FunctionCallOutputPayload`
- Updated freeform tool plumbing to preserve structured output bodies
- Updated `js_repl` to aggregate nested tool content items and attach
them to the outer `js_repl` result
- Removed the old `js_repl` special case that injected `view_image`
results as a separate pending user image message
- Updated normalization/history/truncation paths to handle multimodal
`custom_tool_call_output`
- Regenerated app-server protocol schema artifacts

## Behavior

Direct `view_image` calls still return a `function_call_output` with
image content.

When `view_image` is called inside `js_repl`, the outer `js_repl`
`custom_tool_call_output` now carries:
- an `input_text` item if the JS produced text output
- one or more `input_image` items from nested tool results

So the nested image result now stays inside the `js_repl` tool output
instead of being injected as a separate message.

## Compatibility

This is intended to be backward-compatible for resumed conversations.

Older histories that stored `custom_tool_call_output.output` as a plain
string still deserialize correctly, and older histories that used the
previous injected-image-message flow also continue to resume.

Added regression coverage for resuming a pre-change rollout containing:
- string-valued `custom_tool_call_output`
- legacy injected image message history


#### [git stack](https://github.com/magus/git-stack-cli)
- 👉 `1` https://github.com/openai/codex/pull/12948
This commit is contained in:
Curtis 'Fjord' Hawthorne
2026-02-26 18:17:46 -08:00
committed by GitHub
parent f90e97e414
commit 7e980d7db6
20 changed files with 688 additions and 177 deletions

View File

@@ -23,6 +23,7 @@ use wiremock::MockServer;
use crate::load_default_config_for_test;
use crate::responses::WebSocketTestServer;
use crate::responses::output_value_to_text;
use crate::responses::start_mock_server;
use crate::streaming_sse::StreamingSseServer;
use crate::wait_for_event;
@@ -394,11 +395,7 @@ impl TestCodexHarness {
pub async fn custom_tool_call_output(&self, call_id: &str) -> String {
let bodies = self.request_bodies().await;
custom_tool_call_output(&bodies, call_id)
.get("output")
.and_then(Value::as_str)
.expect("output string")
.to_string()
custom_tool_call_output_text(&bodies, call_id)
}
pub async fn apply_patch_output(
@@ -433,6 +430,14 @@ fn custom_tool_call_output<'a>(bodies: &'a [Value], call_id: &str) -> &'a Value
panic!("custom_tool_call_output {call_id} not found");
}
fn custom_tool_call_output_text(bodies: &[Value], call_id: &str) -> String {
let output = custom_tool_call_output(bodies, call_id)
.get("output")
.unwrap_or_else(|| panic!("custom_tool_call_output {call_id} missing output"));
output_value_to_text(output)
.unwrap_or_else(|| panic!("custom_tool_call_output {call_id} missing text output"))
}
fn function_call_output<'a>(bodies: &'a [Value], call_id: &str) -> &'a Value {
for body in bodies {
if let Some(items) = body.get("input").and_then(Value::as_array) {
@@ -456,3 +461,36 @@ pub fn test_codex() -> TestCodexBuilder {
home: None,
}
}
#[cfg(test)]
mod tests {
use super::*;
use pretty_assertions::assert_eq;
use serde_json::json;
#[test]
fn custom_tool_call_output_text_returns_output_text() {
let bodies = vec![json!({
"input": [{
"type": "custom_tool_call_output",
"call_id": "call-1",
"output": "hello"
}]
})];
assert_eq!(custom_tool_call_output_text(&bodies, "call-1"), "hello");
}
#[test]
#[should_panic(expected = "custom_tool_call_output call-2 missing output")]
fn custom_tool_call_output_text_panics_when_output_is_missing() {
let bodies = vec![json!({
"input": [{
"type": "custom_tool_call_output",
"call_id": "call-2"
}]
})];
let _ = custom_tool_call_output_text(&bodies, "call-2");
}
}