Code mode on v8 (#15276)

Moves Code Mode to a new crate with no dependencies on codex. This
create encodes the code mode semantics that we want for lifetime,
mounting, tool calling.

The model-facing surface is mostly unchanged. `exec` still runs raw
JavaScript, `wait` still resumes or terminates a `cell_id`, nested tools
are still available through `tools.*`, and helpers like `text`, `image`,
`store`, `load`, `notify`, `yield_control`, and `exit` still exist.

The major change is underneath that surface:

- Old code mode was an external Node runtime.
- New code mode is an in-process V8 runtime embedded directly in Rust.
- Old code mode managed cells inside a long-lived Node runner process.
- New code mode manages cells in Rust, with one V8 runtime thread per
active `exec`.
- Old code mode used JSON protocol messages over child stdin/stdout plus
Node worker-thread messages.
- New code mode uses Rust channels and direct V8 callbacks/events.

This PR also fixes the two migration regressions that fell out of that
substrate change:

- `wait { terminate: true }` now waits for the V8 runtime to actually
stop before reporting termination.
- synchronous top-level `exit()` now succeeds again instead of surfacing
as a script error.

---

- `core/src/tools/code_mode/*` is now mostly an adapter layer for the
public `exec` / `wait` tools.
- `code-mode/src/service.rs` owns cell sessions and async control flow
in Rust.
- `code-mode/src/runtime/*.rs` owns the embedded V8 isolate and
JavaScript execution.
- each `exec` spawns a dedicated runtime thread plus a Rust
session-control task.
- helper globals are installed directly into the V8 context instead of
being injected through a source prelude.
- helper modules like `tools.js` and `@openai/code_mode` are synthesized
through V8 module resolution callbacks in Rust.

---

Also added a benchmark for showing the speed of init and use of a code
mode env:
```
$ cargo bench -p codex-code-mode --bench exec_overhead -- --samples 30 --warm-iterations 25 --tool-counts 0,32,128
Finished [`bench` profile [optimized]](https://doc.rust-lang.org/cargo/reference/profiles.html#default-profiles) target(s) in 0.18s
     Running benches/exec_overhead.rs (target/release/deps/exec_overhead-008c440d800545ae)
exec_overhead: samples=30, warm_iterations=25, tool_counts=[0, 32, 128]
scenario       tools samples    warmups      iters      mean/exec       p95/exec       rssΔ p50       rssΔ max
cold_exec          0      30          0          1         1.13ms         1.20ms        8.05MiB        8.06MiB
warm_exec          0      30          1         25       473.43us       512.49us      912.00KiB        1.33MiB
cold_exec         32      30          0          1         1.03ms         1.15ms        8.08MiB        8.11MiB
warm_exec         32      30          1         25       509.73us       545.76us      960.00KiB        1.30MiB
cold_exec        128      30          0          1         1.14ms         1.19ms        8.30MiB        8.34MiB
warm_exec        128      30          1         25       575.08us       591.03us      736.00KiB      864.00KiB
memory uses a fresh-process max RSS delta for each scenario
```

---------

Co-authored-by: Codex <noreply@openai.com>
This commit is contained in:
Channing Conger
2026-03-20 23:36:58 -07:00
committed by GitHub
parent ec32866c37
commit e4eedd6170
31 changed files with 2730 additions and 2265 deletions

View File

@@ -0,0 +1,555 @@
use serde::Deserialize;
use serde::Serialize;
use serde_json::Value as JsonValue;
use crate::PUBLIC_TOOL_NAME;
const MAX_JS_SAFE_INTEGER: u64 = (1_u64 << 53) - 1;
const CODE_MODE_ONLY_PREFACE: &str =
"Use `exec/wait` tool to run all other tools, do not attempt to use any other tools directly";
const EXEC_DESCRIPTION_TEMPLATE: &str = r#"## exec
- Runs raw JavaScript in an isolated context (no Node, no file system, or network access, no console).
- Send raw JavaScript source text, not JSON, quoted strings, or markdown code fences.
- You may optionally start the tool input with a first-line pragma like `// @exec: {"yield_time_ms": 10000, "max_output_tokens": 1000}`.
- `yield_time_ms` asks `exec` to yield early after that many milliseconds if the script is still running.
- `max_output_tokens` sets the token budget for direct `exec` results. By default the result is truncated to 10000 tokens.
- All nested tools are available on the global `tools` object, for example `await tools.exec_command(...)`. Tool names are exposed as normalized JavaScript identifiers, for example `await tools.mcp__ologs__get_profile(...)`.
- Tool methods take either string or object as parameter.
- They return either a structured value or a string based on the description above.
- Global helpers:
- `exit()`: Immediately ends the current script successfully (like an early return from the top level).
- `text(value: string | number | boolean | undefined | null)`: Appends a text item. Non-string values are stringified with `JSON.stringify(...)` when possible.
- `image(imageUrlOrItem: string | { image_url: string; detail?: "auto" | "low" | "high" | "original" | null })`: Appends an image item. `image_url` can be an HTTPS URL or a base64-encoded `data:` URL.
- `store(key: string, value: any)`: stores a serializable value under a string key for later `exec` calls in the same session.
- `load(key: string)`: returns the stored value for a string key, or `undefined` if it is missing.
- `notify(value: string | number | boolean | undefined | null)`: immediately injects an extra `custom_tool_call_output` for the current `exec` call. Values are stringified like `text(...)`.
- `ALL_TOOLS`: metadata for the enabled nested tools as `{ name, description }` entries.
- `yield_control()`: yields the accumulated output to the model immediately while the script keeps running."#;
const WAIT_DESCRIPTION_TEMPLATE: &str = r#"- Use `wait` only after `exec` returns `Script running with cell ID ...`.
- `cell_id` identifies the running `exec` cell to resume.
- `yield_time_ms` controls how long to wait for more output before yielding again. If omitted, `wait` uses its default wait timeout.
- `max_tokens` limits how much new output this wait call returns.
- `terminate: true` stops the running cell instead of waiting for more output.
- `wait` returns only the new output since the last yield, or the final completion or termination result for that cell.
- If the cell is still running, `wait` may yield again with the same `cell_id`.
- If the cell has already finished, `wait` returns the completed result and closes the cell."#;
pub const CODE_MODE_PRAGMA_PREFIX: &str = "// @exec:";
#[derive(Clone, Copy, Debug, Deserialize, Eq, PartialEq, Serialize)]
#[serde(rename_all = "snake_case")]
pub enum CodeModeToolKind {
Function,
Freeform,
}
#[derive(Clone, Debug, PartialEq)]
pub struct ToolDefinition {
pub name: String,
pub description: String,
pub kind: CodeModeToolKind,
pub input_schema: Option<JsonValue>,
pub output_schema: Option<JsonValue>,
}
#[derive(Debug, Default, Deserialize, PartialEq, Eq)]
#[serde(deny_unknown_fields)]
struct CodeModeExecPragma {
#[serde(default)]
yield_time_ms: Option<u64>,
#[serde(default)]
max_output_tokens: Option<usize>,
}
#[derive(Debug, PartialEq, Eq)]
pub struct ParsedExecSource {
pub code: String,
pub yield_time_ms: Option<u64>,
pub max_output_tokens: Option<usize>,
}
pub fn parse_exec_source(input: &str) -> Result<ParsedExecSource, String> {
if input.trim().is_empty() {
return Err(
"exec expects raw JavaScript source text (non-empty). Provide JS only, optionally with first-line `// @exec: {\"yield_time_ms\": 10000, \"max_output_tokens\": 1000}`.".to_string(),
);
}
let mut args = ParsedExecSource {
code: input.to_string(),
yield_time_ms: None,
max_output_tokens: None,
};
let mut lines = input.splitn(2, '\n');
let first_line = lines.next().unwrap_or_default();
let rest = lines.next().unwrap_or_default();
let trimmed = first_line.trim_start();
let Some(pragma) = trimmed.strip_prefix(CODE_MODE_PRAGMA_PREFIX) else {
return Ok(args);
};
if rest.trim().is_empty() {
return Err(
"exec pragma must be followed by JavaScript source on subsequent lines".to_string(),
);
}
let directive = pragma.trim();
if directive.is_empty() {
return Err(
"exec pragma must be a JSON object with supported fields `yield_time_ms` and `max_output_tokens`"
.to_string(),
);
}
let value: serde_json::Value = serde_json::from_str(directive).map_err(|err| {
format!(
"exec pragma must be valid JSON with supported fields `yield_time_ms` and `max_output_tokens`: {err}"
)
})?;
let object = value.as_object().ok_or_else(|| {
"exec pragma must be a JSON object with supported fields `yield_time_ms` and `max_output_tokens`"
.to_string()
})?;
for key in object.keys() {
match key.as_str() {
"yield_time_ms" | "max_output_tokens" => {}
_ => {
return Err(format!(
"exec pragma only supports `yield_time_ms` and `max_output_tokens`; got `{key}`"
));
}
}
}
let pragma: CodeModeExecPragma = serde_json::from_value(value).map_err(|err| {
format!(
"exec pragma fields `yield_time_ms` and `max_output_tokens` must be non-negative safe integers: {err}"
)
})?;
if pragma
.yield_time_ms
.is_some_and(|yield_time_ms| yield_time_ms > MAX_JS_SAFE_INTEGER)
{
return Err(
"exec pragma field `yield_time_ms` must be a non-negative safe integer".to_string(),
);
}
if pragma.max_output_tokens.is_some_and(|max_output_tokens| {
u64::try_from(max_output_tokens)
.map(|max_output_tokens| max_output_tokens > MAX_JS_SAFE_INTEGER)
.unwrap_or(true)
}) {
return Err(
"exec pragma field `max_output_tokens` must be a non-negative safe integer".to_string(),
);
}
args.code = rest.to_string();
args.yield_time_ms = pragma.yield_time_ms;
args.max_output_tokens = pragma.max_output_tokens;
Ok(args)
}
pub fn is_code_mode_nested_tool(tool_name: &str) -> bool {
tool_name != crate::PUBLIC_TOOL_NAME && tool_name != crate::WAIT_TOOL_NAME
}
pub fn build_exec_tool_description(
enabled_tools: &[(String, String)],
code_mode_only: bool,
) -> String {
if !code_mode_only {
return EXEC_DESCRIPTION_TEMPLATE.to_string();
}
let mut sections = vec![
CODE_MODE_ONLY_PREFACE.to_string(),
EXEC_DESCRIPTION_TEMPLATE.to_string(),
];
if !enabled_tools.is_empty() {
let nested_tool_reference = enabled_tools
.iter()
.map(|(name, nested_description)| {
let global_name = normalize_code_mode_identifier(name);
format!(
"### `{global_name}` (`{name}`)\n{}",
nested_description.trim()
)
})
.collect::<Vec<_>>()
.join("\n\n");
sections.push(nested_tool_reference);
}
sections.join("\n\n")
}
pub fn build_wait_tool_description() -> &'static str {
WAIT_DESCRIPTION_TEMPLATE
}
pub fn normalize_code_mode_identifier(tool_key: &str) -> String {
let mut identifier = String::new();
for (index, ch) in tool_key.chars().enumerate() {
let is_valid = if index == 0 {
ch == '_' || ch == '$' || ch.is_ascii_alphabetic()
} else {
ch == '_' || ch == '$' || ch.is_ascii_alphanumeric()
};
if is_valid {
identifier.push(ch);
} else {
identifier.push('_');
}
}
if identifier.is_empty() {
"_".to_string()
} else {
identifier
}
}
pub fn augment_tool_definition(mut definition: ToolDefinition) -> ToolDefinition {
if definition.name != PUBLIC_TOOL_NAME {
definition.description = append_code_mode_sample_for_definition(&definition);
}
definition
}
pub fn enabled_tool_metadata(definition: &ToolDefinition) -> EnabledToolMetadata {
EnabledToolMetadata {
tool_name: definition.name.clone(),
global_name: normalize_code_mode_identifier(&definition.name),
description: definition.description.clone(),
kind: definition.kind,
}
}
#[derive(Clone, Debug, Eq, PartialEq, Serialize)]
pub struct EnabledToolMetadata {
pub tool_name: String,
pub global_name: String,
pub description: String,
pub kind: CodeModeToolKind,
}
pub fn append_code_mode_sample(
description: &str,
tool_name: &str,
input_name: &str,
input_type: String,
output_type: String,
) -> String {
let declaration = format!(
"declare const tools: {{ {} }};",
render_code_mode_tool_declaration(tool_name, input_name, input_type, output_type)
);
format!("{description}\n\nexec tool declaration:\n```ts\n{declaration}\n```")
}
fn append_code_mode_sample_for_definition(definition: &ToolDefinition) -> String {
let input_name = match definition.kind {
CodeModeToolKind::Function => "args",
CodeModeToolKind::Freeform => "input",
};
let input_type = match definition.kind {
CodeModeToolKind::Function => definition
.input_schema
.as_ref()
.map(render_json_schema_to_typescript)
.unwrap_or_else(|| "unknown".to_string()),
CodeModeToolKind::Freeform => "string".to_string(),
};
let output_type = definition
.output_schema
.as_ref()
.map(render_json_schema_to_typescript)
.unwrap_or_else(|| "unknown".to_string());
append_code_mode_sample(
&definition.description,
&definition.name,
input_name,
input_type,
output_type,
)
}
fn render_code_mode_tool_declaration(
tool_name: &str,
input_name: &str,
input_type: String,
output_type: String,
) -> String {
let tool_name = normalize_code_mode_identifier(tool_name);
format!("{tool_name}({input_name}: {input_type}): Promise<{output_type}>;")
}
pub fn render_json_schema_to_typescript(schema: &JsonValue) -> String {
render_json_schema_to_typescript_inner(schema)
}
fn render_json_schema_to_typescript_inner(schema: &JsonValue) -> String {
match schema {
JsonValue::Bool(true) => "unknown".to_string(),
JsonValue::Bool(false) => "never".to_string(),
JsonValue::Object(map) => {
if let Some(value) = map.get("const") {
return render_json_schema_literal(value);
}
if let Some(values) = map.get("enum").and_then(JsonValue::as_array) {
let rendered = values
.iter()
.map(render_json_schema_literal)
.collect::<Vec<_>>();
if !rendered.is_empty() {
return rendered.join(" | ");
}
}
for key in ["anyOf", "oneOf"] {
if let Some(variants) = map.get(key).and_then(JsonValue::as_array) {
let rendered = variants
.iter()
.map(render_json_schema_to_typescript_inner)
.collect::<Vec<_>>();
if !rendered.is_empty() {
return rendered.join(" | ");
}
}
}
if let Some(variants) = map.get("allOf").and_then(JsonValue::as_array) {
let rendered = variants
.iter()
.map(render_json_schema_to_typescript_inner)
.collect::<Vec<_>>();
if !rendered.is_empty() {
return rendered.join(" & ");
}
}
if let Some(schema_type) = map.get("type") {
if let Some(types) = schema_type.as_array() {
let rendered = types
.iter()
.filter_map(JsonValue::as_str)
.map(|schema_type| render_json_schema_type_keyword(map, schema_type))
.collect::<Vec<_>>();
if !rendered.is_empty() {
return rendered.join(" | ");
}
}
if let Some(schema_type) = schema_type.as_str() {
return render_json_schema_type_keyword(map, schema_type);
}
}
if map.contains_key("properties")
|| map.contains_key("additionalProperties")
|| map.contains_key("required")
{
return render_json_schema_object(map);
}
if map.contains_key("items") || map.contains_key("prefixItems") {
return render_json_schema_array(map);
}
"unknown".to_string()
}
_ => "unknown".to_string(),
}
}
fn render_json_schema_type_keyword(
map: &serde_json::Map<String, JsonValue>,
schema_type: &str,
) -> String {
match schema_type {
"string" => "string".to_string(),
"number" | "integer" => "number".to_string(),
"boolean" => "boolean".to_string(),
"null" => "null".to_string(),
"array" => render_json_schema_array(map),
"object" => render_json_schema_object(map),
_ => "unknown".to_string(),
}
}
fn render_json_schema_array(map: &serde_json::Map<String, JsonValue>) -> String {
if let Some(items) = map.get("items") {
let item_type = render_json_schema_to_typescript_inner(items);
return format!("Array<{item_type}>");
}
if let Some(items) = map.get("prefixItems").and_then(JsonValue::as_array) {
let item_types = items
.iter()
.map(render_json_schema_to_typescript_inner)
.collect::<Vec<_>>();
if !item_types.is_empty() {
return format!("[{}]", item_types.join(", "));
}
}
"unknown[]".to_string()
}
fn render_json_schema_object(map: &serde_json::Map<String, JsonValue>) -> String {
let required = map
.get("required")
.and_then(JsonValue::as_array)
.map(|items| {
items
.iter()
.filter_map(JsonValue::as_str)
.collect::<Vec<_>>()
})
.unwrap_or_default();
let properties = map
.get("properties")
.and_then(JsonValue::as_object)
.cloned()
.unwrap_or_default();
let mut sorted_properties = properties.iter().collect::<Vec<_>>();
sorted_properties.sort_unstable_by(|(name_a, _), (name_b, _)| name_a.cmp(name_b));
let mut lines = sorted_properties
.into_iter()
.map(|(name, value)| {
let optional = if required.iter().any(|required_name| required_name == name) {
""
} else {
"?"
};
let property_name = render_json_schema_property_name(name);
let property_type = render_json_schema_to_typescript_inner(value);
format!("{property_name}{optional}: {property_type};")
})
.collect::<Vec<_>>();
if let Some(additional_properties) = map.get("additionalProperties") {
let property_type = match additional_properties {
JsonValue::Bool(true) => Some("unknown".to_string()),
JsonValue::Bool(false) => None,
value => Some(render_json_schema_to_typescript_inner(value)),
};
if let Some(property_type) = property_type {
lines.push(format!("[key: string]: {property_type};"));
}
} else if properties.is_empty() {
lines.push("[key: string]: unknown;".to_string());
}
if lines.is_empty() {
return "{}".to_string();
}
format!("{{ {} }}", lines.join(" "))
}
fn render_json_schema_property_name(name: &str) -> String {
if normalize_code_mode_identifier(name) == name {
name.to_string()
} else {
serde_json::to_string(name).unwrap_or_else(|_| format!("\"{}\"", name.replace('"', "\\\"")))
}
}
fn render_json_schema_literal(value: &JsonValue) -> String {
serde_json::to_string(value).unwrap_or_else(|_| "unknown".to_string())
}
#[cfg(test)]
mod tests {
use super::CodeModeToolKind;
use super::ParsedExecSource;
use super::ToolDefinition;
use super::augment_tool_definition;
use super::build_exec_tool_description;
use super::normalize_code_mode_identifier;
use super::parse_exec_source;
use pretty_assertions::assert_eq;
use serde_json::json;
#[test]
fn parse_exec_source_without_pragma() {
assert_eq!(
parse_exec_source("text('hi')").unwrap(),
ParsedExecSource {
code: "text('hi')".to_string(),
yield_time_ms: None,
max_output_tokens: None,
}
);
}
#[test]
fn parse_exec_source_with_pragma() {
assert_eq!(
parse_exec_source("// @exec: {\"yield_time_ms\": 10}\ntext('hi')").unwrap(),
ParsedExecSource {
code: "text('hi')".to_string(),
yield_time_ms: Some(10),
max_output_tokens: None,
}
);
}
#[test]
fn normalize_identifier_rewrites_invalid_characters() {
assert_eq!(
"mcp__ologs__get_profile",
normalize_code_mode_identifier("mcp__ologs__get_profile")
);
assert_eq!(
"hidden_dynamic_tool",
normalize_code_mode_identifier("hidden-dynamic-tool")
);
}
#[test]
fn augment_tool_definition_appends_typed_declaration() {
let definition = ToolDefinition {
name: "hidden_dynamic_tool".to_string(),
description: "Test tool".to_string(),
kind: CodeModeToolKind::Function,
input_schema: Some(json!({
"type": "object",
"properties": { "city": { "type": "string" } },
"required": ["city"],
"additionalProperties": false
})),
output_schema: Some(json!({
"type": "object",
"properties": { "ok": { "type": "boolean" } },
"required": ["ok"]
})),
};
let description = augment_tool_definition(definition).description;
assert!(description.contains("declare const tools"));
assert!(
description.contains(
"hidden_dynamic_tool(args: { city: string; }): Promise<{ ok: boolean; }>;"
)
);
}
#[test]
fn code_mode_only_description_includes_nested_tools() {
let description =
build_exec_tool_description(&[("foo".to_string(), "bar".to_string())], true);
assert!(description.contains("### `foo` (`foo`)"));
}
}