Commit Graph

87 Commits

Author SHA1 Message Date
Charlie Marsh
54ef99a365 Disable empty Cargo test targets (#21584)
## Summary

`cargo test` has entails both running standard Rust tests and doctests.
It turns out that the doctest discovery is fairly slow, and it's a cost
you pay even for crates that don't include any doctests.

This PR disables doctests with `doctest = false` for crates that lack
any doctests.

For the collection of crates below, this speeds up test execution by
>4x.

E.g., before this PR:

```
Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
  Time (mean ± σ):      1.849 s ±  4.455 s    [User: 0.752 s, System: 1.367 s]
  Range (min … max):    0.418 s … 14.529 s    10 runs
```

And after:

```
Benchmark 1: cargo test     -p codex-utils-absolute-path     -p codex-utils-cache     -p codex-utils-cli     -p codex-utils-home-dir     -p codex-utils-output-truncation     -p codex-utils-path     -p codex-utils-string     -p codex-utils-template     -p codex-utils-elapsed     -p codex-utils-json-to-toml
  Time (mean ± σ):     428.6 ms ±   6.9 ms    [User: 187.7 ms, System: 219.7 ms]
  Range (min … max):   418.0 ms … 436.8 ms    10 runs
```

For a single crate, with >2x speedup, before:

```
Benchmark 1: cargo test -p codex-utils-string
  Time (mean ± σ):     491.1 ms ±   9.0 ms    [User: 229.8 ms, System: 234.9 ms]
  Range (min … max):   480.9 ms … 512.0 ms    10 runs
```

And after:

```
Benchmark 1: cargo test -p codex-utils-string
  Time (mean ± σ):     213.9 ms ±   4.3 ms    [User: 112.8 ms, System: 84.0 ms]
  Range (min … max):   206.8 ms … 221.0 ms    13 runs
```

Co-authored-by: Codex <noreply@openai.com>
2026-05-07 15:44:17 -07:00
pakrym-oai
9417cf9696 [codex] Move tool specs into core handlers (#21416)
## Why

This is the first mechanical slice of moving tool spec ownership toward
the handlers. `codex-tools` should keep shared primitives and conversion
helpers, while builtin tool specs and registration planning live in
`codex-core` with the handlers that own those tools.

Keeping this PR to relocation and import updates isolates the copy/move
review from the later logic change that wires specs through registered
handlers.

## What changed

- Moved builtin tool spec constructors from `codex-rs/tools/src` into
`codex-rs/core/src/tools/handlers/*_spec.rs` or nearby core tool
modules.
- Moved the registry planning code into
`codex-rs/core/src/tools/spec_plan.rs` and its associated types/tests
into core.
- Kept shared primitives in `codex-tools`, including `ToolSpec`,
schema/types, discovery/config primitives, dynamic/MCP conversion
helpers, and code-mode collection helpers.
- Updated handlers that referenced moved argument types or tool-name
constants to use the core spec modules.
- Moved spec tests next to the moved spec modules.

## Verification

- `cargo check -p codex-tools`
- `cargo check -p codex-core`
- `cargo test -p codex-tools`
- `cargo test -p codex-core _spec::tests`
- `cargo test -p codex-core tools::spec_plan::tests`
- `just fix -p codex-tools`
- `just fix -p codex-core`

Note: I also tried the broader `cargo test -p codex-core tools::`; it
reached the moved spec-plan/spec tests successfully, then aborted with a
stack overflow in
`tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`,
which is outside this spec relocation.
2026-05-06 15:40:50 -07:00
pakrym-oai
f593323ef1 [codex] Split tool handlers by tool name (#20687)
## Why

Tool registration used to bind a tool name to a handler externally,
which left ownership split between the registry plan and the handler
implementation. Some built-in handlers also multiplexed multiple in-core
tools by switching on the invoked tool name internally.

This moves the registry identity onto the handler itself and makes
built-in multi-tool areas use separate concrete handlers, so each
registered handler instance owns exactly one tool name and one dispatch
path.

## What Changed

- Added `ToolHandler::tool_name()` and changed
`ToolRegistryBuilder::register_handler` to derive the registry key from
the handler.
- Split built-in multiplexed handlers into concrete per-tool handlers
for unified exec, shell/local shell/container exec, MCP resources, goal
tools, and agent job tools.
- Kept name-carrying handler instances only where the runtime target is
inherently external or dynamic, such as MCP tools, dynamic tools, and
unavailable placeholders.
- Updated `ToolHandlerKind` and registry-plan construction so plan
entries map directly to concrete handler registrations.

## Verification

- `cargo test -p codex-tools tool_registry_plan`
- `cargo test -p codex-core --lib tools::registry_tests`
- `just fix -p codex-tools`
- `just fix -p codex-core`
2026-05-05 13:46:45 -07:00
starr-openai
78421face0 Route process tools to selected environments (#20647)
## Why
When a turn exposes multiple selected environments, shell-style tools
need a model-facing way to identify the intended target environment and
handlers need to resolve that target before parsing cwd-relative
permission fields or launching processes.

This PR scopes that rollout to process tools. Filesystem-oriented tools
such as `apply_patch`, `view_image`, and `list_dir` are intentionally
left for follow-up slices.

## What Changed
- Adds an `include_environment_id` option to shell-style tool schema
builders.
- Exposes optional `environment_id` on `shell`, `shell_command`, and
`exec_command` only when `ToolEnvironmentMode::Multiple` is active.
- Adds a shared handler helper that parses `environment_id` and
`workdir` from JSON function-call arguments and returns the selected
`Environment` plus effective absolute cwd.
- Uses that helper in `shell`, `shell_command`, and `exec_command`
handling so process execution uses the selected environment filesystem
and cwd.
- Changes `ExecCommandRequest` to carry a required resolved `cwd`,
removing the process-manager fallback to the primary turn cwd for new
exec commands.
- Leaves `write_stdin` unchanged because it targets an existing process
id, not a new environment.

## Testing
- Added unit coverage for process-tool schema exposure, selected
environment resolution, primary fallback, no-environment handling,
unknown environment ids, and resolving cwd-relative permission paths
against the selected environment cwd.
- Added a remote-suite e2e coverage case for `exec_command` routing
across explicit zero environments, one local environment, and
local+remote environments.
- Ran `just fmt` and `git diff --check`.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-05 12:12:03 -07:00
jif-oai
70807730f5 tools: remove unused experimental list_dir tool (#21170)
## Why
`list_dir` still carries a full spec/handler/test path, but nothing in
the current model catalog advertises it via
`experimental_supported_tools`. That leaves us maintaining an
environment-backed tool surface that is effectively unused.

## What changed
- delete the `list_dir` handler and its tests from `codex-core`
- remove the `list_dir` spec builder, handler kind, and registry wiring
from `codex-tools`
- clean up the remaining internal README and registry tests so they no
longer mention the removed tool
2026-05-05 13:11:07 +02:00
Ahmed Ibrahim
9d579813bb 1- Add model service tiers metadata (#20969)
## Why

The model list needs to carry display-ready service tier metadata so
clients can render tier choices with stable IDs, names, and
descriptions. A raw speed-tier string list is not enough for richer UI
copy or future tier labels.

## What changed

- Added `ModelServiceTier` to shared model metadata with string `id`,
`name`, and `description` fields.
- Added `service_tiers` to `ModelInfo` and `ModelPreset`, preserving
empty defaults for older cached model payloads.
- Exposed `serviceTiers` on app-server v2 `Model` responses and threaded
it through TUI app-server model conversion.
- Marked legacy `additional_speed_tiers` / `additionalSpeedTiers`
metadata as deprecated in source and generated schema output.
- Regenerated app-server protocol JSON schema and TypeScript fixtures,
including `ModelServiceTier.ts`.

## Verification

- Ran `just write-app-server-schema`.
- Did not run local tests per repo instruction; relying on PR CI.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-05 09:51:18 +03:00
sayan-oai
b9e8df47da Use MCP server instructions in deferred namespace descriptions (#21053)
## Why

MCP servers can provide `instructions` that explain what their tools are
for. Directly exposed MCP namespaces already use those instructions when
a connector description is not available, but deferred `tool_search`
results did not preserve that fallback. The direct path falls back from
connector metadata to server instructions, while the deferred path only
carried `connector_description` and otherwise fell back to generic
namespace text.

That meant a plain MCP server could provide useful model-facing guidance
and still appear as `Tools in the X namespace.` whenever it was
discovered lazily through `tool_search`.

## What changed

- Store one model-facing `namespace_description` on `ToolInfo`, using
connector descriptions for connector-backed tools and server
instructions for plain MCP servers.
- Thread that namespace description through the `tool_search` source
list, search indexing, and returned namespace metadata.
- Add an end-to-end regression test for deferred non-app MCP search
results exposing server instructions as the namespace description.

## Verification

- `cargo test -p codex-tools
search_tool_description_lists_each_mcp_source_once --lib`
- `cargo test -p codex-core --test all
tool_search_uses_non_app_mcp_server_instructions_as_namespace_description`
2026-05-04 19:36:07 +00:00
starr-openai
905987c08f Prepare selected environment plumbing (#20669)
## Why
This is a prep PR in the multi-environment process-tool stack. It
separates ownership/config cleanup from the behavior change that teaches
process tools to route by selected environment, so the follow-up PR can
focus on model-facing `environment_id` behavior.

## Stack
1. https://github.com/openai/codex/pull/20646 - `EnvironmentContext`
rendering for selected environments
2. https://github.com/openai/codex/pull/20669 - selected-environment
ownership and tool config prep (this PR)
3. https://github.com/openai/codex/pull/20647 - process-tool
`environment_id` routing

## What Changed
- keep the resolved turn environment list wrapped in
`ResolvedTurnEnvironments` through `TurnContext` instead of unwrapping
it back to a raw `Vec`
- add `TurnContext::resolve_path_against` so cwd-relative path
resolution has one shared helper
- replace the old tool config boolean with `ToolEnvironmentMode::{None,
Single, Multiple}`

## Testing
- Tests not run locally; this prep refactor is covered by GitHub CI for
the stack.

Co-authored-by: Codex <noreply@openai.com>
2026-05-04 17:55:49 +00:00
Matthew Zeng
f88701f5c8 [tool_suggest] More prompt polishes. (#20566)
Tool suggest still misfires when model needs tool_search, updating the
prompts to further disambiguate it:

- [x] rename it from `tool_suggest` to `request_plugin_install`
- [x] rephrase "suggestion" to "install" in the tool descriptions.
- [x] disambiguate "the tool" vs "the plugin/connector". 

Tested with the Codex App and verified it still works.
2026-05-02 04:22:12 +00:00
jif-oai
c37f7434ba Gate multi-agent v2 tools independently of collab (#20246)
## Why

`multi_agents_v2` is meant to be independently gated from the older
`collab` feature. The tool registry still treated the
collaboration-style agent tools as `collab`-only, so enabling
`multi_agents_v2` without `collab` omitted the v2 agent tools. Review
and guardian sub-sessions also need to keep agent spawning disabled even
when the outer session has `multi_agents_v2` enabled.

## What changed

- Include the collab-backed agent tools when either `multi_agents_v2` or
`collab` is enabled.
- Explicitly disable `multi_agents_v2` for review and guardian review
sub-sessions, matching the existing `spawn_csv` and `collab`
restrictions.
- Add a registry test that enables `multi_agents_v2`, disables `collab`,
and verifies the v2 agent tools are present while legacy `send_input`
and `resume_agent` remain hidden.

## Testing

- Added
`test_build_specs_multi_agent_v2_does_not_require_collab_feature`.
2026-04-30 10:23:31 +02:00
pakrym-oai
fedcefe9da Reduce the surface of collaboration modes (#20149)
Collaboration modes were slightly invasive both into ThreadManager
construction and ModelProvider
2026-04-29 17:22:41 -07:00
iceweasel-oai
13dbcda28f stop blocking unified_exec on Windows (#19435)
## Summary
- remove the Windows-specific unified-exec environment block from tool
selection
- keep `unified_exec` default-off on Windows unless the feature is
explicitly enabled
- normalize model-provided `shell_type = unified_exec` to
`shell_command` when the feature is disabled
- drop obsolete tests tied to the removed environment gate and keep the
feature-flag regression coverage

## Why
Now that the session/long-lived process backend is implemented for the
Windows sandbox, we don't need to hard disable it anymore. We will be
rolling out slowly using a feature gate.

## Impact
This allows manual Windows opt-in in CLI and app-backed flows while
preserving the existing default-off behavior for Windows users.

---------

Co-authored-by: canvrno-oai <kbond@openai.com>
Co-authored-by: Codex <noreply@openai.com>
2026-04-29 16:06:33 -07:00
Matthew Zeng
8ce48f9968 [tool_suggest] Improve tool_suggest triggering conditions. (#20091)
## Summary
- Tighten `tool_suggest` guidance so it prefers explicit plugin install
requests, while still allowing a connector install when the relevant
plugin is already installed and a needed connector from that plugin is
missing.
- Tell the model not to call `tool_suggest` in parallel with other
tools.

## Testing
- `cargo test -p codex-tools tool_suggest`
- `cargo test -p codex-core tool_suggest`
2026-04-29 13:41:12 -07:00
Andrey Mishchenko
857146b328 Delete multi_agent_v2 followup_task interrupt parameter (#20139)
Messages sent with `followup_task` already arrive at their target
recipient promptly (at message boundaries while sampling, or after the
pending tool call completes) -- having `interrupt` is not worth the
added complexity.
2026-04-28 23:19:48 -07:00
Celia Chen
f8fe96d548 feat: disable capabilities by model provider (#19442)
## Why

Unsupported features must fail closed and Codex must not expose
OpenAI-hosted fallback paths when the active provider cannot support
them. In practice, Bedrock should not surface app connectors, MCP
servers, tool search/suggestions, image generation, web search, or JS
REPL until those paths are explicitly supported for that provider.

This PR moves that decision into provider-owned capability metadata
instead of scattering Bedrock-specific checks across callers.

## What changed

- Adds `ProviderCapabilities` to `codex-model-provider`, with default
support for existing providers and a Bedrock override that disables
unsupported launch surfaces.
- Adds `ToolCapabilityBounds` to `codex-tools` so provider capability
limits can clamp otherwise-enabled tool config.
- Applies capability bounds when building session and review-thread tool
config.
- Routes MCP/app connector configuration through
`McpManager::mcp_config`, which filters configured MCP servers and app
connectors based on the active provider.
- Updates app-server MCP list/read paths to use the filtered MCP config.
- Adds coverage for default provider capabilities, Bedrock disabled
capabilities, and optional tool-surface clamping.

## Testing

built locally and verified that bedrock responses api now return without
errors calling unsupported tools.
2026-04-28 17:51:30 -07:00
Matthew Zeng
ebdf3a878c Support disabling tool suggest for specific tools. (#20072)
## Summary
- Add `disable_tool_suggest` to app and plugin config, schema, and
TypeScript output
- Exclude disabled connectors and plugins from tool suggestion discovery
- Persist "never show again" tool-suggestion choices back into
`config.toml`
- Update config docs and add coverage for connector and plugin
suppression

## Testing
- Added and updated unit tests for config persistence and tool-suggest
filtering
- Not run (not requested)
2026-04-29 00:19:34 +00:00
jif-oai
34d71d43eb Make MultiAgentV2 wait minimum configurable (#20052)
## Why

MultiAgentV2 `wait_agent` currently clamps short waits to a fixed 10
second minimum. That default is still useful for preventing tight
polling loops, but it is too rigid for environments that need faster
mailbox wake-up checks or a larger minimum to discourage frequent
polling.

This PR makes the minimum wait timeout configurable from the existing
MultiAgentV2 feature config section, so operators can tune the behavior
without changing the legacy multi-agent tool surface.

## What Changed

- Added `features.multi_agent_v2.min_wait_timeout_ms`.
- Defaulted the new setting to the existing 10 second floor.
- Validated the configured value as `1..=3600000`, matching the existing
one hour maximum wait bound.
- Applied the configured minimum to MultiAgentV2 `wait_agent` runtime
clamping.
- Plumbed the configured minimum into the `wait_agent` tool schema,
including the effective default when the minimum is above the normal 30
second default.
- Regenerated `core/config.schema.json`.

## Verification

- `cargo test -p codex-features`
- `cargo test -p codex-tools`
- `cargo test -p codex-core --lib multi_agent_v2`
- `just fix -p codex-core`
2026-04-28 22:36:44 +02:00
Michael Bolin
4d7ce3447d permissions: make runtime config profile-backed (#19606)
## Why

This supersedes #19391. During stack repair, GitHub marked #19391 as
merged into a temporary stack branch rather than into `main`, so the
runtime-config change needed a fresh PR.

`PermissionProfile` is now the canonical permissions shape after #19231
because it can distinguish `Managed`, `Disabled`, and `External`
enforcement while also carrying filesystem rules that legacy
`SandboxPolicy` cannot represent cleanly. Core config and session state
still needed to accept profile-backed permissions without forcing every
profile through the strict legacy bridge, which rejected valid runtime
profiles such as direct write roots.

The unrelated CI/test hardening that previously rode along with this PR
has been split into #19683 so this PR stays focused on the permissions
model migration.

## What Changed

- Adds `Permissions.permission_profile` and
`SessionConfiguration.permission_profile` as constrained runtime state,
while keeping `sandbox_policy` as a legacy compatibility projection.
- Introduces profile setters that keep `PermissionProfile`, split
filesystem/network policies, and legacy `SandboxPolicy` projections
synchronized.
- Uses a compatibility projection for requirement checks and legacy
consumers instead of rejecting profiles that cannot round-trip through
`SandboxPolicy` exactly.
- Updates config loading, config overrides, session updates, turn
context plumbing, prompt permission text, sandbox tags, and exec request
construction to carry profile-backed runtime permissions.
- Preserves configured deny-read entries and `glob_scan_max_depth` when
command/session profiles are narrowed.
- Adds `PermissionProfile::read_only()` and
`PermissionProfile::workspace_write()` presets that match legacy
defaults.

## Verification

- `cargo test -p codex-core direct_write_roots`
- `cargo test -p codex-core runtime_roots_to_legacy_projection`
- `cargo test -p codex-app-server
requested_permissions_trust_project_uses_permission_profile_intent`




---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19606).
* #19395
* #19394
* #19393
* #19392
* __->__ #19606
2026-04-26 13:29:54 -07:00
Eric Traut
32ace07ac5 Add goal model tools (3 / 5) (#18075)
Adds the model-facing goal tools on top of the app-server API from PR 2.

## Why

Once goals are persisted and exposed to clients, the model needs a
small, constrained tool surface for goal workflows. The tool contract
should let the model inspect goals, create them only when explicitly
requested, and mark them complete without giving it broad control over
user/runtime-owned state.

## What changed

- Added `get_goal`, `create_goal`, and `update_goal` tool specs behind
the `goals` feature flag.
- Added core goal tool handlers that validate objectives and token
budgets before mutating persisted state.
- Constrained `create_goal` to create only when no goal exists, with
optional `token_budget` only when a budget is explicitly provided.
- Tightened the `create_goal` instructions so the model does not infer
goals from ordinary task requests.
- Constrained `update_goal` to expose only goal completion; pause,
resume, clear, and budget-limited transitions remain user- or
runtime-controlled.
- Registered the goal tools in the tool registry and kept them out of
review contexts where they should not appear.

## Verification

- Added tool-registry coverage for feature gating and tool availability.
- Added core session tests for create/get/update behavior, duplicate
goal rejection, budget validation, and completion-only updates.
2026-04-24 20:54:40 -07:00
Curtis 'Fjord' Hawthorne
8a559e7938 Remove js_repl feature (#19410) 2026-04-24 17:49:29 -07:00
jif-oai
deb4509302 feat: surface multi-agent thread limit in spawn description (#19360)
## Summary
- Thread `agent_max_threads` into `ToolsConfig` and
`SpawnAgentToolOptions`.
- Render the configured `max_concurrent_threads_per_session` value in
the MultiAgentV2 `spawn_agent` description.
- Cover the description text in `codex-tools` unit tests and
`codex-core` tool spec tests.

## Validation
- `just fmt`
- `cargo test -p codex-tools`
- `cargo test -p codex-core spawn_agent_description`
- `git diff --check`

## Notes
- `cargo test -p codex-core` was also attempted, but unrelated
environment-sensitive tests failed with the active local environment.
Examples: approvals reviewer defaults observed `AutoReview` instead of
`User`, request-permissions event tests did not emit events, and
proxy-env tests saw `http://127.0.0.1:50604` from the active proxy
environment.

Co-authored-by: Codex <noreply@openai.com>
2026-04-24 15:13:54 +02:00
iceweasel-oai
2e228969be guide Windows to use -WindowStyle Hidden for Start-Process calls (#19044)
Sometimes codex runs `Start-Process` to start up a service or something
similar, which launches a user-visible powershell window that probably
doesn't get cleaned up. This instruction change encourages it to do so
using a hidden window.

This was reported in
https://openai.slack.com/archives/C09K6H5DZC4/p1776741272870519

One caveat is that this change won't do anything to cleanup these
processes, but it will stop them from polluting the user's visible
workspace

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-23 21:36:17 +00:00
pash-openai
dc1a8f2190 [tool search] support namespaced deferred dynamic tools (#18413)
Deferred dynamic tools need to round-trip a namespace so a tool returned
by `tool_search` can be called through the same registry key that core
uses for dispatch.

This change adds namespace support for dynamic tool specs/calls,
persists it through app-server thread state, and routes dynamic tool
calls by full `ToolName` while still sending the app the leaf tool name.
Deferred dynamic tools must provide a namespace; non-deferred dynamic
tools may remain top-level.

It also introduces `LoadableToolSpec` as the shared
function-or-namespace Responses shape used by both `tool_search` output
and dynamic tool registration, so dynamic tools use the same wrapping
logic in both paths.

Validation:
- `cargo test -p codex-tools`
- `cargo test -p codex-core tool_search`

---------

Co-authored-by: Sayan Sisodiya <sayan@openai.com>
2026-04-21 14:13:08 +08:00
Thibault Sottiaux
54bd07d28c [codex] prefer inherited spawn agent model (#18701)
This updates the spawn-agent tool contract so subagents are presented as
inheriting the parent model by default. The visible model list is now
framed as optional overrides, the model parameter tells callers to leave
it unset and the delegation guidance no longer nudges models toward
picking a smaller/mini override.

Fixes reports that 5.4 would occasionally pick 5.2 or lower as
sub-agents.
2026-04-20 22:34:08 +00:00
pakrym-oai
53b1570367 Update image outputs to default to high detail (#18386)
Do not assume the default `detail`.
2026-04-18 11:01:12 -07:00
sayan-oai
6991be7ead enable tool search over dynamic tools (#18263)
## Summary

- Normalize deferred MCP and dynamic tools into `ToolSearchEntry` values
before constructing `ToolSearchHandler`.
- Move the tool-search entry adapter out of `tools/handlers` and into
`tools/tool_search_entry.rs` so the handlers directory stays focused on
handlers.
- Keep `ToolSearchHandler` operating over one generic entry list for
BM25 search, namespace grouping, and per-bucket default limits.

## Why

Follow-up cleanup for #17849. The dynamic tool-search support made the
handler juggle source-specific MCP and dynamic tool lists, index
arithmetic, output conversion, and namespace emission. This keeps source
adaptation outside the handler so the search loop itself is smaller and
source-agnostic.

## Validation

- `just fmt`
- `cargo test -p codex-core tools::handlers::tool_search::tests`
- `git diff --check`
- `cargo test -p codex-core` currently fails in unrelated
`plugins::manager::tests::list_marketplaces_ignores_installed_roots_missing_from_config`;
rerunning that single test fails the same way at
`core/src/plugins/manager_tests.rs:1692`.

---------

Co-authored-by: pash <pash@openai.com>
2026-04-18 02:07:59 +08:00
Won Park
85203d8872 Launch image generation by default (#17153)
## Summary
- Promote `image_generation` from under-development to stable
- Enable image generation by default in the feature registry
- Update feature coverage for the new launch-state expectation
- Add the missing image-generation auth fixture field in a tool registry
test

## Testing
- `just fmt`
- `cargo test -p codex-features`
- `cargo test -p codex-tools` currently fails:
`test_full_toolset_specs_for_gpt5_codex_unified_exec_web_search` needs
its expected default tool list updated for `image_generation`
2026-04-16 09:05:38 -07:00
sayan-oai
9c6d038622 [code mode] defer mcp tools from exec description (#17287)
## Summary
- hide deferred MCP/app nested tool descriptions from the `exec` prompt
in code mode
- add short guidance that omitted nested tools are still available
through `ALL_TOOLS`
- cover the code_mode_only path with an integration test that discovers
and calls a deferred app tool

## Motivation
`code_mode_only` exposes only top-level `exec`/`wait`, but the `exec`
description could still include a large nested-tool reference. This
keeps deferred nested tools callable while avoiding that prompt bloat.

## Tests
- `just fmt`
- `just fix -p codex-code-mode`
- `just fix -p codex-tools`
- `cargo test -p codex-code-mode
exec_description_mentions_deferred_nested_tools_when_available`
- `cargo test -p codex-tools
create_code_mode_tool_matches_expected_spec`
- `cargo test -p codex-core
code_mode_only_guides_all_tools_search_and_calls_deferred_app_tools`
2026-04-17 00:01:14 +08:00
Matthew Zeng
77fe33bf72 Update ToolSearch to be enabled by default (#17854)
## Summary
- Promote `Feature::ToolSearch` to `Stable` and enable it in the default
feature set
- Update feature tests and tool registry coverage to match the new
default
- Adjust the search-tool integration test to assert the default-on path
and explicit disable fallback

## Testing
- `just fmt`
- `cargo test -p codex-features`
- `cargo test -p codex-core --test all search_tool`
- `cargo test -p codex-tools`
2026-04-15 22:01:05 -07:00
Curtis 'Fjord' Hawthorne
9e2fc31854 Support original-detail metadata on MCP image outputs (#17714)
## Summary
- honor `_meta["codex/imageDetail"] == "original"` on MCP image content
and map it to `detail: "original"` where supported
- strip that detail back out when the active model does not support
original-detail image inputs
- update code-mode `image(...)` to accept individual MCP image blocks
- teach `js_repl` / `codex.emitImage(...)` to preserve the same hint
from raw MCP image outputs
- document the new `_meta` contract and add generic RMCP-backed coverage
across protocol, core, code-mode, and js_repl paths
2026-04-15 14:43:33 -07:00
Shijie Rao
78ce61c78e Fix empty tool descriptions (#17946)
## Summary
- Ensure direct namespaced MCP tool groups are emitted with a non-empty
namespace description even when namespace metadata is missing or blank.
- Add regression coverage for missing MCP namespace descriptions.

## Cause
Latest `main` can serialize a direct namespaced MCP tool group with an
empty top-level `description`. The namespace description path used
`unwrap_or_default()` when `tool_namespaces` did not include metadata
for that namespace, so the outbound Responses API payload could contain
a tool like `{"type":"namespace","description":""}`. The Responses API
rejects that because namespace tool descriptions must be a non-empty
string.

## Fix
- Add a fallback namespace description: `Tools in the <namespace>
namespace.`
- Preserve provided namespace descriptions after trimming, but treat
blank descriptions as missing.

### Issue I am seeing
This is what I am seeing on the local build.
<img width="1593" height="488" alt="Screenshot 2026-04-15 at 10 55 55
AM"
src="https://github.com/user-attachments/assets/bab668ba-bf17-4c71-be4e-b102202fce57"
/>

---------

Co-authored-by: Sayan Sisodiya <sayan@openai.com>
2026-04-15 18:14:43 +00:00
sayan-oai
0df7e9a820 register all mcp tools with namespace (#17404)
stacked on #17402.

MCP tools returned by `tool_search` (deferred tools) get registered in
our `ToolRegistry` with a different format than directly available
tools. this leads to two different ways of accessing MCP tools from our
tool catalog, only one of which works for each. fix this by registering
all MCP tools with the namespace format, since this info is already
available.

also, direct MCP tools are registered to responsesapi without a
namespace, while deferred MCP tools have a namespace. this means we can
receive MCP `FunctionCall`s in both formats from namespaces. fix this by
always registering MCP tools with namespace, regardless of deferral
status.

make code mode track `ToolName` provenance of tools so it can map the
literal JS function name string to the correct `ToolName` for
invocation, rather than supporting both in core.

this lets us unify to a single canonical `ToolName` representation for
each MCP tool and force everywhere to use that one, without supporting
fallbacks.
2026-04-15 21:02:59 +08:00
Curtis 'Fjord' Hawthorne
f030ab62eb Always enable original image detail on supported models (#17665)
## Summary

This PR removes `image_detail_original` as a runtime experiment and
makes original image detail available whenever the selected model
supports it.

Concretely, this change:
- drops the `image_detail_original` feature flag from the feature
registry and generated config schema
- makes tool-emitted image detail depend only on
`ModelInfo.supports_image_detail_original`
- updates `view_image` and `code_mode`/`js_repl` image emission to use
that capability check directly
- removes now-redundant experiment-specific tests and instruction
coverage
- keeps backward compatibility for existing configs by silently ignoring
a stale `features.image_detail_original` entry

The net effect is that `detail: "original"` is always available on
supported models, without requiring an experiment toggle.
2026-04-14 08:15:56 -07:00
sayan-oai
1325bcd3f6 chore: refactor name and namespace to single type (#17402)
avoid passing them both around, unify on a type. this now also keys
`ToolRegistry`.

tests pass
2026-04-11 23:06:22 +00:00
jif-oai
d39a722865 feat: description multi-agent v2 (#17338) 2026-04-10 15:31:32 +01:00
Vivian Fang
7bbe3b6011 Add output_schema to code mode render (#17210)
This updates code-mode tool rendering so MCP tools can surface
structured output types from their `outputSchema`.

What changed:
- Detect MCP tool-call result wrappers from the output schema shape
instead of relying on tool-name parsing or provenance flags.
- Render shared TypeScript aliases once for MCP tool results
(`CallToolResult`, `ContentBlock`, etc.) so multiple MCP tool
declarations stay compact.
- Type `structuredContent` from the tool definition's `outputSchema`
instead of rendering it as `unknown`.
- Update the shared MCP aliases to match the MCP draft `CallToolResult`
schema more closely.

Example:
- Before: `declare const tools: { mcp__rmcp__echo(args: { env_var?:
string; message: string; }): Promise<{ _meta?: unknown; content:
Array<unknown>; isError?: boolean; structuredContent?: unknown; }>; };`
- After: `declare const tools: { mcp__rmcp__echo(args: { env_var?:
string; message: string; }): Promise<CallToolResult<{ echo: string; env:
string | null; }>>; };`
2026-04-10 11:41:44 +00:00
sayan-oai
04fc208b6d preserve search results order in tool_search_output (#17263)
we used to alpha-sort tool search results because we were using
`BTreeMap`, which threw away the actual search result ordering.

Now we use a vec to preserve it.

### Tests
Updated tests
2026-04-09 18:15:10 -07:00
Matthew Zeng
d7f99b0fa6 [mcp] Expand tool search to custom MCPs. (#16944)
- [x] Expand tool search to custom MCPs.
- [x] Rename several variables/fields to be more generic.

Updated tool & server name lifecycles:

**Raw Identity**

ToolInfo.server_name is raw MCP server name.
ToolInfo.tool.name is raw MCP tool name.
MCP calls route back to raw via parse_tool_name() returning
(tool.server_name, tool.tool.name).
mcpServerStatus/list now groups by raw server and keys tools by
Tool.name: mod.rs:599
App-server just forwards that grouped raw snapshot:
codex_message_processor.rs:5245

**Callable Names**

On list-tools, we create provisional callable_namespace / callable_name:
mcp_connection_manager.rs:1556
For non-app MCP, provisional callable name starts as raw tool name.
For codex-apps, provisional callable name is sanitized and strips
connector name/id prefix; namespace includes connector name.
Then qualify_tools() sanitizes callable namespace + name to ASCII alnum
/ _ only: mcp_tool_names.rs:128
Note: this is stricter than Responses API. Hyphen is currently replaced
with _ for code-mode compatibility.

**Collision Handling**

We do initially collapse example-server and example_server to the same
base.
Then qualify_tools() detects distinct raw namespace identities behind
the same sanitized namespace and appends a hash to the callable
namespace: mcp_tool_names.rs:137
Same idea for tool-name collisions: hash suffix goes on callable tool
name.
Final list_all_tools() map key is callable_namespace + callable_name:
mcp_connection_manager.rs:769

**Direct Model Tools**

Direct MCP tool declarations use the full qualified sanitized key as the
Responses function name.
The raw rmcp Tool is converted but renamed for model exposure.

**Tool Search / Deferred**

Tool search result namespace = final ToolInfo.callable_namespace:
tool_search.rs:85
Tool search result nested name = final ToolInfo.callable_name:
tool_search.rs:86
Deferred tool handler is registered as "{namespace}:{name}":
tool_registry_plan.rs:248
When a function call comes back, core recombines namespace + name, looks
up the full qualified key, and gets the raw server/tool for MCP
execution: codex.rs:4353

**Separate Legacy Snapshot**

collect_mcp_snapshot_from_manager_with_detail() still returns a map
keyed by qualified callable name.
mcpServerStatus/list no longer uses that; it uses
McpServerStatusSnapshot, which is raw-inventory shaped.
2026-04-09 13:34:52 -07:00
canvrno-oai
58ad79b60e Fix missing fields (#17149)
Fix missing `image_generation_tool_auth_allowed` in two locations.
2026-04-08 13:53:53 -07:00
Won Park
e003f84e1e release ready, enabling only for siwc users (#17046)
**Disabling Image-Gen for Non-SIWC Codex Users**

We are only enabling image-gen feature for SIWC Codex users until there
comes a fix in ResponsesAPI to omit output from responses.completed, to
prevent the following issues:

1. websocket blows up due to heavier load (images) than before (text) 
2. http parser streams through n^2 of n-base64 bytes (sum of base64s of
all images generated in turn) that causes long delays in
turn_completion.
2026-04-08 11:22:39 -07:00
pakrym-oai
4c07dd4d25 Configure multi_agent_v2 spawn agent hints (#17071)
Allow multi_agent_v2 features to have its own temporary configuration
under `[features.multi_agent_v2]`

```
[features.multi_agent_v2]
enabled = true
usage_hint_enabled = false
usage_hint_text = "Custom delegation guidance."
hide_spawn_agent_metadata = true
```

Absent `usage_hint_text` means use the default hint.

```
[features]
multi_agent_v2 = true
```

still works as the boolean shorthand.
2026-04-08 08:42:18 -07:00
Vivian Fang
d47b755aa2 Render namespace description for tools (#16879) 2026-04-08 02:39:40 -07:00
Vivian Fang
9091999c83 Render function attribute descriptions (#16880) 2026-04-08 02:10:45 -07:00
Vivian Fang
ea516f9a40 Support anyOf and enum in JsonSchema (#16875)
This brings us into better alignment with the JSON schema subset that is
supported in
<https://developers.openai.com/api/docs/guides/structured-outputs#supported-schemas>,
and also allows us to render richer function signatures in code mode
(e.g., anyOf{null, OtherObjectType})
2026-04-08 01:07:55 -07:00
Vivian Fang
fa5119a8a6 Add regression tests for JsonSchema (#17052)
Tests added for existing JsonSchema in
`codex-rs/tools/src/json_schema_tests.rs`:

- `parse_tool_input_schema_coerces_boolean_schemas`
- `parse_tool_input_schema_infers_object_shape_and_defaults_properties`
- `parse_tool_input_schema_normalizes_integer_and_missing_array_items`
- `parse_tool_input_schema_sanitizes_additional_properties_schema`
-
`parse_tool_input_schema_infers_object_shape_from_boolean_additional_properties_only`
- `parse_tool_input_schema_infers_number_from_numeric_keywords`
- `parse_tool_input_schema_infers_number_from_multiple_of`
-
`parse_tool_input_schema_infers_string_from_enum_const_and_format_keywords`
- `parse_tool_input_schema_defaults_empty_schema_to_string`
- `parse_tool_input_schema_infers_array_from_prefix_items`
-
`parse_tool_input_schema_preserves_boolean_additional_properties_on_inferred_object`
-
`parse_tool_input_schema_infers_object_shape_from_schema_additional_properties_only`

Tests that we expect to fail on the baseline normalizer, but pass with
the new JsonSchema:

- `parse_tool_input_schema_preserves_nested_nullable_type_union`
- `parse_tool_input_schema_preserves_nested_any_of_property`
2026-04-07 18:18:54 -07:00
pash-openai
80ebc80be5 Use model metadata for Fast Mode status (#16949)
Fast Mode status was still tied to one model name in the TUI and
model-list plumbing. This changes the model metadata shape so a model
can advertise additional speed tiers, carries that field through the
app-server model list, and uses it to decide when to show Fast Mode
status.

For people using Codex, the behavior is intended to stay the same for
existing models. Fast Mode still requires the existing signed-in /
feature-gated path; the difference is that the UI can now recognize any
model the model list marks as Fast-capable, instead of requiring a new
client-side slug check.
2026-04-07 17:55:40 -07:00
jif-oai
4cc6818996 chore: keep request_user_input tool to persist cache on multi-agents (#17009) 2026-04-07 16:53:31 +01:00
jif-oai
99f167e6bf chore: hide nickname for debug flag (#17007) 2026-04-07 11:31:13 +01:00
jif-oai
68e16baabe chore: send_message and followup_task do not return anything (#17008) 2026-04-07 11:26:36 +01:00
jif-oai
2a8c3a2a52 feat: drop agent ID from v2 (#17005) 2026-04-07 10:56:01 +01:00