codex

mirror of https://github.com/openai/codex.git synced 2026-04-27 16:15:09 +00:00

Author	SHA1	Message	Date
Matthew Zeng	d7f99b0fa6	[mcp] Expand tool search to custom MCPs. (#16944 ) - [x] Expand tool search to custom MCPs. - [x] Rename several variables/fields to be more generic. Updated tool & server name lifecycles: Raw Identity ToolInfo.server_name is raw MCP server name. ToolInfo.tool.name is raw MCP tool name. MCP calls route back to raw via parse_tool_name() returning (tool.server_name, tool.tool.name). mcpServerStatus/list now groups by raw server and keys tools by Tool.name: mod.rs:599 App-server just forwards that grouped raw snapshot: codex_message_processor.rs:5245 Callable Names On list-tools, we create provisional callable_namespace / callable_name: mcp_connection_manager.rs:1556 For non-app MCP, provisional callable name starts as raw tool name. For codex-apps, provisional callable name is sanitized and strips connector name/id prefix; namespace includes connector name. Then qualify_tools() sanitizes callable namespace + name to ASCII alnum / _ only: mcp_tool_names.rs:128 Note: this is stricter than Responses API. Hyphen is currently replaced with _ for code-mode compatibility. Collision Handling We do initially collapse example-server and example_server to the same base. Then qualify_tools() detects distinct raw namespace identities behind the same sanitized namespace and appends a hash to the callable namespace: mcp_tool_names.rs:137 Same idea for tool-name collisions: hash suffix goes on callable tool name. Final list_all_tools() map key is callable_namespace + callable_name: mcp_connection_manager.rs:769 Direct Model Tools Direct MCP tool declarations use the full qualified sanitized key as the Responses function name. The raw rmcp Tool is converted but renamed for model exposure. Tool Search / Deferred Tool search result namespace = final ToolInfo.callable_namespace: tool_search.rs:85 Tool search result nested name = final ToolInfo.callable_name: tool_search.rs:86 Deferred tool handler is registered as "{namespace}:{name}": tool_registry_plan.rs:248 When a function call comes back, core recombines namespace + name, looks up the full qualified key, and gets the raw server/tool for MCP execution: codex.rs:4353 Separate Legacy Snapshot collect_mcp_snapshot_from_manager_with_detail() still returns a map keyed by qualified callable name. mcpServerStatus/list no longer uses that; it uses McpServerStatusSnapshot, which is raw-inventory shaped.	2026-04-09 13:34:52 -07:00
neil-oai	a92a5085bd	Forward app-server turn clientMetadata to Responses (#16009 ) ## Summary App-server v2 already receives turn-scoped `clientMetadata`, but the Rust app-server was dropping it before the outbound Responses request. This change keeps the fix lightweight by threading that metadata through the existing turn-metadata path rather than inventing a new transport. ## What we're trying to do and why We want turn-scoped metadata from the app-server protocol layer, especially fields like Hermes/GAAS run IDs, to survive all the way to the actual Responses API request so it is visible in downstream websocket request logging and analytics. The specific bug was: - app-server protocol uses camelCase `clientMetadata` - Responses transport already has an existing turn metadata carrier: `x-codex-turn-metadata` - websocket transport already rewrites that header into `request.request_body.client_metadata["x-codex-turn-metadata"]` - but the Rust app-server never parsed or stored `clientMetadata`, so nothing from the app-server request was making it into that existing path This PR fixes that without adding a new header or a second metadata channel. ## How we did it ### Protocol surface - Add optional `clientMetadata` to v2 `TurnStartParams` and `TurnSteerParams` - Regenerate the JSON schema / TypeScript fixtures - Update app-server docs to describe the field and its behavior ### Runtime plumbing - Add a dedicated core op for app-server user input carrying turn-scoped metadata: `Op::UserInputWithClientMetadata` - Wire `turn/start` and `turn/steer` through that op / signature path instead of dropping the metadata at the message-processor boundary - Store the metadata in `TurnMetadataState` ### Transport behavior - Reuse the existing serialized `x-codex-turn-metadata` payload - Merge the new app-server `clientMetadata` into that JSON additively - Do not replace built-in reserved fields already present in the turn metadata payload - Keep websocket behavior unchanged at the outer shape level: it still sends only `client_metadata["x-codex-turn-metadata"]`, but that JSON string now contains the merged fields - Keep HTTP fallback behavior unchanged except that the existing `x-codex-turn-metadata` header now includes the merged fields too ### Request shape before / after Before, a websocket `response.create` looked like: ```json { "type": "response.create", "client_metadata": { "x-codex-turn-metadata": "{\"session_id\":\"...\",\"turn_id\":\"...\"}" } } ``` Even if the app-server caller supplied `clientMetadata`, it was not represented there. After, the same request shape is preserved, but the serialized payload now includes the new turn-scoped fields: ```json { "type": "response.create", "client_metadata": { "x-codex-turn-metadata": "{\"session_id\":\"...\",\"turn_id\":\"...\",\"fiber_run_id\":\"fiber-start-123\",\"origin\":\"gaas\"}" } } ``` ## Validation ### Targeted tests added / updated - protocol round-trip coverage for `clientMetadata` on `turn/start` and `turn/steer` - protocol round-trip coverage for `Op::UserInputWithClientMetadata` - `TurnMetadataState` merge test proving client metadata is added without overwriting reserved built-in fields - websocket request-shape test proving outbound `response.create` contains merged metadata inside `client_metadata["x-codex-turn-metadata"]` - app-server integration tests proving: - `turn/start` forwards `clientMetadata` into the outbound Responses request path - websocket warmup + real turn request both behave correctly - `turn/steer` updates the follow-up request metadata ### Commands run - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-protocol` - `cargo test -p codex-core turn_metadata_state_merges_client_metadata_without_replacing_reserved_fields --lib` - `cargo test -p codex-core --test all responses_websocket_preserves_custom_turn_metadata_fields` - `cargo test -p codex-app-server --test all client_metadata` - `cargo test -p codex-app-server --test all turn_start_forwards_client_metadata_to_responses_websocket_request_body_v2 -- --nocapture` - `just fmt` - `just fix -p codex-core -p codex-protocol -p codex-app-server-protocol -p codex-app-server` - `just fix -p codex-exec -p codex-tui-app-server` - `just argument-comment-lint` ### Full suite note `cargo test` in `codex-rs` still fails in: - `suite::v2::turn_interrupt::turn_interrupt_resolves_pending_command_approval_request` I verified that same failure on a clean detached `HEAD` worktree with an isolated `CARGO_TARGET_DIR`, so it is not caused by this patch.	2026-04-09 11:52:37 -07:00
Ahmed Ibrahim	af8a9d2d2b	remove temporary ownership re-exports (#16626 ) Stacked on #16508. This removes the temporary `codex-core` / `codex-login` re-export shims from the ownership split and rewrites callsites to import directly from `codex-model-provider-info`, `codex-models-manager`, `codex-api`, `codex-protocol`, `codex-feedback`, and `codex-response-debug-context`. No behavior change intended; this is the mechanical import cleanup layer split out from the ownership move. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-03 00:33:34 -07:00
Ahmed Ibrahim	6fff9955f1	extract models manager and related ownership from core (#16508 ) ## Summary - split `models-manager` out of `core` and add `ModelsManagerConfig` plus `Config::to_models_manager_config()` so model metadata paths stop depending on `core::Config` - move login-owned/auth-owned code out of `core` into `codex-login`, move model provider config into `codex-model-provider-info`, move API bridge mapping into `codex-api`, move protocol-owned types/impls into `codex-protocol`, and move response debug helpers into a dedicated `response-debug-context` crate - move feedback tag emission into `codex-feedback`, relocate tests to the crates that now own the code, and keep broad temporary re-exports so this PR avoids a giant import-only rewrite ## Major moves and decisions - created `codex-models-manager` as the owner for model cache/catalog/config/model info logic, including the new `ModelsManagerConfig` struct - created `codex-model-provider-info` as the owner for provider config parsing/defaults and kept temporary `codex-login`/`codex-core` re-exports for old import paths - moved `api_bridge` error mapping + `CoreAuthProvider` into `codex-api`, while `codex-login::api_bridge` temporarily re-exports those symbols and keeps the `auth_provider_from_auth` wrapper - moved `auth_env_telemetry` and `provider_auth` ownership to `codex-login` - moved `CodexErr` ownership to `codex-protocol::error`, plus `StreamOutput`, `bytes_to_string_smart`, and network policy helpers to protocol-owned modules - created `codex-response-debug-context` for `extract_response_debug_context`, `telemetry_transport_error_message`, and related response-debug plumbing instead of leaving that behavior in `core` - moved `FeedbackRequestTags`, `emit_feedback_request_tags`, and `emit_feedback_request_tags_with_auth_env` to `codex-feedback` - deferred removal of temporary re-exports and the mechanical import rewrites to a stacked follow-up PR so this PR stays reviewable ## Test moves - moved auth refresh coverage from `core/tests/suite/auth_refresh.rs` to `login/tests/suite/auth_refresh.rs` - moved text encoding coverage from `core/tests/suite/text_encoding_fix.rs` to `protocol/src/exec_output_tests.rs` - moved model info override coverage from `core/tests/suite/model_info_overrides.rs` to `models-manager/src/model_info_overrides_tests.rs` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-02 23:00:02 -07:00
Michael Bolin	aa2403e2eb	core: remove cross-crate re-exports from lib.rs (#16512 ) ## Why `codex-core` was re-exporting APIs owned by sibling `codex-` crates, which made downstream crates depend on `codex-core` as a proxy module instead of the actual owner crate. Removing those forwards makes crate boundaries explicit and lets leaf crates drop unnecessary `codex-core` dependencies. In this PR, this reduces the dependency on `codex-core` to `codex-login` in the following files: ``` codex-rs/backend-client/Cargo.toml codex-rs/mcp-server/tests/common/Cargo.toml ``` ## What - Remove `codex-rs/core/src/lib.rs` re-exports for symbols owned by `codex-login`, `codex-mcp`, `codex-rollout`, `codex-analytics`, `codex-protocol`, `codex-shell-command`, `codex-sandboxing`, `codex-tools`, and `codex-utils-path`. - Delete the `default_client` forwarding shim in `codex-rs/core`. - Update in-crate and downstream callsites to import directly from the owning `codex-` crate. - Add direct Cargo dependencies where callsites now target the owner crate, and remove `codex-core` from `codex-rs/backend-client`.	2026-04-01 23:06:24 -07:00
Matthew Zeng	e590fad50b	[plugins] Add a flag for tool search. (#15722 ) - [x] Add a flag for tool search.	2026-03-25 07:00:25 +00:00
Ahmed Ibrahim	2e22885e79	Split features into codex-features crate (#15253 ) - Split the feature system into a new `codex-features` crate. - Cut `codex-core` and workspace consumers over to the new config and warning APIs. Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com> Co-authored-by: Codex <noreply@openai.com>	2026-03-19 20:12:07 -07:00
nicholasclark-openai	2bee37fe69	Plumb MCP turn metadata through _meta (#15190 ) ## Summary Some background. We're looking to instrument GA turns end to end. Right now a big gap is grouping mcp tool calls with their codex sessions. We send session id and turn id headers to the responses call but not the mcp/wham calls. Ideally we could pass the args as headers like with responses, but given the setup of the rmcp client, we can't send as headers without either changing the rmcp package upstream to allow per request headers or introducing a mutex which break concurrency. An earlier attempt made the assumption that we had 1 client per thread, which allowed us to set headers at the start of a turn. @pakrym mentioned that this assumption might break in the near future. So the solution now is to package the turn metadata/session id into the _meta field in the post body and pull out in codex-backend. - send turn metadata to MCP servers via `tools/call` `_meta` instead of assuming per-thread request headers on shared clients - preserve the existing `_codex_apps` metadata while adding `x-codex-turn-metadata` for all MCP tool calls - extend tests to cover both custom MCP servers and the codex apps search flow --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-19 22:05:13 +00:00
Matthew Zeng	d4af6053e2	[apps] Improve search tool fallback. (#14732 ) - [x] Bypass tool search and stuff tool specs directly into model context when either a. Tool search is not available for the model or b. There are not that many tools to search for.	2026-03-15 21:41:55 -07:00
Matthew Zeng	49edf311ac	[apps] Add tool call meta. (#14647 ) - [x] Add resource_uri and other things to _meta to shortcut resource lookup and speed things up.	2026-03-14 22:24:13 -07:00
pakrym-oai	cb7d8f45a1	Normalize MCP tool names to code-mode safe form (#14605 ) Code mode doesn't allow `-` in names and it's better if function names and code-mode names are the same.	2026-03-13 14:50:16 -07:00
Anton Panasenko	651717323c	feat(search_tool): gate search_tool on model supports_search_tool field (#14502 )	2026-03-12 16:03:50 -07:00
Anton Panasenko	77b0c75267	feat: search_tool migrate to bring you own tool of Responses API (#14274 ) ## Why to support a new bring your own search tool in Responses API(https://developers.openai.com/api/docs/guides/tools-tool-search#client-executed-tool-search) we migrating our bm25 search tool to use official way to execute search on client and communicate additional tools to the model. ## What - replace the legacy `search_tool_bm25` flow with client-executed `tool_search` - add protocol, SSE, history, and normalization support for `tool_search_call` and `tool_search_output` - return namespaced Codex Apps search results and wire namespaced follow-up tool calls back into MCP dispatch	2026-03-11 17:51:51 -07:00
Matthew Zeng	566e4cee4b	[apps] Fix apps enablement condition. (#14011 ) - [x] Fix apps enablement condition to check both the feature flag and that the user is not an API key user.	2026-03-09 22:25:43 -07:00
sayan-oai	4e77ea0ec7	add @plugin mentions (#13510 ) ## Note-- added plugin mentions via @, but that conflicts with file mentions depends and builds upon #13433. - introduces explicit `@plugin` mentions. this injects the plugin's mcp servers, app names, and skill name format into turn context as a dev message. - we do not yet have UI for these mentions, so we currently parse raw text (as opposed to skills and apps which have UI chips, autocomplete, etc.) this depends on a `plugins/list` app-server endpoint we can feed the UI with, which is upcoming - also annotate mcp and app tool descriptions with the plugin(s) they come from. this gives the model a first class way of understanding what tools come from which plugins, which will help implicit invocation. ### Tests Added and updated tests, unit and integration. Also confirmed locally a raw `@plugin` injects the dev message, and the model knows about its apps, mcps, and skills.	2026-03-06 00:03:39 +00:00
Michael Bolin	bfff0c729f	config: enforce enterprise feature requirements (#13388 ) ## Why Enterprises can already constrain approvals, sandboxing, and web search through `requirements.toml` and MDM, but feature flags were still only configurable as managed defaults. That meant an enterprise could suggest feature values, but it could not actually pin them. This change closes that gap and makes enterprise feature requirements behave like the other constrained settings. The effective feature set now stays consistent with enterprise requirements during config load, when config writes are validated, and when runtime code mutates feature flags later in the session. It also tightens the runtime API for managed features. `ManagedFeatures` now follows the same constraint-oriented shape as `Constrained<T>` instead of exposing panic-prone mutation helpers, and production code can no longer construct it through an unconstrained `From<Features>` path. The PR also hardens the `compact_resume_fork` integration coverage on Windows. After the feature-management changes, `compact_resume_after_second_compaction_preserves_history` was overflowing the libtest/Tokio thread stacks on Windows, so the test now uses an explicit larger-stack harness as a pragmatic mitigation. That may not be the ideal root-cause fix, and it merits a parallel investigation into whether part of the async future chain should be boxed to reduce stack pressure instead. ## What Changed Enterprises can now pin feature values in `requirements.toml` with the requirements-side `features` table: ```toml [features] personality = true unified_exec = false ``` Only canonical feature keys are allowed in the requirements `features` table; omitted keys remain unconstrained. - Added a requirements-side pinned feature map to `ConfigRequirementsToml`, threaded it through source-preserving requirements merge and normalization in `codex-config`, and made the TOML surface use `[features]` (while still accepting legacy `[feature_requirements]` for compatibility). - Exposed `featureRequirements` from `configRequirements/read`, regenerated the JSON/TypeScript schema artifacts, and updated the app-server README. - Wrapped the effective feature set in `ManagedFeatures`, backed by `ConstrainedWithSource<Features>`, and changed its API to mirror `Constrained<T>`: `can_set(...)`, `set(...) -> ConstraintResult<()>`, and result-returning `enable` / `disable` / `set_enabled` helpers. - Removed the legacy-usage and bulk-map passthroughs from `ManagedFeatures`; callers that need those behaviors now mutate a plain `Features` value and reapply it through `set(...)`, so the constrained wrapper remains the enforcement boundary. - Removed the production loophole for constructing unconstrained `ManagedFeatures`. Non-test code now creates it through the configured feature-loading path, and `impl From<Features> for ManagedFeatures` is restricted to `#[cfg(test)]`. - Rejected legacy feature aliases in enterprise feature requirements, and return a load error when a pinned combination cannot survive dependency normalization. - Validated config writes against enterprise feature requirements before persisting changes, including explicit conflicting writes and profile-specific feature states that normalize into invalid combinations. - Updated runtime and TUI feature-toggle paths to use the constrained setter API and to persist or apply the effective post-constraint value rather than the requested value. - Updated the `core_test_support` Bazel target to include the bundled core model-catalog fixtures in its runtime data, so helper code that resolves `core/models.json` through runfiles works in remote Bazel test environments. - Renamed the core config test coverage to emphasize that effective feature values are normalized at runtime, while conflicting persisted config writes are rejected. - Ran `compact_resume_after_second_compaction_preserves_history` inside an explicit 8 MiB test thread and Tokio runtime worker stack, following the existing larger-stack integration-test pattern, to keep the Windows `compact_resume_fork` test slice from aborting while a parallel investigation continues into whether some of the underlying async futures should be boxed. ## Verification - `cargo test -p codex-config` - `cargo test -p codex-core feature_requirements_ -- --nocapture` - `cargo test -p codex-core load_requirements_toml_produces_expected_constraints -- --nocapture` - `cargo test -p codex-core compact_resume_after_second_compaction_preserves_history -- --nocapture` - `cargo test -p codex-core compact_resume_fork -- --nocapture` - Re-ran the built `codex-core` `tests/all` binary with `RUST_MIN_STACK=262144` for `compact_resume_after_second_compaction_preserves_history` to confirm the explicit-stack harness fixes the deterministic low-stack repro. - `cargo test -p codex-core` - This still fails locally in unrelated integration areas that expect the `codex` / `test_stdio_server` binaries or hit existing `search_tool` wiremock mismatches. ## Docs `developers.openai.com/codex` should document the requirements-side `[features]` table for enterprise and MDM-managed configuration, including that it only accepts canonical feature keys and that conflicting config writes are rejected.	2026-03-04 04:40:22 +00:00
Eric Traut	cee009d117	Add oauth_resource handling for MCP login flows (#12866 ) Addresses bug https://github.com/openai/codex/issues/12589 Builds on community PR #12763. This adds `oauth_resource` support for MCP `streamable_http` servers and wires it through the relevant config and login paths. It fixes the bug where the configured OAuth resource was not reliably included in the authorization request, causing MCP login to omit the expected `resource` parameter.	2026-02-26 20:10:12 -08:00
Michael Bolin	1af2a37ada	chore: remove codex-core public protocol/shell re-exports (#12432 ) ## Why `codex-rs/core/src/lib.rs` re-exported a broad set of types and modules from `codex-protocol` and `codex-shell-command`. That made it easy for workspace crates to import those APIs through `codex-core`, which in turn hides dependency edges and makes it harder to reduce compile-time coupling over time. This change removes those public re-exports so call sites must import from the source crates directly. Even when a crate still depends on `codex-core` today, this makes dependency boundaries explicit and unblocks future work to drop `codex-core` dependencies where possible. ## What Changed - Removed public re-exports from `codex-rs/core/src/lib.rs` for: - `codex_protocol::protocol` and related protocol/model types (including `InitialHistory`) - `codex_protocol::config_types` (`protocol_config_types`) - `codex_shell_command::{bash, is_dangerous_command, is_safe_command, parse_command, powershell}` - Migrated workspace Rust call sites to import directly from: - `codex_protocol::protocol` - `codex_protocol::config_types` - `codex_protocol::models` - `codex_shell_command` - Added explicit `Cargo.toml` dependencies (`codex-protocol` / `codex-shell-command`) in crates that now import those crates directly. - Kept `codex-core` internal modules compiling by using `pub(crate)` aliases in `core/src/lib.rs` (internal-only, not part of the public API). - Updated the two utility crates that can already drop a `codex-core` dependency edge entirely: - `codex-utils-approval-presets` - `codex-utils-cli` ## Verification - `cargo test -p codex-utils-approval-presets` - `cargo test -p codex-utils-cli` - `cargo check --workspace --all-targets` - `just clippy`	2026-02-20 23:45:35 -08:00
Anton Panasenko	02abd9a8ea	feat: persist and restore codex app's tools after search (#11780 ) ### What changed 1. Removed per-turn MCP selection reset in `core/src/tasks/mod.rs`. 2. Added `SessionState::set_mcp_tool_selection(Vec<String>)` in `core/src/state/session.rs` for authoritative restore behavior (deduped, order-preserving, empty clears). 3. Added rollout parsing in `core/src/codex.rs` to recover `active_selected_tools` from prior `search_tool_bm25` outputs: - tracks matching `call_id`s - parses function output text JSON - extracts `active_selected_tools` - latest valid payload wins - malformed/non-matching payloads are ignored 4. Applied restore logic to resumed and forked startup paths in `core/src/codex.rs`. 5. Updated instruction text to session/thread scope in `core/templates/search_tool/tool_description.md`. 6. Expanded tests in `core/tests/suite/search_tool.rs`, plus unit coverage in: - `core/src/codex.rs` - `core/src/state/session.rs` ### Behavior after change 1. Search activates matched tools. 2. Additional searches union into active selection. 3. Selection survives new turns in the same thread. 4. Resume/fork restores selection from rollout history. 5. Separate threads do not inherit selection unless forked.	2026-02-15 19:18:41 -08:00
Anton Panasenko	38c442ca7f	core: limit search_tool_bm25 to Apps and clarify discovery guidance (#11669 ) ## Summary - Limit `search_tool_bm25` indexing to `codex_apps` tools only, so non-Apps MCP servers are no longer discoverable through this search path. - Move search-tool discovery guidance into the `search_tool_bm25` tool description (via template include) instead of injecting it as a separate developer message. - Update Apps discovery guidance wording to clarify when to use `search_tool_bm25` for Apps-backed systems (for example Slack, Google Drive, Jira, Notion) and when to call tools directly. - Remove dead `core` helper code (`filter_codex_apps_mcp_tools` and `codex_apps_connector_id`) that is no longer used after the tool-selection refactor. - Update `core` search-tool tests to assert codex-apps-only behavior and to validate guidance from the tool description. ## Validation - ✅ `just fmt` - ✅ `cargo test -p codex-core search_tool` - ⚠️ `cargo test -p codex-core` was attempted, but the run repeatedly stalled on `tools::js_repl::tests::js_repl_can_attach_image_via_view_image_tool`. ## Tickets - None	2026-02-13 09:32:46 -08:00
Anton Panasenko	d3b078c282	Consolidate search_tool feature into apps (#11509 ) ## Summary - Remove `Feature::SearchTool` and the `search_tool` config key from the feature registry/schema. - Gate `search_tool_bm25` exposure via `Feature::Apps` in `core/src/tools/spec.rs`. - Update MCP selection logic in `core/src/codex.rs` to use `Feature::Apps` for search-tool behavior. - Update `core/tests/suite/search_tool.rs` to enable `Feature::Apps`. - Regenerate `core/config.schema.json` via `just write-config-schema`. ## Testing - `just fmt` - `cargo test -p codex-core --test all suite::search_tool::` ## Tickets - None	2026-02-11 16:52:42 -08:00
Anton Panasenko	becc3a0424	feat: search_tool (#10657 ) Why We Did This - The goal is to reduce MCP tool context pollution by not exposing the full MCP tool list up front - It forces an explicit discovery step (`search_tool_bm25`) so the model narrows tool scope before making MCP calls, which helps relevance and lowers prompt/tool clutter. What It Changed - Added a new experimental feature flag `search_tool` in `core/src/features.rs:90` and `core/src/features.rs:430`. - Added config/schema support for that flag in `core/config.schema.json:214` and `core/config.schema.json:1235`. - Added BM25 dependency (`bm25`) in `Cargo.toml:129` and `core/Cargo.toml:23`. - Added new tool handler `search_tool_bm25` in `core/src/tools/handlers/search_tool_bm25.rs:18`. - Registered the handler and tool spec in `core/src/tools/handlers/mod.rs:11` and `core/src/tools/spec.rs:780` and `core/src/tools/spec.rs:1344`. - Extended `ToolsConfig` to carry `search_tool` enablement in `core/src/tools/spec.rs:32` and `core/src/tools/spec.rs:56`. - Injected dedicated developer instructions for tool-discovery workflow in `core/src/codex.rs:483` and `core/src/codex.rs:1976`, using `core/templates/search_tool/developer_instructions.md:1`. - Added session state to store one-shot selected MCP tools in `core/src/state/session.rs:27` and `core/src/state/session.rs:131`. - Added filtering so when feature is enabled, only selected MCP tools are exposed on the next request (then consumed) in `core/src/codex.rs:3800` and `core/src/codex.rs:3843`. - Added E2E suite coverage for enablement/instructions/hide-until-search/one-turn-selection in `core/tests/suite/search_tool.rs:72`, `core/tests/suite/search_tool.rs:109`, `core/tests/suite/search_tool.rs:147`, and `core/tests/suite/search_tool.rs:218`. - Refactored test helper utilities to support config-driven tool collection in `core/tests/suite/tools.rs:281`. Net Behavioral Effect - With `search_tool` off: existing MCP behavior (tools exposed normally). - With `search_tool` on: MCP tools start hidden, model must call `search_tool_bm25`, and only returned `selected_tools` are available for the next model call.	2026-02-09 12:53:50 -08:00

22 Commits