codex

mirror of https://github.com/openai/codex.git synced 2026-06-01 19:02:59 +00:00

Author	SHA1	Message	Date
pash-openai	6c0a924203	turn metadata: per-turn non-blocking (#11677 )	2026-02-13 12:48:29 -08:00
alexsong-oai	e71760fc64	support app usage analytics (#11687 ) Emit app mentioned and app used events. Dedup by (turn_id, connector_id) Example event params: { "event_type": "codex_app_used", "connector_id": "asdk_app_xxx", "thread_id": "019c5527-36d4-xxx", "turn_id": "019c552c-cd17-xxx", "app_name": "Slack (OpenAI Internal)", "product_client_id": "codex_cli_rs", "invoke_type": "explicit", "model_slug": "gpt-5.3-codex" }	2026-02-13 12:00:16 -08:00
Anton Panasenko	38c442ca7f	core: limit search_tool_bm25 to Apps and clarify discovery guidance (#11669 ) ## Summary - Limit `search_tool_bm25` indexing to `codex_apps` tools only, so non-Apps MCP servers are no longer discoverable through this search path. - Move search-tool discovery guidance into the `search_tool_bm25` tool description (via template include) instead of injecting it as a separate developer message. - Update Apps discovery guidance wording to clarify when to use `search_tool_bm25` for Apps-backed systems (for example Slack, Google Drive, Jira, Notion) and when to call tools directly. - Remove dead `core` helper code (`filter_codex_apps_mcp_tools` and `codex_apps_connector_id`) that is no longer used after the tool-selection refactor. - Update `core` search-tool tests to assert codex-apps-only behavior and to validate guidance from the tool description. ## Validation - ✅ `just fmt` - ✅ `cargo test -p codex-core search_tool` - ⚠️ `cargo test -p codex-core` was attempted, but the run repeatedly stalled on `tools::js_repl::tests::js_repl_can_attach_image_via_view_image_tool`. ## Tickets - None	2026-02-13 09:32:46 -08:00
jif-oai	561fc14045	chore: move explorer to spark (#11745 )	2026-02-13 16:13:24 +00:00
Charley Cunningham	f24669d444	Persist complete TurnContextItem state via canonical conversion (#11656 ) ## Summary This PR delivers the first small, shippable step toward model-visible state diffing by making `TurnContextItem` more complete and standardizing how it is built. Specifically, it: - Adds persisted network context to `TurnContextItem`. - Introduces a single canonical `TurnContext -> TurnContextItem` conversion path. - Routes existing rollout write sites through that canonical conversion helper. No context injection/diff behavior changes are included in this PR. ## Why this change The design goal is to make `TurnContextItem` the canonical source of truth for context-diff decisions. Before this PR: - `TurnContextItem` did not include all TurnContext-derived environment inputs needed for v1 completeness. - Construction was duplicated at multiple write sites. This PR addresses both with a minimal, reviewable change. ## Changes ### 1) Extend `TurnContextItem` with network state - Added `TurnContextNetworkItem { allowed_domains, denied_domains }`. - Added `network: Option<TurnContextNetworkItem>` to `TurnContextItem`. - Kept backward compatibility by making the new field optional and skipped when absent. Files: - `codex-rs/protocol/src/protocol.rs` ### 2) Canonical conversion helper - Added `TurnContext::to_turn_context_item(collaboration_mode)` in core. - Added internal helper to derive network fields from `config_layer_stack.requirements().network`. Files: - `codex-rs/core/src/codex.rs` ### 3) Use canonical conversion at rollout write sites - Replaced ad hoc `TurnContextItem { ... }` construction with `to_turn_context_item(...)` in: - sampling request path - compaction path Files: - `codex-rs/core/src/codex.rs` - `codex-rs/core/src/compact.rs` ### 4) Update fixtures/tests for new optional field - Updated existing `TurnContextItem` literals in tests to include `network: None`. - Added protocol tests for: - deserializing old payloads with no `network` - serializing when `network` is present Files: - `codex-rs/core/tests/suite/resume_warning.rs` - No replay/diff logic changes. - Persisted rollout `TurnContextItem` now carries additional network context when available. - Older rollout lines without `network` remain readable.	2026-02-12 17:22:44 -08:00
Matthew Zeng	c37560069a	[apps] Add is_enabled to app info. (#11417 ) - [x] Add is_enabled to app info and the response of `app/list`. - [x] Update TUI to have Enable/Disable button on the app detail page.	2026-02-13 00:30:52 +00:00
Curtis 'Fjord' Hawthorne	0dcfc59171	Add js_repl_tools_only model and routing restrictions (#10671 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request. #### [git stack](https://github.com/magus/git-stack-cli) - ✅ `1` https://github.com/openai/codex/pull/10674 - ✅ `2` https://github.com/openai/codex/pull/10672 - 👉 `3` https://github.com/openai/codex/pull/10671 - ⏳ `4` https://github.com/openai/codex/pull/10673 - ⏳ `5` https://github.com/openai/codex/pull/10670	2026-02-12 15:41:05 -08:00
Michael Bolin	a4cc1a4a85	feat: introduce Permissions (#11633 ) ## Why We currently carry multiple permission-related concepts directly on `Config` for shell/unified-exec behavior (`approval_policy`, `sandbox_policy`, `network`, `shell_environment_policy`, `windows_sandbox_mode`). Consolidating these into one in-memory struct makes permission handling easier to reason about and sets up the next step: supporting named permission profiles (`[permissions.PROFILE_NAME]`) without changing behavior now. This change is mostly mechanical: it updates existing callsites to go through `config.permissions`, but it does not yet refactor those callsites to take a single `Permissions` value in places where multiple permission fields are still threaded separately. This PR intentionally does not change the on-disk `config.toml` format yet and keeps compatibility with legacy config keys. ## What Changed - Introduced `Permissions` in `core/src/config/mod.rs`. - Added `Config::permissions` and moved effective runtime permission fields under it: - `approval_policy` - `sandbox_policy` - `network` - `shell_environment_policy` - `windows_sandbox_mode` - Updated config loading/building so these effective values are still derived from the same existing config inputs and constraints. - Updated Windows sandbox helpers/resolution to read/write via `permissions`. - Threaded the new field through all permission consumers across core runtime, app-server, CLI/exec, TUI, and sandbox summary code. - Updated affected tests to reference `config.permissions.*`. - Renamed the struct/field from `EffectivePermissions`/`effective_permissions` to `Permissions`/`permissions` and aligned variable naming accordingly. ## Verification - `just fix -p codex-core -p codex-tui -p codex-cli -p codex-app-server -p codex-exec -p codex-utils-sandbox-summary` - `cargo build -p codex-core -p codex-tui -p codex-cli -p codex-app-server -p codex-exec -p codex-utils-sandbox-summary`	2026-02-12 14:42:54 -08:00
Dylan Hurd	4668feb43a	chore(core) Deprecate approval_policy: on-failure (#11631 ) ## Summary In an effort to start simplifying our sandbox setup, we're announcing this approval_policy as deprecated. In general, it performs worse than `on-request`, and we're focusing on making fewer sandbox configurations perform much better. ## Testing - [x] Tested locally - [x] Existing tests pass	2026-02-12 13:23:30 -08:00
Owen Lin	efc8d45750	feat(app-server): experimental flag to persist extended history (#11227 ) This PR adds an experimental `persist_extended_history` bool flag to app-server thread APIs so rollout logs can retain a richer set of EventMsgs for non-lossy Thread > Turn > ThreadItems reconstruction (i.e. on `thread/resume`). ### Motivation Today, our rollout recorder only persists a small subset (e.g. user message, reasoning, assistant message) of `EventMsg` types, dropping a good number (like command exec, file change, etc.) that are important for reconstructing full item history for `thread/resume`, `thread/read`, and `thread/fork`. Some clients want to be able to resume a thread without lossiness. This lossiness is primarily a UI thing, since what the model sees are `ResponseItem` and not `EventMsg`. ### Approach This change introduces an opt-in `persist_full_history` flag to preserve those events when you start/resume/fork a thread (defaults to `false`). This is done by adding an `EventPersistenceMode` to the rollout recorder: - `Limited` (existing behavior, default) - `Extended` (new opt-in behavior) In `Extended` mode, persist additional `EventMsg` variants needed for non-lossy app-server `ThreadItem` reconstruction. We now store the following ThreadItems that we didn't before: - web search - command execution - patch/file changes - MCP tool calls - image view calls - collab tool outcomes - context compaction - review mode enter/exit For command executions in particular, we truncate the output using the existing `truncate_text` from core to store an upper bound of 10,000 bytes, which is also the default value for truncating tool outputs shown to the model. This keeps the size of the rollout file and command execution items returned over the wire reasonable. And we also persist `EventMsg::Error` which we can now map back to the Turn's status and populates the Turn's error metadata. #### Updates to EventMsgs To truly make `thread/resume` non-lossy, we also needed to persist the `status` on `EventMsg::CommandExecutionEndEvent` and `EventMsg::PatchApplyEndEvent`. Previously it was not obvious whether a command failed or was declined (similar for apply_patch). These EventMsgs were never persisted before so I made it a required field.	2026-02-12 19:34:22 +00:00
canvrno-oai	22fa283511	Parse first order skill/connector mentions (#11547 ) This PR introduces a skill-expansion mechanism for mentions so nested or skill or connection mentions are expanded if present in skills invoked by the user. This keeps behavior aligned with existing mention handling while extending coverage to deeper scenarios. With these changes, users can create skills that invoke connectors, and skills that invoke other skills. Replaces #10863, which is not needed with the addition of [search_tool_bm25](https://github.com/openai/codex/issues/10657)	2026-02-12 10:55:22 -08:00
jif-oai	a0dab25c68	feat: mem slash commands (#11569 ) Add 2 slash commands for memories: * `/m_drop` delete all the memories * `/m_update` update the memories with phase 1 and 2	2026-02-12 10:39:43 +00:00
pakrym-oai	d391f3e2f9	Hide the first websocket retry (#11548 ) Sometimes connection needs to be quickly reestablished, don't produce an error for that.	2026-02-11 22:48:13 -08:00
Gabriel Peal	bd3ce98190	Bump rmcp to 0.15 (#11539 ) https://github.com/modelcontextprotocol/rust-sdk/pull/598 in 0.14 broke some MCP oauth (like Linear) and https://github.com/modelcontextprotocol/rust-sdk/pull/641 fixed it in 0.15	2026-02-11 22:04:17 -08:00
Ahmed Ibrahim	95fb86810f	Update context window after model switch (#11520 ) - Update token usage aggregation to refresh model context window after a model change. - Add protocol/core tests, including an e2e model-switch test that validates switching to a smaller model updates telemetry.	2026-02-11 17:41:23 -08:00
Ahmed Ibrahim	6938150c5e	Pre-sampling compact with previous model context (#11504 ) - Run pre-sampling compact through a single helper that builds previous-model turn context and compacts before the follow-up request when switching to a smaller context window. - Keep compaction events on the parent turn id and add compact suite coverage for switch-in-session and resume+switch flows.	2026-02-11 17:24:06 -08:00
Anton Panasenko	d3b078c282	Consolidate search_tool feature into apps (#11509 ) ## Summary - Remove `Feature::SearchTool` and the `search_tool` config key from the feature registry/schema. - Gate `search_tool_bm25` exposure via `Feature::Apps` in `core/src/tools/spec.rs`. - Update MCP selection logic in `core/src/codex.rs` to use `Feature::Apps` for search-tool behavior. - Update `core/tests/suite/search_tool.rs` to enable `Feature::Apps`. - Regenerate `core/config.schema.json` via `just write-config-schema`. ## Testing - `just fmt` - `cargo test -p codex-core --test all suite::search_tool::` ## Tickets - None	2026-02-11 16:52:42 -08:00
Ahmed Ibrahim	bb5dfd037a	Hydrate previous model across resume/fork/rollback/task start (#11497 ) - Replace pending resume model state with persistent previous_model and hydrate it on resume, fork, rollback, and task end in spawn_task	2026-02-11 16:45:18 -08:00
gt-oai	7112e16809	Add AfterToolUse hook (#11335 ) Not wired up to config yet. (So we can change the name if we want) An example payload: ``` { "session_id": "019c48b7-7098-7b61-bc48-32e82585d451", "cwd": "/Users/gt/code/codex/codex-rs", "triggered_at": "2026-02-10T18:02:31Z", "hook_event": { "event_type": "after_tool_use", "turn_id": "4", "call_id": "call_iuo4DqWgjE7OxQywnL2UzJUE", "tool_name": "apply_patch", "tool_kind": "custom", "tool_input": { "input_type": "custom", "input": "* Begin Patch\n* Update File: README.md\n@@\n-# Codex CLI hello (Rust Implementation)\n+# Codex CLI (Rust Implementation)\n*** End Patch\n" }, "executed": true, "success": true, "duration_ms": 37, "mutating": true, "sandbox": "none", "sandbox_policy": "danger-full-access", "output_preview": "{\"output\":\"Success. Updated the following files:\\nM README.md\\n\",\"metadata\":{\"exit_code\":0,\"duration_seconds\":0.0}}" } } ```	2026-02-11 22:25:04 +00:00
jif-oai	de6f2ef746	nit: memory truncation (#11479 ) Use existing truncation for memories	2026-02-11 21:11:57 +00:00
Curtis 'Fjord' Hawthorne	42e22f3bde	Add feature-gated freeform js_repl core runtime (#10674 ) ## Summary This PR adds an experimental, feature-gated `js_repl` core runtime so models can execute JavaScript in a persistent REPL context across tool calls. The implementation integrates with existing feature gating, tool registration, prompt composition, config/schema docs, and tests. ## What changed - Added new experimental feature flag: `features.js_repl`. - Added freeform `js_repl` tool and companion `js_repl_reset` tool. - Gated tool availability behind `Feature::JsRepl`. - Added conditional prompt-section injection for JS REPL instructions via marker-based prompt processing. - Implemented JS REPL handlers, including freeform parsing and pragma support (timeout/reset controls). - Added runtime resolution order for Node: 1. `CODEX_JS_REPL_NODE_PATH` 2. `js_repl_node_path` in config 3. `PATH` - Added JS runtime assets/version files and updated docs/schema. ## Why This enables richer agent workflows that require incremental JavaScript execution with preserved state, while keeping rollout safe behind an explicit feature flag. ## Testing Coverage includes: - Feature-flag gating behavior for tool exposure. - Freeform parser/pragma handling edge cases. - Runtime behavior (state persistence across calls and top-level `await` support). ## Usage ```toml [features] js_repl = true ``` Optional runtime override: - `CODEX_JS_REPL_NODE_PATH`, or - `js_repl_node_path` in config. #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/10674 - ⏳ `2` https://github.com/openai/codex/pull/10672 - ⏳ `3` https://github.com/openai/codex/pull/10671 - ⏳ `4` https://github.com/openai/codex/pull/10673 - ⏳ `5` https://github.com/openai/codex/pull/10670	2026-02-11 12:05:02 -08:00
jif-oai	d4b2c230f1	feat: memory read path (#11459 )	2026-02-11 18:22:45 +00:00
pakrym-oai	eac5473114	Do not attempt to append after response.completed (#11402 ) Completed responses are fully done, and new response must be created.	2026-02-11 07:45:17 -08:00
Michael Bolin	476c1a7160	Remove `test-support` feature from `codex-core` and replace it with explicit test toggles (#11405 ) ## Why `codex-core` was being built in multiple feature-resolved permutations because test-only behavior was modeled as crate features. For a large crate, those permutations increase compile cost and reduce cache reuse. ## Net Change - Removed the `test-support` crate feature and related feature wiring so `codex-core` no longer needs separate feature shapes for test consumers. - Standardized cross-crate test-only access behind `codex_core::test_support`. - External test code now imports helpers from `codex_core::test_support`. - Underlying implementation hooks are kept internal (`pub(crate)`) instead of broadly public. ## Outcome - Fewer `codex-core` build permutations. - Better incremental cache reuse across test targets. - No intended production behavior change.	2026-02-10 22:44:02 -08:00
xl-openai	fdd0cd1de9	feat: support multiple rate limits (#11260 ) Added multi-limit support end-to-end by carrying limit_name in rate-limit snapshots and handling multiple buckets instead of only codex. Extended /usage client parsing to consume additional_rate_limits Updated TUI /status and in-memory state to store/render per-limit snapshots Extended app-server rate-limit read response: kept rate_limits and added rate_limits_by_name. Adjusted usage-limit error messaging for non-default codex limit buckets	2026-02-10 20:09:31 -08:00
Celia Chen	641d5268fa	chore: persist turn_id in rollout session and make turn_id uuid based (#11246 ) Problem: 1. turn id is constructed in-memory; 2. on resuming threads, turn_id might not be unique; 3. client cannot no the boundary of a turn from rollout files easily. This PR does three things: 1. persist `task_started` and `task_complete` events; 1. persist `turn_id` in rollout turn events; 5. generate turn_id as unique uuids instead of incrementing it in memory. This helps us resolve the issue of clients wanting to have unique turn ids for resuming a thread, and knowing the boundry of each turn in rollout files. example debug logs ``` 2026-02-11T00:32:10.746876Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=8 turn=Turn { id: "019c4a07-d809-74c3-bc4b-fd9618487b4b", items: [UserMessage { id: "item-24", content: [Text { text: "hi", text_elements: [] }] }, AgentMessage { id: "item-25", text: "Hi. I’m in the workspace with your current changes loaded and ready. Send the next task and I’ll execute it end-to-end." }], status: Completed, error: None } 2026-02-11T00:32:10.746888Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=9 turn=Turn { id: "019c4a18-1004-76c0-a0fb-a77610f6a9b8", items: [UserMessage { id: "item-26", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-27", text: "Hello. Ready for the next change in `codex-rs`; I can continue from the current in-progress diff or start a new task." }], status: Completed, error: None } 2026-02-11T00:32:10.746899Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=10 turn=Turn { id: "019c4a19-41f0-7db0-ad78-74f1503baeb8", items: [UserMessage { id: "item-28", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-29", text: "Hello. Send the specific change you want in `codex-rs`, and I’ll implement it and run the required checks." }], status: Completed, error: None } ``` backward compatibility: if you try to resume an old session without task_started and task_complete event populated, the following happens: - If you resume and do nothing: those reconstructed historical IDs can differ next time you resume. - If you resume and send a new turn: the new turn gets a fresh UUID from live submission flow and is persisted, so that new turn’s ID is stable on later resumes. I think this behavior is fine, because we only care about deterministic turn id once a turn is triggered.	2026-02-11 03:56:01 +00:00
pakrym-oai	c68999ee6d	Prefer websocket transport when model opts in (#11386 ) Summary - add a `prefer_websockets` field to `ModelInfo`, defaulting to `false` in all fixtures and constructors - wire the new flag into websocket selection so models that opt in always use websocket transport even when the feature gate is off Testing - Not run (not requested)	2026-02-10 18:50:48 -08:00
jif-oai	a6e9469fa4	chore: unify memory job flow (#11334 )	2026-02-10 20:26:39 +00:00
Ahmed Ibrahim	5e01450963	Strip unsupported images from prompt history to guard against model switch (#11349 ) - Make `ContextManager::for_prompt` modality-aware and strip input_image content when the active model is text-only. - Added a test for multi-model -> text-only model switch	2026-02-10 11:58:00 -08:00
iceweasel-oai	82f93a13b2	include sandbox (seatbelt, elevated, etc.) as in turn metadata header (#10946 ) This will help us understand retention/usage for folks who use the Windows (or any other) sandboxes	2026-02-10 19:50:07 +00:00
pakrym-oai	e4b5384539	Extract tool building (#11337 ) Make it clear what input go into building tools and allow for easy reuse for pre-warm request	2026-02-10 11:45:23 -08:00
jif-oai	847a6092e6	fix: reduce usage of `open_if_present` (#11344 )	2026-02-10 19:25:07 +00:00
Shijie Rao	c4b771a16f	Fix: update parallel tool call exec approval to approve on request id (#11162 ) ### Summary In parallel tool call, exec command approvals were not approved at request level but at a turn level. i.e. when a single request is approved, the system currently treats all requests in turn as approved. ### Before https://github.com/user-attachments/assets/d50ed129-b3d2-4b2f-97fa-8601eb11f6a8 ### After https://github.com/user-attachments/assets/36528a43-a4aa-4775-9e12-f13287ef19fc	2026-02-10 09:38:00 -08:00
jif-oai	d735df1f50	Extract hooks into dedicated crate (#11311 ) Summary - move `core/src/hooks` implementation into a new `codex-hooks` crate with its own manifest - update `codex-rs` workspace and `codex-core` crate to depend on the extracted `hooks` crate and wire up the shared APIs - ensure references, modules, and lockfile reflect the new crate layout Testing - Not run (not requested)	2026-02-10 13:42:17 +00:00
jif-oai	6049ff02a0	memories: add extraction and prompt module foundation (#11200 ) ## Summary - add the new `core/src/memories` module (phase-one parsing, rollout filtering, storage, selection, prompts) - add Askama-backed memory templates for stage-one input/system and consolidation prompts - add module tests for parsing, filtering, path bucketing, and summary maintenance ## Testing - just fmt - cargo test -p codex-core --lib memories::	2026-02-10 10:10:24 +00:00
Michael Bolin	44ebf4588f	feat: retain NetworkProxy, when appropriate (#11207 ) As of this PR, `SessionServices` retains a `Option<StartedNetworkProxy>`, if appropriate. Now the `network` field on `Config` is `Option<NetworkProxySpec>` instead of `Option<NetworkProxy>`. Over in `Session::new()`, we invoke `NetworkProxySpec::start_proxy()` to create the `StartedNetworkProxy`, which is a new struct that retains the `NetworkProxy` as well as the `NetworkProxyHandle`. (Note that `Drop` is implemented for `NetworkProxyHandle` to ensure the proxies are shutdown when it is dropped.) The `NetworkProxy` from the `StartedNetworkProxy` is threaded through to the appropriate places. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/11207). * #11285 * __->__ #11207	2026-02-10 02:09:23 -08:00
alexsong-oai	91704c5672	feat: add SkillPolicy to skill metadata and support allow_implicit_invocation (#11244 ) Tested by setting the policy in agents/openai.yaml to true, false, and leaving it unset (default). ``` policy: allow_implicit_invocation: false ``` <img width="847" height="289" alt="Screenshot 2026-02-09 at 3 42 41 PM" src="https://github.com/user-attachments/assets/d3476264-3355-47cf-894a-4ffba53e3481" />	2026-02-09 23:13:27 -08:00
Matthew Zeng	005e040f97	[apps] Add thread_id param to optionally load thread config for apps feature check. (#11279 ) - [x] Add thread_id param to optionally load thread config for apps feature check	2026-02-09 23:10:26 -08:00
Ahmed Ibrahim	d1df3bd63b	Revert "Revert "Update models.json"" (#11256 ) Reverts openai/codex#11255	2026-02-09 19:22:41 -08:00
Ahmed Ibrahim	a1abd53b6a	Remove offline fallback for models (#11238 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.	2026-02-09 16:58:54 -08:00
Matthew Zeng	d90df4761b	[apps] Add gated instructions for Apps. (#10924 ) - [x] Add gated instructions for Apps.	2026-02-09 14:48:09 -08:00
jif-oai	ffd4bd345c	feat: tie shell snapshot to cwd (#11231 ) Fix for this: https://github.com/openai/codex/issues/11223 Basically we tie the shell snapshot to a `cwd` to handle `cwd`-based env setups	2026-02-09 22:14:39 +00:00
Anton Panasenko	becc3a0424	feat: search_tool (#10657 ) Why We Did This - The goal is to reduce MCP tool context pollution by not exposing the full MCP tool list up front - It forces an explicit discovery step (`search_tool_bm25`) so the model narrows tool scope before making MCP calls, which helps relevance and lowers prompt/tool clutter. What It Changed - Added a new experimental feature flag `search_tool` in `core/src/features.rs:90` and `core/src/features.rs:430`. - Added config/schema support for that flag in `core/config.schema.json:214` and `core/config.schema.json:1235`. - Added BM25 dependency (`bm25`) in `Cargo.toml:129` and `core/Cargo.toml:23`. - Added new tool handler `search_tool_bm25` in `core/src/tools/handlers/search_tool_bm25.rs:18`. - Registered the handler and tool spec in `core/src/tools/handlers/mod.rs:11` and `core/src/tools/spec.rs:780` and `core/src/tools/spec.rs:1344`. - Extended `ToolsConfig` to carry `search_tool` enablement in `core/src/tools/spec.rs:32` and `core/src/tools/spec.rs:56`. - Injected dedicated developer instructions for tool-discovery workflow in `core/src/codex.rs:483` and `core/src/codex.rs:1976`, using `core/templates/search_tool/developer_instructions.md:1`. - Added session state to store one-shot selected MCP tools in `core/src/state/session.rs:27` and `core/src/state/session.rs:131`. - Added filtering so when feature is enabled, only selected MCP tools are exposed on the next request (then consumed) in `core/src/codex.rs:3800` and `core/src/codex.rs:3843`. - Added E2E suite coverage for enablement/instructions/hide-until-search/one-turn-selection in `core/tests/suite/search_tool.rs:72`, `core/tests/suite/search_tool.rs:109`, `core/tests/suite/search_tool.rs:147`, and `core/tests/suite/search_tool.rs:218`. - Refactored test helper utilities to support config-driven tool collection in `core/tests/suite/tools.rs:281`. Net Behavioral Effect - With `search_tool` off: existing MCP behavior (tools exposed normally). - With `search_tool` on: MCP tools start hidden, model must call `search_tool_bm25`, and only returned `selected_tools` are available for the next model call.	2026-02-09 12:53:50 -08:00
Charley Cunningham	9450cd9ce5	core: add focused diagnostics for remote compaction overflows (#11133 ) ## Summary - add targeted remote-compaction failure diagnostics in compact_remote logging - log the specific values needed to explain overflow timing: - last_api_response_total_tokens - estimated_tokens_of_items_added_since_last_successful_api_response - estimated_bytes_of_items_added_since_last_successful_api_response - failing_compaction_request_body_bytes - simplify breakdown naming and remove last_api_response_total_bytes_estimate (it was an approximation and not useful for debugging) ## Why When compaction fails with context_length_exceeded, we need concrete, low-ambiguity numbers that map directly to: 1) what the API most recently reported, and 2) what local history added since then. This keeps the failure logs actionable without adding broad, noisy metrics. ## Testing - just fmt - cargo test -p codex-core	2026-02-09 12:42:20 -08:00
pakrym-oai	ccd17374cb	Move warmup to the task level (#11216 ) Instead of storing a special connection on the client level make the regular task responsible for establishing a normal client session and open a connection on it. Then when the turn is started we pass in a pre-established session.	2026-02-09 10:57:52 -08:00
jif-oai	6cf61725d0	feat: do not close unified exec processes across turns (#10799 ) With this PR we do not close the unified exec processes (i.e. background terminals) at the end of a turn unless: * The user interrupt the turn * The user decide to clean the processes through `app-server` or `/clean` I made sure that `codex exec` correctly kill all the processes	2026-02-09 10:27:46 +00:00
Michael Bolin	383b45279e	feat: include NetworkConfig through ExecParams (#11105 ) This PR adds the following field to `Config`: ```rust pub network: Option<NetworkProxy>, ``` Though for the moment, it will always be initialized as `None` (this will be addressed in a subsequent PR). This PR does the work to thread `network` through to `execute_exec_env()`, `process_exec_tool_call()`, and `UnifiedExecRuntime.run()` to ensure it is available whenever we span a process.	2026-02-09 03:32:17 +00:00
Michael Bolin	181b721ba5	feat: include [experimental_network] in <environment_context> (#11044 ) If `NetworkConstraints` is set, then include the relevant settings on `<environment_context>`. Example: ```xml <environment_context> <cwd>/repo</cwd> <shell>bash</shell> <network enabled="true"> <allowed>api.example.com</allowed> <allowed>*.openai.com</allowed> <denied>blocked.example.com</denied> </network> </environment_context> ```	2026-02-08 15:16:50 -08:00
Matthew Zeng	9f1009540b	Upgrade rmcp to 0.14 (#10718 ) - [x] Upgrade rmcp to 0.14	2026-02-08 15:07:53 -08:00
Eric Traut	b3de6c7f2b	Defer persistence of rollout file (#11028 ) - Defer rollout persistence for fresh threads (`InitialHistory::New`): keep rollout events in memory and only materialize rollout file + state DB row on first `EventMsg::UserMessage`. - Keep precomputed rollout path available before materialization. - Change `thread/start` to build thread response from live config snapshot and optional precomputed path. - Improve pre-materialization behavior in app-server/TUI: clearer invalid-request errors for file-backed ops and a friendlier `/fork` “not ready yet” UX. - Update tests to match deferred semantics across start/read/archive/unarchive/fork/resume/review flows. - Improved resilience of user_shell test, which should be unrelated to this change but must be affected by timing changes For Reviewers: * The primary change is in recorder.rs * Most of the other changes were to fix up broken assumptions in existing tests Testing: * Manually tested CLI * Exercised app server paths by manually running IDE Extension with rebuilt CLI binary * Only user-visible change is that `/fork` in TUI generates visible error if used prior to first turn	2026-02-07 23:05:03 -08:00

1 2 3 4 5 ...

584 Commits