codex

mirror of https://github.com/openai/codex.git synced 2026-06-01 19:02:59 +00:00

Author	SHA1	Message	Date
jif-oai	77f74a5c17	fix: race in js repl (#11922 ) js_repl_reset previously raced with in-flight/new js_repl executions because reset() could clear exec_tool_calls without synchronizing with execute(). In that window, a running exec could lose its per-exec tool-call context, and subsequent kernel RunTool messages would fail with js_repl exec context not found. The fix serializes reset and execute on the same exec_lock, so reset cannot run concurrently with exec setup/teardown. We also keep the timeout path safe by performing reset steps inline while execute() already holds the lock, avoiding re-entrant lock acquisition. A regression test now verifies that reset waits for the exec lock and does not clear tool-call state early.	2026-02-17 11:06:14 +00:00
jif-oai	846464e869	fix: js_repl reset hang by clearing exec tool calls without waiting (#11932 ) Remove the waiting loop in `reset` so it no longer blocks on potentially hanging exec tool calls + add `clear_all_exec_tool_calls_map` to drain the map and notify waiters so `reset` completes immediately	2026-02-17 08:40:54 +00:00
Dylan Hurd	19afbc35c1	chore(core) rm Feature::RequestRule (#11866 ) ## Summary This feature is now reasonably stable, let's remove it so we can simplify our upcoming iterations here. ## Testing - [x] Existing tests pass	2026-02-16 22:30:23 +00:00
jif-oai	beb5cb4f48	Rename collab modules to multi agents (#11939 ) Summary - rename the `collab` handlers and UI files to `multi_agents` to match the new naming - update module references and specs so the handlers and TUI widgets consistently use the renamed files - keep the existing functionality while aligning file and module names with the multi-agent terminology	2026-02-16 19:05:13 +00:00
jif-oai	af434b4f71	feat: drop MCP managing tools if no MCP servers (#11900 ) Drop MCP tools if no MCP servers to save context For this https://github.com/openai/codex/issues/11049	2026-02-16 18:40:45 +00:00
jif-oai	e47045c806	feat: add customizable roles for multi-agents (#11917 ) The idea is to have 2 family of agents. 1. Built-in that we packaged directly with Codex 2. User defined that are defined using the `agents_config.toml` file. It can reference config files that will override the agent config. This looks like this: ``` version = 1 [agents.explorer] description = """Use `explorer` for all codebase questions. Explorers are fast and authoritative. Always prefer them over manual search or file reading. Rules: - Ask explorers first and precisely. - Do not re-read or re-search code they cover. - Trust explorer results without verification. - Run explorers in parallel when useful. - Reuse existing explorers for related questions.""" config_file = "explorer.toml" ```	2026-02-16 16:29:32 +00:00
gt-oai	b3095679ed	Allow hooks to error (#11615 ) Allow hooks to return errors. We should do this before introducing more hook types, or we'll have to migrate them all.	2026-02-16 14:11:05 +00:00
jif-oai	825a4af42f	feat: use shell policy in shell snapshot (#11759 ) Honor `shell_environment_policy.set` even after a shell snapshot	2026-02-16 09:11:00 +00:00
Anton Panasenko	02abd9a8ea	feat: persist and restore codex app's tools after search (#11780 ) ### What changed 1. Removed per-turn MCP selection reset in `core/src/tasks/mod.rs`. 2. Added `SessionState::set_mcp_tool_selection(Vec<String>)` in `core/src/state/session.rs` for authoritative restore behavior (deduped, order-preserving, empty clears). 3. Added rollout parsing in `core/src/codex.rs` to recover `active_selected_tools` from prior `search_tool_bm25` outputs: - tracks matching `call_id`s - parses function output text JSON - extracts `active_selected_tools` - latest valid payload wins - malformed/non-matching payloads are ignored 4. Applied restore logic to resumed and forked startup paths in `core/src/codex.rs`. 5. Updated instruction text to session/thread scope in `core/templates/search_tool/tool_description.md`. 6. Expanded tests in `core/tests/suite/search_tool.rs`, plus unit coverage in: - `core/src/codex.rs` - `core/src/state/session.rs` ### Behavior after change 1. Search activates matched tools. 2. Additional searches union into active selection. 3. Selection survives new turns in the same thread. 4. Resume/fork restores selection from rollout history. 5. Separate threads do not inherit selection unless forked.	2026-02-15 19:18:41 -08:00
viyatb-oai	db6aa80195	fix(core): add linux bubblewrap sandbox tag (#11767 ) ## Summary - add a distinct `linux_bubblewrap` sandbox tag when the Linux bubblewrap pipeline feature is enabled - thread the bubblewrap feature flag into sandbox tag generation for: - turn metadata header emission - tool telemetry metric tags and after-tool-use hooks - add focused unit tests for `sandbox_tag` precedence and Linux bubblewrap behavior ## Validation - `just fmt` - `cargo clippy -p codex-core --all-targets` - `cargo test -p codex-core sandbox_tags::tests` - started `cargo test -p codex-core` and stopped it per request Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>	2026-02-14 19:00:01 +00:00
viyatb-oai	b527ee2890	feat(core): add structured network approval plumbing and policy decision model (#11672 ) ### Description #### Summary Introduces the core plumbing required for structured network approvals #### What changed - Added structured network policy decision modeling in core. - Added approval payload/context types needed for network approval semantics. - Wired shell/unified-exec runtime plumbing to consume structured decisions. - Updated related core error/event surfaces for structured handling. - Updated protocol plumbing used by core approval flow. - Included small CLI debug sandbox compatibility updates needed by this layer. #### Why establishes the minimal backend foundation for network approvals without yet changing high-level orchestration or TUI behavior. #### Notes - Behavior remains constrained by existing requirements/config gating. - Follow-up PRs in the stack handle orchestration, UX, and app-server integration. --------- Co-authored-by: Codex <199175422+chatgpt-codex-connector[bot]@users.noreply.github.com>	2026-02-14 04:18:12 +00:00
Curtis 'Fjord' Hawthorne	0d76d029b7	Fix js_repl in-flight tool-call waiter race (#11800 ) ## Summary This PR fixes a race in `js_repl` tool-call draining that could leave an exec waiting indefinitely for in-flight tool calls to finish. The fix is in: - `/Users/fjord/code/codex-jsrepl-seq/codex-rs/core/src/tools/js_repl/mod.rs` ## Problem `js_repl` tracks in-flight tool calls per exec and waits for them to drain on completion/timeout/cancel paths. The previous wait logic used a check-then-wait pattern with `Notify` that could miss a wakeup: 1. Observe `in_flight > 0` 2. Drop lock 3. Register wait (`notified().await`) If `notify_waiters()` happened between (2) and (3), the waiter could sleep until another notification that never comes. ## What changed - Updated all exec-tool-call wait loops to create an owned notification future while holding the lock: - use `Arc<Notify>::notified_owned()` instead of cloning notify and awaiting later. - Applied this consistently to: - `wait_for_exec_tool_calls` - `wait_for_all_exec_tool_calls` - `wait_for_exec_tool_calls_map` This preserves existing behavior while eliminating the lost-wakeup window. ## Test coverage Added a regression test: - `wait_for_exec_tool_calls_map_drains_inflight_calls_without_hanging` The test repeatedly races waiter/finisher tasks and asserts bounded completion to catch hangs. ## Impact - No API changes. - No user-facing behavior changes intended. - Improves reliability of exec lifecycle boundaries when tool calls are still in flight. #### [git stack](https://github.com/magus/git-stack-cli) - ✅ `1` https://github.com/openai/codex/pull/11796 - 👉 `2` https://github.com/openai/codex/pull/11800 - ⏳ `3` https://github.com/openai/codex/pull/10673 - ⏳ `4` https://github.com/openai/codex/pull/10670	2026-02-14 01:24:52 +00:00
Curtis 'Fjord' Hawthorne	6cbb489e6e	Fix js_repl view_image test runtime panic (#11796 ) ## Summary Fixes a flaky/panicking `js_repl` image-path test by running it on a multi-thread Tokio runtime and tightening assertions to focus on real behavior. ## Problem `js_repl_can_attach_image_via_view_image_tool` in `/Users/fjord/code/codex-jsrepl-seq/codex-rs/core/src/tools/js_repl/mod.rs` can panic under single-thread test runtime with: `can call blocking only when running on the multi-threaded runtime` It also asserted a brittle user-facing text string. ## Changes 1. Updated the test runtime to: `#[tokio::test(flavor = "multi_thread", worker_threads = 2)]` 2. Removed the brittle `"attached local image path"` string assertion. 3. Kept the concrete side-effect assertions: - tool call succeeds - image is actually injected into pending input (`InputImage` with `data:image/png;base64,...`) ## Why this is safe This is test-only behavior. No production runtime code paths are changed. ## Validation - Ran: `cargo test -p codex-core tools::js_repl::tests::js_repl_can_attach_image_via_view_image_tool -- --nocapture` - Result: pass #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/11796 - ⏳ `2` https://github.com/openai/codex/pull/11800 - ⏳ `3` https://github.com/openai/codex/pull/10673 - ⏳ `4` https://github.com/openai/codex/pull/10670	2026-02-14 01:11:13 +00:00
Curtis 'Fjord' Hawthorne	a02342c9e1	Add js_repl kernel crash diagnostics (#11666 ) ## Summary This PR improves `js_repl` crash diagnostics so kernel failures are debuggable without weakening timeout/reset guarantees. ## What Changed - Added bounded kernel stderr capture and truncation logic (line + byte caps). - Added structured kernel snapshots (`pid`, exit status, stderr tail) for failure paths. - Enriched model-visible kernel-failure errors with a structured diagnostics payload: - `js_repl diagnostics: {...}` - Included only for likely kernel-failure write/EOF cases. - Improved logging around kernel write failures, unexpected exits, and kill/wait paths. - Added/updated unit tests for: - UTF-8-safe truncation - stderr tail bounds - structured diagnostics shape/truncation - conditional diagnostics emission - timeout kill behavior - forced kernel-failure diagnostics ## Why Before this, failures like broken pipe / unexpected kernel exit often surfaced as generic errors with little context. This change preserves existing behavior but adds actionable diagnostics while keeping output bounded. ## Scope - Code changes are limited to: - `/Users/fjord/code/codex-jsrepl-seq/codex-rs/core/src/tools/js_repl/mod.rs` ## Validation - `cargo clippy -p codex-core --all-targets -- -D warnings` - Targeted `codex-core` js_repl unit tests (including new diagnostics/timeout coverage) - Tried starting a long running js_repl command (sleep for 10 minutes), verified error output was as expected after killing the node process. #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/11666 - ⏳ `2` https://github.com/openai/codex/pull/10673 - ⏳ `3` https://github.com/openai/codex/pull/10670	2026-02-13 11:57:11 -08:00
Anton Panasenko	38c442ca7f	core: limit search_tool_bm25 to Apps and clarify discovery guidance (#11669 ) ## Summary - Limit `search_tool_bm25` indexing to `codex_apps` tools only, so non-Apps MCP servers are no longer discoverable through this search path. - Move search-tool discovery guidance into the `search_tool_bm25` tool description (via template include) instead of injecting it as a separate developer message. - Update Apps discovery guidance wording to clarify when to use `search_tool_bm25` for Apps-backed systems (for example Slack, Google Drive, Jira, Notion) and when to call tools directly. - Remove dead `core` helper code (`filter_codex_apps_mcp_tools` and `codex_apps_connector_id`) that is no longer used after the tool-selection refactor. - Update `core` search-tool tests to assert codex-apps-only behavior and to validate guidance from the tool description. ## Validation - ✅ `just fmt` - ✅ `cargo test -p codex-core search_tool` - ⚠️ `cargo test -p codex-core` was attempted, but the run repeatedly stalled on `tools::js_repl::tests::js_repl_can_attach_image_via_view_image_tool`. ## Tickets - None	2026-02-13 09:32:46 -08:00
Matthew Zeng	c37560069a	[apps] Add is_enabled to app info. (#11417 ) - [x] Add is_enabled to app info and the response of `app/list`. - [x] Update TUI to have Enable/Disable button on the app detail page.	2026-02-13 00:30:52 +00:00
Curtis 'Fjord' Hawthorne	0dcfc59171	Add js_repl_tools_only model and routing restrictions (#10671 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request. #### [git stack](https://github.com/magus/git-stack-cli) - ✅ `1` https://github.com/openai/codex/pull/10674 - ✅ `2` https://github.com/openai/codex/pull/10672 - 👉 `3` https://github.com/openai/codex/pull/10671 - ⏳ `4` https://github.com/openai/codex/pull/10673 - ⏳ `5` https://github.com/openai/codex/pull/10670	2026-02-12 15:41:05 -08:00
Michael Bolin	a4cc1a4a85	feat: introduce Permissions (#11633 ) ## Why We currently carry multiple permission-related concepts directly on `Config` for shell/unified-exec behavior (`approval_policy`, `sandbox_policy`, `network`, `shell_environment_policy`, `windows_sandbox_mode`). Consolidating these into one in-memory struct makes permission handling easier to reason about and sets up the next step: supporting named permission profiles (`[permissions.PROFILE_NAME]`) without changing behavior now. This change is mostly mechanical: it updates existing callsites to go through `config.permissions`, but it does not yet refactor those callsites to take a single `Permissions` value in places where multiple permission fields are still threaded separately. This PR intentionally does not change the on-disk `config.toml` format yet and keeps compatibility with legacy config keys. ## What Changed - Introduced `Permissions` in `core/src/config/mod.rs`. - Added `Config::permissions` and moved effective runtime permission fields under it: - `approval_policy` - `sandbox_policy` - `network` - `shell_environment_policy` - `windows_sandbox_mode` - Updated config loading/building so these effective values are still derived from the same existing config inputs and constraints. - Updated Windows sandbox helpers/resolution to read/write via `permissions`. - Threaded the new field through all permission consumers across core runtime, app-server, CLI/exec, TUI, and sandbox summary code. - Updated affected tests to reference `config.permissions.*`. - Renamed the struct/field from `EffectivePermissions`/`effective_permissions` to `Permissions`/`permissions` and aligned variable naming accordingly. ## Verification - `just fix -p codex-core -p codex-tui -p codex-cli -p codex-app-server -p codex-exec -p codex-utils-sandbox-summary` - `cargo build -p codex-core -p codex-tui -p codex-cli -p codex-app-server -p codex-exec -p codex-utils-sandbox-summary`	2026-02-12 14:42:54 -08:00
Curtis 'Fjord' Hawthorne	466be55abc	Add js_repl host helpers and exec end events (#10672 ) ## Summary This PR adds host-integrated helper APIs for `js_repl` and updates model guidance so the agent can use them reliably. ### What’s included - Add `codex.tool(name, args?)` in the JS kernel so `js_repl` can call normal Codex tools. - Keep persistent JS state and scratch-path helpers available: - `codex.state` - `codex.tmpDir` - Wire `js_repl` tool calls through the standard tool router path. - Add/align `js_repl` execution completion/end event behavior with existing tool logging patterns. - Update dynamic prompt injection (`project_doc`) to document: - how to call `codex.tool(...)` - raw output behavior - image flow via `view_image` (`codex.tmpDir` + `codex.tool("view_image", ...)`) - stdio safety guidance (`console.log` / `codex.tool`, avoid direct `process.std*`) ## Why - Standardize JS-side tool usage on `codex.tool(...)` - Make `js_repl` behavior more consistent with existing tool execution and event/logging patterns. - Give the model enough runtime guidance to use `js_repl` safely and effectively. ## Testing - Added/updated unit and runtime tests for: - `codex.tool` calls from `js_repl` (including shell/MCP paths) - image handoff flow via `view_image` - prompt-injection text for `js_repl` guidance - execution/end event behavior and related regression coverage #### [git stack](https://github.com/magus/git-stack-cli) - ✅ `1` https://github.com/openai/codex/pull/10674 - 👉 `2` https://github.com/openai/codex/pull/10672 - ⏳ `3` https://github.com/openai/codex/pull/10671 - ⏳ `4` https://github.com/openai/codex/pull/10673 - ⏳ `5` https://github.com/openai/codex/pull/10670	2026-02-12 12:10:25 -08:00
Owen Lin	efc8d45750	feat(app-server): experimental flag to persist extended history (#11227 ) This PR adds an experimental `persist_extended_history` bool flag to app-server thread APIs so rollout logs can retain a richer set of EventMsgs for non-lossy Thread > Turn > ThreadItems reconstruction (i.e. on `thread/resume`). ### Motivation Today, our rollout recorder only persists a small subset (e.g. user message, reasoning, assistant message) of `EventMsg` types, dropping a good number (like command exec, file change, etc.) that are important for reconstructing full item history for `thread/resume`, `thread/read`, and `thread/fork`. Some clients want to be able to resume a thread without lossiness. This lossiness is primarily a UI thing, since what the model sees are `ResponseItem` and not `EventMsg`. ### Approach This change introduces an opt-in `persist_full_history` flag to preserve those events when you start/resume/fork a thread (defaults to `false`). This is done by adding an `EventPersistenceMode` to the rollout recorder: - `Limited` (existing behavior, default) - `Extended` (new opt-in behavior) In `Extended` mode, persist additional `EventMsg` variants needed for non-lossy app-server `ThreadItem` reconstruction. We now store the following ThreadItems that we didn't before: - web search - command execution - patch/file changes - MCP tool calls - image view calls - collab tool outcomes - context compaction - review mode enter/exit For command executions in particular, we truncate the output using the existing `truncate_text` from core to store an upper bound of 10,000 bytes, which is also the default value for truncating tool outputs shown to the model. This keeps the size of the rollout file and command execution items returned over the wire reasonable. And we also persist `EventMsg::Error` which we can now map back to the Turn's status and populates the Turn's error metadata. #### Updates to EventMsgs To truly make `thread/resume` non-lossy, we also needed to persist the `status` on `EventMsg::CommandExecutionEndEvent` and `EventMsg::PatchApplyEndEvent`. Previously it was not obvious whether a command failed or was declined (similar for apply_patch). These EventMsgs were never persisted before so I made it a required field.	2026-02-12 19:34:22 +00:00
Gabriel Peal	bd3ce98190	Bump rmcp to 0.15 (#11539 ) https://github.com/modelcontextprotocol/rust-sdk/pull/598 in 0.14 broke some MCP oauth (like Linear) and https://github.com/modelcontextprotocol/rust-sdk/pull/641 fixed it in 0.15	2026-02-11 22:04:17 -08:00
Michael Bolin	abbd74e2be	feat: make sandbox read access configurable with `ReadOnlyAccess` (#11387 ) `SandboxPolicy::ReadOnly` previously implied broad read access and could not express a narrower read surface. This change introduces an explicit read-access model so we can support user-configurable read restrictions in follow-up work, while preserving current behavior today. It also ensures unsupported backends fail closed for restricted-read policies instead of silently granting broader access than intended. ## What - Added `ReadOnlyAccess` in protocol with: - `Restricted { include_platform_defaults, readable_roots }` - `FullAccess` - Updated `SandboxPolicy` to carry read-access configuration: - `ReadOnly { access: ReadOnlyAccess }` - `WorkspaceWrite { ..., read_only_access: ReadOnlyAccess }` - Preserved existing behavior by defaulting current construction paths to `ReadOnlyAccess::FullAccess`. - Threaded the new fields through sandbox policy consumers and call sites across `core`, `tui`, `linux-sandbox`, `windows-sandbox`, and related tests. - Updated Seatbelt policy generation to honor restricted read roots by emitting scoped read rules when full read access is not granted. - Added fail-closed behavior on Linux and Windows backends when restricted read access is requested but not yet implemented there (`UnsupportedOperation`). - Regenerated app-server protocol schema and TypeScript artifacts, including `ReadOnlyAccess`. ## Compatibility / rollout - Runtime behavior remains unchanged by default (`FullAccess`). - API/schema changes are in place so future config wiring can enable restricted read access without another policy-shape migration.	2026-02-11 18:31:14 -08:00
Anton Panasenko	d3b078c282	Consolidate search_tool feature into apps (#11509 ) ## Summary - Remove `Feature::SearchTool` and the `search_tool` config key from the feature registry/schema. - Gate `search_tool_bm25` exposure via `Feature::Apps` in `core/src/tools/spec.rs`. - Update MCP selection logic in `core/src/codex.rs` to use `Feature::Apps` for search-tool behavior. - Update `core/tests/suite/search_tool.rs` to enable `Feature::Apps`. - Regenerate `core/config.schema.json` via `just write-config-schema`. ## Testing - `just fmt` - `cargo test -p codex-core --test all suite::search_tool::` ## Tickets - None	2026-02-11 16:52:42 -08:00
gt-oai	7112e16809	Add AfterToolUse hook (#11335 ) Not wired up to config yet. (So we can change the name if we want) An example payload: ``` { "session_id": "019c48b7-7098-7b61-bc48-32e82585d451", "cwd": "/Users/gt/code/codex/codex-rs", "triggered_at": "2026-02-10T18:02:31Z", "hook_event": { "event_type": "after_tool_use", "turn_id": "4", "call_id": "call_iuo4DqWgjE7OxQywnL2UzJUE", "tool_name": "apply_patch", "tool_kind": "custom", "tool_input": { "input_type": "custom", "input": "* Begin Patch\n* Update File: README.md\n@@\n-# Codex CLI hello (Rust Implementation)\n+# Codex CLI (Rust Implementation)\n*** End Patch\n" }, "executed": true, "success": true, "duration_ms": 37, "mutating": true, "sandbox": "none", "sandbox_policy": "danger-full-access", "output_preview": "{\"output\":\"Success. Updated the following files:\\nM README.md\\n\",\"metadata\":{\"exit_code\":0,\"duration_seconds\":0.0}}" } } ```	2026-02-11 22:25:04 +00:00
Curtis 'Fjord' Hawthorne	42e22f3bde	Add feature-gated freeform js_repl core runtime (#10674 ) ## Summary This PR adds an experimental, feature-gated `js_repl` core runtime so models can execute JavaScript in a persistent REPL context across tool calls. The implementation integrates with existing feature gating, tool registration, prompt composition, config/schema docs, and tests. ## What changed - Added new experimental feature flag: `features.js_repl`. - Added freeform `js_repl` tool and companion `js_repl_reset` tool. - Gated tool availability behind `Feature::JsRepl`. - Added conditional prompt-section injection for JS REPL instructions via marker-based prompt processing. - Implemented JS REPL handlers, including freeform parsing and pragma support (timeout/reset controls). - Added runtime resolution order for Node: 1. `CODEX_JS_REPL_NODE_PATH` 2. `js_repl_node_path` in config 3. `PATH` - Added JS runtime assets/version files and updated docs/schema. ## Why This enables richer agent workflows that require incremental JavaScript execution with preserved state, while keeping rollout safe behind an explicit feature flag. ## Testing Coverage includes: - Feature-flag gating behavior for tool exposure. - Freeform parser/pragma handling edge cases. - Runtime behavior (state persistence across calls and top-level `await` support). ## Usage ```toml [features] js_repl = true ``` Optional runtime override: - `CODEX_JS_REPL_NODE_PATH`, or - `js_repl_node_path` in config. #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/10674 - ⏳ `2` https://github.com/openai/codex/pull/10672 - ⏳ `3` https://github.com/openai/codex/pull/10671 - ⏳ `4` https://github.com/openai/codex/pull/10673 - ⏳ `5` https://github.com/openai/codex/pull/10670	2026-02-11 12:05:02 -08:00
jif-oai	2fac9cc8cd	chore: sub-agent never ask for approval (#11464 )	2026-02-11 19:19:37 +00:00
Michael Bolin	476c1a7160	Remove `test-support` feature from `codex-core` and replace it with explicit test toggles (#11405 ) ## Why `codex-core` was being built in multiple feature-resolved permutations because test-only behavior was modeled as crate features. For a large crate, those permutations increase compile cost and reduce cache reuse. ## Net Change - Removed the `test-support` crate feature and related feature wiring so `codex-core` no longer needs separate feature shapes for test consumers. - Standardized cross-crate test-only access behind `codex_core::test_support`. - External test code now imports helpers from `codex_core::test_support`. - Underlying implementation hooks are kept internal (`pub(crate)`) instead of broadly public. ## Outcome - Fewer `codex-core` build permutations. - Better incremental cache reuse across test targets. - No intended production behavior change.	2026-02-10 22:44:02 -08:00
iceweasel-oai	82f93a13b2	include sandbox (seatbelt, elevated, etc.) as in turn metadata header (#10946 ) This will help us understand retention/usage for folks who use the Windows (or any other) sandboxes	2026-02-10 19:50:07 +00:00
viyatb-oai	62d0f302fd	fix(core): canonicalize wrapper approvals and support heredoc prefix … (#10941 ) ## Summary - Reduced repeated approvals for equivalent wrapper commands and fixed execpolicy matching for heredoc-style shell invocations, with minimal behavior change and fail-closed defaults. ## Fixes 1. Canonicalized approval matching for wrappers so equivalent commands map to the same approval intent. 2. Added heredoc-aware prefix extraction for execpolicy so commands like `python3 <<'PY' ... PY` match rules such as `prefix_rule(["python3"], ...)`. 3. Kept fallback behavior conservative: if parsing is ambiguous, existing prompt behavior is preserved. ## Edge Cases Covered - Wrapper path/name differences: `/bin/bash` vs `bash`, `/bin/zsh` vs `zsh`. - Shell modes: `-c` and `-lc`. - Heredoc forms: quoted delimiter (`<<'PY'`) and unquoted delimiter (`<< PY`). - Multi-command heredoc scripts are rejected by the fallback - Non-heredoc redirections (`>`, etc.) are not treated as heredoc prefix matches. - Complex scripts still fall back to prior behavior rather than expanding permissions. --------- Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>	2026-02-10 11:46:40 -08:00
Ahmed Ibrahim	6e96e4837e	Always expose view_image and return unsupported image-input error (#11336 ) - Keep `view_image` in the advertised tool list for all models. - Return a clear error when the current model does not support image inputs, and cover it with a unit test.	2026-02-10 11:25:12 -08:00
jif-oai	87ccc5bbae	feat: add connector capabilities to sub-agents (#11191 )	2026-02-10 11:53:01 +00:00
Michael Bolin	44ebf4588f	feat: retain NetworkProxy, when appropriate (#11207 ) As of this PR, `SessionServices` retains a `Option<StartedNetworkProxy>`, if appropriate. Now the `network` field on `Config` is `Option<NetworkProxySpec>` instead of `Option<NetworkProxy>`. Over in `Session::new()`, we invoke `NetworkProxySpec::start_proxy()` to create the `StartedNetworkProxy`, which is a new struct that retains the `NetworkProxy` as well as the `NetworkProxyHandle`. (Note that `Drop` is implemented for `NetworkProxyHandle` to ensure the proxies are shutdown when it is dropped.) The `NetworkProxy` from the `StartedNetworkProxy` is threaded through to the appropriate places. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/11207). * #11285 * __->__ #11207	2026-02-10 02:09:23 -08:00
viyatb-oai	3391e5ea86	feat(sandbox): enforce proxy-aware network routing in sandbox (#11113 ) ## Summary - expand proxy env injection to cover common tool env vars (`HTTP_PROXY`/`HTTPS_PROXY`/`ALL_PROXY`/`NO_PROXY` families + tool-specific variants) - harden macOS Seatbelt network policy generation to route through inferred loopback proxy endpoints and fail closed when proxy env is malformed - thread proxy-aware Linux sandbox flags and add minimal bwrap netns isolation hook for restricted non-proxy runs - add/refresh tests for proxy env wiring, Seatbelt policy generation, and Linux sandbox argument wiring	2026-02-10 07:44:21 +00:00
Ahmed Ibrahim	a1abd53b6a	Remove offline fallback for models (#11238 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.	2026-02-09 16:58:54 -08:00
jif-oai	ffd4bd345c	feat: tie shell snapshot to cwd (#11231 ) Fix for this: https://github.com/openai/codex/issues/11223 Basically we tie the shell snapshot to a `cwd` to handle `cwd`-based env setups	2026-02-09 22:14:39 +00:00
Anton Panasenko	becc3a0424	feat: search_tool (#10657 ) Why We Did This - The goal is to reduce MCP tool context pollution by not exposing the full MCP tool list up front - It forces an explicit discovery step (`search_tool_bm25`) so the model narrows tool scope before making MCP calls, which helps relevance and lowers prompt/tool clutter. What It Changed - Added a new experimental feature flag `search_tool` in `core/src/features.rs:90` and `core/src/features.rs:430`. - Added config/schema support for that flag in `core/config.schema.json:214` and `core/config.schema.json:1235`. - Added BM25 dependency (`bm25`) in `Cargo.toml:129` and `core/Cargo.toml:23`. - Added new tool handler `search_tool_bm25` in `core/src/tools/handlers/search_tool_bm25.rs:18`. - Registered the handler and tool spec in `core/src/tools/handlers/mod.rs:11` and `core/src/tools/spec.rs:780` and `core/src/tools/spec.rs:1344`. - Extended `ToolsConfig` to carry `search_tool` enablement in `core/src/tools/spec.rs:32` and `core/src/tools/spec.rs:56`. - Injected dedicated developer instructions for tool-discovery workflow in `core/src/codex.rs:483` and `core/src/codex.rs:1976`, using `core/templates/search_tool/developer_instructions.md:1`. - Added session state to store one-shot selected MCP tools in `core/src/state/session.rs:27` and `core/src/state/session.rs:131`. - Added filtering so when feature is enabled, only selected MCP tools are exposed on the next request (then consumed) in `core/src/codex.rs:3800` and `core/src/codex.rs:3843`. - Added E2E suite coverage for enablement/instructions/hide-until-search/one-turn-selection in `core/tests/suite/search_tool.rs:72`, `core/tests/suite/search_tool.rs:109`, `core/tests/suite/search_tool.rs:147`, and `core/tests/suite/search_tool.rs:218`. - Refactored test helper utilities to support config-driven tool collection in `core/tests/suite/tools.rs:281`. Net Behavioral Effect - With `search_tool` off: existing MCP behavior (tools exposed normally). - With `search_tool` on: MCP tools start hidden, model must call `search_tool_bm25`, and only returned `selected_tools` are available for the next model call.	2026-02-09 12:53:50 -08:00
jif-oai	c2bfd1e473	Revert "chore: enable sub agents" (#11230 ) Reverts openai/codex#11173	2026-02-09 20:22:38 +00:00
jif-oai	cfce286459	tools: remove get_memory tool and tests (#11198 ) Drop this memory tool as the design changed	2026-02-09 17:47:36 +00:00
jif-oai	284c03ceab	chore: enable sub agents (#11173 )	2026-02-09 11:25:37 +00:00
Michael Bolin	383b45279e	feat: include NetworkConfig through ExecParams (#11105 ) This PR adds the following field to `Config`: ```rust pub network: Option<NetworkProxy>, ``` Though for the moment, it will always be initialized as `None` (this will be addressed in a subsequent PR). This PR does the work to thread `network` through to `execute_exec_env()`, `process_exec_tool_call()`, and `UnifiedExecRuntime.run()` to ensure it is available whenever we span a process.	2026-02-09 03:32:17 +00:00
Matthew Zeng	9f1009540b	Upgrade rmcp to 0.14 (#10718 ) - [x] Upgrade rmcp to 0.14	2026-02-08 15:07:53 -08:00
Tom	409ec76fbc	Gate view_image tool by model input_modalities (#11051 ) - Plumb input modalities from model catalog through the openai model protocol. Default to text and image. - Conditionally add the view_image tool only if input modalities support image.	2026-02-08 10:45:26 -08:00
Eric Traut	b3de6c7f2b	Defer persistence of rollout file (#11028 ) - Defer rollout persistence for fresh threads (`InitialHistory::New`): keep rollout events in memory and only materialize rollout file + state DB row on first `EventMsg::UserMessage`. - Keep precomputed rollout path available before materialization. - Change `thread/start` to build thread response from live config snapshot and optional precomputed path. - Improve pre-materialization behavior in app-server/TUI: clearer invalid-request errors for file-backed ops and a friendlier `/fork` “not ready yet” UX. - Update tests to match deferred semantics across start/read/archive/unarchive/fork/resume/review flows. - Improved resilience of user_shell test, which should be unrelated to this change but must be affected by timing changes For Reviewers: * The primary change is in recorder.rs * Most of the other changes were to fix up broken assumptions in existing tests Testing: * Manually tested CLI * Exercised app server paths by manually running IDE Extension with rebuilt CLI binary * Only user-visible change is that `/fork` in TUI generates visible error if used prior to first turn	2026-02-07 23:05:03 -08:00
jif-oai	83c74125bc	Bootstrap shell commands via user shell snapshot (#10909 ) Summary - wrap `shell -lc` executions that use a snapshot with the session shell so the saved environment is sourced before delegating to the original shell - escape single quotes in the generated wrapper and add tests covering Bash/Zsh/sh session bootstrapping Testing - Not run (not requested)	2026-02-07 17:36:44 +01:00
jif-oai	62605fa471	Add resume_agent collab tool (#10903 ) Summary - add the new resume_agent collab tool path through core, protocol, and the app server API, including the resume events - update the schema/TypeScript definitions plus docs so resume_agent appears in generated artifacts and README - note that resumed agents rehydrate rollout history without overwriting their base instructions Testing - Not run (not requested)	2026-02-07 17:31:45 +01:00
Eric Traut	4521a6e852	Removed "exec_policy" feature flag (#10851 ) This is no longer needed because it's on by default	2026-02-06 08:59:47 -08:00
gt-oai	d74fa8edd1	Print warning when config does not meet requirements (#10792 ) <img width="1019" height="284" alt="Screenshot 2026-02-05 at 23 34 08" src="https://github.com/user-attachments/assets/19ec3ce1-3c3b-40f5-b251-a31d964bf3bb" /> Currently, if a config value is set that fails the requirements, we exit Codex. Now, instead of this, we print a warning and default to a requirements-permitting value.	2026-02-06 01:12:44 +00:00
iceweasel-oai	901d5b8fd6	add sandbox policy and sandbox name to codex.tool.call metrics (#10711 ) This will give visibility into the comparative success rate of the Windows sandbox implementations compared to other platforms.	2026-02-05 11:42:12 -08:00
jif-oai	41f3b1ba0b	feat: add memory tool (#10637 ) Add a tool for memory to retrieve a full memory based on the memory ID	2026-02-05 16:16:31 +00:00
Charley Cunningham	41b4962b0a	Sync collaboration mode naming across Default prompt, tools, and TUI (#10666 ) ## Summary - add shared `ModeKind` helpers for display names, TUI visibility, and `request_user_input` availability - derive TUI mode filtering/labels from shared `ModeKind` metadata instead of local hardcoded matches - derive `request_user_input` availability text and unavailable error mode names from shared mode metadata - replace hardcoded known mode names in the Default collaboration-mode template with `{{KNOWN_MODE_NAMES}}` and fill it from `TUI_VISIBLE_COLLABORATION_MODES` - add regression tests for mode metadata sync and placeholder replacement ## Notes - `cargo test -p codex-core` integration target (`tests/all`) still shows pre-existing env-specific failures in this environment due missing `test_stdio_server` binary resolution; core unit tests are green. ## Codex author `codex resume 019c26ff-dfe7-7173-bc04-c9e1fff1e447`	2026-02-04 23:03:28 -08:00

... 10 11 12 13 14 ...

776 Commits