codex

mirror of https://github.com/openai/codex.git synced 2026-06-01 19:02:59 +00:00

Author	SHA1	Message	Date
Eric Traut	d8b91f5fa1	Attribute automated PR Babysitter review replies (#18379 ) ## Summary PR Babysitter can reply directly to GitHub code review comments when feedback is non-actionable, already addressed, or not valid. Those replies should be visibly attributed so reviewers do not mistake an automated Codex response for a message from the human operator. This updates the skill instructions to require GitHub code review replies from the babysitter to start with `[codex]`. ## Changes - Adds the `[codex]` prefix requirement to the core PR Babysitter workflow. - Repeats the requirement in the review comment handling guidance where agents decide whether to reply to a review thread.	2026-04-17 12:27:48 -07:00
Ahmed Ibrahim	0f0ef094b6	Show default reasoning in /status (#18373 ) - Shows the model catalog default reasoning effort when no reasoning override is configured. - Adds /status coverage for the empty-config fallback.	2026-04-17 12:21:09 -07:00
github-actions[bot]	a801b999ff	Update models.json (#12640 ) Automated update of models.json. Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com> Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>	2026-04-17 12:16:07 -07:00
Ahmed Ibrahim	9d3a5cf05e	[3/6] Add pushed exec process events (#18020 ) ## Summary - Add a pushed `ExecProcessEvent` stream alongside retained `process/read` output. - Publish local and remote output, exit, close, and failure events. - Cover the event stream with shared local/remote exec process tests. ## Testing - `cargo check -p codex-exec-server` - `cargo check -p codex-rmcp-client` - Not run: `cargo test` per repo instruction; CI will cover. ## Stack ```text o #18027 [6/6] Fail exec client operations after disconnect │ o #18212 [5/6] Wire executor-backed MCP stdio │ o #18087 [4/6] Abstract MCP stdio server launching │ @ #18020 [3/6] Add pushed exec process events │ o #18086 [2/6] Support piped stdin in exec process API │ o #18085 [1/6] Add MCP server environment config │ o main ``` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-17 19:07:43 +00:00
David de Regt	eaf78e43f2	Add sorting/backwardsCursor to thread/list and new thread/turns/list api (#17305 ) To improve performance of UI loads from the app, add two main improvements: 1. The `thread/list` api now gets a `sortDirection` request field and a `backwardsCursor` to the response, which lets you paginate forwards and backwards from a window. This lets you fetch the first few items to display immediately while you paginate to fill in history, then can paginate "backwards" on future loads to catch up with any changes since the last UI load without a full reload of the entire data set. 2. Added a new `thread/turns/list` api which also has sortDirection and backwardsCursor for the same behavior as `thread/list`, allowing you the same small-fetch for immediate display followed by background fill-in and resync catchup.	2026-04-17 11:49:02 -07:00
Michael Bolin	29bc2ad2f4	ci: scope Bazel repository cache by job (#18366 ) ## Why The Bazel workflow has multiple jobs that run concurrently for the same target triple. In particular, the Windows `test`, `clippy`, and `verify-release-build` jobs could all miss and then attempt to save the same Bazel repository cache key: ```text bazel-cache-${target}-${lockhash} ``` Because `actions/cache` entries are immutable, only one job can reserve that key. The others can report failures such as: ```text Failed to save: Unable to reserve cache with key bazel-cache-x86_64-pc-windows-gnullvm-..., another job may be creating this cache. ``` Adding only the workflow name would not separate these jobs because they all run inside the same `Bazel` workflow. The key needs a job-level namespace as well. ## What Changed - Added a required `cache-scope` input to `.github/actions/prepare-bazel-ci/action.yml`. - Moved Bazel repository cache key construction into the shared action and exposed the computed key as `repository-cache-key`. - Exposed the exact restore result as `repository-cache-hit` so save steps can skip exact cache hits. - Updated `.github/workflows/bazel.yml` to pass `cache-scope: bazel-${{ github.job }}` for the `test`, `clippy`, and `verify-release-build` jobs. - The scoped restore key is now the only fallback. This avoids carrying a temporary restore path for the old unscoped cache namespace. ## Verification - Parsed `.github/actions/prepare-bazel-ci/action.yml` and `.github/workflows/bazel.yml` with Ruby's YAML parser. - `actionlint` is not installed in this workspace, so I could not run a GitHub Actions semantic lint locally.	2026-04-17 11:39:38 -07:00
Ahmed Ibrahim	481ba014a7	Add core CODEOWNERS (#18362 ) Adds @openai/codex-core-agent-team as the owner for codex-rs/core/ and protects .github/CODEOWNERS with the same owner.	2026-04-17 11:29:46 -07:00
Michael Bolin	2c2ed51876	ci: make Windows Bazel clippy catch core test imports (#18350 ) ## Why Unused imports in `core/tests/suite/unified_exec.rs` in the Windows build were not caught by Bazel CI on https://github.com/openai/codex/pull/18096. I spot-checked https://github.com/openai/codex/actions/workflows/rust-ci-full.yml?query=branch%3Amain and noticed that builds were consistently red. This revealed that our Cargo builds _were_ properly catching these issues, identifying a Windows-specific coverage hole in the Bazel clippy job. The Windows Bazel clippy job uses `--skip_incompatible_explicit_targets` so it can lint a broad target set without failing immediately on targets that are genuinely incompatible with Windows. However, with the default Windows host platform, `rust_test` targets such as `//codex-rs/core:core-all-test` could be skipped before the clippy aspect reached their integration-test modules. As a result, the imports in `core/tests/suite/unified_exec.rs` were not being linted by the Windows Bazel clippy job at all. The clippy diagnostic that Windows Bazel should have surfaced was: ```text error: unused import: `codex_config::Constrained` --> core\tests\suite\unified_exec.rs:8:5 \| 8 \| use codex_config::Constrained; \| ^^^^^^^^^^^^^^^^^^^^^^^^^ \| = note: `-D unused-imports` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(unused_imports)]` error: unused import: `codex_protocol::permissions::FileSystemAccessMode` --> core\tests\suite\unified_exec.rs:11:5 \| 11 \| use codex_protocol::permissions::FileSystemAccessMode; \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ error: unused import: `codex_protocol::permissions::FileSystemPath` --> core\tests\suite\unified_exec.rs:12:5 \| 12 \| use codex_protocol::permissions::FileSystemPath; \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ error: unused import: `codex_protocol::permissions::FileSystemSandboxEntry` --> core\tests\suite\unified_exec.rs:13:5 \| 13 \| use codex_protocol::permissions::FileSystemSandboxEntry; \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ error: unused import: `codex_protocol::permissions::FileSystemSandboxPolicy` --> core\tests\suite\unified_exec.rs:14:5 \| 14 \| use codex_protocol::permissions::FileSystemSandboxPolicy; \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``` ## What changed - Run the Windows Bazel clippy job with the MSVC host platform via `--windows-msvc-host-platform`, matching the Windows Bazel test job. This keeps `--skip_incompatible_explicit_targets` while ensuring Windows `rust_test` targets such as `//codex-rs/core:core-all-test` are still linted. - Remove the unused imports from `core/tests/suite/unified_exec.rs`. - Add `--print-failed-action-summary` to `.github/scripts/run-bazel-ci.sh` so Bazel action failures can be summarized after the build exits. ## Failure reporting Once the coverage issue was fixed, an intentionally reintroduced unused import made the Windows Bazel clippy job fail as expected. That exposed a separate usability problem: because the job keeps `--keep_going`, the top-level Bazel output could still end with: ```text ERROR: Build did NOT complete successfully FAILED: ``` without the underlying rustc/clippy diagnostic being visible in the obvious part of the GitHub Actions log. To keep `--keep_going` while making failures actionable, the wrapper now scans the captured Bazel console output for failed actions and prints the matching rustc/clippy diagnostic block. When a diagnostic block is found, it is emitted both as a GitHub `::error` annotation and as plain expanded log output, rather than being hidden in a collapsed group. ## Verification To validate the CI path, I intentionally introduced an unused import in `core/tests/suite/unified_exec.rs`. The Windows Bazel clippy job failed as expected, confirming that the integration-test module is now covered by Bazel clippy. The same failure also verified that the wrapper surfaces the matching clippy diagnostics directly in the Actions output.	2026-04-17 18:19:58 +00:00
sayan-oai	6991be7ead	enable tool search over dynamic tools (#18263 ) ## Summary - Normalize deferred MCP and dynamic tools into `ToolSearchEntry` values before constructing `ToolSearchHandler`. - Move the tool-search entry adapter out of `tools/handlers` and into `tools/tool_search_entry.rs` so the handlers directory stays focused on handlers. - Keep `ToolSearchHandler` operating over one generic entry list for BM25 search, namespace grouping, and per-bucket default limits. ## Why Follow-up cleanup for #17849. The dynamic tool-search support made the handler juggle source-specific MCP and dynamic tool lists, index arithmetic, output conversion, and namespace emission. This keeps source adaptation outside the handler so the search loop itself is smaller and source-agnostic. ## Validation - `just fmt` - `cargo test -p codex-core tools::handlers::tool_search::tests` - `git diff --check` - `cargo test -p codex-core` currently fails in unrelated `plugins::manager::tests::list_marketplaces_ignores_installed_roots_missing_from_config`; rerunning that single test fails the same way at `core/src/plugins/manager_tests.rs:1692`. --------- Co-authored-by: pash <pash@openai.com>	2026-04-18 02:07:59 +08:00
Tom	fad3d0f1d0	codex: route thread/read persistence through thread store (#18352 ) Summary - replace the thread/read persisted-load helper with ThreadStore::read_thread - move SQLite/rollout summary, name, fork metadata, and history loading for persisted reads into LocalThreadStore - leave getConversationSummary unchanged for a later PR Context - Replaces closed stacked PR #18232 after PR #18231 merged and its base branch was deleted.	2026-04-17 10:31:30 -07:00
Felipe Coury	d3692b14c9	feat(tui): add clear-context plan implementation (#17499 ) ## TL;DR - Adds a second Plan Mode handoff: implement the approved plan after clearing context. - Keeps the existing same-thread `Yes, implement this plan` action unchanged. - Reuses the `/clear` thread-start path and submits the approved plan as the fresh thread's first prompt. - Covers the new popup option, event plumbing, initial-message behavior, and disabled states in TUI tests. ## Problem Plan Mode already asks whether to implement an approved plan, but the only affirmative path continues in the same thread. That is useful when the planning conversation itself is still valuable, but it does not support the workflow where exploratory planning context is discarded and implementation starts from the final approved plan as the only model-visible handoff. <img width="1253" height="869" alt="image" src="https://github.com/user-attachments/assets/90023d75-c330-4919-bed8-518671c3474b" /> ## Mental model There are now two implementation choices after a proposed plan. The existing choice, `Yes, implement this plan`, is unchanged: it switches to Default mode and submits `Implement the plan.` in the current thread. The new choice, `Yes, clear context and implement`, treats the proposed plan as a handoff artifact. It clears the UI/session context through the same thread-start source used by `/clear`, then submits an initial prompt containing the approved plan after the fresh thread is configured. The important distinction is that the new path is not compaction. The model receives a deliberate implementation prompt built from the approved plan markdown, not a summary of the previous planning transcript. Both implementation choices require the Default collaboration preset to be available, so the popup does not offer a coding handoff when the fresh thread would fall back to another mode. ## Non-goals This change does not alter `/clear`, `/compact`, or the existing same-context Plan Mode implementation option. It does not add protocol surface area or app-server schema changes. It also does not carry the previous transcript path or a generated planning summary into the new model context. ## Tradeoffs The fresh-context option relies on the approved plan being sufficiently complete. That matches the Plan Mode contract, but it means vague plans will produce weaker implementation starts than a compacted transcript would. The upside is that rejected ideas, exploratory dead ends, and planning corrections do not leak into the implementation turn. The current implementation stores the latest proposed plan in `ChatWidget` rather than deriving it from history cells at selection time. This keeps the popup action simple and deterministic, but it makes the cache lifecycle important: it must be reset when a new task starts so an old plan cannot be submitted later. ## Architecture The TUI stores the most recent completed proposed-plan markdown when a plan item completes. The Plan Mode approval popup uses that cache to enable the fresh-context option and to build a first-turn prompt that instructs the model to implement the approved plan in a fresh context. Selecting the new option emits a TUI-internal `ClearUiAndSubmitUserMessage` event. `App` handles that event by reusing the existing clear flow: clear terminal state, reset app UI state, start a new app-server thread with `ThreadStartSource::Clear`, and attach a replacement `ChatWidget` with an initial user message. The existing initial-message suppression in `enqueue_primary_thread_session` ensures the prompt is submitted only after the new session is configured and any startup replay is rendered. ## Observability The previous thread remains resumable through the existing clear-session summary hint. There is no new telemetry or protocol event for this path, so debugging should start at the TUI event boundary: confirm the popup emitted `ClearUiAndSubmitUserMessage`, confirm the app-server thread start used `ThreadStartSource::Clear`, then confirm the fresh widget submitted the initial user message after `SessionConfigured`. ## Tests The Plan Mode popup snapshots cover the new option and preserve the original option as the first/default action. Unit coverage verifies the original same-context option still emits `SubmitUserMessageWithMode`, the new option emits `ClearUiAndSubmitUserMessage` with the approved plan embedded verbatim, and the clear-context option is disabled when Default mode is unavailable or no approved plan exists. The broader `codex-tui` test package passes with the updated fresh-thread initial-message plumbing.	2026-04-17 14:30:09 -03:00
colby-oai	ea84537369	Make app tool hint defaults pessimistic for app policies (#17232 ) ## Summary - default missing app tool destructive/open-world hints to true for app policies - add regression tests for missing MCP annotations under restrictive app config	2026-04-17 13:27:49 -04:00
jif-oai	cfc23eee3d	feat: config aliases (#18140 ) Rename `no_memories_if_mcp_or_web_search` → `disable_on_external_context` with backward compatibility While doing so, we add a key alias system on our layer merging system. What we try to avoid is a case where a company managed config use an old name while the user has a new name in it's local config (which would make the deserialization fail)	2026-04-17 18:26:09 +01:00
Won Park	af7b8d551c	Guardian -> Auto-Review (#18021 ) This PR is a user-facing change for our rebranding of guardian to auto-review.	2026-04-17 09:56:24 -07:00
Michael Bolin	d0eff70383	Fix config-loader tests after filesystem abstraction race (#18351 ) ## Why `origin/main` picked up two changes that crossed in flight: - #18209 refactored config loading to read through `ExecutorFileSystem`, changing `load_requirements_toml` to take a filesystem handle and an `AbsolutePathBuf`. - #17740 added managed `deny_read` requirements tests that still called `load_requirements_toml` with the previous two-argument signature. Once both landed, `just clippy` failed because the new tests no longer matched the current helper API. ## What - Updates the two managed `deny_read` requirements tests to convert the fixture path to `AbsolutePathBuf` before loading. - Passes `LOCAL_FS.as_ref()` into `load_requirements_toml` so these tests follow the filesystem abstraction introduced by #18209. ## Verification - `just clippy` - `cargo test -p codex-core load_requirements_toml_resolves_deny_read` - `cargo test -p codex-core --test all unified_exec_enforces_glob_deny_read_policy`	2026-04-17 09:20:39 -07:00
pakrym-oai	71e4c6fa17	Move codex module under session (#18249 ) ## Summary - rename the core codex module root to session/mod.rs without using #[path] - move the codex module directory and tests under core/src/session - remove session/mod.rs reexports so call sites use explicit child module paths ## Testing - cargo test -p codex-core --lib - cargo check -p codex-core --tests - just fmt - just fix -p codex-core - git diff --check	2026-04-17 16:18:53 +00:00
viyatb-oai	dae0608c06	feat(config): support managed deny-read requirements (#17740 ) ## Summary - adds managed requirements support for deny-read filesystem entries - constrains config layers so managed deny-read requirements cannot be widened by user-controlled config - surfaces managed deny-read requirements through debug/config plumbing This PR lets managed requirements inject deny-read filesystem constraints into the effective filesystem sandbox policy. User-controlled config can still choose the surrounding permission profile, but it cannot remove or weaken the managed deny-read entries. ## Managed deny-read shape A managed requirements file can declare exact paths and glob patterns under `[permissions.filesystem]`: ```toml # /etc/codex/requirements.toml [permissions.filesystem] deny_read = [ "/Users/alice/.gitconfig", "/Users/alice/.ssh", "./managed-private/*/.env", ] ``` Those entries are compiled into the effective filesystem policy as `access = none` rules, equivalent in shape to filesystem permission entries like: ```toml [permissions.workspace.filesystem] "/Users/alice/.gitconfig" = "none" "/Users/alice/.ssh" = "none" "/absolute/path/to/managed-private/*/.env" = "none" ``` The important difference is that the managed entries come from requirements, so lower-precedence user config cannot remove them or make those paths readable again. Relative managed `deny_read` entries are resolved relative to the directory containing the managed requirements file. Glob entries keep their glob suffix after the non-glob prefix is normalized. ## Runtime behavior - Managed `deny_read` entries are appended to the effective `FileSystemSandboxPolicy` after the selected permission profile is resolved. - Exact paths become `FileSystemPath::Path { access: None }`; glob patterns become `FileSystemPath::GlobPattern { access: None }`. - When managed deny-read entries are present, `sandbox_mode` is constrained to `read-only` or `workspace-write`; `danger-full-access` and `external-sandbox` cannot silently bypass the managed read-deny policy. - On Windows, the managed deny-read policy is enforced for direct file tools, but shell subprocess reads are not sandboxed yet, so startup emits a warning for that platform. - `/debug-config` shows the effective managed requirement as `permissions.filesystem.deny_read` with its source. ## Stack 1. #15979 - glob deny-read policy/config/direct-tool support 2. #18096 - macOS and Linux sandbox enforcement 3. This PR - managed deny-read requirements --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-17 08:40:09 -07:00
Eric Traut	2dd6734dd3	fix(tui): use BEL for terminal title updates (#18261 ) ## Summary Fixes #18160. iTerm2 can append the current foreground process to tab titles, and Codex's terminal-title updates were causing that decoration to appear as `(codex")` with a stray trailing quote. Codex was writing OSC 0 title sequences terminated with ST (`ESC \`). Some terminal title integrations appear to accept that title update but still expose the ST terminator in their own process/title decoration. ## Changes - Update `codex-rs/tui/src/terminal_title.rs` to terminate OSC 0 title updates with BEL instead of ST. - Update the focused terminal-title encoding test to assert the BEL-terminated sequence. ## Compatibility This should be low risk: the title payload and update timing are unchanged, and BEL is the form already emitted by `crossterm::terminal::SetTitle` in the crossterm version used by this repository. BEL is also the widely supported xterm-family title terminator used by common terminals and multiplexers. The main theoretical risk would be a very old or unusual terminal that accepted only ST and not BEL for OSC title termination, but that is unlikely compared with the observed iTerm2 issue. ## Verification - `cargo test -p codex-tui terminal_title` - `cargo test -p codex-tui`	2026-04-17 08:39:37 -07:00
Eric Traut	c3ecb557d3	Support Ctrl+P/Ctrl+N in resume picker (#18267 ) Fixes #18179. ## Why The fullscreen `/resume` picker accepted Up/Down navigation but ignored Ctrl+P/Ctrl+N, which made it inconsistent with other TUI selection flows such as `ListSelectionView`-backed pickers and composer navigation. ## What Changed Updated `codex-rs/tui/src/resume_picker.rs` so the resume picker treats Ctrl+P/Ctrl+N as aliases for Up/Down, including the raw `^P`/`^N` control-character events some terminals emit without a CONTROL modifier.	2026-04-17 08:38:47 -07:00
jif-oai	3421a107e0	nit: phase 2 ephemeral (#18338 )	2026-04-17 16:10:58 +01:00
Abhinav	8494e5bd7b	Add PermissionRequest hooks support (#17563 ) ## Why We need `PermissionRequest` hook support! Also addresses: - https://github.com/openai/codex/issues/16301 - run a script on Hook to do things like play a sound to draw attention but actually no-op so user can still approve - can omit the `decision` object from output or just have the script exit 0 and print nothing - https://github.com/openai/codex/issues/15311 - let the script approve/deny on its own - external UI what will run on Hook and relay decision back to codex ## Reviewer Note There's a lot of plumbing for the new hook, key files to review are: - New hook added in `codex-rs/hooks/src/events/permission_request.rs` - Wiring for network approvals `codex-rs/core/src/tools/network_approval.rs` - Wiring for tool orchestrator `codex-rs/core/src/tools/orchestrator.rs` - Wiring for execve `codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs` ## What - Wires shell, unified exec, and network approval prompts into the `PermissionRequest` hook flow. - Lets hooks allow or deny approval prompts; quiet or invalid hooks fall back to the normal approval path. - Uses `tool_input.description` for user-facing context when it helps: - shell / `exec_command`: the request justification, when present - network approvals: `network-access <domain>` - Uses `tool_name: Bash` for shell, unified exec, and network approval permission-request hooks. - For network approvals, passes the originating command in `tool_input.command` when there is a single owning call; otherwise falls back to the synthetic `network-access ...` command. <details> <summary>Example `PermissionRequest` hook input for a shell approval</summary> ```json { "session_id": "<session-id>", "turn_id": "<turn-id>", "transcript_path": "/path/to/transcript.jsonl", "cwd": "/path/to/cwd", "hook_event_name": "PermissionRequest", "model": "gpt-5", "permission_mode": "default", "tool_name": "Bash", "tool_input": { "command": "rm -f /tmp/example" } } ``` </details> <details> <summary>Example `PermissionRequest` hook input for an escalated `exec_command` request</summary> ```json { "session_id": "<session-id>", "turn_id": "<turn-id>", "transcript_path": "/path/to/transcript.jsonl", "cwd": "/path/to/cwd", "hook_event_name": "PermissionRequest", "model": "gpt-5", "permission_mode": "default", "tool_name": "Bash", "tool_input": { "command": "cp /tmp/source.json /Users/alice/export/source.json", "description": "Need to copy a generated file outside the workspace" } } ``` </details> <details> <summary>Example `PermissionRequest` hook input for a network approval</summary> ```json { "session_id": "<session-id>", "turn_id": "<turn-id>", "transcript_path": "/path/to/transcript.jsonl", "cwd": "/path/to/cwd", "hook_event_name": "PermissionRequest", "model": "gpt-5", "permission_mode": "default", "tool_name": "Bash", "tool_input": { "command": "curl http://codex-network-test.invalid", "description": "network-access http://codex-network-test.invalid" } } ``` </details> ## Follow-ups - Implement the `PermissionRequest` semantics for `updatedInput`, `updatedPermissions`, `interrupt`, and suggestions / `permission_suggestions` - Add `PermissionRequest` support for the `request_permissions` tool path --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-17 14:45:47 +00:00
sayan-oai	d0047de7cb	add token-based tool deferral behind feature flag (#18097 ) add new `tool_search_always_defer_mcp_tools` feature flag that always defers all mcp tools rather than deferring once > 100 deferrable tools. add new tests, also move `mcp_exposure` tests into dedicated file rather than polluting `codex_tests`.	2026-04-17 18:34:06 +08:00
alexsong-oai	20b4b80426	Sync local plugin imports, async remote imports, refresh caches after… (#18246 ) … import ## Why `externalAgentConfig/import` used to spawn plugin imports in the background and return immediately. That meant local marketplace imports could still be in flight when the caller refreshed plugin state, so newly imported plugins would not show up right away. This change makes local marketplace imports complete before the RPC returns, while keeping remote marketplace imports asynchronous so we do not block on remote fetches. ## What changed - split plugin migration details into local and remote marketplace imports based on the external config source - import local marketplaces synchronously during `externalAgentConfig/import` - return pending remote plugin imports to the app-server so it can finish them in the background - clear the plugin and skills caches before responding to plugin imports, and again after background remote imports complete, so the next `plugin/list` reloads fresh state - keep marketplace source parsing encapsulated behind `is_local_marketplace_source(...)` instead of re-exporting the internal enum - add core and app-server coverage for the synchronous local import path and the pending remote import path ## Verification - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-core` (currently fails an existing unrelated test: `config_loader::tests::cli_override_can_update_project_local_mcp_server_when_project_is_trusted`) - `cargo test` (currently fails existing `codex-app-server` integration tests in MCP/skills/thread-start areas, plus the unrelated `codex-core` failure above)	2026-04-17 09:34:55 +00:00
jif-oai	64177aaa22	fix: reduce writable root (#17947 )	2026-04-17 09:33:12 +01:00
Eric Traut	2e038e6d38	Fix Windows exec policy test flake (#18304 ) ## Summary This fixes a Windows-only failure in the exec policy multi-segment shell test. The test was meant to verify that a compound shell command only bypasses sandboxing when every parsed segment has an explicit exec policy allow rule. On Windows, the read-only sandbox setup is intentionally treated as lacking sandbox protection, so the old fixture could take the approval path before reaching the intended bypass assertion. The test now uses the workspace-write sandbox policy, keeping the focus on the per-segment bypass rule while preserving the expected bypass_sandbox false result when only cat is explicitly allowed.	2026-04-17 00:43:49 -07:00
sashank-oai	22f7ef1cb7	[codex] Revoke ChatGPT tokens on logout (#17825 ) ## Summary This changes Codex logout so managed ChatGPT auth is revoked against AuthAPI before local auth state is removed. CLI logout, TUI `/logout`, and the app-server account logout path now use the token-revoking logout flow instead of only deleting `auth.json` / credential store state. ## Root Cause Logout previously cleared only local auth storage. That removed Codex's local credentials but did not ask the backend to invalidate the refresh/access token state associated with a managed ChatGPT login. ## Behavior For managed ChatGPT auth, logout sends the stored refresh token to `https://auth.openai.com/oauth/revoke` with `token_type_hint: refresh_token` and the Codex OAuth client id, then deletes all local auth stores after revocation succeeds. If only an access token is available, it falls back to revoking that access token. API key auth and externally supplied `chatgptAuthTokens` are still only cleared locally because Codex does not own a refresh token for those modes. Revocation failures are fail-closed: if Codex cannot load stored auth or the backend revoke call fails, logout returns an error and leaves local auth in place so the user can retry instead of silently clearing local state while backend tokens remain valid. ## Validation ran local version of `codex-cli` with staging overrides/harness for auth ran `codex login` then `codex logout`: saw auth.json clear and backend revocation endpoints were called ``` POST /oauth/revoke status: 200 revoking access token should clear auth session clearing auth session due to token revocation successfully revoked session and access token CANONICAL-API-LINE Response: status='200' method='POST' path='/oauth/revoke ``` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-16 22:51:21 -07:00
Dylan Hurd	fe7c959e90	fix(exec-policy) rules parsing (#18126 ) ## Summary See scenarios - rules must always be enforced on all commands in the string ## Testing - [x] Added ExecApprovalRequirementScenario tests	2026-04-16 21:18:39 -07:00
Tom	9d6f4f2e2e	codex: split thread/read view loading (#18231 ) Summary - refactor thread/read into explicit persisted-load, live-load, and merge steps - preserve existing SQLite/filesystem/live-thread behavior exactly - keep ThreadStore migration out of this PR so the next PR is easier to review Validation - this one's a pure reorganization that relies on existing test coverage	2026-04-16 21:06:03 -07:00
Leo Shimonaka	dd00efe781	Move Computer Use tool suggestion to core (#18219 ) ## Summary Move the Computer Use tool suggestion into core Codex plugin discovery. Also search `openai-bundled` when listing suggested plugins, with test coverage for overlap between baked-in suggestions and `tool_suggest.discoverables`. ## Test plan Tested locally: - `cargo test -p codex-core list_tool_suggest_discoverable_plugins`	2026-04-16 19:55:23 -07:00
xl-openai	37161bc76e	feat: Handle alternate plugin manifest paths (#18182 ) Load plugin manifests through a shared discoverable-path helper so manifest reads, installs, and skill names all see the same alternate manifest location.	2026-04-16 19:43:19 -07:00
Celia Chen	a803790a10	feat: add opt-in provider runtime abstraction (#17713 ) ## Summary - Add `codex-model-provider` as the runtime home for model-provider behavior that does not belong in `codex-core`, `codex-login`, or `codex-api`. - The new crate wraps configured `ModelProviderInfo` in a `ModelProvider` trait object that can resolve the API provider config, provider-scoped auth manager, and request auth provider for each call. - This centralizes provider auth behavior in one place today, and gives us an extension point for future provider-specific auth, model listing, request setup, and related runtime behavior. ## Tests Ran tests manually to make sure that provider auth under different configs still work as expected. --------- Co-authored-by: pakrym-oai <pakrym@openai.com>	2026-04-17 02:27:45 +00:00
pakrym-oai	91e8eebd03	Split codex session modules (#18244 ) ## Summary - split `codex.rs` session definitions and constructor into `codex/session.rs` - move MCP session methods into `codex/mcp.rs` - move turn-context types/helpers into `codex/turn_context.rs` - move review thread spawning into `codex/review.rs` ## Testing - `cargo check -p codex-core` - `just fmt` - `just fix -p codex-core` - `cargo test -p codex-core` (unit tests passed; integration run failed locally with 45 failures, including missing helper binaries such as `test_stdio_server`/`codex` plus approval/web-search/MCP-related cases)	2026-04-16 18:15:19 -07:00
Akshay Nathan	7995c66032	Stream apply_patch changes (#17862 ) Adds new events for streaming apply_patch changes from responses api. This is to enable clients to show progress during file writes. Caveat: This does not work with apply_patch in function call mode, since that required adding streaming json parsing.	2026-04-16 18:12:19 -07:00
pakrym-oai	9effa0509f	Refactor config loading to use filesystem abstraction (#18209 ) Initial pass propagating FileSystem through config loading.	2026-04-17 00:51:21 +00:00
viyatb-oai	2967900d81	fix: deprecate use_legacy_landlock feature flag (#17971 ) ## Summary - mark `features.use_legacy_landlock` as a deprecated feature flag - emit a startup deprecation notice when the flag is configured - add feature- and core-level regression coverage for the notice <img width="1288" height="93" alt="Screenshot 2026-04-15 at 11 14 00 PM" src="https://github.com/user-attachments/assets/fffc628b-614c-4521-9374-64e50a269252" /> --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-16 17:37:15 -07:00
viyatb-oai	0d0abe839a	feat(sandbox): add glob deny-read platform enforcement (#18096 ) ## Summary - adds macOS Seatbelt deny rules for unreadable glob patterns - expands unreadable glob matches on Linux and masks them in bwrap, including canonical symlink targets - keeps Linux glob expansion robust when `rg` is unavailable in minimal or Bazel test environments - adds sandbox integration coverage that runs `shell` and `exec_command` with a `*/.env = none` policy and verifies the secret contents do not reach the model ## Linux glob expansion ```text Prefer: rg --files --hidden --no-ignore --glob <pattern> -- <search-root> Fallback: internal globset walker when rg is not installed Failure: any other rg failure aborts sandbox construction ``` ``` [permissions.workspace.filesystem] glob_scan_max_depth = 2 [permissions.workspace.filesystem.":project_roots"] "*/.env" = "none" ``` This keeps the common path fast without making sandbox construction depend on an ambient `rg` binary. If `rg` is present but fails for another reason, the sandbox setup fails closed instead of silently omitting deny-read masks. ## Platform support - macOS: subprocess sandbox enforcement is handled by Seatbelt regex deny rules - Linux: subprocess sandbox enforcement is handled by expanding existing glob matches and masking them in bwrap - Windows: policy/config/direct-tool glob support is already on `main` from #15979; Windows subprocess sandbox paths continue to fail closed when unreadable split filesystem carveouts require runtime enforcement, rather than silently running unsandboxed ## Stack 1. #15979 - merged: cross-platform glob deny-read policy/config/direct-tool support for macOS, Linux, and Windows 2. This PR - macOS/Linux subprocess sandbox enforcement plus Windows fail-closed clarification 3. #17740 - managed deny-read requirements ## Verification - Added integration coverage for `shell` and `exec_command` glob deny-read enforcement - `cargo check -p codex-sandboxing -p codex-linux-sandbox --tests` - `cargo check -p codex-core --test all` - `cargo clippy -p codex-linux-sandbox -p codex-sandboxing --tests` - `just bazel-lock-check` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-16 17:35:16 -07:00
xli-oai	5818ed6660	Move marketplace add under plugin command (#18116 ) ## Summary - move the marketplace add CLI from `codex marketplace add` to `codex plugin marketplace add` - keep marketplace config overrides working through the nested plugin command - reject `--sparse` for local marketplace directory sources before the local-source install path bypasses git-source validation ## Validation - `just fmt` - `git diff --check` - `cargo test -p codex-cli` - `cargo test -p codex-core marketplace_add -- --nocapture` - `cargo test -p codex-core install_plugin_updates_config_with_relative_path_and_plugin_key -- --nocapture` - `xli-test-marketplace-cli` local isolated matrix: `T1`, `L1`-`L10`	2026-04-16 17:06:34 -07:00
Matthew Zeng	bf6e7e12aa	Use in-process app-server for unknown-thread MCP read test (#18196 ) ## Summary - Switch the unknown-thread MCP resource read test from the stdio subprocess to the in-process app-server path. - Keep the assertion focused on the returned error message while avoiding child-process teardown timing issues in nextest. ## Testing - Not run (not requested)	2026-04-16 23:46:15 +00:00
Jeff Harris	65cc12d72e	Use codex-auto-review for guardian reviews (#18169 ) ## Summary This is the minimal client-side follow-up for the Codex Auto Review model slug rollout. It updates the guardian reviewer preferred model from `gpt-5.4` to `codex-auto-review`, so the client can rely on the backend catalog + Statsig mapping instead of hardcoding the GPT-5.4 slug. Context: https://openai.slack.com/archives/C0AF9328RL0/p1775777479388369?thread_ts=1775773094.071629&cid=C0AF9328RL0 ## Testing - `cargo fmt --package codex-core --check` - `cargo test -p codex-core guardian::` - `bazel test --experimental_remote_downloader= --test_output=errors //codex-rs/core:core-unit-tests --test_arg=guardian`	2026-04-16 15:43:51 -07:00
pakrym-oai	a1736fcd20	[codex] Split codex turn logic (#18206 ) ## Summary - Move Codex turn execution logic from `codex.rs` into `codex/turn.rs`. - Keep the existing crate-visible `run_turn`, `build_prompt`, `built_tools`, and `get_last_assistant_message_from_turn` surface re-exported from `codex.rs`. - Preserve test access for moved turn helpers while reducing the main `codex.rs` orchestration footprint. ## Stack - Base: #18200 (`pakrym/split-codex-handlers`) ## Testing - `CARGO_INCREMENTAL=0 cargo test -p codex-core --lib` - `just fix -p codex-core` - `just fmt` - `git diff --check`	2026-04-16 15:28:59 -07:00
canvrno-oai	fa5d14e276	Add tabbed lists, single line rendering, col width changes (#18188 ) This PR adds shared bottom-pane selection-list for future `/plugins` menu work and wires the existing `/plugins` menu into the new list-rendering path without changing it to tabs yet. The main user-visible effect is that the current plugin list now renders as a denser single-line list with shared name-column sizing, while the tabbed selection support remains available for follow-up PRs but is currently unused in production menus. - Add generic tabbed selection-list support to the bottom pane, including per-tab headers/items and tab-aware list state - Add single-line row rendering with ellipsis truncation for dense list UIs - Add shared name-column width support so descriptions align consistently across rows - Wire the current /plugins menu to the new single-line and shared column-width behavior only - Keep tabbed menu adoption deferred; no existing menu is switched to tabs in this PR --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-16 15:27:59 -07:00
bxie-openai	6a1ddfc366	[codex] Update realtime V2 VAD silence delay and 1.5 prompt (#18092 ) ## Summary - set the realtime v2 server VAD silence delay to 500ms - update the default realtime 1.5 backend prompt to the v4 text - keep the session payload and prompt rendering tests aligned with those changes ## Why - the VAD change gives the voice path a longer pause before ending the user's turn - the prompt change makes the default bundled realtime prompt match the current v4 content ## Validation - `cargo +1.93.0 test -p codex-core realtime_prompt --manifest-path /tmp/codex-realtime-v2-vad-prompt-v4/codex-rs/Cargo.toml` - `CARGO_TARGET_DIR=/tmp/codex-pr-v4-target cargo +1.93.0 test -p codex-api realtime_v2_session_update_includes_background_agent_tool_and_handoff_output_item --manifest-path /tmp/codex-realtime-v2-vad-prompt-v4/codex-rs/Cargo.toml` - `CARGO_TARGET_DIR=/tmp/codex-pr-v4-target cargo +1.93.0 test -p codex-app-server --test all 'suite::v2::realtime_conversation::realtime_webrtc_start_emits_sdp_notification' --manifest-path /tmp/codex-realtime-v2-vad-prompt-v4/codex-rs/Cargo.toml -- --exact`	2026-04-16 14:30:57 -07:00
Abhinav	d9c71d41a9	Add OTEL metrics for hook runs (#18026 ) # Why We already emit analytics for completed hook runs, but we don't have matching OTEL metrics to track hook volume and latency. # What - add `codex.hooks.run` and `codex.hooks.run.duration_ms` - tag both metrics with `hook_name`, `source`, and `status` - emit the metrics from the completed hook path Verified locally against a dummy OTLP collector --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-16 21:30:38 +00:00
Adrian	55c3de75cb	Register agent tasks behind use_agent_identity (#17387 ) ## Summary Stack PR3 for feature-gated agent identity support. This PR adds per-thread agent task registration behind `features.use_agent_identity`. Tasks are minted on the first real user turn and cached in thread runtime state for later turns. ## Stack - PR1: https://github.com/openai/codex/pull/17385 - add `features.use_agent_identity` - PR2: https://github.com/openai/codex/pull/17386 - register agent identities when enabled - PR3: https://github.com/openai/codex/pull/17387 - this PR, original task registration slice - PR3.1: https://github.com/openai/codex/pull/17978 - persist and prewarm registered tasks per thread - PR4: https://github.com/openai/codex/pull/17980 - use `AgentAssertion` downstream when enabled ## Validation Covered as part of the local stack validation pass: - `just fmt` - `cargo test -p codex-core --lib agent_identity` - `cargo test -p codex-core --lib agent_assertion` - `cargo test -p codex-core --lib websocket_agent_task` - `cargo test -p codex-api api_bridge` - `cargo build -p codex-cli --bin codex` ## Notes The full local app-server E2E path is still being debugged after PR creation. The current branch stack is directionally ready for review while that follow-up continues.	2026-04-16 14:30:02 -07:00
pakrym-oai	0708cc78cb	[codex] Split codex op handlers (#18200 ) Start splitting the codex.rs	2026-04-16 14:21:29 -07:00
starr-openai	3905f72891	Throttle Windows Bazel test concurrency (#18192 ) ## Summary - cap the Windows Bazel test lane at `--jobs=8` to reduce local runner pressure - keep Linux and macOS Bazel test concurrency unchanged - make failed-test log tailing resolve `bazel-testlogs` with the same CI config and Windows host-platform context as the failed invocation - prefer Bazel-reported `test.log` paths and normalize Windows path separators before tailing ## Context The Windows Bazel workflow currently uses `ci-windows`, which does not inherit the remote executor config. This means the lane runs the `//...` test suite locally and otherwise falls back to the repo-wide `common --jobs=30`. The new Windows-only override is intended to reduce local executor pressure without changing coverage. ## Validation Not run locally; this is a CI workflow change and the draft PR is intended to exercise the GitHub Actions lane directly. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-16 14:16:15 -07:00
bxie-openai	37bf42d5d5	[codex] Make realtime startup context truncation deterministic (#18172 ) ## Summary - remove the final whole-blob truncation pass from realtime startup-context assembly - enforce fixed per-section budgets, including each section heading - keep the existing per-section caps and raise the overall realtime startup-context budget to `5300`, matching the sum of those section budgets - add focused tests for the new wrapping and section-budget behavior ## Why The previous flow truncated each section and then middle-truncated the final combined startup-context blob again. Small input changes could shift that combined cut point, which made retained context unstable and caused nondeterministic tests. ## Impact Startup context now preserves section boundaries and ordering deterministically. Each section is still budgeted independently, but the final assembled blob is no longer truncated again as a single opaque string. To match that design, the overall startup-context token budget is updated to the sum of the existing section budgets rather than lowering the section caps. ## Validation - `cargo +1.93.0 test -p codex-core realtime_context` - `cargo +1.93.0 test -p codex-core --test all suite::realtime_conversation::conversation_start_injects_startup_context_from_thread_history -- --exact` - `cargo +1.93.0 test -p codex-core --test all suite::realtime_conversation::conversation_startup_context_current_thread_selects_many_turns_by_budget -- --exact` - `cargo +1.93.0 test -p codex-core --test all suite::realtime_conversation::conversation_startup_context_falls_back_to_workspace_map -- --exact` - `cargo +1.93.0 test -p codex-core --test all suite::realtime_conversation::conversation_startup_context_is_truncated_and_sent_once_per_start -- --exact`	2026-04-16 13:51:43 -07:00
Felipe Coury	ec8d4bfc77	fix(app-server): replay token usage after resume and fork (#18023 ) ## Problem When a user resumed or forked a session, the TUI could render the restored thread history immediately, but it did not receive token usage until a later model turn emitted a fresh usage event. That left the context/status UI blank or stale during the exact window where the user expects resumed state to look complete. Core already reconstructed token usage from the rollout; the missing behavior was app-server lifecycle replay to the client that just attached. ## Mental model Token usage has two representations. The rollout is the durable source of historical `TokenCount` events, and the core session cache is the in-memory snapshot reconstructed from that rollout on resume or fork. App-server v2 clients do not read core state directly; they learn about usage through `thread/tokenUsage/updated`. The fix keeps those roles separate: core exposes the restored `TokenUsageInfo`, and app-server sends one targeted notification after a successful `thread/resume` or `thread/fork` response when that restored snapshot exists. This notification is not a new model event. It is a replay of already-persisted state for the client that just attached. That distinction matters because using the normal core event path here would risk duplicating `TokenCount` entries in the rollout and making future resumes count historical usage twice. ## Non-goals This change does not add a new protocol method or payload shape. It reuses the existing v2 `thread/tokenUsage/updated` notification and the TUI’s existing handler for that notification. This change does not alter how token usage is computed, accumulated, compacted, or written during turns. It only exposes the token usage that resume and fork reconstruction already restored. This change does not broadcast historical usage replay to every subscribed client. The replay is intentionally scoped to the connection that requested resume or fork so already-attached clients are not surprised by an old usage update while they may be rendering live activity. ## Tradeoffs Sending the usage notification after the JSON-RPC response preserves a clear lifecycle order: the client first receives the thread object, then receives restored usage for that thread. The tradeoff is that usage is still a notification rather than part of the `thread/resume` or `thread/fork` response. That keeps the protocol shape stable and avoids duplicating usage fields across response types, but clients must continue listening for notifications after receiving the response. The helper selects the latest non-in-progress turn id for the replayed usage notification. This is conservative because restored usage belongs to completed persisted accounting, not to newly attached in-flight work. The fallback to the last turn preserves a stable wire payload for unusual histories, but histories with no meaningful completed turn still have a weak attribution story. ## Architecture Core already seeds `Session` token state from the last persisted rollout `TokenCount` during `InitialHistory::Resumed` and `InitialHistory::Forked`. The new core accessor exposes the complete `TokenUsageInfo` through `CodexThread` without giving app-server direct session mutation authority. App-server calls that accessor from three lifecycle paths: cold `thread/resume`, running-thread resume/rejoin, and `thread/fork`. In each path, the server sends the normal response first, then calls a shared helper that converts core usage into `ThreadTokenUsageUpdatedNotification` and sends it only to the requesting connection. The tests build fake rollouts with a user turn plus a persisted token usage event. They then exercise `thread/resume` and `thread/fork` without starting another model turn, proving that restored usage arrives before any next-turn token event could be produced. ## Observability The primary debug path is the app-server JSON-RPC stream. After `thread/resume` or `thread/fork`, a client should see the response followed by `thread/tokenUsage/updated` when the source rollout includes token usage. If the notification is absent, check whether the rollout contains an `event_msg` payload of type `token_count`, whether core reconstruction seeded `Session::token_usage_info`, and whether the connection stayed attached long enough to receive the targeted notification. The notification is sent through the existing `OutgoingMessageSender::send_server_notification_to_connections` path, so existing app-server tracing around server notifications still applies. Because this is a replay, not a model turn event, debugging should start at the resume/fork handlers rather than the turn event translation in `bespoke_event_handling`. ## Tests The focused regression coverage is `cargo test -p codex-app-server emits_restored_token_usage`, which covers both resume and fork. The core reconstruction guard is `cargo test -p codex-core record_initial_history_seeds_token_info_from_rollout`. Formatting and lint/fix passes were run with `just fmt`, `just fix -p codex-core`, and `just fix -p codex-app-server`. Full crate test runs surfaced pre-existing unrelated failures in command execution and plugin marketplace tests; the new token usage tests passed in focused runs and within the app-server suite before the unrelated command execution failure.	2026-04-16 17:29:34 -03:00
Michael Bolin	ea34c6ed8d	fix: fix clippy issue in examples/ folder (#18184 ) I believe this use of `expect()` was introduced in https://github.com/openai/codex/pull/17826, but was not flagged by CI. Though I did see it in the diagnostics panel in VS Code, so it's worth cleaning up. I guess our current CI does include `examples/` when running Clippy?	2026-04-16 12:48:31 -07:00
Abhinav	8720b7bdce	Add codex_hook_run analytics event (#17996 ) # Why Add product analytics for hook handler executions so we can understand which hooks are running, where they came from, and whether they completed, failed, stopped, or blocked work. # What - add the new `codex_hook_run` analytics event and payload plumbing in `codex-rs/analytics` - emit hook-run analytics from the shared hook completion path in `codex-rs/core` - classify hook source from the loaded hook path as `system`, `user`, `project`, or `unknown` ``` { "event_type": "codex_hook_run", "event_params": { "thread_id": "string", "turn_id": "string", "model_slug": "string", "hook_name": "string, // any HookEventName "hook_source": "system \| user \| project \| unknown", "status": "completed \| failed \| stopped \| blocked" } } ``` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-16 19:43:16 +00:00

... 5 6 7 8 9 ...

5798 Commits