codex

mirror of https://github.com/openai/codex.git synced 2026-05-27 06:25:48 +00:00

Author	SHA1	Message	Date
Won Park	83ec1eb5d6	Rename approvals reviewer variant to auto-review (#19056 ) ## Why `approvals_reviewer` now uses `auto_review` as the canonical config/API value after #18504, but the Rust enum variant and nearby helper/test names still used `GuardianSubagent` / guardian approval wording. That made follow-up code and reviews confusing even though the external value had already moved to Auto-review. ## What changed - Renamed `ApprovalsReviewer::GuardianSubagent` to `ApprovalsReviewer::AutoReview`. - Updated protocol, app-server, config, core, TUI, exec, and analytics test callsites. - Renamed nearby helper/test names from guardian approval wording to Auto-review wording where they refer to the approvals reviewer mode. - Preserved wire compatibility: - `auto_review` remains the canonical serialized value. - `guardian_subagent` remains accepted as a legacy alias. This intentionally does not rename the `[features].guardian_approval` key, `Feature::GuardianApproval`, `core/src/guardian`, analytics event names, or app-server Guardian review event types. ## Verification - `cargo test -p codex-protocol approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent` - `cargo test -p codex-app-server-protocol approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent` - `cargo test -p codex-config approvals_reviewer` - `cargo test -p codex-tui update_feature_flags` - `cargo test -p codex-core permissions_instructions` - `cargo test -p codex-tui permissions_selection`	2026-04-22 17:22:35 -07:00
Andrei Eternal	eed0e07825	hooks: emit Bash PostToolUse when exec_command completes via write_stdin (#18888 ) Fixes #16246. ## Why `exec_command` already emits `PreToolUse`, but long-running unified exec commands that finish on a later `write_stdin` poll could miss the matching `PostToolUse`. That left the Bash hook lifecycle inconsistent, broke expectations around `tool_use_id` and `tool_input.command`, and meant `PostToolUse` block/replacement feedback could fail to replace the final session output before it reached model context. This keeps the fix scoped to the `exec_command` / `write_stdin` lifecycle. Broader non-Bash hook expansion is still out of scope here and remains tracked separately in #16732. ## What changed - Compute and store `PostToolUsePayload` while handlers still have access to their concrete output type, and carry `tool_use_id` through that payload. - Preserve the original hook-facing `exec_command` string through unified exec state (`ExecCommandRequest`, `ProcessEntry`, `PreparedProcessHandles`, and `ExecCommandToolOutput`) via `hook_command`, and remove the now-unused `session_command` output metadata. - Emit exactly one Bash `PostToolUse` for long-running `exec_command` sessions when a later `write_stdin` poll observes final completion, using the original `exec_command` call id and hook-facing command. - Keep one-shot `exec_command` behavior aligned with the same payload construction, including interactive completions that return a final result directly. - Apply `PostToolUse` block/replacement feedback before the final `write_stdin` completion output is sent back to the model. - Keep `write_stdin` itself out of `PreToolUse` matching so it continues to act as transport/polling for the original Bash tool call. - Restore plain matcher behavior for tool-name matchers such as `Bash` and `Edit\|Write`, while still treating patterns with regex characters (for example `mcp__.*`) as regexes. - Add unit coverage for unified exec payload construction and parallel session separation, plus a core integration regression that verifies a blocked `PostToolUse` replaces the final `write_stdin` output in model context. ## Testing - `cargo test -p codex-hooks` - `cargo test -p codex-core post_tool_use_payload` - `cargo test -p codex-core post_tool_use_blocks_when_exec_session_completes_via_write_stdin`	2026-04-22 17:14:22 -07:00
Michael Bolin	6ca038bbd1	rollout: persist turn permission profiles (#18281 ) ## Why Resume and reconstruction need to preserve the permissions that were active for each user turn. If rollouts only keep legacy sandbox fields, replay cannot faithfully represent profile-shaped overrides introduced earlier in the stack. ## What changed This records `permission_profile` on user-turn rollout events, reconstructs it through history/state extraction, and updates rollout reconstruction and related fixtures to keep the field explicit. ## Verification - `cargo test -p codex-core --test all permissions_messages -- --nocapture` - `cargo test -p codex-core --test all request_permissions -- --nocapture` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18281). * #18288 * #18287 * #18286 * #18285 * #18284 * #18283 * #18282 * __->__ #18281	2026-04-22 17:00:29 -07:00
Michael Bolin	bc083e4713	clients: send permission profiles to app-server (#18280 ) ## Why After app-server can accept `PermissionProfile`, first-party clients should stop preferring legacy sandbox fields when canonical permission information is available. This keeps the migration moving without removing legacy compatibility yet. The client side still has mixed surfaces during the stack: embedded thread start/resume/fork and exec initial turns can derive a profile directly from local config, while TUI remote sessions and some turn-start paths only have a legacy/server-context-safe sandbox projection. Those paths keep sending legacy sandbox fields rather than synthesizing or sending lossy/local-only profiles. ## What changed - Sends `permissionProfile` from exec and embedded TUI thread start/resume/fork requests when config has a representable profile. - Keeps legacy sandbox fallback for external sandbox policies, TUI remote thread lifecycle requests, and TUI turn-start requests that do not yet carry the active profile. - Sends the actual config-derived `permissionProfile` for exec initial turns instead of rebuilding one from the legacy sandbox projection. - Stores response `permissionProfile` as optional in TUI session state so external sandbox responses and compatibility payloads preserve `null`. - Updates tests for request construction and response mapping. ## Verification - `cargo check --tests -p codex-tui -p codex-exec` - `cargo test -p codex-tui app_server_session -- --nocapture` - `cargo test -p codex-exec thread_start_params -- --nocapture` - `cargo test -p codex-tui app_server_session::tests::thread_lifecycle_params -- --nocapture` - `just fix -p codex-tui -p codex-exec` - `just fix -p codex-tui` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18280). * #18288 * #18287 * #18286 * #18285 * #18284 * #18283 * #18282 * #18281 * __->__ #18280	2026-04-22 16:34:13 -07:00
Michael Bolin	44dbd9e48a	exec-server: require explicit filesystem sandbox cwd (#19046 ) ## Why This is a cleanup PR for the `PermissionProfile` migration stack. #19016 fixed remote exec-server sandbox contexts so Docker-backed filesystem requests use a request/container `cwd` instead of leaking the local test runner `cwd`. That exposed the broader API problem: `FileSystemSandboxContext::new(SandboxPolicy)` could still reconstruct filesystem permissions by reading the exec-server process cwd with `AbsolutePathBuf::current_dir()`. That made `cwd`-dependent legacy entries, such as `:cwd`, `:project_roots`, and relative deny globs, depend on ambient process state instead of the request sandbox `cwd`. As later PRs make `PermissionProfile` the primary permissions abstraction, sandbox contexts should be explicit about whether they carry a request `cwd` or are profile-only. Removing the implicit constructor prevents new call sites from accidentally rebuilding permissions against the wrong `cwd`. ## What changed - Removed `FileSystemSandboxContext::new(SandboxPolicy)`. - Kept production callers on explicit constructors: `from_legacy_sandbox_policy(..., cwd)`, `from_permission_profile(...)`, and `from_permission_profile_with_cwd(...)`. - Updated exec-server test helpers to construct `PermissionProfile` values directly instead of routing through legacy `SandboxPolicy` projections. - Updated the environment regression test to use an explicit restricted profile with no synthetic `cwd`. ## Verification - `cargo test -p codex-exec-server` - `just fix -p codex-exec-server` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19046). * #18288 * #18287 * #18286 * #18285 * #18284 * #18283 * #18282 * #18281 * #18280 * __->__ #19046	2026-04-22 23:05:12 +00:00
Won Park	46142c3cb0	Rebrand approvals reviewer config to auto-review (#18504 ) ### Why Auto-review is the user-facing name for the approvals reviewer, but the config/API value still exposed the old `guardian_subagent` name. That made new configs and generated schemas point users at Guardian terminology even though the intended product surface is Auto-review. This PR updates the external `approvals_reviewer` value while preserving compatibility for existing configs and clients. ### What changed - Makes `auto_review` the canonical serialized value for `approvals_reviewer`. - Keeps `guardian_subagent` accepted as a legacy alias. - Keeps `user` accepted and serialized as `user`. - Updates generated config and app-server schemas so `approvals_reviewer` includes: - `user` - `auto_review` - `guardian_subagent` - Updates app-server README docs for the reviewer value. - Updates analytics and config requirements tests for the canonical auto_review value. ### Compatibility Existing configs and API payloads using: ```toml approvals_reviewer = "guardian_subagent" ``` continue to load and map to the Auto-review reviewer behavior. New serialization emits: ```toml approvals_reviewer = "auto_review" ``` This PR intentionally does not rename the [features].guardian_approval key or broad internal Guardian symbols. Those are split out for a follow-up PR to keep this migration small and avoid touching large TUI/internal surfaces. Verification cargo test -p codex-protocol approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent cargo test -p codex-app-server-protocol approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent	2026-04-22 15:45:35 -07:00
Konstantine Kahadze	0e25c5ff42	Update bundled OpenAI Docs skill freshness check (#19043 ) ## Summary Sync the bundled `openai-docs` system skill with the already-merged `openai/skills` update from https://github.com/openai/skills/pull/360. Codex bundles system skills from `codex-rs/skills/src/assets/samples`, so this PR copies the same GPT-5.4 OpenAI Docs skill update into the Codex app/CLI bundle path. ## Changes - Add the latest-model resolver script to the bundled `openai-docs` skill. - Route model upgrade and prompt-upgrade requests through remote latest-model metadata when current guidance is needed. - Rename bundled fallback references to `upgrade-guide.md` and `prompting-guide.md`. - Keep the bundled fallback guidance GPT-5.4-only. ## Validation - Verified this bundled skill is byte-for-byte identical to `openai/skills@origin/main` `skills/.system/openai-docs`. - Ran the resolver locally and confirmed it returns `gpt-5.4` / `gpt-5p4`.	2026-04-22 22:31:04 +00:00
khoi	568cdacc7e	[Codex] Register browser requirements feature keys (#18956 ) ## Summary - register `in_app_browser` and `browser_use` as stable feature keys - allow requirements/MDM feature requirements to pin those desktop browser controls - add coverage for browser requirements being accepted by config loading ## Testing - `cargo fmt --all` (`just fmt` unavailable locally; rustfmt warned about nightly-only `imports_granularity` config) - `cargo test -p codex-features` - `cargo test -p codex-core browser_feature_requirements_are_valid` - Tested manually by setting in `requirements.toml` and seeing after app restart state to reflect the setting was correct (at the time hiding the `Browser Use` setting when the enterprise setting was set to false	2026-04-22 15:27:15 -07:00
joeytrasatti-openai	ee70b365ab	Overlay state DB git metadata for filtered thread lists (#19036 ) ## Summary - Factor the state DB `ThreadMetadata` to rollout `ThreadItem` mapping into a shared helper used by both DB pages and filesystem overlays - Generalize filtered filesystem list overlays to fill missing thread list metadata from the state-derived `ThreadItem`, while preserving filesystem `path` and `thread_id` - Add coverage for the merge behavior so existing filesystem values are not overwritten and future `ThreadItem` fields require an explicit decision ## Testing - `just fmt` from `codex-rs` - `git diff --check -- codex-rs/rollout/src/recorder.rs codex-rs/rollout/src/recorder_tests.rs` - Attempted `cargo test -p codex-rollout thread_item_metadata` from `codex-rs`; blocked in dependency fetch/setup after updating crates.io and git submodules `https://github.com/livekit/protocol` and `https://chromium.googlesource.com/libyuv/libyuv`, so the focused tests did not run	2026-04-22 14:59:20 -07:00
Michael Bolin	d3dd0d759b	exec-server: expose arg0 alias root to fs sandbox (#19016 ) ## Why The post-merge `rust-ci-full` run for #18999 still failed the Ubuntu remote `suite::remote_env` sandboxed filesystem tests. That run checked out merge commit `ddde50c611e4800cb805f243ed3c50bbafe7d011`, so the arg0 guard lifetime fix was present. The Docker-backed failure had two remaining pieces: - The sandboxed filesystem helper needs to execute Codex through the `codex-linux-sandbox` arg0 alias path. The helper sandbox was only granting read access to the real Codex executable parent, so the alias parent also has to be visible inside the helper sandbox. - The remote-env tests were building sandbox contexts with `FileSystemSandboxContext::new()`, which captures the local test runner cwd. In the Docker remote exec-server, that host checkout path does not exist, so spawning the filesystem helper failed with `No such file or directory` before the helper could process the request. ## What Changed - Track all helper runtime read roots instead of a single root. - Add both the real Codex executable parent and the `codex-linux-sandbox` alias parent to sandbox readable roots. - Avoid sending an unused local cwd in remote filesystem sandbox contexts when the permission profile has no cwd-dependent entries. - Build the Docker remote-env test sandbox contexts with a cwd path that exists inside the container. - Add unit coverage for the alias-parent root and remote sandbox cwd handling. ## Verification - `cargo test -p codex-exec-server` - `cargo test -p codex-core remote_test_env_sandboxed_read_allows_readable_root` - `just fix -p codex-exec-server` - `just fix -p codex-core`	2026-04-22 21:34:22 +00:00
Leo Shimonaka	16eeeb534a	Fix MCP permission policy sync (#19033 ) ###### Why/Context/Summary Repro: start a session outside Full Access, switch permissions to Full Access, then submit a new turn that triggers MCP/CUA permission handling. The turn used the live Full Access `SessionConfiguration`, but the MCP coordinator was still synced from the stale `original_config_do_not_use` / per-turn config copy. That left the coordinator with an old sandbox policy, so empty MCP permission elicitations could be denied instead of auto-accepted. Fix: update/rebuild the MCP connection manager from the live turn/session approval and sandbox policy fields. ###### Test plan ```sh just fmt cargo test -p codex-core --lib cargo test -p codex-core --lib mcp_tool_call::tests ```	2026-04-22 14:30:29 -07:00
viyatb-oai	2d73bac45f	feat: add guardian network approval trigger context (#18197 ) ## Summary Give guardian network-access reviews the command context that triggered a managed-network approval. The prompt JSON now includes the originating tool call id, tool name, command argv, cwd, sandbox permissions, additional permissions, justification, and tty state when a single active tool call can be attributed. The implementation keeps the trigger shape canonical by serializing `GuardianNetworkAccessTrigger` directly and lets each runtime build that trigger from its `ToolCtx`. Non-guardian approval prompts avoid cloning the full trigger payload. ## UX changes Guardian network-access reviews now include a `trigger` object that explains what command caused the network approval. Instead of seeing only the requested host, the guardian reviewer can also see the originating tool call, argv, working directory, sandbox mode, justification, and tty state. Example payload the guardian reviewer can see: ```json { "tool": "network_access", "target": "https://api.github.com:443", "host": "api.github.com", "protocol": "https", "port": 443, "trigger": { "callId": "call_abc123", "toolName": "shell", "command": ["gh", "api", "/repos/openai/codex/pulls/18197"], "cwd": "/workspace/codex", "sandboxPermissions": "require_escalated", "justification": "Fetch PR metadata from GitHub.", "tty": false } } ``` The network review itself remains scoped to the network decision: `target_item_id` stays `null`. `trigger.callId` is attribution context only, so clients can still distinguish network reviews from item-targeted command reviews. ## Verification - Added coverage for serializing network trigger context in guardian approval JSON. - Added regression coverage that network guardian reviews do not reuse `trigger.callId` as `target_item_id`. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-22 14:00:53 -07:00
Ahmed Ibrahim	9360f267f3	[2/4] Implement executor HTTP request runner (#18582 ) ### Why Remote streamable HTTP MCP needs the executor to perform ordinary HTTP requests on the executor side. This keeps network placement aligned with `experimental_environment = "remote"` without adding MCP-specific executor APIs. ### What - Add an executor-side `http/request` runner backed by `reqwest`. - Validate request method and URL scheme, preserving the transport boundary at plain HTTP. - Return buffered responses for ordinary calls and emit ordered `http/request/bodyDelta` notifications for streaming responses. - Register the request handler in the exec-server router. - Document the runner entrypoint, conversion helpers, body-stream bridge, notification sender, timeout behavior, and new integration-test helpers. - Add exec-server integration tests with the existing websocket harness and a local TCP HTTP peer for buffered and streamed responses, with comments spelling out what each test proves and its setup/exercise/assert phases. ### Stack 1. #18581 protocol 2. #18582 runner 3. #18583 RMCP client 4. #18584 manager wiring and local/remote coverage ### Verification - `just fmt` - `cargo check -p codex-exec-server -p codex-rmcp-client --tests` - `cargo check -p codex-core --test all` compile-only - `git diff --check` - Online full CI is running from the `full-ci` branch, including the remote Rust test job. Co-authored-by: Codex <noreply@openai.com> --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-22 20:36:34 +00:00
Michael Bolin	18a26d7bbc	app-server: accept permission profile overrides (#18279 ) ## Why `PermissionProfile` is becoming the canonical permissions shape shared by core and app-server. After app-server responses expose the active profile, clients need to be able to send that same shape back when starting, resuming, forking, or overriding a turn instead of translating through the legacy `sandbox`/`sandboxPolicy` shorthands. This still needs to preserve the existing requirements/platform enforcement model. A profile-shaped request can be downgraded or rejected by constraints, but the server should keep the user's elevated-access intent for project trust decisions. Turn-level profile overrides also need to retain existing read protections, including deny-read entries and bounded glob-scan metadata, so a permission override cannot accidentally drop configured protections such as `*/.env = deny`. ## What changed - Adds optional `permissionProfile` request fields to `thread/start`, `thread/resume`, `thread/fork`, and `turn/start`. - Rejects ambiguous requests that specify both `permissionProfile` and the legacy `sandbox`/`sandboxPolicy` fields, including running-thread resume requests. - Converts profile-shaped overrides into core runtime filesystem/network permissions while continuing to derive the constrained legacy sandbox projection used by existing execution paths. - Preserves project-trust intent for profile overrides that are equivalent to workspace-write or full-access sandbox requests. - Preserves existing deny-read entries and `globScanMaxDepth` when applying turn-level `permissionProfile` overrides. - Updates app-server docs plus generated JSON/TypeScript schema fixtures and regression coverage. ## Verification - `cargo test -p codex-app-server-protocol schema_fixtures` - `cargo test -p codex-core session_configuration_apply_permission_profile_preserves_existing_deny_read_entries` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18279). * #18288 * #18287 * #18286 * #18285 * #18284 * #18283 * #18282 * #18281 * #18280 * __->__ #18279	2026-04-22 13:34:33 -07:00
Dylan Hurd	ed4def8286	feat(auto-review) short-circuit (#18890 ) ## Summary Short circuit the convo if auto-review hits too many denials ## Testing - [x] Added unit tests --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-22 20:34:15 +00:00
xl-openai	b77791c228	feat: Fairly trim skill descriptions within context budget (#18925 ) Preserve skill name/path entries whenever possible and trim descriptions first, using round-robin character allocation so short descriptions do not waste budget.	2026-04-22 12:33:29 -07:00
Michael Bolin	ddde50c611	arg0: keep dispatch aliases alive during async main (#18999 ) ## Why The Ubuntu GNU remote Cargo run has been regularly failing sandboxed `suite::remote_env` filesystem tests with `No such file or directory`, while the same cases pass under Bazel. The Cargo remote-env setup starts `target/debug/codex exec-server` inside Docker via `scripts/test-remote-env.sh`. That CLI builds `codex-linux-sandbox` and other arg0 helper aliases in a temporary directory, then passes those alias paths into the exec-server runtime. `arg0_dispatch_or_else` constructed `Arg0DispatchPaths` from that temporary alias guard, but then awaited the async CLI entry point without otherwise keeping the guard live. That allowed the guard to be dropped while the exec-server was still running, removing the helper alias directory. Later sandboxed filesystem calls tried to spawn the now-deleted `codex-linux-sandbox` path and surfaced as `ENOENT`. The relevant distinction I found is that `core/tests/common` stores the result of `arg0_dispatch()` in a process-lifetime `OnceLock<Option<Arg0PathEntryGuard>>` for test binaries. The Cargo remote-env setup exercises a real `codex exec-server` process instead, so it depends on the normal CLI lifetime behavior fixed here. ## What Changed - Keep the arg0 tempdir guard alive until `main_fn(paths).await` completes. - Keep the helper on the real `arg0_dispatch()` shape, where alias setup can fail and return `None` in production. - Add a regression test that uses an explicit guard, yields once, and verifies the generated helper alias path still exists while the async entry point is running. ## Verification - `cargo test -p codex-arg0` - `just argument-comment-lint -p codex-arg0` - `just fix -p codex-arg0`	2026-04-22 11:06:34 -07:00
Won Park	11e5af53c4	Add plumbing to approve stored Auto-Review denials (#18955 ) ## Summary This adds the structural plumbing needed for an app-server client to approve a previously denied Guardian review and carry that approval context into the next model turn. This PR does not add the actual `/auto-review-denials` tool ## What Changed - Added app-server v2 RPC `thread/approveGuardianDeniedAction`. - Added generated JSON schema and TypeScript fixtures for `ThreadApproveGuardianDeniedAction*`. - Added core `Op::ApproveGuardianDeniedAction`. - Added a core handler that validates the event is a denied Guardian assessment and injects a developer message containing the stored denial event JSON. - Queues the approval context for the next turn if there is no active turn yet. - Added the TUI app-server bridge so `Op::ApproveGuardianDeniedAction { event }` is routed to the app-server request. ## What This Does Not Do - Does not add `/auto-review-denials`. - Does not add chat widget recent-denial state. - Does not add popup/list UI. - Does not add a product-facing denial lookup/store. - Does not change where Guardian denials are originally emitted or persisted. ## Verification - `cargo test -p codex-tui thread_approve_guardian_denied_action`	2026-04-22 10:38:19 -07:00
Dylan Hurd	78593d72ea	feat(auto-review) policy config (#18959 ) ## Summary Allow users to customize their own auto-review policy config. ## Testing - [x] added config_tests	2026-04-22 10:33:02 -07:00
cassirer-openai	f67383bcba	[rollout_trace] Record core session rollout traces (#18877 ) ## Summary Wires rollout trace recording into `codex-core` session and turn execution. This records the core model request/response, compaction, and session lifecycle boundaries needed for replay without yet tracing every nested runtime/tool boundary. ## Stack This is PR 2/5 in the rollout trace stack. - [#18876](https://github.com/openai/codex/pull/18876): Add rollout trace crate - [#18877](https://github.com/openai/codex/pull/18877): Record core session rollout traces - [#18878](https://github.com/openai/codex/pull/18878): Trace tool and code-mode boundaries - [#18879](https://github.com/openai/codex/pull/18879): Trace sessions and multi-agent edges - [#18880](https://github.com/openai/codex/pull/18880): Add debug trace reduction command ## Review Notes This layer is the first live integration point. The important review question is whether trace recording is isolated from normal session behavior: trace failures should not become user-visible execution failures, and recording should preserve the existing turn/session lifecycle semantics. The PR depends on the reducer/data model from the first stack entry and only introduces the core recorder surface that later PRs use for richer runtime and relationship events.	2026-04-22 17:00:48 +00:00
Eric Traut	79ea577156	TUI: Keep remote app-server events draining (#18932 ) Addresses #18860 Problem: Remote app-server clients could stop draining websocket events when their bounded local event channel filled, leaving clients stuck on stale in-progress turns after a disconnect. Solution: Use an unbounded local event channel for the remote client so the websocket reader can keep forwarding disconnect and progress events instead of blocking or dropping them. Why this is reasonable: This does not make the remote websocket itself unbounded. The changed queue lives inside the remote client, between the task that reads the remote websocket and the API consumer in the same client process. Once an event has been received from the remote server, preserving it is preferable to blocking websocket reads or dropping disconnect/lifecycle events; network-level backpressure still happens at the websocket boundary if the remote side outpaces the client.	2026-04-22 09:29:34 -07:00
Vaibhav Srivastav	0ebe69a8c3	[codex] Update imagegen system skill (#18852 ) ## Summary This updates the embedded `imagegen` system skill in `codex-rs/skills` with the ImageGen 2 skill changes from `openai/skills-internal#87`. The bundled skill now keeps normal image generation/editing on the built-in `image_gen` path, updates the CLI fallback defaults to `gpt-image-2`, and routes explicit transparent-output requests through `gpt-image-1.5` with clear guidance that `gpt-image-2` does not support transparent backgrounds. ## Details - Update `SKILL.md` routing guidance for built-in vs CLI fallback behavior. - Update CLI/API references for `gpt-image-2` size constraints, quality options, near-4K sizes, and unsupported options. - Update `scripts/image_gen.py` defaults and validation: - default model `gpt-image-2` - default size `auto` - default quality `medium` - reject transparent backgrounds on `gpt-image-2` - reject `input_fidelity` on `gpt-image-2` - validate flexible `gpt-image-2` sizes and suggest `3824x2160` / `2160x3824` for near-4K requests - Update prompt/reference docs with the new model and routing guidance. ## Validation - `cargo test -p codex-skills` - `git diff --check` - Manual CLI dry-runs for: - default `gpt-image-2` payload - `3824x2160` near-4K size acceptance - `3840x2160` rejection with near-4K guidance - transparent background rejection on `gpt-image-2` - transparent background acceptance on `gpt-image-1.5` - `input_fidelity` rejection on `gpt-image-2` Bazel target check was not run locally because `bazel` is not installed in this environment.	2026-04-22 15:08:10 +00:00
jif-oai	65420737e8	chore: prep memories for AB (#18973 )	2026-04-22 11:46:15 +01:00
jif-oai	ddf65c9647	fix: cargo deny (#18971 )	2026-04-22 11:46:11 +01:00
jif-oai	639382609f	fix: wait_agent timeout for queued mailbox mail (#18968 ) ## Why `wait_agent` can be called while mailbox mail is already pending. The previous implementation subscribed for future mailbox sequence changes and then waited for the next notification. If the mail was queued before that wait started, no new notification arrived, so the tool could sit until `timeout_ms` even though mail was ready to deliver. ## What Changed - Added `Session::has_pending_mailbox_items()` for checking pending mailbox mail through the session API. - Updated `multi_agents_v2::wait` to return immediately when pending mailbox mail already exists before sleeping on a new mailbox sequence update. - Reworked the regression coverage in `multi_agents_tests.rs` so already queued mailbox mail must wake `wait_agent` promptly. Relevant code: - [`wait_agent` pending-mail check](`aa8ca06e83/codex-rs/core/src/tools/handlers/multi_agents_v2/wait.rs (L55-L60)`) - [`Session::has_pending_mailbox_items`](`aa8ca06e83/codex-rs/core/src/session/mod.rs (L2979-L2981)`) - [`multi_agent_v2_wait_agent_returns_for_already_queued_mail`](`aa8ca06e83/codex-rs/core/src/tools/handlers/multi_agents_tests.rs (L2854)`) ## Verification - `cargo test -p codex-core multi_agent_v2_wait_agent_returns_for_already_queued_mail`	2026-04-22 11:16:17 +01:00
acrognale-oai	4f8c58f737	Support multiple cwd filters for thread list (#18502 ) ## Summary - Teach app-server `thread/list` to accept either a single `cwd` or an array of cwd filters, returning threads whose recorded session cwd matches any requested path - Add `useStateDbOnly` as an explicit opt-in fast path for callers that want to answer `thread/list` from SQLite without scanning JSONL rollout files - Preserve backwards compatibility: by default, `thread/list` still scans JSONL rollouts and repairs SQLite state - Wire the new cwd array and SQLite-only options through app-server, local/remote thread-store, rollout listing, generated TypeScript/schema fixtures, proto output, and docs ## Test Plan - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-rollout` - `cargo test -p codex-thread-store` - `cargo test -p codex-app-server thread_list` - `just fmt` - `just fix -p codex-app-server-protocol -p codex-rollout -p codex-thread-store -p codex-app-server` - `cargo build -p codex-cli --bin codex`	2026-04-22 06:10:09 -04:00
jif-oai	b04ffeee4c	nit: expose lib (#18962 ) As a follow-up	2026-04-22 10:06:53 +01:00
rhan-oai	213b17b7a3	[codex-analytics] guardian review TTFT plumbing and emission (#17696 ) ## Why Guardian analytics includes time-to-first-token, but the Guardian reviewer runs as a normal Codex session and `TurnCompleteEvent` did not expose TTFT. The timing needs to flow through the standard turn-completion protocol so Guardian review analytics can consume the same value as the rest of the session machinery. ## What changed Adds optional `time_to_first_token_ms` to `TurnCompleteEvent` and populates it from `TurnTiming`. The value is carried through app-server thread history, rollout reconstruction, TUI/app-server adapters, and Guardian review session handling. Guardian review analytics now captures TTFT from the reviewer turn-complete event when available. Existing tests and fixtures are updated to set the new optional field to `None` where TTFT is not relevant. ## Verification - `cargo clippy -p codex-tui --tests -- -D warnings` - `cargo clippy -p codex-core --lib --tests -- -D warnings` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17696). * __->__ #17696 * #17695 * #17693 * #18278 * #18953	2026-04-22 01:52:48 -07:00
rhan-oai	37aadeaa13	[codex-analytics] guardian review truncation (#17695 ) ## Why The Guardian review event needs to report whether the action shown to Guardian was truncated. That field should come from the same truncation path used to build the Guardian prompt, rather than being inferred after the fact. ## What changed Plumbs truncation metadata through Guardian action formatting, prompt construction, review session execution, and analytics emission. `guardian_truncate_text` now reports both the rendered text and whether it inserted the truncation marker, and `reviewed_action_truncated` is set from that prompt-building result. This keeps the analytics field aligned with the model-visible reviewed action while preserving the existing Guardian prompt behavior. ## Verification - Guardian truncation tests cover both truncated and non-truncated action payloads. - Guardian review tests assert the review session metadata and truncation field are propagated. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17695). * #17696 * __->__ #17695 * #17693 * #18278 * #18953	2026-04-22 08:35:29 +00:00
rhan-oai	4e7399c6b9	[codex-analytics] guardian review analytics events emission (#17693 ) ## Why Guardian approvals now run as review sessions, but Codex analytics did not have a terminal event for those reviews. That made it hard to measure approval outcomes, failure modes, Guardian session reuse, model metadata, token usage, and timing separately from the parent turn. ## What changed Adds `codex_guardian_review` analytics emission for Guardian approval reviews. The event is emitted from the Guardian review path with review identity, target item id, approval request source, a PII-minimized reviewed-action shape, terminal decision/status, failure reason, Guardian assessment fields, Guardian session metadata, token usage, and timing metadata. The reviewed-action payload intentionally omits high-risk fields such as shell commands, working directories, argv, file paths, network targets/hosts, rationale, retry reason, and permission justifications. It also classifies prompt-build failures separately from Guardian session/runtime failures so fail-closed cases are distinguishable in analytics. ## Verification - Guardian review analytics tests cover terminal success, timeout/cancel/fail-closed paths, session metadata, and token usage plumbing. - `cargo clippy -p codex-core --lib --tests -- -D warnings` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17693). * #17696 * #17695 * __->__ #17693	2026-04-22 01:02:47 -07:00
Michael Bolin	5eab9ff8ca	app-server: expose thread permission profiles (#18278 ) ## Why The `PermissionProfile` migration needs app-server clients to see the same constrained permission model that core is using at runtime. Before this PR, thread lifecycle responses only exposed the legacy `SandboxPolicy` shape, so clients still had to infer active permissions from sandbox fields. That makes downstream resume, fork, and override flows harder to make `PermissionProfile`-first. External sandbox policies are intentionally excluded from this canonical view. External enforcement cannot be round-tripped as a `PermissionProfile`, and exposing a lossy root-write profile would let clients accidentally change sandbox semantics if they echo the profile back later. ## What changed - Adds the app-server v2 `PermissionProfile` wire shape, including filesystem permissions and glob scan depth metadata. - Adds `PermissionProfileNetworkPermissions` so the profile response does not expose active network state through the older additional-permissions naming. - Returns `permissionProfile` from thread start, resume, and fork responses when the active sandbox can be represented as a `PermissionProfile`. - Keeps legacy `sandbox` in those responses for compatibility and documents `permissionProfile` as canonical when present. - Makes lifecycle `permissionProfile` nullable and returns `null` for `ExternalSandbox` to avoid exposing a lossy profile. - Regenerates the app-server JSON schema and TypeScript fixtures. ## Verification - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server thread_response_permission_profile_omits_external_sandbox -- --nocapture` - `cargo check --tests -p codex-analytics -p codex-exec -p codex-tui` - `just fix -p codex-app-server-protocol -p codex-app-server -p codex-analytics -p codex-exec -p codex-tui` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18278). * #18279 * __->__ #18278	2026-04-21 23:52:56 -07:00
iceweasel-oai	3a451b6321	use long-lived sessions for codex sandbox windows (#18953 ) `codex sandbox windows` previously did a one-shot spawn for all commands. This change uses the `unified_exec` session to spawn long-lived processes instead, and implements a simple bridge to forward stdin to the spawned session and stdout/stderr from the spawned session back to the caller. It also fixes a bug with the new shared spawn context code where the "no-network env" was being applied to both elevated and unelevated sandbox spawns. It should only be applied for the unelevated sandbox because the elevated one uses firewall rules instead of an env-based network suppression strategy.	2026-04-22 06:39:29 +00:00
efrazer-oai	69c8913e24	feat: add explicit AgentIdentity auth mode (#18785 ) ## Summary This PR adds `CodexAuth::AgentIdentity` as an explicit auth mode. An AgentIdentity auth record is a standalone `auth.json` mode. When `AuthManager::auth().await` loads that mode, it registers one process-scoped task and stores it in runtime-only state on the auth value. Header creation stays synchronous after that because the task is initialized before callers receive the auth object. This PR also removes the old feature flag path. AgentIdentity is selected by explicit auth mode, not by a hidden flag or lazy mutation of ChatGPT auth records. Reference old stack: https://github.com/openai/codex/pull/17387/changes ## Design Decisions - AgentIdentity is a real auth enum variant because it can be the only credential in `auth.json`. - The process task is ephemeral runtime state. It is not serialized and is not stored in rollout/session data. - Account/user metadata needed by existing Codex backend checks lives on the AgentIdentity record for now. - `is_chatgpt_auth()` remains token-specific. - `uses_codex_backend()` is the broader predicate for ChatGPT-token auth and AgentIdentity auth. ## Stack 1. https://github.com/openai/codex/pull/18757: full revert 2. https://github.com/openai/codex/pull/18871: isolated Agent Identity crate 3. This PR: explicit AgentIdentity auth mode and startup task allocation 4. https://github.com/openai/codex/pull/18811: migrate Codex backend auth callsites through AuthProvider 5. https://github.com/openai/codex/pull/18904: accept AgentIdentity JWTs and load `CODEX_AGENT_IDENTITY` ## Testing Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.	2026-04-21 22:33:24 -07:00
Michael Bolin	0fef35dc3a	core: derive active permission profiles (#18277 ) ## Why `Permissions` should not store a separate `PermissionProfile` that can drift from the constrained `SandboxPolicy` and network settings. The active profile needs to be derived from the same constrained values that already honor `requirements.toml`. ## What changed This adds derivation of the active `PermissionProfile` from the constrained runtime permission settings and exposes that derived value through config snapshots and thread state. The app-server can then report the active profile without introducing a second source of truth. ## Verification - `cargo test -p codex-core --test all permissions_messages -- --nocapture` - `cargo test -p codex-core --test all request_permissions -- --nocapture` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18277). * #18288 * #18287 * #18286 * #18285 * #18284 * #18283 * #18282 * #18281 * #18280 * #18279 * #18278 * __->__ #18277	2026-04-21 22:11:40 -07:00
Celia Chen	51fdc35945	chore: remove unused Bedrock auth lazy loading (#18948 ) ## Summary The Bedrock Mantle SigV4 auth provider currently looks like it can lazily load `AwsAuthContext`, but the provider is only constructed after `resolve_auth_method` has already loaded that context. Because `with_context` always pre-populates the `OnceCell`, the `get_or_try_init` fallback is unused in normal operation and makes the provider lifecycle harder to reason about. This change removes that dead lazy-loading path and makes the actual behavior explicit: - `BedrockAuthMethod::AwsSdkAuth` carries only the resolved `AwsAuthContext`. - `BedrockMantleSigV4AuthProvider` stores the resolved context directly. - request signing uses the stored context without going through `OnceCell`. The existing eager AWS auth resolution behavior is unchanged; this is a simplification of the provider state, not a behavior change. ## Testing - `cargo shear` - `cargo test -p codex-model-provider` - `just bazel-lock-check`	2026-04-22 05:01:22 +00:00
Dylan Hurd	34800d717e	[codex] Clean guardian instructions (#18934 ) ## Summary - Keep the guardian policy installed as guardian base instructions. - Clear inherited parent `developer_instructions` for guardian review sessions. - Update guardian config tests to assert developer instructions are cleared and policy text is sourced from base instructions. ## Why Guardian review sessions are intended to run under an isolated guardian policy. Because the guardian config is cloned from the parent config, inherited custom or managed developer instructions could otherwise remain active and conflict with guardian review behavior. ## Validation - `just fmt` - `cargo test -p codex-core guardian_review_session_config` Co-authored-by: Codex <noreply@openai.com>	2026-04-21 21:47:58 -07:00
Michael Bolin	faed6d5c07	tests: serialize process-heavy Windows CI suites (#18943 ) ## Why A [Windows Cargo build](https://github.com/openai/codex/actions/runs/24754807756/job/72425641062) on `main` timed out in several unrelated-looking suites at the same time: - `codex-app-server` account tests failed before account logic, while `mcp.initialize()` was waiting for the first JSON-RPC response. - `codex-core` `apply_patch_cli` tests timed out while running full Codex/apply_patch turns. - `codex-windows-sandbox` legacy session tests timed out while creating restricted-token child processes and private desktops. The app-server log reached the test harness write path in [`McpProcess::initialize_with_params`](`731b54d08f/codex-rs/app-server/tests/common/mcp_process.rs (L244-L263)`), but never printed the matching stdout read from [`read_jsonrpc_message`](`731b54d08f/codex-rs/app-server/tests/common/mcp_process.rs (L1123-L1128)`). The server initialize handler is a small bookkeeping/response path ([`message_processor.rs`](`731b54d08f/codex-rs/app-server/src/message_processor.rs (L601-L728)`)), so the failure looks like Windows runner process/pipe scheduling starvation rather than account-specific behavior. ## What Changed This updates `.config/nextest.toml` to serialize two process-heavy sets: - `codex-core` tests matching `package(codex-core) & kind(test) & test(apply_patch_cli)` - `codex-windows-sandbox` tests matching `package(codex-windows-sandbox) & test(legacy_)` `codex-app-server` integration tests were already serialized inside their own package; this change reduces overlap with the other suites that were saturating the runner at the same time. ## Verification - `cargo nextest list --filterset "package(codex-core) & kind(test) & test(apply_patch_cli)"` - `cargo nextest list --filterset "package(codex-windows-sandbox) & test(legacy_)"` The Windows sandbox filter naturally lists no tests on macOS, but it validates the nextest filter/config syntax locally.	2026-04-21 21:14:45 -07:00
Dylan Hurd	0e39614d87	chore(tui) debug-config guardian_policy_config (#18923 ) ## Summary List guardian_policy_config_source in `/debug-config` output ## Testing - [x] Ran locally	2026-04-21 21:00:23 -07:00
Eric Traut	c7e5a9d95e	Keep TUI status surfaces in sync (#18935 )	2026-04-21 20:39:23 -07:00
Michael Bolin	36f8bb4ffa	exec-server: carry filesystem sandbox profiles (#18276 ) ## Why The exec-server still needs platform sandbox inputs, but the migration should preserve the `PermissionProfile` that produced them. Keeping only the derived legacy sandbox map would keep `SandboxPolicy` as the effective abstraction and would make full-disk vs. restricted profiles harder to preserve as the permissions stack starts round-tripping profiles. `PermissionProfile` entries can also be cwd-sensitive (`:cwd`, `:project_roots`, relative globs), so the exec-server must carry the request sandbox cwd instead of resolving those entries against the long-lived exec-server process cwd. ## What changed `FileSystemSandboxContext` now carries `permissions: PermissionProfile` plus an optional `cwd`: - removed `sandboxPolicy`, `sandboxPolicyCwd`, `fileSystemSandboxPolicy`, and `additionalPermissions` - added `permissions` and `cwd` - kept the platform knobs `windowsSandboxLevel`, `windowsSandboxPrivateDesktop`, and `useLegacyLandlock` Core turn and apply-patch paths populate the context from the active runtime permissions and request cwd. Exec-server derives platform `SandboxPolicy`/`FileSystemSandboxPolicy` at the filesystem boundary, adds helper runtime reads there, and rejects cwd-dependent profiles that arrive without a cwd. The legacy `FileSystemSandboxContext::new(SandboxPolicy)` constructor now preserves the old workspace-write conversion semantics for compatibility tests/callers. ## Verification - `cargo test -p codex-exec-server` - `cargo test -p codex-exec-server sandbox_cwd -- --nocapture` - `cargo test -p codex-exec-server sandbox_context_new_preserves_legacy_workspace_write_read_only_subpaths -- --nocapture` - `cargo test -p codex-core --lib file_system_sandbox_context_uses_active_attempt -- --nocapture`	2026-04-21 20:22:28 -07:00
efrazer-oai	564860e8bd	refactor: add agent identity crate (#18871 ) ## Summary This PR adds `codex-agent-identity` as an isolated crate for Agent Identity business logic. The crate owns: - AgentAssertion construction. - Agent task registration. - private-key assertion signing. - bounded blocking HTTP for task registration. It does not wire AgentIdentity into `auth.json`, `AuthManager`, rollout state, or request callsites. That integration happens in later PRs. Reference old stack: https://github.com/openai/codex/pull/17387/changes ## Stack 1. https://github.com/openai/codex/pull/18757: full revert 2. This PR: isolated Agent Identity crate 3. https://github.com/openai/codex/pull/18785: explicit AgentIdentity auth mode and startup task allocation 4. https://github.com/openai/codex/pull/18811: migrate Codex backend auth callsites through AuthProvider 5. https://github.com/openai/codex/pull/18904: accept AgentIdentity JWTs and load `CODEX_AGENT_IDENTITY` ## Testing Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.	2026-04-21 19:57:49 -07:00
Michael Bolin	8fea372c77	Fix remote app-server shutdown race (#18936 ) ## Why A Mac Bazel CI run saw `remote_notifications_arrive_over_websocket` fail during shutdown with `remote app-server shutdown channel is closed` (https://app.buildbuddy.io/invocation/9dac05d6-ae20-40f9-b627-fca6e91cf127). The remote websocket worker can legitimately finish while `shutdown()` is waiting for the shutdown acknowledgement: after the test server sends a notification and exits, the worker may deliver the required disconnect event, observe that the caller has dropped the event receiver, and exit before it sends the shutdown one-shot. That state is already terminal cleanup, not a failed shutdown, so callers should not see a `BrokenPipe` from the acknowledgement channel. ## What Changed - Treat a closed remote shutdown acknowledgement as an already-exited worker while still propagating websocket close errors when the worker returns them. - Added a deterministic regression test for the interleaving where the shutdown command is received and the worker exits before replying. ## Verification - `cargo test -p codex-app-server-client` - New test: `remote::tests::shutdown_tolerates_worker_exit_after_command_is_queued`	2026-04-22 02:41:19 +00:00
xl-openai	a978e411f6	feat: Support remote plugin list/read. (#18452 ) Add a temporary internal remote_plugin feature flag that merges remote marketplaces into plugin/list and routes plugin/read through the remote APIs when needed, while keeping pure local marketplaces working as before. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-21 18:39:07 -07:00
Celia Chen	1cd3ad1f49	feat: add AWS SigV4 auth for OpenAI-compatible model providers (#17820 ) ## Summary Add first-class Amazon Bedrock Mantle provider support so Codex can keep using its existing Responses API transport with OpenAI-compatible AWS-hosted endpoints such as AOA/Mantle. This is needed for the AWS launch path, where provider traffic should authenticate with AWS credentials instead of OpenAI bearer credentials. Requests are authenticated immediately before transport send, so SigV4 signs the final method, URL, headers, and body bytes that `reqwest` will send. ## What Changed - Added a new `codex-aws-auth` crate for loading AWS SDK config, resolving credentials, and signing finalized HTTP requests with AWS SigV4. - Added a built-in `amazon-bedrock` provider that targets Bedrock Mantle Responses endpoints, defaults to `us-east-1`, supports region/profile overrides, disables WebSockets, and does not require OpenAI auth. - Added Amazon Bedrock auth resolution in `codex-model-provider`: prefer `AWS_BEARER_TOKEN_BEDROCK` when set, otherwise use AWS SDK credentials and SigV4 signing. - Added `AuthProvider::apply_auth` and `Request::prepare_body_for_send` so request-signing providers can sign the exact outbound request after JSON serialization/compression. - Determine the region by taking the `aws.region` config first (required for bearer token codepath), and fallback to SDK default region. ## Testing Amazon Bedrock Mantle Responses paths: - Built the local Codex binary with `cargo build`. - Verified the custom proxy-backed `aws` provider using `env_key = "AWS_BEARER_TOKEN_BEDROCK"` streamed raw `responses` output with `response.output_text.delta`, `response.completed`, and `mantle-env-ok`. - Verified a full `codex exec --profile aws` turn returned `mantle-env-ok`. - Confirmed the custom provider used the bearer env var, not AWS profile auth: bogus `AWS_PROFILE` still passed, empty env var failed locally, and malformed env var reached Mantle and failed with `401 invalid_api_key`. - Verified built-in `amazon-bedrock` with `AWS_BEARER_TOKEN_BEDROCK` set passed despite bogus AWS profiles, returning `amazon-bedrock-env-ok`. - Verified built-in `amazon-bedrock` SDK/SigV4 auth passed with `AWS_BEARER_TOKEN_BEDROCK` unset and temporary AWS session env credentials, returning `amazon-bedrock-sdk-env-ok`.	2026-04-22 01:11:17 +00:00
Michael Bolin	e18fe7a07f	test(core): move prompt debug coverage to integration suite (#18916 ) ## Why `build_prompt_input` now initializes `ExecServerRuntimePaths`, which requires a configured Codex executable path. The previous inline unit test in `core/src/prompt_debug.rs` built a bare `test_config()` and then failed before it could assert anything useful: ```text Codex executable path is not configured ``` This coverage is also integration-shaped: it drives the public `build_prompt_input` entry point through config, thread, and session setup rather than testing a small internal helper in isolation. Bazel CI did not catch this earlier because the affected test was behind the same wrapped Rust unit-test path fixed by #18913. Before that launcher/sharding fix, the outer `workspace_root_test` changed the working directory for Insta compatibility while the inner `rules_rust` sharding wrapper still expected its runfiles working directory. In practice, Bazel could report success without executing the Rust test cases in that shard. Once #18913 makes the wrapper run the Rust test binary directly and shard with libtest arguments, this stale unit test actually runs and exposes the missing `codex_self_exe` setup. ## What Changed - Moved `build_prompt_input_includes_context_and_user_message` out of `core/src/prompt_debug.rs`. - Added `core/tests/suite/prompt_debug_tests.rs` and registered it from `core/tests/suite/mod.rs`. - Builds the test config with `ConfigBuilder` and provides `codex_self_exe` using the current test executable, matching the runtime-path invariant required by prompt debug setup. - Preserves the existing assertions that the generated prompt input includes both the debug user message and project-specific user instructions. ## Verification - `cargo test -p codex-core --test all prompt_debug_tests::build_prompt_input_includes_context_and_user_message` - `bazel test //codex-rs/core:core-all-test --test_arg=prompt_debug_tests::build_prompt_input_includes_context_and_user_message --test_output=errors` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18916). * #18913 * __->__ #18916	2026-04-22 01:08:25 +00:00
Felipe Coury	09ebc34f17	fix(core): emit hooks for apply_patch edits (#18391 ) Fixes https://github.com/openai/codex/issues/16732. ## Why `apply_patch` is Codex's primary file edit path, but it was not emitting `PreToolUse` or `PostToolUse` hook events. That meant hook-based policy, auditing, and write coordination could observe shell commands while missing the actual file mutation performed by `apply_patch`. The issue also exposed that the hook runtime serialized command hook payloads with `tool_name: "Bash"` unconditionally. Even if `apply_patch` supplied hook payloads, hooks would either fail to match it directly or receive misleading stdin that identified the edit as a Bash tool call. ## What Changed - Added `PreToolUse` and `PostToolUse` payload support to `ApplyPatchHandler`. - Exposed the raw patch body as `tool_input.command` for both JSON/function and freeform `apply_patch` calls. - Taught tool hook payloads to carry a handler-supplied hook-facing `tool_name`. - Preserved existing shell compatibility by continuing to emit `Bash` for shell-like tools. - Serialized the selected hook `tool_name` into hook stdin instead of hardcoding `Bash`. - Relaxed the generated hook command input schema so `tool_name` can represent tools other than `Bash`. ## Verification Added focused handler coverage for: - JSON/function `apply_patch` calls producing a `PreToolUse` payload. - Freeform `apply_patch` calls producing a `PreToolUse` payload. - Successful `apply_patch` output producing a `PostToolUse` payload. - Shell and `exec_command` handlers continuing to expose `Bash`. Added end-to-end hook coverage for: - A `PreToolUse` hook matching `^apply_patch$` blocking the patch before the target file is created. - A `PostToolUse` hook matching `^apply_patch$` receiving the patch input and tool response, then adding context to the follow-up model request. - Non-participating tools such as the plan tool continuing not to emit `PreToolUse`/`PostToolUse` hook events. Also validated manually with a live `codex exec` smoke test using an isolated temp workspace and temp `CODEX_HOME`. The smoke test confirmed that a real `apply_patch` edit emits `PreToolUse`/`PostToolUse` with `tool_name: "apply_patch"`, a shell command still emits `tool_name: "Bash"`, and a denying `PreToolUse` hook prevents the blocked patch file from being created.	2026-04-21 22:00:40 -03:00
starr-openai	1d4cc494c9	Add turn-scoped environment selections (#18416 ) ## Summary - add experimental turn/start.environments params for per-turn environment id + cwd selections - pass selections through core protocol ops and resolve them with EnvironmentManager before TurnContext creation - treat omitted selections as default behavior, empty selections as no environment, and non-empty selections as first environment/cwd as the turn primary ## Testing - ran `just fmt` - ran `just write-app-server-schema` - not run: unit tests for this stacked PR --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-21 17:48:33 -07:00
Michael Bolin	6368f506b7	fix: windows snapshot for external_agent_config_migration::tests::prompt_snapshot did not match windows output (#18915 ) Fix a snapshot test that is failing on Windows, but is currently missed by Bazel due to https://github.com/openai/codex/pull/18913. We see this failing on Cargo builds on Windows, though. This Bazel vs. Cargo inconsistency explains why https://github.com/openai/codex/pull/18768 did not fix the Cargo Windows build.	2026-04-22 00:32:46 +00:00
Michael Bolin	799e50412e	sandboxing: materialize cwd-relative permission globs (#18867 ) ## Why #18275 anchors session-scoped `:cwd` and `:project_roots` grants to the request cwd before recording them for reuse. Relative deny glob entries need the same treatment. Without anchoring, a stored session permission can keep a pattern such as `*/.env` relative, then reinterpret that deny against a later turn cwd. That makes the persisted profile depend on the cwd at reuse time instead of the cwd that was reviewed and approved. ## What changed `intersect_permission_profiles` now materializes retained `FileSystemPath::GlobPattern` entries against the request cwd, matching the existing materialization for cwd-sensitive special paths. Materialized accepted grants are now deduplicated before deny retention runs. This keeps the sticky-grant preapproval shape stable when a repeated request is merged with the stored grant and both `:cwd = write` and the materialized absolute cwd write are present. The preapproval check compares against the same materialized form, so a later request for the same cwd-relative deny glob still matches the stored anchored grant instead of re-prompting or rejecting. Tests cover both the storage path and the preapproval path: a session-scoped `:cwd = write` grant with `*/.env = none` is stored with both the cwd write and deny glob anchored to the original request cwd, cannot be reused from a later cwd, and remains preapproved when re-requested from the original cwd after merging with the stored grant. ## Verification - `cargo test -p codex-sandboxing policy_transforms` - `cargo test -p codex-core --lib relative_deny_glob_grants_remain_preapproved_after_materialization` - `cargo clippy -p codex-sandboxing --tests -- -D clippy::redundant_clone` - `cargo clippy -p codex-core --lib -- -D clippy::redundant_clone` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18867). * #18288 * #18287 * #18286 * #18285 * #18284 * #18283 * #18282 * #18281 * #18280 * #18279 * #18278 * #18277 * #18276 * __->__ #18867	2026-04-21 17:28:58 -07:00
canvrno-oai	37701d4654	Update /statusline and /title snapshots (#18909 ) Update `/statusline` and `/title` snapshots	2026-04-21 17:16:50 -07:00

... 10 11 12 13 14 ...

5468 Commits