codex

mirror of https://github.com/openai/codex.git synced 2026-05-17 01:32:32 +00:00

Author	SHA1	Message	Date
Dylan Hurd	af089fb21d	fix(exec_policy) heredoc parsing file_redirect (#20113 ) ## Summary Fixes a regression introduced in #10941 so that heredocs do not permit file redirects to be approved by rules, and adds scenario tests to cover this behavior. Previously, heredoc command parsing would allow redirects and environment variables: ```bash # commands_for_exec_policy() would parse this via parse_shell_lc_single_command_prefix PATH=/tmp/bad:$PATH cat <<'EOF' > /tmp/bad/hello.txt hello EOF ``` This conflicts with the Codex Rules documentation; heredoc parsing logic should abide by the same strictness of parsing. ## Tests - [x] Updated unit tests accordingly - [x] Added scenario tests for these cases --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-01 01:05:02 +00:00
iceweasel-oai	4f96001fa7	execpolicy: unwrap PowerShell -Command wrappers on Windows (#20336 ) ## Why On Windows, Codex runs shell commands through a top-level `powershell.exe -NoProfile -Command ...` wrapper. `execpolicy` was matching that wrapper instead of the inner command, so prefix rules like `["git", "push"]` did not fire for PowerShell-wrapped commands even though the same normalization already happens for `bash -lc` on Unix. This change makes the Windows shell wrapper transparent to rule matching while preserving the existing Windows unmatched-command safelist and dangerous-command heuristics. ## What changed - add `parse_powershell_command_plain_commands()` in `shell-command/src/powershell.rs` to unwrap the top-level PowerShell `-Command` body with `extract_powershell_command()` and parse it with the existing PowerShell AST parser - update `core/src/exec_policy.rs` so `commands_for_exec_policy()` treats top-level PowerShell wrappers like `bash -lc` and evaluates rules against the parsed inner commands - carry a small `ExecPolicyCommandOrigin` through unmatched-command evaluation and expose `is_safe_powershell_words()` / `is_dangerous_powershell_words()` so Windows safelist and dangerous-command checks still work after unwrap - add Windows-focused tests for wrapped PowerShell prompt/allow matches, wrapper parsing, and unmatched safe/dangerous inner commands, and re-enable the end-to-end `execpolicy_blocks_shell_invocation` test on Windows ## Testing - `cargo test -p codex-shell-command`	2026-05-01 00:56:20 +00:00
Michael Bolin	2cb8746457	permissions: remove core legacy policy round trips (#19394 ) ## Why Several execution paths still converted profile-backed permissions into `SandboxPolicy` and then rebuilt runtime permissions from that legacy shape. Those round trips are unnecessary after the preceding PRs and can lose split filesystem semantics. Core approval and escalation should carry the resolved profile directly. ## What Changed - Removes `sandbox_policy` from `ResolvedPermissionProfile`; the resolved permission object now carries the canonical `PermissionProfile` directly. - Updates exec-policy fallback, shell/unified-exec interception, escalation reruns, and related tests to pass profiles instead of legacy policies. - Removes legacy additional-permission merge helpers that built an effective `SandboxPolicy` before rebuilding runtime permissions. - Keeps legacy projections only at compatibility boundaries that still require `SandboxPolicy`, not in core permission computation. ## Verification - `cargo test -p codex-core direct_write_roots` - `cargo test -p codex-core runtime_roots_to_legacy_projection` - `cargo test -p codex-app-server requested_permissions_trust_project_uses_permission_profile_intent` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19394). * #19737 * #19736 * #19735 * #19734 * #19395 * __->__ #19394	2026-04-26 17:43:32 -07:00
pakrym-oai	9c3abcd46c	[codex] Move config loading into codex-config (#19487 ) ## Why Config loading had become split across crates: `codex-config` owned the config types and merge logic, while `codex-core` still owned the loader that assembled the layer stack. This change consolidates that responsibility in `codex-config`, so the crate that defines config behavior also owns how configs are discovered and loaded. To make that move possible without reintroducing the old dependency cycle, the shell-environment policy types and helpers that `codex-exec-server` needs now live in `codex-protocol` instead of flowing through `codex-config`. This also makes the migrated loader tests more deterministic on machines that already have managed or system Codex config installed by letting tests override the system config and requirements paths instead of reading the host's `/etc/codex`. ## What Changed - moved the config loader implementation from `codex-core` into `codex-config::loader` and deleted the old `core::config_loader` module instead of leaving a compatibility shim - moved shell-environment policy types and helpers into `codex-protocol`, then updated `codex-exec-server` and other downstream crates to import them from their new home - updated downstream callers to use loader/config APIs from `codex-config` - added test-only loader overrides for system config and requirements paths so loader-focused tests do not depend on host-managed config state - cleaned up now-unused dependency entries and platform-specific cfgs that were surfaced by post-push CI ## Testing - `cargo test -p codex-config` - `cargo test -p codex-core config_loader_tests::` - `cargo test -p codex-protocol -p codex-exec-server -p codex-cloud-requirements -p codex-rmcp-client --lib` - `cargo test --lib -p codex-app-server-client -p codex-exec` - `cargo test --no-run --lib -p codex-app-server` - `cargo test -p codex-linux-sandbox --lib` - `cargo shear` - `just bazel-lock-check` ## Notes - I did not chase unrelated full-suite failures outside the migrated loader surface. - `cargo test -p codex-core --lib` still hits unrelated proxy-sensitive failures on this machine, and Windows CI still shows unrelated long-running/timeouting test noise outside the loader migration itself.	2026-04-26 15:10:53 -07:00
Michael Bolin	5d5d610740	refactor: use semaphores for async serialization gates (#18403 ) This is the second cleanup in the await-holding lint stack. The higher-level goal, following https://github.com/openai/codex/pull/18178 and https://github.com/openai/codex/pull/18398, is to enable Clippy coverage for guards held across `.await` points without carrying broad suppressions. The stack is working toward enabling Clippy's [`await_holding_lock`](https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_lock) lint and the configurable [`await_holding_invalid_type`](https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_invalid_type) lint for Tokio guard types. Several existing fields used `tokio::sync::Mutex<()>` only as one-at-a-time async gates. Those guards intentionally lived across `.await` while an operation was serialized. A mutex over `()` suggests protected data and trips the await-holding lint shape; a single-permit `tokio::sync::Semaphore` expresses the intended serialization directly. ## What changed - Replace `Mutex<()>` serialization gates with `Semaphore::new(1)` for agent identity ensure, exec policy updates, guardian review session reuse, plugin remote sync, managed network proxy refresh, auth token refresh, and RMCP session recovery. - Update call sites from `lock().await` / `try_lock()` to `acquire().await` / `try_acquire()`. - Map closed-semaphore errors into the existing local error types, even though these semaphores are owned for the lifetime of their managers. - Update session test builders for the new `managed_network_proxy_refresh_lock` type. ## Verification - The split stack was verified at the final lint-enabling head with `just clippy`. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18403). * #18698 * #18423 * #18418 * __->__ #18403	2026-04-20 17:21:29 +00:00
jif-oai	fc758af9eb	fix: exec policy loading for sub-agents (#18654 )	2026-04-20 11:51:58 +01:00
jif-oai	be4fe9f9b2	feat: add `--ignore-user-config` and `--ignore-rules` (#18646 ) Add those 2 flags to be able to fully isolate a run of `codex exec` from any rules or tools. This will be used by Chronicle	2026-04-20 11:27:47 +01:00
viyatb-oai	370bed4bf4	fix: trust-gate project hooks and exec policies (#14718 ) ## Summary - trust-gate project `.codex` layers consistently, including repos that have `.codex/hooks.json` or `.codex/execpolicy/*.rules` but no `.codex/config.toml` - keep disabled project layers in the config stack so nested trusted project layers still resolve correctly, while preventing hooks and exec policies from loading until the project is trusted - update app-server/TUI onboarding copy to make the trust boundary explicit and add regressions for loader, hooks, exec-policy, and onboarding coverage ## Security Before this change, an untrusted repo could auto-load project hooks or exec policies from `.codex/` as long as `config.toml` was absent. This makes trust the single gate for project-local config, hooks, and exec policies. ## Stack - Parent of #15936 ## Test - cargo test -p codex-core without_config_toml --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-17 17:56:58 -07:00
Dylan Hurd	fe7c959e90	fix(exec-policy) rules parsing (#18126 ) ## Summary See scenarios - rules must always be enforced on all commands in the string ## Testing - [x] Added ExecApprovalRequirementScenario tests	2026-04-16 21:18:39 -07:00
pakrym-oai	f1a2b920f9	[codex] Make AbsolutePathBuf joins infallible (#16981 ) Having to check for errors every time join is called is painful and unnecessary.	2026-04-07 10:52:08 -07:00
Michael Bolin	aa2403e2eb	core: remove cross-crate re-exports from lib.rs (#16512 ) ## Why `codex-core` was re-exporting APIs owned by sibling `codex-` crates, which made downstream crates depend on `codex-core` as a proxy module instead of the actual owner crate. Removing those forwards makes crate boundaries explicit and lets leaf crates drop unnecessary `codex-core` dependencies. In this PR, this reduces the dependency on `codex-core` to `codex-login` in the following files: ``` codex-rs/backend-client/Cargo.toml codex-rs/mcp-server/tests/common/Cargo.toml ``` ## What - Remove `codex-rs/core/src/lib.rs` re-exports for symbols owned by `codex-login`, `codex-mcp`, `codex-rollout`, `codex-analytics`, `codex-protocol`, `codex-shell-command`, `codex-sandboxing`, `codex-tools`, and `codex-utils-path`. - Delete the `default_client` forwarding shim in `codex-rs/core`. - Update in-crate and downstream callsites to import directly from the owning `codex-` crate. - Add direct Cargo dependencies where callsites now target the owner crate, and remove `codex-core` from `codex-rs/backend-client`.	2026-04-01 23:06:24 -07:00
Dylan Hurd	60c59a7799	fix(core) disable command_might_be_dangerous when unsandboxed (#15036 ) ## Summary If we are in a mode that is already explicitly un-sandboxed, then `ApprovalPolicy::Never` should not block dangerous commands. ## Testing - [x] Existing unit test covers old behavior - [x] Added a unit test for this new case	2026-03-21 01:28:25 +00:00
Dylan Hurd	84f4e7b39d	fix(subagents) share execpolicy by default (#13702 ) ## Summary If a subagent requests approval, and the user persists that approval to the execpolicy, it should (by default) propagate. We'll need to rethink this a bit in light of coming Permissions changes, though I think this is closer to the end state that we'd want, which is that execpolicy changes to one permissions profile should be synced across threads. ## Testing - [x] Added integration test --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-18 06:42:26 +00:00
Michael Bolin	b77fe8fefe	Apply argument comment lint across codex-rs (#14652 ) ## Why Once the repo-local lint exists, `codex-rs` needs to follow the checked-in convention and CI needs to keep it from drifting. This commit applies the fallback `/param/` style consistently across existing positional literal call sites without changing those APIs. The longer-term preference is still to avoid APIs that require comments by choosing clearer parameter types and call shapes. This PR is intentionally the mechanical follow-through for the places where the existing signatures stay in place. After rebasing onto newer `main`, the rollout also had to cover newly introduced `tui_app_server` call sites. That made it clear the first cut of the CI job was too expensive for the common path: it was spending almost as much time installing `cargo-dylint` and re-testing the lint crate as a representative test job spends running product tests. The CI update keeps the full workspace enforcement but trims that extra overhead from ordinary `codex-rs` PRs. ## What changed - keep a dedicated `argument_comment_lint` job in `rust-ci` - mechanically annotate remaining opaque positional literals across `codex-rs` with exact `/param/` comments, including the rebased `tui_app_server` call sites that now fall under the lint - keep the checked-in style aligned with the lint policy by using `/param/` and leaving string and char literals uncommented - cache `cargo-dylint`, `dylint-link`, and the relevant Cargo registry/git metadata in the lint job - split changed-path detection so the lint crate's own `cargo test` step runs only when `tools/argument-comment-lint/` or `rust-ci.yml` changes - continue to run the repo wrapper over the `codex-rs` workspace, so product-code enforcement is unchanged Most of the code changes in this commit are intentionally mechanical comment rewrites or insertions driven by the lint itself. ## Verification - `./tools/argument-comment-lint/run.sh --workspace` - `cargo test -p codex-tui-app-server -p codex-tui` - parsed `.github/workflows/rust-ci.yml` locally with PyYAML --- -> #14652 * #14651	2026-03-16 16:48:15 -07:00
Owen Lin	014e19510d	feat(app-server, core): add more spans (#14479 ) ## Description This PR expands tracing coverage across app-server thread startup, core session initialization, and the Responses transport layer. It also gives core dispatch spans stable operation-specific names so traces are easier to follow than the old generic `submission_dispatch` spans. Also use `fmt::Display` for types that we serialize in traces so we send strings instead of rust types	2026-03-13 13:16:33 -07:00
Jack Mousseau	b7dba72dbd	Rename reject approval policy to granular (#14516 )	2026-03-12 16:38:04 -07:00
Michael Bolin	0c8a36676a	fix: move inline codex-rs/core unit tests into sibling files (#14444 ) ## Why PR #13783 moved the `codex.rs` unit tests into `codex_tests.rs`. This applies the same extraction pattern across the rest of `codex-rs/core` so the production modules stay focused on runtime code instead of large inline test blocks. Keeping the tests in sibling files also makes follow-up edits easier to review because product changes no longer have to share a file with hundreds or thousands of lines of test scaffolding. ## What changed - replaced each inline `mod tests { ... }` in `codex-rs/core/src/*` with a path-based module declaration - moved each extracted unit test module into a sibling `_tests.rs` file, using `mod_tests.rs` for `mod.rs` modules - preserved the existing `cfg(...)` guards and module-local structure so the refactor remains structural rather than behavioral ## Testing - `cargo test -p codex-core --lib` (`1653 passed; 0 failed; 5 ignored`) - `just fix -p codex-core` - `cargo fmt --check` - `cargo shear`	2026-03-12 08:16:36 -07:00
viyatb-oai	c2d5458d67	fix: align core approvals with split sandbox policies (#14171 ) ## Stack fix: fail closed for unsupported split windows sandboxing #14172 fix: preserve split filesystem semantics in linux sandbox #14173 -> fix: align core approvals with split sandbox policies #14171 refactor: centralize filesystem permissions precedence #14174 ## Why This PR Exists This PR is intentionally narrower than the title may suggest. Most of the original split-permissions migration already landed in the earlier `#13434 -> #13453` stack. In particular: - `#13439` already did the broad runtime plumbing for split filesystem and network policies. - `#13445` already moved `apply_patch` safety onto filesystem-policy semantics. - `#13448` already switched macOS Seatbelt generation to split policies. - `#13449` and `#13453` already handled Linux helper and bubblewrap enforcement. - `#13440` already introduced the first protocol-side helpers for deriving effective filesystem access. The reason this PR still exists is that after the follow-on `[permissions]` work and the new shared precedence helper in `#14174`, a few core approval paths were still deciding behavior from the legacy `SandboxPolicy` projection instead of the split filesystem policy that actually carries the carveouts. That means this PR is mostly a cleanup and alignment pass over the remaining core consumers, not a fresh sandbox backend migration. ## What Is Actually New Here - make unmatched-command fallback decisions consult `FileSystemSandboxPolicy` instead of only legacy `DangerFullAccess` / `ReadOnly` / `WorkspaceWrite` categories - thread `file_system_sandbox_policy` into the shell, unified-exec, and intercepted-exec approval paths so they all use the same split-policy semantics - keep `apply_patch` safety on the same effective-access rules as the shared protocol helper, rather than letting it drift through compatibility projections - add loader-level regression coverage proving legacy `sandbox_mode` config still builds split policies and round-trips back without semantic drift ## What This PR Does Not Do This PR does not introduce new platform backend enforcement on its own. - Linux backend parity remains in `#14173`. - Windows fail-closed handling remains in `#14172`. - The shared precedence/model changes live in `#14174`. ## Files To Focus On - `core/src/exec_policy.rs`: unmatched-command fallback and approval rendering now read the split filesystem policy directly - `core/src/tools/sandboxing.rs`: default exec-approval requirement keys off `FileSystemSandboxPolicy.kind` - `core/src/tools/handlers/shell.rs`: shell approval requests now carry the split filesystem policy - `core/src/unified_exec/process_manager.rs`: unified-exec approval requests now carry the split filesystem policy - `core/src/tools/runtimes/shell/unix_escalation.rs`: intercepted exec fallback now uses the same split-policy approval semantics - `core/src/safety.rs`: `apply_patch` safety keeps using effective filesystem access rather than legacy sandbox categories - `core/src/config/config_tests.rs`: new regression coverage for legacy `sandbox_mode` no-drift behavior through the split-policy loader ## Notes - `core/src/codex.rs` and `core/src/codex_tests.rs` are just small fallout updates for `RequestPermissionsResponse.scope`; they are not the point of the PR. - If you reviewed the earlier `#13439` / `#13445` stack, the main review question here is simply: “are there any remaining approval or patch-safety paths that still reconstruct semantics from legacy `SandboxPolicy` instead of consuming the split filesystem policy directly?” ## Testing - cargo test -p codex-core legacy_sandbox_mode_config_builds_split_policies_without_drift - cargo test -p codex-core request_permissions - cargo test -p codex-core intercepted_exec_policy - cargo test -p codex-core restricted_sandbox_requires_exec_approval_on_request - cargo test -p codex-core unmatched_on_request_uses_split_filesystem_policy_for_escalation_prompts - cargo test -p codex-core explicit_ - cargo clippy -p codex-core --tests -- -D warnings	2026-03-12 02:23:22 +00:00
Celia Chen	c1a424691f	chore: add a separate reject-policy flag for skill approvals (#14271 ) ## Summary - add `skill_approval` to `RejectConfig` and the app-server v2 `AskForApproval::Reject` payload so skill-script prompts can be configured independently from sandbox and rule-based prompts - update Unix shell escalation to reject prompts based on the actual decision source, keeping prefix rules tied to `rules`, unmatched command fallbacks tied to `sandbox_approval`, and skill scripts tied to `skill_approval` - regenerate the affected protocol/config schemas and expand unit/integration coverage for the new flag and skill approval behavior	2026-03-11 12:33:09 -07:00
Dylan Hurd	6da84efed8	feat(approvals) RejectConfig for request_permissions (#14118 ) ## Summary We need to support allowing request_permissions calls when using `Reject` policy <img width="1133" height="588" alt="Screenshot 2026-03-09 at 12 06 40 PM" src="https://github.com/user-attachments/assets/a8df987f-c225-4866-b8ab-5590960daec5" /> Note that this is a backwards-incompatible change for Reject policy. I'm not sure if we need to add a default based on our current use/setup ## Testing - [x] Added tests - [x] Tested locally	2026-03-09 18:16:54 -07:00
Charley Cunningham	e84ee33cc0	Add guardian approval MVP (#13692 ) ## Summary - add the guardian reviewer flow for `on-request` approvals in command, patch, sandbox-retry, and managed-network approval paths - keep guardian behind `features.guardian_approval` instead of exposing a public `approval_policy = guardian` mode - route ordinary `OnRequest` approvals to the guardian subagent when the feature is enabled, without changing the public approval-mode surface ## Public model - public approval modes stay unchanged - guardian is enabled via `features.guardian_approval` - when that feature is on, `approval_policy = on-request` keeps the same approval boundaries but sends those approval requests to the guardian reviewer instead of the user - `/experimental` only persists the feature flag; it does not rewrite `approval_policy` - CLI and app-server no longer expose a separate `guardian` approval mode in this PR ## Guardian reviewer - the reviewer runs as a normal subagent and reuses the existing subagent/thread machinery - it is locked to a read-only sandbox and `approval_policy = never` - it does not inherit user/project exec-policy rules - it prefers `gpt-5.4` when the current provider exposes it, otherwise falls back to the parent turn's active model - it fail-closes on timeout, startup failure, malformed output, or any other review error - it currently auto-approves only when `risk_score < 80` ## Review context and policy - guardian mirrors `OnRequest` approval semantics rather than introducing a separate approval policy - explicit `require_escalated` requests follow the same approval surface as `OnRequest`; the difference is only who reviews them - managed-network allowlist misses that enter the approval flow are also reviewed by guardian - the review prompt includes bounded recent transcript history plus recent tool call/result evidence - transcript entries and planned-action strings are truncated with explicit `<guardian_truncated ... />` markers so large payloads stay bounded - apply-patch reviews include the full patch content (without duplicating the structured `changes` payload) - the guardian request layout is snapshot-tested using the same model-visible Responses request formatter used elsewhere in core ## Guardian network behavior - the guardian subagent inherits the parent session's managed-network allowlist when one exists, so it can use the same approved network surface while reviewing - exact session-scoped network approvals are copied into the guardian session with protocol/port scope preserved - those copied approvals are now seeded before the guardian's first turn is submitted, so inherited approvals are available during any immediate review-time checks ## Out of scope / follow-ups - the sandbox-permission validation split was pulled into a separate PR and is not part of this diff - a future follow-up can enable `serde_json` preserve-order in `codex-core` and then simplify the guardian action rendering further --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-07 05:40:10 -08:00
Celia Chen	b0ce16c47a	fix(core): respect reject policy by approval source for skill scripts (#13816 ) ## Summary - distinguish reject-policy handling for prefix-rule approvals versus sandbox approvals in Unix shell escalation - keep prompting for skill-script execution when `rules=true` but `sandbox_approval=false`, instead of denying the command up front - add regression coverage for both skill-script reject-policy paths in `codex-rs/core/tests/suite/skill_approval.rs`	2026-03-06 21:43:14 -08:00
Charley Cunningham	cb1a182bbe	Clarify sandbox permission override helper semantics (#13703 ) ## Summary Today `SandboxPermissions::requires_additional_permissions()` does not actually mean "is `WithAdditionalPermissions`". It returns `true` for any non-default sandbox override, including `RequireEscalated`. That broad behavior is relied on in multiple `main` callsites. The naming is security-sensitive because `SandboxPermissions` is used on shell-like tool calls to tell the executor how a single command should relate to the turn sandbox: - `UseDefault`: run with the turn sandbox unchanged - `RequireEscalated`: request execution outside the sandbox - `WithAdditionalPermissions`: stay sandboxed but widen permissions for that command only ## Problem The old helper name reads as if it only applies to the `WithAdditionalPermissions` variant. In practice it means "this command requested any explicit sandbox override." That ambiguity made it easy to read production checks incorrectly and made the guardian change look like a standalone `main` fix when it is not. On `main` today: - `shell` and `unified_exec` intentionally reject any explicit `sandbox_permissions` request unless approval policy is `OnRequest` - `exec_policy` intentionally treats any explicit sandbox override as prompt-worthy in restricted sandboxes - tests intentionally serialize both `RequireEscalated` and `WithAdditionalPermissions` as explicit sandbox override requests So changing those callsites from the broad helper to a narrow `WithAdditionalPermissions` check would be a behavior change, not a pure cleanup. ## What This PR Does - documents `SandboxPermissions` as a per-command sandbox override, not a generic permissions bag - adds `requests_sandbox_override()` for the broad meaning: anything except `UseDefault` - adds `uses_additional_permissions()` for the narrow meaning: only `WithAdditionalPermissions` - keeps `requires_additional_permissions()` as a compatibility alias to the broad meaning for now - updates the current broad callsites to use the accurately named broad helper - adds unit coverage that locks in the semantics of all three helpers ## What This PR Does Not Do This PR does not change runtime behavior. That is intentional. --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-06 09:57:48 -08:00
Michael Bolin	6a673e7339	core: resolve host_executable() rules during preflight (#13065 ) ## Why [#12964](https://github.com/openai/codex/pull/12964) added `host_executable()` support to `codex-execpolicy`, and [#13046](https://github.com/openai/codex/pull/13046) adopted it in the zsh-fork interception path. The remaining gap was the preflight execpolicy check in `core/src/exec_policy.rs`. That path derives approval requirements before execution for `shell`, `shell_command`, and `unified_exec`, but it was still using the default exact-token matcher. As a result, a command that already included an absolute executable path, such as `/usr/bin/git status`, could still miss a basename rule like `prefix_rule(pattern = ["git"], ...)` during preflight even when the policy also defined a matching `host_executable(name = "git", ...)` entry. This PR brings the same opt-in `host_executable()` resolution to the preflight approval path when an absolute program path is already present in the parsed command. ## What Changed - updated `ExecPolicyManager::create_exec_approval_requirement_for_command()` in `core/src/exec_policy.rs` to use `check_multiple_with_options(...)` with `MatchOptions { resolve_host_executables: true }` - kept the existing shell parsing flow for approval derivation, but now allow basename rules to match absolute executable paths during preflight when `host_executable()` permits it - updated requested-prefix amendment evaluation to use the same host-executable-aware matching mode, so suggested `prefix_rule()` amendments are checked consistently for absolute-path commands - added preflight coverage for: - absolute-path commands that should match basename rules through `host_executable()` - absolute-path commands whose paths are not in the allowed `host_executable()` mapping - requested prefix-rule amendments for absolute-path commands ## Verification - `just fix -p codex-core` - `cargo test -p codex-core --lib exec_policy::tests::`	2026-02-28 17:25:30 +00:00
Michael Bolin	b148d98e0e	execpolicy: add host_executable() path mappings (#12964 ) ## Why `execpolicy` currently keys `prefix_rule()` matching off the literal first token. That works for rules like `["/usr/bin/git"]`, but it means shared basename rules such as `["git"]` do not help when a caller passes an absolute executable path like `/usr/bin/git`. This PR lays the groundwork for basename-aware matching without changing existing callers yet. It adds typed host-executable metadata and an opt-in resolution path in `codex-execpolicy`, so a follow-up PR can adopt the new behavior in `unix_escalation.rs` and other call sites without having to redesign the policy layer first. ## What Changed - added `host_executable(name = ..., paths = [...])` to the execpolicy parser and validated it with `AbsolutePathBuf` - stored host executable mappings separately from prefix rules inside `Policy` - added `MatchOptions` and opt-in `*_with_options()` APIs that preserve existing behavior by default - implemented exact-first matching with optional basename fallback, gated by `host_executable()` allowlists when present - normalized executable names for cross-platform matching so Windows paths like `git.exe` can satisfy `host_executable(name = "git", ...)` - updated `match` / `not_match` example validation to exercise the host-executable resolution path instead of only raw prefix-rule matching - preserved source locations for deferred example-validation errors so policy load failures still point at the right file and line - surfaced `resolvedProgram` on `RuleMatch` so callers can tell when a basename rule matched an absolute executable path - preserved host executable metadata when requirements policies overlay file-based policies in `core/src/exec_policy.rs` - documented the new rule shape and CLI behavior in `execpolicy/README.md` ## Verification - `cargo test -p codex-execpolicy` - added coverage in `execpolicy/tests/basic.rs` for parsing, precedence, empty allowlists, basename fallback, exact-match precedence, and host-executable-backed `match` / `not_match` examples - added a regression test in `core/src/exec_policy.rs` to verify requirements overlays preserve `host_executable()` metadata - verified `cargo test -p codex-core --lib`, including source-rendering coverage for deferred validation errors	2026-02-27 12:59:24 -08:00
Dylan Hurd	f6053fdfb3	feat(core) Introduce Feature::RequestPermissions (#11871 ) ## Summary Introduces the initial implementation of Feature::RequestPermissions. RequestPermissions allows the model to request that a command be run inside the sandbox, with additional permissions, like writing to a specific folder. Eventually this will include other rules as well, and the ability to persist these permissions, but this PR is already quite large - let's get the core flow working and go from there! <img width="1279" height="541" alt="Screenshot 2026-02-15 at 2 26 22 PM" src="https://github.com/user-attachments/assets/0ee3ec0f-02ec-4509-91a2-809ac80be368" /> ## Testing - [x] Added tests - [x] Tested locally - [x] Feature	2026-02-24 09:48:57 -08:00
viyatb-oai	c3048ff90a	feat(core): persist network approvals in execpolicy (#12357 ) ## Summary Persist network approval allow/deny decisions as `network_rule(...)` entries in execpolicy (not proxy config) It adds `network_rule` parsing + append support in `codex-execpolicy`, including `decision="prompt"` (parse-only; not compiled into proxy allow/deny lists) - compile execpolicy network rules into proxy allow/deny lists and update the live proxy state on approval - preserve requirements execpolicy `network_rule(...)` entries when merging with file-based execpolicy - reject broad wildcard hosts (for example `*`) for persisted `network_rule(...)`	2026-02-23 21:37:46 -08:00
Dylan Hurd	a8b4b569fb	fix(core) Filter non-matching prefix rules (#12314 ) ## Summary `gpt-5.3-codex` really likes to write complicated shell scripts, and suggest a partial prefix_rule that wouldn't actually approve the command. We should only show the `prefix_rule` suggestion from the model if it would actually fully approve the command the user is seeing. This will technically cause more instances of overly-specific suggestions when we fallback, but I think the UX is clearer, particularly when the model doesn't necessarily understand the current limitations of execpolicy parsing. ## Testing - [x] Add unit tests - [x] Add integration tests	2026-02-20 22:02:35 -08:00
Michael Bolin	425fff7ad6	feat: add Reject approval policy with granular prompt rejection controls (#12087 ) ## Why We need a way to auto-reject specific approval prompt categories without switching all approvals off. The goal is to let users independently control: - sandbox escalation approvals, - execpolicy `prompt` rule approvals, - MCP elicitation prompts. ## What changed - Added a new primary approval mode in `protocol/src/protocol.rs`: ```rust pub enum AskForApproval { // ... Reject(RejectConfig), // ... } pub struct RejectConfig { pub sandbox_approval: bool, pub rules: bool, pub mcp_elicitations: bool, } ``` - Wired `RejectConfig` semantics through approval paths in `core`: - `core/src/exec_policy.rs` - rejects rule-driven prompts when `rules = true` - rejects sandbox/escalation prompts when `sandbox_approval = true` - preserves rule priority when both rule and sandbox prompt conditions are present - `core/src/tools/sandboxing.rs` - applies `sandbox_approval` to default exec approval decisions and sandbox-failure retry gating - `core/src/safety.rs` - keeps `Reject { all false }` behavior aligned with `OnRequest` for patch safety - rejects out-of-root patch approvals when `sandbox_approval = true` - `core/src/mcp_connection_manager.rs` - auto-declines MCP elicitations when `mcp_elicitations = true` - Ensured approval policy used by MCP elicitation flow stays in sync with constrained session policy updates. - Updated app-server v2 conversions and generated schema/TypeScript artifacts for the new `Reject` shape. ## Verification Added focused unit coverage for the new behavior in: - `core/src/exec_policy.rs` - `core/src/tools/sandboxing.rs` - `core/src/mcp_connection_manager.rs` - `core/src/safety.rs` - `core/src/tools/runtimes/apply_patch.rs` Key cases covered include rule-vs-sandbox prompt precedence, MCP auto-decline behavior, and patch/sandbox retry behavior under `RejectConfig`.	2026-02-19 11:41:49 -08:00
Dylan Hurd	0fbe10a807	fix(core) exec_policy parsing fixes (#11951 ) ## Summary Fixes a few things in our exec_policy handling of prefix_rules: 1. Correctly match redirects specifically for exec_policy parsing. i.e. if you have `prefix_rule(["echo"], decision="allow")` then `echo hello > output.txt` should match - this should fix #10321 2. If there already exists any rule that would match our prefix rule (not just a prompt), then drop it, since it won't do anything. ## Testing - [x] Updated unit tests, added approvals ScenarioSpecs	2026-02-16 23:11:59 -08:00
Dylan Hurd	19afbc35c1	chore(core) rm Feature::RequestRule (#11866 ) ## Summary This feature is now reasonably stable, let's remove it so we can simplify our upcoming iterations here. ## Testing - [x] Existing tests pass	2026-02-16 22:30:23 +00:00
Eric Traut	b98c810328	Report syntax errors in rules file (#11686 ) Currently, if there are syntax errors detected in the starlark rules file, the entire policy is silently ignored by the CLI. The app server correctly emits a message that can be displayed in a GUI. This PR changes the CLI (both the TUI and non-interactive exec) to fail when the rules file can't be parsed. It then prints out an error message and exits with a non-zero exit code. This is consistent with the handling of errors in the config file. This addresses #11603	2026-02-13 10:33:40 -08:00
Dylan Hurd	e6e4c5fa3a	chore(core) Restrict model-suggested rules (#11671 ) ## Summary If the model suggests a bad rule, don't show it to the user. This does not impact the parsing of existing rules, just the ones we show. ## Testing - [x] Added unit tests - [x] Ran locally	2026-02-12 23:57:53 -08:00
Josh McKinney	fc073c9c5b	Remove git commands from dangerous command checks (#11510 ) ### Motivation - Git subcommand matching was being classified as "dangerous" and caused benign developer workflows (for example `git push --force-with-lease`) to be blocked by the preflight policy. - The change aligns behavior with the intent to reserve the dangerous checklist for truly destructive shell ops (e.g. `rm -rf`) and avoid surprising developer-facing blocks. ### Description - Remove git-specific subcommand checks from `is_dangerous_to_call_with_exec` in `codex-rs/shell-command/src/command_safety/is_dangerous_command.rs`, leaving only explicit `rm` and `sudo` passthrough checks. - Deleted the git-specific helper logic that classified `reset`, `branch`-delete, `push` (force/delete/refspec) and `clean --force` as dangerous. - Updated unit tests in the same file to assert that various `git reset`/`git branch`/`git push`/`git clean` variants are no longer classified as dangerous. - Kept `find_git_subcommand` (used by safe-command classification) intact so safe/unsafe parsing elsewhere remains functional. ### Testing - Ran formatter with `just fmt` successfully. - Ran unit tests with `cargo test -p codex-shell-command` and all tests passed (`144 passed; 0 failed`). ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_698d19dedb4883299c3ceb5bbc6a0dcf)	2026-02-13 01:33:02 +00:00
Michael Bolin	abbd74e2be	feat: make sandbox read access configurable with `ReadOnlyAccess` (#11387 ) `SandboxPolicy::ReadOnly` previously implied broad read access and could not express a narrower read surface. This change introduces an explicit read-access model so we can support user-configurable read restrictions in follow-up work, while preserving current behavior today. It also ensures unsupported backends fail closed for restricted-read policies instead of silently granting broader access than intended. ## What - Added `ReadOnlyAccess` in protocol with: - `Restricted { include_platform_defaults, readable_roots }` - `FullAccess` - Updated `SandboxPolicy` to carry read-access configuration: - `ReadOnly { access: ReadOnlyAccess }` - `WorkspaceWrite { ..., read_only_access: ReadOnlyAccess }` - Preserved existing behavior by defaulting current construction paths to `ReadOnlyAccess::FullAccess`. - Threaded the new fields through sandbox policy consumers and call sites across `core`, `tui`, `linux-sandbox`, `windows-sandbox`, and related tests. - Updated Seatbelt policy generation to honor restricted read roots by emitting scoped read rules when full read access is not granted. - Added fail-closed behavior on Linux and Windows backends when restricted read access is requested but not yet implemented there (`UnsupportedOperation`). - Regenerated app-server protocol schema and TypeScript artifacts, including `ReadOnlyAccess`. ## Compatibility / rollout - Runtime behavior remains unchanged by default (`FullAccess`). - API/schema changes are in place so future config wiring can enable restricted read access without another policy-shape migration.	2026-02-11 18:31:14 -08:00
Dylan Hurd	cc8c293378	fix(exec-policy) No empty command lists (#11397 ) ## Summary This should rarely, if ever, happen in practice. But regardless, we should never provide an empty list of `commands` to ExecPolicy. This PR is almost entirely adding test around these cases. ## Testing - [x] Adds a bunch of unit tests for this	2026-02-10 19:22:23 -08:00
viyatb-oai	62d0f302fd	fix(core): canonicalize wrapper approvals and support heredoc prefix … (#10941 ) ## Summary - Reduced repeated approvals for equivalent wrapper commands and fixed execpolicy matching for heredoc-style shell invocations, with minimal behavior change and fail-closed defaults. ## Fixes 1. Canonicalized approval matching for wrappers so equivalent commands map to the same approval intent. 2. Added heredoc-aware prefix extraction for execpolicy so commands like `python3 <<'PY' ... PY` match rules such as `prefix_rule(["python3"], ...)`. 3. Kept fallback behavior conservative: if parsing is ambiguous, existing prompt behavior is preserved. ## Edge Cases Covered - Wrapper path/name differences: `/bin/bash` vs `bash`, `/bin/zsh` vs `zsh`. - Shell modes: `-c` and `-lc`. - Heredoc forms: quoted delimiter (`<<'PY'`) and unquoted delimiter (`<< PY`). - Multi-command heredoc scripts are rejected by the fallback - Non-heredoc redirections (`>`, etc.) are not treated as heredoc prefix matches. - Complex scripts still fall back to prior behavior rather than expanding permissions. --------- Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>	2026-02-10 11:46:40 -08:00
Eric Traut	4521a6e852	Removed "exec_policy" feature flag (#10851 ) This is no longer needed because it's on by default	2026-02-06 08:59:47 -08:00
viyatb-oai	1dcce204fc	Revert "Load untrusted rules" (#10536 ) Reverts openai/codex#9791	2026-02-03 19:38:44 +00:00
viyatb-oai	f50c8b2f81	fix: unsafe auto-approval of git commands (#10258 ) fixes https://github.com/openai/codex/issues/10160 and some more. ## Description Hardens Git command safety to prevent approval bypasses for destructive or write-capable invocations (branch delete, risky push forms, output/config-override flags), so these commands no longer auto-run as “safe.” - `git branch -d` variants (especially in worktrees / with global options like -C / -c) - `git show\|diff\|log --output` ... style file-write flags - risky Git config override flags (-c, --config-env) that can trigger external execution - dangerous push forms that weren’t fully caught (`--force*`, `--delete`, `+refspec`, `:refspec`) - grouped short-flag delete forms (e.g. stacked branch flags containing `d/D`) will fast follow with a common git policy to bring windows to parity. --------- Co-authored-by: Eric Traut <etraut@openai.com>	2026-02-02 12:30:17 -08:00
gt-oai	5662eb8b75	Load exec policy rules from requirements (#10190 ) `requirements.toml` should be able to specify rules which always run. My intention here was that these rules could only ever be restrictive, which means the decision can be "prompt" or "forbidden" but never "allow". A requirement of "you must always allow this command" didn't make sense to me, but happy to be gaveled otherwise. Rules already applies the most restrictive decision, so we can safely merge these with rules found in other config folders.	2026-01-30 18:04:09 +00:00
Dylan Hurd	996e09ca24	feat(core) RequestRule (#9489 ) ## Summary Instead of trying to derive the prefix_rule for a command mechanically, let's let the model decide for us. ## Testing - [x] tested locally	2026-01-28 08:43:17 +00:00
gt-oai	b9deb57689	Load untrusted rules (#9791 )	2026-01-23 21:52:27 +00:00
gt-oai	7938c170d9	Print warning if we skip config loading (#9611 ) https://github.com/openai/codex/pull/9533 silently ignored config if untrusted. Instead, we still load it but disable it. Maybe we shouldn't try to parse it either... <img width="939" height="515" alt="Screenshot 2026-01-21 at 14 56 38" src="https://github.com/user-attachments/assets/e753cc22-dd99-4242-8ffe-7589e85bef66" />	2026-01-23 20:06:37 +00:00
Eric Traut	31d9b6f4d2	Improve handling of config and rules errors for app server clients (#9182 ) When an invalid config.toml key or value is detected, the CLI currently just quits. This leaves the VSCE in a dead state. This PR changes the behavior to not quit and bubble up the config error to users to make it actionable. It also surfaces errors related to "rules" parsing. This allows us to surface these errors to users in the VSCE, like this: <img width="342" height="129" alt="Screenshot 2026-01-13 at 4 29 22 PM" src="https://github.com/user-attachments/assets/a79ffbe7-7604-400c-a304-c5165b6eebc4" /> <img width="346" height="244" alt="Screenshot 2026-01-13 at 4 45 06 PM" src="https://github.com/user-attachments/assets/de874f7c-16a2-4a95-8c6d-15f10482e67b" />	2026-01-13 17:57:09 -08:00
Michael Bolin	ddae70bd62	fix: prompt for unsafe commands on Windows (#9117 )	2026-01-12 21:30:09 -08:00
Shijie Rao	efd0c21b9b	Feat: appServer.requirementList for requirement.toml (#8800 ) ### Summary We are exposing requirements via `requirement/list` method from app-server so that we can conditionally disable the agent mode dropdown selection in VSCE and correctly setting the default value. ### Sample output #### `etc/codex/requirements.toml` <img width="497" height="49" alt="Screenshot 2026-01-06 at 11 32 06 PM" src="https://github.com/user-attachments/assets/fbd9402e-515f-4b9e-a158-2abb23e866a0" /> #### App server response <img width="1107" height="79" alt="Screenshot 2026-01-06 at 11 30 18 PM" src="https://github.com/user-attachments/assets/c0d669cd-54ef-4789-a26c-adb2c41950af" />	2026-01-07 13:57:44 -08:00
Michael Bolin	cafb07fe6e	feat: add justification arg to prefix_rule() in *.rules (#8751 ) Adds an optional `justification` parameter to the `prefix_rule()` execpolicy DSL so policy authors can attach human-readable rationale to a rule. That justification is propagated through parsing/matching and can be surfaced to the model (or approval UI) when a command is blocked or requires approval. When a command is rejected (or gated behind approval) due to policy, a generic message makes it hard for the model/user to understand what went wrong and what to do instead. Allowing policy authors to supply a short justification improves debuggability and helps guide the model toward compliant alternatives. Example: ```python prefix_rule( pattern = ["git", "push"], decision = "forbidden", justification = "pushing is blocked in this repo", ) ``` If Codex tried to run `git push origin main`, now the failure would include: ``` `git push origin main` rejected: pushing is blocked in this repo ``` whereas previously, all it was told was: ``` execpolicy forbids this command ```	2026-01-05 21:24:48 +00:00
Michael Bolin	277babba79	feat: load ExecPolicyManager from ConfigLayerStack (#8453 ) https://github.com/openai/codex/pull/8354 added support for in-repo `.config/` files, so this PR updates the logic for loading `.rules` files to load `.rules` files from all relevant layers. The main change to the business logic is `load_exec_policy()` in `codex-rs/core/src/exec_policy.rs`. Note this adds a `config_folder()` method to `ConfigLayerSource` that returns `Option<AbsolutePathBuf>` so that it is straightforward to iterate over the sources and get the associated config folder, if any.	2025-12-22 17:24:17 -08:00
pakrym-oai	96fdbdd434	Add ExecPolicyManager (#8349 ) Move exec policy management into services to keep turn context immutable.	2025-12-22 09:59:32 -08:00

1 2

60 Commits