Commit Graph

3085 Commits

Author SHA1 Message Date
Michael Bolin
fa4fad57f0 otel: report conversation permissions from profiles 2026-04-30 04:05:23 -07:00
Michael Bolin
37aa2f8157 protocol: drop cwd-less legacy profile constructor 2026-04-30 04:00:16 -07:00
Michael Bolin
3c5890a1ce session tests: configure runtime permissions directly 2026-04-30 03:47:14 -07:00
Michael Bolin
7364013fe0 tests: mutate spawn-agent permission profile directly 2026-04-30 03:43:57 -07:00
Michael Bolin
2876493bae tests: use disabled profile in exec capture check 2026-04-30 03:28:48 -07:00
Michael Bolin
e894ac76f7 tests: use profile constructors in config checks 2026-04-30 03:26:51 -07:00
Michael Bolin
4cf7855a99 tests: use permission profiles in multi-agent config checks 2026-04-30 03:23:18 -07:00
Michael Bolin
c48043f4e4 tests: use permission profiles in session network checks 2026-04-30 03:18:22 -07:00
Michael Bolin
8a2144d700 tests: use permission profiles in config loader checks 2026-04-30 03:18:22 -07:00
Michael Bolin
0fc2a7b068 tests: submit websocket turns with permission profiles 2026-04-30 03:08:22 -07:00
Michael Bolin
4f646e0aca tests: use permission profiles in exec policy checks 2026-04-30 03:04:35 -07:00
Michael Bolin
e28bb5c396 tests: use permission profiles in request permission suite 2026-04-30 03:01:06 -07:00
Michael Bolin
521cf5bdd4 tests: use permission profiles in unified exec suite 2026-04-30 03:01:06 -07:00
Michael Bolin
57094ee86d core: use permission profiles in small read-only contexts 2026-04-30 03:01:06 -07:00
Michael Bolin
200c83f7d7 tests: use permission profiles in suite turn submits 2026-04-30 02:36:30 -07:00
Michael Bolin
cfeaa5aab1 guardian: configure review session permissions directly 2026-04-30 02:36:30 -07:00
Michael Bolin
75c9c98aed tests: use permission profiles in small core fixtures 2026-04-30 02:36:30 -07:00
Michael Bolin
05d341f0d4 tests: use permission profiles in guardian config checks 2026-04-30 02:36:30 -07:00
Michael Bolin
d53c86e0da tests: use permission profiles in unix escalation checks 2026-04-30 02:36:30 -07:00
Michael Bolin
44ec706a44 tests: use permission profiles in patch safety checks 2026-04-30 02:36:30 -07:00
Michael Bolin
a3880e937b tests: use permission profiles in tool sandbox tests 2026-04-30 02:36:30 -07:00
Michael Bolin
ee05c896f7 tests: use permission profile fixtures in config checks 2026-04-30 02:36:30 -07:00
Michael Bolin
ada7881352 core: build permission instructions from profiles only 2026-04-30 02:36:30 -07:00
Michael Bolin
97aaf4cea4 tests: copy plugin stdio server before launch 2026-04-30 02:36:21 -07:00
jif-oai
c37f7434ba Gate multi-agent v2 tools independently of collab (#20246)
## Why

`multi_agents_v2` is meant to be independently gated from the older
`collab` feature. The tool registry still treated the
collaboration-style agent tools as `collab`-only, so enabling
`multi_agents_v2` without `collab` omitted the v2 agent tools. Review
and guardian sub-sessions also need to keep agent spawning disabled even
when the outer session has `multi_agents_v2` enabled.

## What changed

- Include the collab-backed agent tools when either `multi_agents_v2` or
`collab` is enabled.
- Explicitly disable `multi_agents_v2` for review and guardian review
sub-sessions, matching the existing `spawn_csv` and `collab`
restrictions.
- Add a registry test that enables `multi_agents_v2`, disables `collab`,
and verifies the v2 agent tools are present while legacy `send_input`
and `resume_agent` remain hidden.

## Testing

- Added
`test_build_specs_multi_agent_v2_does_not_require_collab_feature`.
2026-04-30 10:23:31 +02:00
Abhinav
8f3c06cc97 Add persisted hook enablement state (#19840)
## Why

After `hooks/list` exposes the hook inventory, clients need a way to
persist user hook preferences, make those changes effective in
already-open sessions, and distinguish user-controllable hooks from
managed requirements without adding another bespoke app-server write
API.

## What

- Extends `hooks/list` entries with effective `enabled` state.
- Persists user-level hook state under `hooks.state.<hook-id>` so the
model can grow beyond a single boolean over time.
- Uses the existing `config/batchWrite` path for hook state updates
instead of introducing a dedicated hook write RPC.
- Refreshes live session hook engines after config writes so
already-open threads observe updated enablement without a restart.

## Stack

1. openai/codex#19705
2. openai/codex#19778
3. This PR - openai/codex#19840
4. openai/codex#19882

## Reviewer Notes

The generated schema files account for much of the raw diff. The core
behavior is in:

- `hooks/src/config_rules.rs`, which resolves per-hook user state from
the config layer stack.
- `hooks/src/engine/discovery.rs`, which projects effective enablement
into `hooks/list` from source-derived managedness.
- `config/src/hook_config.rs`, which defines the new `hooks.state`
representation.
- `core/src/session/mod.rs`, which rebuilds live hook state after user
config reloads.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-30 04:46:32 +00:00
Michael Bolin
ac4332c05b permissions: expose active profile metadata (#20095) 2026-04-29 20:54:59 -07:00
Matthew Zeng
ebe602d005 [plugins] Allow MSFT curated plugins in tool_suggest (#20304)
## Summary
- [x] Move the allowlist out of core crate
- [x] Add Teams, SharePoint, Outlook Email, and Outlook Calendar to the
tool_suggest discoverable plugin allowlist
- [x] Add focused coverage for Microsoft curated plugin discovery

## Testing
- just fmt
- cargo test -p codex-core-plugins
- cargo test -p codex-core
list_tool_suggest_discoverable_plugins_returns_
2026-04-29 19:45:52 -07:00
rhan-oai
bb536d65bd [codex-analytics] prevent stale guardian events from satisfying reused reviews (#20080)
## Why

Reused Guardian review trunks can still have older child-turn events
queued when a later review starts. The review waiter currently accepts
the first terminal event it sees from the shared child session, so a
stale `TurnComplete` can be attributed to the new review. That produces
impossible analytics combinations such as non-null TTFT with sub-10 ms
completion latency and zero token deltas on `trunk_reused` reviews.

## What changed

- Preserve the child turn id returned by the Guardian review
`Op::UserTurn` submission.
- Restrict Guardian review waiting to events correlated with that
submitted child turn.
- Restrict timeout/abort draining to terminal events for the same child
turn.
- Add regression coverage for stale prior-turn completions, stale
prior-turn errors, and interrupt draining in
`codex-rs/core/src/guardian/review_session.rs`.

## Verification

- `cargo test -p codex-core guardian::review_session::tests::`
- `cargo clippy -p codex-core --tests -- -D warnings`
2026-04-29 18:26:39 -07:00
pakrym-oai
fedcefe9da Reduce the surface of collaboration modes (#20149)
Collaboration modes were slightly invasive both into ThreadManager
construction and ModelProvider
2026-04-29 17:22:41 -07:00
Abhinav
8774229a89 Add hooks/list app-server RPC (#19778)
## Why

We need a way to list the available hooks to expose via the TUI and App
so users can view and manage their hooks

## What

- Adds `hooks/list` for one or more `cwd` values that returns discovered
hook metadata

## Stack

1. openai/codex#19705
2. This PR - openai/codex#19778
3. openai/codex#19840
4. openai/codex#19882

## Review Notes

The generated schema files account for most of the raw diff, these files
have the core change:

- `hooks/src/engine/discovery.rs` builds the inventory entries during
hook discovery while leaving runtime handlers focused on execution.
- `app-server/src/codex_message_processor.rs` wires `hooks/list` into
the app-server flow for each requested `cwd`.
- `app-server-protocol/src/protocol/v2.rs` defines the new v2
request/response payloads exposed on the wire.

### Core Changes

`core/src/plugins/manager.rs` adds `plugins_for_layer_stack(...)` so
`skills/list` and `hooks/list`can resolve plugin state for each
requested `cwd`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-29 23:39:57 +00:00
iceweasel-oai
13dbcda28f stop blocking unified_exec on Windows (#19435)
## Summary
- remove the Windows-specific unified-exec environment block from tool
selection
- keep `unified_exec` default-off on Windows unless the feature is
explicitly enabled
- normalize model-provided `shell_type = unified_exec` to
`shell_command` when the feature is disabled
- drop obsolete tests tied to the removed environment gate and keep the
feature-flag regression coverage

## Why
Now that the session/long-lived process backend is implemented for the
Windows sandbox, we don't need to hard disable it anymore. We will be
rolling out slowly using a feature gate.

## Impact
This allows manual Windows opt-in in CLI and app-backed flows while
preserving the existing default-off behavior for Windows users.

---------

Co-authored-by: canvrno-oai <kbond@openai.com>
Co-authored-by: Codex <noreply@openai.com>
2026-04-29 16:06:33 -07:00
pakrym-oai
8de2a7a16d Add codex-core public API listing (#20243)
Summary:
- Add a checked-in codex-core public API listing generated by
cargo-public-api.
- Add scripts/regen-public-api.sh with an embedded crate list,
auto-install for cargo-public-api 0.51.0, pinned nightly, and --check
mode.
- Add Rust CI jobs on the codex Linux x64 runner pool to verify the
listing stays up to date.

Testing:
- bash -n scripts/regen-public-api.sh
- just regen-public-api --check
- yq '.' .github/workflows/rust-ci.yml
.github/workflows/rust-ci-full.yml
- git diff --check
2026-04-29 22:58:08 +00:00
Matthew Zeng
e20391e567 [mcp] Fix plugin MCP approval policy. (#19537)
Plugin MCP servers are loaded from plugin manifests rather than
top-level `[mcp_servers]`, so their tool approval preferences need to be
stored and applied through the owning plugin config. Without this,
choosing "Always allow" for a plugin MCP tool could write a preference
that was not reliably used on later tool calls.

## Summary
- Add plugin-scoped MCP policy config under
`plugins.<plugin>.mcp_servers`, including server enablement, tool
allow/deny lists, server defaults, and per-tool approval modes.
- Overlay plugin MCP policy onto manifest-provided server configs when
plugins are loaded.
- Route persistent "Always allow" writes for plugin MCP tools back to
the owning `plugins.<plugin>.mcp_servers.<server>.tools.<tool>` config
entry.
- Reload user config after persisting an approval and make the plugin
load cache config-aware so stale plugin MCP policy is not reused after
`config.toml` changes.
- Regenerate the config schema and add coverage for plugin MCP policy
loading, approval lookup, persistence, and stale-cache prevention.

## Testing
- `cargo test -p codex-config`
- `cargo test -p codex-core-plugins`
- `cargo test -p codex-core --lib plugin_mcp`
2026-04-29 15:40:03 -07:00
Eric Traut
4241df4d79 Escape turn metadata headers as ASCII JSON (#19620)
## Why

`x-codex-turn-metadata` is sent as an HTTP/WebSocket header, but Codex
was serializing the metadata JSON with raw UTF-8 string contents. When a
workspace path contains non-ASCII characters, common HTTP stacks can
reject or corrupt that header before the request reaches the provider.

Fixes #17468. Also addresses the duplicate WebSocket report in #19581.

## What changed

- Added `codex_utils_string::to_ascii_json_string`, a shared helper that
serializes JSON normally while escaping non-ASCII string content as
`\uXXXX`.
- Switched turn metadata header serialization, including merged
Responses API client metadata, to use the ASCII-safe JSON helper.
- Added coverage for non-ASCII workspace paths and non-ASCII client
metadata while preserving the same parsed JSON values.

## Verification

- `cargo test -p codex-utils-string`
- `cargo test -p codex-core turn_metadata`
- `just bazel-lock-check`
2026-04-29 15:35:33 -07:00
Alex Daley
f63b19bedd [apps] Add apps MCP path override (#20231)
Summary

- Add `[features.apps_mcp_path_override]` config with a `path` field for
overriding only the built-in apps MCP path.
- Keep existing host/base URL derivation unchanged and append the
configured path after that base.
- Regenerate the config schema with the custom feature-config case.

Test Plan

- Not run for latest revision; only `just fmt` and `just
write-config-schema` were run.
- Earlier revision: `cargo test -p codex-features`
- Earlier revision: `cargo test -p codex-mcp`
2026-04-29 18:08:06 -04:00
Matthew Zeng
8ce48f9968 [tool_suggest] Improve tool_suggest triggering conditions. (#20091)
## Summary
- Tighten `tool_suggest` guidance so it prefers explicit plugin install
requests, while still allowing a connector install when the relevant
plugin is already installed and a needed connector from that plugin is
missing.
- Tell the model not to call `tool_suggest` in parallel with other
tools.

## Testing
- `cargo test -p codex-tools tool_suggest`
- `cargo test -p codex-core tool_suggest`
2026-04-29 13:41:12 -07:00
viyatb-oai
07c8b8c77c fix: handle deferred network proxy denials (#19184)
## Why

This bug is exposed by Guardian/auto-review approvals. With the managed
network proxy enabled, a blocked network request can be reported back
through the network approval service as an approval denial after the
command has already started. Before this change, the shell and unified
exec runtimes registered those network approval calls, but did not have
a way to observe an async proxy denial as a cancellation/failure signal
for the running process.

The result was confusing: Guardian/auto-review could correctly deny
network access, but the command path could keep running or unregister
the approval without surfacing the denial as the command failure.

## What Changed

- `NetworkApprovalService` now attaches a cancellation token to active
and deferred network approvals.
- Proxy-denial outcomes are recorded only for active registrations,
cancel the owning token, and are consumed when the approval is
finalized.
- The shell runtime combines the normal command timeout with the
network-denial cancellation token.
- Unified exec stores the deferred network approval object, terminates
tracked processes when the proxy denial arrives, and returns the denial
as a process failure while polling or completing the process.
- Tool orchestration passes the active network approval cancellation
token into the sandbox attempt and preserves deferred approval errors
instead of silently unregistering them.
- App-server `command/exec` now handles the combined
timeout-or-cancellation expiration variant used by the runtime.

## Verification

- `cargo test -p codex-core network_approval --lib`
- `cargo clippy -p codex-app-server --all-targets -- -D warnings`
- `cargo clippy -p codex-core --all-targets -- -D warnings`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-29 19:13:57 +00:00
xl-openai
73cd831952 feat: Use remote installed plugin cache for skills and MCP (#20096)
- Fetches and caches remote /installed plugin state
- Lets skills/list load skills from remote-installed cached plugins
without requiring a local marketplace entry
- Routes plugin list/startup/install/uninstall changes through async
plugin cache invalidation and MCP refresh
2026-04-29 12:09:49 -07:00
Won Park
5cf0adba93 Include auto-review rollout in feedback uploads (#20064)
## Summary

- include the live auto-review trunk rollout when `/feedback` uploads
logs
- upload that attachment as
`auto-review-rollout-<parent-thread-id>.jsonl` so it is distinguishable
from the parent rollout
- show the same auto-review attachment name in the TUI consent popup

## Scope

- this only covers the live cached auto-review trunk for the current
parent thread
- it does not add durable historical parent->auto-review lookup
- it does not add persisted rollout support for ephemeral parallel
review forks

## UI 

<img width="599" height="185" alt="Screenshot 2026-04-28 at 1 17 18 PM"
src="https://github.com/user-attachments/assets/6a0e79c2-5d21-4702-8a89-f765778bc9e9"
/>

## Validation

- `cargo test -p codex-core
cached_guardian_subagent_exposes_its_rollout_path`
- `cargo test -p codex-feedback`
- `cargo test -p codex-app-server`
- `cargo test -p codex-tui feedback_upload_consent_popup_snapshot`
- `cargo test -p codex-tui
feedback_good_result_consent_popup_includes_connectivity_diagnostics_filename`

## Known unrelated local failures

- `cargo test -p codex-core` currently fails in the pre-existing proxy
env snapshot test
`tools::runtimes::tests::maybe_wrap_shell_lc_with_snapshot_keeps_user_proxy_env_when_proxy_inactive`
- `cargo test -p codex-tui` currently hits pre-existing `status::*`
snapshot drift unrelated to this change

## Follow-Up 
- persist parallel auto-review fork sessions so /feedback can include
their rollout history too
- attach each persisted fork as its own clearly named file, for example
auto-review-rollout-<parent-thread-id>-fork <n>.jsonl, instead of
merging multiple Guardian sessions into one attachment
- keep the same live-session-only scope initially; durable historical
parent -> auto-review lookup can remain a separate decision if we later
need feedback from resumed sessions
2026-04-29 11:44:55 -07:00
pakrym-oai
8356806fc9 Add ThreadManager sample crate (#20141)
Summary:
- Add codex-thread-manager-sample, a one-shot binary that starts a
ThreadManager thread, submits a prompt, and prints the final assistant
output.
- Pass ThreadStore into ThreadManager::new and expose
thread_store_from_config for existing callsites.
- Build the sample Config directly with only --model and prompt inputs.

Verification:
- just fmt
- cargo check -p codex-thread-manager-sample -p codex-app-server -p
codex-mcp-server
- git diff --check

Tests: Not run per request.
2026-04-29 11:21:06 -07:00
jif-oai
70ac0f123c Make multi-agent v2 ignore agents.max_depth (#20180)
## Why

`agents.max_depth` is a legacy multi-agent v1 guard. Multi-agent v2 uses
task-path routing and its own session/thread limits, so v2 should not
reject nested `spawn_agent` calls just because the thread-spawn depth
has reached the v1 maximum.

Keeping the v1 depth guard active in v2 prevents deeper task trees even
though the v2 path still needs the depth value only for lineage and
task-path metadata.

## What Changed

- Removed the depth-limit rejection from the multi-agent v2
`spawn_agent` handler while still computing child depth for lineage/path
metadata.
- Made the depth-based disabling of legacy `SpawnCsv`/`Collab` tools
apply only when `Feature::MultiAgentV2` is disabled.
- Added `multi_agent_v2_spawn_agent_ignores_configured_max_depth` to
cover a v2 child spawning another agent when `agent_max_depth = 1`,
while the existing v1 depth-limit tests continue to enforce the legacy
behavior.

## Verification

- `cargo test -p codex-core
multi_agent_v2_spawn_agent_ignores_configured_max_depth -- --nocapture`
- `cargo test -p codex-core depth_limit -- --nocapture`
- `cargo test -p codex-core tools::handlers::multi_agents::tests --
--nocapture`
2026-04-29 12:23:00 +02:00
jif-oai
c41b74c453 nit: drop old memories things (#20186)
Drop legacy code
2026-04-29 12:19:50 +02:00
viyatb-oai
5597925155 feat(cli): add sandbox profile config controls (#20118)
## Why

The explicit profile path from #20117 is meant for standalone testing,
but it still inherited the
shell cwd and all managed requirements implicitly. The pre-existing
launcher path even called out
that it did not support a separate cwd yet in

[`debug_sandbox.rs`](509453f688/codex-rs/cli/src/debug_sandbox.rs (L174-L179)).

For a standalone command, the useful default is to let the caller choose
the project directory being
tested and to avoid administrator-provided constraints unless the caller
explicitly wants to test
those too.

## What changed

- Add explicit-profile-only `-C/--cd DIR`, and use that cwd for both
profile resolution and command
  execution.
- Add explicit-profile-only `--include-managed-config`.
- Make explicit profile mode skip managed requirement sources by
default, including cloud
requirements, MDM requirements, `/etc/codex/requirements.toml`, and the
legacy managed-config
  requirements projection.
- Preserve all existing invocations outside the explicit-profile path.

## Stack

1. #20117 `sandbox-ui-profile`
2. #20118 `sandbox-ui-config` --> this PR

Both PRs are additive. Replay JSON is intentionally deferred to a
follow-up design pass.

## Tests ran

- `cargo test -p codex-cli debug_sandbox`
- `cargo test -p codex-cli sandbox_macos_`
- `cargo test -p codex-core
load_config_layers_can_ignore_managed_requirements`
- `cargo test -p codex-core
load_config_layers_includes_cloud_requirements`
- macOS branch-binary smoke on the rebased top of stack: `-C` changed
execution cwd, explicit
profile mode omitted managed proxy env under `env -i`, and
`--include-managed-config` restored it.
- Linux devbox branch-binary smoke on the rebased top of stack: `-C`
changed execution cwd for
  built-in and user-defined explicit profiles.
2026-04-29 06:55:51 +00:00
Andrey Mishchenko
857146b328 Delete multi_agent_v2 followup_task interrupt parameter (#20139)
Messages sent with `followup_task` already arrive at their target
recipient promptly (at message boundaries while sampling, or after the
pending tool call completes) -- having `interrupt` is not worth the
added complexity.
2026-04-28 23:19:48 -07:00
viyatb-oai
6ed0440611 feat(cli): add explicit sandbox permission profiles (#20117)
## Why

`codex sandbox` is useful for exercising sandbox behavior directly, but
before this stack the CLI
only picked up permission profiles indirectly from the active config.
The existing debug-sandbox path
already compiled `[permissions]` profiles through normal config loading,
as covered by the existing
profile tests in
[`debug_sandbox.rs`](de2ccf9473/codex-rs/cli/src/debug_sandbox.rs (L715-L760)).

This adds the smallest stable entry point first: an explicit profile
selector that reuses the same
config machinery as normal Codex config, so standalone testing becomes
possible without changing
current no-selector behavior.

## What changed

- Add additive `--permissions-profile NAME` support to `codex sandbox
macos|linux|windows`.
- Resolve built-in and user-defined profile names by feeding
`default_permissions` through the
existing config compilation path instead of inventing a sandbox-only
parser.
- Make an explicit selector win over an ambient active profile's legacy
`sandbox_mode`.
- Keep the existing no-selector behavior unchanged.

## Stack

1. #20117 `sandbox-ui-profile` --> this PR
2. #20118 `sandbox-ui-config`

Both PRs are additive. Replay JSON is intentionally deferred to a
follow-up design pass.

## Tests ran

- `cargo test -p codex-cli debug_sandbox`
- `cargo test -p codex-cli sandbox_macos_parses_permissions_profile`
- `cargo test -p codex-core
cli_override_takes_precedence_over_profile_sandbox_mode`
- macOS branch-binary smoke on the rebased top of stack: built-in
`:workspace` and user-defined
  profiles both executed successfully through `--permissions-profile`.
- Linux devbox branch-binary smoke on the rebased top of stack: built-in
`:workspace` and
user-defined profiles both executed successfully through
`--permissions-profile`.
2026-04-29 06:18:16 +00:00
starr-openai
e1ec9e63a0 Add environment provider snapshot (#20058)
## Summary
- Change `EnvironmentProvider` to return concrete `Environment`
instances instead of `EnvironmentConfigurations`.
- Make `DefaultEnvironmentProvider` provide the provider-visible `local`
environment plus optional `remote` environment from
`CODEX_EXEC_SERVER_URL`.
- Keep `EnvironmentManager` as the concrete cache while exposing its own
explicit local environment for `local_environment()` fallback paths.

## Validation
- `just fmt`
- `git diff --check`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-28 20:05:18 -07:00
xl-openai
6f328d5e02 Soften skill description budget warnings (#20112)
Updates skill description budget messaging to be less alarming
2026-04-28 19:56:25 -07:00
Michael Bolin
e6db1a9442 linux-sandbox: switch helper plumbing to PermissionProfile (#20106)
## Why

`PermissionProfile` is the canonical runtime permission model in the
Rust workspace, but the Linux sandbox helper still accepted a legacy
`SandboxPolicy` plus separate filesystem and network policy flags. That
translation layer made the helper interface harder to reason about and
left `linux-sandbox`-specific callers and tests coupled to the legacy
policy representation.

This change moves the helper onto `PermissionProfile` directly so the
Linux sandbox plumbing matches the rest of the permission stack.

## What changed

- changed `codex-linux-sandbox` to accept `--permission-profile` and
derive the runtime filesystem and network policies internally
- updated the in-process seccomp and legacy Landlock path in
`codex-rs/linux-sandbox` to operate on `PermissionProfile`
- updated Linux sandbox argv construction in `codex-rs/sandboxing`,
`codex-rs/core`, and the CLI debug sandbox path to pass the canonical
profile instead of serializing compatibility policy projections
- simplified the Linux sandbox tests to build the exact permission
profile under test, including the managed-proxy path and
direct-runtime-enforcement carveout coverage
- removed helper-local `SandboxPolicy` usage from `bwrap` tests where
`FileSystemSandboxPolicy` is already the value being exercised

## Testing

- `cargo test -p codex-sandboxing`
- `cargo test -p codex-linux-sandbox` (on this macOS host, the crate
compiled cleanly and its Linux-only tests were cfg-gated)
- `cargo test -p codex-core --no-run`
- `cargo test -p codex-cli --no-run`
2026-04-28 19:43:44 -07:00
Michael Bolin
c9f7c88f3d fix: restore live event submit path for apply patch tests (#20108)
## Summary

This fixes the CI regression introduced by
[#20040](https://github.com/openai/codex/pull/20040).

That PR migrated several `apply_patch_cli` tests from direct
`codex.submit(Op::UserTurn { ... })` calls to `harness.submit(...)`.
`harness.submit()` waits for `TurnComplete` before returning, which
drains the same event stream that these tests use to assert `TurnDiff`,
`PatchApplyUpdated`, and related live events. The regressed tests then
timed out waiting for events that had already been consumed.

This change restores a no-wait submit path for the event-observing
`apply_patch_cli` tests so they can watch the turn stream directly
again.

## What Changed

- added a local `submit_without_wait(...)` helper in
`codex-rs/core/tests/suite/apply_patch_cli.rs`
- switched the `apply_patch_cli` tests that assert live turn events back
to that helper
- left the profile-backed `harness.submit(...)` migration in place for
tests that only care about final filesystem or tool output state

## Why macOS Looked Green

In the failing run
[25084487331](https://github.com/openai/codex/actions/runs/25084487331),
`//codex-rs/core:core-all-test` was cached on macOS, so the regressed
tests were not rerun there. The Linux GNU, Linux MUSL, and Windows Bazel
jobs reran the target and exposed the failure.

## Verification

- `cargo test -p codex-core apply_patch_ -- --nocapture`
- previously failing local cases now pass again:
  - `apply_patch_cli_move_without_content_change_has_no_turn_diff`
  - `apply_patch_turn_diff_for_rename_with_content_change`
  - `apply_patch_aggregates_diff_across_multiple_tool_calls`
2026-04-28 18:09:20 -07:00