Commit Graph

6285 Commits

Author SHA1 Message Date
iceweasel-oai
cecca5ae06 Improve Windows process management edge cases (#19211)
## Summary

Some improvements to Windows process-management issues from
https://github.com/openai/codex/pull/15578

- bound the elevated runner pipe-connect handshake instead of waiting
forever on blocking pipe connects
- terminate the spawned runner if that handshake fails, so timeout/error
paths do not leave a stray `codex-command-runner.exe`
- loop on partial `WriteFile` results when forwarding stdin in the
elevated runner, so input is not silently truncated
- fix the concrete HANDLE/SID cleanup paths in the runner setup code
- keep draining driver-backed stdout/stderr after exit until the backend
closes, instead of dropping the tail after a fixed 200ms grace period
- reuse `LocalSid` for SID ownership and add more explanatory comments
around the ownership/concurrency-sensitive code paths

## Why

The original PR fixed a lot of Windows session plumbing, but there were
still a few sharp process-lifecycle edges:

- some elevated runner handshakes could block forever
- the new timeout path could still orphan the spawned runner process
- stdin forwarding still assumed a single `WriteFile` consumed the whole
buffer
- a few raw HANDLE/SID error paths still leaked
- driver-backed output could still lose the last chunk of stdout/stderr
on slower backends

## Validation

- `cargo fmt -p codex-windows-sandbox -p codex-utils-pty`
- `cargo test -p codex-utils-pty`
- `cargo test -p codex-windows-sandbox finish_driver_spawn`
- `cargo test -p codex-windows-sandbox runner_`

Ran a local test matrix of unified-exec and shell_tool tests, all
passing
2026-04-29 10:00:01 -07:00
Eric Traut
1c420a90cd TUI: Remove core protocol dependency [1/7] (#20172)
## Why

This is part 1 of a 7-PR stack to remove direct
`codex_protocol::protocol` usage from `codex-tui` while keeping each
layer reviewable and shippable.

This first layer reduces the size of the later `chatwidget` diff by
mechanically moving MCP startup bookkeeping out of the central widget
file without changing the event shapes or behavior.

## What changed

- Extracted MCP startup status handling into
`tui/src/chatwidget/mcp_startup.rs`.
- Kept the existing core event types in place for this purely mechanical
move.
- Updated the MCP startup tests to import the moved test-only event
types directly.

## Verification

- `cargo test -p codex-tui chatwidget::tests::mcp_startup`
2026-04-29 09:10:22 -07:00
Eric Traut
91ca551df8 Use /goal resume for paused goals (#20082)
## Why

The paused goal statusline currently points users at `/goal` to unpause
a goal, but bare `/goal` is the summary command and does not change the
goal state. Instead of making `/goal` mutate state only when a goal is
paused, this gives the action an explicit command that reads naturally
in the UI.

## What Changed

- Replace `/goal unpause` with `/goal resume` for reactivating a paused
goal.
- Update the paused goal statusline and `/goal` summary copy to point at
`/goal resume`.
2026-04-29 08:56:02 -07:00
jif-oai
70ac0f123c Make multi-agent v2 ignore agents.max_depth (#20180)
## Why

`agents.max_depth` is a legacy multi-agent v1 guard. Multi-agent v2 uses
task-path routing and its own session/thread limits, so v2 should not
reject nested `spawn_agent` calls just because the thread-spawn depth
has reached the v1 maximum.

Keeping the v1 depth guard active in v2 prevents deeper task trees even
though the v2 path still needs the depth value only for lineage and
task-path metadata.

## What Changed

- Removed the depth-limit rejection from the multi-agent v2
`spawn_agent` handler while still computing child depth for lineage/path
metadata.
- Made the depth-based disabling of legacy `SpawnCsv`/`Collab` tools
apply only when `Feature::MultiAgentV2` is disabled.
- Added `multi_agent_v2_spawn_agent_ignores_configured_max_depth` to
cover a v2 child spawning another agent when `agent_max_depth = 1`,
while the existing v1 depth-limit tests continue to enforce the legacy
behavior.

## Verification

- `cargo test -p codex-core
multi_agent_v2_spawn_agent_ignores_configured_max_depth -- --nocapture`
- `cargo test -p codex-core depth_limit -- --nocapture`
- `cargo test -p codex-core tools::handlers::multi_agents::tests --
--nocapture`
2026-04-29 12:23:00 +02:00
jif-oai
c41b74c453 nit: drop old memories things (#20186)
Drop legacy code
2026-04-29 12:19:50 +02:00
iceweasel-oai
5cac3f896d Fix Windows pseudoconsole attribute handling for sandboxed PTY sessions (#20042)
## Summary
Fix the Windows sandbox PTY spawn path to pass the pseudoconsole handle
value directly into `UpdateProcThreadAttribute`.

## Why
Sandboxed `unified_exec` PTY sessions on Windows were failing during
child process startup with `0xc0000142` (`STATUS_DLL_INIT_FAILED`). In
practice this showed up as PowerShell DLL init popups when the sandboxed
background-terminal path tried to launch an interactive shell.

The root cause was that we were passing a pointer to a local `isize`
variable instead of the pseudoconsole handle value in the form Windows
expects for `PROC_THREAD_ATTRIBUTE_PSEUDOCONSOLE`.

## Validation
- `cargo build -p codex-windows-sandbox --bins`
- Reproduced the real sandboxed `codex exec` flow with
`windows.sandbox_private_desktop=true`
- Verified a `tty=true` interactive session launched through the normal
PowerShell wrapper, printed `READY`, accepted follow-up stdin, and
exited cleanly
- Confirmed no new `0xc0000142` / `Application Popup` events appeared
after the successful repro
2026-04-29 11:59:45 +02:00
alexsong-oai
d92c909ee4 Fix migrated hook path rewriting (#20144)
## Summary
- Rewrite migrated external-agent hook commands by replacing the full
hook script path token instead of only the `.claude/hooks/` segment.
- Preserve quoting around the full rewritten target path so script names
with spaces, absolute paths, and shell operators/redirection continue to
work.
- Apply `.claude/settings.local.json` over `.claude/settings.json` for
config, MCP, and plugin migration so local scope matches Claude settings
precedence.
- Skip legacy command markdown without `description` frontmatter,
including README-style docs under `.claude/commands`.

## Root Cause
The previous hook rewrite handled `.claude/hooks/` as a substring
replacement. For absolute source commands, that left the original
project-root prefix before the newly quoted `.codex/hooks` directory,
producing invalid commands like
`project/'project/.codex/hooks'/script.sh`.

The migration also only used project `settings.json` for
config/MCP/plugin decisions, so local settings such as
`disabledMcpjsonServers` could be ignored even though Claude gives local
settings higher precedence than project settings.

## Validation
- `just fmt`
- `cargo test -p codex-external-agent-migration`
- `cargo test -p codex-app-server external_agent_config`
- `just fix -p codex-external-agent-migration`
- `just fix -p codex-app-server`
- `git diff --check`
2026-04-29 00:46:11 -07:00
viyatb-oai
5597925155 feat(cli): add sandbox profile config controls (#20118)
## Why

The explicit profile path from #20117 is meant for standalone testing,
but it still inherited the
shell cwd and all managed requirements implicitly. The pre-existing
launcher path even called out
that it did not support a separate cwd yet in

[`debug_sandbox.rs`](509453f688/codex-rs/cli/src/debug_sandbox.rs (L174-L179)).

For a standalone command, the useful default is to let the caller choose
the project directory being
tested and to avoid administrator-provided constraints unless the caller
explicitly wants to test
those too.

## What changed

- Add explicit-profile-only `-C/--cd DIR`, and use that cwd for both
profile resolution and command
  execution.
- Add explicit-profile-only `--include-managed-config`.
- Make explicit profile mode skip managed requirement sources by
default, including cloud
requirements, MDM requirements, `/etc/codex/requirements.toml`, and the
legacy managed-config
  requirements projection.
- Preserve all existing invocations outside the explicit-profile path.

## Stack

1. #20117 `sandbox-ui-profile`
2. #20118 `sandbox-ui-config` --> this PR

Both PRs are additive. Replay JSON is intentionally deferred to a
follow-up design pass.

## Tests ran

- `cargo test -p codex-cli debug_sandbox`
- `cargo test -p codex-cli sandbox_macos_`
- `cargo test -p codex-core
load_config_layers_can_ignore_managed_requirements`
- `cargo test -p codex-core
load_config_layers_includes_cloud_requirements`
- macOS branch-binary smoke on the rebased top of stack: `-C` changed
execution cwd, explicit
profile mode omitted managed proxy env under `env -i`, and
`--include-managed-config` restored it.
- Linux devbox branch-binary smoke on the rebased top of stack: `-C`
changed execution cwd for
  built-in and user-defined explicit profiles.
2026-04-29 06:55:51 +00:00
Andrey Mishchenko
857146b328 Delete multi_agent_v2 followup_task interrupt parameter (#20139)
Messages sent with `followup_task` already arrive at their target
recipient promptly (at message boundaries while sampling, or after the
pending tool call completes) -- having `interrupt` is not worth the
added complexity.
2026-04-28 23:19:48 -07:00
viyatb-oai
6ed0440611 feat(cli): add explicit sandbox permission profiles (#20117)
## Why

`codex sandbox` is useful for exercising sandbox behavior directly, but
before this stack the CLI
only picked up permission profiles indirectly from the active config.
The existing debug-sandbox path
already compiled `[permissions]` profiles through normal config loading,
as covered by the existing
profile tests in
[`debug_sandbox.rs`](de2ccf9473/codex-rs/cli/src/debug_sandbox.rs (L715-L760)).

This adds the smallest stable entry point first: an explicit profile
selector that reuses the same
config machinery as normal Codex config, so standalone testing becomes
possible without changing
current no-selector behavior.

## What changed

- Add additive `--permissions-profile NAME` support to `codex sandbox
macos|linux|windows`.
- Resolve built-in and user-defined profile names by feeding
`default_permissions` through the
existing config compilation path instead of inventing a sandbox-only
parser.
- Make an explicit selector win over an ambient active profile's legacy
`sandbox_mode`.
- Keep the existing no-selector behavior unchanged.

## Stack

1. #20117 `sandbox-ui-profile` --> this PR
2. #20118 `sandbox-ui-config`

Both PRs are additive. Replay JSON is intentionally deferred to a
follow-up design pass.

## Tests ran

- `cargo test -p codex-cli debug_sandbox`
- `cargo test -p codex-cli sandbox_macos_parses_permissions_profile`
- `cargo test -p codex-core
cli_override_takes_precedence_over_profile_sandbox_mode`
- macOS branch-binary smoke on the rebased top of stack: built-in
`:workspace` and user-defined
  profiles both executed successfully through `--permissions-profile`.
- Linux devbox branch-binary smoke on the rebased top of stack: built-in
`:workspace` and
user-defined profiles both executed successfully through
`--permissions-profile`.
2026-04-29 06:18:16 +00:00
Dylan Hurd
3d10ba9f36 chore(cli) deprecate --full-auto (#20133)
## Summary
Starts the process of getting rid of `--full-auto`, with some
concessions:
1. Fully removes the command from the tui, since it just resolves to the
default permissions there, and encourages users to use the one-time
trust flow if they're not in a trusted repo.
2. Marks the command as deprecated in `codex exec`, in case users are
actively relying on this. We'll remove in an upcoming n+X release.
3. Cleans up some of the `codex sandbox` cli logic, to keep supporting
legacy sandbox policies for now.

This isn't the cleanest setup, but I think it is worthwhile to warn
users for one release before hard-removing it.

## Testing 
- [x] Updated unit tests
2026-04-29 04:41:30 +00:00
starr-openai
e1ec9e63a0 Add environment provider snapshot (#20058)
## Summary
- Change `EnvironmentProvider` to return concrete `Environment`
instances instead of `EnvironmentConfigurations`.
- Make `DefaultEnvironmentProvider` provide the provider-visible `local`
environment plus optional `remote` environment from
`CODEX_EXEC_SERVER_URL`.
- Keep `EnvironmentManager` as the concrete cache while exposing its own
explicit local environment for `local_environment()` fallback paths.

## Validation
- `just fmt`
- `git diff --check`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-28 20:05:18 -07:00
xl-openai
6f328d5e02 Soften skill description budget warnings (#20112)
Updates skill description budget messaging to be less alarming
2026-04-28 19:56:25 -07:00
Michael Bolin
e6db1a9442 linux-sandbox: switch helper plumbing to PermissionProfile (#20106)
## Why

`PermissionProfile` is the canonical runtime permission model in the
Rust workspace, but the Linux sandbox helper still accepted a legacy
`SandboxPolicy` plus separate filesystem and network policy flags. That
translation layer made the helper interface harder to reason about and
left `linux-sandbox`-specific callers and tests coupled to the legacy
policy representation.

This change moves the helper onto `PermissionProfile` directly so the
Linux sandbox plumbing matches the rest of the permission stack.

## What changed

- changed `codex-linux-sandbox` to accept `--permission-profile` and
derive the runtime filesystem and network policies internally
- updated the in-process seccomp and legacy Landlock path in
`codex-rs/linux-sandbox` to operate on `PermissionProfile`
- updated Linux sandbox argv construction in `codex-rs/sandboxing`,
`codex-rs/core`, and the CLI debug sandbox path to pass the canonical
profile instead of serializing compatibility policy projections
- simplified the Linux sandbox tests to build the exact permission
profile under test, including the managed-proxy path and
direct-runtime-enforcement carveout coverage
- removed helper-local `SandboxPolicy` usage from `bwrap` tests where
`FileSystemSandboxPolicy` is already the value being exercised

## Testing

- `cargo test -p codex-sandboxing`
- `cargo test -p codex-linux-sandbox` (on this macOS host, the crate
compiled cleanly and its Linux-only tests were cfg-gated)
- `cargo test -p codex-core --no-run`
- `cargo test -p codex-cli --no-run`
2026-04-28 19:43:44 -07:00
Celia Chen
80fb0704ee feat: update Bedrock Mantle endpoint and GPT-5.4 model ID (#20109)
## Summary

Amazon Bedrock Mantle's OpenAI-compatible endpoint now lives under
`/openai/v1`, and the GPT-5.4 Mantle model ID no longer uses the `-cmb`
suffix. This updates Codex's built-in Bedrock provider configuration so
generated providers and the static Bedrock catalog use the current
endpoint and model ID.

## Changes

- Update the Bedrock Mantle base URL from
`https://bedrock-mantle.{region}.api.aws/v1` to
`https://bedrock-mantle.{region}.api.aws/openai/v1`.
- Update the Amazon Bedrock default base URL in
`codex-model-provider-info`.
- Change the Bedrock GPT-5.4 catalog slug from `openai.gpt-5.4-cmb` to
`openai.gpt-5.4`.
- Align provider and catalog tests with the new URL and model ID.

## Test Plan

- Manual smoke test:

  ```shell
  target/debug/codex \
      -m openai.gpt-5.4 \
      -c 'model_provider="amazon-bedrock"' \
      -c 'model_providers.amazon-bedrock.aws.region="us-west-2"'
  ```
2026-04-29 01:37:21 +00:00
Celia Chen
8c47e36504 feat: expose provider capability bounds to app server clients (#20049)
follow up of #19442. The app server now exposes provider-derived bounds
through a new v2 `modelProvider/read` method. The response reports the
configured provider map key as `modelProvider` and returns the effective
capability booleans so clients can align their UI with the same
provider-owned limits used by core.
2026-04-29 01:36:19 +00:00
canvrno-oai
4c39ad33cb Fix plugin list workspace settings test isolation (#20086)
Fixes test that often fails locally when running `cargo test`
- Add an app-server test helper that combines managed-config isolation
with custom env overrides.
- Isolate `HOME` / `USERPROFILE` in plugin-list workspace settings tests
so host home marketplaces do not affect results.
2026-04-28 18:34:38 -07:00
canvrno-oai
24be9ac0a4 Restore TUI working status after steer message is set (#19939)
Fix for #19925

Restore the `Working` indicator after a streamed final answer finishes
when a user steer message is sent.
Add regression coverage for long output plus a mid-stream steer:
`cargo test -p codex-tui
final_answer_completion_restores_status_indicator_for_pending_steer`

Duplication/testing steps:
1. Start a new thread and ask for a long response.
2. While the response is streaming, submit a steer message.
3. When the first response finishes, observe whether `Working...` is
shown while waiting for the steer message response.
2026-04-28 18:10:40 -07:00
Michael Bolin
c9f7c88f3d fix: restore live event submit path for apply patch tests (#20108)
## Summary

This fixes the CI regression introduced by
[#20040](https://github.com/openai/codex/pull/20040).

That PR migrated several `apply_patch_cli` tests from direct
`codex.submit(Op::UserTurn { ... })` calls to `harness.submit(...)`.
`harness.submit()` waits for `TurnComplete` before returning, which
drains the same event stream that these tests use to assert `TurnDiff`,
`PatchApplyUpdated`, and related live events. The regressed tests then
timed out waiting for events that had already been consumed.

This change restores a no-wait submit path for the event-observing
`apply_patch_cli` tests so they can watch the turn stream directly
again.

## What Changed

- added a local `submit_without_wait(...)` helper in
`codex-rs/core/tests/suite/apply_patch_cli.rs`
- switched the `apply_patch_cli` tests that assert live turn events back
to that helper
- left the profile-backed `harness.submit(...)` migration in place for
tests that only care about final filesystem or tool output state

## Why macOS Looked Green

In the failing run
[25084487331](https://github.com/openai/codex/actions/runs/25084487331),
`//codex-rs/core:core-all-test` was cached on macOS, so the regressed
tests were not rerun there. The Linux GNU, Linux MUSL, and Windows Bazel
jobs reran the target and exposed the failure.

## Verification

- `cargo test -p codex-core apply_patch_ -- --nocapture`
- previously failing local cases now pass again:
  - `apply_patch_cli_move_without_content_change_has_no_turn_diff`
  - `apply_patch_turn_diff_for_rename_with_content_change`
  - `apply_patch_aggregates_diff_across_multiple_tool_calls`
2026-04-28 18:09:20 -07:00
Celia Chen
f8fe96d548 feat: disable capabilities by model provider (#19442)
## Why

Unsupported features must fail closed and Codex must not expose
OpenAI-hosted fallback paths when the active provider cannot support
them. In practice, Bedrock should not surface app connectors, MCP
servers, tool search/suggestions, image generation, web search, or JS
REPL until those paths are explicitly supported for that provider.

This PR moves that decision into provider-owned capability metadata
instead of scattering Bedrock-specific checks across callers.

## What changed

- Adds `ProviderCapabilities` to `codex-model-provider`, with default
support for existing providers and a Bedrock override that disables
unsupported launch surfaces.
- Adds `ToolCapabilityBounds` to `codex-tools` so provider capability
limits can clamp otherwise-enabled tool config.
- Applies capability bounds when building session and review-thread tool
config.
- Routes MCP/app connector configuration through
`McpManager::mcp_config`, which filters configured MCP servers and app
connectors based on the active provider.
- Updates app-server MCP list/read paths to use the filtered MCP config.
- Adds coverage for default provider capabilities, Bedrock disabled
capabilities, and optional tool-surface clamping.

## Testing

built locally and verified that bedrock responses api now return without
errors calling unsupported tools.
2026-04-28 17:51:30 -07:00
alexsong-oai
cb8b1bbcd6 Support detect and import MCP, Subagents, hooks, commands from external (#19949)
## Why
This PR expands the migration path so Codex can detect and import MCP
server config, hooks, commands, and subagents configs in a Codex-native
shape.

## What changed

- Added a `codex-external-agent-migration` crate that owns conversion
logic for external-agent MCP servers, hooks, commands, and subagents.
- Extended the app-server external-agent config detection/import API
with migration item types for MCP server config, hooks, commands, and
subagents.

## Migration strategy

The migration is intentionally conservative: Codex only imports
external-agent config that can be represented safely in Codex today.
Unsupported or ambiguous config is skipped instead of being partially
translated into behavior that may not match the source system.

- **MCP servers**: import supported stdio and HTTP MCP server
definitions into `mcp_servers`. Disabled servers and servers filtered
out by source `enabledMcpjsonServers` / `disabledMcpjsonServers` are
skipped. Project-scoped MCP entries from `.claude.json` are included
when they match the repo path.
- **Hooks**: import only supported command hooks into
`.codex/hooks.json`. Unsupported hook features such as conditional
groups, async handlers, prompt/http hooks, or unknown fields are
skipped. Referenced hook scripts are copied into `.codex/hooks/`,
preserving any existing target scripts.
- **Commands**: import supported external commands as Codex skills under
`.agents/skills/source-command-*`. Commands that rely on source runtime
expansion such as `$ARGUMENTS`, `$1`, `@file` references, shell
interpolation, or colliding generated names are skipped.
- **Subagents**: import valid subagent Markdown files into
`.codex/agents/*.toml` when they have the minimum Codex agent fields.
Source model names are not migrated, so imported agents keep the user’s
Codex default model; compatible reasoning effort and sandbox mode are
migrated when present.
- **Skills and project guidance**: copy missing skill directories into
`.agents/skills` and migrate `CLAUDE.md` guidance into `AGENTS.md`,
rewriting source-agent terminology to Codex terminology where
appropriate.
- **Detection details**: detected migration items include lightweight
details for UI preview, such as MCP server names, hook event names,
generated command skill names, and subagent names. Import still
recomputes from disk instead of trusting details as the source of truth.

- Adds focused coverage for the new migration behavior and app-server
import flow.

## Verification

- `cargo test -p codex-external-agent-migration`
- `cargo test -p codex-hooks`
- `cargo test -p codex-app-server external_agent_config`
- `just bazel-lock-check`
2026-04-29 00:45:24 +00:00
Matthew Zeng
ebdf3a878c Support disabling tool suggest for specific tools. (#20072)
## Summary
- Add `disable_tool_suggest` to app and plugin config, schema, and
TypeScript output
- Exclude disabled connectors and plugins from tool suggestion discovery
- Persist "never show again" tool-suggestion choices back into
`config.toml`
- Update config docs and add coverage for connector and plugin
suppression

## Testing
- Added and updated unit tests for config persistence and tool-suggest
filtering
- Not run (not requested)
2026-04-29 00:19:34 +00:00
Michael Bolin
1211a90a35 core tests: migrate hook turns to profiles (#20041)
## Summary
- Removes `SandboxPolicy` from the hooks test suite.
- Submits hook-related turns with explicit `PermissionProfile` values
for disabled, read-only, and workspace-write cases.
- Preserves the managed-network hook test by configuring and submitting
a workspace-write profile with enabled network, allowing the existing
requirements-backed proxy path to remain covered.

## Verification
- `cargo check -p codex-core --tests`
- `just fmt`
2026-04-28 17:18:45 -07:00
Michael Bolin
1fed948c66 core tests: migrate apply patch turns to profiles (#20040)
## Summary
- Removes `SandboxPolicy` from the apply-patch CLI test suite.
- Uses the harness' profile-backed submit helper for danger/no-sandbox
turns instead of constructing `Op::UserTurn` manually with legacy
fields.
- Converts the workspace-write traversal cases to submit
`PermissionProfile::workspace_write_with(...)` directly.

## Verification
- `cargo check -p codex-core --tests`
- `just fmt`
2026-04-28 17:18:19 -07:00
Michael Bolin
1dae5788e1 core tests: migrate rmcp turns to profiles (#20037)
## Summary
- Removes `SandboxPolicy` from the RMCP client test suite.
- Adds shared read-only user-turn helpers that submit
`PermissionProfile::read_only()` plus the legacy compatibility
projection required by the current `Op::UserTurn` shape.
- Keeps sandbox metadata assertions intact by deriving the expected
legacy `sandboxPolicy` value from the same read-only profile used for
the turn.

## Verification
- `cargo check -p codex-core --tests`
- `just fmt`
2026-04-28 17:17:47 -07:00
Michael Bolin
6662c0f312 core tests: migrate compact turns to profiles (#20035)
## Summary
- Removes the remaining `SandboxPolicy` usage from the compaction test
suite.
- Adds a small local helper for direct `Op::UserTurn` construction so
these tests send `PermissionProfile::Disabled` plus the legacy
compatibility projection required by the protocol field.
- Keeps the existing danger/full-access behavior while exercising the
canonical permission profile path.

## Verification
- `cargo check -p codex-core --tests`
- `just fmt`
2026-04-28 17:17:12 -07:00
Michael Bolin
026df712cc core tests: migrate zsh-fork permissions to profiles (#20034)
## Summary
- Updates the zsh-fork test helper to configure `PermissionProfile`
directly instead of constructing a legacy `SandboxPolicy`.
- Sends permission-profile-backed turns from the skill approval zsh-fork
tests so the runtime and request path exercise the canonical permissions
model.
- Leaves the broader approvals suite on legacy policies for now, except
for the zsh-fork test that shares this helper.

## Verification
- `cargo check -p codex-core --tests`
- `just fmt`
2026-04-28 17:15:58 -07:00
Michael Bolin
1ea90410e1 core tests: migrate request permissions tool turns to profiles (#20033)
## Summary

This migrates the macOS request-permissions tool tests from legacy
`SandboxPolicy` setup to `PermissionProfile` setup. The tests still
exercise the same workspace-write baseline and request-permission
grants, but the canonical permissions value is now the profile.

## Changes

- Replaces the `workspace_write_excluding_tmp()` helper with a
`PermissionProfile::workspace_write_with()` helper.
- Applies test config through `Permissions::set_permission_profile()`.
- Uses `turn_permission_fields()` for `Op::UserTurn` compatibility
fields.
- Removes the `SandboxPolicy` import from `request_permissions_tool.rs`.

## Verification

- `cargo check -p codex-core --tests`
2026-04-28 17:15:13 -07:00
Michael Bolin
af39e488bc core tests: migrate prompt caching turns to profiles (#20032)
## Summary

This removes the explicit `SandboxPolicy` constructors from
`core/tests/suite/prompt_caching.rs`. The tests still exercise the same
prompt-cache invariants across permission and turn-context changes, but
the permission source is now `PermissionProfile`.

## Changes

- Uses `PermissionProfile::workspace_write_with()` for workspace-write
override scenarios.
- Uses `PermissionProfile::Disabled` for the no-sandbox per-turn
override.
- Projects profiles through `turn_permission_fields()` or
`to_legacy_sandbox_policy()` only to populate compatibility fields on
existing ops.
- Removes the `SandboxPolicy` import from `prompt_caching.rs`.

## Verification

- `cargo check -p codex-core --tests`
2026-04-28 17:13:53 -07:00
Michael Bolin
5d08315c00 core tests: migrate exec policy turns to profiles (#20030)
## Summary

This migrates `core/tests/suite/exec_policy.rs` away from legacy
`SandboxPolicy` turn construction. These tests all use no-sandbox turns
to exercise exec-policy behavior, so `PermissionProfile::Disabled` is
the canonical representation.

## Changes

- Replaces direct `SandboxPolicy::DangerFullAccess` turn fields with
`PermissionProfile::Disabled`.
- Uses `turn_permission_fields()` to populate the compatibility
`sandbox_policy` field required by `Op::UserTurn`.
- Removes the `SandboxPolicy` import from `exec_policy.rs`.

## Verification

- `cargo check -p codex-core --tests`
2026-04-28 17:12:48 -07:00
Michael Bolin
b599849d86 core tests: migrate permissions message tests to profiles (#20028)
## Summary

This removes another test-only `SandboxPolicy` dependency by configuring
`permissions_messages.rs` with a `PermissionProfile` directly. The test
still verifies the rendered compatibility permissions text, but now
obtains the legacy projection from the loaded `Config` rather than using
`SandboxPolicy` as the source of truth.

## Changes

- Builds the workspace-write test setup with
`PermissionProfile::workspace_write_with()`.
- Applies that profile through `Permissions::set_permission_profile()`.
- Uses `Config::legacy_sandbox_policy()` only for the expected
`PermissionsInstructions` compatibility rendering.

## Verification

- `cargo check -p codex-core --tests`
2026-04-28 17:12:10 -07:00
Michael Bolin
3ef09c71d3 core tests: migrate tools tests to permission profiles (#20027)
## Summary

This continues the test-side migration away from `SandboxPolicy` by
removing the remaining legacy policy setup in
`core/tests/suite/tools.rs`. The affected test was already modeling a
profile-backed filesystem policy with a deny-read glob, so configuring
the test through `Permissions::set_permission_profile()` is a better
match for the behavior being exercised.

## Changes

- Drops the `SandboxPolicy` import from `core/tests/suite/tools.rs`.
- Configures the glob deny-read shell test directly with a
`PermissionProfile` instead of creating a legacy read-only policy first.
- Submits the test turn with the session permission profile so the
deny-read glob remains active for the command under test.

## Verification

- `cargo check -p codex-core --tests`
2026-04-28 17:11:43 -07:00
Michael Bolin
8d3992d830 core tests: migrate plan item turns to profiles (#20026)
## Why

The core item tests still had a cluster of plan-mode `Op::UserTurn`
literals that used `SandboxPolicy::DangerFullAccess` and omitted
`permission_profile`. These tests are validating emitted item lifecycle
events, so keeping them on the legacy sandbox-only turn shape adds noise
to the broader permissions migration without testing legacy behavior.

## What Changed

- Adds a local `disabled_plan_turn()` helper that preserves the existing
`std::env::current_dir()` turn cwd behavior.
- Uses `turn_permission_fields(PermissionProfile::Disabled, cwd)` to
populate both the compatibility `sandbox_policy` and canonical
`permission_profile` fields.
- Replaces the plan-mode hand-built turns in
`codex-rs/core/tests/suite/items.rs`, removing all `SandboxPolicy`
references from that file and reducing remaining `codex-rs/core/tests`
`SandboxPolicy` files from 16 to 15.

## Verification

- `cargo check -p codex-core --tests`
2026-04-28 17:11:17 -07:00
Michael Bolin
162f4e3183 core tests: migrate safety check turns to profiles (#20024)
## Why

This stack is retiring direct `SandboxPolicy` construction from tests so
core coverage exercises the same `PermissionProfile` turn path used by
runtime code. `safety_check_downgrade.rs` still submitted each test turn
as `SandboxPolicy::DangerFullAccess` with no permission profile, even
though the tests are about model verification/reroute behavior rather
than legacy sandbox conversion.

## What Changed

- Adds a local `disabled_text_turn()` helper that derives both the
compatibility `sandbox_policy` and canonical `permission_profile` from
`PermissionProfile::Disabled`.
- Replaces repeated hand-built `Op::UserTurn` literals in
`codex-rs/core/tests/suite/safety_check_downgrade.rs` with that helper.
- Removes all `SandboxPolicy` references from the safety-check suite,
reducing the remaining `codex-rs/core/tests` files that mention
`SandboxPolicy` from 17 to 16.

## Verification

- `cargo check -p codex-core --tests`
2026-04-28 17:10:42 -07:00
Michael Bolin
2a8ce9b319 core tests: migrate view image turns to profiles (#20021)
## Why

This stack is removing direct `SandboxPolicy` usage from test code so
new tests exercise the same `PermissionProfile` path that runtime code
now treats as canonical. `view_image.rs` still built `Op::UserTurn`
requests with `SandboxPolicy::DangerFullAccess` and no permission
profile, which kept another core test module on the legacy turn shape.

## What Changed

- Adds a small `disabled_user_turn()` helper for the view-image suite
that derives the compatibility `sandbox_policy` and canonical
`permission_profile` from `PermissionProfile::Disabled`.
- Replaces repeated direct `Op::UserTurn` literals in
`codex-rs/core/tests/suite/view_image.rs` with that helper.
- Removes all `SandboxPolicy` references from `view_image.rs`, reducing
the remaining `codex-rs/core/tests` files that mention `SandboxPolicy`
from 18 to 17.

## Verification

- `cargo check -p codex-core --tests`
2026-04-28 17:09:48 -07:00
Michael Bolin
d77d23da2e core tests: migrate model/personality turns to profiles (#20018)
## Summary

- Migrates `model_switching.rs` and `personality.rs` direct
`Op::UserTurn` construction from legacy `SandboxPolicy` literals to
`PermissionProfile`-backed turn fields.
- Adds small local helpers in each file so tests keep asserting
model/personality behavior without repeating permission plumbing.
- Reduces `rg -l '\bSandboxPolicy\b' codex-rs/core/tests` from 20 files
to 18; `codex-rs/tui` remains at zero `SandboxPolicy` references.

## Testing

- `cargo check -p codex-core --tests`
- `just fmt`
2026-04-28 17:09:12 -07:00
Abhinav
5b0d9df1d0 Increase plugin hook env test timeout (#20100)
# Why

`plugin_hook_sources_run_with_plugin_env_and_plugin_source` can still
fail on Windows after the earlier file-based assertion cleanup because
the hook process itself occasionally exceeds the old 5s timeout under CI
load. When that happens, the hook run ends as `Failed` before the test
can inspect its structured output.

The Windows Bazel failure showed the hook run itself failing after
nearly 8 seconds:

```text
---- engine::tests::plugin_hook_sources_run_with_plugin_env_and_plugin_source stdout ----
thread 'engine::tests::plugin_hook_sources_run_with_plugin_env_and_plugin_source' panicked at hooks/src\engine\mod_tests.rs:428:5:
assertion failed: `(left == right)`
Diff < left / right > :
<Failed
>Completed
...
test result: FAILED. 78 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 7.96s
```

# What

- raise the flaky plugin hook env test timeout from 5s to 10s so it
matches the other executed hook tests in this module

# Validation

- `cargo test -p codex-hooks`
2026-04-28 17:08:12 -07:00
Michael Bolin
d6d79ffcc7 core tests: send model turns with permission profiles (#20016)
## Summary
- Migrate direct `Op::UserTurn` construction in remote-model tests from
legacy `SandboxPolicy::DangerFullAccess` to
`PermissionProfile::Disabled` via `turn_permission_fields()`.
- Migrate the Responses API proxy header helper from an inline
workspace-write `SandboxPolicy` to
`PermissionProfile::workspace_write()`.
- Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 22
files after #20015 to 20 files.

## Testing
- `cargo check -p codex-core --tests`
- `just fmt`





























---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/20016).
* #20041
* #20040
* #20037
* #20035
* #20034
* #20033
* #20032
* #20030
* #20028
* #20027
* #20026
* #20024
* #20021
* #20018
* __->__ #20016
2026-04-28 17:08:04 -07:00
Michael Bolin
158b2a4201 core tests: configure profiles directly (#20015)
## Summary
- Replace legacy sandbox config setup in delegate and telemetry tests
with direct `PermissionProfile` configuration.
- Move no-sandbox and read-only test turns in `tools.rs`,
`code_mode.rs`, `user_shell_cmd.rs`, and `model_visible_layout.rs` from
legacy `SandboxPolicy` values to `PermissionProfile` helpers, while
leaving the deny-glob read-only compatibility case for a later targeted
cleanup.
- Use `PermissionProfile::read_only()` where tests need managed
read-only behavior and `PermissionProfile::Disabled` where they
intentionally need no sandbox.
- Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 27
files after #20013 to 22 files.

## Testing
- `cargo check -p codex-core --tests`
- `just fmt`
2026-04-28 17:06:59 -07:00
Michael Bolin
52e79ee49a core tests: migrate more turns to permission profiles (#20013)
## Summary
- Migrate another batch of direct `Op::UserTurn` test construction from
legacy `SandboxPolicy` values to `PermissionProfile` inputs via
`turn_permission_fields()`.
- Replace a one-off read-only `SandboxPolicy` bridge in the macOS exec
test with `PermissionProfile::read_only()`.
- Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 32
files at the start of the cleanup stack to 27 files.

## Testing
- `cargo check -p codex-core --tests`
- `just fmt`
- `just fix -p codex-core`
2026-04-28 17:05:53 -07:00
Michael Bolin
7d15936e69 core tests: build user turns from permission profiles (#20011)
## Summary
- Add `turn_permission_fields()` so tests that construct `Op::UserTurn`
directly can provide a canonical `PermissionProfile` while still filling
the required legacy `sandbox_policy` compatibility field.
- Migrate direct user-turn construction in core integration tests from
`SandboxPolicy::DangerFullAccess` to `PermissionProfile::Disabled`.
- Continue reducing direct `SandboxPolicy` usage in
`codex-rs/core/tests`, from 41 files after #20010 to 32 files in this
PR.

## Testing
- `cargo check -p codex-core --tests`
- `just fmt`
- `just fix -p core_test_support`
- `just fix -p codex-core`
2026-04-28 17:03:20 -07:00
Eric Traut
2223b31c06 Refine Codex issue digest summaries (#20097)
## Why

The `codex-issue-digest` skill was producing more detail than the daily
digest needed, and broad all-area digests could miss active issues. In
particular, issue #16088 had substantial recent comments and reactions
but did not appear in the weekly all-areas output because GitHub search
was using default relevance ranking and the collector could exhaust its
candidate cap before later search queries got a fair sample.

That made the digest look quieter than the underlying user activity and
made threshold tuning misleading.

## What changed

- Make the digest summary headline-first and summary-only by default.
- Add an explicit opt-in flow for `## Details`, so the issue table is
shown only when requested or when the prompt asks for details upfront.
- Update the collector to request GitHub issue search results with
`sort=updated` and `order=desc`.
- Apply the search candidate cap per query instead of globally across
all queries.
- Bump the collector script version to `3`.
- Add tests that cover updated sorting and per-query candidate limits.

## Verification

- `pytest
.codex/skills/codex-issue-digest/scripts/test_collect_issue_digest.py`
- `ruff check
.codex/skills/codex-issue-digest/scripts/collect_issue_digest.py
.codex/skills/codex-issue-digest/scripts/test_collect_issue_digest.py`
- `git diff --check`
- Reran the all-areas weekly collector and confirmed #16088 is now
included with `55` interactions.
2026-04-28 16:53:59 -07:00
Ruslan Nigmatullin
c6465c1ec2 app-server: notify clients of remote-control status changes (#19919)
## Why

Remote-control app-server enrollments have both an internal server id
and the environment id exposed to remote-control clients. App-server
clients need one current status snapshot that says whether remote
control is usable and which environment id, if any, is exposed.

A temporary websocket disconnect is not itself an identity change.
Account changes, stale enrollment invalidation, successful
re-enrollment, and missing ChatGPT auth are meaningful status changes.
Disabled remote control remains `disabled` regardless of auth or SQLite
state. SQLite startup failure disablement and enrollment persistence
failures are handled in #20068; this PR reports the resulting effective
status to clients.

## What changed

- Adds v2 `remoteControl/status/changed` carrying `state` and
`environmentId`.
- Adds `RemoteControlConnectionState` values: `disabled`, `connecting`,
`connected`, and `errored`.
- Exposes remote-control status updates through `RemoteControlHandle`
using a Tokio watch channel.
- Always sends the current remote-control status snapshot to newly
initialized app-server clients.
- Broadcasts status changes to initialized app-server clients when state
or environment id changes.
- Treats missing ChatGPT auth as an `errored` status while leaving it
retryable because auth can change at runtime.
- Clears `environmentId` when enrollment is cleared for account changes,
auth loss, stale backend invalidation, or disabled remote control.
- Updates app-server protocol schema fixtures, generated TypeScript,
app-server README, remote-control tests, and TUI exhaustive notification
matches.

## Stack

- Builds on #20068.

## Verification

- `just write-app-server-schema`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server transport::remote_control --lib`
- `cargo check -p codex-tui`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-app-server`
- `just fix -p codex-tui`
2026-04-28 23:52:14 +00:00
Gabriel Peal
5e6cbbadf7 Return None when auth refresh fails (#20092)
Right now, if Codex winds up in a state with auth but it can't refresh
the token, the user is left with an unhelpful message that says to log
out and log back in again.

Ultimately, we should prevent that from happening but if it does,
returning None will allow the caller to redirect the user back to the
login page
2026-04-28 16:15:47 -07:00
Michael Bolin
891722849d core tests: submit turns with permission profiles (#20010)
## Summary

- Add `PermissionProfile`-based turn submission helpers to
`core_test_support`, while keeping the legacy `SandboxPolicy` helper for
tests that intentionally exercise legacy fallback behavior.
- Switch the default `TestCodex::submit_turn()` path to send a real
`PermissionProfile` plus the required legacy compatibility projection in
`Op::UserTurn`.
- Migrate straightforward app/search/shell/truncation tests from
`SandboxPolicy::{DangerFullAccess, ReadOnly}` to
`PermissionProfile::{Disabled, read_only}`.
- Add a TUI compatibility projection helper for legacy app-server fields
so non-legacy writable roots are preserved instead of being downgraded
to read-only.
- Fix remote start/resume/fork sandbox-mode projection to classify any
managed profile with writable roots as workspace-write, not only
profiles that can write `cwd`.
- Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 47
files to 41 files without changing production behavior.

## Testing

- `cargo check -p codex-core --tests`
- `cargo test -p codex-tui
compatibility_profile_preserves_unbridgeable_write_roots`
- `cargo test -p codex-tui
sandbox_mode_preserves_non_cwd_write_roots_for_remote_sessions`
- `just fmt`
- `just fix -p core_test_support`
- `just fix -p codex-core`
2026-04-28 23:01:40 +00:00
viyatb-oai
2dbde94aa9 fix(network-proxy): normalize network proxy host matching (#19995)
## Why
The proxy matches allow and deny rules against normalized host strings.
Scoped IPv6 literals can arrive in equivalent forms, such as
`fd00::1%eth0`, `[fd00::1%eth0]`, or `[fd00::1%25eth0]`. Policy should
canonicalize those spellings without erasing scope granularity: an
unscoped rule like `fd00::1` should still cover scoped requests for that
address, while a scoped rule like `fd00::1%eth0` should remain exact to
that scope.

## What changed
- preserve IPv6 scope IDs during host normalization and canonicalize
`%25scope` to `%scope`
- match policy against the exact normalized host plus the unscoped IP
base for scoped literals
- keep local-address explicit allow checks aligned with the same
scoped/unscoped semantics
- add focused coverage for scoped IPv6 normalization, scoped allow
rules, and scoped deny rules in `network-proxy`

## Security impact
A request cannot bypass a broad deny rule by adding an IPv6 scope
suffix. At the same time, scoped policy remains precise:
`deny=fd00::1%eth0` affects that scoped spelling without collapsing
`fd00::1%eth1` onto the same key, and `allow=fe80::1%eth0` does not
implicitly allow other scopes.

## Verification
- `just fmt`
- `cargo test -p codex-network-proxy`
- `just fix -p codex-network-proxy`
- `git diff --check`

---------

Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: evawong-oai <evawong@openai.com>
2026-04-28 15:50:00 -07:00
Abhinav
3291463ff1 Fix flaky plugin hook env test (#20088)
The test was flaky because it was checking the right thing in a
roundabout way.

What it wanted to prove:
- plugin hooks receive the right environment variables.

What it actually did:
1. Run a plugin hook.
2. Have that hook write those env vars into a temporary `env.json` file.
3. After the hook finished, read `env.json` back from disk.

On Windows, that last file was sometimes not there when the test tried
to read it, so the test failed with `read env log: file not found`. The
hook system itself was not what the test failure was directly proving;
the test was failing on the extra filesystem side effect it introduced.

The fix is to stop using a temp file as the proof mechanism. The hook
now prints the env values in its normal structured output, and the test
asserts on the output that the hook engine already captures. So we still
verify the same behavior, but without depending on a separate file being
created and read back correctly on Windows.
2026-04-28 15:45:26 -07:00
Owen Lin
2e598df6fc fix: don't auto approve git -C ... (#20085)
It's safer to make sure these commands go through approval flows.
2026-04-28 22:06:55 +00:00
canvrno-oai
66b0781502 /plugins: add marketplace install flow (#18704)
This PR adds a new feature to the `/plugins` menu that gives users the
ability to add new plugin marketplaces. It introduces an Add Marketplace
tab to the right of installed marketplaces, a source prompt, loading and
error states, and the app-server request flow needed to perform the
install. After a successful `marketplace/add`, the popup refreshes back
into the newly added marketplace tab so the new plugins are immediately
visible.

- Add an Add Marketplace tab to the `/plugins` menu
- Prompt for marketplace source input from git repo, URL, or local path
- Show loading and error states during `marketplace/add`
- Refresh plugin data after success and switch into the newly added
marketplace tab
- Add tests and snapshot updates
2026-04-28 14:22:39 -07:00
Abhinav
c6e7d564c3 Discover hooks bundled with plugins (#19705)
## Why

Plugins can bundle lifecycle hooks, but Codex previously only discovered
hooks from user, project, and managed config layers. This adds the
plugin discovery and runtime plumbing needed for plugin-bundled hooks
while keeping execution behind the `plugin_hooks` feature flag.

## What

- Discovers plugin hook sources from each plugin's default
`hooks/hooks.json`.
- Supports `plugin.json` manifest `hooks` entries as either relative
paths or inline hook objects.
- Plumbs discovered plugin hook sources through plugin loading into the
hook runtime when `plugin_hooks` is enabled.
- Marks plugin-originated hook runs as `HookSource::Plugin`.
- Injects `PLUGIN_ROOT` and `CLAUDE_PLUGIN_ROOT` into plugin hook
command environments.
- Updates generated schemas and hook source metadata for the plugin hook
source.

## Stack

1. This PR - openai/codex#19705
2. openai/codex#19778
3. openai/codex#19840
4. openai/codex#19882

## Reviewer Notes

- Core logic is in `codex-rs/core-plugins/src/loader.rs` and
`codex-rs/hooks/src/engine/discovery.rs`
- Moved existing / adding new tests to
`codex-rs/core-plugins/src/loader_tests.rs` hence the large diff there
- Otherwise mostly plumbing and minor schema updates

### Core Changes

The `codex-rs/core` changes are limited to wiring plugin hook support
into existing core flows:

- `core/src/session/session.rs` conditionally pulls effective plugin
hook sources and plugin hook load warnings from `PluginsManager` when
`plugin_hooks` is enabled, then passes them into `HooksConfig`.
- `core/src/hook_runtime.rs` adds the `plugin` metric tag for
`HookSource::Plugin`.
- `core/config.schema.json` picks up the new `plugin_hooks` feature
flag, and `core/src/plugins/manager_tests.rs` updates fixtures for the
added plugin hook fields.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-28 14:17:18 -07:00