Commit Graph

1150 Commits

Author SHA1 Message Date
Andrei Eternal
c98b0d1925 codex: add dynamic tool hook support 2026-05-01 19:39:56 -07:00
Channing Conger
a5fbcf1ab4 Prune unused code-mode globals (#20542)
Hide Atomics, SharedArrayBuffer, and WebAssembly from the code-mode
runtime since the harness does not expose worker support or need those
APIs.
2026-05-01 15:11:22 -07:00
starr-openai
2952beb009 Surface multi-environment choices in environment context (#20646)
## Why
The model needs a way to see which environments are available during a
multi-environment turn without changing the legacy single-environment
prompt surface or pulling replay/persistence changes into the same
review.

## Stack
1. https://github.com/openai/codex/pull/20646 - `EnvironmentContext`
rendering for selected environments (this PR)
2. https://github.com/openai/codex/pull/20669 - selected-environment
ownership and tool config prep
3. https://github.com/openai/codex/pull/20647 - process-tool
`environment_id` routing

## What Changed
- extend `environment_context` so multi-environment turns render an
`<environments>` block with the selected environment ids and cwd values
- keep zero- and single-environment turns on the existing cwd-only
render path
- keep replay and persistence paths on the legacy surface for now so
this PR stays scoped to live prompt rendering
- add focused coverage in
`codex-rs/core/src/context/environment_context_tests.rs`

## Testing
- CI

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-01 22:11:06 +00:00
pakrym-oai
aed74e5ee4 [codex] Emit image view as core item (#20512)
## Why

Image-view results should be represented as a core-produced turn item
instead of being reconstructed by app-server. At the same time, existing
rollout/history paths still understand the legacy `ViewImageToolCall`
event, so this keeps that event as compatibility output generated from
the new item lifecycle.

## What changed

- Added `TurnItem::ImageView` to `codex-protocol`.
- Emitted image-view item start/completion directly from the core
`view_image` handler.
- Kept `ViewImageToolCall` as a legacy event and generate it from
completed `TurnItem::ImageView` items.
- Kept `thread_history.rs` on the legacy `ViewImageToolCall` replay
path, with `ImageView` item lifecycle events ignored there.
- Updated app-server protocol conversion, rollout persistence, and
affected exhaustive event matches for the new item plus legacy fan-out
shape.

## Verification

- `cargo test -p codex-protocol -p codex-app-server-protocol -p
codex-rollout -p codex-rollout-trace -p codex-mcp-server -p
codex-app-server --lib`
- `cargo test -p codex-core --test all
view_image_tool_attaches_local_image`
- `just fix -p codex-protocol -p codex-core -p codex-app-server-protocol
-p codex-app-server -p codex-rollout -p codex-rollout-trace -p
codex-mcp-server`
- `git diff --check`
2026-05-01 11:28:30 -07:00
Abhinav
78baa20780 deprecate legacy notify (#20524)
# Why

`notify` is the remaining compatibility surface from the legacy hook
implementation. The newer lifecycle hook engine now owns the active hook
system, so we should start steering users away from adding new `notify`
configs before removing the old path entirely. This also adds a
lightweight watchpoint for the deprecation so we can see how much legacy
usage remains before the clean drop.

# What

- emit a startup deprecation notice when a non-empty `notify` command is
configured
- emit `codex.notify.configured` when a session starts with legacy
`notify` configured
- emit `codex.notify.run` when the legacy notify path fires after a
completed turn
- mark `notify` as deprecated in the config schema and repo docs
- remove the orphaned `codex-rs/hooks/src/user_notification.rs` file
that is no longer compiled
- add regression coverage for the new deprecation notice

# Next steps

A follow-up PR can remove the legacy notify path entirely once we are
ready for the clean drop. Before then, we can watch
`codex.notify.configured` and `codex.notify.run` to understand the
deprecation impact and remaining active usage. The cleanup PR should
then delete the `notify` config field, the `legacy_notify`
implementation, the old compatibility dispatch types and callsites that
only exist for the legacy path, and the remaining compatibility
docs/tests.

# Testing

- `cargo test -p codex-hooks`
- `cargo test -p codex-config`
- `cargo test -p codex-core emits_deprecation_notice_for_notify`
2026-05-01 17:35:21 +00:00
pakrym-oai
f476338f93 Move apply-patch file changes into turn items (#20540)
## Why

Apply-patch file changes are now part of the core turn item stream, so
v2 clients can consume the same first-class item lifecycle path used by
other turn items instead of relying on app-server-specific remapping
from legacy patch events.

## What changed

- Added a core `TurnItem::FileChange` carrying apply-patch changes and
completion metadata.
- Updated the apply-patch tool emitter to send `ItemStarted` /
`ItemCompleted` with the new `FileChange` item while preserving legacy
`PatchApplyBegin` / `PatchApplyEnd` fan-out.
- Updated app-server v2 conversion to render the new core item directly
and stopped `event_mapping` from remapping old patch begin/end events
into item notifications.
- Kept thread history reconstruction based on the existing old
apply-patch events for rollout compatibility.

## Verification

- `cargo test -p codex-protocol -p codex-app-server-protocol`
- `cargo test -p codex-core --test all
apply_patch_tool_executes_and_emits_patch_events`
- `cargo test -p codex-app-server bespoke_event_handling`
2026-05-01 08:47:18 -07:00
Tom
fe05acad23 Make thread store process-scoped (#19474)
- Build one app-server process ThreadStore from startup config and share
it with ThreadManager and CodexMessageProcessor.
- Remove per-thread/fork store reconstruction so effective thread config
cannot switch the persistence backend.
- Add params to ThreadStore create/resume for specifying thread
metadata, since otherwise the metadata from store creation would be used
(incorrectly).
2026-04-30 21:24:59 -07:00
pakrym-oai
f50c02d7bc [codex] Remove unused event messages (#20511)
## Why

Several legacy `EventMsg` variants were still emitted or mapped even
though clients either ignored them or had moved to item/lifecycle
events. `Op::Undo` had also degraded to an unavailable shim, so this
removes that dead task path instead of preserving a command that cannot
do useful work.

`McpStartupComplete`, `WebSearchBegin`, and `ImageGenerationBegin` are
intentionally kept because useful consumers still depend on them: MCP
startup completion drives readiness behavior, and the begin events let
app-server/core consumers surface in-progress web-search and
image-generation items before the final payload arrives.

## What Changed

- Removed weak legacy event variants and payloads from `codex-protocol`,
including legacy agent deltas, background events, and undo lifecycle
events.
- Kept/restored `EventMsg::McpStartupComplete`,
`EventMsg::WebSearchBegin`, and `EventMsg::ImageGenerationBegin` with
serializer and emission coverage.
- Updated core, rollout, MCP server, app-server thread history,
review/delegate filtering, and tests to rely on the useful replacement
events that remain.
- Removed `Op::Undo`, `UndoTask`, the undo test module, and stale TUI
slash-command comments.
- Stopped agent job/background progress and compaction retry notices
from emitting `BackgroundEvent` payloads.

## Verification

- `cargo check -p codex-protocol -p codex-app-server-protocol -p
codex-core -p codex-rollout -p codex-rollout-trace -p codex-mcp-server`
- `cargo test -p codex-protocol -p codex-app-server-protocol -p
codex-rollout -p codex-rollout-trace -p codex-mcp-server`
- `cargo test -p codex-core --test all suite::items`
- `just fix -p codex-protocol -p codex-app-server-protocol -p codex-core
-p codex-rollout -p codex-rollout-trace -p codex-mcp-server`
- Earlier coverage on this PR also included `codex-mcp`, `codex-tui`,
core library tests, MCP/plugin/delegate/review/agent job tests, and MCP
startup TUI tests.
2026-04-30 20:03:26 -07:00
Dylan Hurd
af089fb21d fix(exec_policy) heredoc parsing file_redirect (#20113)
## Summary
Fixes a regression introduced in #10941 so that heredocs do not permit
file redirects to be approved by rules, and adds scenario tests to cover
this behavior.


Previously, heredoc command parsing would allow redirects and
environment variables:
```bash
# commands_for_exec_policy() would parse this via parse_shell_lc_single_command_prefix
PATH=/tmp/bad:$PATH cat <<'EOF' > /tmp/bad/hello.txt
hello
EOF
```
This conflicts with the Codex Rules documentation; heredoc parsing logic
should abide by the same strictness of parsing.


## Tests
- [x] Updated unit tests accordingly
- [x] Added scenario tests for these cases

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-01 01:05:02 +00:00
iceweasel-oai
4f96001fa7 execpolicy: unwrap PowerShell -Command wrappers on Windows (#20336)
## Why
On Windows, Codex runs shell commands through a top-level
`powershell.exe -NoProfile -Command ...` wrapper. `execpolicy` was
matching that wrapper instead of the inner command, so prefix rules like
`["git", "push"]` did not fire for PowerShell-wrapped commands even
though the same normalization already happens for `bash -lc` on Unix.

This change makes the Windows shell wrapper transparent to rule matching
while preserving the existing Windows unmatched-command safelist and
dangerous-command heuristics.

## What changed
- add `parse_powershell_command_plain_commands()` in
`shell-command/src/powershell.rs` to unwrap the top-level PowerShell
`-Command` body with `extract_powershell_command()` and parse it with
the existing PowerShell AST parser
- update `core/src/exec_policy.rs` so `commands_for_exec_policy()`
treats top-level PowerShell wrappers like `bash -lc` and evaluates rules
against the parsed inner commands
- carry a small `ExecPolicyCommandOrigin` through unmatched-command
evaluation and expose `is_safe_powershell_words()` /
`is_dangerous_powershell_words()` so Windows safelist and
dangerous-command checks still work after unwrap
- add Windows-focused tests for wrapped PowerShell prompt/allow matches,
wrapper parsing, and unmatched safe/dangerous inner commands, and
re-enable the end-to-end `execpolicy_blocks_shell_invocation` test on
Windows

## Testing
- `cargo test -p codex-shell-command`
2026-05-01 00:56:20 +00:00
Abhinav
0d9a5d20ec Alias codex_hooks feature as hooks (#20522)
# Why

The hooks feature flag should use the concise canonical name `hooks`,
while existing configs that still use `codex_hooks` continue to work
during the rename.

# What

- change the canonical `Feature::CodexHooks` key from `codex_hooks` to
`hooks`
- register `codex_hooks` through the existing legacy-alias path
- update the config schema and canonical config fixtures to prefer
`hooks`
- add regression coverage that both `hooks` and `codex_hooks` resolve to
`Feature::CodexHooks`

# Verification

- `cargo test -p codex-features`
- `cargo test -p codex-core config::schema_tests`
- `cargo test -p codex-core
pre_tool_use_blocks_shell_when_defined_in_config_toml`
- `cargo test -p codex-app-server
hooks_list_uses_each_cwds_effective_feature_enablement`
2026-05-01 00:46:33 +00:00
Akshay Nathan
8426edf71e Stateful streaming apply_patch parser 2026-04-30 21:41:15 +00:00
Ahmed Ibrahim
8a97f3cf03 realtime: rename provider session ids (#20361)
## Summary

Codex is repurposing `session` to mean a thread group, so the realtime
provider session id should no longer use `session_id` / `sessionId` in
Codex-facing protocol payloads. This PR renames that provider-specific
field to `realtime_session_id` / `realtimeSessionId` and intentionally
breaks clients that still send the old field names.

## What Changed

- Renamed realtime provider session fields in `ConversationStartParams`,
`RealtimeConversationStartedEvent`, and `RealtimeEvent::SessionUpdated`.
- Renamed app-server v2 realtime request and notification fields to
`realtimeSessionId`.
- Removed legacy serde aliases for `session_id` / `sessionId`; clients
must send the new names.
- Propagated the rename through core realtime startup, app-server
adapters, codex-api websocket handling, and TUI realtime state.
- Regenerated app-server protocol schema/TypeScript outputs and updated
app-server README examples.
- Kept upstream Realtime API concepts unchanged: provider `session.id`
parsing and `x-session-id` headers still use the upstream wire names.

## Testing

- CI is running on the latest pushed commit.
- Earlier local verification on this PR:
  - `cargo test -p codex-protocol`
- `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-core
realtime_conversation`
  - `cargo test -p codex-app-server-protocol`
- `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-app-server
realtime_conversation`
- attempted `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-tui` (local
linker bus error while linking the test binary)

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-30 13:39:48 +03:00
pakrym-oai
fedcefe9da Reduce the surface of collaboration modes (#20149)
Collaboration modes were slightly invasive both into ThreadManager
construction and ModelProvider
2026-04-29 17:22:41 -07:00
Matthew Zeng
8ce48f9968 [tool_suggest] Improve tool_suggest triggering conditions. (#20091)
## Summary
- Tighten `tool_suggest` guidance so it prefers explicit plugin install
requests, while still allowing a connector install when the relevant
plugin is already installed and a needed connector from that plugin is
missing.
- Tell the model not to call `tool_suggest` in parallel with other
tools.

## Testing
- `cargo test -p codex-tools tool_suggest`
- `cargo test -p codex-core tool_suggest`
2026-04-29 13:41:12 -07:00
viyatb-oai
07c8b8c77c fix: handle deferred network proxy denials (#19184)
## Why

This bug is exposed by Guardian/auto-review approvals. With the managed
network proxy enabled, a blocked network request can be reported back
through the network approval service as an approval denial after the
command has already started. Before this change, the shell and unified
exec runtimes registered those network approval calls, but did not have
a way to observe an async proxy denial as a cancellation/failure signal
for the running process.

The result was confusing: Guardian/auto-review could correctly deny
network access, but the command path could keep running or unregister
the approval without surfacing the denial as the command failure.

## What Changed

- `NetworkApprovalService` now attaches a cancellation token to active
and deferred network approvals.
- Proxy-denial outcomes are recorded only for active registrations,
cancel the owning token, and are consumed when the approval is
finalized.
- The shell runtime combines the normal command timeout with the
network-denial cancellation token.
- Unified exec stores the deferred network approval object, terminates
tracked processes when the proxy denial arrives, and returns the denial
as a process failure while polling or completing the process.
- Tool orchestration passes the active network approval cancellation
token into the sandbox attempt and preserves deferred approval errors
instead of silently unregistering them.
- App-server `command/exec` now handles the combined
timeout-or-cancellation expiration variant used by the runtime.

## Verification

- `cargo test -p codex-core network_approval --lib`
- `cargo clippy -p codex-app-server --all-targets -- -D warnings`
- `cargo clippy -p codex-core --all-targets -- -D warnings`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-29 19:13:57 +00:00
pakrym-oai
8356806fc9 Add ThreadManager sample crate (#20141)
Summary:
- Add codex-thread-manager-sample, a one-shot binary that starts a
ThreadManager thread, submits a prompt, and prints the final assistant
output.
- Pass ThreadStore into ThreadManager::new and expose
thread_store_from_config for existing callsites.
- Build the sample Config directly with only --model and prompt inputs.

Verification:
- just fmt
- cargo check -p codex-thread-manager-sample -p codex-app-server -p
codex-mcp-server
- git diff --check

Tests: Not run per request.
2026-04-29 11:21:06 -07:00
starr-openai
e1ec9e63a0 Add environment provider snapshot (#20058)
## Summary
- Change `EnvironmentProvider` to return concrete `Environment`
instances instead of `EnvironmentConfigurations`.
- Make `DefaultEnvironmentProvider` provide the provider-visible `local`
environment plus optional `remote` environment from
`CODEX_EXEC_SERVER_URL`.
- Keep `EnvironmentManager` as the concrete cache while exposing its own
explicit local environment for `local_environment()` fallback paths.

## Validation
- `just fmt`
- `git diff --check`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-28 20:05:18 -07:00
Michael Bolin
c9f7c88f3d fix: restore live event submit path for apply patch tests (#20108)
## Summary

This fixes the CI regression introduced by
[#20040](https://github.com/openai/codex/pull/20040).

That PR migrated several `apply_patch_cli` tests from direct
`codex.submit(Op::UserTurn { ... })` calls to `harness.submit(...)`.
`harness.submit()` waits for `TurnComplete` before returning, which
drains the same event stream that these tests use to assert `TurnDiff`,
`PatchApplyUpdated`, and related live events. The regressed tests then
timed out waiting for events that had already been consumed.

This change restores a no-wait submit path for the event-observing
`apply_patch_cli` tests so they can watch the turn stream directly
again.

## What Changed

- added a local `submit_without_wait(...)` helper in
`codex-rs/core/tests/suite/apply_patch_cli.rs`
- switched the `apply_patch_cli` tests that assert live turn events back
to that helper
- left the profile-backed `harness.submit(...)` migration in place for
tests that only care about final filesystem or tool output state

## Why macOS Looked Green

In the failing run
[25084487331](https://github.com/openai/codex/actions/runs/25084487331),
`//codex-rs/core:core-all-test` was cached on macOS, so the regressed
tests were not rerun there. The Linux GNU, Linux MUSL, and Windows Bazel
jobs reran the target and exposed the failure.

## Verification

- `cargo test -p codex-core apply_patch_ -- --nocapture`
- previously failing local cases now pass again:
  - `apply_patch_cli_move_without_content_change_has_no_turn_diff`
  - `apply_patch_turn_diff_for_rename_with_content_change`
  - `apply_patch_aggregates_diff_across_multiple_tool_calls`
2026-04-28 18:09:20 -07:00
Michael Bolin
1211a90a35 core tests: migrate hook turns to profiles (#20041)
## Summary
- Removes `SandboxPolicy` from the hooks test suite.
- Submits hook-related turns with explicit `PermissionProfile` values
for disabled, read-only, and workspace-write cases.
- Preserves the managed-network hook test by configuring and submitting
a workspace-write profile with enabled network, allowing the existing
requirements-backed proxy path to remain covered.

## Verification
- `cargo check -p codex-core --tests`
- `just fmt`
2026-04-28 17:18:45 -07:00
Michael Bolin
1fed948c66 core tests: migrate apply patch turns to profiles (#20040)
## Summary
- Removes `SandboxPolicy` from the apply-patch CLI test suite.
- Uses the harness' profile-backed submit helper for danger/no-sandbox
turns instead of constructing `Op::UserTurn` manually with legacy
fields.
- Converts the workspace-write traversal cases to submit
`PermissionProfile::workspace_write_with(...)` directly.

## Verification
- `cargo check -p codex-core --tests`
- `just fmt`
2026-04-28 17:18:19 -07:00
Michael Bolin
1dae5788e1 core tests: migrate rmcp turns to profiles (#20037)
## Summary
- Removes `SandboxPolicy` from the RMCP client test suite.
- Adds shared read-only user-turn helpers that submit
`PermissionProfile::read_only()` plus the legacy compatibility
projection required by the current `Op::UserTurn` shape.
- Keeps sandbox metadata assertions intact by deriving the expected
legacy `sandboxPolicy` value from the same read-only profile used for
the turn.

## Verification
- `cargo check -p codex-core --tests`
- `just fmt`
2026-04-28 17:17:47 -07:00
Michael Bolin
6662c0f312 core tests: migrate compact turns to profiles (#20035)
## Summary
- Removes the remaining `SandboxPolicy` usage from the compaction test
suite.
- Adds a small local helper for direct `Op::UserTurn` construction so
these tests send `PermissionProfile::Disabled` plus the legacy
compatibility projection required by the protocol field.
- Keeps the existing danger/full-access behavior while exercising the
canonical permission profile path.

## Verification
- `cargo check -p codex-core --tests`
- `just fmt`
2026-04-28 17:17:12 -07:00
Michael Bolin
026df712cc core tests: migrate zsh-fork permissions to profiles (#20034)
## Summary
- Updates the zsh-fork test helper to configure `PermissionProfile`
directly instead of constructing a legacy `SandboxPolicy`.
- Sends permission-profile-backed turns from the skill approval zsh-fork
tests so the runtime and request path exercise the canonical permissions
model.
- Leaves the broader approvals suite on legacy policies for now, except
for the zsh-fork test that shares this helper.

## Verification
- `cargo check -p codex-core --tests`
- `just fmt`
2026-04-28 17:15:58 -07:00
Michael Bolin
1ea90410e1 core tests: migrate request permissions tool turns to profiles (#20033)
## Summary

This migrates the macOS request-permissions tool tests from legacy
`SandboxPolicy` setup to `PermissionProfile` setup. The tests still
exercise the same workspace-write baseline and request-permission
grants, but the canonical permissions value is now the profile.

## Changes

- Replaces the `workspace_write_excluding_tmp()` helper with a
`PermissionProfile::workspace_write_with()` helper.
- Applies test config through `Permissions::set_permission_profile()`.
- Uses `turn_permission_fields()` for `Op::UserTurn` compatibility
fields.
- Removes the `SandboxPolicy` import from `request_permissions_tool.rs`.

## Verification

- `cargo check -p codex-core --tests`
2026-04-28 17:15:13 -07:00
Michael Bolin
af39e488bc core tests: migrate prompt caching turns to profiles (#20032)
## Summary

This removes the explicit `SandboxPolicy` constructors from
`core/tests/suite/prompt_caching.rs`. The tests still exercise the same
prompt-cache invariants across permission and turn-context changes, but
the permission source is now `PermissionProfile`.

## Changes

- Uses `PermissionProfile::workspace_write_with()` for workspace-write
override scenarios.
- Uses `PermissionProfile::Disabled` for the no-sandbox per-turn
override.
- Projects profiles through `turn_permission_fields()` or
`to_legacy_sandbox_policy()` only to populate compatibility fields on
existing ops.
- Removes the `SandboxPolicy` import from `prompt_caching.rs`.

## Verification

- `cargo check -p codex-core --tests`
2026-04-28 17:13:53 -07:00
Michael Bolin
5d08315c00 core tests: migrate exec policy turns to profiles (#20030)
## Summary

This migrates `core/tests/suite/exec_policy.rs` away from legacy
`SandboxPolicy` turn construction. These tests all use no-sandbox turns
to exercise exec-policy behavior, so `PermissionProfile::Disabled` is
the canonical representation.

## Changes

- Replaces direct `SandboxPolicy::DangerFullAccess` turn fields with
`PermissionProfile::Disabled`.
- Uses `turn_permission_fields()` to populate the compatibility
`sandbox_policy` field required by `Op::UserTurn`.
- Removes the `SandboxPolicy` import from `exec_policy.rs`.

## Verification

- `cargo check -p codex-core --tests`
2026-04-28 17:12:48 -07:00
Michael Bolin
b599849d86 core tests: migrate permissions message tests to profiles (#20028)
## Summary

This removes another test-only `SandboxPolicy` dependency by configuring
`permissions_messages.rs` with a `PermissionProfile` directly. The test
still verifies the rendered compatibility permissions text, but now
obtains the legacy projection from the loaded `Config` rather than using
`SandboxPolicy` as the source of truth.

## Changes

- Builds the workspace-write test setup with
`PermissionProfile::workspace_write_with()`.
- Applies that profile through `Permissions::set_permission_profile()`.
- Uses `Config::legacy_sandbox_policy()` only for the expected
`PermissionsInstructions` compatibility rendering.

## Verification

- `cargo check -p codex-core --tests`
2026-04-28 17:12:10 -07:00
Michael Bolin
3ef09c71d3 core tests: migrate tools tests to permission profiles (#20027)
## Summary

This continues the test-side migration away from `SandboxPolicy` by
removing the remaining legacy policy setup in
`core/tests/suite/tools.rs`. The affected test was already modeling a
profile-backed filesystem policy with a deny-read glob, so configuring
the test through `Permissions::set_permission_profile()` is a better
match for the behavior being exercised.

## Changes

- Drops the `SandboxPolicy` import from `core/tests/suite/tools.rs`.
- Configures the glob deny-read shell test directly with a
`PermissionProfile` instead of creating a legacy read-only policy first.
- Submits the test turn with the session permission profile so the
deny-read glob remains active for the command under test.

## Verification

- `cargo check -p codex-core --tests`
2026-04-28 17:11:43 -07:00
Michael Bolin
8d3992d830 core tests: migrate plan item turns to profiles (#20026)
## Why

The core item tests still had a cluster of plan-mode `Op::UserTurn`
literals that used `SandboxPolicy::DangerFullAccess` and omitted
`permission_profile`. These tests are validating emitted item lifecycle
events, so keeping them on the legacy sandbox-only turn shape adds noise
to the broader permissions migration without testing legacy behavior.

## What Changed

- Adds a local `disabled_plan_turn()` helper that preserves the existing
`std::env::current_dir()` turn cwd behavior.
- Uses `turn_permission_fields(PermissionProfile::Disabled, cwd)` to
populate both the compatibility `sandbox_policy` and canonical
`permission_profile` fields.
- Replaces the plan-mode hand-built turns in
`codex-rs/core/tests/suite/items.rs`, removing all `SandboxPolicy`
references from that file and reducing remaining `codex-rs/core/tests`
`SandboxPolicy` files from 16 to 15.

## Verification

- `cargo check -p codex-core --tests`
2026-04-28 17:11:17 -07:00
Michael Bolin
162f4e3183 core tests: migrate safety check turns to profiles (#20024)
## Why

This stack is retiring direct `SandboxPolicy` construction from tests so
core coverage exercises the same `PermissionProfile` turn path used by
runtime code. `safety_check_downgrade.rs` still submitted each test turn
as `SandboxPolicy::DangerFullAccess` with no permission profile, even
though the tests are about model verification/reroute behavior rather
than legacy sandbox conversion.

## What Changed

- Adds a local `disabled_text_turn()` helper that derives both the
compatibility `sandbox_policy` and canonical `permission_profile` from
`PermissionProfile::Disabled`.
- Replaces repeated hand-built `Op::UserTurn` literals in
`codex-rs/core/tests/suite/safety_check_downgrade.rs` with that helper.
- Removes all `SandboxPolicy` references from the safety-check suite,
reducing the remaining `codex-rs/core/tests` files that mention
`SandboxPolicy` from 17 to 16.

## Verification

- `cargo check -p codex-core --tests`
2026-04-28 17:10:42 -07:00
Michael Bolin
2a8ce9b319 core tests: migrate view image turns to profiles (#20021)
## Why

This stack is removing direct `SandboxPolicy` usage from test code so
new tests exercise the same `PermissionProfile` path that runtime code
now treats as canonical. `view_image.rs` still built `Op::UserTurn`
requests with `SandboxPolicy::DangerFullAccess` and no permission
profile, which kept another core test module on the legacy turn shape.

## What Changed

- Adds a small `disabled_user_turn()` helper for the view-image suite
that derives the compatibility `sandbox_policy` and canonical
`permission_profile` from `PermissionProfile::Disabled`.
- Replaces repeated direct `Op::UserTurn` literals in
`codex-rs/core/tests/suite/view_image.rs` with that helper.
- Removes all `SandboxPolicy` references from `view_image.rs`, reducing
the remaining `codex-rs/core/tests` files that mention `SandboxPolicy`
from 18 to 17.

## Verification

- `cargo check -p codex-core --tests`
2026-04-28 17:09:48 -07:00
Michael Bolin
d77d23da2e core tests: migrate model/personality turns to profiles (#20018)
## Summary

- Migrates `model_switching.rs` and `personality.rs` direct
`Op::UserTurn` construction from legacy `SandboxPolicy` literals to
`PermissionProfile`-backed turn fields.
- Adds small local helpers in each file so tests keep asserting
model/personality behavior without repeating permission plumbing.
- Reduces `rg -l '\bSandboxPolicy\b' codex-rs/core/tests` from 20 files
to 18; `codex-rs/tui` remains at zero `SandboxPolicy` references.

## Testing

- `cargo check -p codex-core --tests`
- `just fmt`
2026-04-28 17:09:12 -07:00
Michael Bolin
d6d79ffcc7 core tests: send model turns with permission profiles (#20016)
## Summary
- Migrate direct `Op::UserTurn` construction in remote-model tests from
legacy `SandboxPolicy::DangerFullAccess` to
`PermissionProfile::Disabled` via `turn_permission_fields()`.
- Migrate the Responses API proxy header helper from an inline
workspace-write `SandboxPolicy` to
`PermissionProfile::workspace_write()`.
- Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 22
files after #20015 to 20 files.

## Testing
- `cargo check -p codex-core --tests`
- `just fmt`





























---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/20016).
* #20041
* #20040
* #20037
* #20035
* #20034
* #20033
* #20032
* #20030
* #20028
* #20027
* #20026
* #20024
* #20021
* #20018
* __->__ #20016
2026-04-28 17:08:04 -07:00
Michael Bolin
158b2a4201 core tests: configure profiles directly (#20015)
## Summary
- Replace legacy sandbox config setup in delegate and telemetry tests
with direct `PermissionProfile` configuration.
- Move no-sandbox and read-only test turns in `tools.rs`,
`code_mode.rs`, `user_shell_cmd.rs`, and `model_visible_layout.rs` from
legacy `SandboxPolicy` values to `PermissionProfile` helpers, while
leaving the deny-glob read-only compatibility case for a later targeted
cleanup.
- Use `PermissionProfile::read_only()` where tests need managed
read-only behavior and `PermissionProfile::Disabled` where they
intentionally need no sandbox.
- Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 27
files after #20013 to 22 files.

## Testing
- `cargo check -p codex-core --tests`
- `just fmt`
2026-04-28 17:06:59 -07:00
Michael Bolin
52e79ee49a core tests: migrate more turns to permission profiles (#20013)
## Summary
- Migrate another batch of direct `Op::UserTurn` test construction from
legacy `SandboxPolicy` values to `PermissionProfile` inputs via
`turn_permission_fields()`.
- Replace a one-off read-only `SandboxPolicy` bridge in the macOS exec
test with `PermissionProfile::read_only()`.
- Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 32
files at the start of the cleanup stack to 27 files.

## Testing
- `cargo check -p codex-core --tests`
- `just fmt`
- `just fix -p codex-core`
2026-04-28 17:05:53 -07:00
Michael Bolin
7d15936e69 core tests: build user turns from permission profiles (#20011)
## Summary
- Add `turn_permission_fields()` so tests that construct `Op::UserTurn`
directly can provide a canonical `PermissionProfile` while still filling
the required legacy `sandbox_policy` compatibility field.
- Migrate direct user-turn construction in core integration tests from
`SandboxPolicy::DangerFullAccess` to `PermissionProfile::Disabled`.
- Continue reducing direct `SandboxPolicy` usage in
`codex-rs/core/tests`, from 41 files after #20010 to 32 files in this
PR.

## Testing
- `cargo check -p codex-core --tests`
- `just fmt`
- `just fix -p core_test_support`
- `just fix -p codex-core`
2026-04-28 17:03:20 -07:00
Michael Bolin
891722849d core tests: submit turns with permission profiles (#20010)
## Summary

- Add `PermissionProfile`-based turn submission helpers to
`core_test_support`, while keeping the legacy `SandboxPolicy` helper for
tests that intentionally exercise legacy fallback behavior.
- Switch the default `TestCodex::submit_turn()` path to send a real
`PermissionProfile` plus the required legacy compatibility projection in
`Op::UserTurn`.
- Migrate straightforward app/search/shell/truncation tests from
`SandboxPolicy::{DangerFullAccess, ReadOnly}` to
`PermissionProfile::{Disabled, read_only}`.
- Add a TUI compatibility projection helper for legacy app-server fields
so non-legacy writable roots are preserved instead of being downgraded
to read-only.
- Fix remote start/resume/fork sandbox-mode projection to classify any
managed profile with writable roots as workspace-write, not only
profiles that can write `cwd`.
- Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 47
files to 41 files without changing production behavior.

## Testing

- `cargo check -p codex-core --tests`
- `cargo test -p codex-tui
compatibility_profile_preserves_unbridgeable_write_roots`
- `cargo test -p codex-tui
sandbox_mode_preserves_non_cwd_write_roots_for_remote_sessions`
- `just fmt`
- `just fix -p core_test_support`
- `just fix -p codex-core`
2026-04-28 23:01:40 +00:00
Abhinav
c6e7d564c3 Discover hooks bundled with plugins (#19705)
## Why

Plugins can bundle lifecycle hooks, but Codex previously only discovered
hooks from user, project, and managed config layers. This adds the
plugin discovery and runtime plumbing needed for plugin-bundled hooks
while keeping execution behind the `plugin_hooks` feature flag.

## What

- Discovers plugin hook sources from each plugin's default
`hooks/hooks.json`.
- Supports `plugin.json` manifest `hooks` entries as either relative
paths or inline hook objects.
- Plumbs discovered plugin hook sources through plugin loading into the
hook runtime when `plugin_hooks` is enabled.
- Marks plugin-originated hook runs as `HookSource::Plugin`.
- Injects `PLUGIN_ROOT` and `CLAUDE_PLUGIN_ROOT` into plugin hook
command environments.
- Updates generated schemas and hook source metadata for the plugin hook
source.

## Stack

1. This PR - openai/codex#19705
2. openai/codex#19778
3. openai/codex#19840
4. openai/codex#19882

## Reviewer Notes

- Core logic is in `codex-rs/core-plugins/src/loader.rs` and
`codex-rs/hooks/src/engine/discovery.rs`
- Moved existing / adding new tests to
`codex-rs/core-plugins/src/loader_tests.rs` hence the large diff there
- Otherwise mostly plumbing and minor schema updates

### Core Changes

The `codex-rs/core` changes are limited to wiring plugin hook support
into existing core flows:

- `core/src/session/session.rs` conditionally pulls effective plugin
hook sources and plugin hook load warnings from `PluginsManager` when
`plugin_hooks` is enabled, then passes them into `HooksConfig`.
- `core/src/hook_runtime.rs` adds the `plugin` metric tag for
`HookSource::Plugin`.
- `core/config.schema.json` picks up the new `plugin_hooks` feature
flag, and `core/src/plugins/manager_tests.rs` updates fixtures for the
added plugin hook fields.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-28 14:17:18 -07:00
charley-openai
de2ccf9473 [codex] Add token usage to turn tracing spans (#19432)
## Why

Slow Codex turns are easier to debug when token usage is visible in the
trace itself, without joining against separate analytics. This adds
token usage to existing turn-handling spans for regular user turns only.

[Example
turn](https://openai.datadoghq.com/apm/trace/9d353efa2cb5de1f4c5b93dc33c3df04?colorBy=service&graphType=flamegraph&shouldShowLegend=true&sort=time&spanID=3555541504891512675&spanViewType=metadata&traceQuery=)
<img width="1447" height="967" alt="Screenshot 2026-04-24 at 3 03 07 PM"
src="https://github.com/user-attachments/assets/ab7bb187-e7fc-41f0-a366-6c44610b2b2c"
/>

## What Changed

Added response-level token fields on completed handle_responses spans:

gen_ai.usage.input_tokens
gen_ai.usage.cache_read.input_tokens
gen_ai.usage.output_tokens
codex.usage.reasoning_output_tokens
codex.usage.total_tokens
Added aggregate token fields on regular turn spans:

codex.turn.token_usage.*
Added an explicit regular-turn opt-in via
SessionTask::records_turn_token_usage_on_span() so this is not coupled
to span-name strings.

## Testing

- `cargo test -p codex-otel`
- `cargo test -p codex-core
turn_and_completed_response_spans_record_token_usage`
- `just fmt`
- `just fix -p codex-core`
- `just fix -p codex-otel`
- Manual local Electron/app-server smoke test: regular user turn emits
the new span fields

Known status: `cargo test -p codex-core` was attempted and failed in
unrelated existing areas: config approvals, request-permissions,
git-info ordering, and subagent metadata persistence.
2026-04-28 11:41:32 -07:00
Michael Bolin
9e26613657 permissions: add built-in default profiles (#19900)
## Why

The migration away from `SandboxPolicy` needs new configs to start from
permissions profiles instead of deriving profiles from legacy sandbox
modes. Existing users can have empty `config.toml` files, and we should
not rewrite user-owned config files that may live in shared
repositories.

This PR introduces built-in profile names so an empty config can resolve
to a canonical `PermissionProfile`, while explicit named `[permissions]`
profiles still behave predictably.

## What changed

- Adds built-in `default_permissions` profile names:
  - `:read-only` maps to `PermissionProfile::read_only()`.
- `:workspace` maps to the workspace-write profile, including
project-root metadata carveouts.
- `:danger-no-sandbox` maps to `PermissionProfile::Disabled`, preserving
the distinction between no sandbox and a broad managed sandbox.
- Reserves the `:` prefix for built-in profiles so user-defined
`[permissions]` profiles cannot collide with future built-ins.
- Allows `default_permissions` to reference a built-in profile without
requiring a `[permissions]` table.
- Makes an otherwise empty config choose a built-in profile by
trust/platform context: trusted or untrusted project roots use
`:workspace` when the platform supports that sandbox, while roots
without a trust decision use `:read-only`.
- Keeps legacy `sandbox_mode` configs on the legacy path, and still
rejects user-defined `[permissions]` profiles that omit
`default_permissions` so we do not silently guess among custom profiles.
- Preserves compatibility behavior for implicit defaults: bare
`network.enabled = true` allows runtime network without starting the
managed proxy, explicit profile proxy policy still starts the proxy, and
implicit workspace/add-dir roots keep legacy metadata carveouts.

## Verification

- `cargo test -p codex-core builtin --lib`
- `cargo test -p codex-core profile_network_proxy_config`
- `cargo test -p codex-core
implicit_builtin_workspace_profile_preserves_add_dir_metadata_carveouts`
- `cargo test -p codex-core
permissions_profiles_network_enabled_allows_runtime_network_without_proxy`
- `cargo test -p codex-core
permissions_profiles_proxy_policy_starts_managed_network_proxy`

## Documentation

Public Codex config docs should mention these built-in names when the
`[permissions]` config format is ready to document as stable.









---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19900).
* #20041
* #20040
* #20037
* #20035
* #20034
* #20033
* #20032
* #20030
* #20028
* #20027
* #20026
* #20024
* #20021
* #20018
* #20016
* #20015
* #20013
* #20011
* #20010
* #20008
* __->__ #19900
2026-04-28 11:21:39 -07:00
efrazer-oai
f6797c3ac6 feat: verify agent identity JWTs with JWKS (#19764) 2026-04-28 09:56:20 -07:00
mchen-oai
ccec84b148 Add turn start timestamp to turn metadata (#19473)
## Why
- Without change: MCP tool calls receive
`_meta["x-codex-turn-metadata"]` with `session_id` and `turn_id`.
- Issue: MCP servers may want the turn start timestamp to measure
internal latency relative to turn start.

## What Changed
- With change: turn metadata now includes `turn_started_at_unix_ms`,
which is propagated to MCP tool calls in
`_meta["x-codex-turn-metadata"]`.

## Verification
- `codex-rs/core/src/mcp_tool_call_tests.rs`
- `codex-rs/core/src/turn_metadata_tests.rs`
- `codex-rs/core/src/turn_timing_tests.rs`
- `codex-rs/core/tests/responses_headers.rs`
- `codex-rs/core/tests/suite/search_tool.rs`
2026-04-28 16:36:59 +00:00
jif-oai
431ebeaef7 feat: split memories part 2 (#19860)
Keep extracting memories out of core and moving the write trigger in the
app-server
This is temporary and it should move at the client level as a follow-up
This makes core fully independant from `codex-memories-write`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-28 13:03:28 +02:00
marksteinbrick-oai
6a8df2b61d [codex-analytics] include user agent in default headers (#17689)
## Summary
Adds the standard Codex `User-Agent` to shared default headers so the
responses-api WS handshake carries the same client OS and version
context as HTTP requests.

## Testing
- `cargo test -p codex-core
build_ws_client_metadata_includes_window_lineage_and_turn_metadata`
- `cargo test -p codex-core --test all responses_websocket`
2026-04-27 21:32:10 -07:00
pakrym-oai
4e05f3053c Remove ghost snapshots (#19481)
## Summary
- Remove `ghost_snapshot` / `GhostCommit` from the Responses API surface
and generated SDK/schema artifacts.
- Keep legacy config loading compatible, but make undo a no-op that
reports the feature is unavailable.
- Clean up core history, compaction, telemetry, rollout, and tests to
stop carrying ghost snapshot items.

## Testing
- Unit tests passed for `codex-protocol`, `codex-core` targeted undo and
compaction flows, `codex-rollout`, and `codex-app-server-protocol`.
- Regenerated config and app-server schemas plus Python SDK artifacts
and verified they match the checked-in outputs.
2026-04-27 18:48:57 -07:00
Dylan Hurd
7e8594fc19 Stabilize plugin MCP fixture tests (#19452)
## Why

Recent `main` CI had repeated flakes in the plugin fixture tests:

- `codex-core::all
suite::plugins::explicit_plugin_mentions_inject_plugin_guidance` failed
in runs
[24909500958](https://github.com/openai/codex/actions/runs/24909500958),
[24908076251](https://github.com/openai/codex/actions/runs/24908076251),
[24906197645](https://github.com/openai/codex/actions/runs/24906197645),
and
[24898949647](https://github.com/openai/codex/actions/runs/24898949647).
- `codex-core::all suite::plugins::plugin_mcp_tools_are_listed` failed
in runs
[24909500958](https://github.com/openai/codex/actions/runs/24909500958),
[24908076251](https://github.com/openai/codex/actions/runs/24908076251),
and
[24898949647](https://github.com/openai/codex/actions/runs/24898949647).

The failures were in the same plugin/MCP fixture family: assertions
expected sample plugin guidance or tool inventory, but the test could
observe the session before the sample MCP server had finished startup.

## Root Cause

`explicit_plugin_mentions_inject_plugin_guidance` submitted the user
turn immediately after constructing the session. MCP startup is
asynchronous, so on a slower or busier CI runner the prompt could be
built before the sample plugin MCP server had reported its tools. That
made the test depend on scheduler timing rather than the fixture being
ready.

`plugin_mcp_tools_are_listed` already needed the same readiness
condition, but its wait logic was local to that test.

## What Changed

- Added a shared `wait_for_sample_mcp_ready` helper for the plugin
fixture tests.
- Wait for `McpStartupComplete` before submitting the explicit plugin
mention turn.
- Reuse the same readiness helper in the MCP tool-listing test.

## Why This Should Be Reliable

The tests now wait for the explicit readiness signal from the sample MCP
server before asserting guidance or tools derived from that server. This
removes the startup race while still exercising the real fixture path,
so the assertions should only run after the plugin inventory is
deterministic.

## Verification

- `cargo test -p codex-core --test all plugins::`
- GitHub CI for this PR is passing.
2026-04-28 01:14:44 +00:00
sayan-oai
85c1500569 fix: filter dynamic deferred tools from model_visible_specs (#19771)
fixes #19486

### Problem
Right now dynamic deferred tools are filtered at normal-turn prompt
building time, rather than upstream while building the `ToolRouter`
itself. This causes issues because dynamic deferred tools are then
wrongly included in the router's `model_visible_specs`, which is what
the compaction request-building flow relies on.

### Fix
Move the dynamic deferred tool filtering to `ToolRouter` creation time
to solve this problem for every request that relies on `ToolRouter` for
`model_visible_specs`, which solves the issue generically.

### Tests
Added unit + integration tests to ensure dynamic deferred tools are
omitted from `model_visible_specs` and compaction request respectively.

Tested against live `/compact` endpoint; raw deferred dynamic tools
without `tool_search` returned `400` (current bug), while the filtered
payload (this fix) returns `200`.
2026-04-27 19:09:02 +00:00
efrazer-oai
2009f6e894 refactor: make auth loading async (#19762)
## Summary

Auth loading used to expose synchronous construction helpers in several
places even though some auth sources now need async work. This PR makes
the auth-loading surface async and updates the callers to await it.

This is intentionally only plumbing. It does not change how
AgentIdentity tokens are decoded, how task runtime ids are allocated, or
how JWT signatures are verified.

## Stack

1. **This PR:** [refactor: make auth loading
async](https://github.com/openai/codex/pull/19762)
2. [refactor: load AgentIdentity runtime
eagerly](https://github.com/openai/codex/pull/19763)
3. [feat: verify AgentIdentity JWTs with
JWKS](https://github.com/openai/codex/pull/19764)

## Important call sites

| Area | Change |
| --- | --- |
| `codex-login` auth loading | `CodexAuth` and `AuthManager`
construction paths now await auth loading. |
| app-server startup | Auth manager construction is awaited during
initialization. |
| CLI/TUI/exec/MCP/chatgpt callers | Existing auth-loading calls now
await the same behavior. |
| cloud requirements storage loader | The loader becomes async so it can
share the same auth construction path. |
| auth tests | Tests that load auth now run in async contexts. |

## Testing

Tests: targeted Rust auth test compilation, formatter, scoped Clippy
fix, and Bazel lock check.
2026-04-27 11:00:27 -07:00
jif-oai
01ab25dbb5 feat: use git-backed workspace diffs for memory consolidation (#18982)
## Why

This PR make the `morpheus` agent (memory phase 2) use a git diff to
start it's consolidation. The workflow is the following:
1. The agent acquire a lock
2. If `.codex/memories` does not exist or is not a git root, initialize
everything (and make a first empty commit)
3. Update `raw_memories.md` and `rollout_summaries/` as before.
Basically we select max N phase 1 memories based on a given policy
4. We use git (`gix`) to get a diff between the current state of
`.codex/memories` and the last commit.
5. Dump the diff in `phase2_workspace_diff.md`
6. Spawn `morpheus` and point it to `phase2_workspace_diff.md`
7. Wait for `morpheus` to be done
8. Re-create a new `.git` and make one single commit on it. We do this
because we don't want to preserve history through `.git` and this is
cheap anyway
9. We release the lock
On top of this, we keep the retry policies etc etc

The goals of this new workflow are:
* Better support of any memory extensions such as `chronicle`
* Allow the user to manually edit memories and this will be considered
by the phase 2 agent
 
As a follow-up we will need to add support for user's edition while
`morpheus` is running

## What Changed

- Added memory workspace helpers that prepare the git baseline, compute
the diff, write `phase2_workspace_diff.md`, and reset the baseline after
successful consolidation.
- Updated Phase 2 to sync current inputs into `raw_memories.md` and
`rollout_summaries/`, prune old extension resources, skip clean
workspaces, and run the consolidation subagent only when the workspace
has changes.
- Tightened Phase 2 job ownership around long-running consolidation with
heartbeats and an ownership check before resetting the baseline.
- Simplified the prompt and state APIs so DB watermarks are bookkeeping,
while workspace dirtiness decides whether consolidation work exists.
- Updated the memory pipeline README and tests for workspace diffs,
extension-resource cleanup, pollution-driven forgetting, selection
ranking, and baseline persistence.

## Verification

- Added/updated coverage in `core/src/memories/tests.rs`,
`core/src/memories/workspace_tests.rs`, `state/src/runtime/memories.rs`,
and `core/tests/suite/memories.rs`.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-27 14:32:44 +02:00