Commit Graph

535 Commits

Author SHA1 Message Date
Abhinav
c6e7d564c3 Discover hooks bundled with plugins (#19705)
## Why

Plugins can bundle lifecycle hooks, but Codex previously only discovered
hooks from user, project, and managed config layers. This adds the
plugin discovery and runtime plumbing needed for plugin-bundled hooks
while keeping execution behind the `plugin_hooks` feature flag.

## What

- Discovers plugin hook sources from each plugin's default
`hooks/hooks.json`.
- Supports `plugin.json` manifest `hooks` entries as either relative
paths or inline hook objects.
- Plumbs discovered plugin hook sources through plugin loading into the
hook runtime when `plugin_hooks` is enabled.
- Marks plugin-originated hook runs as `HookSource::Plugin`.
- Injects `PLUGIN_ROOT` and `CLAUDE_PLUGIN_ROOT` into plugin hook
command environments.
- Updates generated schemas and hook source metadata for the plugin hook
source.

## Stack

1. This PR - openai/codex#19705
2. openai/codex#19778
3. openai/codex#19840
4. openai/codex#19882

## Reviewer Notes

- Core logic is in `codex-rs/core-plugins/src/loader.rs` and
`codex-rs/hooks/src/engine/discovery.rs`
- Moved existing / adding new tests to
`codex-rs/core-plugins/src/loader_tests.rs` hence the large diff there
- Otherwise mostly plumbing and minor schema updates

### Core Changes

The `codex-rs/core` changes are limited to wiring plugin hook support
into existing core flows:

- `core/src/session/session.rs` conditionally pulls effective plugin
hook sources and plugin hook load warnings from `PluginsManager` when
`plugin_hooks` is enabled, then passes them into `HooksConfig`.
- `core/src/hook_runtime.rs` adds the `plugin` metric tag for
`HookSource::Plugin`.
- `core/config.schema.json` picks up the new `plugin_hooks` feature
flag, and `core/src/plugins/manager_tests.rs` updates fixtures for the
added plugin hook fields.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-28 14:17:18 -07:00
Shijie Rao
25ac0e4527 Load cloud requirements for agent identity (#19708)
## Why

Agent Identity sessions can represent Business and Enterprise ChatGPT
workspaces, but cloud requirements were skipped before fetch. That meant
workspace-managed requirements were not loaded for Agent Identity even
when the JWT carried the same account identity and plan information that
normal ChatGPT token auth exposes.

This PR now sits on top of the Agent Identity stack through
[#19764](https://github.com/openai/codex/pull/19764). Because
[#19763](https://github.com/openai/codex/pull/19763) moved task
registration into Agent Identity auth loading, cloud requirements no
longer needs a separate runtime-initialization step before building the
backend client.

## What changed

- Stop skipping `CodexAuth::AgentIdentity` in the cloud requirements
loader.
- Share the cloud requirements eligibility check between startup load
and background cache refresh.
- Rely on eagerly loaded Agent Identity auth so backend requests can
attach task-scoped `AgentAssertion` headers.
- Decode Agent Identity JWT `plan_type` as the auth-layer plan type,
then convert it through a shared `auth::PlanType` -> `account::PlanType`
mapping.
- Add the missing serde alias for the `education` plan string and add
coverage for raw Agent Identity plan aliases such as `hc` and
`education`.

## Testing

- `cargo test -p codex-agent-identity -p codex-login -p
codex-cloud-requirements -p codex-protocol`
2026-04-28 12:35:00 -07:00
evawong-oai
0156b1e61f [sandbox] Enforce protected workspace metadata paths (#19846)
## Summary

Make FileSystemSandboxPolicy the semantic source of truth for project
root metadata protection. Under writable roots, `.git`, `.codex`, and
`.agents` stay protected unless user policy grants an explicit write
rule for that metadata path.

## Scope

1. Add `protected_metadata_names` to `WritableRoot`.
2. Teach `FileSystemSandboxPolicy::can_write_path_with_cwd` to reject
protected metadata writes under writable roots unless explicitly
allowed.
3. Default workspace write profiles to protect `.git`, `.codex`, and
`.agents`.
4. Add the Linux fallback setup needed before Linux enforcement lands
later in the stack.

## Reviewer Focus

1. The policy decision belongs in FileSystemSandboxPolicy, not shell
command parsing.
2. Legacy SandboxPolicy remains a compatibility projection, not the
source of the new rule.
3. Explicit user write rules can still opt into these metadata paths.

## Stack

1. Policy primitive: this PR
2. macOS Seatbelt adapter: #19847
3. Shell preflight UX: #19848
4. Runtime profile propagation: #19849
5. Linux bubblewrap adapter: #19852

## Validation

1. codex protocol permissions tests
2. formatting for codex protocol and codex linux sandbox
3. diff whitespace check
2026-04-28 09:10:41 -07:00
friel-openai
598bbcdb58 Preserve assistant phase for replayed messages (#19832) 2026-04-28 08:46:13 -07:00
jif-oai
431ebeaef7 feat: split memories part 2 (#19860)
Keep extracting memories out of core and moving the write trigger in the
app-server
This is temporary and it should move at the client level as a follow-up
This makes core fully independant from `codex-memories-write`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-28 13:03:28 +02:00
Michael Bolin
bf38def44e permissions: make SessionConfigured profile-only (#19774)
## Why

`SessionConfiguredEvent` is the internal event that tells clients what
permissions are active for a session. Emitting both `sandbox_policy` and
`permission_profile` leaves two possible authorities and forces every
consumer to decide which one to honor. At this point in the migration,
the profile is expressive enough to represent managed, disabled, and
external sandbox enforcement, so the internal event can be profile-only.

The wire compatibility concern is older serialized events or rollout
data that only contain `sandbox_policy`; those still need to
deserialize.

## What Changed

- Removes `sandbox_policy` from `SessionConfiguredEvent` and makes
`permission_profile` required.
- Adds custom deserialization so old payloads with only `sandbox_policy`
are upgraded to a cwd-anchored `PermissionProfile`.
- Updates core event emission and TUI session handling to sync
permissions from the profile directly.
- Updates app-server response construction to derive the legacy
`sandbox` response field from the active thread snapshot instead of from
`SessionConfiguredEvent`.
- Updates yolo-mode display logic to treat both
`PermissionProfile::Disabled` and managed unrestricted filesystem plus
enabled network as full-access, while still preserving the distinction
between no sandbox and external sandboxing.

## Verification

- `cargo test -p codex-protocol session_configured_event --lib`
- `cargo test -p codex-protocol serialize_event --lib`
- `cargo test -p codex-exec session_configured --lib`
- `cargo test -p codex-app-server
thread_response_permission_profile_preserves_enforcement --lib`
- `cargo test -p codex-core
session_configured_reports_permission_profile_for_external_sandbox
--lib`
- `cargo test -p codex-tui session_configured --lib`
- `cargo test -p codex-tui
yolo_mode_includes_managed_full_access_profiles --lib`


































---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19774).
* #19900
* #19899
* #19776
* #19775
* __->__ #19774
2026-04-27 22:06:47 -07:00
pakrym-oai
4e05f3053c Remove ghost snapshots (#19481)
## Summary
- Remove `ghost_snapshot` / `GhostCommit` from the Responses API surface
and generated SDK/schema artifacts.
- Keep legacy config loading compatible, but make undo a no-op that
reports the feature is unavailable.
- Clean up core history, compaction, telemetry, rollout, and tests to
stop carrying ghost snapshot items.

## Testing
- Unit tests passed for `codex-protocol`, `codex-core` targeted undo and
compaction flows, `codex-rollout`, and `codex-app-server-protocol`.
- Regenerated config and app-server schemas plus Python SDK artifacts
and verified they match the checked-in outputs.
2026-04-27 18:48:57 -07:00
Michael Bolin
755880ef9c permissions: derive config defaults as profiles (#19772)
## Why

This continues the permissions migration by making legacy config default
resolution produce the canonical `PermissionProfile` first. The legacy
`SandboxPolicy` projection should stay available at compatibility
boundaries, but config loading should not create a legacy policy just to
immediately convert it back into a profile.

Specifically, when `default_permissions` is not specified in
`config.toml`, instead of creating a `SandboxPolicy` in
`codex-rs/core/src/config/mod.rs` and then trying to derive a
`PermissionProfile` from it, we use `derive_permission_profile()` to
create a more faithful `PermissionProfile` using the values of
`ConfigToml` directly.

This also keeps the existing behavior of `sandbox_workspace_write` and
extra writable roots after #19841 replaced `:cwd` with `:project_roots`.
Legacy workspace-write defaults are represented as symbolic
`:project_roots` write access plus symbolic project-root metadata
carveouts. Extra absolute writable roots are still added directly and
continue to get concrete metadata protections for paths that exist under
those roots.

The platform sandboxes differ when a symbolic project-root subpath does
not exist yet.

* **Seatbelt** can encode literal/subpath exclusions directly, so macOS
emits project-root metadata subpath policies even if `.git`, `.agents`,
or `.codex` do not exist.
* **bwrap** has to materialize bind-mount targets. Binding `/dev/null`
to a missing `.git` can create a host-visible placeholder that changes
Git repo discovery. Binding missing `.agents` would not affect Git
discovery, but it would still create a host-visible project metadata
placeholder from an automatic compatibility carveout. Linux therefore
skips only missing automatic `.git` and `.agents` read-only metadata
masks; missing `.codex` remains protected so first-time project config
creation goes through the protected-path approval flow. User-authored
`read` and `none` subpath rules keep normal bwrap behavior, and `none`
can still mask the first missing component to prevent creation under
writable roots.

## What Changed

- Adds profile-native helpers for legacy workspace-write semantics,
including `PermissionProfile::workspace_write_with()`,
`FileSystemSandboxPolicy::workspace_write()`, and
`FileSystemSandboxPolicy::with_additional_legacy_workspace_writable_roots()`.
- Makes `FileSystemSandboxPolicy::workspace_write()` the single legacy
workspace-write constructor so both `from_legacy_sandbox_policy()` and
`From<&SandboxPolicy>` include the project-root metadata carveouts.
- Removes the no-carveout `legacy_workspace_write_base_policy()` path
and the `prune_read_entries_under_writable_roots()` cleanup that was
only needed by that split construction.
- Adds `ConfigToml::derive_permission_profile()` for legacy sandbox-mode
fallback resolution; named `default_permissions` profiles continue
through the permissions profile pipeline instead of being reconstructed
from `sandbox_mode`.
- Updates `Config::load()` to start from the derived profile, validate
that it still has a legacy compatibility projection, and apply
additional writable roots directly to managed workspace-write filesystem
policies.
- Updates Linux bwrap argument construction so missing automatic
`.git`/`.agents` symbolic project-root read-only carveouts are skipped
before emitting bind args; missing `.codex`, user-authored `read`/`none`
subpath rules, and existing missing writable-root behavior are
preserved.
- Adds coverage that legacy workspace-write config produces symbolic
project-root metadata carveouts, extra legacy workspace writable roots
still protect existing metadata paths such as `.git`, and bwrap skips
missing `.git`/`.agents` project-root carveouts while preserving missing
`.codex` and user-authored missing subpath rules.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19772).
* #19776
* #19775
* #19774
* #19773
* __->__ #19772
2026-04-27 16:50:10 -07:00
Michael Bolin
4b55979755 permissions: remove cwd special path (#19841)
## Why

The experimental `PermissionProfile` API had both `:cwd` and
`:project_roots` special filesystem paths, which made the permission
root ambiguous. This PR removes the unstable `current_working_directory`
special path before the permissions API is stabilized, so callers use
`:project_roots` for symbolic project-root access.

## What changed

- Removes `FileSystemSpecialPath::CurrentWorkingDirectory` from protocol
and app-server protocol models, plus regenerated app-server
JSON/TypeScript schemas.
- Replaces internal `:cwd` permission entries with `:project_roots`
entries.
- Keeps the existing cwd-update behavior for legacy-shaped
workspace-write profiles, while removing the deleted
`CurrentWorkingDirectory` case from that compatibility path.
- Keeps `PermissionProfile::workspace_write()` as the reusable symbolic
workspace-write helper, with docs noting that `:project_roots` entries
resolve at enforcement time.
- Updates app-server docs/examples and approval UI labeling to stop
advertising `:cwd` as a permission token.

## Compatibility

Persisted rollout items may contain the old
`{"kind":"current_working_directory"}` tag from earlier experimental
`permissionProfile` snapshots. This PR keeps that tag as a
deserialize-only alias for `ProjectRoots { subpath: None }`, while
continuing to serialize only the new `project_roots` tag.

## Follow-up

This PR intentionally does not introduce an explicit project-root set on
`SessionConfiguration` or runtime sandbox resolution. Today, the
resolver still uses the active cwd as the single implicit project root.
A follow-up should model project roots separately from tool cwd so
`:project_roots` entries can resolve against the configured project
roots, and resolve to no entries when there are no project roots.

## Verification

- `cargo test -p codex-protocol permissions:: --lib`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-sandboxing -p codex-exec-server --lib`
- `cargo test -p codex-core session_configuration_apply_ --lib`
- `cargo test -p codex-app-server
command_exec_permission_profile_project_roots_use_command_cwd --test
all`
- `cargo test -p codex-tui
thread_read_session_state_does_not_reuse_primary_permission_profile
--lib`
- `cargo test -p codex-tui
preset_matching_accepts_workspace_write_with_extra_roots --lib`
- `cargo test -p codex-config --lib`
2026-04-27 13:41:27 -07:00
jif-oai
f8c527e529 multi_agent_v2: move thread cap into feature config (#19792)
## Why

`features.multi_agent_v2.max_concurrent_threads_per_session` is meant to
be the MultiAgentV2-specific session thread cap: it counts the root
thread and all open subagent threads. The previous implementation kept
this surface tied to `agents.max_threads`, which made it a global
subagent-only cap and allowed the legacy setting to coexist with
MultiAgentV2.

## What Changed

- Added `max_concurrent_threads_per_session` to
`[features.multi_agent_v2]` with default `4`.
- Removed the `[agents] max_concurrent_threads_per_session` alias to
`agents.max_threads`.
- When MultiAgentV2 is enabled, reject `agents.max_threads` and derive
the existing internal subagent slot limit as
`max_concurrent_threads_per_session - 1`.
- Regenerated `core/config.schema.json` and added coverage for the new
config semantics.

## Result
```
➜  codex git:(jif/clean-multi-agent-v2-config) codex -c features.multi_agent_v2.enabled=true -c features.multi_agent_v2.max_concurrent_threads_per_session=3
╭────────────────────────────────────────────────────╮
│ >_ OpenAI Codex (v0.0.0)                           │
│                                                    │
│ model:     gpt-5.5 xhigh   fast   /model to change │
│ directory: ~/code/codex                            │
╰────────────────────────────────────────────────────╯

  Tip: Update Required - This version will no longer be supported starting May 8th. Please upgrade to the latest version (https://github.com/openai/codex/releases/latest) using your preferred package manager.

› Can you try to spawn 4 agents


• I’ll try to start four lightweight agents at once and report exactly what the runtime accepts.

• Spawned Russell [no-apps] (gpt-5.5 xhigh)
  └ Spawn probe 1: reply briefly that you started, then wait for further instructions. Do not do any repo work.

• Spawned Descartes [no-apps] (gpt-5.5 xhigh)
  └ Spawn probe 2: reply briefly that you started, then wait for further instructions. Do not do any repo work.

• Agent spawn failed
  └ Spawn probe 3: reply briefly that you started, then wait for further instructions. Do not do any repo work.

• Agent spawn failed
  └ Spawn probe 4: reply briefly that you started, then wait for further instructions. Do not do any repo work.

• The runtime accepted the first two and rejected the next two with agent thread limit reached. I’m checking whether the two accepted probes have returned cleanly, then I’ll close them if needed.
```

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-27 13:31:56 +02:00
Michael Bolin
2cb8746457 permissions: remove core legacy policy round trips (#19394)
## Why

Several execution paths still converted profile-backed permissions into
`SandboxPolicy` and then rebuilt runtime permissions from that legacy
shape. Those round trips are unnecessary after the preceding PRs and can
lose split filesystem semantics. Core approval and escalation should
carry the resolved profile directly.

## What Changed

- Removes `sandbox_policy` from `ResolvedPermissionProfile`; the
resolved permission object now carries the canonical `PermissionProfile`
directly.
- Updates exec-policy fallback, shell/unified-exec interception,
escalation reruns, and related tests to pass profiles instead of legacy
policies.
- Removes legacy additional-permission merge helpers that built an
effective `SandboxPolicy` before rebuilding runtime permissions.
- Keeps legacy projections only at compatibility boundaries that still
require `SandboxPolicy`, not in core permission computation.

## Verification

- `cargo test -p codex-core direct_write_roots`
- `cargo test -p codex-core runtime_roots_to_legacy_projection`
- `cargo test -p codex-app-server
requested_permissions_trust_project_uses_permission_profile_intent`







































































---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19394).
* #19737
* #19736
* #19735
* #19734
* #19395
* __->__ #19394
2026-04-26 17:43:32 -07:00
Andrey Mishchenko
35bc6e3d01 Delete unused ResponseItem::Message.end_turn (#19605)
This field is unused. Delete it.
2026-04-26 17:18:09 -07:00
pakrym-oai
9c3abcd46c [codex] Move config loading into codex-config (#19487)
## Why

Config loading had become split across crates: `codex-config` owned the
config types and merge logic, while `codex-core` still owned the loader
that assembled the layer stack. This change consolidates that
responsibility in `codex-config`, so the crate that defines config
behavior also owns how configs are discovered and loaded.

To make that move possible without reintroducing the old dependency
cycle, the shell-environment policy types and helpers that
`codex-exec-server` needs now live in `codex-protocol` instead of
flowing through `codex-config`.

This also makes the migrated loader tests more deterministic on machines
that already have managed or system Codex config installed by letting
tests override the system config and requirements paths instead of
reading the host's `/etc/codex`.

## What Changed

- moved the config loader implementation from `codex-core` into
`codex-config::loader` and deleted the old `core::config_loader` module
instead of leaving a compatibility shim
- moved shell-environment policy types and helpers into
`codex-protocol`, then updated `codex-exec-server` and other downstream
crates to import them from their new home
- updated downstream callers to use loader/config APIs from
`codex-config`
- added test-only loader overrides for system config and requirements
paths so loader-focused tests do not depend on host-managed config state
- cleaned up now-unused dependency entries and platform-specific cfgs
that were surfaced by post-push CI

## Testing

- `cargo test -p codex-config`
- `cargo test -p codex-core config_loader_tests::`
- `cargo test -p codex-protocol -p codex-exec-server -p
codex-cloud-requirements -p codex-rmcp-client --lib`
- `cargo test --lib -p codex-app-server-client -p codex-exec`
- `cargo test --no-run --lib -p codex-app-server`
- `cargo test -p codex-linux-sandbox --lib`
- `cargo shear`
- `just bazel-lock-check`

## Notes

- I did not chase unrelated full-suite failures outside the migrated
loader surface.
- `cargo test -p codex-core --lib` still hits unrelated proxy-sensitive
failures on this machine, and Windows CI still shows unrelated
long-running/timeouting test noise outside the loader migration itself.
2026-04-26 15:10:53 -07:00
Michael Bolin
deaa307fb2 permissions: derive compatibility policies from profiles (#19392)
## Why

After #19391, `PermissionProfile` and the split filesystem/network
policies could still be stored in parallel. That creates drift risk: a
profile can preserve deny globs, external enforcement, or split
filesystem entries while a cached projection silently loses those
details. This PR makes the profile the runtime source and derives
compatibility views from it.

## What Changed

- Removes stored filesystem/network sandbox projections from
`Permissions` and `SessionConfiguration`; their accessors now derive
from the canonical `PermissionProfile`.
- Derives legacy `SandboxPolicy` snapshots from profiles only where an
older API still needs that field.
- Updates MCP connection and elicitation state to track
`PermissionProfile` instead of `SandboxPolicy` for auto-approval
decisions.
- Adds semantic filesystem-policy comparison so cwd changes can preserve
richer profiles while still recognizing equivalent legacy projections
independent of entry ordering.
- Updates config/session tests to assert profile-derived projections
instead of parallel stored fields.

## Verification

- `cargo test -p codex-core direct_write_roots`
- `cargo test -p codex-core runtime_roots_to_legacy_projection`
- `cargo test -p codex-app-server
requested_permissions_trust_project_uses_permission_profile_intent`



































































---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19392).
* #19395
* #19394
* #19393
* __->__ #19392
2026-04-26 15:06:42 -07:00
Michael Bolin
4d7ce3447d permissions: make runtime config profile-backed (#19606)
## Why

This supersedes #19391. During stack repair, GitHub marked #19391 as
merged into a temporary stack branch rather than into `main`, so the
runtime-config change needed a fresh PR.

`PermissionProfile` is now the canonical permissions shape after #19231
because it can distinguish `Managed`, `Disabled`, and `External`
enforcement while also carrying filesystem rules that legacy
`SandboxPolicy` cannot represent cleanly. Core config and session state
still needed to accept profile-backed permissions without forcing every
profile through the strict legacy bridge, which rejected valid runtime
profiles such as direct write roots.

The unrelated CI/test hardening that previously rode along with this PR
has been split into #19683 so this PR stays focused on the permissions
model migration.

## What Changed

- Adds `Permissions.permission_profile` and
`SessionConfiguration.permission_profile` as constrained runtime state,
while keeping `sandbox_policy` as a legacy compatibility projection.
- Introduces profile setters that keep `PermissionProfile`, split
filesystem/network policies, and legacy `SandboxPolicy` projections
synchronized.
- Uses a compatibility projection for requirement checks and legacy
consumers instead of rejecting profiles that cannot round-trip through
`SandboxPolicy` exactly.
- Updates config loading, config overrides, session updates, turn
context plumbing, prompt permission text, sandbox tags, and exec request
construction to carry profile-backed runtime permissions.
- Preserves configured deny-read entries and `glob_scan_max_depth` when
command/session profiles are narrowed.
- Adds `PermissionProfile::read_only()` and
`PermissionProfile::workspace_write()` presets that match legacy
defaults.

## Verification

- `cargo test -p codex-core direct_write_roots`
- `cargo test -p codex-core runtime_roots_to_legacy_projection`
- `cargo test -p codex-app-server
requested_permissions_trust_project_uses_permission_profile_intent`




---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19606).
* #19395
* #19394
* #19393
* #19392
* __->__ #19606
2026-04-26 13:29:54 -07:00
Eric Traut
4167628622 Add goal core runtime (4 / 5) (#18076)
Adds the core runtime behavior for active goals on top of the model
tools from PR 3.

## Why

A long-running goal should be a core runtime concern, not something
every client has to implement. Core owns the turn lifecycle, tool
completion boundaries, interruptions, resume behavior, and token usage,
so it is the right place to account progress, enforce budgets, and
decide when to continue work.

## What changed

- Centralized goal lifecycle side effects behind
`Session::goal_runtime_apply(GoalRuntimeEvent::...)`.
- Starts goal continuation turns only when the session is idle; pending
user input and mailbox work take priority.
- Accounts token and wall-clock usage at turn, tool, mutation,
interrupt, and resume boundaries; `get_thread_goal` remains read-only.
- Preserves sub-second wall-clock remainder across accounting boundaries
so long-running goals do not drift downward over time.
- Treats token budget exhaustion as a soft stop by marking the goal
`budget_limited` and injecting wrap-up steering instead of aborting the
active turn.
- Suppresses budget steering when `update_goal` marks a goal complete.
- Pauses active goals on interrupt and auto-reactivates paused goals
when a thread resumes outside plan mode.
- Suppresses repeated automatic continuation when a continuation turn
makes no tool calls.
- Added continuation and budget-limit prompt templates.

## Verification

- Added focused core coverage for continuation scheduling, accounting
boundaries, budget-limit steering, completion accounting, interrupt
pause behavior, resume auto-activation, and wall-clock remainder
accounting.
2026-04-24 21:16:00 -07:00
Eric Traut
6c874f9b34 Add goal app-server API (2 / 5) (#18074)
Adds the app-server v2 goal API on top of the persisted goal state from
PR 1.

## Why

Clients need a stable app-server surface for reading and controlling
materialized thread goals before the model tools and TUI can use them.
Goal changes also need to be observable by app-server clients, including
clients that resume an existing thread.

## What changed

- Added v2 `thread/goal/get`, `thread/goal/set`, and `thread/goal/clear`
RPCs for materialized threads.
- Added `thread/goal/updated` and `thread/goal/cleared` notifications so
clients can keep local goal state in sync.
- Added resume/snapshot wiring so reconnecting clients see the current
goal state for a thread.
- Added app-server handlers that reconcile persisted rollout state
before direct goal mutations.
- Updated the app-server README plus generated JSON and TypeScript
schema fixtures for the new API surface.

## Verification

- Added app-server v2 coverage for goal get/set/clear behavior,
notification emission, resume snapshots, and non-local thread-store
interactions.
2026-04-24 20:53:41 -07:00
Michael Bolin
789f387982 permissions: remove legacy read-only access modes (#19449)
## Why

`ReadOnlyAccess` was a transitional legacy shape on `SandboxPolicy`:
`FullAccess` meant the historical read-only/workspace-write modes could
read the full filesystem, while `Restricted` tried to carry partial
readable roots. The partial-read model now belongs in
`FileSystemSandboxPolicy` and `PermissionProfile`, so keeping it on
`SandboxPolicy` makes every legacy projection reintroduce lossy
read-root bookkeeping and creates unnecessary noise in the rest of the
permissions migration.

This PR makes the legacy policy model narrower and explicit:
`SandboxPolicy::ReadOnly` and `SandboxPolicy::WorkspaceWrite` represent
the old full-read sandbox modes only. Split readable roots, deny-read
globs, and platform-default/minimal read behavior stay in the runtime
permissions model.

## What changed

- Removes `ReadOnlyAccess` from
`codex_protocol::protocol::SandboxPolicy`, including the generated
`access` and `readOnlyAccess` API fields.
- Updates legacy policy/profile conversions so restricted filesystem
reads are represented only by `FileSystemSandboxPolicy` /
`PermissionProfile` entries.
- Keeps app-server v2 compatible with legacy `fullAccess` read-access
payloads by accepting and ignoring that no-op shape, while rejecting
legacy `restricted` read-access payloads instead of silently widening
them to full-read legacy policies.
- Carries Windows sandbox platform-default read behavior with an
explicit override flag instead of depending on
`ReadOnlyAccess::Restricted`.
- Refreshes generated app-server schema/types and updates tests/docs for
the simplified legacy policy shape.

## Verification

- `cargo check -p codex-app-server-protocol --tests`
- `cargo check -p codex-windows-sandbox --tests`
- `cargo test -p codex-app-server-protocol sandbox_policy_`


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19449).
* #19395
* #19394
* #19393
* #19392
* #19391
* __->__ #19449
2026-04-24 17:16:58 -07:00
Tom
0a9b559c0b Migrate fork and resume reads to thread store (#18900)
- Route cold thread/resume and thread/fork source loading through
ThreadStore reads instead of direct rollout path operations
- Keep lookups that explicitly specify a rollout-path using the local
thread store methods but return an invalid-request error for remote
ThreadStore configurations
- Add some additional unit tests for code path coverage
2026-04-24 13:51:37 -07:00
Michael Bolin
13e0ec1614 permissions: make legacy profile conversion cwd-free (#19414)
## Why

The profile conversion path still required a `cwd` even when it was only
translating a legacy `SandboxPolicy` into a `PermissionProfile`. That
made profile producers invent an ambient `cwd`, which is exactly the
anchoring we are trying to remove from permission-profile data. A legacy
workspace-write policy can be represented symbolically instead: `:cwd =
write` plus read-only `:project_roots` metadata subpaths.

This PR creates that cwd-free base so the rest of the stack can stop
threading cwd through profile construction. Callers that actually need a
concrete runtime filesystem policy for a specific cwd still have an
explicitly named cwd-bound conversion.

## What Changed

- `PermissionProfile::from_legacy_sandbox_policy` now takes only
`&SandboxPolicy`.
- `FileSystemSandboxPolicy::from_legacy_sandbox_policy` is now the
symbolic, cwd-free projection for profiles.
- The old concrete projection is retained as
`FileSystemSandboxPolicy::from_legacy_sandbox_policy_for_cwd` for
runtime/boundary code that must materialize legacy cwd behavior.
- Workspace-write profiles preserve `CurrentWorkingDirectory` and
`ProjectRoots` special entries instead of materializing cwd into
absolute paths.

## Verification

- `cargo check -p codex-protocol -p codex-core -p
codex-app-server-protocol -p codex-app-server -p codex-exec -p
codex-exec-server -p codex-tui -p codex-sandboxing -p
codex-linux-sandbox -p codex-analytics --tests`
- `just fix -p codex-protocol -p codex-core -p codex-app-server-protocol
-p codex-app-server -p codex-exec -p codex-exec-server -p codex-tui -p
codex-sandboxing -p codex-linux-sandbox -p codex-analytics`




---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19414).
* #19395
* #19394
* #19393
* #19392
* #19391
* __->__ #19414
2026-04-24 13:42:05 -07:00
Michael Bolin
4816b89204 permissions: make profiles represent enforcement (#19231)
## Why

`PermissionProfile` is becoming the canonical permissions abstraction,
but the old shape only carried optional filesystem and network fields.
It could describe allowed access, but not who is responsible for
enforcing it. That made `DangerFullAccess` and `ExternalSandbox` lossy
when profiles were exported, cached, or round-tripped through app-server
APIs.

The important model change is that active permissions are now a disjoint
union over the enforcement mode. Conceptually:

```rust
pub enum PermissionProfile {
    Managed {
        file_system: FileSystemSandboxPolicy,
        network: NetworkSandboxPolicy,
    },
    Disabled,
    External {
        network: NetworkSandboxPolicy,
    },
}
```

This distinction matters because `Disabled` means Codex should apply no
outer sandbox at all, while `External` means filesystem isolation is
owned by an outside caller. Those are not equivalent to a broad managed
sandbox. For example, macOS cannot nest Seatbelt inside Seatbelt, so an
inner sandbox may require the outer Codex layer to use no sandbox rather
than a permissive one.

## How Existing Modeling Maps

Legacy `SandboxPolicy` remains a boundary projection, but it now maps
into the higher-fidelity profile model:

- `ReadOnly` and `WorkspaceWrite` map to `PermissionProfile::Managed`
with restricted filesystem entries plus the corresponding network
policy.
- `DangerFullAccess` maps to `PermissionProfile::Disabled`, preserving
the “no outer sandbox” intent instead of treating it as a lax managed
sandbox.
- `ExternalSandbox { network_access }` maps to
`PermissionProfile::External { network }`, preserving external
filesystem enforcement while still carrying the active network policy.
- Split runtime policies that legacy `SandboxPolicy` cannot faithfully
express, such as managed unrestricted filesystem plus restricted
network, stay `Managed` instead of being collapsed into
`ExternalSandbox`.
- Per-command/session/turn grants remain partial overlays via
`AdditionalPermissionProfile`; full `PermissionProfile` is reserved for
complete active runtime permissions.

## What Changed

- Change active `PermissionProfile` into a tagged union: `managed`,
`disabled`, and `external`.
- Keep partial permission grants separate with
`AdditionalPermissionProfile` for command/session/turn overlays.
- Represent managed filesystem permissions as either `restricted`
entries or `unrestricted`; `glob_scan_max_depth` is non-zero when
present.
- Preserve old rollout compatibility by accepting the pre-tagged `{
network, file_system }` profile shape during deserialization.
- Preserve fidelity for important edge cases: `DangerFullAccess`
round-trips as `disabled`, `ExternalSandbox` round-trips as `external`,
and managed unrestricted filesystem + restricted network stays managed
instead of being mistaken for external enforcement.
- Preserve configured deny-read entries and bounded glob scan depth when
full profiles are projected back into runtime policies, including
unrestricted replacements that now become `:root = write` plus deny
entries.
- Regenerate the experimental app-server v2 JSON/TypeScript schema and
update the `command/exec` README example for the tagged
`permissionProfile` shape.

## Compatibility

Legacy `SandboxPolicy` remains available at config/API boundaries as the
compatibility projection. Existing rollout lines with the old
`PermissionProfile` shape continue to load. The app-server
`permissionProfile` field is experimental, so its v2 wire shape is
intentionally updated to match the higher-fidelity model.

## Verification

- `just write-app-server-schema`
- `cargo check --tests`
- `cargo test -p codex-protocol permission_profile`
- `cargo test -p codex-protocol
preserving_deny_entries_keeps_unrestricted_policy_enforceable`
- `cargo test -p codex-app-server-protocol
permission_profile_file_system_permissions`
- `cargo test -p codex-app-server-protocol serialize_client_response`
- `cargo test -p codex-core
session_configured_reports_permission_profile_for_external_sandbox`
- `just fix`
- `just fix -p codex-protocol`
- `just fix -p codex-app-server-protocol`
- `just fix -p codex-core`
- `just fix -p codex-app-server`
2026-04-23 23:02:18 -07:00
starr-openai
49fb25997f Add sticky environment API and thread state (#18897)
## Summary
- add sticky environment selections to app-server v2 thread/start and
turn/start request flow
- carry thread-level selections through core session/thread state
- add app-server coverage for sticky selections and turn overrides

## Stack
1. This PR: API and thread persistence
2. #18898: config.toml named environment loading
3. #18899: downstream tool/runtime consumers

## Validation
- Not run locally; split only.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-23 18:57:13 -07:00
Celia Chen
432771c5fd feat: expose AWS account state from account/read (#19048)
## Why

AWS/Bedrock mode currently reports `account: null` with
`requiresOpenaiAuth: false` from `account/read`. That suppresses the
OpenAI-auth requirement, but it does not let app clients distinguish AWS
auth from any other non-OpenAI custom provider. For the prototype AWS
provider UX, clients need a simple provider-derived signal so they can
suppress ChatGPT/API-key login and token-refresh paths without
hardcoding Bedrock checks.

## What changed

- Adds an `aws` variant to the v2 `Account` protocol union.
- Adds `ProviderAccountKind` to `codex-model-provider` so the runtime
provider owns the app-visible account classification.
- Makes Amazon Bedrock return `ProviderAccountKind::Aws` from the
model-provider layer.
- Updates app-server `account/read` to map `ProviderAccountKind` to the
existing `GetAccountResponse` wire shape.
- Preserves the existing `account: null, requiresOpenaiAuth: false`
behavior for other non-OpenAI providers.
- Regenerates the app-server protocol schema fixtures.
- Adds coverage for provider account classification and for the Amazon
Bedrock `account/read` response.

## Testing

- `cargo test -p codex-model-provider`
- `cargo test -p codex-app-server-protocol`
- `cargo test -p codex-app-server get_account_with_aws_provider`

## Notes

I attempted `just bazel-lock-update` and `just bazel-lock-check`, but
both are blocked in my local environment because `bazel` is not
installed.
2026-04-24 01:53:13 +00:00
efrazer-oai
5882f3f95e refactor: route Codex auth through AuthProvider (#18811)
## Summary

This PR moves Codex backend request authentication from direct
bearer-token handling to `AuthProvider`.

The new `codex-auth-provider` crate defines the shared request-auth
trait. `CodexAuth::provider()` returns a provider that can apply all
headers needed for the selected auth mode.

This lets ChatGPT token auth and AgentIdentity auth share the same
callsite path:
- ChatGPT token auth applies bearer auth plus account/FedRAMP headers
where needed.
- AgentIdentity auth applies AgentAssertion plus account/FedRAMP headers
where needed.

Reference old stack: https://github.com/openai/codex/pull/17387/changes

## Callsite Migration

| Area | Change |
| --- | --- |
| backend-client | accepts an `AuthProvider` instead of a raw
token/header |
| chatgpt client/connectors | applies auth through
`CodexAuth::provider()` |
| cloud tasks | keeps Codex-backend gating, applies auth through
provider |
| cloud requirements | uses Codex-backend auth checks and provider
headers |
| app-server remote control | applies provider headers for backend calls
|
| MCP Apps/connectors | gates on `uses_codex_backend()` and keys caches
from generic account getters |
| model refresh | treats AgentIdentity as Codex-backend auth |
| OpenAI file upload path | rejects non-Codex-backend auth before
applying headers |
| core client setup | keeps model-provider auth flow and allows
AgentIdentity through provider-backed OpenAI auth |

## Stack

1. https://github.com/openai/codex/pull/18757: full revert
2. https://github.com/openai/codex/pull/18871: isolated Agent Identity
crate
3. https://github.com/openai/codex/pull/18785: explicit AgentIdentity
auth mode and startup task allocation
4. This PR: migrate Codex backend auth callsites through AuthProvider
5. https://github.com/openai/codex/pull/18904: accept AgentIdentity JWTs
and load `CODEX_AGENT_IDENTITY`

## Testing

Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.
2026-04-23 17:14:02 -07:00
Michael Bolin
9c0eced391 shell-escalation: carry resolved permission profiles (#18287)
## Why

Shell escalation still has adapter code that expects a legacy sandbox
policy, but command approvals should carry the resolved
`PermissionProfile` so callers can reason about the granted permissions
canonically.

## What changed

This introduces profile-shaped resolved escalation permissions while
retaining the derived legacy sandbox policy for the Unix escalation
adapter. It updates approval types, the escalation server protocol, and
tests that inspect escalated command permissions.

## Verification

- `cargo test -p codex-core --test all handle_container_exec_ --
--nocapture`
- `cargo test -p codex-core --test all handle_sandbox_ -- --nocapture`

























































---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18287).
* #18288
* __->__ #18287
2026-04-23 12:46:19 -07:00
Michael Bolin
f90cc0ee64 tui: carry permission profiles on user turns (#18285)
## Why

Per-turn permission overrides should use the same canonical profile
abstraction as session configuration. That lets TUI submissions preserve
exact configured permissions without round-tripping through legacy
sandbox fields.

## What changed

This adds `permission_profile` to user-turn operations, threads it
through TUI/app-server submission paths, fills the new field in existing
test fixtures, and adds coverage that composer submission includes the
configured profile.

## Verification

- `cargo test -p codex-tui permissions -- --nocapture`
- `cargo test -p codex-core --test all permissions_messages --
--nocapture`























































---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18285).
* #18288
* #18287
* #18286
* __->__ #18285
2026-04-23 11:54:17 -07:00
Won Park
17ae906048 Fix auto-review config compatibility across protocol and SDK (#19113)
## Why

This keeps the partial Guardian subagent -> Auto-review rename
forward-compatible across mixed Codex installations. Newer binaries need
to understand the new `auto_review` spelling, but they cannot write it
to shared `~/.codex/config.toml` yet because older CLI/app-server
bundles only know `user` and `guardian_subagent` and can fail during
config load before recovering.

The Python SDK had the opposite compatibility gap: app-server responses
can contain `approvalsReviewer: "auto_review"`, but the checked-in
generated SDK enum did not accept that value.

## What Changed

- Keep `ApprovalsReviewer::AutoReview` readable from both
`guardian_subagent` and `auto_review`, while serializing it as
`guardian_subagent` in both protocol crates.
- Update TUI Auto-review persistence tests so enabling Auto-review
writes `approvals_reviewer = "guardian_subagent"` while UI copy still
says Auto-review.
- Map managed/cloud `feature_requirements.auto_review` to the existing
`Feature::GuardianApproval` gate without adding a broad local
`[features].auto_review` key or changing config writes.
- Add `auto_review` to the Python SDK `ApprovalsReviewer` enum and cover
`ThreadResumeResponse` validation.

## Testing

- `cargo test -p codex-protocol approvals_reviewer`
- `cargo test -p codex-app-server-protocol approvals_reviewer`
- `cargo test -p codex-tui
update_feature_flags_enabling_guardian_selects_auto_review`
- `cargo test -p codex-tui
update_feature_flags_enabling_guardian_in_profile_sets_profile_auto_review_policy`
- `cargo test -p codex-core
feature_requirements_auto_review_disables_guardian_approval`
- `pytest
sdk/python/tests/test_client_rpc_methods.py::test_thread_resume_response_accepts_auto_review_reviewer`
- `git diff --check`
2026-04-23 03:12:56 -07:00
Eric Traut
bbff4ee61a Add safety check notification and error handling (#19055)
Adds a new app-server notification that fires when a user account has
been flagged for potential safety reasons.
2026-04-22 22:24:12 -07:00
Michael Bolin
082fc4f632 protocol: report session permission profiles (#18282)
## Why

Clients that observe `SessionConfigured` need the same canonical
permission view that app-server thread responses provide. Reporting the
profile in protocol events lets clients keep their local state
synchronized without reinterpreting legacy sandbox fields.

## What changed

This adds `permission_profile` to `SessionConfigured` and propagates it
through core, exec JSON output, MCP server messages, and TUI
history/widget handling.

## Verification

- `cargo test -p codex-tui permissions -- --nocapture`
- `cargo test -p codex-core --test all permissions_messages --
--nocapture`















































---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18282).
* #18288
* #18287
* #18286
* #18285
* #18284
* #18283
* __->__ #18282
2026-04-22 21:29:32 -07:00
Dylan Hurd
5e71da1424 feat(request-permissions) approve with strict review (#19050)
## Summary
Allow the user to approve a request_permissions_tool request with the
condition that all commands in the rest of the turn are reviewed by
guardian, regardless of sandbox status.

## Testing
- [x] Added unit tests
- [x] Ran locally
2026-04-23 01:56:32 +00:00
Won Park
83ec1eb5d6 Rename approvals reviewer variant to auto-review (#19056)
## Why

`approvals_reviewer` now uses `auto_review` as the canonical config/API
value after #18504, but the Rust enum variant and nearby helper/test
names still used `GuardianSubagent` / guardian approval wording. That
made follow-up code and reviews confusing even though the external value
had already moved to Auto-review.

## What changed

- Renamed `ApprovalsReviewer::GuardianSubagent` to
`ApprovalsReviewer::AutoReview`.
- Updated protocol, app-server, config, core, TUI, exec, and analytics
test callsites.
- Renamed nearby helper/test names from guardian approval wording to
Auto-review wording where they refer to the approvals reviewer mode.
- Preserved wire compatibility:
  - `auto_review` remains the canonical serialized value.
  - `guardian_subagent` remains accepted as a legacy alias.

This intentionally does not rename the `[features].guardian_approval`
key, `Feature::GuardianApproval`, `core/src/guardian`, analytics event
names, or app-server Guardian review event types.

## Verification

- `cargo test -p codex-protocol
approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent`
- `cargo test -p codex-app-server-protocol
approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent`
- `cargo test -p codex-config approvals_reviewer`
- `cargo test -p codex-tui update_feature_flags`
- `cargo test -p codex-core permissions_instructions`
- `cargo test -p codex-tui permissions_selection`
2026-04-22 17:22:35 -07:00
Michael Bolin
6ca038bbd1 rollout: persist turn permission profiles (#18281)
## Why

Resume and reconstruction need to preserve the permissions that were
active for each user turn. If rollouts only keep legacy sandbox fields,
replay cannot faithfully represent profile-shaped overrides introduced
earlier in the stack.

## What changed

This records `permission_profile` on user-turn rollout events,
reconstructs it through history/state extraction, and updates rollout
reconstruction and related fixtures to keep the field explicit.

## Verification

- `cargo test -p codex-core --test all permissions_messages --
--nocapture`
- `cargo test -p codex-core --test all request_permissions --
--nocapture`











































---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18281).
* #18288
* #18287
* #18286
* #18285
* #18284
* #18283
* #18282
* __->__ #18281
2026-04-22 17:00:29 -07:00
Won Park
46142c3cb0 Rebrand approvals reviewer config to auto-review (#18504)
### Why

Auto-review is the user-facing name for the approvals reviewer, but the
config/API value still exposed the old `guardian_subagent` name. That
made new configs and generated schemas point users at Guardian
terminology even though the intended product surface is Auto-review.

This PR updates the external `approvals_reviewer` value while preserving
compatibility for existing configs and clients.

### What changed

- Makes `auto_review` the canonical serialized value for
`approvals_reviewer`.
- Keeps `guardian_subagent` accepted as a legacy alias.
- Keeps `user` accepted and serialized as `user`.
- Updates generated config and app-server schemas so
`approvals_reviewer` includes:
  - `user`
  - `auto_review`
  - `guardian_subagent`
- Updates app-server README docs for the reviewer value.
- Updates analytics and config requirements tests for the canonical
auto_review value.


### Compatibility

Existing configs and API payloads using:

```toml
approvals_reviewer = "guardian_subagent"
```

continue to load and map to the Auto-review reviewer behavior. 

New serialization emits: 
```toml
approvals_reviewer = "auto_review" 
```

This PR intentionally does not rename the [features].guardian_approval
key or broad internal Guardian symbols. Those are split out for a
follow-up PR to keep this migration small and avoid touching large
TUI/internal surfaces.

**Verification**
cargo test -p codex-protocol
approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent
cargo test -p codex-app-server-protocol
approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent
2026-04-22 15:45:35 -07:00
Michael Bolin
18a26d7bbc app-server: accept permission profile overrides (#18279)
## Why

`PermissionProfile` is becoming the canonical permissions shape shared
by core and app-server. After app-server responses expose the active
profile, clients need to be able to send that same shape back when
starting, resuming, forking, or overriding a turn instead of translating
through the legacy `sandbox`/`sandboxPolicy` shorthands.

This still needs to preserve the existing requirements/platform
enforcement model. A profile-shaped request can be downgraded or
rejected by constraints, but the server should keep the user's
elevated-access intent for project trust decisions. Turn-level profile
overrides also need to retain existing read protections, including
deny-read entries and bounded glob-scan metadata, so a permission
override cannot accidentally drop configured protections such as
`**/*.env = deny`.

## What changed

- Adds optional `permissionProfile` request fields to `thread/start`,
`thread/resume`, `thread/fork`, and `turn/start`.
- Rejects ambiguous requests that specify both `permissionProfile` and
the legacy `sandbox`/`sandboxPolicy` fields, including running-thread
resume requests.
- Converts profile-shaped overrides into core runtime filesystem/network
permissions while continuing to derive the constrained legacy sandbox
projection used by existing execution paths.
- Preserves project-trust intent for profile overrides that are
equivalent to workspace-write or full-access sandbox requests.
- Preserves existing deny-read entries and `globScanMaxDepth` when
applying turn-level `permissionProfile` overrides.
- Updates app-server docs plus generated JSON/TypeScript schema fixtures
and regression coverage.

## Verification

- `cargo test -p codex-app-server-protocol schema_fixtures`
- `cargo test -p codex-core
session_configuration_apply_permission_profile_preserves_existing_deny_read_entries`







---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18279).
* #18288
* #18287
* #18286
* #18285
* #18284
* #18283
* #18282
* #18281
* #18280
* __->__ #18279
2026-04-22 13:34:33 -07:00
Dylan Hurd
ed4def8286 feat(auto-review) short-circuit (#18890)
## Summary
Short circuit the convo if auto-review hits too many denials

## Testing
- [x] Added unit tests

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-22 20:34:15 +00:00
Won Park
11e5af53c4 Add plumbing to approve stored Auto-Review denials (#18955)
## Summary

This adds the structural plumbing needed for an app-server client to
approve a previously denied Guardian review and carry that approval
context into the next model turn.

This PR does not add the actual `/auto-review-denials` tool 

## What Changed

- Added app-server v2 RPC `thread/approveGuardianDeniedAction`.
- Added generated JSON schema and TypeScript fixtures for
`ThreadApproveGuardianDeniedAction*`.
- Added core `Op::ApproveGuardianDeniedAction`.
- Added a core handler that validates the event is a denied Guardian
assessment and injects a developer message containing the stored denial
event JSON.
- Queues the approval context for the next turn if there is no active
turn yet.
- Added the TUI app-server bridge so `Op::ApproveGuardianDeniedAction {
event }` is routed to the app-server request.

## What This Does Not Do

- Does not add `/auto-review-denials`.
- Does not add chat widget recent-denial state.
- Does not add popup/list UI.
- Does not add a product-facing denial lookup/store.
- Does not change where Guardian denials are originally emitted or
persisted.

## Verification

- `cargo test -p codex-tui thread_approve_guardian_denied_action`
2026-04-22 10:38:19 -07:00
rhan-oai
213b17b7a3 [codex-analytics] guardian review TTFT plumbing and emission (#17696)
## Why

Guardian analytics includes time-to-first-token, but the Guardian
reviewer runs as a normal Codex session and `TurnCompleteEvent` did not
expose TTFT. The timing needs to flow through the standard
turn-completion protocol so Guardian review analytics can consume the
same value as the rest of the session machinery.

## What changed

Adds optional `time_to_first_token_ms` to `TurnCompleteEvent` and
populates it from `TurnTiming`. The value is carried through app-server
thread history, rollout reconstruction, TUI/app-server adapters, and
Guardian review session handling.

Guardian review analytics now captures TTFT from the reviewer
turn-complete event when available. Existing tests and fixtures are
updated to set the new optional field to `None` where TTFT is not
relevant.

## Verification

- `cargo clippy -p codex-tui --tests -- -D warnings`
- `cargo clippy -p codex-core --lib --tests -- -D warnings`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17696).
* __->__ #17696
* #17695
* #17693
* #18278
* #18953
2026-04-22 01:52:48 -07:00
starr-openai
1d4cc494c9 Add turn-scoped environment selections (#18416)
## Summary
- add experimental turn/start.environments params for per-turn
environment id + cwd selections
- pass selections through core protocol ops and resolve them with
EnvironmentManager before TurnContext creation
- treat omitted selections as default behavior, empty selections as no
environment, and non-empty selections as first environment/cwd as the
turn primary

## Testing
- ran `just fmt`
- ran `just write-app-server-schema`
- not run: unit tests for this stacked PR

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-21 17:48:33 -07:00
efrazer-oai
be75785504 fix: fully revert agent identity runtime wiring (#18757)
## Summary

This PR fully reverts the previously merged Agent Identity runtime
integration from the old stack:
https://github.com/openai/codex/pull/17387/changes

It removes the Codex-side task lifecycle wiring, rollout/session
persistence, feature flag plumbing, lazy `auth.json` mutation,
background task auth paths, and request callsite changes introduced by
that stack.

This leaves the repo in a clean pre-AgentIdentity integration state so
the follow-up PRs can reintroduce the pieces in smaller reviewable
layers.

## Stack

1. This PR: full revert
2. https://github.com/openai/codex/pull/18871: move Agent Identity
business logic into a crate
3. https://github.com/openai/codex/pull/18785: add explicit
AgentIdentity auth mode and startup task allocation
4. https://github.com/openai/codex/pull/18811: migrate auth callsites
through AuthProvider

## Testing

Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.
2026-04-21 14:30:55 -07:00
Michael Bolin
f8562bd47b sandboxing: intersect permission profiles semantically (#18275)
## Why

Permission approval responses must not be able to grant more access than
the tool requested. Moving this flow to `PermissionProfile` means the
comparison must be profile-shaped instead of `SandboxPolicy`-shaped, and
cwd-relative special paths such as `:cwd` and `:project_roots` must stay
anchored to the turn that produced the request.

## What changed

This implements semantic `PermissionProfile` intersection in
`codex-sandboxing` for file-system and network permissions. The
intersection accepts narrower path grants, rejects broader grants,
preserves deny-read carve-outs and glob scan depth, and materializes
cwd-dependent special-path grants to absolute paths before they can be
recorded for reuse.

The request-permissions response paths now use that intersection
consistently. App-server captures the request turn cwd before waiting
for the client response, includes that cwd in the v2 approval params,
and core stores the requested profile plus cwd for direct TUI/client
responses and Guardian decisions before recording turn- or
session-scoped grants. The TUI app-server bridge now preserves the
app-server request cwd when converting permission approval params into
core events.

## Verification

- `cargo test -p codex-sandboxing intersect_permission_profiles --
--nocapture`
- `cargo test -p codex-app-server request_permissions_response --
--nocapture`
- `cargo test -p codex-core
request_permissions_response_materializes_session_cwd_grants_before_recording
-- --nocapture`
- `cargo check -p codex-tui --tests`
- `cargo check --tests`
- `cargo test -p codex-tui
app_server_request_permissions_preserves_file_system_permissions`
2026-04-21 10:23:01 -07:00
pakrym-oai
2a226096f6 Split DeveloperInstructions into individual fragments. (#18813)
Split DeveloperInstructions into individual fragments.
2026-04-21 10:22:36 -07:00
pash-openai
dc1a8f2190 [tool search] support namespaced deferred dynamic tools (#18413)
Deferred dynamic tools need to round-trip a namespace so a tool returned
by `tool_search` can be called through the same registry key that core
uses for dispatch.

This change adds namespace support for dynamic tool specs/calls,
persists it through app-server thread state, and routes dynamic tool
calls by full `ToolName` while still sending the app the leaf tool name.
Deferred dynamic tools must provide a namespace; non-deferred dynamic
tools may remain top-level.

It also introduces `LoadableToolSpec` as the shared
function-or-namespace Responses shape used by both `tool_search` output
and dynamic tool registration, so dynamic tools use the same wrapping
logic in both paths.

Validation:
- `cargo test -p codex-tools`
- `cargo test -p codex-core tool_search`

---------

Co-authored-by: Sayan Sisodiya <sayan@openai.com>
2026-04-21 14:13:08 +08:00
Dylan Hurd
86535c9901 feat(auto-review) Handle request_permissions calls (#18393)
## Summary
When auto-review is enabled, it should handle request_permissions tool.
We'll need to clean up the UX but I'm planning to do that in a separate
pass

## Testing
- [x] Ran locally
<img width="893" height="396" alt="Screenshot 2026-04-17 at 1 16 13 PM"
src="https://github.com/user-attachments/assets/4c045c5f-1138-4c6c-ac6e-2cb6be4514d8"
/>

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-20 21:48:57 -07:00
Michael Bolin
3d2f123895 protocol: preserve glob scan depth in permission profiles (#18713)
## Why

#18274 made `PermissionProfile` the canonical file-system permissions
shape, but the round-trip from `FileSystemSandboxPolicy` to
`PermissionProfile` still dropped one piece of policy metadata:
`glob_scan_max_depth`.

That field is security-relevant for deny-read globs such as `**/*.env`.
On Linux, bubblewrap sandbox construction uses it to bound unreadable
glob expansion. If a profile copied from active runtime permissions
loses this value and is submitted back as an override, the resulting
`FileSystemSandboxPolicy` can behave differently even though the visible
permission entries look equivalent.

## What changed

- Add `glob_scan_max_depth` to protocol `FileSystemPermissions` and
preserve it when converting to/from `FileSystemSandboxPolicy`.
- Keep legacy `read`/`write` JSON for simple path-only permissions, but
force canonical JSON when glob scan depth is present so the metadata is
not silently dropped.
- Carry `globScanMaxDepth` through app-server
`AdditionalFileSystemPermissions`, generated JSON/TypeScript schemas,
and app-server/TUI conversion call sites.
- Preserve the metadata through sandboxing permission normalization,
merging, and intersection.
- Carry the merged scan depth into the effective
`FileSystemSandboxPolicy` used for command execution, so bounded
deny-read globs reach Linux bubblewrap materialization.

## Verification

- `cargo test -p codex-sandboxing glob_scan -- --nocapture`
- `cargo test -p codex-sandboxing policy_transforms -- --nocapture`
- `just fix -p codex-sandboxing`





---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18713).
* #18288
* #18287
* #18286
* #18285
* #18284
* #18283
* #18282
* #18281
* #18280
* #18279
* #18278
* #18277
* #18276
* #18275
* __->__ #18713
2026-04-20 19:42:45 -07:00
guinness-oai
1029742cf7 Add realtime silence tool (#18635)
## Summary

Adds a second realtime v2 function tool, `remain_silent`, so the
realtime model has an explicit non-speaking action when the
collaboration mode or latest context says it should not answer aloud.
This is stacked on #18597.

## Design

- Advertise `remain_silent` alongside `background_agent` in realtime v2
conversational sessions.
- Parse `remain_silent` function calls into a typed
`RealtimeEvent::NoopRequested` event.
- Have core answer that function call with an empty
`function_call_output` and deliberately avoid `response.create`, so no
follow-up realtime response is requested.
- Keep the event hidden from app-server/TUI surfaces; it is operational
plumbing, not user-visible conversation content.
2026-04-20 15:43:20 -07:00
rhan-oai
7f53e47250 [codex-analytics] guardian review analytics schema polishing (#17692)
## Why

Guardian review analytics needs a Rust event shape that matches the
backend schema while avoiding unnecessary PII exposure from reviewed
tool calls. This PR narrows the analytics payload to the fields we
intend to emit and keeps shared Guardian assessment enums in protocol
instead of duplicating equivalent analytics-only enums.

## What changed

- Uses protocol Guardian enums directly for `risk_level`,
`user_authorization`, `outcome`, and command source values.
- Removes high-risk reviewed-action fields from the analytics payload,
including raw commands, display strings, working directories, file
paths, network targets/hosts, justification text, retry reason, and
rationale text.
- Makes `target_item_id` and `tool_call_count` nullable so the Codex
event can represent cases where the app-server protocol or producer does
not have those values.
- Keeps lower-risk structured reviewed-action metadata such as sandbox
permissions, permission profile, `tty`, `execve` source/program, network
protocol/port, and MCP connector/tool labels.
- Adds an analytics reducer/client test covering `codex_guardian_review`
serialization with an optional `target_item_id` and absent removed
fields.

## Verification

- `cargo test -p codex-analytics
guardian_review_event_ingests_custom_fact_with_optional_target_item`
- `cargo fmt --check`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17692).
* #17696
* #17695
* #17693
* __->__ #17692
2026-04-20 13:08:17 -07:00
Michael Bolin
dcec516313 protocol: canonicalize file system permissions (#18274)
## Why

`PermissionProfile` needs stable, canonical file-system semantics before
it can become the primary runtime permissions abstraction. Without a
canonical form, callers have to keep re-deriving legacy sandbox maps and
profile comparisons remain lossy or order-dependent.

## What changed

This adds canonicalization helpers for `FileSystemPermissions` and
`PermissionProfile`, expands special paths into explicit sandbox
entries, and updates permission request/conversion paths to consume
those canonical entries. It also tightens the legacy bridge so root-wide
write profiles with narrower carveouts are not silently projected as
full-disk legacy access.

## Verification

- `cargo test -p codex-protocol
root_write_with_read_only_child_is_not_full_disk_write -- --nocapture`
- `cargo test -p codex-sandboxing permission -- --nocapture`
- `cargo test -p codex-tui permissions -- --nocapture`
2026-04-20 09:57:03 -07:00
jif-oai
b528ff02b6 chore: morpheus to path (#18353)
Make the morpheus agent (which is the phase 2 memories agent) follow the
agent-v2 path system by naming it `/morpheus`. To maintain the path
primitive this means moving it to a dedicated `AgentControl`

Co-authored-by: Codex <noreply@openai.com>
2026-04-20 10:32:20 +01:00
Adrian
e5b52a3caa Persist and prewarm agent tasks per thread (#17978)
## Summary
- persist registered agent tasks in the session state update stream so
the thread can reuse them
- prewarm task registration once identity registration succeeds, while
keeping startup failures best-effort
- isolate the session-side task lifecycle into a dedicated module so
AgentIdentityManager and RegisteredAgentTask do not leak across as many
core layers

## Testing
- cargo test -p codex-core startup_agent_task_prewarm
- cargo test -p codex-core
cached_agent_task_for_current_identity_clears_stale_task
- cargo test -p codex-core record_initial_history_
2026-04-19 15:45:28 -07:00
pakrym-oai
53b1570367 Update image outputs to default to high detail (#18386)
Do not assume the default `detail`.
2026-04-18 11:01:12 -07:00