- Split MCP runtime/server code out of `codex-core` into the new
`codex-mcp` crate. New/moved public structs/types include `McpConfig`,
`McpConnectionManager`, `ToolInfo`, `ToolPluginProvenance`,
`CodexAppsToolsCacheKey`, and the `McpManager` API
(`codex_mcp::mcp::McpManager` plus the `codex_core::mcp::McpManager`
wrapper/shim). New/moved functions include `with_codex_apps_mcp`,
`configured_mcp_servers`, `effective_mcp_servers`,
`collect_mcp_snapshot`, `collect_mcp_snapshot_from_manager`,
`qualified_mcp_tool_name_prefix`, and the MCP auth/skill-dependency
helpers. Why: this creates a focused MCP crate boundary and shrinks
`codex-core` without forcing every consumer to migrate in the same PR.
- Move MCP server config schema and persistence into `codex-config`.
New/moved structs/enums include `AppToolApproval`,
`McpServerToolConfig`, `McpServerConfig`, `RawMcpServerConfig`,
`McpServerTransportConfig`, `McpServerDisabledReason`, and
`codex_config::ConfigEditsBuilder`. New/moved functions include
`load_global_mcp_servers` and
`ConfigEditsBuilder::replace_mcp_servers`/`apply`. Why: MCP TOML
parsing/editing is config ownership, and this keeps config
validation/round-tripping (including per-tool approval overrides and
inline bearer-token rejection) in the config crate instead of
`codex-core`.
- Rewire `codex-core`, app-server, and plugin call sites onto the new
crates. Updated `Config::to_mcp_config(&self, plugins_manager)`,
`codex-rs/core/src/mcp.rs`, `codex-rs/core/src/connectors.rs`,
`codex-rs/core/src/codex.rs`,
`CodexMessageProcessor::list_mcp_server_status_task`, and
`utils/plugins/src/mcp_connector.rs` to build/pass the new MCP
config/runtime types. Why: plugin-provided MCP servers still merge with
user-configured servers, and runtime auth (`CodexAuth`) is threaded into
`with_codex_apps_mcp` / `collect_mcp_snapshot` explicitly so `McpConfig`
stays config-only.
## Why
Follow-up to #16288: the new dynamic provider auth token flow currently
defaults `refresh_interval_ms` to a non-zero value and rejects `0`
entirely.
For command-backed bearer auth, `0` should mean "never auto-refresh".
That lets callers keep using the cached token until the backend actually
returns `401 Unauthorized`, at which point Codex can rerun the auth
command as part of the existing retry path.
## What changed
- changed `ModelProviderAuthInfo.refresh_interval_ms` to accept `0` and
documented that value as disabling proactive refresh
- updated the external bearer token refresher to treat
`refresh_interval_ms = 0` as an indefinitely reusable cached token,
while still rerunning the auth command during unauthorized recovery
- regenerated `core/config.schema.json` so the schema minimum is `0` and
the new behavior is described in the field docs
- added coverage for both config deserialization and the no-auto-refresh
plus `401` recovery behavior
## How tested
- `cargo test -p codex-protocol`
- `cargo test -p codex-login`
- `cargo test -p codex-core test_deserialize_provider_auth_config_`
## Summary
Fixes#15189.
Custom model providers that set `requires_openai_auth = false` could
only use static credentials via `env_key` or
`experimental_bearer_token`. That is not enough for providers that mint
short-lived bearer tokens, because Codex had no way to run a command to
obtain a bearer token, cache it briefly in memory, and retry with a
refreshed token after a `401`.
This PR adds that provider config and wires it through the existing auth
design: request paths still go through `AuthManager.auth()` and
`UnauthorizedRecovery`, with `core` only choosing when to use a
provider-backed bearer-only `AuthManager`.
## Scope
To keep this PR reviewable, `/models` only uses provider auth for the
initial request in this change. It does **not** add a dedicated `401`
retry path for `/models`; that can be follow-up work if we still need it
after landing the main provider-token support.
## Example Usage
```toml
model_provider = "corp-openai"
[model_providers.corp-openai]
name = "Corp OpenAI"
base_url = "https://gateway.example.com/openai"
requires_openai_auth = false
[model_providers.corp-openai.auth]
command = "gcloud"
args = ["auth", "print-access-token"]
timeout_ms = 5000
refresh_interval_ms = 300000
```
The command contract is intentionally small:
- write the bearer token to `stdout`
- exit `0`
- any leading or trailing whitespace is trimmed before the token is used
## What Changed
- add `model_providers.<id>.auth` to the config model and generated
schema
- validate that command-backed provider auth is mutually exclusive with
`env_key`, `experimental_bearer_token`, and `requires_openai_auth`
- build a bearer-only `AuthManager` for `ModelClient` and
`ModelsManager` when a provider configures `auth`
- let normal Responses requests and realtime websocket connects use the
provider-backed bearer source through the same `AuthManager.auth()` path
- allow `/models` online refresh for command-auth providers and attach
the provider token to the initial `/models` request
- keep `auth.cwd` available as an advanced escape hatch and include it
in the generated config schema
## Testing
- `cargo test -p codex-core provider_auth_command`
- `cargo test -p codex-core
refresh_available_models_uses_provider_auth_token`
- `cargo test -p codex-core
test_deserialize_provider_auth_config_defaults`
## Docs
- `developers.openai.com/codex` should document the new
`[model_providers.<id>.auth]` block and the token-command contract
## Summary
This PR replaces the legacy network allow/deny list model with explicit
rule maps for domains and unix sockets across managed requirements,
permissions profiles, the network proxy config, and the app server
protocol.
Concretely, it:
- introduces typed domain (`allow` / `deny`) and unix socket permission
(`allow` / `none`) entries instead of separate `allowed_domains`,
`denied_domains`, and `allow_unix_sockets` lists
- updates config loading, managed requirements merging, and exec-policy
overlays to read and upsert rule entries consistently
- exposes the new shape through protocol/schema outputs, debug surfaces,
and app-server config APIs
- rejects the legacy list-based keys and updates docs/tests to reflect
the new config format
## Why
The previous representation split related network policy across multiple
parallel lists, which made merging and overriding rules harder to reason
about. Moving to explicit keyed permission maps gives us a single source
of truth per host/socket entry, makes allow/deny precedence clearer, and
gives protocol consumers access to the full rule state instead of
derived projections only.
## Backward Compatibility
### Backward compatible
- Managed requirements still accept the legacy
`experimental_network.allowed_domains`,
`experimental_network.denied_domains`, and
`experimental_network.allow_unix_sockets` fields. They are normalized
into the new canonical `domains` and `unix_sockets` maps internally.
- App-server v2 still deserializes legacy `allowedDomains`,
`deniedDomains`, and `allowUnixSockets` payloads, so older clients can
continue reading managed network requirements.
- App-server v2 responses still populate `allowedDomains`,
`deniedDomains`, and `allowUnixSockets` as legacy compatibility views
derived from the canonical maps.
- `managed_allowed_domains_only` keeps the same behavior after
normalization. Legacy managed allowlists still participate in the same
enforcement path as canonical `domains` entries.
### Not backward compatible
- Permissions profiles under `[permissions.<profile>.network]` no longer
accept the legacy list-based keys. Those configs must use the canonical
`[domains]` and `[unix_sockets]` tables instead of `allowed_domains`,
`denied_domains`, or `allow_unix_sockets`.
- Managed `experimental_network` config cannot mix canonical and legacy
forms in the same block. For example, `domains` cannot be combined with
`allowed_domains` or `denied_domains`, and `unix_sockets` cannot be
combined with `allow_unix_sockets`.
- The canonical format can express explicit `"none"` entries for unix
sockets, but those entries do not round-trip through the legacy
compatibility fields because the legacy fields only represent allow/deny
lists.
## Testing
`/target/debug/codex sandbox macos --log-denials /bin/zsh -c 'curl
https://www.example.com' ` gives 200 with config
```
[permissions.workspace.network.domains]
"www.example.com" = "allow"
```
and fails when set to deny: `curl: (56) CONNECT tunnel failed, response
403`.
Also tested backward compatibility path by verifying that adding the
following to `/etc/codex/requirements.toml` works:
```
[experimental_network]
allowed_domains = ["www.example.com"]
```
Add environment manager that is a singleton and is created early in
app-server (before skill manager, before config loading).
Use an environment variable to point to a running exec server.
This PR add an URI-based system to reference agents within a tree. This
comes from a sync between research and engineering.
The main agent (the one manually spawned by a user) is always called
`/root`. Any sub-agent spawned by it will be `/root/agent_1` for example
where `agent_1` is chosen by the model.
Any agent can contact any agents using the path.
Paths can be used either in absolute or relative to the calling agents
Resume is not supported for now on this new path
## Problem
When multiple Codex sessions are open at once, terminal tabs and windows
are hard to distinguish from each other. The existing status line only
helps once the TUI is already focused, so it does not solve the "which
tab is this?" problem.
This PR adds a first-class `/title` command so the terminal window or
tab title can carry a short, configurable summary of the current
session.
## Screenshot
<img width="849" height="320" alt="image"
src="https://github.com/user-attachments/assets/8b112927-7890-45ed-bb1e-adf2f584663d"
/>
## Mental model
`/statusline` and `/title` are separate status surfaces with different
constraints. The status line is an in-app footer that can be denser and
more detailed. The terminal title is external terminal metadata, so it
needs short, stable segments that still make multiple sessions easy to
tell apart.
The `/title` configuration is an ordered list of compact items. By
default it renders `spinner,project`, so active sessions show
lightweight progress first while idle sessions still stay easy to
disambiguate. Each configured item is omitted when its value is not
currently available rather than forcing a placeholder.
## Non-goals
This does not merge `/title` into `/statusline`, and it does not add an
arbitrary free-form title string. The feature is intentionally limited
to a small set of structured items so the title stays short and
reviewable.
This also does not attempt to restore whatever title the terminal or
shell had before Codex started. When Codex clears the title, it clears
the title Codex last wrote.
## Tradeoffs
A separate `/title` command adds some conceptual overlap with
`/statusline`, but it keeps title-specific constraints explicit instead
of forcing the status line model to cover two different surfaces.
Title refresh can happen frequently, so the implementation now shares
parsing and git-branch orchestration between the status line and title
paths, and caches the derived project-root name by cwd. That keeps the
hot path cheap without introducing background polling.
## Architecture
The TUI gets a new `/title` slash command and a dedicated picker UI for
selecting and ordering terminal-title items. The chosen ids are
persisted in `tui.terminal_title`, with `spinner` and `project` as the
default when the config is unset. `status` remains available as a
separate text item, so configurations like `spinner,status` render
compact progress like `⠋ Working`.
`ChatWidget` now refreshes both status surfaces through a shared
`refresh_status_surfaces()` path. That shared path parses configured
items once, warns on invalid ids once, synchronizes shared cached state
such as git-branch lookup, then renders the footer status line and
terminal title from the same snapshot.
Low-level OSC title writes live in `codex-rs/tui/src/terminal_title.rs`,
which owns the terminal write path and last-mile sanitization before
emitting OSC 0.
## Security
Terminal-title text is treated as untrusted display content before Codex
emits it. The write path strips control characters, removes invisible
and bidi formatting characters that can make the title visually
misleading, normalizes whitespace, and caps the emitted length.
References used while implementing this:
- [xterm control
sequences](https://invisible-island.net/xterm/ctlseqs/ctlseqs.html)
- [WezTerm escape sequences](https://wezterm.org/escape-sequences.html)
- [CWE-150: Improper Neutralization of Escape, Meta, or Control
Sequences](https://cwe.mitre.org/data/definitions/150.html)
- [CERT VU#999008 (Trojan Source)](https://kb.cert.org/vuls/id/999008)
- [Trojan Source disclosure site](https://trojansource.codes/)
- [Unicode Bidirectional Algorithm (UAX
#9)](https://www.unicode.org/reports/tr9/)
- [Unicode Security Considerations (UTR
#36)](https://www.unicode.org/reports/tr36/)
## Observability
Unknown configured title item ids are warned about once instead of
repeatedly spamming the transcript. Live preview applies immediately
while the `/title` picker is open, and cancel rolls the in-memory title
selection back to the pre-picker value.
If terminal title writes fail, the TUI emits debug logs around set and
clear attempts. The rendered status label intentionally collapses richer
internal states into compact title text such as `Starting...`, `Ready`,
`Thinking...`, `Working...`, `Waiting...`, and `Undoing...` when
`status` is configured.
## Tests
Ran:
- `just fmt`
- `cargo test -p codex-tui`
At the moment, the red Windows `rust-ci` failures are due to existing
`codex-core` `apply_patch_cli` stack-overflow tests that also reproduce
on `main`. The `/title`-specific `codex-tui` suite is green.
- thread the realtime version into conversation start and app-server
notifications
- keep playback-aware mic gating and playback interruption behavior on
v2 only, leaving v1 on the legacy path
## Description
This PR fixes a bad first-turn failure mode in app-server when the
startup websocket prewarm hangs. Before this change, `initialize ->
thread/start -> turn/start` could sit behind the prewarm for up to five
minutes, so the client would not see `turn/started`, and even
`turn/interrupt` would block because the turn had not actually started
yet.
Now, we:
- set a (configurable) timeout of 15s for websocket startup time,
exposed as `websocket_startup_timeout_ms` in config.toml
- `turn/started` is sent immediately on `turn/start` even if the
websocket is still connecting
- `turn/interrupt` can be used to cancel a turn that is still waiting on
the websocket warmup
- the turn task will wait for the full 15s websocket warming timeout
before falling back
## Why
The old behavior made app-server feel stuck at exactly the moment the
client expects turn lifecycle events to start flowing. That was
especially painful for external clients, because from their point of
view the server had accepted the request but then went silent for
minutes.
## Configuring the websocket startup timeout
Can set it in config.toml like this:
```
[model_providers.openai]
supports_websockets = true
websocket_connect_timeout_ms = 15000
```
This PR replicates the `tui` code directory and creates a temporary
parallel `tui_app_server` directory. It also implements a new feature
flag `tui_app_server` to select between the two tui implementations.
Once the new app-server-based TUI is stabilized, we'll delete the old
`tui` directory and feature flag.
## Summary
- reuse a guardian subagent session across approvals so reviews keep a
stable prompt cache key and avoid one-shot startup overhead
- clear the guardian child history before each review so prior guardian
decisions do not leak into later approvals
- include the `smart_approvals` -> `guardian_approval` feature flag
rename in the same PR to minimize release latency on a very tight
timeline
- add regression coverage for prompt-cache-key reuse without
prior-review prompt bleed
## Request
- Bug/enhancement request: internal guardian prompt-cache and latency
improvement request
---------
Co-authored-by: Codex <noreply@openai.com>
We receive bug reports from users who attempt to override one of the
three built-in model providers (openai, ollama, or lmstuio). Currently,
these overrides are silently ignored. This PR makes it an error to
override them.
## Summary
- add validation for `model_providers` so `openai`, `ollama`, and
`lmstudio` keys now produce clear configuration errors instead of being
silently ignored
We regularly get bug reports from users who mistakenly have the
`OPENAI_BASE_URL` environment variable set. This PR deprecates this
environment variable in favor of a top-level config key
`openai_base_url` that is used for the same purpose. By making it a
config key, it will be more visible to users. It will also participate
in all of the infrastructure we've added for layered and managed
configs.
Summary
- introduce the `openai_base_url` top-level config key, update
schema/tests, and route the built-in openai provider through it while
- fall back to deprecated `OPENAI_BASE_URL` env var but warn user of
deprecation when no `openai_base_url` config key is present
- update CLI, SDK, and TUI code to prefer the new config path (with a
deprecated env-var fallback) and document the SDK behavior change
## Summary
- add `approvals_reviewer = "user" | "guardian_subagent"` as the runtime
control for who reviews approval requests
- route Smart Approvals guardian review through core for command
execution, file changes, managed-network approvals, MCP approvals, and
delegated/subagent approval flows
- expose guardian review in app-server with temporary unstable
`item/autoApprovalReview/{started,completed}` notifications carrying
`targetItemId`, `review`, and `action`
- update the TUI so Smart Approvals can be enabled from `/experimental`,
aligned with the matching `/approvals` mode, and surfaced clearly while
reviews are pending or resolved
## Runtime model
This PR does not introduce a new `approval_policy`.
Instead:
- `approval_policy` still controls when approval is needed
- `approvals_reviewer` controls who reviewable approval requests are
routed to:
- `user`
- `guardian_subagent`
`guardian_subagent` is a carefully prompted reviewer subagent that
gathers relevant context and applies a risk-based decision framework
before approving or denying the request.
The `smart_approvals` feature flag is a rollout/UI gate. Core runtime
behavior keys off `approvals_reviewer`.
When Smart Approvals is enabled from the TUI, it also switches the
current `/approvals` settings to the matching Smart Approvals mode so
users immediately see guardian review in the active thread:
- `approval_policy = on-request`
- `approvals_reviewer = guardian_subagent`
- `sandbox_mode = workspace-write`
Users can still change `/approvals` afterward.
Config-load behavior stays intentionally narrow:
- plain `smart_approvals = true` in `config.toml` remains just the
rollout/UI gate and does not auto-set `approvals_reviewer`
- the deprecated `guardian_approval = true` alias migration does
backfill `approvals_reviewer = "guardian_subagent"` in the same scope
when that reviewer is not already configured there, so old configs
preserve their original guardian-enabled behavior
ARC remains a separate safety check. For MCP tool approvals, ARC
escalations now flow into the configured reviewer instead of always
bypassing guardian and forcing manual review.
## Config stability
The runtime reviewer override is stable, but the config-backed
app-server protocol shape is still settling.
- `thread/start`, `thread/resume`, and `turn/start` keep stable
`approvalsReviewer` overrides
- the config-backed `approvals_reviewer` exposure returned via
`config/read` (including profile-level config) is now marked
`[UNSTABLE]` / experimental in the app-server protocol until we are more
confident in that config surface
## App-server surface
This PR intentionally keeps the guardian app-server shape narrow and
temporary.
It adds generic unstable lifecycle notifications:
- `item/autoApprovalReview/started`
- `item/autoApprovalReview/completed`
with payloads of the form:
- `{ threadId, turnId, targetItemId, review, action? }`
`review` is currently:
- `{ status, riskScore?, riskLevel?, rationale? }`
- where `status` is one of `inProgress`, `approved`, `denied`, or
`aborted`
`action` carries the guardian action summary payload from core when
available. This lets clients render temporary standalone pending-review
UI, including parallel reviews, even when the underlying tool item has
not been emitted yet.
These notifications are explicitly documented as `[UNSTABLE]` and
expected to change soon.
This PR does **not** persist guardian review state onto `thread/read`
tool items. The intended follow-up is to attach guardian review state to
the reviewed tool item lifecycle instead, which would improve
consistency with manual approvals and allow thread history / reconnect
flows to replay guardian review state directly.
## TUI behavior
- `/experimental` exposes the rollout gate as `Smart Approvals`
- enabling it in the TUI enables the feature and switches the current
session to the matching Smart Approvals `/approvals` mode
- disabling it in the TUI clears the persisted `approvals_reviewer`
override when appropriate and returns the session to default manual
review when the effective reviewer changes
- `/approvals` still exposes the reviewer choice directly
- the TUI renders:
- pending guardian review state in the live status footer, including
parallel review aggregation
- resolved approval/denial state in history
## Scope notes
This PR includes the supporting core/runtime work needed to make Smart
Approvals usable end-to-end:
- shell / unified-exec / apply_patch / managed-network / MCP guardian
review
- delegated/subagent approval routing into guardian review
- guardian review risk metadata and action summaries for app-server/TUI
- config/profile/TUI handling for `smart_approvals`, `guardian_approval`
alias migration, and `approvals_reviewer`
- a small internal cleanup of delegated approval forwarding to dedupe
fallback paths and simplify guardian-vs-parent approval waiting (no
intended behavior change)
Out of scope for this PR:
- redesigning the existing manual approval protocol shapes
- persisting guardian review state onto app-server `ThreadItem`s
- delegated MCP elicitation auto-review (the current delegated MCP
guardian shim only covers the legacy `RequestUserInput` path)
---------
Co-authored-by: Codex <noreply@openai.com>
Summary
- add the code_mode_only feature flag/config schema and wire its
dependency on code_mode
- update code mode tool descriptions to list nested tools with detailed
headers
- restrict available tools for prompt and exec descriptions when
code_mode_only is enabled and test the behavior
Testing
- Not run (not requested)
## Summary
- launch Windows sandboxed children on a private desktop instead of
`Winsta0\Default`
- make private desktop the default while keeping
`windows.sandbox_private_desktop=false` as the escape hatch
- centralize process launch through the shared
`create_process_as_user(...)` path
- scope the private desktop ACL to the launching logon SID
## Why
Today sandboxed Windows commands run on the visible shared desktop. That
leaves an avoidable same-desktop attack surface for window interaction,
spoofing, and related UI/input issues. This change moves sandboxed
commands onto a dedicated per-launch desktop by default so the sandbox
no longer shares `Winsta0\Default` with the user session.
The implementation stays conservative on security with no silent
fallback back to `Winsta0\Default`
If private-desktop setup fails on a machine, users can still opt out
explicitly with `windows.sandbox_private_desktop=false`.
## Validation
- `cargo build -p codex-cli`
- elevated-path `codex exec` desktop-name probe returned
`CodexSandboxDesktop-*`
- elevated-path `codex exec` smoke sweep for shell commands, nested
`pwsh`, jobs, and hidden `notepad` launch
- unelevated-path full private-desktop compatibility sweep via `codex
exec` with `-c windows.sandbox=unelevated`
- add experimental_realtime_ws_mode (conversational/transcription) and
plumb it into realtime conversation session config
- switch realtime websocket intent and session.update payload shape
based on mode
- update config schema and realtime/config tests
---------
Co-authored-by: Codex <noreply@openai.com>
- Add a feature-flagged realtime v2 parser on the existing
websocket/session pipeline.
- Wire parser selection from core feature flags and map the codex
handoff tool-call path into existing handoff events.
---------
Co-authored-by: Codex <noreply@openai.com>
## Summary
- rename the public feature flag for `spawn_agents_on_csv()` from
`spawn_csv` to `enable_fanout`
- regenerate the config schema so only `enable_fanout` is advertised
- keep the behavior the same: enabling `enable_fanout` still pulls in
`multi_agent`
## Notes
- this is a hard rename with no `spawn_csv` compatibility alias
- the internal enum remains `Feature::SpawnCsv` to keep the patch small
## Testing
- `cd codex-rs && just fmt`
- `cd codex-rs && cargo test -p codex-core` (running locally;
`suite::agent_jobs::*` and rename-specific coverage passed so far)
## Summary
- restore `use_linux_sandbox_bwrap` as a removed feature key so older
`--enable` callers parse again
- keep it as a no-op by leaving runtime behavior unchanged
- add regression coverage for the legacy `--enable` path
## Testing
- Not run (updated and pushed quickly)
## Summary
- make bubblewrap the default Linux sandbox and keep
`use_legacy_landlock` as the only override
- remove `use_linux_sandbox_bwrap` from feature, config, schema, and
docs surfaces
- update Linux sandbox selection, CLI/config plumbing, and related
tests/docs to match the new default
- fold in the follow-up CI fixes for request-permissions responses and
Linux read-only sandbox error text
## Summary
- add `skill_approval` to `RejectConfig` and the app-server v2
`AskForApproval::Reject` payload so skill-script prompts can be
configured independently from sandbox and rule-based prompts
- update Unix shell escalation to reject prompts based on the actual
decision source, keeping prefix rules tied to `rules`, unmatched command
fallbacks tied to `sandbox_approval`, and skill scripts tied to
`skill_approval`
- regenerate the affected protocol/config schemas and expand
unit/integration coverage for the new flag and skill approval behavior
(Experimental)
This PR adds a first MVP for hooks, with SessionStart and Stop
The core design is:
- hooks live in a dedicated engine under codex-rs/hooks
- each hook type has its own event-specific file
- hook execution is synchronous and blocks normal turn progression while
running
- matching hooks run in parallel, then their results are aggregated into
a normalized HookRunSummary
On the AppServer side, hooks are exposed as operational metadata rather
than transcript-native items:
- new live notifications: hook/started, hook/completed
- persisted/replayed hook results live on Turn.hookRuns
- we intentionally did not add hook-specific ThreadItem variants
Hooks messages are not persisted, they remain ephemeral. The context
changes they add are (they get appended to the user's prompt)
## Summary
We need to support allowing request_permissions calls when using
`Reject` policy
<img width="1133" height="588" alt="Screenshot 2026-03-09 at 12 06
40 PM"
src="https://github.com/user-attachments/assets/a8df987f-c225-4866-b8ab-5590960daec5"
/>
Note that this is a backwards-incompatible change for Reject policy. I'm
not sure if we need to add a default based on our current use/setup
## Testing
- [x] Added tests
- [x] Tested locally
Adds a built-in `request_permissions` tool and wires it through the
Codex core, protocol, and app-server layers so a running turn can ask
the client for additional permissions instead of relying on a static
session policy.
The new flow emits a `RequestPermissions` event from core, tracks the
pending request by call ID, forwards it through app-server v2 as an
`item/permissions/requestApproval` request, and resumes the tool call
once the client returns an approved subset of the requested permission
profile.
## Summary
- add the guardian reviewer flow for `on-request` approvals in command,
patch, sandbox-retry, and managed-network approval paths
- keep guardian behind `features.guardian_approval` instead of exposing
a public `approval_policy = guardian` mode
- route ordinary `OnRequest` approvals to the guardian subagent when the
feature is enabled, without changing the public approval-mode surface
## Public model
- public approval modes stay unchanged
- guardian is enabled via `features.guardian_approval`
- when that feature is on, `approval_policy = on-request` keeps the same
approval boundaries but sends those approval requests to the guardian
reviewer instead of the user
- `/experimental` only persists the feature flag; it does not rewrite
`approval_policy`
- CLI and app-server no longer expose a separate `guardian` approval
mode in this PR
## Guardian reviewer
- the reviewer runs as a normal subagent and reuses the existing
subagent/thread machinery
- it is locked to a read-only sandbox and `approval_policy = never`
- it does not inherit user/project exec-policy rules
- it prefers `gpt-5.4` when the current provider exposes it, otherwise
falls back to the parent turn's active model
- it fail-closes on timeout, startup failure, malformed output, or any
other review error
- it currently auto-approves only when `risk_score < 80`
## Review context and policy
- guardian mirrors `OnRequest` approval semantics rather than
introducing a separate approval policy
- explicit `require_escalated` requests follow the same approval surface
as `OnRequest`; the difference is only who reviews them
- managed-network allowlist misses that enter the approval flow are also
reviewed by guardian
- the review prompt includes bounded recent transcript history plus
recent tool call/result evidence
- transcript entries and planned-action strings are truncated with
explicit `<guardian_truncated ... />` markers so large payloads stay
bounded
- apply-patch reviews include the full patch content (without
duplicating the structured `changes` payload)
- the guardian request layout is snapshot-tested using the same
model-visible Responses request formatter used elsewhere in core
## Guardian network behavior
- the guardian subagent inherits the parent session's managed-network
allowlist when one exists, so it can use the same approved network
surface while reviewing
- exact session-scoped network approvals are copied into the guardian
session with protocol/port scope preserved
- those copied approvals are now seeded before the guardian's first turn
is submitted, so inherited approvals are available during any immediate
review-time checks
## Out of scope / follow-ups
- the sandbox-permission validation split was pulled into a separate PR
and is not part of this diff
- a future follow-up can enable `serde_json` preserve-order in
`codex-core` and then simplify the guardian action rendering further
---------
Co-authored-by: Codex <noreply@openai.com>