Commit Graph

3093 Commits

Author SHA1 Message Date
Michael Bolin
6058961865 app-server: compare resume sandbox from permission profile 2026-04-30 06:26:47 -07:00
Michael Bolin
4a90f76298 mcp: send sandbox metadata as permission profile only 2026-04-30 06:20:52 -07:00
Michael Bolin
513a439b8f thread-store: stop exposing legacy sandbox policy 2026-04-30 06:16:53 -07:00
Michael Bolin
5d82aeefcc test support: derive turn sandbox from permission profiles 2026-04-30 05:32:17 -07:00
Michael Bolin
0d0d3d8d12 windows setup: derive sandbox from permission profile 2026-04-30 05:26:50 -07:00
Michael Bolin
df12b83033 windows read grants: accept permission profiles 2026-04-30 05:21:34 -07:00
Michael Bolin
53f8983bf5 session tests: configure permissions with profiles 2026-04-30 05:21:34 -07:00
Michael Bolin
3a0e5391bb approval tests: configure scenarios with permission profiles 2026-04-30 04:54:18 -07:00
Michael Bolin
fa4fad57f0 otel: report conversation permissions from profiles 2026-04-30 04:05:23 -07:00
Michael Bolin
37aa2f8157 protocol: drop cwd-less legacy profile constructor 2026-04-30 04:00:16 -07:00
Michael Bolin
3c5890a1ce session tests: configure runtime permissions directly 2026-04-30 03:47:14 -07:00
Michael Bolin
7364013fe0 tests: mutate spawn-agent permission profile directly 2026-04-30 03:43:57 -07:00
Michael Bolin
2876493bae tests: use disabled profile in exec capture check 2026-04-30 03:28:48 -07:00
Michael Bolin
e894ac76f7 tests: use profile constructors in config checks 2026-04-30 03:26:51 -07:00
Michael Bolin
4cf7855a99 tests: use permission profiles in multi-agent config checks 2026-04-30 03:23:18 -07:00
Michael Bolin
c48043f4e4 tests: use permission profiles in session network checks 2026-04-30 03:18:22 -07:00
Michael Bolin
8a2144d700 tests: use permission profiles in config loader checks 2026-04-30 03:18:22 -07:00
Michael Bolin
0fc2a7b068 tests: submit websocket turns with permission profiles 2026-04-30 03:08:22 -07:00
Michael Bolin
4f646e0aca tests: use permission profiles in exec policy checks 2026-04-30 03:04:35 -07:00
Michael Bolin
e28bb5c396 tests: use permission profiles in request permission suite 2026-04-30 03:01:06 -07:00
Michael Bolin
521cf5bdd4 tests: use permission profiles in unified exec suite 2026-04-30 03:01:06 -07:00
Michael Bolin
57094ee86d core: use permission profiles in small read-only contexts 2026-04-30 03:01:06 -07:00
Michael Bolin
200c83f7d7 tests: use permission profiles in suite turn submits 2026-04-30 02:36:30 -07:00
Michael Bolin
cfeaa5aab1 guardian: configure review session permissions directly 2026-04-30 02:36:30 -07:00
Michael Bolin
75c9c98aed tests: use permission profiles in small core fixtures 2026-04-30 02:36:30 -07:00
Michael Bolin
05d341f0d4 tests: use permission profiles in guardian config checks 2026-04-30 02:36:30 -07:00
Michael Bolin
d53c86e0da tests: use permission profiles in unix escalation checks 2026-04-30 02:36:30 -07:00
Michael Bolin
44ec706a44 tests: use permission profiles in patch safety checks 2026-04-30 02:36:30 -07:00
Michael Bolin
a3880e937b tests: use permission profiles in tool sandbox tests 2026-04-30 02:36:30 -07:00
Michael Bolin
ee05c896f7 tests: use permission profile fixtures in config checks 2026-04-30 02:36:30 -07:00
Michael Bolin
ada7881352 core: build permission instructions from profiles only 2026-04-30 02:36:30 -07:00
Michael Bolin
97aaf4cea4 tests: copy plugin stdio server before launch 2026-04-30 02:36:21 -07:00
jif-oai
c37f7434ba Gate multi-agent v2 tools independently of collab (#20246)
## Why

`multi_agents_v2` is meant to be independently gated from the older
`collab` feature. The tool registry still treated the
collaboration-style agent tools as `collab`-only, so enabling
`multi_agents_v2` without `collab` omitted the v2 agent tools. Review
and guardian sub-sessions also need to keep agent spawning disabled even
when the outer session has `multi_agents_v2` enabled.

## What changed

- Include the collab-backed agent tools when either `multi_agents_v2` or
`collab` is enabled.
- Explicitly disable `multi_agents_v2` for review and guardian review
sub-sessions, matching the existing `spawn_csv` and `collab`
restrictions.
- Add a registry test that enables `multi_agents_v2`, disables `collab`,
and verifies the v2 agent tools are present while legacy `send_input`
and `resume_agent` remain hidden.

## Testing

- Added
`test_build_specs_multi_agent_v2_does_not_require_collab_feature`.
2026-04-30 10:23:31 +02:00
Abhinav
8f3c06cc97 Add persisted hook enablement state (#19840)
## Why

After `hooks/list` exposes the hook inventory, clients need a way to
persist user hook preferences, make those changes effective in
already-open sessions, and distinguish user-controllable hooks from
managed requirements without adding another bespoke app-server write
API.

## What

- Extends `hooks/list` entries with effective `enabled` state.
- Persists user-level hook state under `hooks.state.<hook-id>` so the
model can grow beyond a single boolean over time.
- Uses the existing `config/batchWrite` path for hook state updates
instead of introducing a dedicated hook write RPC.
- Refreshes live session hook engines after config writes so
already-open threads observe updated enablement without a restart.

## Stack

1. openai/codex#19705
2. openai/codex#19778
3. This PR - openai/codex#19840
4. openai/codex#19882

## Reviewer Notes

The generated schema files account for much of the raw diff. The core
behavior is in:

- `hooks/src/config_rules.rs`, which resolves per-hook user state from
the config layer stack.
- `hooks/src/engine/discovery.rs`, which projects effective enablement
into `hooks/list` from source-derived managedness.
- `config/src/hook_config.rs`, which defines the new `hooks.state`
representation.
- `core/src/session/mod.rs`, which rebuilds live hook state after user
config reloads.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-30 04:46:32 +00:00
Michael Bolin
ac4332c05b permissions: expose active profile metadata (#20095) 2026-04-29 20:54:59 -07:00
Matthew Zeng
ebe602d005 [plugins] Allow MSFT curated plugins in tool_suggest (#20304)
## Summary
- [x] Move the allowlist out of core crate
- [x] Add Teams, SharePoint, Outlook Email, and Outlook Calendar to the
tool_suggest discoverable plugin allowlist
- [x] Add focused coverage for Microsoft curated plugin discovery

## Testing
- just fmt
- cargo test -p codex-core-plugins
- cargo test -p codex-core
list_tool_suggest_discoverable_plugins_returns_
2026-04-29 19:45:52 -07:00
rhan-oai
bb536d65bd [codex-analytics] prevent stale guardian events from satisfying reused reviews (#20080)
## Why

Reused Guardian review trunks can still have older child-turn events
queued when a later review starts. The review waiter currently accepts
the first terminal event it sees from the shared child session, so a
stale `TurnComplete` can be attributed to the new review. That produces
impossible analytics combinations such as non-null TTFT with sub-10 ms
completion latency and zero token deltas on `trunk_reused` reviews.

## What changed

- Preserve the child turn id returned by the Guardian review
`Op::UserTurn` submission.
- Restrict Guardian review waiting to events correlated with that
submitted child turn.
- Restrict timeout/abort draining to terminal events for the same child
turn.
- Add regression coverage for stale prior-turn completions, stale
prior-turn errors, and interrupt draining in
`codex-rs/core/src/guardian/review_session.rs`.

## Verification

- `cargo test -p codex-core guardian::review_session::tests::`
- `cargo clippy -p codex-core --tests -- -D warnings`
2026-04-29 18:26:39 -07:00
pakrym-oai
fedcefe9da Reduce the surface of collaboration modes (#20149)
Collaboration modes were slightly invasive both into ThreadManager
construction and ModelProvider
2026-04-29 17:22:41 -07:00
Abhinav
8774229a89 Add hooks/list app-server RPC (#19778)
## Why

We need a way to list the available hooks to expose via the TUI and App
so users can view and manage their hooks

## What

- Adds `hooks/list` for one or more `cwd` values that returns discovered
hook metadata

## Stack

1. openai/codex#19705
2. This PR - openai/codex#19778
3. openai/codex#19840
4. openai/codex#19882

## Review Notes

The generated schema files account for most of the raw diff, these files
have the core change:

- `hooks/src/engine/discovery.rs` builds the inventory entries during
hook discovery while leaving runtime handlers focused on execution.
- `app-server/src/codex_message_processor.rs` wires `hooks/list` into
the app-server flow for each requested `cwd`.
- `app-server-protocol/src/protocol/v2.rs` defines the new v2
request/response payloads exposed on the wire.

### Core Changes

`core/src/plugins/manager.rs` adds `plugins_for_layer_stack(...)` so
`skills/list` and `hooks/list`can resolve plugin state for each
requested `cwd`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-29 23:39:57 +00:00
iceweasel-oai
13dbcda28f stop blocking unified_exec on Windows (#19435)
## Summary
- remove the Windows-specific unified-exec environment block from tool
selection
- keep `unified_exec` default-off on Windows unless the feature is
explicitly enabled
- normalize model-provided `shell_type = unified_exec` to
`shell_command` when the feature is disabled
- drop obsolete tests tied to the removed environment gate and keep the
feature-flag regression coverage

## Why
Now that the session/long-lived process backend is implemented for the
Windows sandbox, we don't need to hard disable it anymore. We will be
rolling out slowly using a feature gate.

## Impact
This allows manual Windows opt-in in CLI and app-backed flows while
preserving the existing default-off behavior for Windows users.

---------

Co-authored-by: canvrno-oai <kbond@openai.com>
Co-authored-by: Codex <noreply@openai.com>
2026-04-29 16:06:33 -07:00
pakrym-oai
8de2a7a16d Add codex-core public API listing (#20243)
Summary:
- Add a checked-in codex-core public API listing generated by
cargo-public-api.
- Add scripts/regen-public-api.sh with an embedded crate list,
auto-install for cargo-public-api 0.51.0, pinned nightly, and --check
mode.
- Add Rust CI jobs on the codex Linux x64 runner pool to verify the
listing stays up to date.

Testing:
- bash -n scripts/regen-public-api.sh
- just regen-public-api --check
- yq '.' .github/workflows/rust-ci.yml
.github/workflows/rust-ci-full.yml
- git diff --check
2026-04-29 22:58:08 +00:00
Matthew Zeng
e20391e567 [mcp] Fix plugin MCP approval policy. (#19537)
Plugin MCP servers are loaded from plugin manifests rather than
top-level `[mcp_servers]`, so their tool approval preferences need to be
stored and applied through the owning plugin config. Without this,
choosing "Always allow" for a plugin MCP tool could write a preference
that was not reliably used on later tool calls.

## Summary
- Add plugin-scoped MCP policy config under
`plugins.<plugin>.mcp_servers`, including server enablement, tool
allow/deny lists, server defaults, and per-tool approval modes.
- Overlay plugin MCP policy onto manifest-provided server configs when
plugins are loaded.
- Route persistent "Always allow" writes for plugin MCP tools back to
the owning `plugins.<plugin>.mcp_servers.<server>.tools.<tool>` config
entry.
- Reload user config after persisting an approval and make the plugin
load cache config-aware so stale plugin MCP policy is not reused after
`config.toml` changes.
- Regenerate the config schema and add coverage for plugin MCP policy
loading, approval lookup, persistence, and stale-cache prevention.

## Testing
- `cargo test -p codex-config`
- `cargo test -p codex-core-plugins`
- `cargo test -p codex-core --lib plugin_mcp`
2026-04-29 15:40:03 -07:00
Eric Traut
4241df4d79 Escape turn metadata headers as ASCII JSON (#19620)
## Why

`x-codex-turn-metadata` is sent as an HTTP/WebSocket header, but Codex
was serializing the metadata JSON with raw UTF-8 string contents. When a
workspace path contains non-ASCII characters, common HTTP stacks can
reject or corrupt that header before the request reaches the provider.

Fixes #17468. Also addresses the duplicate WebSocket report in #19581.

## What changed

- Added `codex_utils_string::to_ascii_json_string`, a shared helper that
serializes JSON normally while escaping non-ASCII string content as
`\uXXXX`.
- Switched turn metadata header serialization, including merged
Responses API client metadata, to use the ASCII-safe JSON helper.
- Added coverage for non-ASCII workspace paths and non-ASCII client
metadata while preserving the same parsed JSON values.

## Verification

- `cargo test -p codex-utils-string`
- `cargo test -p codex-core turn_metadata`
- `just bazel-lock-check`
2026-04-29 15:35:33 -07:00
Alex Daley
f63b19bedd [apps] Add apps MCP path override (#20231)
Summary

- Add `[features.apps_mcp_path_override]` config with a `path` field for
overriding only the built-in apps MCP path.
- Keep existing host/base URL derivation unchanged and append the
configured path after that base.
- Regenerate the config schema with the custom feature-config case.

Test Plan

- Not run for latest revision; only `just fmt` and `just
write-config-schema` were run.
- Earlier revision: `cargo test -p codex-features`
- Earlier revision: `cargo test -p codex-mcp`
2026-04-29 18:08:06 -04:00
Matthew Zeng
8ce48f9968 [tool_suggest] Improve tool_suggest triggering conditions. (#20091)
## Summary
- Tighten `tool_suggest` guidance so it prefers explicit plugin install
requests, while still allowing a connector install when the relevant
plugin is already installed and a needed connector from that plugin is
missing.
- Tell the model not to call `tool_suggest` in parallel with other
tools.

## Testing
- `cargo test -p codex-tools tool_suggest`
- `cargo test -p codex-core tool_suggest`
2026-04-29 13:41:12 -07:00
viyatb-oai
07c8b8c77c fix: handle deferred network proxy denials (#19184)
## Why

This bug is exposed by Guardian/auto-review approvals. With the managed
network proxy enabled, a blocked network request can be reported back
through the network approval service as an approval denial after the
command has already started. Before this change, the shell and unified
exec runtimes registered those network approval calls, but did not have
a way to observe an async proxy denial as a cancellation/failure signal
for the running process.

The result was confusing: Guardian/auto-review could correctly deny
network access, but the command path could keep running or unregister
the approval without surfacing the denial as the command failure.

## What Changed

- `NetworkApprovalService` now attaches a cancellation token to active
and deferred network approvals.
- Proxy-denial outcomes are recorded only for active registrations,
cancel the owning token, and are consumed when the approval is
finalized.
- The shell runtime combines the normal command timeout with the
network-denial cancellation token.
- Unified exec stores the deferred network approval object, terminates
tracked processes when the proxy denial arrives, and returns the denial
as a process failure while polling or completing the process.
- Tool orchestration passes the active network approval cancellation
token into the sandbox attempt and preserves deferred approval errors
instead of silently unregistering them.
- App-server `command/exec` now handles the combined
timeout-or-cancellation expiration variant used by the runtime.

## Verification

- `cargo test -p codex-core network_approval --lib`
- `cargo clippy -p codex-app-server --all-targets -- -D warnings`
- `cargo clippy -p codex-core --all-targets -- -D warnings`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-29 19:13:57 +00:00
xl-openai
73cd831952 feat: Use remote installed plugin cache for skills and MCP (#20096)
- Fetches and caches remote /installed plugin state
- Lets skills/list load skills from remote-installed cached plugins
without requiring a local marketplace entry
- Routes plugin list/startup/install/uninstall changes through async
plugin cache invalidation and MCP refresh
2026-04-29 12:09:49 -07:00
Won Park
5cf0adba93 Include auto-review rollout in feedback uploads (#20064)
## Summary

- include the live auto-review trunk rollout when `/feedback` uploads
logs
- upload that attachment as
`auto-review-rollout-<parent-thread-id>.jsonl` so it is distinguishable
from the parent rollout
- show the same auto-review attachment name in the TUI consent popup

## Scope

- this only covers the live cached auto-review trunk for the current
parent thread
- it does not add durable historical parent->auto-review lookup
- it does not add persisted rollout support for ephemeral parallel
review forks

## UI 

<img width="599" height="185" alt="Screenshot 2026-04-28 at 1 17 18 PM"
src="https://github.com/user-attachments/assets/6a0e79c2-5d21-4702-8a89-f765778bc9e9"
/>

## Validation

- `cargo test -p codex-core
cached_guardian_subagent_exposes_its_rollout_path`
- `cargo test -p codex-feedback`
- `cargo test -p codex-app-server`
- `cargo test -p codex-tui feedback_upload_consent_popup_snapshot`
- `cargo test -p codex-tui
feedback_good_result_consent_popup_includes_connectivity_diagnostics_filename`

## Known unrelated local failures

- `cargo test -p codex-core` currently fails in the pre-existing proxy
env snapshot test
`tools::runtimes::tests::maybe_wrap_shell_lc_with_snapshot_keeps_user_proxy_env_when_proxy_inactive`
- `cargo test -p codex-tui` currently hits pre-existing `status::*`
snapshot drift unrelated to this change

## Follow-Up 
- persist parallel auto-review fork sessions so /feedback can include
their rollout history too
- attach each persisted fork as its own clearly named file, for example
auto-review-rollout-<parent-thread-id>-fork <n>.jsonl, instead of
merging multiple Guardian sessions into one attachment
- keep the same live-session-only scope initially; durable historical
parent -> auto-review lookup can remain a separate decision if we later
need feedback from resumed sessions
2026-04-29 11:44:55 -07:00
pakrym-oai
8356806fc9 Add ThreadManager sample crate (#20141)
Summary:
- Add codex-thread-manager-sample, a one-shot binary that starts a
ThreadManager thread, submits a prompt, and prints the final assistant
output.
- Pass ThreadStore into ThreadManager::new and expose
thread_store_from_config for existing callsites.
- Build the sample Config directly with only --model and prompt inputs.

Verification:
- just fmt
- cargo check -p codex-thread-manager-sample -p codex-app-server -p
codex-mcp-server
- git diff --check

Tests: Not run per request.
2026-04-29 11:21:06 -07:00
jif-oai
70ac0f123c Make multi-agent v2 ignore agents.max_depth (#20180)
## Why

`agents.max_depth` is a legacy multi-agent v1 guard. Multi-agent v2 uses
task-path routing and its own session/thread limits, so v2 should not
reject nested `spawn_agent` calls just because the thread-spawn depth
has reached the v1 maximum.

Keeping the v1 depth guard active in v2 prevents deeper task trees even
though the v2 path still needs the depth value only for lineage and
task-path metadata.

## What Changed

- Removed the depth-limit rejection from the multi-agent v2
`spawn_agent` handler while still computing child depth for lineage/path
metadata.
- Made the depth-based disabling of legacy `SpawnCsv`/`Collab` tools
apply only when `Feature::MultiAgentV2` is disabled.
- Added `multi_agent_v2_spawn_agent_ignores_configured_max_depth` to
cover a v2 child spawning another agent when `agent_max_depth = 1`,
while the existing v1 depth-limit tests continue to enforce the legacy
behavior.

## Verification

- `cargo test -p codex-core
multi_agent_v2_spawn_agent_ignores_configured_max_depth -- --nocapture`
- `cargo test -p codex-core depth_limit -- --nocapture`
- `cargo test -p codex-core tools::handlers::multi_agents::tests --
--nocapture`
2026-04-29 12:23:00 +02:00