Commit Graph

6009 Commits

Author SHA1 Message Date
Anton Panasenko
191dc00a71 feat: add remote control enrollment read 2026-04-29 16:07:30 -07:00
pakrym-oai
8de2a7a16d Add codex-core public API listing (#20243)
Summary:
- Add a checked-in codex-core public API listing generated by
cargo-public-api.
- Add scripts/regen-public-api.sh with an embedded crate list,
auto-install for cargo-public-api 0.51.0, pinned nightly, and --check
mode.
- Add Rust CI jobs on the codex Linux x64 runner pool to verify the
listing stays up to date.

Testing:
- bash -n scripts/regen-public-api.sh
- just regen-public-api --check
- yq '.' .github/workflows/rust-ci.yml
.github/workflows/rust-ci-full.yml
- git diff --check
2026-04-29 22:58:08 +00:00
Rasmus Rygaard
782191547c Add agent graph store interface (#19229)
## Summary

Persisted subagent parent/child topology currently leaks through
`StateRuntime`'s SQLite-specific thread-spawn helpers. This PR
introduces a narrow `AgentGraphStore` boundary so follow-up work can
route graph operations through a local or remote store without coupling
orchestration code directly to the state DB graph API.

## Changes

- Adds the new `codex-agent-graph-store` crate.
- Defines a flat `AgentGraphStore` trait for the v1 graph surface:
upsert edge, set edge status, list direct children, and list
descendants.
- Adds public graph types for `ThreadSpawnEdgeStatus`,
`AgentGraphStoreError`, and `AgentGraphStoreResult`.
- Implements `LocalAgentGraphStore` on top of an existing
`codex_state::StateRuntime`, preserving today's SQLite-backed
`thread_spawn_edges` behavior.
- Registers the crate in Cargo/Bazel metadata.

This PR only adds the local contract and implementation; call-site
migration and the remote gRPC store are left to the follow-up PRs in the
stack.

## Testing

- `cargo test -p codex-agent-graph-store`

The new unit tests cover local parity with the existing `StateRuntime`
graph methods, `Open`/`Closed` filtering, status updates, and stable
breadth-first descendant ordering.
2026-04-29 22:48:26 +00:00
Matthew Zeng
e20391e567 [mcp] Fix plugin MCP approval policy. (#19537)
Plugin MCP servers are loaded from plugin manifests rather than
top-level `[mcp_servers]`, so their tool approval preferences need to be
stored and applied through the owning plugin config. Without this,
choosing "Always allow" for a plugin MCP tool could write a preference
that was not reliably used on later tool calls.

## Summary
- Add plugin-scoped MCP policy config under
`plugins.<plugin>.mcp_servers`, including server enablement, tool
allow/deny lists, server defaults, and per-tool approval modes.
- Overlay plugin MCP policy onto manifest-provided server configs when
plugins are loaded.
- Route persistent "Always allow" writes for plugin MCP tools back to
the owning `plugins.<plugin>.mcp_servers.<server>.tools.<tool>` config
entry.
- Reload user config after persisting an approval and make the plugin
load cache config-aware so stale plugin MCP policy is not reused after
`config.toml` changes.
- Regenerate the config schema and add coverage for plugin MCP policy
loading, approval lookup, persistence, and stale-cache prevention.

## Testing
- `cargo test -p codex-config`
- `cargo test -p codex-core-plugins`
- `cargo test -p codex-core --lib plugin_mcp`
2026-04-29 15:40:03 -07:00
Eric Traut
4241df4d79 Escape turn metadata headers as ASCII JSON (#19620)
## Why

`x-codex-turn-metadata` is sent as an HTTP/WebSocket header, but Codex
was serializing the metadata JSON with raw UTF-8 string contents. When a
workspace path contains non-ASCII characters, common HTTP stacks can
reject or corrupt that header before the request reaches the provider.

Fixes #17468. Also addresses the duplicate WebSocket report in #19581.

## What changed

- Added `codex_utils_string::to_ascii_json_string`, a shared helper that
serializes JSON normally while escaping non-ASCII string content as
`\uXXXX`.
- Switched turn metadata header serialization, including merged
Responses API client metadata, to use the ASCII-safe JSON helper.
- Added coverage for non-ASCII workspace paths and non-ASCII client
metadata while preserving the same parsed JSON values.

## Verification

- `cargo test -p codex-utils-string`
- `cargo test -p codex-core turn_metadata`
- `just bazel-lock-check`
2026-04-29 15:35:33 -07:00
Michael Bolin
b1546008fc docs: discourage #[async_trait] and #[allow(async_fn_in_trait)] (#20242)
## Why

We have run into two avoidable problems when introducing async trait
APIs in Rust:

- `#[async_trait]` has caused materially worse build times in this
repository.
- `#[allow(async_fn_in_trait)]` makes it too easy to ship a public trait
without spelling out whether the returned future is `Send`, which hides
an important part of the trait contract.

We already have a good example of the preferred alternative in
[#16630](https://github.com/openai/codex/pull/16630) /
[`3c7f013f9735`](https://github.com/openai/codex/commit/3c7f013f9735),
but that guidance currently lives only as prior art in the codebase.
This PR documents the rule in `AGENTS.md` so contributors are more
likely to follow the native RPITIT pattern before these two shortcuts
spread further.

## What Changed

- added Rust guidance in `AGENTS.md` discouraging both `#[async_trait]`
and `#[allow(async_fn_in_trait)]`
- pointed contributors to the native RPITIT pattern with explicit `Send`
bounds on the returned future
- clarified that implementations may still use `async fn` when they
satisfy that trait contract

## Verification

- docs-only change; no tests run
2026-04-29 15:29:29 -07:00
Alex Daley
f63b19bedd [apps] Add apps MCP path override (#20231)
Summary

- Add `[features.apps_mcp_path_override]` config with a `path` field for
overriding only the built-in apps MCP path.
- Keep existing host/base URL derivation unchanged and append the
configured path after that base.
- Regenerate the config schema with the custom feature-config case.

Test Plan

- Not run for latest revision; only `just fmt` and `just
write-config-schema` were run.
- Earlier revision: `cargo test -p codex-features`
- Earlier revision: `cargo test -p codex-mcp`
2026-04-29 18:08:06 -04:00
xli-oai
8d5da3ffe5 Fallback login callback port when default is busy (#19334)
## Summary
- Keep the preferred ChatGPT login callback port `1455` first.
- Preserve the existing `/cancel` recovery for stale Codex login
servers.
- Fall back to the registered localhost callback port `1457` when `1455`
remains unavailable.

## Why
Cursor and Codex Desktop both use the ChatGPT account login callback
server. On Windows, Cursor can already be listening on `127.0.0.1:1455`
/ `[::1]:1455`, causing Codex Desktop sign-in to fail with:

`Local callback port 1455 is already in use on this machine.`

Codex already attempted to cancel a stale Codex login server on that
port, but if the listener does not release the port, the old behavior
was to fail. The new behavior falls back to `1457`, which matches the
fixed redirect URI being registered server-side in
`openai/openai#863817`. This keeps the OAuth `redirect_uri` inside
Hydra's exact allow-list instead of choosing an arbitrary ephemeral
port.

## Validation
- `just fmt`
- `cargo test -p codex-login`
- `git diff --check HEAD~1..HEAD`
2026-04-29 14:45:27 -07:00
rhan-oai
72a39e3a96 [app-server] centralize client response analytics (#20059)
## Why

The precursor PR keeps successful client responses typed until
app-server's outgoing response seam. This follow-up uses that seam to
move successful client-response analytics out of individual handlers and
into the shared sender path, while keeping filtering decisions inside
`codex-analytics`.

## What changed

- Emit successful client-response analytics centrally from
`OutgoingMessageSender::send_response`.
- Remove duplicate handler-local response tracking for the current
thread/turn lifecycle responses.
- Keep analytics ingestion selective inside `AnalyticsEventsClient`, so
unrelated client traffic is ignored before cloning or boxing.
- Collapse client-response analytics facts onto one typed path and
normalize payloads in the reducer.
- Add direct client-filter coverage plus sender-level coverage for the
centralized forwarding path.

## Verification

- `cargo test -p codex-analytics`
- `cargo test -p codex-app-server outgoing_message::tests --lib`
2026-04-29 21:22:39 +00:00
xli-oai
afbddabc8b Require remote plugin detail before uninstall (#19966)
## Summary
- Fetch remote plugin detail before sending the uninstall request.
- Use the detail response to derive the marketplace namespace and plugin
name for cache cleanup.
- Stop the uninstall before the backend POST if detail lookup fails, so
backend state and local cache state do not diverge.

## Testing
- `just fmt`
- `cargo test -p codex-app-server plugin_uninstall`
- `cargo test -p codex-core-plugins`
- `git diff --check`
2026-04-29 14:01:11 -07:00
rhan-oai
973c5c823e [app-server] type client response payloads (#20050)
## Why

`pr17088` adds typed server-originated request/response plumbing, but
successful client responses are still erased into bare JSON-RPC `result`
values before app-server can make any typed decision about them.

This precursor PR keeps successful client responses typed until the
outgoing response seam. It is intentionally limited to
protocol/app-server plumbing so the analytics behavior change can review
separately on top.

## What changed

- Add `ClientResponsePayload` as the pre-serialization client response
body type.
- Route app-server successful response paths through the typed payload
seam while preserving existing handler-local analytics behavior.
- Keep `InterruptConversation` JSON-RPC-only because it has no
`ClientResponse` variant.
- Move the new payload conversion tests into a dedicated protocol test
module.

## Verification

- `cargo check -p codex-app-server`
- `cargo test -p codex-app-server-protocol`
2026-04-29 20:50:47 +00:00
sayan-oai
b15074d0a4 app-server: fix outgoing sender test setup (#20258)
## Why

[#17088](https://github.com/openai/codex/pull/17088) changed
`OutgoingMessageSender::new` to require an `AnalyticsEventsClient`, but
one `command_exec` test added earlier on `main` still called the old
one-argument constructor. That leaves current `main` failing to compile
in Bazel and argument-comment-lint jobs.

## What changed

- Pass `AnalyticsEventsClient::disabled()` to the missed
`OutgoingMessageSender::new` test call site in `command_exec.rs`.

## Verification

- `cargo test -p codex-app-server
timeout_or_cancellation_reports_cancellation_without_timeout_exit_code`
2026-04-29 20:47:20 +00:00
Matthew Zeng
8ce48f9968 [tool_suggest] Improve tool_suggest triggering conditions. (#20091)
## Summary
- Tighten `tool_suggest` guidance so it prefers explicit plugin install
requests, while still allowing a connector install when the relevant
plugin is already installed and a needed connector from that plugin is
missing.
- Tell the model not to call `tool_suggest` in parallel with other
tools.

## Testing
- `cargo test -p codex-tools tool_suggest`
- `cargo test -p codex-core tool_suggest`
2026-04-29 13:41:12 -07:00
rhan-oai
0690ab0842 [codex-analytics] ingest server requests and responses (#17088)
## Why

Codex analytics needs a typed seam for app-server-originated
request/response traffic so future tool-approval analytics can consume
those facts without adding bespoke callsite tracking each time. Server
responses arrive as JSON-RPC `id + result` payloads, so analytics has to
reconstruct the matching typed response from the original typed request
while that request context still exists in app-server.

This also puts analytics on the app-server outbound path, which needs to
avoid keeping the runtime alive during shutdown. The final ownership fix
keeps the normal strong auth-manager retention in analytics and makes
the external-auth refresh bridge hold a weak back-reference to
`OutgoingMessageSender`, breaking the runtime cycle at the bridge
boundary instead of exposing retention policy through the analytics
client API.

## What changed

- Adds typed `ServerRequest` and `ServerResponse` analytics facts, plus
`AnalyticsEventsClient::track_server_request` and
`track_server_response`.
- Renames the existing client-side facts to `ClientRequest` and
`ClientResponse` so reducers can distinguish client-to-server traffic
from server-to-client traffic.
- Adds `ServerRequest::response_from_result`, allowing a stored typed
request to decode the matching typed server response from a raw JSON-RPC
result payload.
- Threads `AnalyticsEventsClient` through `OutgoingMessageSender` and
records targeted server requests, replayed targeted requests, and
matching targeted responses with the responding connection id needed for
correlation.
- Intentionally leaves broadcast server requests/responses out of
analytics for now because the current model is per connection, while
broadcasts fan one logical request out across multiple connections.
- Breaks the app-server shutdown cycle by storing
`Weak<OutgoingMessageSender>` in `ExternalAuthRefreshBridge` and
upgrading it only when an external-auth refresh is actually requested.
- Keeps reducer ingestion of the new server-side facts as no-ops for
now; this PR is plumbing for later tool-approval analytics work.

## Verification

- `cargo test -p codex-analytics`
- `cargo test -p codex-app-server outgoing_message::tests::`
- Covers typed-response reconstruction plus the targeted, replayed,
broadcast-exclusion, and response-attribution analytics paths.

## Follow-up

This PR intentionally stops at ingestion plumbing, so `ServerRequest`
and `ServerResponse` facts are still reducer no-ops. Once a follow-up PR
adds real downstream analytics output for those facts:

- replace the temporary pre-reducer observation seam with reducer tests
for the emitted event shape;
- add end-to-end coverage in `app-server/tests/suite/v2/analytics.rs`
for the real app-server workflow and captured analytics payload;
- remove the temporary sender-level observer tests added here in favor
of the real-output coverage above.

---

[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17088).
* #18748
* #18747
* #17090
* #17089
* #20241
* #20239
* __->__ #17088
2026-04-29 19:56:41 +00:00
iceweasel-oai
9d1e5df4b2 expand the set of core shell env vars for Windows. (#20089)
https://github.com/openai/codex/issues/13917 and
https://github.com/openai/codex/issues/18248 correctly identify that

```
[shell_environment_policy]
inherit = "core"
```
is not functional on Windows because it carries an insufficient set of
env vars.
This PR expands that to match the more functional set from the MCP
client
2026-04-29 19:23:46 +00:00
viyatb-oai
07c8b8c77c fix: handle deferred network proxy denials (#19184)
## Why

This bug is exposed by Guardian/auto-review approvals. With the managed
network proxy enabled, a blocked network request can be reported back
through the network approval service as an approval denial after the
command has already started. Before this change, the shell and unified
exec runtimes registered those network approval calls, but did not have
a way to observe an async proxy denial as a cancellation/failure signal
for the running process.

The result was confusing: Guardian/auto-review could correctly deny
network access, but the command path could keep running or unregister
the approval without surfacing the denial as the command failure.

## What Changed

- `NetworkApprovalService` now attaches a cancellation token to active
and deferred network approvals.
- Proxy-denial outcomes are recorded only for active registrations,
cancel the owning token, and are consumed when the approval is
finalized.
- The shell runtime combines the normal command timeout with the
network-denial cancellation token.
- Unified exec stores the deferred network approval object, terminates
tracked processes when the proxy denial arrives, and returns the denial
as a process failure while polling or completing the process.
- Tool orchestration passes the active network approval cancellation
token into the sandbox attempt and preserves deferred approval errors
instead of silently unregistering them.
- App-server `command/exec` now handles the combined
timeout-or-cancellation expiration variant used by the runtime.

## Verification

- `cargo test -p codex-core network_approval --lib`
- `cargo clippy -p codex-app-server --all-targets -- -D warnings`
- `cargo clippy -p codex-core --all-targets -- -D warnings`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-29 19:13:57 +00:00
xl-openai
73cd831952 feat: Use remote installed plugin cache for skills and MCP (#20096)
- Fetches and caches remote /installed plugin state
- Lets skills/list load skills from remote-installed cached plugins
without requiring a local marketplace entry
- Routes plugin list/startup/install/uninstall changes through async
plugin cache invalidation and MCP refresh
2026-04-29 12:09:49 -07:00
Won Park
5cf0adba93 Include auto-review rollout in feedback uploads (#20064)
## Summary

- include the live auto-review trunk rollout when `/feedback` uploads
logs
- upload that attachment as
`auto-review-rollout-<parent-thread-id>.jsonl` so it is distinguishable
from the parent rollout
- show the same auto-review attachment name in the TUI consent popup

## Scope

- this only covers the live cached auto-review trunk for the current
parent thread
- it does not add durable historical parent->auto-review lookup
- it does not add persisted rollout support for ephemeral parallel
review forks

## UI 

<img width="599" height="185" alt="Screenshot 2026-04-28 at 1 17 18 PM"
src="https://github.com/user-attachments/assets/6a0e79c2-5d21-4702-8a89-f765778bc9e9"
/>

## Validation

- `cargo test -p codex-core
cached_guardian_subagent_exposes_its_rollout_path`
- `cargo test -p codex-feedback`
- `cargo test -p codex-app-server`
- `cargo test -p codex-tui feedback_upload_consent_popup_snapshot`
- `cargo test -p codex-tui
feedback_good_result_consent_popup_includes_connectivity_diagnostics_filename`

## Known unrelated local failures

- `cargo test -p codex-core` currently fails in the pre-existing proxy
env snapshot test
`tools::runtimes::tests::maybe_wrap_shell_lc_with_snapshot_keeps_user_proxy_env_when_proxy_inactive`
- `cargo test -p codex-tui` currently hits pre-existing `status::*`
snapshot drift unrelated to this change

## Follow-Up 
- persist parallel auto-review fork sessions so /feedback can include
their rollout history too
- attach each persisted fork as its own clearly named file, for example
auto-review-rollout-<parent-thread-id>-fork <n>.jsonl, instead of
merging multiple Guardian sessions into one attachment
- keep the same live-session-only scope initially; durable historical
parent -> auto-review lookup can remain a separate decision if we later
need feedback from resumed sessions
2026-04-29 11:44:55 -07:00
friel-openai
05fd904572 test protocol: lock inter-agent commentary phase (#20046)
## Summary
- add a regression test for
`InterAgentCommunication::to_response_input_item`
- assert replayed inter-agent messages keep `phase:
Some(MessagePhase::Commentary)`

## Test plan
- `cargo test -p codex-protocol`
- `just argument-comment-lint`
2026-04-29 11:24:17 -07:00
pakrym-oai
8356806fc9 Add ThreadManager sample crate (#20141)
Summary:
- Add codex-thread-manager-sample, a one-shot binary that starts a
ThreadManager thread, submits a prompt, and prints the final assistant
output.
- Pass ThreadStore into ThreadManager::new and expose
thread_store_from_config for existing callsites.
- Build the sample Config directly with only --model and prompt inputs.

Verification:
- just fmt
- cargo check -p codex-thread-manager-sample -p codex-app-server -p
codex-mcp-server
- git diff --check

Tests: Not run per request.
2026-04-29 11:21:06 -07:00
joeytrasatti-openai
47fba5df4a [codex-backend] Prefer sqlite git info for rollout-path reads (#20228)
### Summary

- Path-based local thread reads currently return rollout/session git
metadata directly, so `thread/resume` can disagree with persisted SQLite
metadata for the same thread.
- Merge non-null SQLite git fields over rollout-path reads while keeping
rollout values as fallbacks for fields SQLite does not know.
- Add focused regression coverage for rollout-path reads so persisted
branch updates are preserved during resume.

### Testing

- `cargo test -p codex-thread-store`
2026-04-29 17:54:37 +00:00
Eric Traut
d0204c3dcc TUI: Remove core protocol dependency [3/7] (#20174)
## Why

This is part 3 of a 7-PR stack to remove direct
`codex_protocol::protocol` usage from `codex-tui` while keeping each
layer reviewable and shippable.

With `AppCommand` now explicit, the internal app event bus can carry TUI
commands directly instead of bouncing through core `Op` values.

## What changed

- Changed `AppEvent::CodexOp` and `AppEvent::SubmitThreadOp` to carry
`AppCommand`.
- Updated app-event senders and direct emitters to submit `AppCommand`
values.
- Adjusted tests to match `AppCommand` or convert back through
`into_core()` where they intentionally assert legacy payload equality.

## Verification

- `cargo test -p codex-tui --no-run`
2026-04-29 10:52:10 -07:00
Eric Traut
445629815c TUI: Remove core protocol dependency [2/7] (#20173)
## Why

This is part 2 of a 7-PR stack to remove direct
`codex_protocol::protocol` usage from `codex-tui` while keeping each
layer reviewable and shippable.

Before the TUI event bus can stop carrying core `Op` values,
`AppCommand` needs to be an owned TUI command shape rather than a thin
wrapper around `Op`.

## What changed

- Replaced the opaque `AppCommand(Op)` wrapper with explicit owned
variants for the commands the TUI submits.
- Preserved `into_core()` so this layer does not yet change the
app/thread submission boundary.
- Kept existing core leaf types for now so this remains a mechanical
command-shape refactor.

## Verification

- `cargo check -p codex-tui`
2026-04-29 10:28:04 -07:00
cassirer-openai
df966996a7 [rollout-tracer] Match analysis messages on encrypted id. (#20123)
In some setups the summary or raw content can be dropped between
requests. This triggers a check in the reducer which expects that the
messages should remain identical between requests.

This PR relaxes the checks to only focus on the encrypted ID instead. It
also changes the reducer to keep the most rich version of the message
observed during the rollout (this ensures that we don't accidentally
lose the CoT nor summary when available).
2026-04-29 17:22:24 +00:00
iceweasel-oai
cecca5ae06 Improve Windows process management edge cases (#19211)
## Summary

Some improvements to Windows process-management issues from
https://github.com/openai/codex/pull/15578

- bound the elevated runner pipe-connect handshake instead of waiting
forever on blocking pipe connects
- terminate the spawned runner if that handshake fails, so timeout/error
paths do not leave a stray `codex-command-runner.exe`
- loop on partial `WriteFile` results when forwarding stdin in the
elevated runner, so input is not silently truncated
- fix the concrete HANDLE/SID cleanup paths in the runner setup code
- keep draining driver-backed stdout/stderr after exit until the backend
closes, instead of dropping the tail after a fixed 200ms grace period
- reuse `LocalSid` for SID ownership and add more explanatory comments
around the ownership/concurrency-sensitive code paths

## Why

The original PR fixed a lot of Windows session plumbing, but there were
still a few sharp process-lifecycle edges:

- some elevated runner handshakes could block forever
- the new timeout path could still orphan the spawned runner process
- stdin forwarding still assumed a single `WriteFile` consumed the whole
buffer
- a few raw HANDLE/SID error paths still leaked
- driver-backed output could still lose the last chunk of stdout/stderr
on slower backends

## Validation

- `cargo fmt -p codex-windows-sandbox -p codex-utils-pty`
- `cargo test -p codex-utils-pty`
- `cargo test -p codex-windows-sandbox finish_driver_spawn`
- `cargo test -p codex-windows-sandbox runner_`

Ran a local test matrix of unified-exec and shell_tool tests, all
passing
2026-04-29 10:00:01 -07:00
Eric Traut
1c420a90cd TUI: Remove core protocol dependency [1/7] (#20172)
## Why

This is part 1 of a 7-PR stack to remove direct
`codex_protocol::protocol` usage from `codex-tui` while keeping each
layer reviewable and shippable.

This first layer reduces the size of the later `chatwidget` diff by
mechanically moving MCP startup bookkeeping out of the central widget
file without changing the event shapes or behavior.

## What changed

- Extracted MCP startup status handling into
`tui/src/chatwidget/mcp_startup.rs`.
- Kept the existing core event types in place for this purely mechanical
move.
- Updated the MCP startup tests to import the moved test-only event
types directly.

## Verification

- `cargo test -p codex-tui chatwidget::tests::mcp_startup`
2026-04-29 09:10:22 -07:00
Eric Traut
91ca551df8 Use /goal resume for paused goals (#20082)
## Why

The paused goal statusline currently points users at `/goal` to unpause
a goal, but bare `/goal` is the summary command and does not change the
goal state. Instead of making `/goal` mutate state only when a goal is
paused, this gives the action an explicit command that reads naturally
in the UI.

## What Changed

- Replace `/goal unpause` with `/goal resume` for reactivating a paused
goal.
- Update the paused goal statusline and `/goal` summary copy to point at
`/goal resume`.
2026-04-29 08:56:02 -07:00
jif-oai
70ac0f123c Make multi-agent v2 ignore agents.max_depth (#20180)
## Why

`agents.max_depth` is a legacy multi-agent v1 guard. Multi-agent v2 uses
task-path routing and its own session/thread limits, so v2 should not
reject nested `spawn_agent` calls just because the thread-spawn depth
has reached the v1 maximum.

Keeping the v1 depth guard active in v2 prevents deeper task trees even
though the v2 path still needs the depth value only for lineage and
task-path metadata.

## What Changed

- Removed the depth-limit rejection from the multi-agent v2
`spawn_agent` handler while still computing child depth for lineage/path
metadata.
- Made the depth-based disabling of legacy `SpawnCsv`/`Collab` tools
apply only when `Feature::MultiAgentV2` is disabled.
- Added `multi_agent_v2_spawn_agent_ignores_configured_max_depth` to
cover a v2 child spawning another agent when `agent_max_depth = 1`,
while the existing v1 depth-limit tests continue to enforce the legacy
behavior.

## Verification

- `cargo test -p codex-core
multi_agent_v2_spawn_agent_ignores_configured_max_depth -- --nocapture`
- `cargo test -p codex-core depth_limit -- --nocapture`
- `cargo test -p codex-core tools::handlers::multi_agents::tests --
--nocapture`
2026-04-29 12:23:00 +02:00
jif-oai
c41b74c453 nit: drop old memories things (#20186)
Drop legacy code
2026-04-29 12:19:50 +02:00
iceweasel-oai
5cac3f896d Fix Windows pseudoconsole attribute handling for sandboxed PTY sessions (#20042)
## Summary
Fix the Windows sandbox PTY spawn path to pass the pseudoconsole handle
value directly into `UpdateProcThreadAttribute`.

## Why
Sandboxed `unified_exec` PTY sessions on Windows were failing during
child process startup with `0xc0000142` (`STATUS_DLL_INIT_FAILED`). In
practice this showed up as PowerShell DLL init popups when the sandboxed
background-terminal path tried to launch an interactive shell.

The root cause was that we were passing a pointer to a local `isize`
variable instead of the pseudoconsole handle value in the form Windows
expects for `PROC_THREAD_ATTRIBUTE_PSEUDOCONSOLE`.

## Validation
- `cargo build -p codex-windows-sandbox --bins`
- Reproduced the real sandboxed `codex exec` flow with
`windows.sandbox_private_desktop=true`
- Verified a `tty=true` interactive session launched through the normal
PowerShell wrapper, printed `READY`, accepted follow-up stdin, and
exited cleanly
- Confirmed no new `0xc0000142` / `Application Popup` events appeared
after the successful repro
2026-04-29 11:59:45 +02:00
alexsong-oai
d92c909ee4 Fix migrated hook path rewriting (#20144)
## Summary
- Rewrite migrated external-agent hook commands by replacing the full
hook script path token instead of only the `.claude/hooks/` segment.
- Preserve quoting around the full rewritten target path so script names
with spaces, absolute paths, and shell operators/redirection continue to
work.
- Apply `.claude/settings.local.json` over `.claude/settings.json` for
config, MCP, and plugin migration so local scope matches Claude settings
precedence.
- Skip legacy command markdown without `description` frontmatter,
including README-style docs under `.claude/commands`.

## Root Cause
The previous hook rewrite handled `.claude/hooks/` as a substring
replacement. For absolute source commands, that left the original
project-root prefix before the newly quoted `.codex/hooks` directory,
producing invalid commands like
`project/'project/.codex/hooks'/script.sh`.

The migration also only used project `settings.json` for
config/MCP/plugin decisions, so local settings such as
`disabledMcpjsonServers` could be ignored even though Claude gives local
settings higher precedence than project settings.

## Validation
- `just fmt`
- `cargo test -p codex-external-agent-migration`
- `cargo test -p codex-app-server external_agent_config`
- `just fix -p codex-external-agent-migration`
- `just fix -p codex-app-server`
- `git diff --check`
2026-04-29 00:46:11 -07:00
viyatb-oai
5597925155 feat(cli): add sandbox profile config controls (#20118)
## Why

The explicit profile path from #20117 is meant for standalone testing,
but it still inherited the
shell cwd and all managed requirements implicitly. The pre-existing
launcher path even called out
that it did not support a separate cwd yet in

[`debug_sandbox.rs`](509453f688/codex-rs/cli/src/debug_sandbox.rs (L174-L179)).

For a standalone command, the useful default is to let the caller choose
the project directory being
tested and to avoid administrator-provided constraints unless the caller
explicitly wants to test
those too.

## What changed

- Add explicit-profile-only `-C/--cd DIR`, and use that cwd for both
profile resolution and command
  execution.
- Add explicit-profile-only `--include-managed-config`.
- Make explicit profile mode skip managed requirement sources by
default, including cloud
requirements, MDM requirements, `/etc/codex/requirements.toml`, and the
legacy managed-config
  requirements projection.
- Preserve all existing invocations outside the explicit-profile path.

## Stack

1. #20117 `sandbox-ui-profile`
2. #20118 `sandbox-ui-config` --> this PR

Both PRs are additive. Replay JSON is intentionally deferred to a
follow-up design pass.

## Tests ran

- `cargo test -p codex-cli debug_sandbox`
- `cargo test -p codex-cli sandbox_macos_`
- `cargo test -p codex-core
load_config_layers_can_ignore_managed_requirements`
- `cargo test -p codex-core
load_config_layers_includes_cloud_requirements`
- macOS branch-binary smoke on the rebased top of stack: `-C` changed
execution cwd, explicit
profile mode omitted managed proxy env under `env -i`, and
`--include-managed-config` restored it.
- Linux devbox branch-binary smoke on the rebased top of stack: `-C`
changed execution cwd for
  built-in and user-defined explicit profiles.
2026-04-29 06:55:51 +00:00
Andrey Mishchenko
857146b328 Delete multi_agent_v2 followup_task interrupt parameter (#20139)
Messages sent with `followup_task` already arrive at their target
recipient promptly (at message boundaries while sampling, or after the
pending tool call completes) -- having `interrupt` is not worth the
added complexity.
2026-04-28 23:19:48 -07:00
viyatb-oai
6ed0440611 feat(cli): add explicit sandbox permission profiles (#20117)
## Why

`codex sandbox` is useful for exercising sandbox behavior directly, but
before this stack the CLI
only picked up permission profiles indirectly from the active config.
The existing debug-sandbox path
already compiled `[permissions]` profiles through normal config loading,
as covered by the existing
profile tests in
[`debug_sandbox.rs`](de2ccf9473/codex-rs/cli/src/debug_sandbox.rs (L715-L760)).

This adds the smallest stable entry point first: an explicit profile
selector that reuses the same
config machinery as normal Codex config, so standalone testing becomes
possible without changing
current no-selector behavior.

## What changed

- Add additive `--permissions-profile NAME` support to `codex sandbox
macos|linux|windows`.
- Resolve built-in and user-defined profile names by feeding
`default_permissions` through the
existing config compilation path instead of inventing a sandbox-only
parser.
- Make an explicit selector win over an ambient active profile's legacy
`sandbox_mode`.
- Keep the existing no-selector behavior unchanged.

## Stack

1. #20117 `sandbox-ui-profile` --> this PR
2. #20118 `sandbox-ui-config`

Both PRs are additive. Replay JSON is intentionally deferred to a
follow-up design pass.

## Tests ran

- `cargo test -p codex-cli debug_sandbox`
- `cargo test -p codex-cli sandbox_macos_parses_permissions_profile`
- `cargo test -p codex-core
cli_override_takes_precedence_over_profile_sandbox_mode`
- macOS branch-binary smoke on the rebased top of stack: built-in
`:workspace` and user-defined
  profiles both executed successfully through `--permissions-profile`.
- Linux devbox branch-binary smoke on the rebased top of stack: built-in
`:workspace` and
user-defined profiles both executed successfully through
`--permissions-profile`.
2026-04-29 06:18:16 +00:00
Dylan Hurd
3d10ba9f36 chore(cli) deprecate --full-auto (#20133)
## Summary
Starts the process of getting rid of `--full-auto`, with some
concessions:
1. Fully removes the command from the tui, since it just resolves to the
default permissions there, and encourages users to use the one-time
trust flow if they're not in a trusted repo.
2. Marks the command as deprecated in `codex exec`, in case users are
actively relying on this. We'll remove in an upcoming n+X release.
3. Cleans up some of the `codex sandbox` cli logic, to keep supporting
legacy sandbox policies for now.

This isn't the cleanest setup, but I think it is worthwhile to warn
users for one release before hard-removing it.

## Testing 
- [x] Updated unit tests
2026-04-29 04:41:30 +00:00
starr-openai
e1ec9e63a0 Add environment provider snapshot (#20058)
## Summary
- Change `EnvironmentProvider` to return concrete `Environment`
instances instead of `EnvironmentConfigurations`.
- Make `DefaultEnvironmentProvider` provide the provider-visible `local`
environment plus optional `remote` environment from
`CODEX_EXEC_SERVER_URL`.
- Keep `EnvironmentManager` as the concrete cache while exposing its own
explicit local environment for `local_environment()` fallback paths.

## Validation
- `just fmt`
- `git diff --check`

---------

Co-authored-by: Codex <noreply@openai.com>
2026-04-28 20:05:18 -07:00
xl-openai
6f328d5e02 Soften skill description budget warnings (#20112)
Updates skill description budget messaging to be less alarming
2026-04-28 19:56:25 -07:00
Michael Bolin
e6db1a9442 linux-sandbox: switch helper plumbing to PermissionProfile (#20106)
## Why

`PermissionProfile` is the canonical runtime permission model in the
Rust workspace, but the Linux sandbox helper still accepted a legacy
`SandboxPolicy` plus separate filesystem and network policy flags. That
translation layer made the helper interface harder to reason about and
left `linux-sandbox`-specific callers and tests coupled to the legacy
policy representation.

This change moves the helper onto `PermissionProfile` directly so the
Linux sandbox plumbing matches the rest of the permission stack.

## What changed

- changed `codex-linux-sandbox` to accept `--permission-profile` and
derive the runtime filesystem and network policies internally
- updated the in-process seccomp and legacy Landlock path in
`codex-rs/linux-sandbox` to operate on `PermissionProfile`
- updated Linux sandbox argv construction in `codex-rs/sandboxing`,
`codex-rs/core`, and the CLI debug sandbox path to pass the canonical
profile instead of serializing compatibility policy projections
- simplified the Linux sandbox tests to build the exact permission
profile under test, including the managed-proxy path and
direct-runtime-enforcement carveout coverage
- removed helper-local `SandboxPolicy` usage from `bwrap` tests where
`FileSystemSandboxPolicy` is already the value being exercised

## Testing

- `cargo test -p codex-sandboxing`
- `cargo test -p codex-linux-sandbox` (on this macOS host, the crate
compiled cleanly and its Linux-only tests were cfg-gated)
- `cargo test -p codex-core --no-run`
- `cargo test -p codex-cli --no-run`
2026-04-28 19:43:44 -07:00
Celia Chen
80fb0704ee feat: update Bedrock Mantle endpoint and GPT-5.4 model ID (#20109)
## Summary

Amazon Bedrock Mantle's OpenAI-compatible endpoint now lives under
`/openai/v1`, and the GPT-5.4 Mantle model ID no longer uses the `-cmb`
suffix. This updates Codex's built-in Bedrock provider configuration so
generated providers and the static Bedrock catalog use the current
endpoint and model ID.

## Changes

- Update the Bedrock Mantle base URL from
`https://bedrock-mantle.{region}.api.aws/v1` to
`https://bedrock-mantle.{region}.api.aws/openai/v1`.
- Update the Amazon Bedrock default base URL in
`codex-model-provider-info`.
- Change the Bedrock GPT-5.4 catalog slug from `openai.gpt-5.4-cmb` to
`openai.gpt-5.4`.
- Align provider and catalog tests with the new URL and model ID.

## Test Plan

- Manual smoke test:

  ```shell
  target/debug/codex \
      -m openai.gpt-5.4 \
      -c 'model_provider="amazon-bedrock"' \
      -c 'model_providers.amazon-bedrock.aws.region="us-west-2"'
  ```
2026-04-29 01:37:21 +00:00
Celia Chen
8c47e36504 feat: expose provider capability bounds to app server clients (#20049)
follow up of #19442. The app server now exposes provider-derived bounds
through a new v2 `modelProvider/read` method. The response reports the
configured provider map key as `modelProvider` and returns the effective
capability booleans so clients can align their UI with the same
provider-owned limits used by core.
2026-04-29 01:36:19 +00:00
canvrno-oai
4c39ad33cb Fix plugin list workspace settings test isolation (#20086)
Fixes test that often fails locally when running `cargo test`
- Add an app-server test helper that combines managed-config isolation
with custom env overrides.
- Isolate `HOME` / `USERPROFILE` in plugin-list workspace settings tests
so host home marketplaces do not affect results.
2026-04-28 18:34:38 -07:00
canvrno-oai
24be9ac0a4 Restore TUI working status after steer message is set (#19939)
Fix for #19925

Restore the `Working` indicator after a streamed final answer finishes
when a user steer message is sent.
Add regression coverage for long output plus a mid-stream steer:
`cargo test -p codex-tui
final_answer_completion_restores_status_indicator_for_pending_steer`

Duplication/testing steps:
1. Start a new thread and ask for a long response.
2. While the response is streaming, submit a steer message.
3. When the first response finishes, observe whether `Working...` is
shown while waiting for the steer message response.
2026-04-28 18:10:40 -07:00
Michael Bolin
c9f7c88f3d fix: restore live event submit path for apply patch tests (#20108)
## Summary

This fixes the CI regression introduced by
[#20040](https://github.com/openai/codex/pull/20040).

That PR migrated several `apply_patch_cli` tests from direct
`codex.submit(Op::UserTurn { ... })` calls to `harness.submit(...)`.
`harness.submit()` waits for `TurnComplete` before returning, which
drains the same event stream that these tests use to assert `TurnDiff`,
`PatchApplyUpdated`, and related live events. The regressed tests then
timed out waiting for events that had already been consumed.

This change restores a no-wait submit path for the event-observing
`apply_patch_cli` tests so they can watch the turn stream directly
again.

## What Changed

- added a local `submit_without_wait(...)` helper in
`codex-rs/core/tests/suite/apply_patch_cli.rs`
- switched the `apply_patch_cli` tests that assert live turn events back
to that helper
- left the profile-backed `harness.submit(...)` migration in place for
tests that only care about final filesystem or tool output state

## Why macOS Looked Green

In the failing run
[25084487331](https://github.com/openai/codex/actions/runs/25084487331),
`//codex-rs/core:core-all-test` was cached on macOS, so the regressed
tests were not rerun there. The Linux GNU, Linux MUSL, and Windows Bazel
jobs reran the target and exposed the failure.

## Verification

- `cargo test -p codex-core apply_patch_ -- --nocapture`
- previously failing local cases now pass again:
  - `apply_patch_cli_move_without_content_change_has_no_turn_diff`
  - `apply_patch_turn_diff_for_rename_with_content_change`
  - `apply_patch_aggregates_diff_across_multiple_tool_calls`
2026-04-28 18:09:20 -07:00
Celia Chen
f8fe96d548 feat: disable capabilities by model provider (#19442)
## Why

Unsupported features must fail closed and Codex must not expose
OpenAI-hosted fallback paths when the active provider cannot support
them. In practice, Bedrock should not surface app connectors, MCP
servers, tool search/suggestions, image generation, web search, or JS
REPL until those paths are explicitly supported for that provider.

This PR moves that decision into provider-owned capability metadata
instead of scattering Bedrock-specific checks across callers.

## What changed

- Adds `ProviderCapabilities` to `codex-model-provider`, with default
support for existing providers and a Bedrock override that disables
unsupported launch surfaces.
- Adds `ToolCapabilityBounds` to `codex-tools` so provider capability
limits can clamp otherwise-enabled tool config.
- Applies capability bounds when building session and review-thread tool
config.
- Routes MCP/app connector configuration through
`McpManager::mcp_config`, which filters configured MCP servers and app
connectors based on the active provider.
- Updates app-server MCP list/read paths to use the filtered MCP config.
- Adds coverage for default provider capabilities, Bedrock disabled
capabilities, and optional tool-surface clamping.

## Testing

built locally and verified that bedrock responses api now return without
errors calling unsupported tools.
2026-04-28 17:51:30 -07:00
alexsong-oai
cb8b1bbcd6 Support detect and import MCP, Subagents, hooks, commands from external (#19949)
## Why
This PR expands the migration path so Codex can detect and import MCP
server config, hooks, commands, and subagents configs in a Codex-native
shape.

## What changed

- Added a `codex-external-agent-migration` crate that owns conversion
logic for external-agent MCP servers, hooks, commands, and subagents.
- Extended the app-server external-agent config detection/import API
with migration item types for MCP server config, hooks, commands, and
subagents.

## Migration strategy

The migration is intentionally conservative: Codex only imports
external-agent config that can be represented safely in Codex today.
Unsupported or ambiguous config is skipped instead of being partially
translated into behavior that may not match the source system.

- **MCP servers**: import supported stdio and HTTP MCP server
definitions into `mcp_servers`. Disabled servers and servers filtered
out by source `enabledMcpjsonServers` / `disabledMcpjsonServers` are
skipped. Project-scoped MCP entries from `.claude.json` are included
when they match the repo path.
- **Hooks**: import only supported command hooks into
`.codex/hooks.json`. Unsupported hook features such as conditional
groups, async handlers, prompt/http hooks, or unknown fields are
skipped. Referenced hook scripts are copied into `.codex/hooks/`,
preserving any existing target scripts.
- **Commands**: import supported external commands as Codex skills under
`.agents/skills/source-command-*`. Commands that rely on source runtime
expansion such as `$ARGUMENTS`, `$1`, `@file` references, shell
interpolation, or colliding generated names are skipped.
- **Subagents**: import valid subagent Markdown files into
`.codex/agents/*.toml` when they have the minimum Codex agent fields.
Source model names are not migrated, so imported agents keep the user’s
Codex default model; compatible reasoning effort and sandbox mode are
migrated when present.
- **Skills and project guidance**: copy missing skill directories into
`.agents/skills` and migrate `CLAUDE.md` guidance into `AGENTS.md`,
rewriting source-agent terminology to Codex terminology where
appropriate.
- **Detection details**: detected migration items include lightweight
details for UI preview, such as MCP server names, hook event names,
generated command skill names, and subagent names. Import still
recomputes from disk instead of trusting details as the source of truth.

- Adds focused coverage for the new migration behavior and app-server
import flow.

## Verification

- `cargo test -p codex-external-agent-migration`
- `cargo test -p codex-hooks`
- `cargo test -p codex-app-server external_agent_config`
- `just bazel-lock-check`
2026-04-29 00:45:24 +00:00
Matthew Zeng
ebdf3a878c Support disabling tool suggest for specific tools. (#20072)
## Summary
- Add `disable_tool_suggest` to app and plugin config, schema, and
TypeScript output
- Exclude disabled connectors and plugins from tool suggestion discovery
- Persist "never show again" tool-suggestion choices back into
`config.toml`
- Update config docs and add coverage for connector and plugin
suppression

## Testing
- Added and updated unit tests for config persistence and tool-suggest
filtering
- Not run (not requested)
2026-04-29 00:19:34 +00:00
Michael Bolin
1211a90a35 core tests: migrate hook turns to profiles (#20041)
## Summary
- Removes `SandboxPolicy` from the hooks test suite.
- Submits hook-related turns with explicit `PermissionProfile` values
for disabled, read-only, and workspace-write cases.
- Preserves the managed-network hook test by configuring and submitting
a workspace-write profile with enabled network, allowing the existing
requirements-backed proxy path to remain covered.

## Verification
- `cargo check -p codex-core --tests`
- `just fmt`
2026-04-28 17:18:45 -07:00
Michael Bolin
1fed948c66 core tests: migrate apply patch turns to profiles (#20040)
## Summary
- Removes `SandboxPolicy` from the apply-patch CLI test suite.
- Uses the harness' profile-backed submit helper for danger/no-sandbox
turns instead of constructing `Op::UserTurn` manually with legacy
fields.
- Converts the workspace-write traversal cases to submit
`PermissionProfile::workspace_write_with(...)` directly.

## Verification
- `cargo check -p codex-core --tests`
- `just fmt`
2026-04-28 17:18:19 -07:00
Michael Bolin
1dae5788e1 core tests: migrate rmcp turns to profiles (#20037)
## Summary
- Removes `SandboxPolicy` from the RMCP client test suite.
- Adds shared read-only user-turn helpers that submit
`PermissionProfile::read_only()` plus the legacy compatibility
projection required by the current `Op::UserTurn` shape.
- Keeps sandbox metadata assertions intact by deriving the expected
legacy `sandboxPolicy` value from the same read-only profile used for
the turn.

## Verification
- `cargo check -p codex-core --tests`
- `just fmt`
2026-04-28 17:17:47 -07:00
Michael Bolin
6662c0f312 core tests: migrate compact turns to profiles (#20035)
## Summary
- Removes the remaining `SandboxPolicy` usage from the compaction test
suite.
- Adds a small local helper for direct `Op::UserTurn` construction so
these tests send `PermissionProfile::Disabled` plus the legacy
compatibility projection required by the protocol field.
- Keeps the existing danger/full-access behavior while exercising the
canonical permission profile path.

## Verification
- `cargo check -p codex-core --tests`
- `just fmt`
2026-04-28 17:17:12 -07:00