codex

mirror of https://github.com/openai/codex.git synced 2026-05-18 02:02:30 +00:00

Author	SHA1	Message	Date
Matthew Zeng	ebe602d005	[plugins] Allow MSFT curated plugins in tool_suggest (#20304 ) ## Summary - [x] Move the allowlist out of core crate - [x] Add Teams, SharePoint, Outlook Email, and Outlook Calendar to the tool_suggest discoverable plugin allowlist - [x] Add focused coverage for Microsoft curated plugin discovery ## Testing - just fmt - cargo test -p codex-core-plugins - cargo test -p codex-core list_tool_suggest_discoverable_plugins_returns_	2026-04-29 19:45:52 -07:00
pakrym-oai	4e677d62da	app-server: remove dead api version handling from bespoke events (#20291 ) Remove ApiVersion::V1	2026-04-30 01:55:44 +00:00
rhan-oai	bb536d65bd	[codex-analytics] prevent stale guardian events from satisfying reused reviews (#20080 ) ## Why Reused Guardian review trunks can still have older child-turn events queued when a later review starts. The review waiter currently accepts the first terminal event it sees from the shared child session, so a stale `TurnComplete` can be attributed to the new review. That produces impossible analytics combinations such as non-null TTFT with sub-10 ms completion latency and zero token deltas on `trunk_reused` reviews. ## What changed - Preserve the child turn id returned by the Guardian review `Op::UserTurn` submission. - Restrict Guardian review waiting to events correlated with that submitted child turn. - Restrict timeout/abort draining to terminal events for the same child turn. - Add regression coverage for stale prior-turn completions, stale prior-turn errors, and interrupt draining in `codex-rs/core/src/guardian/review_session.rs`. ## Verification - `cargo test -p codex-core guardian::review_session::tests::` - `cargo clippy -p codex-core --tests -- -D warnings`	2026-04-29 18:26:39 -07:00
Alex Zamoshchin	8b07132e09	update codex_plugins_beta_setting (from workspace settings) (#20250 ) update the name after rename internally see https://github.com/openai/openai/pull/871006	2026-04-30 00:40:25 +00:00
Eric Traut	515aa9a4fb	tui: return from side chat on Ctrl-D (#20282 ) ## Why Fixes #20264. Side conversations are an ephemeral layer on top of the main chat. Pressing `Ctrl+D` from an empty side-chat composer should unwind back to the parent thread, matching the existing side-return behavior, instead of falling through to the global quit shortcut and exiting Codex. ## What changed The side-return shortcut matcher now treats `Ctrl+D` the same way it already treats `Esc` and `Ctrl+C`. Because app-level side-return handling runs before the chat widget's global quit handling, this returns from `/side` while preserving normal `Ctrl+D` quit behavior outside side conversations. The existing shortcut coverage was updated to include lowercase and uppercase `Ctrl+D` key events. ## Verification - `cargo test -p codex-tui side_return_shortcuts_match_esc_ctrl_c_and_ctrl_d` - `cargo test -p codex-tui` starts successfully and the new shortcut test passes, but the broader suite later aborts in the unrelated existing test `app::tests::attach_live_thread_for_selection_rejects_unmaterialized_fallback_threads` with a stack overflow.	2026-04-29 17:26:11 -07:00
pakrym-oai	fedcefe9da	Reduce the surface of collaboration modes (#20149 ) Collaboration modes were slightly invasive both into ThreadManager construction and ModelProvider	2026-04-29 17:22:41 -07:00
stefanstokic-oai	c8abcbf925	Import external agent sessions in background (#20284 ) Summary: - Return from external agent import before session history import finishes - Run session import work in the background and emit the existing completion notification when it is done - Serialize session imports so duplicate requests do not create duplicate imported threads Verification: - cargo test -p codex-app-server external_agent_config_ - cargo test -p codex-external-agent-sessions - just fix -p codex-app-server - just fix -p codex-external-agent-sessions - git diff --check	2026-04-30 00:00:41 +00:00
alexsong-oai	7bcd4626c4	Consume ai-title from external sessions and add end marker (#20261 ) ## Summary - Support Claude Code `ai-title` / `aiTitle` records when detecting and importing external agent sessions. - Preserve existing `custom-title` / `customTitle` precedence; only fall back to `aiTitle` when no custom title is present. - Add coverage for both detection and import title selection, including the custom-title-over-ai-title case. ## Testing - `cargo test -p codex-external-agent-sessions` - `just fix -p codex-external-agent-sessions`	2026-04-30 00:00:13 +00:00
Abhinav	8774229a89	Add hooks/list app-server RPC (#19778 ) ## Why We need a way to list the available hooks to expose via the TUI and App so users can view and manage their hooks ## What - Adds `hooks/list` for one or more `cwd` values that returns discovered hook metadata ## Stack 1. openai/codex#19705 2. This PR - openai/codex#19778 3. openai/codex#19840 4. openai/codex#19882 ## Review Notes The generated schema files account for most of the raw diff, these files have the core change: - `hooks/src/engine/discovery.rs` builds the inventory entries during hook discovery while leaving runtime handlers focused on execution. - `app-server/src/codex_message_processor.rs` wires `hooks/list` into the app-server flow for each requested `cwd`. - `app-server-protocol/src/protocol/v2.rs` defines the new v2 request/response payloads exposed on the wire. ### Core Changes `core/src/plugins/manager.rs` adds `plugins_for_layer_stack(...)` so `skills/list` and `hooks/list`can resolve plugin state for each requested `cwd` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-29 23:39:57 +00:00
Michael Bolin	6eab7519b4	chore: increase release build timeout from 60 min to 90 (#20271 ) Build times are creeping up, so increase the timeout as a precaution.	2026-04-29 16:19:59 -07:00
rafael-jac	98f67b15d3	Update Codex login success page UX (#20136 ) ## Summary update the local login success page to match the Codex desktop auth UX use theme-aware colors and an inline 20px Codex mark keep the actual localhost success page aligned with the browser auth UX PR ## Tests <img width="1728" height="1117" alt="Screenshot 2026-04-29 at 12 00 34 PM" src="https://github.com/user-attachments/assets/76a40c3f-07c3-452c-97da-e7c43717cd2c" />	2026-04-29 19:14:53 -04:00
evawong-oai	74f06dcdfb	Enforce workspace metadata protections in Linux sandbox (#19852 ) ## Summary Enforce FileSystemSandboxPolicy protected metadata names in the Linux bubblewrap adapter so `.git`, `.agents`, and `.codex` remain read only inside writable workspace roots unless the policy grants an explicit write carveout. ## Scope 1. Translate protected metadata names from FileSystemSandboxPolicy into bubblewrap masks for existing metadata paths. 2. Represent missing protected metadata paths as guarded mount targets so agents cannot create `.git`, `.agents`, or `.codex` under writable roots. 3. Preserve normal git discovery for existing repos, worktrees, and parent repos. 4. Keep explicit user write grants working when policy allows a protected metadata path directly. ## Not in scope 1. No shell preflight UX. 2. No TUI runtime profile propagation. 3. No macOS Seatbelt changes in this PR. ## Reviewer focus 1. This should be reviewed as the Linux enforcement adapter for the policy primitive from PR 19846. 2. macOS enforcement already landed in PR 19847. 3. The important invariant is that `FileSystemSandboxPolicy` is the source of truth for `.git`, `.agents`, and `.codex`. ## Validation 1. `git diff` whitespace check passed. 2. `cargo fmt` check passed with the existing stable rustfmt warning about `imports_granularity`. 3. Full Linux sandbox Cargo test suite passed on the devbox. 4. Devbox forty six case suite passed at head `012accb703c13bd28df5b40079a9bf183036336a`. 5. Devbox summary: pass 46, fail 0. 6. The devbox suite was run through `just c sandbox linux`. 7. Focused repo test for Viyat parent repo case passed on the devbox.	2026-04-29 16:14:14 -07:00
iceweasel-oai	13dbcda28f	stop blocking unified_exec on Windows (#19435 ) ## Summary - remove the Windows-specific unified-exec environment block from tool selection - keep `unified_exec` default-off on Windows unless the feature is explicitly enabled - normalize model-provided `shell_type = unified_exec` to `shell_command` when the feature is disabled - drop obsolete tests tied to the removed environment gate and keep the feature-flag regression coverage ## Why Now that the session/long-lived process backend is implemented for the Windows sandbox, we don't need to hard disable it anymore. We will be rolling out slowly using a feature gate. ## Impact This allows manual Windows opt-in in CLI and app-backed flows while preserving the existing default-off behavior for Windows users. --------- Co-authored-by: canvrno-oai <kbond@openai.com> Co-authored-by: Codex <noreply@openai.com>	2026-04-29 16:06:33 -07:00
pakrym-oai	8de2a7a16d	Add codex-core public API listing (#20243 ) Summary: - Add a checked-in codex-core public API listing generated by cargo-public-api. - Add scripts/regen-public-api.sh with an embedded crate list, auto-install for cargo-public-api 0.51.0, pinned nightly, and --check mode. - Add Rust CI jobs on the codex Linux x64 runner pool to verify the listing stays up to date. Testing: - bash -n scripts/regen-public-api.sh - just regen-public-api --check - yq '.' .github/workflows/rust-ci.yml .github/workflows/rust-ci-full.yml - git diff --check	2026-04-29 22:58:08 +00:00
Rasmus Rygaard	782191547c	Add agent graph store interface (#19229 ) ## Summary Persisted subagent parent/child topology currently leaks through `StateRuntime`'s SQLite-specific thread-spawn helpers. This PR introduces a narrow `AgentGraphStore` boundary so follow-up work can route graph operations through a local or remote store without coupling orchestration code directly to the state DB graph API. ## Changes - Adds the new `codex-agent-graph-store` crate. - Defines a flat `AgentGraphStore` trait for the v1 graph surface: upsert edge, set edge status, list direct children, and list descendants. - Adds public graph types for `ThreadSpawnEdgeStatus`, `AgentGraphStoreError`, and `AgentGraphStoreResult`. - Implements `LocalAgentGraphStore` on top of an existing `codex_state::StateRuntime`, preserving today's SQLite-backed `thread_spawn_edges` behavior. - Registers the crate in Cargo/Bazel metadata. This PR only adds the local contract and implementation; call-site migration and the remote gRPC store are left to the follow-up PRs in the stack. ## Testing - `cargo test -p codex-agent-graph-store` The new unit tests cover local parity with the existing `StateRuntime` graph methods, `Open`/`Closed` filtering, status updates, and stable breadth-first descendant ordering.	2026-04-29 22:48:26 +00:00
Matthew Zeng	e20391e567	[mcp] Fix plugin MCP approval policy. (#19537 ) Plugin MCP servers are loaded from plugin manifests rather than top-level `[mcp_servers]`, so their tool approval preferences need to be stored and applied through the owning plugin config. Without this, choosing "Always allow" for a plugin MCP tool could write a preference that was not reliably used on later tool calls. ## Summary - Add plugin-scoped MCP policy config under `plugins.<plugin>.mcp_servers`, including server enablement, tool allow/deny lists, server defaults, and per-tool approval modes. - Overlay plugin MCP policy onto manifest-provided server configs when plugins are loaded. - Route persistent "Always allow" writes for plugin MCP tools back to the owning `plugins.<plugin>.mcp_servers.<server>.tools.<tool>` config entry. - Reload user config after persisting an approval and make the plugin load cache config-aware so stale plugin MCP policy is not reused after `config.toml` changes. - Regenerate the config schema and add coverage for plugin MCP policy loading, approval lookup, persistence, and stale-cache prevention. ## Testing - `cargo test -p codex-config` - `cargo test -p codex-core-plugins` - `cargo test -p codex-core --lib plugin_mcp`	2026-04-29 15:40:03 -07:00
Eric Traut	4241df4d79	Escape turn metadata headers as ASCII JSON (#19620 ) ## Why `x-codex-turn-metadata` is sent as an HTTP/WebSocket header, but Codex was serializing the metadata JSON with raw UTF-8 string contents. When a workspace path contains non-ASCII characters, common HTTP stacks can reject or corrupt that header before the request reaches the provider. Fixes #17468. Also addresses the duplicate WebSocket report in #19581. ## What changed - Added `codex_utils_string::to_ascii_json_string`, a shared helper that serializes JSON normally while escaping non-ASCII string content as `\uXXXX`. - Switched turn metadata header serialization, including merged Responses API client metadata, to use the ASCII-safe JSON helper. - Added coverage for non-ASCII workspace paths and non-ASCII client metadata while preserving the same parsed JSON values. ## Verification - `cargo test -p codex-utils-string` - `cargo test -p codex-core turn_metadata` - `just bazel-lock-check`	2026-04-29 15:35:33 -07:00
Michael Bolin	b1546008fc	docs: discourage `#[async_trait]` and `#[allow(async_fn_in_trait)]` (#20242 ) ## Why We have run into two avoidable problems when introducing async trait APIs in Rust: - `#[async_trait]` has caused materially worse build times in this repository. - `#[allow(async_fn_in_trait)]` makes it too easy to ship a public trait without spelling out whether the returned future is `Send`, which hides an important part of the trait contract. We already have a good example of the preferred alternative in [#16630](https://github.com/openai/codex/pull/16630) / [`3c7f013f9735`](https://github.com/openai/codex/commit/3c7f013f9735), but that guidance currently lives only as prior art in the codebase. This PR documents the rule in `AGENTS.md` so contributors are more likely to follow the native RPITIT pattern before these two shortcuts spread further. ## What Changed - added Rust guidance in `AGENTS.md` discouraging both `#[async_trait]` and `#[allow(async_fn_in_trait)]` - pointed contributors to the native RPITIT pattern with explicit `Send` bounds on the returned future - clarified that implementations may still use `async fn` when they satisfy that trait contract ## Verification - docs-only change; no tests run	2026-04-29 15:29:29 -07:00
Alex Daley	f63b19bedd	[apps] Add apps MCP path override (#20231 ) Summary - Add `[features.apps_mcp_path_override]` config with a `path` field for overriding only the built-in apps MCP path. - Keep existing host/base URL derivation unchanged and append the configured path after that base. - Regenerate the config schema with the custom feature-config case. Test Plan - Not run for latest revision; only `just fmt` and `just write-config-schema` were run. - Earlier revision: `cargo test -p codex-features` - Earlier revision: `cargo test -p codex-mcp`	2026-04-29 18:08:06 -04:00
xli-oai	8d5da3ffe5	Fallback login callback port when default is busy (#19334 ) ## Summary - Keep the preferred ChatGPT login callback port `1455` first. - Preserve the existing `/cancel` recovery for stale Codex login servers. - Fall back to the registered localhost callback port `1457` when `1455` remains unavailable. ## Why Cursor and Codex Desktop both use the ChatGPT account login callback server. On Windows, Cursor can already be listening on `127.0.0.1:1455` / `[::1]:1455`, causing Codex Desktop sign-in to fail with: `Local callback port 1455 is already in use on this machine.` Codex already attempted to cancel a stale Codex login server on that port, but if the listener does not release the port, the old behavior was to fail. The new behavior falls back to `1457`, which matches the fixed redirect URI being registered server-side in `openai/openai#863817`. This keeps the OAuth `redirect_uri` inside Hydra's exact allow-list instead of choosing an arbitrary ephemeral port. ## Validation - `just fmt` - `cargo test -p codex-login` - `git diff --check HEAD~1..HEAD`	2026-04-29 14:45:27 -07:00
rhan-oai	72a39e3a96	[app-server] centralize client response analytics (#20059 ) ## Why The precursor PR keeps successful client responses typed until app-server's outgoing response seam. This follow-up uses that seam to move successful client-response analytics out of individual handlers and into the shared sender path, while keeping filtering decisions inside `codex-analytics`. ## What changed - Emit successful client-response analytics centrally from `OutgoingMessageSender::send_response`. - Remove duplicate handler-local response tracking for the current thread/turn lifecycle responses. - Keep analytics ingestion selective inside `AnalyticsEventsClient`, so unrelated client traffic is ignored before cloning or boxing. - Collapse client-response analytics facts onto one typed path and normalize payloads in the reducer. - Add direct client-filter coverage plus sender-level coverage for the centralized forwarding path. ## Verification - `cargo test -p codex-analytics` - `cargo test -p codex-app-server outgoing_message::tests --lib`	2026-04-29 21:22:39 +00:00
xli-oai	afbddabc8b	Require remote plugin detail before uninstall (#19966 ) ## Summary - Fetch remote plugin detail before sending the uninstall request. - Use the detail response to derive the marketplace namespace and plugin name for cache cleanup. - Stop the uninstall before the backend POST if detail lookup fails, so backend state and local cache state do not diverge. ## Testing - `just fmt` - `cargo test -p codex-app-server plugin_uninstall` - `cargo test -p codex-core-plugins` - `git diff --check`	2026-04-29 14:01:11 -07:00
rhan-oai	973c5c823e	[app-server] type client response payloads (#20050 ) ## Why `pr17088` adds typed server-originated request/response plumbing, but successful client responses are still erased into bare JSON-RPC `result` values before app-server can make any typed decision about them. This precursor PR keeps successful client responses typed until the outgoing response seam. It is intentionally limited to protocol/app-server plumbing so the analytics behavior change can review separately on top. ## What changed - Add `ClientResponsePayload` as the pre-serialization client response body type. - Route app-server successful response paths through the typed payload seam while preserving existing handler-local analytics behavior. - Keep `InterruptConversation` JSON-RPC-only because it has no `ClientResponse` variant. - Move the new payload conversion tests into a dedicated protocol test module. ## Verification - `cargo check -p codex-app-server` - `cargo test -p codex-app-server-protocol`	2026-04-29 20:50:47 +00:00
sayan-oai	b15074d0a4	app-server: fix outgoing sender test setup (#20258 ) ## Why [#17088](https://github.com/openai/codex/pull/17088) changed `OutgoingMessageSender::new` to require an `AnalyticsEventsClient`, but one `command_exec` test added earlier on `main` still called the old one-argument constructor. That leaves current `main` failing to compile in Bazel and argument-comment-lint jobs. ## What changed - Pass `AnalyticsEventsClient::disabled()` to the missed `OutgoingMessageSender::new` test call site in `command_exec.rs`. ## Verification - `cargo test -p codex-app-server timeout_or_cancellation_reports_cancellation_without_timeout_exit_code`	2026-04-29 20:47:20 +00:00
Matthew Zeng	8ce48f9968	[tool_suggest] Improve tool_suggest triggering conditions. (#20091 ) ## Summary - Tighten `tool_suggest` guidance so it prefers explicit plugin install requests, while still allowing a connector install when the relevant plugin is already installed and a needed connector from that plugin is missing. - Tell the model not to call `tool_suggest` in parallel with other tools. ## Testing - `cargo test -p codex-tools tool_suggest` - `cargo test -p codex-core tool_suggest`	2026-04-29 13:41:12 -07:00
rhan-oai	0690ab0842	[codex-analytics] ingest server requests and responses (#17088 ) ## Why Codex analytics needs a typed seam for app-server-originated request/response traffic so future tool-approval analytics can consume those facts without adding bespoke callsite tracking each time. Server responses arrive as JSON-RPC `id + result` payloads, so analytics has to reconstruct the matching typed response from the original typed request while that request context still exists in app-server. This also puts analytics on the app-server outbound path, which needs to avoid keeping the runtime alive during shutdown. The final ownership fix keeps the normal strong auth-manager retention in analytics and makes the external-auth refresh bridge hold a weak back-reference to `OutgoingMessageSender`, breaking the runtime cycle at the bridge boundary instead of exposing retention policy through the analytics client API. ## What changed - Adds typed `ServerRequest` and `ServerResponse` analytics facts, plus `AnalyticsEventsClient::track_server_request` and `track_server_response`. - Renames the existing client-side facts to `ClientRequest` and `ClientResponse` so reducers can distinguish client-to-server traffic from server-to-client traffic. - Adds `ServerRequest::response_from_result`, allowing a stored typed request to decode the matching typed server response from a raw JSON-RPC result payload. - Threads `AnalyticsEventsClient` through `OutgoingMessageSender` and records targeted server requests, replayed targeted requests, and matching targeted responses with the responding connection id needed for correlation. - Intentionally leaves broadcast server requests/responses out of analytics for now because the current model is per connection, while broadcasts fan one logical request out across multiple connections. - Breaks the app-server shutdown cycle by storing `Weak<OutgoingMessageSender>` in `ExternalAuthRefreshBridge` and upgrading it only when an external-auth refresh is actually requested. - Keeps reducer ingestion of the new server-side facts as no-ops for now; this PR is plumbing for later tool-approval analytics work. ## Verification - `cargo test -p codex-analytics` - `cargo test -p codex-app-server outgoing_message::tests::` - Covers typed-response reconstruction plus the targeted, replayed, broadcast-exclusion, and response-attribution analytics paths. ## Follow-up This PR intentionally stops at ingestion plumbing, so `ServerRequest` and `ServerResponse` facts are still reducer no-ops. Once a follow-up PR adds real downstream analytics output for those facts: - replace the temporary pre-reducer observation seam with reducer tests for the emitted event shape; - add end-to-end coverage in `app-server/tests/suite/v2/analytics.rs` for the real app-server workflow and captured analytics payload; - remove the temporary sender-level observer tests added here in favor of the real-output coverage above. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17088). * #18748 * #18747 * #17090 * #17089 * #20241 * #20239 * __->__ #17088	2026-04-29 19:56:41 +00:00
iceweasel-oai	9d1e5df4b2	expand the set of core shell env vars for Windows. (#20089 ) https://github.com/openai/codex/issues/13917 and https://github.com/openai/codex/issues/18248 correctly identify that ``` [shell_environment_policy] inherit = "core" ``` is not functional on Windows because it carries an insufficient set of env vars. This PR expands that to match the more functional set from the MCP client	2026-04-29 19:23:46 +00:00
viyatb-oai	07c8b8c77c	fix: handle deferred network proxy denials (#19184 ) ## Why This bug is exposed by Guardian/auto-review approvals. With the managed network proxy enabled, a blocked network request can be reported back through the network approval service as an approval denial after the command has already started. Before this change, the shell and unified exec runtimes registered those network approval calls, but did not have a way to observe an async proxy denial as a cancellation/failure signal for the running process. The result was confusing: Guardian/auto-review could correctly deny network access, but the command path could keep running or unregister the approval without surfacing the denial as the command failure. ## What Changed - `NetworkApprovalService` now attaches a cancellation token to active and deferred network approvals. - Proxy-denial outcomes are recorded only for active registrations, cancel the owning token, and are consumed when the approval is finalized. - The shell runtime combines the normal command timeout with the network-denial cancellation token. - Unified exec stores the deferred network approval object, terminates tracked processes when the proxy denial arrives, and returns the denial as a process failure while polling or completing the process. - Tool orchestration passes the active network approval cancellation token into the sandbox attempt and preserves deferred approval errors instead of silently unregistering them. - App-server `command/exec` now handles the combined timeout-or-cancellation expiration variant used by the runtime. ## Verification - `cargo test -p codex-core network_approval --lib` - `cargo clippy -p codex-app-server --all-targets -- -D warnings` - `cargo clippy -p codex-core --all-targets -- -D warnings` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-29 19:13:57 +00:00
xl-openai	73cd831952	feat: Use remote installed plugin cache for skills and MCP (#20096 ) - Fetches and caches remote /installed plugin state - Lets skills/list load skills from remote-installed cached plugins without requiring a local marketplace entry - Routes plugin list/startup/install/uninstall changes through async plugin cache invalidation and MCP refresh	2026-04-29 12:09:49 -07:00
Won Park	5cf0adba93	Include auto-review rollout in feedback uploads (#20064 ) ## Summary - include the live auto-review trunk rollout when `/feedback` uploads logs - upload that attachment as `auto-review-rollout-<parent-thread-id>.jsonl` so it is distinguishable from the parent rollout - show the same auto-review attachment name in the TUI consent popup ## Scope - this only covers the live cached auto-review trunk for the current parent thread - it does not add durable historical parent->auto-review lookup - it does not add persisted rollout support for ephemeral parallel review forks ## UI <img width="599" height="185" alt="Screenshot 2026-04-28 at 1 17 18 PM" src="https://github.com/user-attachments/assets/6a0e79c2-5d21-4702-8a89-f765778bc9e9" /> ## Validation - `cargo test -p codex-core cached_guardian_subagent_exposes_its_rollout_path` - `cargo test -p codex-feedback` - `cargo test -p codex-app-server` - `cargo test -p codex-tui feedback_upload_consent_popup_snapshot` - `cargo test -p codex-tui feedback_good_result_consent_popup_includes_connectivity_diagnostics_filename` ## Known unrelated local failures - `cargo test -p codex-core` currently fails in the pre-existing proxy env snapshot test `tools::runtimes::tests::maybe_wrap_shell_lc_with_snapshot_keeps_user_proxy_env_when_proxy_inactive` - `cargo test -p codex-tui` currently hits pre-existing `status::*` snapshot drift unrelated to this change ## Follow-Up - persist parallel auto-review fork sessions so /feedback can include their rollout history too - attach each persisted fork as its own clearly named file, for example auto-review-rollout-<parent-thread-id>-fork <n>.jsonl, instead of merging multiple Guardian sessions into one attachment - keep the same live-session-only scope initially; durable historical parent -> auto-review lookup can remain a separate decision if we later need feedback from resumed sessions	2026-04-29 11:44:55 -07:00
friel-openai	05fd904572	test protocol: lock inter-agent commentary phase (#20046 ) ## Summary - add a regression test for `InterAgentCommunication::to_response_input_item` - assert replayed inter-agent messages keep `phase: Some(MessagePhase::Commentary)` ## Test plan - `cargo test -p codex-protocol` - `just argument-comment-lint`	2026-04-29 11:24:17 -07:00
pakrym-oai	8356806fc9	Add ThreadManager sample crate (#20141 ) Summary: - Add codex-thread-manager-sample, a one-shot binary that starts a ThreadManager thread, submits a prompt, and prints the final assistant output. - Pass ThreadStore into ThreadManager::new and expose thread_store_from_config for existing callsites. - Build the sample Config directly with only --model and prompt inputs. Verification: - just fmt - cargo check -p codex-thread-manager-sample -p codex-app-server -p codex-mcp-server - git diff --check Tests: Not run per request.	2026-04-29 11:21:06 -07:00
joeytrasatti-openai	47fba5df4a	[codex-backend] Prefer sqlite git info for rollout-path reads (#20228 ) ### Summary - Path-based local thread reads currently return rollout/session git metadata directly, so `thread/resume` can disagree with persisted SQLite metadata for the same thread. - Merge non-null SQLite git fields over rollout-path reads while keeping rollout values as fallbacks for fields SQLite does not know. - Add focused regression coverage for rollout-path reads so persisted branch updates are preserved during resume. ### Testing - `cargo test -p codex-thread-store`	2026-04-29 17:54:37 +00:00
Eric Traut	d0204c3dcc	TUI: Remove core protocol dependency [3/7] (#20174 ) ## Why This is part 3 of a 7-PR stack to remove direct `codex_protocol::protocol` usage from `codex-tui` while keeping each layer reviewable and shippable. With `AppCommand` now explicit, the internal app event bus can carry TUI commands directly instead of bouncing through core `Op` values. ## What changed - Changed `AppEvent::CodexOp` and `AppEvent::SubmitThreadOp` to carry `AppCommand`. - Updated app-event senders and direct emitters to submit `AppCommand` values. - Adjusted tests to match `AppCommand` or convert back through `into_core()` where they intentionally assert legacy payload equality. ## Verification - `cargo test -p codex-tui --no-run`	2026-04-29 10:52:10 -07:00
Eric Traut	445629815c	TUI: Remove core protocol dependency [2/7] (#20173 ) ## Why This is part 2 of a 7-PR stack to remove direct `codex_protocol::protocol` usage from `codex-tui` while keeping each layer reviewable and shippable. Before the TUI event bus can stop carrying core `Op` values, `AppCommand` needs to be an owned TUI command shape rather than a thin wrapper around `Op`. ## What changed - Replaced the opaque `AppCommand(Op)` wrapper with explicit owned variants for the commands the TUI submits. - Preserved `into_core()` so this layer does not yet change the app/thread submission boundary. - Kept existing core leaf types for now so this remains a mechanical command-shape refactor. ## Verification - `cargo check -p codex-tui`	2026-04-29 10:28:04 -07:00
cassirer-openai	df966996a7	[rollout-tracer] Match analysis messages on encrypted id. (#20123 ) In some setups the summary or raw content can be dropped between requests. This triggers a check in the reducer which expects that the messages should remain identical between requests. This PR relaxes the checks to only focus on the encrypted ID instead. It also changes the reducer to keep the most rich version of the message observed during the rollout (this ensures that we don't accidentally lose the CoT nor summary when available).	2026-04-29 17:22:24 +00:00
iceweasel-oai	cecca5ae06	Improve Windows process management edge cases (#19211 ) ## Summary Some improvements to Windows process-management issues from https://github.com/openai/codex/pull/15578 - bound the elevated runner pipe-connect handshake instead of waiting forever on blocking pipe connects - terminate the spawned runner if that handshake fails, so timeout/error paths do not leave a stray `codex-command-runner.exe` - loop on partial `WriteFile` results when forwarding stdin in the elevated runner, so input is not silently truncated - fix the concrete HANDLE/SID cleanup paths in the runner setup code - keep draining driver-backed stdout/stderr after exit until the backend closes, instead of dropping the tail after a fixed 200ms grace period - reuse `LocalSid` for SID ownership and add more explanatory comments around the ownership/concurrency-sensitive code paths ## Why The original PR fixed a lot of Windows session plumbing, but there were still a few sharp process-lifecycle edges: - some elevated runner handshakes could block forever - the new timeout path could still orphan the spawned runner process - stdin forwarding still assumed a single `WriteFile` consumed the whole buffer - a few raw HANDLE/SID error paths still leaked - driver-backed output could still lose the last chunk of stdout/stderr on slower backends ## Validation - `cargo fmt -p codex-windows-sandbox -p codex-utils-pty` - `cargo test -p codex-utils-pty` - `cargo test -p codex-windows-sandbox finish_driver_spawn` - `cargo test -p codex-windows-sandbox runner_` Ran a local test matrix of unified-exec and shell_tool tests, all passing	2026-04-29 10:00:01 -07:00
Eric Traut	1c420a90cd	TUI: Remove core protocol dependency [1/7] (#20172 ) ## Why This is part 1 of a 7-PR stack to remove direct `codex_protocol::protocol` usage from `codex-tui` while keeping each layer reviewable and shippable. This first layer reduces the size of the later `chatwidget` diff by mechanically moving MCP startup bookkeeping out of the central widget file without changing the event shapes or behavior. ## What changed - Extracted MCP startup status handling into `tui/src/chatwidget/mcp_startup.rs`. - Kept the existing core event types in place for this purely mechanical move. - Updated the MCP startup tests to import the moved test-only event types directly. ## Verification - `cargo test -p codex-tui chatwidget::tests::mcp_startup`	2026-04-29 09:10:22 -07:00
Eric Traut	91ca551df8	Use /goal resume for paused goals (#20082 ) ## Why The paused goal statusline currently points users at `/goal` to unpause a goal, but bare `/goal` is the summary command and does not change the goal state. Instead of making `/goal` mutate state only when a goal is paused, this gives the action an explicit command that reads naturally in the UI. ## What Changed - Replace `/goal unpause` with `/goal resume` for reactivating a paused goal. - Update the paused goal statusline and `/goal` summary copy to point at `/goal resume`.	2026-04-29 08:56:02 -07:00
jif-oai	70ac0f123c	Make multi-agent v2 ignore agents.max_depth (#20180 ) ## Why `agents.max_depth` is a legacy multi-agent v1 guard. Multi-agent v2 uses task-path routing and its own session/thread limits, so v2 should not reject nested `spawn_agent` calls just because the thread-spawn depth has reached the v1 maximum. Keeping the v1 depth guard active in v2 prevents deeper task trees even though the v2 path still needs the depth value only for lineage and task-path metadata. ## What Changed - Removed the depth-limit rejection from the multi-agent v2 `spawn_agent` handler while still computing child depth for lineage/path metadata. - Made the depth-based disabling of legacy `SpawnCsv`/`Collab` tools apply only when `Feature::MultiAgentV2` is disabled. - Added `multi_agent_v2_spawn_agent_ignores_configured_max_depth` to cover a v2 child spawning another agent when `agent_max_depth = 1`, while the existing v1 depth-limit tests continue to enforce the legacy behavior. ## Verification - `cargo test -p codex-core multi_agent_v2_spawn_agent_ignores_configured_max_depth -- --nocapture` - `cargo test -p codex-core depth_limit -- --nocapture` - `cargo test -p codex-core tools::handlers::multi_agents::tests -- --nocapture`	2026-04-29 12:23:00 +02:00
jif-oai	c41b74c453	nit: drop old memories things (#20186 ) Drop legacy code	2026-04-29 12:19:50 +02:00
iceweasel-oai	5cac3f896d	Fix Windows pseudoconsole attribute handling for sandboxed PTY sessions (#20042 ) ## Summary Fix the Windows sandbox PTY spawn path to pass the pseudoconsole handle value directly into `UpdateProcThreadAttribute`. ## Why Sandboxed `unified_exec` PTY sessions on Windows were failing during child process startup with `0xc0000142` (`STATUS_DLL_INIT_FAILED`). In practice this showed up as PowerShell DLL init popups when the sandboxed background-terminal path tried to launch an interactive shell. The root cause was that we were passing a pointer to a local `isize` variable instead of the pseudoconsole handle value in the form Windows expects for `PROC_THREAD_ATTRIBUTE_PSEUDOCONSOLE`. ## Validation - `cargo build -p codex-windows-sandbox --bins` - Reproduced the real sandboxed `codex exec` flow with `windows.sandbox_private_desktop=true` - Verified a `tty=true` interactive session launched through the normal PowerShell wrapper, printed `READY`, accepted follow-up stdin, and exited cleanly - Confirmed no new `0xc0000142` / `Application Popup` events appeared after the successful repro	2026-04-29 11:59:45 +02:00
alexsong-oai	d92c909ee4	Fix migrated hook path rewriting (#20144 ) ## Summary - Rewrite migrated external-agent hook commands by replacing the full hook script path token instead of only the `.claude/hooks/` segment. - Preserve quoting around the full rewritten target path so script names with spaces, absolute paths, and shell operators/redirection continue to work. - Apply `.claude/settings.local.json` over `.claude/settings.json` for config, MCP, and plugin migration so local scope matches Claude settings precedence. - Skip legacy command markdown without `description` frontmatter, including README-style docs under `.claude/commands`. ## Root Cause The previous hook rewrite handled `.claude/hooks/` as a substring replacement. For absolute source commands, that left the original project-root prefix before the newly quoted `.codex/hooks` directory, producing invalid commands like `project/'project/.codex/hooks'/script.sh`. The migration also only used project `settings.json` for config/MCP/plugin decisions, so local settings such as `disabledMcpjsonServers` could be ignored even though Claude gives local settings higher precedence than project settings. ## Validation - `just fmt` - `cargo test -p codex-external-agent-migration` - `cargo test -p codex-app-server external_agent_config` - `just fix -p codex-external-agent-migration` - `just fix -p codex-app-server` - `git diff --check`	2026-04-29 00:46:11 -07:00
viyatb-oai	5597925155	feat(cli): add sandbox profile config controls (#20118 ) ## Why The explicit profile path from #20117 is meant for standalone testing, but it still inherited the shell cwd and all managed requirements implicitly. The pre-existing launcher path even called out that it did not support a separate cwd yet in [`debug_sandbox.rs`](`509453f688/codex-rs/cli/src/debug_sandbox.rs (L174-L179)`). For a standalone command, the useful default is to let the caller choose the project directory being tested and to avoid administrator-provided constraints unless the caller explicitly wants to test those too. ## What changed - Add explicit-profile-only `-C/--cd DIR`, and use that cwd for both profile resolution and command execution. - Add explicit-profile-only `--include-managed-config`. - Make explicit profile mode skip managed requirement sources by default, including cloud requirements, MDM requirements, `/etc/codex/requirements.toml`, and the legacy managed-config requirements projection. - Preserve all existing invocations outside the explicit-profile path. ## Stack 1. #20117 `sandbox-ui-profile` 2. #20118 `sandbox-ui-config` --> this PR Both PRs are additive. Replay JSON is intentionally deferred to a follow-up design pass. ## Tests ran - `cargo test -p codex-cli debug_sandbox` - `cargo test -p codex-cli sandbox_macos_` - `cargo test -p codex-core load_config_layers_can_ignore_managed_requirements` - `cargo test -p codex-core load_config_layers_includes_cloud_requirements` - macOS branch-binary smoke on the rebased top of stack: `-C` changed execution cwd, explicit profile mode omitted managed proxy env under `env -i`, and `--include-managed-config` restored it. - Linux devbox branch-binary smoke on the rebased top of stack: `-C` changed execution cwd for built-in and user-defined explicit profiles.	2026-04-29 06:55:51 +00:00
Andrey Mishchenko	857146b328	Delete multi_agent_v2 followup_task interrupt parameter (#20139 ) Messages sent with `followup_task` already arrive at their target recipient promptly (at message boundaries while sampling, or after the pending tool call completes) -- having `interrupt` is not worth the added complexity.	2026-04-28 23:19:48 -07:00
viyatb-oai	6ed0440611	feat(cli): add explicit sandbox permission profiles (#20117 ) ## Why `codex sandbox` is useful for exercising sandbox behavior directly, but before this stack the CLI only picked up permission profiles indirectly from the active config. The existing debug-sandbox path already compiled `[permissions]` profiles through normal config loading, as covered by the existing profile tests in [`debug_sandbox.rs`](`de2ccf9473/codex-rs/cli/src/debug_sandbox.rs (L715-L760)`). This adds the smallest stable entry point first: an explicit profile selector that reuses the same config machinery as normal Codex config, so standalone testing becomes possible without changing current no-selector behavior. ## What changed - Add additive `--permissions-profile NAME` support to `codex sandbox macos\|linux\|windows`. - Resolve built-in and user-defined profile names by feeding `default_permissions` through the existing config compilation path instead of inventing a sandbox-only parser. - Make an explicit selector win over an ambient active profile's legacy `sandbox_mode`. - Keep the existing no-selector behavior unchanged. ## Stack 1. #20117 `sandbox-ui-profile` --> this PR 2. #20118 `sandbox-ui-config` Both PRs are additive. Replay JSON is intentionally deferred to a follow-up design pass. ## Tests ran - `cargo test -p codex-cli debug_sandbox` - `cargo test -p codex-cli sandbox_macos_parses_permissions_profile` - `cargo test -p codex-core cli_override_takes_precedence_over_profile_sandbox_mode` - macOS branch-binary smoke on the rebased top of stack: built-in `:workspace` and user-defined profiles both executed successfully through `--permissions-profile`. - Linux devbox branch-binary smoke on the rebased top of stack: built-in `:workspace` and user-defined profiles both executed successfully through `--permissions-profile`.	2026-04-29 06:18:16 +00:00
Dylan Hurd	3d10ba9f36	chore(cli) deprecate --full-auto (#20133 ) ## Summary Starts the process of getting rid of `--full-auto`, with some concessions: 1. Fully removes the command from the tui, since it just resolves to the default permissions there, and encourages users to use the one-time trust flow if they're not in a trusted repo. 2. Marks the command as deprecated in `codex exec`, in case users are actively relying on this. We'll remove in an upcoming n+X release. 3. Cleans up some of the `codex sandbox` cli logic, to keep supporting legacy sandbox policies for now. This isn't the cleanest setup, but I think it is worthwhile to warn users for one release before hard-removing it. ## Testing - [x] Updated unit tests	2026-04-29 04:41:30 +00:00
starr-openai	e1ec9e63a0	Add environment provider snapshot (#20058 ) ## Summary - Change `EnvironmentProvider` to return concrete `Environment` instances instead of `EnvironmentConfigurations`. - Make `DefaultEnvironmentProvider` provide the provider-visible `local` environment plus optional `remote` environment from `CODEX_EXEC_SERVER_URL`. - Keep `EnvironmentManager` as the concrete cache while exposing its own explicit local environment for `local_environment()` fallback paths. ## Validation - `just fmt` - `git diff --check` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-28 20:05:18 -07:00
xl-openai	6f328d5e02	Soften skill description budget warnings (#20112 ) Updates skill description budget messaging to be less alarming	2026-04-28 19:56:25 -07:00
Michael Bolin	e6db1a9442	linux-sandbox: switch helper plumbing to PermissionProfile (#20106 ) ## Why `PermissionProfile` is the canonical runtime permission model in the Rust workspace, but the Linux sandbox helper still accepted a legacy `SandboxPolicy` plus separate filesystem and network policy flags. That translation layer made the helper interface harder to reason about and left `linux-sandbox`-specific callers and tests coupled to the legacy policy representation. This change moves the helper onto `PermissionProfile` directly so the Linux sandbox plumbing matches the rest of the permission stack. ## What changed - changed `codex-linux-sandbox` to accept `--permission-profile` and derive the runtime filesystem and network policies internally - updated the in-process seccomp and legacy Landlock path in `codex-rs/linux-sandbox` to operate on `PermissionProfile` - updated Linux sandbox argv construction in `codex-rs/sandboxing`, `codex-rs/core`, and the CLI debug sandbox path to pass the canonical profile instead of serializing compatibility policy projections - simplified the Linux sandbox tests to build the exact permission profile under test, including the managed-proxy path and direct-runtime-enforcement carveout coverage - removed helper-local `SandboxPolicy` usage from `bwrap` tests where `FileSystemSandboxPolicy` is already the value being exercised ## Testing - `cargo test -p codex-sandboxing` - `cargo test -p codex-linux-sandbox` (on this macOS host, the crate compiled cleanly and its Linux-only tests were cfg-gated) - `cargo test -p codex-core --no-run` - `cargo test -p codex-cli --no-run`	2026-04-28 19:43:44 -07:00

1 2 3 4 5 ...

6021 Commits