codex

mirror of https://github.com/openai/codex.git synced 2026-06-02 19:31:59 +00:00

Author	SHA1	Message	Date
Adam Perry @ OpenAI	a29a5b0861	[codex] document out-of-line test module convention (#25682 ) ## Why New unit test modules should follow one consistent layout so implementation files stay focused and test suites remain easy to locate, without creating cleanup churn in existing inline test modules. ## What changed - Added `AGENTS.md` guidance requiring new test modules to use separate sibling `*_tests.rs` files with an explicit `#[path = "..._tests.rs"]` attribute. - Clarified that existing inline `#[cfg(test)] mod tests { ... }` modules should not be moved solely to follow the new convention. ## Validation - Ran `git diff --check`.	2026-06-01 13:36:16 -07:00
jif-oai	9f4fac8ec4	Add rollout compression counters (#25679 ) ## Summary Add counter telemetry for the local rollout compression worker so we can see when it runs, why it skips, and how individual file/materialization paths resolve. ## Changes - Emit `codex.rollout_compression.run` with statuses for start, completion, failure, duplicate-run skip, and missing runtime skip. - Emit `codex.rollout_compression.file` outcomes for scanned, compressed, skipped, and failed compression candidates. - Emit `codex.rollout_compression.temp_cleanup` and `codex.rollout_compression.materialize` counters for cleanup and decompression paths. ## Validation - `just fmt` - `just test -p codex-rollout` - `just fix -p codex-rollout`	2026-06-01 22:26:32 +02:00
Michael Bolin	feb9eddc51	refactor: hide shell override for zsh fork unified exec (#24980 ) ## Why When unified exec is configured to launch through the zsh fork, local commands should not let the model override the shell binary with the `shell` parameter. The configured zsh fork is the mechanism that makes `execv(2)` interception reliable, so exposing `shell` for local zsh-fork execution would create a confusing API surface and undermine the composition. Remote environments are different: zsh-fork interception is local-only, so remote unified-exec calls must keep direct unified-exec behavior and still expose `shell` when a remote environment can be selected. ## What Changed - Taught the `exec_command` schema builder to omit the `shell` parameter when requested. - Hid `shell` from the unified-exec tool schema only when zsh-fork unified exec applies to all selectable environments. - Kept `shell` visible when any remote environment can be targeted, because those calls run through direct unified exec. - Made unified exec choose the effective shell mode per selected environment: local environments keep zsh-fork mode, remote environments use direct mode. - Left direct unified-exec behavior unchanged, including support for model-specified shells there. ## Verification - Added schema coverage showing `exec_command` can hide `shell`. - Added planner coverage showing zsh-fork unified exec hides `shell` for local-only execution while direct unified exec still exposes it. - Added planner coverage showing `shell` remains visible when a remote environment is available. - Added handler coverage showing remote environments use direct unified-exec shell mode instead of zsh-fork mode. - Ran the focused `codex-core` shell-parameter and zsh-fork tests. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/24980). * #24982 * #24981 * __->__ #24980	2026-06-01 20:22:28 +00:00
Michael Bolin	d6748f741a	feat: gate unified exec zsh fork composition (#24979 ) ## Why `shell_zsh_fork` and unified exec need to remain independently controllable for enterprise rollouts, but we also need a third mode that composes them. That composed mode is intended to preserve unified exec command lifecycle support while letting the zsh fork provide more accurate `execv(2)` interception. Enabling `unified_exec_zsh_fork` by itself is intentionally not sufficient. It is a composition gate, not a dependency-enabling shortcut: - `unified_exec` selects the PTY-backed unified exec tool. - `shell_zsh_fork` opts into the zsh fork backend. - `unified_exec_zsh_fork` only allows those two already-enabled modes to be composed so local zsh unified exec commands can launch through the zsh fork. This separation is deliberate. Enterprises and staged rollouts must be able to enable or disable unified exec and zsh-fork independently. If `unified_exec_zsh_fork` implied either dependency, then enabling one under-development composition flag would silently activate a shell backend that the configured feature set left disabled. This PR introduces only the configuration and planning gate for that composition. Existing `shell_zsh_fork` behavior continues to use the standalone shell tool unless the new composition feature is explicitly enabled alongside both dependencies. ## What Changed - Added the under-development feature flag `unified_exec_zsh_fork`. - Added `UnifiedExecFeatureMode` so the three input feature flags collapse into `Disabled`, `Direct`, or `ZshFork` mode before tool planning. - Updated tool selection so zsh-fork composition requires `unified_exec`, `shell_zsh_fork`, and `unified_exec_zsh_fork`. - Kept the existing standalone zsh-fork shell tool behavior when only `shell_zsh_fork` is enabled. - Updated config schema output for the new feature flag. ## Verification - Added feature and tool-config coverage for the new gate. - Added planner coverage proving `shell_zsh_fork` remains standalone until composition is explicitly enabled. - Ran focused tests for `codex-features`, `codex-tools`, and the affected `codex-core` planner case. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/24979). * #24982 * #24981 * #24980 * __->__ #24979	2026-06-01 13:01:36 -07:00
jif-oai	009e6c4817	fix: deflake zsh-fork approval test (#25669 ) Fixes this flake: https://github.com/openai/codex/actions/runs/26773809591/job/78919970410?pr=25659 This test is about zsh-fork subcommand approval behavior, not workspace sandboxing, so it now runs with `DangerFullAccess` to avoid macOS sandbox setup failures before the second subcommand approval.	2026-06-01 21:55:44 +02:00
starr-openai	53ac02356e	exec-server: canonicalize bound filesystem paths (#25149 ) ## Summary - add executor filesystem canonicalization as a bound-path operation - route remote canonicalization through the exec-server filesystem RPC surface - keep path normalization attached to the filesystem that owns the path ## Stack - 2/5 in the skills path authority stack extracted from https://github.com/openai/codex/pull/25098 - follows merged https://github.com/openai/codex/pull/25121 ## Validation - `cd /Users/starr/code/codex-worktrees/pr-25098-restack-review-pr1b/codex-rs && just fmt` - Not run: tests/checks (not requested) - GitHub CI pending on rewritten head	2026-06-01 11:53:31 -07:00
Won Park	f1609d9fb6	[codex-rs] auto-review model override (#23767 ) ## Why Guardian auto-review normally uses the provider-preferred review model when one is available. Some parent models need model-catalog metadata to select a different review model while keeping older `/models` payloads compatible when that metadata is absent. ## What changed - Added optional `ModelInfo::auto_review_model_override` metadata to the public model payload as a review-model slug. - Updated Guardian review model selection to prefer the catalog override when present, while preserving the existing provider preferred-model path and parent-model fallback when it is omitted. - Added focused Guardian coverage for override and no-override model selection. - Added an `auto_review` core integration suite test that loads override metadata from a remote model catalog path and asserts the strict auto-review `/responses` request uses the catalog-selected review model. - Updated existing `ModelInfo` fixtures and local catalog constructors for the new optional field. ## Validation - `cargo test -p codex-protocol model_info_defaults_availability_nux_to_none_when_omitted` - `cargo test -p codex-core guardian_review_uses_` - `cargo test -p codex-core remote_model_override_uses_catalog_model_for_strict_auto_review --test all` - `just fix -p codex-protocol` - `just fix -p codex-core` - `just fmt` - `git diff --check`	2026-06-01 11:51:15 -07:00
Adam Perry @ OpenAI	281b416c44	Check root Python script formatting in CI (#25165 ) ## Why Python files under `scripts/` were not covered by the repository formatting recipe or the CI formatting job, so formatting drift could merge unnoticed. ## What - Add a dedicated `scripts/pyproject.toml` and `scripts/uv.lock` so root-script formatting uses a locked Ruff version. - Extend `just fmt` to format root Python scripts and add `fmt-scripts-check` for CI. - Run `just fmt-scripts-check` from `.github/workflows/ci.yml`, installing `uv` through SHA-pinned `astral-sh/setup-uv` while retaining the `uv` `0.11.3` pin. - Apply Ruff formatting to the root Python scripts, including `scripts/just-shell.py`, and extend `sdk/python/tests/test_artifact_workflow_and_binaries.py` to cover the root formatting recipe. - Update `AGENTS.md` so agents run `just fmt` after code changes anywhere in the repository. ## Validation - Extended the existing Python SDK workflow test to assert that `just fmt` includes root Python scripts.	2026-06-01 18:50:23 +00:00
jif-oai	c3cdf3c007	Throttle repeated rollout compression runs (#25659 ) ## Why [#25089](https://github.com/openai/codex/pull/25089) introduced the background worker that compresses cold archived rollouts, and [#25654](https://github.com/openai/codex/pull/25654) made that pass faster once it starts. But the worker still deleted `rollout-compression.lock` on successful exit, so the existing six-hour staleness window only helped with overlapping or crashed workers. Each new local thread-store initialization could immediately rescan archived rollouts even if a full pass had just finished. This change keeps the existing marker around long enough to throttle redundant reruns. The worker is still best-effort, but it no longer does repeated startup scans when nothing new is eligible for compression. ## What Changed - Replace the drop-scoped `CompressionLock` with a `CompressionRunMarker` that claims the existing `.tmp/rollout-compression.lock` path and leaves it in place after success. - Reuse the existing six-hour staleness window to block both overlapping starts and immediate reruns, while still letting a stale marker be reclaimed. - Update the worker docs and debug logging to describe the new "already running or recently ran" behavior. - Extend the rollout compression tests to assert that a successful run leaves the marker behind and that a fresh marker suppresses a new run. ## Validation - `just test -p codex-rollout`	2026-06-01 20:46:54 +02:00
Adam Perry @ OpenAI	ba2b67f9cd	[codex] Consolidate shared prompts in codex-prompts (#25151 ) ## Why `codex_core` is consistently a bottleneck for incremental builds during iteration. The simplest fix is to make the crate smaller. ## Summary `codex-core` owns several reusable prompt renderers and static prompt assets, which makes the crate harder to split apart. Rename `codex-review-prompts` to `codex-prompts` and move shared review, goal, permissions, compaction, realtime, hierarchical AGENTS.md, and `apply_patch` prompts into it. Move prompt-only tests and update consumers and `CODEOWNERS`. ## Validation - `just test -p codex-prompts -p codex-apply-patch` - `just test -p codex-core prompt_caching` - Bazel builds for the affected crates	2026-06-01 18:45:07 +00:00
iceweasel-oai	88c7a4ff07	[codex] Make justfile recipes Windows-aware (#24983 ) ## Summary Make the root `justfile` usable from Windows without maintaining a separate Windows copy of most recipes. The repo recipes previously assumed POSIX shell behavior for things like variadic argument forwarding (`"$@"`) and stderr redirection (`2>/dev/null`). That made common workflows such as `just fmt`, `just test`, and `just log` unreliable from Windows. This PR introduces a small cross-platform shell adapter so recipes can stay mostly unified while still expanding the few shell-specific constructs correctly on macOS/Linux and Windows. ## What Changed - Add `scripts/just-shell.py` as the configured `just` shell adapter. - On Unix it invokes `sh -cu`. - On Windows it invokes `pwsh -CommandWithArgs` so arguments containing spaces are preserved. - Add portable recipe placeholders: - `{args}` expands to `"$@"` on Unix and the equivalent PowerShell forwarded-args expression on Windows. - `{stderr-null}` expands to the platform-specific stderr suppression used by `fmt`. - Convert most variadic one-line recipes to the unified `{args}` form, including `codex`, `exec`, `file-search`, `app-server-test-client`, `fix`, `clippy`, `bench`, `mcp-server-run`, `write-app-server-schema`, and `argument-comment-lint-from-source`. - Keep genuinely shell-specific recipes split or Unix-only for now, including recipes backed by `.sh` scripts or recipes whose bodies are more than simple command forwarding. - Add a Windows `just install` path that installs PowerShell via `winget` when `pwsh` is not available, then runs the same basic Rust setup steps. - Update the SDK test that validates the root `fmt` recipe so it recognizes the new portable stderr placeholder. ## Validation - `just --summary` - `just --dry-run fmt` - `just --dry-run bench-smoke` - `just --dry-run codex foo "bar binky" baz` - `just --dry-run write-hooks-schema` - `just --dry-run bazel-lock-update` - `just --dry-run argument-comment-lint-from-source -- "foo bar"` - `git diff --check -- justfile scripts/just-shell.py sdk/python/tests/test_artifact_workflow_and_binaries.py` - Verified Windows argv preservation through `scripts/just-shell.py` with arguments containing spaces. - `uv run --frozen --project sdk/python --extra dev pytest sdk/python/tests/test_artifact_workflow_and_binaries.py::test_root_fmt_recipe_formats_rust_and_python_sdk`	2026-06-01 11:26:36 -07:00
charlesgong-openai	9756316d89	Preserve plugin app manifest order (#25491 ) ## Summary - Preserve app declaration order when loading plugin .app.json files. - Keep plugin connector summaries in plugin app order after connector metadata is merged and filtered. - Add regression coverage for .app.json order and connector summary order. ## Validation - just fmt - just test -p codex-chatgpt connectors_for_plugin_apps_returns_only_requested_plugin_apps - just test -p codex-core-plugins effective_apps_preserves_app_config_order - just fix -p codex-core-plugins (passes with existing clippy large_enum_variant warning in core-plugins/src/manifest.rs) - just fix -p codex-chatgpt - just bazel-lock-update - just bazel-lock-check	2026-06-01 11:04:21 -07:00
jif-oai	6ddb747e76	[codex] Rename multi-agent v2 assign_task to followup_task (#25636 ) ## Summary Renames the MultiAgentV2 turn-triggering tool from `assign_task` to `followup_task` so the exposed tool name better describes sending an additional task to an existing agent. This updates the tool spec, handler/module names, registry wiring, default multi-agent v2 usage hints, and tests. Rollout trace classification keeps accepting legacy `assign_task` events so older traces still reduce correctly, while docs show the new tool name. ## Test plan - `just test -p codex-core followup_task` - `just test -p codex-core -E 'test(multi_agent_feature_selects_one_agent_tool_family) \| test(multi_agent_v2_can_use_configured_tool_namespace) \| test(code_mode_only_can_expose_namespaced_multi_agent_v2_as_normal_tools)'` - `just test -p codex-rollout-trace` - `just fix -p codex-core` - `just fix -p codex-rollout-trace` Notes: `just fmt` ran `cargo fmt` but failed in the Python ruff phase because the local environment could not resolve `hatchling>=1.27.0` from the configured internal registry. A full `just test -p codex-core` also hit unrelated environment-sensitive integration failures involving missing spawned test binaries/sandbox behavior; the changed multi-agent spec/handler tests passed in the filtered runs above.	2026-06-01 19:57:11 +02:00
starr-openai	fb94703b21	exec-server: add environment path refs (#25121 ) ## Summary - add public `codex_exec_server::EnvironmentPathRef` - bind an absolute path to its owning executor filesystem - keep path operations in the next review slice ## Stack - 1/5 in the skills path authority stack extracted from https://github.com/openai/codex/pull/25098 ## Validation - `cd /Users/starr/code/codex-worktrees/pr-25098-restack4/codex-rs && just fmt` - GitHub CI pending on rewritten head	2026-06-01 10:55:52 -07:00
jif-oai	917a9a41a3	Parallelize cold rollout compression (#25654 ) ## Why [#25089](https://github.com/openai/codex/pull/25089) added the background worker for compressing cold archived rollouts, but the worker still processed files effectively one at a time: each compression job was sent to `spawn_blocking` and then awaited before the next file started. On machines with a backlog of archived rollouts, that makes catch-up slower than it needs to be even though the actual compression work already runs off the async runtime. ## What Changed - Queue rollout compression work in a `JoinSet` while directory traversal continues. - Cap the worker at two in-flight compression jobs so it can overlap compression without turning the background task into unbounded blocking work. - Drain pending jobs before returning, including the `read_dir.next_entry()` error path, so every launched job still contributes to the final `compressed`, `skipped`, and `failed` stats. - Treat task join failures the same way as compression failures in the worker's warning and failure accounting.	2026-06-01 19:54:52 +02:00
jif-oai	e6eb462f07	nit: drop todo (#25655 )	2026-06-01 19:48:29 +02:00
Shijie Rao	795031621d	[codex] Use git CLI for release Cargo fetches (#25644 ) ## Summary - Configure the rust-release build job with `CARGO_NET_GIT_FETCH_WITH_CLI=true` - Document the macOS SecureTransport/libgit2 failure mode that hit the `libwebrtc`/`libyuv` git submodule fetch ## Root cause The release run at https://github.com/openai/codex/actions/runs/26717498860/job/78745156683 repeatedly failed before compilation because Cargo's libgit2 fetch path could not clone the nested `yuv-sys/libyuv` submodule from `chromium.googlesource.com`, ending with `SecureTransport error: connection closed via error`. ## Validation - `git diff --check` This is a workflow-only change, so I did not run Rust package tests.	2026-06-01 10:34:12 -07:00
Vivian Fang	2bf1c986f9	[codex] Inherit raw events for spawned child listeners (#25603 )	2026-06-01 10:13:56 -07:00
Eric Traut	8b759b9c18	Disable SQLite intrinsics for Windows x64 releases (#25490 ) ## Why Codex 0.135.0 started shipping bundled SQLite 3.51.x via SQLx 0.9.0 to avoid the older WAL corruption bug fixed by #24728. On Windows x64, #25367 reports an immediate `STATUS_ILLEGAL_INSTRUCTION` crash on a Haswell CPU when starting normal Codex paths. Rather than downgrading SQLite, this keeps the newer bundled SQLite source and removes SQLite compiler-intrinsic code paths from the Windows x64 release build. ## What changed For `x86_64-pc-windows-msvc` release builds, export `LIBSQLITE3_FLAGS=SQLITE_DISABLE_INTRINSIC` before `cargo build` in: - `.github/workflows/rust-release.yml` - `.github/workflows/rust-release-windows.yml` Other targets keep their current SQLite build flags. ## Verification - `git diff --check`	2026-06-01 09:49:55 -07:00
jif-oai	01cb97851b	Compress cold local rollouts (#25089 ) ## Rollout compression stack This stack splits #24941 into reviewable steps for local rollout compression. The design is intentionally staged: 1. Teach readers, listing, search, and lookup to understand compressed rollouts. 2. Make append and resume paths materialize compressed rollouts back to plain JSONL before writing. 3. Add a disabled-by-default worker that can compress cold archived rollouts behind `local_thread_store_compression`. The key invariant is that writers append to plain `.jsonl`. A `.jsonl.zst` file is a cold/read representation; if a write is needed, the compressed file is materialized back to plain JSONL first. Readers prefer plain `.jsonl` when both forms exist and can fall back to the compressed sibling during transitions. The worker is deliberately the last PR and remains behind an under-development feature flag. It currently scans only `archived_sessions`, not active `sessions`, because active sessions have the highest resume/append race risk. That means this stack does not yet compress most unarchived local history. ## Known race / follow-up The remaining unresolved design question is writer/compressor coordination. Even for archived rollouts, a resume or metadata update can append while the worker is replacing the plain file with `.jsonl.zst`; the current double-stat checks narrow but do not fully eliminate the window where a writer has opened the plain file before unlink. Do not treat the worker PR as production-ready until we either: - prevent append/resume paths from racing archived compression, or - introduce a shared representation/append lock or equivalent coordination. The first two PRs are useful independently: they make compressed rollouts readable and make append paths safely recover back to plain JSONL. The third PR isolates the worker behavior so that coordination issue is reviewable separately. ## Validation Focused local validation for the stack includes: - `just test -p codex-rollout` - `just test -p codex-thread-store` where thread-store paths were touched - `just test -p codex-features` for the feature flag slice - `just bazel-lock-check` after dependency graph changes - scoped `just fix -p ...` passes for changed crates CI is still the source of truth for the full platform matrix. ## This PR in the stack This is PR 3/3, based on #25088. It adds the under-development feature flag and starts the best-effort background worker when enabled. The worker currently compresses only cold archived rollouts, skips active sessions, verifies compressed output, preserves mtime and permissions, keeps a store-level lock heartbeat, and cleans stale temp files. Stack order: 1. #25087: read compressed local rollouts. 2. #25088: materialize compressed rollouts before append. 3. This PR: add the disabled local compression worker.	2026-06-01 18:35:58 +02:00
jif-oai	3cdce52865	Preserve renamed thread titles during reconciliation (#25624 ) ## Summary - preserve existing explicit SQLite thread titles during rollout reconciliation/backfill when the incoming rollout title is only first-message-derived - keep stale inferred-title repair behavior while avoiding session-index scans during startup backfill - add a regression test for renamed titles surviving reconcile ## Testing - just fmt - just test -p codex-rollout - just test -p codex-state	2026-06-01 18:33:05 +02:00
Eric Traut	f1d029cf75	Add reasoning-only status surface item (#25504 ) Closes #24886. ## Why Users can configure the TUI status line and terminal title with `model-with-reasoning`, but issue #24886 asks for a compact reasoning-only item. That lets a setup show just `default`, `low`, `medium`, `high`, or `xhigh` without repeating the model name. ## What changed - Added a `reasoning` item for `/statusline` and `/title` setup flows. - Rendered the item from the effective reasoning effort, including collaboration-mode overrides. - Registered `reasoning` with `codex doctor` so Codex-generated terminal-title config is not reported as invalid. - Updated TUI setup snapshots so the picker previews include the new item.	2026-06-01 09:30:20 -07:00
Eric Traut	6681446477	Reset slash popup selection when filter changes (#25492 ) ## Summary Fixes #25295. The slash-command popup reused its previous `ScrollState` when the composer filter token changed. After scrolling the full `/` command list, typing a narrower filter such as `/st` could clamp the stale selection into the filtered results and highlight the wrong command. This resets the popup selection and viewport only when the parsed filter token changes, so normal arrow navigation is preserved while new filters start at the first match.	2026-06-01 09:17:19 -07:00
Eric Traut	f94c49cf46	Use deep links for macOS codex app paths (#25485 ) ## Why `codex app [PATH]` is the documented CLI entry point for opening Codex Desktop on a workspace. Recent desktop builds can focus the app while failing to honor paths passed as macOS document-open arguments via `open -a Codex.app <workspace>`, which broke `codex app .` for users. See #25333; related report: #25166. The desktop app still supports the explicit `codex://threads/new?path=...` route, so the CLI should use that app-owned launch surface instead of depending on folder-open event delivery. ## What Changed - Build a `codex://threads/new?path=<workspace>` URL in the macOS app launcher. - Pass that URL to `open -a <Codex.app>` instead of passing the workspace path as a document argument. - Add coverage that workspace paths needing escaping round-trip through URL query encoding. ## Verification - `just test -p codex-cli codex_new_thread_url_encodes_workspace_path`	2026-06-01 09:17:08 -07:00
Charlie Marsh	12c37a6b5c	Allow paste in searchable selection menus (#25400 ) ## Summary I frequently want to be able to paste into the searchable menu -- the most common use-case here is when specifying an upstream for a `/review`, where I copy the upstream from an open terminal.	2026-06-01 18:01:52 +02:00
Won Park	13edafb6ed	Preserve auto-review approval policy in codex exec (#23763 ) ## Why `codex exec` was forcing headless runs to `approval_policy = "never"` even when the resolved reviewer was `auto_review`. That prevented unattended exec workflows from reaching the reviewed MCP write path they were configured to use. ## What changed - Keep the existing headless `never` default for ordinary exec runs. - Re-resolve exec config without that synthetic override when the final reviewer resolves to `AutoReview`, so configured or requirements-driven approval policy is preserved. - Add regression coverage for: - `auto_review` plus `on-request` from user config - requirements-driven `AutoReview`, asserting exec’s final approval policy matches the no-override control config exactly ## Validation - `just fmt` - `cargo test -p codex-exec`	2026-06-01 08:53:25 -07:00
Felipe Coury	c0ea566bb5	feat(tui): restore output-free cancelled prompts (#25316 ) ## TL;DR When you press Esc or Ctrl+C after sending a prompt but before any output was rendering, it restores the last composer and the message. ## Summary Cancelling a prompt immediately after submission should behave like returning to edit that prompt, not like discarding the user's draft. Today, pressing `Esc` or `Ctrl+C` before Codex responds leaves the submitted prompt in the transcript and returns an empty composer, forcing the user to recall or retype it. When an interrupted turn has not produced substantive visible output, restore its submitted prompt directly into the composer and roll back that latest turn. This also covers the first prompt in a fresh thread, before the TUI has retained a local user-history cell. The restored draft keeps its text, image attachments, and active collaboration mode so it can be edited and resubmitted in place. Restoration is intentionally suppressed once the turn has produced user-visible activity such as assistant output, tool work, hooks, or patches. A transient thinking status does not make the prompt ineligible. Rollback also rebuilds terminal scrollback from the retained transcript cells so repeated cancellations and terminal resizes do not duplicate history. ## How to Test 1. Start the TUI with `cargo run -p codex-cli --bin codex`. 2. In a fresh thread, submit the first prompt and press `Esc` before Codex emits substantive output. Confirm that the prompt returns to the composer for editing and its submitted transcript row is removed. 3. Repeat with `Ctrl+C`, then repeat after at least one completed turn. Confirm the same behavior. 4. Submit a prompt, wait for assistant output or tool activity, then cancel. Confirm that the transcript remains intact and the prompt is not restored into the composer. 5. Cancel several output-free prompts and resize the terminal between attempts. Confirm that the startup banner, tip, and transcript history do not duplicate in scrollback. Targeted tests: - `just test -p codex-tui cancelled_turn_edit_restores_prompt` - `just test -p codex-tui output_free_interrupted_turn_requests_prompt_restore` - `just test -p codex-tui visible_output_prevents_cancelled_turn_prompt_restore` - `just test -p codex-tui thinking_status_keeps_cancelled_turn_prompt_restore_eligible` - `just test -p codex-tui patch_activity_prevents_cancelled_turn_prompt_restore` The full `just test -p codex-tui` run completed with `2746` passing tests and two unrelated existing guardian feature-flag failures. `just argument-comment-lint` remains blocked locally by the existing Bazel LLVM `compiler-rt` sanitizer-header glob failure; the touched Rust diff was manually audited for positional literal comments.	2026-06-01 11:49:14 -03:00
Felipe Coury	4eded02f52	[codex] fix compressed rollout fixture SessionMeta initialization (#25628 ) ## Summary - initialize `parent_thread_id` in the compressed rollout test fixture's `SessionMeta` - restore rollout test compilation across Bazel test, clippy, release-build, and argument-comment-lint jobs ## Root cause PR #25087 (`Read compressed rollouts and materialize before append`) added `codex-rs/rollout/src/compression_tests.rs` in merge commit `a8a6071279b6f3112fcc5fc3fee69c48473d7149`. Its `write_rollout` fixture constructs `SessionMeta` without the required `parent_thread_id` field, causing `error[E0063]` when Bazel compiles `rollout-unit-tests-bin` on `main` and downstream PRs. ## Validation - `UV_CACHE_DIR=/private/tmp/codex-uv-cache just fmt` - `just test -p codex-rollout` (`59` tests passed; bench smoke passed) - `git diff --check` - manually audited the touched Rust diff for positional literal argument comments; the change adds no positional callsite ## Local lint blocker - `just argument-comment-lint` could not reach source inspection locally because Bazel's LLVM dependency fails analysis: `compiler-rt/BUILD.bazel` glob `include/sanitizer/*.h` matched no files.	2026-06-01 16:43:21 +02:00
jif-oai	a8a6071279	Read compressed rollouts and materialize before append (#25087 ) ## Why Local rollout compression needs a cold `.jsonl.zst` representation without letting compressed physical paths leak into append-mode writers. The unsafe case is resume or metadata update code successfully reading a compressed rollout and then appending raw JSONL bytes to the zstd file. This PR folds the former #25088 materialization slice into the read-support PR so the reader changes and append-safety invariant land together. ## What Changed - Teach rollout readers, discovery, listing, search, and ID lookup to understand compressed `.jsonl.zst` rollouts. - Keep `.jsonl` as the logical/stored rollout path while allowing read paths to open either plain or compressed storage. - Materialize compressed rollouts back to plain `.jsonl` before append-mode writes, including resume and direct metadata append paths. - Preserve compressed-file permissions when materializing back to plain JSONL. - Refresh thread-store resolved rollout paths after compatibility metadata writes so reconciliation follows the materialized file. - Avoid treating transient compression temp files as real rollout lookup results. ## Remaining Stack #25089 remains the separate worker PR. It is based directly on this PR and stays behind the disabled `local_thread_store_compression` feature flag. The worker still has a broader coordination question: a resume or metadata update can race with background compression while a plain file is being replaced by `.jsonl.zst`. This PR handles the read and materialize-before-append primitives; it does not make the worker production-ready. ## Validation - `just test -p codex-rollout` - `just test -p codex-thread-store` - `just fix -p codex-rollout` - `just fix -p codex-thread-store` - `just bazel-lock-check`	2026-06-01 15:14:19 +02:00
jif-oai	f27bbbd49c	Add goal extension GoalApi (#25096 ) ## Summary - add an extension-owned `GoalApi` for thread goal get/set/clear operations - register live goal runtimes with the API from the goal extension backend - cover the API and runtime-effect paths in goal extension tests ## Stack Follow-up app-server wiring PR: #25108 ## Validation - `just fmt` - `just fix -p codex-goal-extension` - `just test -p codex-goal-extension`	2026-06-01 11:32:13 +02:00
jif-oai	48c16b8bcb	Remove Plan-mode gate from idle turn injection (#25577 ) ## Why `try_start_turn_if_idle` is the core helper for starting injected input only when the session is actually idle. It should stay focused on generic turn-lifecycle safety. The previous `ModeKind::Plan` guard mixed caller policy into that helper: Plan mode may choose not to auto-start some extension work, but that decision belongs at the extension or caller boundary rather than in the session injection primitive. ## What changed - Removed the `ModeKind::Plan` early return from `Session::try_start_turn_if_idle`. - Removed the now-unused `ModeKind` import from `core/src/session/inject.rs`. ## Testing Not run locally.	2026-06-01 11:06:38 +02:00
jif-oai	c875bc8a33	Use templates for goal steering prompts (#25576 ) ## Why Goal steering prompts have grown into long inline Rust strings, which makes the authored prompt text hard to review and easy to damage while changing the surrounding plumbing. Moving those prompts into embedded Markdown templates keeps the policy text in the shape reviewers actually read, while preserving the existing runtime substitution and objective escaping behavior. ## What changed - Added `ext/goal/templates/goals/continuation.md`, `budget_limit.md`, and `objective_updated.md` for the three goal steering prompts. - Updated `ext/goal/src/steering.rs` to parse those embedded templates once with `codex-utils-template` and render the existing goal values into them. - Kept user objectives XML-escaped before rendering and converted budget counters into template variables. - Added the template directory to `ext/goal/BUILD.bazel` `compile_data` so Bazel has the same embedded prompt inputs as Cargo. ## Testing - Not run locally.	2026-06-01 10:55:14 +02:00
jif-oai	f1b1b64005	Add goal extension idle continuation (#25060 ) ## Why The goal extension needs a way to resume an active goal after the thread becomes idle, but the old core goal runtime should not be refactored as part of this step. The missing piece is a small core-owned turn-start primitive: let an extension ask for a normal model turn only when the thread is idle, and otherwise fail without injecting into whatever is currently active. ## What Changed - Adds `CodexThread::try_start_turn_if_idle(...)` as the narrow extension-facing primitive for synthetic idle work. - Implements the session side so it refuses to start when: - the provided input is empty, - the session is in plan mode, - a turn is already active, or - trigger-turn mailbox work is pending. - Gives trigger-turn mailbox work priority if it appears while the idle turn is being prepared. - Wires `GoalExtension::on_thread_idle` to read the active persisted goal and submit the continuation prompt through this idle-only primitive. - Keeps the legacy core goal continuation implementation in place instead of folding it into this PR. ## Behavior This is intentionally best-effort. If `try_start_turn_if_idle` observes that the thread is not idle, or that higher-priority mailbox work should run first, it returns the input to the caller. The goal extension drops that continuation prompt and waits for a future idle opportunity instead of injecting stale synthetic goal text into an active turn. ## Validation - `just test -p codex-core try_start_turn_if_idle_rejects_active_turn_without_injecting` - `just test -p codex-goal-extension`	2026-06-01 10:42:01 +02:00
jif-oai	8d49394feb	Set multi-agent v2 dogfood defaults (#25266 ) ## Summary - default multi-agent v2 to direct-model-only tools so code mode does not wrap subagent tools - add default root/subagent team prompts aligned with dogfood training assumptions - tighten spawn-agent model override wording to prefer the inherited model by default ## Tests - just fmt - just test -p codex-core spawn_agent_description_lists_visible_models_and_reasoning_efforts - just test -p codex-core multi_agent_v2_default_session_thread_cap_counts_root - just test -p codex-rollout-trace - just fix -p codex-core - just fix -p codex-rollout-trace Note: a broad just test -p codex-core run was attempted locally, but this sandbox produced unrelated environment failures around sandbox-exec, missing test_stdio_server, and realtime timeouts.	2026-06-01 10:24:46 +02:00
Owen Lin	cf0911076f	store and expose parent_thread_id on Threads (#25113 ) ## Why This PR https://github.com/openai/codex/pull/24161#discussion_r3325692763 revealed a subagent data modeling issue, where we overloaded `forked_from_id` to also mean `parent_thread_id`. That's incorrect since guardian and review subagents can be a subagent and NOT fork the main thread's history. The solution here is to explicitly store a new `parent_thread_id` on `SessionMeta`, alongside `forked_from_id` which already exists. While we're at it, also expose it in the app-server protocol on the `Thread` object. A thread->subagent relationship and a fork of thread history are orthogonal concepts. ## What Changed - Added top-level `parent_thread_id` persistence on `SessionMeta` and runtime/session plumbing through `SessionConfiguredEvent`, `CodexSpawnArgs`, `SessionConfiguration`, `ThreadConfigSnapshot`, `TurnContext`, and `ModelClient`. - Made turn metadata, request headers, analytics, and subagent-start events read the separate runtime/top-level parent field instead of deriving general parent lineage from `SessionSource` or `forked_from_thread_id`. - Passed parent lineage separately at delegated subagent, review, guardian, agent-job, and multi-agent spawn construction sites; copied-history fork lineage remains derived only from `InitialHistory`. - Persisted and exposed parent lineage through rollout/thread-store projections and app-server v2 `Thread.parentThreadId`. - Updated app-server README text and regenerated app-server schema fixtures for the additive `parentThreadId` response field.	2026-06-01 04:33:20 +00:00
Shijie Rao	3b7334d099	Revert "Add build_unsigned_archive release mode" (#25462 ) Reverts openai/codex#25435	2026-05-31 16:05:33 -07:00
joeflorencio-openai	8a556296f0	Add cloud-managed config layer support (#24620 ) ## Summary PR 3 of 5 in the cloud-managed config client stack. Adds enterprise-managed cloud config as a first-class config layer source. The layer metadata is preserved through config loading, diagnostics, debug output, hook attribution, and app-server protocol surfaces. ## Details - Enterprise-managed config becomes a normal config layer source with backend-supplied `id` and display `name` attached for provenance. - These layers are designed to behave like non-file managed config: they can surface syntax/type diagnostics by layer name even though there is no physical config file. - Relative path settings are resolved from a stored config base so cloud-delivered config remains consistent with existing MDM-delivered config semantics. - Hook attribution distinguishes config-delivered hooks from requirements-delivered hooks via `HookSource::CloudManagedConfig`. - This remains pull-based and snapshot-oriented; the PR adds layer identity/diagnostics, not dynamic reload behavior. ## Validation Validated through the targeted stack checks after rebasing onto current `main`: - Rust crate tests for config/hooks/cloud-config/backend-client/app-server-protocol - Filtered `codex-core` and `codex-app-server` `cloud_config_bundle` tests - Python generated-file contract test - `cargo shear --deny-warnings` - Targeted `argument-comment-lint` for config/hooks	2026-05-31 15:54:31 -07:00
joeflorencio-openai	20debf746b	Compose requirements layers (#24619 ) ## Summary PR 2 of 5 in the cloud-managed config client stack. Adds a shared requirements-layer composition engine. The composer defines how ordered requirements layers combine, with focused tests for the merge semantics and provenance behavior. The final PR in the stack wires runtime requirements sources into this path. ## Details - Mental model: requirements layers are ordered lowest priority first, matching `ConfigLayerStack`; lower-priority layers provide defaults while higher-priority layers win scalar/list conflicts. - Regular fields use config-style TOML merging, including recursive table merging, so requirements layering follows the same broad model as `config.toml` layering. - Domain-specific fields keep explicit semantics: `rules.prefix_rules` and hooks preserve high-priority-first output, hooks fail closed on active managed-dir conflicts, and `permissions.filesystem.deny_read` dedupes as a stable high-priority-first union. - `remote_sandbox_config` is evaluated within each layer before the regular TOML merge, so host-specific sandbox constraints do not leak across layers. - Provenance points at the exact source when one layer owns a value and uses composite provenance when a table field is assembled from multiple layers. ## Validation Local validation: - `just fmt` - `cargo check -p codex-config` - `just test -p codex-config requirements_composition` - `git diff --check` CI will run the broader test matrix.	2026-05-31 15:14:06 -07:00
Shijie Rao	5f60b01352	Add build_unsigned_archive release mode (#25435 ) ## Why We want a manual mode that produces the full packaged unsigned macOS Codex archive, including bundled resources like `rg`, without mixing those archives into the signing and publishing flow. The existing `build_unsigned` mode is the handoff used by external signing and `promote_signed`, so archive-only inspection and local packaging should live in a separate mode and artifact namespace. ## What Changed - added `build_unsigned_archive` as a new manual `release_mode` - kept the existing `build` matrix running for that mode instead of introducing a separate archive-only job - wrote unsigned macOS package archives to `codex-rs/unsigned-archive-dist/...` instead of the normal `dist/...` tree - uploaded those packaged macOS outputs as dedicated `*-unsigned-archive` workflow artifacts - kept `build_unsigned` and `promote_signed` on their existing raw unsigned binary path ## Validation - parsed `.github/workflows/rust-release.yml` with `ruby -e 'require "yaml"; YAML.load_file(".github/workflows/rust-release.yml")'` - ran `git diff --check -- .github/workflows/rust-release.yml` - reviewed the workflow diff to confirm `build_unsigned_archive` now reuses the existing `build` job while isolating the unsigned macOS package archives under dedicated artifact names - locally verified the package builder layout against unsigned macOS binaries to confirm the packaged archive contains `bin/codex`, `codex-path/rg`, and `codex-resources/zsh/bin/zsh`	2026-05-31 14:56:06 -07:00
joeflorencio-openai	e93dc98a48	Add config bundle transport types (#24617 ) ## Summary PR 1 of 5 in the cloud-managed config client stack. Adds the generated backend models and client transport surface for the config bundle endpoint. This bundle endpoint is the replacement backend surface for legacy cloud requirements; the final PR in the stack switches runtime consumers over to it. ## Details - This is transport-only plumbing: no runtime config behavior changes in this PR. - The bundle endpoint is the new shared backend surface for cloud-delivered config and requirements data. - Both supported path styles are wired here: `/api/codex/config/bundle` and `/wham/config/bundle`. - The response types come from generated backend models so later PRs consume the backend contract directly instead of maintaining hand-written mirror structs. ## Validation Validated through the targeted stack checks after rebasing onto current `main`: - Rust crate tests for config/hooks/cloud-config/backend-client/app-server-protocol - Filtered `codex-core` and `codex-app-server` `cloud_config_bundle` tests - Python generated-file contract test - `cargo shear --deny-warnings` - Targeted `argument-comment-lint` for config/hooks	2026-05-31 11:52:18 -07:00
Felipe Coury	2f0726ad6d	feat(tui): allow function keys through f24 in keymaps (#25329 ) ## Why Closes #25006. `tui.keymap` currently rejects `F13` even though Codex's terminal event layer can report higher function keys. This prevents users from using common remappings such as Caps Lock to `F13`. ## What Changed - Define a shared portable upper bound of `F24` for stored TUI keybindings. - Accept `f13` through `f24` in config normalization and runtime parsing. - Allow `/keymap` capture to persist `F13` through `F24`. - Update the unsupported-function-key error and add boundary tests for `F13`, `F24`, and `F25`. ## How to Test 1. Add a binding such as: ```toml [tui.keymap.global] open_transcript = "f13" ``` 2. Start Codex and press the remapped `F13` key. 3. Confirm Codex loads the config without the previous `F1 through F12` error and the action runs. 4. Open `/keymap`, capture `F13` for an action, and confirm the saved binding is `f13`. 5. As a regression check, try to capture `F25` and confirm Codex reports that only `F1` through `F24` can be stored. Targeted tests: - `just test -p codex-config` - `just test -p codex-tui function_keys` Full `just test -p codex-tui` completed with 2,752 passing tests, 4 skipped tests, and two unrelated guardian feature-flag failures: - `app::tests::update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default` - `app::tests::update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history`	2026-05-31 15:42:39 -03:00
xl-openai	cdde711fac	[codex] Avoid forced directory refresh during plugin install auth checks (#25381 ) ## Summary - Use normal directory loading for plugin install app metadata so install avoids forced directory refresh while still loading metadata on cold cache. - Continue force-refreshing codex_apps tools for auth state. - Add regression coverage that pre-warms the directory cache and asserts install returns cached app metadata without extra directory requests. ## Validation - just fmt - git diff --check - just test -p codex-app-server plugin_install_returns_apps_needing_auth plugin_install_filters_disallowed_apps_needing_auth (blocked locally: cargo-nextest is not installed)	2026-05-31 02:14:15 -07:00
Owen Lin	966932124c	fix: Limit Bedrock GPT models to default service tier (#25318 ) ## Description Bedrock currently only supports the implicit `default` service tier for GPT models. This PR strips non-default service tier metadata from Bedrock model catalogs so Codex does not advertise or send unsupported tiers. ## What changed - Normalize both built-in and configured Bedrock catalogs to default-only service tier behavior. - Add regression coverage for built-in and configured Bedrock catalogs. ## Validation - `just fmt` - `just test -p codex-model-provider`	2026-05-30 11:54:58 -07:00
jif-oai	8acaec73b6	Rename multi-agent v2 assignment tool (#25267 ) ## Summary - rename the multi-agent v2 follow-up task tool surface to assign_task - update core tests and spec-plan expectations - keep rollout-trace classification backward-compatible with legacy followup_task ## Tests - just fmt - just test -p codex-core multi_agents_spec::tests::assign_task_tool_requires_message_and_has_no_output_schema - just test -p codex-rollout-trace - just fix -p codex-core - just fix -p codex-rollout-trace Note: a broad just test -p codex-core run was attempted locally, but this sandbox produced unrelated environment failures around sandbox-exec, missing test_stdio_server, and realtime timeouts.	2026-05-30 14:13:05 +02:00
Eric Traut	3e7baa00e4	Add thread archive CLI commands (#25021 ) ## Problem Saved threads can already be archived through app-server RPCs, but the command line did not expose direct archive or unarchive commands. ## Solution Add `codex archive <thread>` and `codex unarchive <thread>`, resolving UUIDs or exact thread names before calling the existing `thread/archive` and `thread/unarchive` RPCs. The commands support scoped remote flags so callers can target remote app-server endpoints when archiving or unarchiving threads. This also fixes a long-standing bug in `codex resume <thread id>` and `codex fork <thread id>` that I found when testing the new commands. These operations shouldn't be allowed on archived sessions. They now fail with an error that tells the user to run `codex unarchive <thread id>` first. ## Verification Added app-server coverage for rejecting archived thread resume by id and checking that the error includes the matching `codex unarchive <thread id>` command.	2026-05-29 23:37:26 -07:00
Dylan Hurd	e0435afb72	feat(config) experimental_request_user_input toggle (#24541 ) ## Summary Experimental flag to allow toggling `request_user_input`: ``` tools.experimental_request_user_input = false ``` ## Testing - [x] Added unit tests	2026-05-29 21:35:53 -07:00
Celia Chen	00ca857d3f	fix: Bedrock API key region fallback (#25171 ) ## Why Users following the Amazon Bedrock API-key setup can export `AWS_BEARER_TOKEN_BEDROCK` and `AWS_REGION`, but Codex's bearer-token auth path only accepted `model_providers.amazon-bedrock.aws.region`. That made the documented env-based setup fail with a missing-region error even though the standard AWS region environment variable was present. ## What Changed - Updates Bedrock bearer-token region resolution to use `model_providers.amazon-bedrock.aws.region` first, then fall back to `AWS_REGION`, then `AWS_DEFAULT_REGION`. - Updates the missing-region error to list all supported region sources. - Adds focused coverage for config precedence, `AWS_REGION`, `AWS_DEFAULT_REGION`, and the missing-region failure.	2026-05-30 01:17:38 +00:00
Eric Ning	e929bb5c88	[codex] Update remote connector suggestions (#25172 ) ## Summary - Use the session-loaded plugin app IDs as the source of connector suggestion candidates. - Remove the redundant plugin reload from `tool_suggest_connector_ids()`. - Add regression coverage for connectors declared by a loaded remote plugin, using the Databricks app case. ## Context Loaded remote plugins can declare app connector IDs in `.app.json`. The session-owned `PluginsManager` already loads those plugins and exposes their effective app IDs. The connector suggestion path was creating a separate `PluginsManager` and recomputing plugin app IDs. That new manager does not share the session manager’s remote installed plugin cache, so app IDs from loaded remote plugins were missing from connector suggestions. ## Fix Pass the already-loaded effective app IDs into connector suggestion generation and use them directly as the plugin-derived connector candidate set. Connector candidates are now built from: - App IDs declared by loaded plugins - Explicitly configured connector discoverables - Existing disabled-suggestion filtering This avoids a second plugin-manager lookup and keeps connector suggestions aligned with the plugins actually loaded for the turn. ## Behavior For example, when a plugin is loaded and its `.app.json` declares data apps, `list_available_plugins_to_install` can now return those data connectors. This does not create plugin suggestions from the plugin itself. Plugin suggestions still come from eligible uninstalled entries in the marketplace catalog and require existing matching/filtering rules. ## Validation - `just fmt` - Added regression coverage for a loaded-plugin connector ID appearing in discoverable tools - Attempted `just test -p codex-core`; the command exited unsuccessfully in the local test environment without useful failure detail captured in the run output	2026-05-29 17:57:34 -07:00
Abhinav	a5a94ee5a7	Constrain Windows sandbox requirements (#23766 ) # Why Managed requirements can already constrain sandbox policy choices, but Windows sandbox implementation selection was still resolved independently from those requirements. That left the TUI able to continue through the unelevated fallback even when an organization wants to require the elevated Windows sandbox implementation. # What - Add `[windows].allowed_sandbox_implementations` requirements support for the Windows `elevated` and `unelevated` implementations. - Apply that allowlist during core config resolution so disallowed configured or feature-selected Windows sandbox implementations fall back to an allowed implementation with the existing requirements warning path. - Reuse the existing TUI Windows setup prompts to block disallowed unelevated continuation, keep required elevated setup in front of the user, and refuse to persist a TUI-selected Windows sandbox mode that requirements disallow. # Semantics \| Allowed \| Selected \| Effective \| \| --- \| --- \| --- \| \| `["elevated"]` \| `unelevated` / unset \| `elevated` \| \| `["unelevated"]` \| `elevated` / unset \| `unelevated` \| \| `["elevated", "unelevated"]` \| `elevated` \| `elevated` \| \| `["elevated", "unelevated"]` \| `unelevated` \| `unelevated` \| \| `["elevated", "unelevated"]` \| unset \| `elevated` \| Availability is handled by interactive setup surfaces after allowlist resolution. If the effective elevated implementation is not ready, elevated-only requirements block on setup. When unelevated is also allowed, the UI may offer the existing unelevated fallback. ## TUI Screens If elevated setup is not already complete: ``` Your organization requires the default Codex agent sandbox to continue. Set it up to protect your files and control network access. Learn more <https://developers.openai.com/codex/windows> › 1. Set up default sandbox (requires Administrator permissions) 2. Quit ``` If admin setup fails under `["elevated"]`: ``` Couldn't set up your sandbox with Administrator permissions Your organization requires the default sandbox before Codex can continue. Learn more <https://developers.openai.com/codex/windows> › 1. Try setting up admin sandbox again 2. Quit ``` # Next Steps - extend the requirements/readout surface, such as `configRequirements/read`, so clients can inspect the loaded `[windows].allowed_sandbox_implementations` requirement instead of inferring it from Windows setup state - consider extending `windowsSandbox/readiness` as well - update the App startup guide, setup flow, and banner surfaces so an elevated-only requirement omits any continue-unelevated escape hatch and blocks startup until a permitted implementation is ready; - preserve the existing unelevated fallback path when requirements allow it, including the `["unelevated"]` case where elevated is disallowed	2026-05-29 16:31:33 -07:00
Noah MacCallum	8e5f561697	Filter plugin install suggestions by installed apps (#24996 ) ## Summary - Keep the original `TOOL_SUGGEST_DISCOVERABLE_PLUGIN_ALLOWLIST` as a fallback seed list, so users with no installed plugins still get initial install suggestions. - Allow additional install suggestions from trusted marketplaces: `openai-curated` and `openai-bundled`. - Require non-fallback, non-configured marketplace candidates to share `.app.json` connector IDs with already installed plugins. - Preserve explicit configured plugin discoverables as an override, while still omitting installed, disabled, and `NOT_AVAILABLE` plugins. ## Context `list_available_plugins_to_install` controls which plugins the model can trigger via `request_plugin_install`. We want a small starter set for empty/new users, but we also want installed workflow plugins to unlock relevant source plugins without maintaining every source plugin ID by hand. This keeps the legacy plugin ID allowlist only as the starter fallback. For everything else, the trusted marketplace is the candidate boundary, and installed app connector overlap is the relevance filter. For example, an installed Sales plugin can make HubSpot and Granola suggestible when those source plugins are in `openai-curated` and share Sales app connector IDs, while an unrelated test-source plugin with an app connector not declared by Sales stays hidden. ## Test Coverage - Empty/no-installed-plugin case: returns the fallback seed plugins from the original allowlist. - Installed-app expansion: returns non-fallback marketplace plugins only when their app connector IDs overlap with an installed plugin. - Sales workflow case: installed Sales declares HubSpot and Granola apps, so `hubspot@openai-curated` and `granola@openai-curated` are returned. - Sales negative case: `test-source@openai-curated` has an app connector not declared by Sales, so it is not returned. - Existing guardrails: installed plugins, disabled suggestions, and `NOT_AVAILABLE` plugins remain omitted; explicit configured discoverables still work as an override. ## Validation - `just fmt` - `just test -p codex-core plugins::discoverable::tests` - `just test -p codex-core` was attempted earlier, but current `main` / local env failed with unrelated existing failures around missing `test_stdio_server`, CLI/code-mode MCP tool setup, and unified_exec/shell snapshot flakes/timeouts. The touched discoverable tests pass.	2026-05-29 15:32:04 -07:00

1 2 3 4 5 ...

7026 Commits