codex

mirror of https://github.com/openai/codex.git synced 2026-06-02 19:31:59 +00:00

Author	SHA1	Message	Date
jif-oai	f1b1b64005	Add goal extension idle continuation (#25060 ) ## Why The goal extension needs a way to resume an active goal after the thread becomes idle, but the old core goal runtime should not be refactored as part of this step. The missing piece is a small core-owned turn-start primitive: let an extension ask for a normal model turn only when the thread is idle, and otherwise fail without injecting into whatever is currently active. ## What Changed - Adds `CodexThread::try_start_turn_if_idle(...)` as the narrow extension-facing primitive for synthetic idle work. - Implements the session side so it refuses to start when: - the provided input is empty, - the session is in plan mode, - a turn is already active, or - trigger-turn mailbox work is pending. - Gives trigger-turn mailbox work priority if it appears while the idle turn is being prepared. - Wires `GoalExtension::on_thread_idle` to read the active persisted goal and submit the continuation prompt through this idle-only primitive. - Keeps the legacy core goal continuation implementation in place instead of folding it into this PR. ## Behavior This is intentionally best-effort. If `try_start_turn_if_idle` observes that the thread is not idle, or that higher-priority mailbox work should run first, it returns the input to the caller. The goal extension drops that continuation prompt and waits for a future idle opportunity instead of injecting stale synthetic goal text into an active turn. ## Validation - `just test -p codex-core try_start_turn_if_idle_rejects_active_turn_without_injecting` - `just test -p codex-goal-extension`	2026-06-01 10:42:01 +02:00
jif-oai	8d49394feb	Set multi-agent v2 dogfood defaults (#25266 ) ## Summary - default multi-agent v2 to direct-model-only tools so code mode does not wrap subagent tools - add default root/subagent team prompts aligned with dogfood training assumptions - tighten spawn-agent model override wording to prefer the inherited model by default ## Tests - just fmt - just test -p codex-core spawn_agent_description_lists_visible_models_and_reasoning_efforts - just test -p codex-core multi_agent_v2_default_session_thread_cap_counts_root - just test -p codex-rollout-trace - just fix -p codex-core - just fix -p codex-rollout-trace Note: a broad just test -p codex-core run was attempted locally, but this sandbox produced unrelated environment failures around sandbox-exec, missing test_stdio_server, and realtime timeouts.	2026-06-01 10:24:46 +02:00
Owen Lin	cf0911076f	store and expose parent_thread_id on Threads (#25113 ) ## Why This PR https://github.com/openai/codex/pull/24161#discussion_r3325692763 revealed a subagent data modeling issue, where we overloaded `forked_from_id` to also mean `parent_thread_id`. That's incorrect since guardian and review subagents can be a subagent and NOT fork the main thread's history. The solution here is to explicitly store a new `parent_thread_id` on `SessionMeta`, alongside `forked_from_id` which already exists. While we're at it, also expose it in the app-server protocol on the `Thread` object. A thread->subagent relationship and a fork of thread history are orthogonal concepts. ## What Changed - Added top-level `parent_thread_id` persistence on `SessionMeta` and runtime/session plumbing through `SessionConfiguredEvent`, `CodexSpawnArgs`, `SessionConfiguration`, `ThreadConfigSnapshot`, `TurnContext`, and `ModelClient`. - Made turn metadata, request headers, analytics, and subagent-start events read the separate runtime/top-level parent field instead of deriving general parent lineage from `SessionSource` or `forked_from_thread_id`. - Passed parent lineage separately at delegated subagent, review, guardian, agent-job, and multi-agent spawn construction sites; copied-history fork lineage remains derived only from `InitialHistory`. - Persisted and exposed parent lineage through rollout/thread-store projections and app-server v2 `Thread.parentThreadId`. - Updated app-server README text and regenerated app-server schema fixtures for the additive `parentThreadId` response field.	2026-06-01 04:33:20 +00:00
Shijie Rao	3b7334d099	Revert "Add build_unsigned_archive release mode" (#25462 ) Reverts openai/codex#25435	2026-05-31 16:05:33 -07:00
joeflorencio-openai	8a556296f0	Add cloud-managed config layer support (#24620 ) ## Summary PR 3 of 5 in the cloud-managed config client stack. Adds enterprise-managed cloud config as a first-class config layer source. The layer metadata is preserved through config loading, diagnostics, debug output, hook attribution, and app-server protocol surfaces. ## Details - Enterprise-managed config becomes a normal config layer source with backend-supplied `id` and display `name` attached for provenance. - These layers are designed to behave like non-file managed config: they can surface syntax/type diagnostics by layer name even though there is no physical config file. - Relative path settings are resolved from a stored config base so cloud-delivered config remains consistent with existing MDM-delivered config semantics. - Hook attribution distinguishes config-delivered hooks from requirements-delivered hooks via `HookSource::CloudManagedConfig`. - This remains pull-based and snapshot-oriented; the PR adds layer identity/diagnostics, not dynamic reload behavior. ## Validation Validated through the targeted stack checks after rebasing onto current `main`: - Rust crate tests for config/hooks/cloud-config/backend-client/app-server-protocol - Filtered `codex-core` and `codex-app-server` `cloud_config_bundle` tests - Python generated-file contract test - `cargo shear --deny-warnings` - Targeted `argument-comment-lint` for config/hooks	2026-05-31 15:54:31 -07:00
joeflorencio-openai	20debf746b	Compose requirements layers (#24619 ) ## Summary PR 2 of 5 in the cloud-managed config client stack. Adds a shared requirements-layer composition engine. The composer defines how ordered requirements layers combine, with focused tests for the merge semantics and provenance behavior. The final PR in the stack wires runtime requirements sources into this path. ## Details - Mental model: requirements layers are ordered lowest priority first, matching `ConfigLayerStack`; lower-priority layers provide defaults while higher-priority layers win scalar/list conflicts. - Regular fields use config-style TOML merging, including recursive table merging, so requirements layering follows the same broad model as `config.toml` layering. - Domain-specific fields keep explicit semantics: `rules.prefix_rules` and hooks preserve high-priority-first output, hooks fail closed on active managed-dir conflicts, and `permissions.filesystem.deny_read` dedupes as a stable high-priority-first union. - `remote_sandbox_config` is evaluated within each layer before the regular TOML merge, so host-specific sandbox constraints do not leak across layers. - Provenance points at the exact source when one layer owns a value and uses composite provenance when a table field is assembled from multiple layers. ## Validation Local validation: - `just fmt` - `cargo check -p codex-config` - `just test -p codex-config requirements_composition` - `git diff --check` CI will run the broader test matrix.	2026-05-31 15:14:06 -07:00
Shijie Rao	5f60b01352	Add build_unsigned_archive release mode (#25435 ) ## Why We want a manual mode that produces the full packaged unsigned macOS Codex archive, including bundled resources like `rg`, without mixing those archives into the signing and publishing flow. The existing `build_unsigned` mode is the handoff used by external signing and `promote_signed`, so archive-only inspection and local packaging should live in a separate mode and artifact namespace. ## What Changed - added `build_unsigned_archive` as a new manual `release_mode` - kept the existing `build` matrix running for that mode instead of introducing a separate archive-only job - wrote unsigned macOS package archives to `codex-rs/unsigned-archive-dist/...` instead of the normal `dist/...` tree - uploaded those packaged macOS outputs as dedicated `*-unsigned-archive` workflow artifacts - kept `build_unsigned` and `promote_signed` on their existing raw unsigned binary path ## Validation - parsed `.github/workflows/rust-release.yml` with `ruby -e 'require "yaml"; YAML.load_file(".github/workflows/rust-release.yml")'` - ran `git diff --check -- .github/workflows/rust-release.yml` - reviewed the workflow diff to confirm `build_unsigned_archive` now reuses the existing `build` job while isolating the unsigned macOS package archives under dedicated artifact names - locally verified the package builder layout against unsigned macOS binaries to confirm the packaged archive contains `bin/codex`, `codex-path/rg`, and `codex-resources/zsh/bin/zsh`	2026-05-31 14:56:06 -07:00
joeflorencio-openai	e93dc98a48	Add config bundle transport types (#24617 ) ## Summary PR 1 of 5 in the cloud-managed config client stack. Adds the generated backend models and client transport surface for the config bundle endpoint. This bundle endpoint is the replacement backend surface for legacy cloud requirements; the final PR in the stack switches runtime consumers over to it. ## Details - This is transport-only plumbing: no runtime config behavior changes in this PR. - The bundle endpoint is the new shared backend surface for cloud-delivered config and requirements data. - Both supported path styles are wired here: `/api/codex/config/bundle` and `/wham/config/bundle`. - The response types come from generated backend models so later PRs consume the backend contract directly instead of maintaining hand-written mirror structs. ## Validation Validated through the targeted stack checks after rebasing onto current `main`: - Rust crate tests for config/hooks/cloud-config/backend-client/app-server-protocol - Filtered `codex-core` and `codex-app-server` `cloud_config_bundle` tests - Python generated-file contract test - `cargo shear --deny-warnings` - Targeted `argument-comment-lint` for config/hooks	2026-05-31 11:52:18 -07:00
Felipe Coury	2f0726ad6d	feat(tui): allow function keys through f24 in keymaps (#25329 ) ## Why Closes #25006. `tui.keymap` currently rejects `F13` even though Codex's terminal event layer can report higher function keys. This prevents users from using common remappings such as Caps Lock to `F13`. ## What Changed - Define a shared portable upper bound of `F24` for stored TUI keybindings. - Accept `f13` through `f24` in config normalization and runtime parsing. - Allow `/keymap` capture to persist `F13` through `F24`. - Update the unsupported-function-key error and add boundary tests for `F13`, `F24`, and `F25`. ## How to Test 1. Add a binding such as: ```toml [tui.keymap.global] open_transcript = "f13" ``` 2. Start Codex and press the remapped `F13` key. 3. Confirm Codex loads the config without the previous `F1 through F12` error and the action runs. 4. Open `/keymap`, capture `F13` for an action, and confirm the saved binding is `f13`. 5. As a regression check, try to capture `F25` and confirm Codex reports that only `F1` through `F24` can be stored. Targeted tests: - `just test -p codex-config` - `just test -p codex-tui function_keys` Full `just test -p codex-tui` completed with 2,752 passing tests, 4 skipped tests, and two unrelated guardian feature-flag failures: - `app::tests::update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default` - `app::tests::update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history`	2026-05-31 15:42:39 -03:00
xl-openai	cdde711fac	[codex] Avoid forced directory refresh during plugin install auth checks (#25381 ) ## Summary - Use normal directory loading for plugin install app metadata so install avoids forced directory refresh while still loading metadata on cold cache. - Continue force-refreshing codex_apps tools for auth state. - Add regression coverage that pre-warms the directory cache and asserts install returns cached app metadata without extra directory requests. ## Validation - just fmt - git diff --check - just test -p codex-app-server plugin_install_returns_apps_needing_auth plugin_install_filters_disallowed_apps_needing_auth (blocked locally: cargo-nextest is not installed)	2026-05-31 02:14:15 -07:00
Owen Lin	966932124c	fix: Limit Bedrock GPT models to default service tier (#25318 ) ## Description Bedrock currently only supports the implicit `default` service tier for GPT models. This PR strips non-default service tier metadata from Bedrock model catalogs so Codex does not advertise or send unsupported tiers. ## What changed - Normalize both built-in and configured Bedrock catalogs to default-only service tier behavior. - Add regression coverage for built-in and configured Bedrock catalogs. ## Validation - `just fmt` - `just test -p codex-model-provider`	2026-05-30 11:54:58 -07:00
jif-oai	8acaec73b6	Rename multi-agent v2 assignment tool (#25267 ) ## Summary - rename the multi-agent v2 follow-up task tool surface to assign_task - update core tests and spec-plan expectations - keep rollout-trace classification backward-compatible with legacy followup_task ## Tests - just fmt - just test -p codex-core multi_agents_spec::tests::assign_task_tool_requires_message_and_has_no_output_schema - just test -p codex-rollout-trace - just fix -p codex-core - just fix -p codex-rollout-trace Note: a broad just test -p codex-core run was attempted locally, but this sandbox produced unrelated environment failures around sandbox-exec, missing test_stdio_server, and realtime timeouts.	2026-05-30 14:13:05 +02:00
Eric Traut	3e7baa00e4	Add thread archive CLI commands (#25021 ) ## Problem Saved threads can already be archived through app-server RPCs, but the command line did not expose direct archive or unarchive commands. ## Solution Add `codex archive <thread>` and `codex unarchive <thread>`, resolving UUIDs or exact thread names before calling the existing `thread/archive` and `thread/unarchive` RPCs. The commands support scoped remote flags so callers can target remote app-server endpoints when archiving or unarchiving threads. This also fixes a long-standing bug in `codex resume <thread id>` and `codex fork <thread id>` that I found when testing the new commands. These operations shouldn't be allowed on archived sessions. They now fail with an error that tells the user to run `codex unarchive <thread id>` first. ## Verification Added app-server coverage for rejecting archived thread resume by id and checking that the error includes the matching `codex unarchive <thread id>` command.	2026-05-29 23:37:26 -07:00
Dylan Hurd	e0435afb72	feat(config) experimental_request_user_input toggle (#24541 ) ## Summary Experimental flag to allow toggling `request_user_input`: ``` tools.experimental_request_user_input = false ``` ## Testing - [x] Added unit tests	2026-05-29 21:35:53 -07:00
Celia Chen	00ca857d3f	fix: Bedrock API key region fallback (#25171 ) ## Why Users following the Amazon Bedrock API-key setup can export `AWS_BEARER_TOKEN_BEDROCK` and `AWS_REGION`, but Codex's bearer-token auth path only accepted `model_providers.amazon-bedrock.aws.region`. That made the documented env-based setup fail with a missing-region error even though the standard AWS region environment variable was present. ## What Changed - Updates Bedrock bearer-token region resolution to use `model_providers.amazon-bedrock.aws.region` first, then fall back to `AWS_REGION`, then `AWS_DEFAULT_REGION`. - Updates the missing-region error to list all supported region sources. - Adds focused coverage for config precedence, `AWS_REGION`, `AWS_DEFAULT_REGION`, and the missing-region failure.	2026-05-30 01:17:38 +00:00
Eric Ning	e929bb5c88	[codex] Update remote connector suggestions (#25172 ) ## Summary - Use the session-loaded plugin app IDs as the source of connector suggestion candidates. - Remove the redundant plugin reload from `tool_suggest_connector_ids()`. - Add regression coverage for connectors declared by a loaded remote plugin, using the Databricks app case. ## Context Loaded remote plugins can declare app connector IDs in `.app.json`. The session-owned `PluginsManager` already loads those plugins and exposes their effective app IDs. The connector suggestion path was creating a separate `PluginsManager` and recomputing plugin app IDs. That new manager does not share the session manager’s remote installed plugin cache, so app IDs from loaded remote plugins were missing from connector suggestions. ## Fix Pass the already-loaded effective app IDs into connector suggestion generation and use them directly as the plugin-derived connector candidate set. Connector candidates are now built from: - App IDs declared by loaded plugins - Explicitly configured connector discoverables - Existing disabled-suggestion filtering This avoids a second plugin-manager lookup and keeps connector suggestions aligned with the plugins actually loaded for the turn. ## Behavior For example, when a plugin is loaded and its `.app.json` declares data apps, `list_available_plugins_to_install` can now return those data connectors. This does not create plugin suggestions from the plugin itself. Plugin suggestions still come from eligible uninstalled entries in the marketplace catalog and require existing matching/filtering rules. ## Validation - `just fmt` - Added regression coverage for a loaded-plugin connector ID appearing in discoverable tools - Attempted `just test -p codex-core`; the command exited unsuccessfully in the local test environment without useful failure detail captured in the run output	2026-05-29 17:57:34 -07:00
Abhinav	a5a94ee5a7	Constrain Windows sandbox requirements (#23766 ) # Why Managed requirements can already constrain sandbox policy choices, but Windows sandbox implementation selection was still resolved independently from those requirements. That left the TUI able to continue through the unelevated fallback even when an organization wants to require the elevated Windows sandbox implementation. # What - Add `[windows].allowed_sandbox_implementations` requirements support for the Windows `elevated` and `unelevated` implementations. - Apply that allowlist during core config resolution so disallowed configured or feature-selected Windows sandbox implementations fall back to an allowed implementation with the existing requirements warning path. - Reuse the existing TUI Windows setup prompts to block disallowed unelevated continuation, keep required elevated setup in front of the user, and refuse to persist a TUI-selected Windows sandbox mode that requirements disallow. # Semantics \| Allowed \| Selected \| Effective \| \| --- \| --- \| --- \| \| `["elevated"]` \| `unelevated` / unset \| `elevated` \| \| `["unelevated"]` \| `elevated` / unset \| `unelevated` \| \| `["elevated", "unelevated"]` \| `elevated` \| `elevated` \| \| `["elevated", "unelevated"]` \| `unelevated` \| `unelevated` \| \| `["elevated", "unelevated"]` \| unset \| `elevated` \| Availability is handled by interactive setup surfaces after allowlist resolution. If the effective elevated implementation is not ready, elevated-only requirements block on setup. When unelevated is also allowed, the UI may offer the existing unelevated fallback. ## TUI Screens If elevated setup is not already complete: ``` Your organization requires the default Codex agent sandbox to continue. Set it up to protect your files and control network access. Learn more <https://developers.openai.com/codex/windows> › 1. Set up default sandbox (requires Administrator permissions) 2. Quit ``` If admin setup fails under `["elevated"]`: ``` Couldn't set up your sandbox with Administrator permissions Your organization requires the default sandbox before Codex can continue. Learn more <https://developers.openai.com/codex/windows> › 1. Try setting up admin sandbox again 2. Quit ``` # Next Steps - extend the requirements/readout surface, such as `configRequirements/read`, so clients can inspect the loaded `[windows].allowed_sandbox_implementations` requirement instead of inferring it from Windows setup state - consider extending `windowsSandbox/readiness` as well - update the App startup guide, setup flow, and banner surfaces so an elevated-only requirement omits any continue-unelevated escape hatch and blocks startup until a permitted implementation is ready; - preserve the existing unelevated fallback path when requirements allow it, including the `["unelevated"]` case where elevated is disallowed	2026-05-29 16:31:33 -07:00
Noah MacCallum	8e5f561697	Filter plugin install suggestions by installed apps (#24996 ) ## Summary - Keep the original `TOOL_SUGGEST_DISCOVERABLE_PLUGIN_ALLOWLIST` as a fallback seed list, so users with no installed plugins still get initial install suggestions. - Allow additional install suggestions from trusted marketplaces: `openai-curated` and `openai-bundled`. - Require non-fallback, non-configured marketplace candidates to share `.app.json` connector IDs with already installed plugins. - Preserve explicit configured plugin discoverables as an override, while still omitting installed, disabled, and `NOT_AVAILABLE` plugins. ## Context `list_available_plugins_to_install` controls which plugins the model can trigger via `request_plugin_install`. We want a small starter set for empty/new users, but we also want installed workflow plugins to unlock relevant source plugins without maintaining every source plugin ID by hand. This keeps the legacy plugin ID allowlist only as the starter fallback. For everything else, the trusted marketplace is the candidate boundary, and installed app connector overlap is the relevance filter. For example, an installed Sales plugin can make HubSpot and Granola suggestible when those source plugins are in `openai-curated` and share Sales app connector IDs, while an unrelated test-source plugin with an app connector not declared by Sales stays hidden. ## Test Coverage - Empty/no-installed-plugin case: returns the fallback seed plugins from the original allowlist. - Installed-app expansion: returns non-fallback marketplace plugins only when their app connector IDs overlap with an installed plugin. - Sales workflow case: installed Sales declares HubSpot and Granola apps, so `hubspot@openai-curated` and `granola@openai-curated` are returned. - Sales negative case: `test-source@openai-curated` has an app connector not declared by Sales, so it is not returned. - Existing guardrails: installed plugins, disabled suggestions, and `NOT_AVAILABLE` plugins remain omitted; explicit configured discoverables still work as an override. ## Validation - `just fmt` - `just test -p codex-core plugins::discoverable::tests` - `just test -p codex-core` was attempted earlier, but current `main` / local env failed with unrelated existing failures around missing `test_stdio_server`, CLI/code-mode MCP tool setup, and unified_exec/shell snapshot flakes/timeouts. The touched discoverable tests pass.	2026-05-29 15:32:04 -07:00
Adam Perry @ OpenAI	a076b21730	Recommend Bazel VSCode extension. (#25161 ) Provides starlark syntax highlighting and editor formatting.	2026-05-29 15:24:41 -07:00
Jinghan Xu	f2e7b462a9	[codex] Fix Vim normal mode editing (#25022 ) ## Summary - add Vim normal-mode `s` support to substitute the character under the cursor and enter insert mode - fix Vim normal-mode `o` so opening below the final line moves the cursor onto the new blank line - update keymap config/schema and keymap picker snapshots for the new action ## Validation - `just fmt` - `just write-config-schema` - `just test -p codex-config` - focused `just test -p codex-tui` coverage for the Vim `s` and `o` behavior, keymap conflict handling, and keymap picker snapshots - `cargo insta pending-snapshots --manifest-path tui/Cargo.toml` - `git diff --check` ## Notes A full `just test -p codex-tui` run still has two unrelated Guardian feature-flag failures in this checkout: - `app::tests::update_feature_flags_disabling_guardian_clears_review_policy_and_restores_default` - `app::tests::update_feature_flags_disabling_guardian_clears_manual_review_policy_without_history`	2026-05-29 14:01:27 -07:00
starr-openai	a717e4ef31	exec-server: preserve fs helper CoreFoundation env (#25118 ) ## Summary - preserve macOS `__CF_USER_TEXT_ENCODING` when launching the sandboxed fs helper - keep the fs-helper env narrow; this adds only the CoreFoundation startup var instead of copying the broader MCP stdio baseline - add focused coverage that the helper keeps that var without admitting `HOME` ## Diagnosis The sandboxed fs helper is not launched like a normal child process. Exec-server rebuilds its environment from an allowlist, then calls `env_clear()` before re-execing Codex with `--codex-run-as-fs-helper`. That helper dispatches before the normal Codex startup path and only needs to boot a small Tokio runtime, read one JSON request from stdin, perform the direct filesystem operation, and write one JSON response. The reported macOS hang sampled the helper before Rust main, in CoreFoundation initialization while resolving the default text encoding: `_CFStringGetUserDefaultEncoding -> getpwuid_r -> notify_register_check -> bootstrap_look_up3 -> mach_msg2_trap`. The fs-helper allowlist kept `PATH` and temp vars for runtime needs, but it dropped macOS `__CF_USER_TEXT_ENCODING`. Other Codex subprocess launchers that intentionally build a minimal Unix baseline, such as MCP stdio, already preserve that variable. My read is that stripping `__CF_USER_TEXT_ENCODING` forced this internal helper down CoreFoundation's fallback user-lookup path, and that lookup intermittently wedged on the affected machine before the helper could read stdin or touch the target file. Preserving only this macOS startup variable avoids that fallback without broadening the fs-helper environment to shell-like vars such as `HOME`, `USER`, locale settings, terminal settings, or proxy credentials. Internal Slack thread omitted from the public PR body. ## Validation - `cd codex-rs && just fmt` - `git diff --check`	2026-05-29 12:20:17 -07:00
Eric Traut	20da4c37c5	ci: use issue triage environment for issue workflows (#25134 ) ## Summary This adds `environment: issue-triage` to the Codex-calling issue workflow jobs so they can read the GitHub Environment Secret while staying on GitHub-hosted runners for public issue-triggered workflows.	2026-05-29 12:06:55 -07:00
sayan-oai	1f93706e99	[codex] Require model for standalone web search (#25131 ) ## Why The standalone `/v1/alpha/search` request now requires a `model`, but the `web.run` extension currently omits it. Adds `model` to extension `ToolCall` invocation. Follow-up to #23823. ## What changed - Make `SearchRequest.model` required. - Expose the effective per-turn model on extension tool calls and pass it in standalone web-search requests. - Assert the model is forwarded in the app-server round-trip test. ## Testing - `just test -p codex-api -p codex-tools -p codex-web-search-extension -p codex-memories-extension -p codex-goal-extension` - `just test -p codex-core -E 'test(passes_turn_fields_and_scoped_turn_item_emitter_to_extension_call)'` - `just test -p codex-app-server -E 'test(standalone_web_search_round_trips_encrypted_output)'`	2026-05-29 12:03:04 -07:00
Michael Bolin	a1ecf0cf1c	thread-store: store permission profiles (#23165 ) ## Why `SandboxPolicy` is the legacy compatibility shape, but `codex-thread-store` still exposed it through `StoredThread`, `ThreadMetadataPatch`, and live metadata sync. That kept thread-store consumers tied to the legacy representation and meant richer permission profile data could not round-trip through thread metadata or cold rollout reconciliation. ## What Changed - Replaced thread-store `sandbox_policy` API fields with canonical `PermissionProfile` fields. - Persist new permission-profile metadata as canonical JSON in the existing SQLite metadata slot while continuing to read older legacy sandbox policy values. - Updated local, in-memory, live metadata sync, and rollout extraction paths to propagate `TurnContextItem::permission_profile()`. - Re-materialize legacy permission metadata against the final rollout cwd when rollout-derived metadata replaces stale SQLite summaries. - Updated affected app-server and core test constructors to build `PermissionProfile` values directly. ## Test Plan - `cargo test -p codex-state` - `cargo test -p codex-thread-store` - `cargo test -p codex-app-server summary_from_stored_thread_preserves_millisecond_precision --lib` - `cargo test -p codex-core realtime_context --lib`	2026-05-29 11:55:31 -07:00
Channing Conger	c9dc0f6338	code-mode: introduce durable session interface (#24180 ) ## Summary Introduce a `CodeModeSession` interface for executing and managing code-mode cells. This moves cell lifecycle, callback delegation, termination, and shutdown behind a session abstraction, while continuing to use the existing in-process implementation, and the ability to implement an external process one behind this interface. A Codex session owns one `CodeModeSession`, which in turn owns its running cells and stored code-mode state. Each cell is represented to the caller as a `StartedCell`, exposing its cell ID and initial response. It also introduces a `CodeModeSessionDelegate` callback interface. A session uses the delegate to invoke nested host tools and emit notifications while a cell is running, allowing the runtime to communicate with its owning Codex session without depending directly on core turn handling. <img width="2121" height="1001" alt="image" src="https://github.com/user-attachments/assets/c349a819-2a59-485c-bda4-2caf68ac4c31" />	2026-05-29 11:42:52 -07:00
Eric Horacek	451b386442	[exec-server] Kill dropped filesystem helpers (#25116 ) ## Summary - terminate sandbox filesystem helpers when the Tokio child handle is dropped ## Why A sandbox filesystem helper can stall during process startup before reading stdin. If the owning async operation is cancelled or torn down, the spawned helper should not remain running as an orphaned process. Setting `kill_on_drop(true)` gives the filesystem helper the cleanup behavior that Tokio child processes otherwise do not enable by default. This intentionally does not add a timeout. It does not detect or recover an active hung file edit while the owning future remains alive. A more precise startup-health mechanism can be handled separately. ## Validation - `just test -p codex-exec-server` (186 tests passed; benchmark smoke passed) - `just fmt` - `just fix -p codex-exec-server` - `git diff --check`	2026-05-29 11:40:44 -07:00
Owen Lin	fc9cf62efb	Add subagent lineage metadata for responsesapi (#24161 ) ## Why We recently added `forked_from_thread_id` which lets us trace where a thread's _context_ comes from, but we also want to understand subagent lineage (e.g. which parent thread spawned this subagent? what kind of subagent is it?) which is orthogonal. This PR adds `parent_thread_id` and `subagent_kind` to the `x-codex-turn-metadata` header sent to ResponsesAPI. ## What changed - Adds `parent_thread_id` and `subagent_kind` to core-owned `x-codex-turn-metadata`. - Restores persisted `SessionSource` and `ThreadSource` from resumed session metadata so cold-resumed subagent threads keep their lineage on later Responses API requests. - Centralizes parent-thread extraction on `SessionSource` / `SubAgentSource` and reuses it in the Responses client, analytics, agent control, and state parsing paths. - Extends reserved-key, git-enrichment, thread-spawn, and app-server v2 metadata coverage for the new lineage fields. ## Verification - Not run locally per request. - Added focused coverage in `core/src/turn_metadata_tests.rs` and `app-server/tests/suite/v2/client_metadata.rs`.	2026-05-29 11:28:12 -07:00
Eric Traut	62039e8d35	Use session wording in `/rename` confirmation (#25035 ) ## Why The TUI `/rename` confirmation should use the term "session" for consistency.	2026-05-29 11:09:40 -07:00
Eric Traut	36cd36626d	Add `/archive` slash command (#25027 ) ## Why TUI users can archive saved sessions from other surfaces, but there is no in-session command for archiving the active session. Since archiving the active session also exits the TUI, the command should ask for explicit confirmation instead of firing immediately. I'm also working on [a companion PR](https://github.com/openai/codex/pull/25021) that adds `codex archive` and `codex unarchive` top-level CLI commands. ## What changed - Adds a new `/archive` slash command described as `archive this session and exit`. - Shows a confirmation dialog with `No, don't archive` selected first and `Yes, archive and exit` as the explicit action. - On confirmation, calls the existing `thread/archive` app-server RPC for the active main session and exits after success. - Keeps `/archive` disabled while a task is running and unavailable in side conversations. ## Verification Added focused TUI coverage for the `/archive` confirmation flow, disabled-while-task-running behavior, and the `/ar` slash-command popup snapshot.	2026-05-29 11:07:19 -07:00
Eric Traut	1333f4a689	Align TUI permissions labels with app (#25017 ) ## Summary The desktop app now presents the on-request permissions mode as `Ask for approval` and the manual-review-backed mode as `Approve for me`. The TUI still exposed older/internal labels like `Default` and `Auto-review`, which made the same underlying settings look different across clients. This updates the TUI UX copy to match the app without changing the underlying default behavior. Fresh threads continue to use the existing on-request approval mode, now displayed as `Ask for approval`. The label changes cover `/permissions`, explicit profile permissions menus, status surfaces, config persistence history/error text, and the corresponding TUI snapshots. ### Before <img width="1181" height="119" alt="Screenshot 2026-05-28 at 10 19 47 PM" src="https://github.com/user-attachments/assets/0664846b-b6dd-4931-b4dd-d0af0d42058e" /> <img width="523" height="19" alt="Screenshot 2026-05-28 at 10 21 29 PM" src="https://github.com/user-attachments/assets/7899c33e-b35d-4684-8389-97e357803423" /> ### After <img width="1216" height="117" alt="Screenshot 2026-05-28 at 10 19 32 PM" src="https://github.com/user-attachments/assets/015aab43-ac97-411f-8031-75cdd887251b" /> <img width="567" height="18" alt="Screenshot 2026-05-28 at 10 20 24 PM" src="https://github.com/user-attachments/assets/28b6422c-b823-4298-b221-c83d46d09d66" />	2026-05-29 11:06:40 -07:00
iceweasel-oai	cb9178e8b3	Add Windows sandbox provisioning setup command (#24831 ) ## Why Some Windows users do not have local admin access, so they cannot complete the elevated portion of the Windows sandbox setup when Codex first needs it. This adds an alpha provisioning path that an admin or IT deployment script can run ahead of time for the Codex user. The intended managed-deployment shape is: ```powershell codex sandbox setup --elevated --user "$env:COMPUTERNAME\Alice" --codex-home "C:\Users\Alice\.codex" ``` `--elevated` is treated as the requested sandbox setup level, not as proof that the process is elevated. The Windows sandbox setup orchestration still checks that the caller is actually elevated before launching the helper without a UAC prompt. ## What changed - Added `codex sandbox setup --elevated` with explicit user selection via either `--current-user` or `--user ... --codex-home ...`. - Moved the CLI implementation into `cli/src/sandbox_setup.rs` instead of growing `cli/src/main.rs`. - Added a Windows sandbox `ProvisionOnly` helper mode that runs the elevation-required provisioning work without requiring a workspace cwd or runtime sandbox policy. - Reused the existing elevated helper path for creating/updating sandbox users, configuring firewall/WFP rules, and applying sandbox directory ACLs. - Persisted `windows.sandbox = "elevated"` into the target `CODEX_HOME` so the desktop app does not show the initial sandbox setup banner after pre-provisioning succeeds. ## Validation - `cargo fmt -p codex-windows-sandbox -p codex-core -p codex-cli` - `cargo test -p codex-cli sandbox_setup --target-dir target\sandbox-setup-check` - `cargo test -p codex-windows-sandbox payload_accepts_provision_only_mode --target-dir target\sandbox-setup-check` - `git diff --check` - Manual Windows alpha flow with a standard local user (`Mandi Lavida`): ran the new setup command from an admin shell, verified the target `.codex` contents, sandbox marker/secrets, ACLs, firewall rules, and desktop startup without the sandbox setup banner once experimental network proxy requirements were disabled. ## Notes This intentionally does not solve later elevated update coordination for IT-managed deployments. The setup command can still apply provisioning updates when run again, but a broader coordination/process story is out of scope for this alpha.	2026-05-29 11:01:44 -07:00
Won Park	10b0399034	Route extension image generation through the native image completion pipeline (#24972 ) ## Why The standalone `image_gen.imagegen` extension should behave like native image generation for artifact persistence and UI completion, while returning its save-location guidance as part of the tool result instead of injecting a developer message. ## What Changed - Added an image-generation completion hook for extension tools so core can persist generated images and emit the existing `ImageGeneration` lifecycle events. - Reused core image artifact persistence for extension output and removed extension-local save-path/file-writing logic. - Split shared image persistence from built-in finalization so native image generation keeps its existing developer-message instruction behavior. - Returned the generated image save-location instruction through the extension `FunctionCallOutput`, alongside the generated image input for model follow-up. - Preserved the existing image-generation event shape for current UI and replay compatibility. - Avoided cloning the full generated-image base64 payload when emitting the in-progress image item. - Removed dependencies no longer needed after moving persistence out of the extension crate. ## Fast Follow - Adjust the existing Extension API and add a general `TurnItem` finalization path for re-usability of code ## Validation - Ran `just fmt`. - Ran `just bazel-lock-update`. - Ran `just bazel-lock-check`. - Ran `just test -p codex-tools -p codex-extension-api -p codex-image-generation-extension`. - Ran `just test -p codex-core image_generation_publication_is_finalized_by_core`. - Ran `just test -p codex-core handle_output_item_done_records_image_save_history_message`. - Ran `just fix -p codex-tools -p codex-extension-api -p codex-core -p codex-image-generation-extension`.	2026-05-29 17:33:13 +00:00
Adam Perry @ OpenAI	3e666dd32a	[codex] Wait for MCP readiness in core integration tests (#24964 ) Ensures MCP-backed `codex-core` integration tests exercise initialized servers instead of racing server startup. I've been idly investigating a few flakes and the failure modes are much more confusing when a tool call fails because of a failed server start than when the failed server start causes the test to fail directly.	2026-05-29 10:22:27 -07:00
xl-openai	e29bbb5368	feat: Add focused diagnostics for MCP HTTP send failures (#25013 ) Adds failure-only logging for MCP streamable HTTP post_message calls and the underlying reqwest send path, capturing the MCP method/request id, endpoint shape, auth-header presence, timeout/connect classification, and sanitized error source chain without logging headers, bodies, tokens, or full URLs.	2026-05-29 10:09:33 -07:00
jif-oai	f4e9d2caac	Move config document helpers into their own module (#25110 ) ## Why `core/src/config/edit.rs` owns the config edit state machine, but it also carried the TOML document helper code inline as a nested module. Moving those helpers into their own file keeps the edit orchestration easier to scan without changing the config persistence behavior. ## What changed - Moved the existing `document_helpers` module from `core/src/config/edit.rs` into `core/src/config/edit/document_helpers.rs`. - Added `mod document_helpers;` so the existing `pub(super)` helper API remains available to the rest of `config::edit`. ## Testing Not run; this is a refactor-only module extraction with no intended behavior change.	2026-05-29 18:49:21 +02:00
sayan-oai	96f1347fa3	Show activity for standalone web search calls (#24693 ) ## Why Standalone `web.run` calls run in the extension, so they need normal web-search progress activity while a request is in flight and durable completed activity after a thread is reloaded. Follow-up to #23823; uses the extension turn-item emission path added in #24813. ## What changed - Emit standalone `web.run` start/completion items through the host turn-item emitter, preserving standard client delivery and rollout persistence. - Include useful completion detail for queries, image queries, and literal-URL `open`/`find` commands. - Render completed searches as `Searched the web` or `Searched the web for <detail>`, with snapshot coverage for the detail-free case. - Extend the app-server round-trip test to verify completed search activity is reconstructed by `thread/read` after a fresh-process reload. ## Testing - `just test -p codex-web-search-extension` - `just test -p codex-app-server -E "test(standalone_web_search_round_trips_encrypted_output)"`	2026-05-29 16:12:58 +00:00
Ahmed Ibrahim	5577a9e148	[codex] Add model tool mode selector (#25031 ) ## Why Some models need to select their code-execution behavior through model catalog metadata. Models without that metadata must continue to follow the existing `CodeMode` and `CodeModeOnly` feature flags, including when a newer server sends an enum value this client does not recognize. ## What changed - add optional `ModelInfo.tool_mode` metadata with `direct`, `code_mode`, and `code_mode_only` - treat omitted and unknown wire values as `None` - resolve `None` from the existing feature flags - carry the resolved `ToolMode` directly on `TurnContext`, outside `Config` - use the resolved value for turn creation, model switches, review turns, tool planning, and code execution ## Coverage - add protocol coverage for omitted, known, and unknown enum values - add focused coverage for flag fallback and explicit metadata overriding feature flags - add core integration coverage that fetches remote model metadata through `/v1/models` and verifies the outbound `/responses` tools for explicit `direct` and `code_mode_only` selectors ## Stack - followed by #25032	2026-05-29 09:05:05 -07:00
Abhinav	251b2412b2	Render multiline hook output in TUI (#24965 ) # Why Fixes #24529. Completed hook output in the TUI rendered each `HookOutputEntry` as one ratatui line, so explicit newlines inside hook output were not shown as separate transcript rows. That made multiline `SessionStart.additionalContext` hard to inspect even though the model-facing context path preserved the original text. # What - Split completed hook output entries on explicit newlines before rendering them in `codex-rs/tui/src/history_cell/hook_cell.rs`. - Keep the hook output prefix, such as `hook context:` or `warning:`, on the first physical line only. - Preserve explicit blank lines and render continuation lines with the hook body indent. - Add unit coverage for multiline context and warning output, plus a chatwidget snapshot regression for `SessionStart` history output. # Testing - `cargo nextest run -p codex-tui completed_hook_multiline hook_completed_before_reveal_renders_completed_without_running_flash` - `just argument-comment-lint -p codex-tui -- --ignore-rust-version --lib --tests`	2026-05-29 15:12:40 +00:00
jif-oai	b40ad0d84d	Remove stale rollout TODO tests (#25106 ) ## Summary Remove a stale `TODO(jif)` block of commented-out rollout listing tests that still referenced an older listing API. The current rollout listing behavior is covered by the active state DB and filesystem fallback tests, so keeping the dead commented tests just adds noise. ## Validation - `just fmt` - `just test -p codex-rollout`	2026-05-29 17:09:00 +02:00
jif-oai	27e256bc40	Handle goal usage limits from turn errors (#25095 ) ## Summary - handle goal usage-limit turn errors in the goal extension - exercise the extension path in the goal backend test ## Tests - just fmt - just test -p codex-goal-extension - just fix -p codex-goal-extension	2026-05-29 15:39:05 +02:00
jif-oai	1c55bb2702	[codex] Improve built-in tool schema docs (#24794 ) ## Summary - Clarify default, omission, and bounded behavior across built-in tool schemas, including unified exec, classic shell, Code Mode exec/wait, multi-agent, agent job, MCP resource, image, goal, plan, tool_search, and test-sync fields. - Convert update_plan status to an enum and add short field descriptions where the schema previously relied on surrounding context. - Remove the dedicated permission-approval schema test and keep only updates to existing expected-spec tests. ## Validation - Ran `just fmt`. - Ran `git diff --check`. - Did not run clippy or tests, per request. Regression has been eval [here](https://openai.slack.com/archives/C09GDSP1J9X/p1779905065496949) and we proved there are no regressions	2026-05-29 13:32:19 +02:00
jif-oai	3deda3116c	fix: main (#25075 )	2026-05-29 12:53:31 +02:00
jif-oai	191c39aa75	Drop debug-client prompt state tracking (#25070 ) Deletes `codex-rs/debug-client/src/state.rs` as one step in removing the stale app-server debug client. This intentionally leaves Cargo workspace and lockfile cleanup for a later follow-up PR.	2026-05-29 12:51:23 +02:00
jif-oai	43fa4e5d25	Remove debug-client server event reader (#25069 ) Deletes `codex-rs/debug-client/src/reader.rs` as one step in removing the stale app-server debug client. This intentionally leaves Cargo workspace and lockfile cleanup for a later follow-up PR.	2026-05-29 12:51:19 +02:00
jif-oai	5c1387846d	Delete debug-client JSONL output helper (#25068 ) Deletes `codex-rs/debug-client/src/output.rs` as one step in removing the stale app-server debug client. This intentionally leaves Cargo workspace and lockfile cleanup for a later follow-up PR.	2026-05-29 12:51:16 +02:00
jif-oai	e2b8ec616a	Remove the debug-client CLI entrypoint (#25067 ) Deletes `codex-rs/debug-client/src/main.rs` as one step in removing the stale app-server debug client. This intentionally leaves Cargo workspace and lockfile cleanup for a later follow-up PR.	2026-05-29 12:51:12 +02:00
jif-oai	3d3cc5a953	Retire debug-client interactive command parsing (#25066 ) Deletes `codex-rs/debug-client/src/commands.rs` as one step in removing the stale app-server debug client. This intentionally leaves Cargo workspace and lockfile cleanup for a later follow-up PR.	2026-05-29 12:51:09 +02:00
jif-oai	1197c7d654	Delete debug-client app-server process plumbing (#25065 ) Deletes `codex-rs/debug-client/src/client.rs` as one step in removing the stale app-server debug client. This intentionally leaves Cargo workspace and lockfile cleanup for a later follow-up PR.	2026-05-29 12:51:05 +02:00
jif-oai	a9a92cbb0a	Remove the generated debug-client README (#25064 ) Deletes `codex-rs/debug-client/README.md` as one step in removing the stale app-server debug client. This intentionally leaves Cargo workspace and lockfile cleanup for a later follow-up PR.	2026-05-29 12:51:01 +02:00
jif-oai	fc8c723553	Drop the stale debug-client manifest (#25063 ) Deletes `codex-rs/debug-client/Cargo.toml` as one step in removing the stale app-server debug client. This intentionally leaves Cargo workspace and lockfile cleanup for a later follow-up PR.	2026-05-29 12:50:58 +02:00

1 2 3 4 5 ...

6994 Commits