codex

mirror of https://github.com/openai/codex.git synced 2026-05-19 10:43:38 +00:00

Author	SHA1	Message	Date
Eric Traut	0d344aca9b	goal: pause continuation loops on usage limits and blockers (#23094 ) Addresses #22833, #22245, #23067 ## Why `/goal` can keep synthesizing turns even when the next turn cannot make meaningful progress. Hard usage exhaustion can replay failing turns, and repeated permission or external-resource blockers can keep burning tokens while waiting for user or system intervention. ## What changed - Add resumable `blocked` and `usageLimited` goal states. As with `paused`, goal continuation stops with these states. - Move to `usageLimited` after usage-limit failures. - Allow the built-in `update_goal` tool to set `blocked` only under explicit repeated-impasse guidance. Updated goal continuation prompt to specify that agent should use `blocked` only when it has made at least three attempts to get past an impasse. Most of the files touched by this PR are because of the small app server protocol update. ## Validation I manually reproduced a number of situations where an agent can run into a true impasse and verified that it properly enters `blocked` state. I then resumed and verified that it once again entered `blocked` state several turns later if the impasse still exists. I also manually reproduced the usage-limit condition by creating a simulated responses API endpoint that returns 429 errors with the appropriate error message. Verified that the goal runtime properly moves the goal into `usageLimited` state and TUI UI updates appropriately. Verified that `/goal resume` resumes (and immediately goes back into `ussageLImited` state if appropriate). ## Follow-up PRs Small changes will be needed to the GUI clients to properly handle the two new states.	2026-05-18 11:28:53 -07:00
efrazer-oai	d32cb2c6ac	fix: harden plugin creator sharing validation (#22893 ) # Summary Before this change, the sample plugin creator could emit placeholder-heavy manifests that fail workspace sharing, and it chose a repo-local marketplace implicitly whenever it ran from inside a git checkout. This PR makes generated plugins share-ready by default. It switches creation to the personal marketplace unless the caller explicitly opts into repo-local paths, adds a validator that mirrors the workspace plugin ingestion contract, and updates the skill prompt and docs to describe the real flow. The goal is to stop malformed generated plugins before they reach sharing and to make the default placement match the personal marketplace behavior users expect. ## Changes - Generate share-safe plugin manifests instead of `[TODO: ...]` placeholder payloads. - Default plugin and marketplace creation to `~/plugins` and `~/.agents/plugins/marketplace.json`. - Keep repo-local marketplace creation available through explicit `--path` and `--marketplace-path` arguments. - Add `validate_plugin.py` to check manifests, companion files, skill frontmatter, skill agent YAML, asset paths, and backend-shaped contracts before sharing. - Refresh the plugin creator skill text, reference docs, and default prompt to describe validation and the personal default. ## Design decisions - The validator tracks the workspace ingestion schema directly, including the required `defaultPrompt` alias handling and skill `agents/openai.yaml` checks. - The validator keeps one intentional extra preflight rule: leftover `[TODO: ...]` placeholders are rejected before sharing even when a single placeholder would not independently violate backend type validation. - Repo-local creation stays possible, but it is now explicit instead of cwd-sensitive. ## Testing Tests: targeted Python syntax checks, plugin skill validation, staged diff whitespace validation, 15 generated plugin smoke runs, backend manifest-schema acceptance for all 15 generated bundles, and a git-repo cwd regression proving the creator still writes to the personal marketplace by default.	2026-05-18 11:22:42 -07:00
starr-openai	8c14b08dd1	Upload rust full CI JUnit reports (#23273 ) ## Why `rust-ci-full` failures currently leave downstream investigation reconstructing basic test facts from raw logs. `cargo nextest` can emit standard JUnit XML for each lane, which gives us a small structured artifact for post-run failure analysis without changing the test execution model. ## What changed - enable nextest JUnit output in `codex-rs/.config/nextest.toml` - upload the lane-scoped JUnit XML artifact from each `rust-ci-full` test lane ## Verification - `rust-ci-full` run `26018931531` on head `52d77c60e79b36859d944ef28a36b014055c5c48` produced JUnit artifacts for macOS, Linux x64 remote, Windows x64, and Windows ARM64 test lanes - `rust-ci-full` run `26021241006` on the same head produced the missing Linux ARM JUnit artifact after the first run lost that runner before export - downloaded all five lane JUnit artifacts and verified each contains non-empty test counters and failure data	2026-05-18 11:10:37 -07:00
iceweasel-oai	b1c13b6fe5	Simplify legacy Windows sandbox ACL persistence (#22569 ) ## Why The legacy Windows sandbox still carried a `persist_aces` mode switch, even though the only path that meaningfully applies filesystem ACEs today is `workspace-write`, which already uses the persistent behavior. Legacy read-only sessions rely on the read-only capability SID rather than per-command filesystem ACE mutation, so the temporary cleanup branch had become conceptual overhead without a corresponding behavioral need. Removing that split makes the ACL lifecycle match the current sandbox model more directly and trims the guard/revocation plumbing from the legacy launcher paths. ## What changed - Removed the `persist_aces` parameter from legacy ACL preparation. - Made legacy deny-read handling always use the persistent reconciliation path. - Dropped guard tracking and post-exit ACE revocation from both capture and unified-exec legacy flows. - Kept workspace `.codex` / `.agents` protection tied directly to `WorkspaceWrite` instead of an intermediate persistence flag. ## Verification - `cargo fmt -p codex-windows-sandbox` - `git diff --check` - `cargo test -p codex-windows-sandbox` - 85 passed, 2 ignored, 2 (unrelated) failed locally.	2026-05-18 11:00:03 -07:00
starr-openai	9286ff2805	Fix remote turn diff display roots (#23261 ) ## Why `TurnDiffTracker` computes a display root so turn diffs can be rendered repo-relative. For remote exec-server turns, the selected turn `cwd` may exist only inside the selected environment, but `run_turn` was discovering the git root through the local host filesystem. When that lookup failed, nested remote-session diffs fell back to the nested `cwd` and showed `/tmp/...`-prefixed paths instead of repo-relative paths. ## What changed - Resolve the diff display root from the primary selected turn environment when one exists, using that environment's filesystem and `cwd`. - Add `codex_git_utils::get_git_repo_root_with_fs(...)` so git-root discovery can run against an `ExecutorFileSystem`, including remote environments. - Reuse that helper from `resolve_root_git_project_for_trust(...)` and add coverage for `.git` gitdir-pointer detection. ## Validation - Devbox Bazel: `//codex-rs/core:core-unit-tests --test_filter=get_git_repo_root_with_fs_detects_gitdir_pointer` - Devbox Docker-backed remote-env repro: `//codex-rs/core:core-all-test --test_filter=apply_patch_turn_diff_paths_stay_repo_relative_when_session_cwd_is_nested`	2026-05-18 10:53:49 -07:00
Felipe Coury	bb43044cba	fix(tui): show shutdown feedback on exit (#23323 ) ## Why Ctrl+C can take a noticeable amount of time to finish when the TUI is waiting for the app-server thread shutdown path to complete. Before this change, the UI could look like it had not accepted the shutdown request because the composer and cursor remained in their normal interactive state during that wait. This PR makes the accepted shutdown visible immediately. It does not add an artificial sleep or change the shutdown timeout; it only draws one final feedback frame before continuing through the existing shutdown flow. ## What Changed - On `ExitMode::ShutdownFirst`, the TUI now renders shutdown feedback before awaiting the existing thread shutdown future. - The bottom pane disables composer input, which hides the cursor through the existing disabled-input cursor path. - The composer shows `Shutting down...` as the disabled input hint and suppresses footer content so the shutdown acknowledgement is not competing with shortcut/status text. - The logout path uses the same feedback path before shutting down. ## How to Test 1. Start Codex from this branch. 2. Press `Ctrl+C` to request shutdown. 3. If shutdown takes long enough to observe, confirm the composer changes to `› Shutting down...`, the cursor disappears, and no footer hint is rendered below it. 4. Regression check: repeat with text already typed in the composer and confirm the visible row still switches to `Shutting down...` while the draft remains preserved internally until the process exits. Targeted tests: - `cargo test -p codex-tui shutdown_in_progress_disables_input_and_uses_hint_without_footer` - `cargo test -p codex-tui bottom_pane::footer::tests::` ## Local Validation Note `cargo test -p codex-tui` still aborts in `app::tests::discard_side_thread_removes_agent_navigation_entry` with a stack overflow. That same test also failed when run alone locally, and the failure appears unrelated to this shutdown feedback path.	2026-05-18 14:41:14 -03:00
iceweasel-oai	d335b00212	windows: link MSVC release binaries with static CRT (#22905 ) ## Why Windows release artifacts currently import `VCRUNTIME140.dll` and `VCRUNTIME140_1.dll`. That becomes observable on clean Windows machines that do not already have the VC++ runtime available globally: - Desktop Store launches can fail after the app relocates `codex.exe` out of `WindowsApps`, which means an MSIX-level VCLibs dependency does not protect the relocated CLI/app-server process. - The npm CLI path reproduces the same missing-DLL startup failure when `System32\vcruntime140_1.dll` is hidden and `PATH` is stripped of incidental fallback copies. In that setup, the existing Windows binary exits with `0xC0000135` / `-1073741515` before Codex code runs. ## What changed - Add `-C target-feature=+crt-static` to the existing MSVC-only Cargo rustflags in `codex-rs/.cargo/config.toml`. - Preserve the existing `/STACK:8388608` linker setting in the same target block. This keeps the change scoped to Windows MSVC builds and avoids altering non-Windows or GNU target behavior. ## Verification I built an x64 Windows release probe with static CRT linkage and the normal 8 MiB stack reserve, then verified: - `dumpbin /dependents codex.exe` no longer reports `VCRUNTIME140.dll` or `VCRUNTIME140_1.dll`. - `dumpbin /headers codex.exe` reports `800000 size of stack reserve`. - With `System32\vcruntime140_1.dll` hidden and `PATH` stripped to Windows system directories only: - the old npm CLI path exits `-1073741515` - the rebuilt static-CRT `codex.exe --version` succeeds with exit code `0` - the rebuilt TUI starts successfully I also confirmed `codex.exe app-server --listen ws://127.0.0.1:0` starts and binds normally with the static-CRT artifact.	2026-05-18 10:32:33 -07:00
jif-oai	3f2b7ede0b	nit: read prompt (#23332 )	2026-05-18 19:25:27 +02:00
pakrym-oai	82061660ae	[codex] Remove legacy shell output formatting paths (#22706 ) ## Why The client and tool pipeline still carried compatibility code for legacy structured shell output. Current shell and apply_patch responses are already plain text for model consumption, so keeping a JSON-serialization path plus shell-item rewrite logic makes the request formatter and tests preserve a format we do not need anymore. ## What Changed - Removed the client-side shell output rewrite from `core/src/client_common.rs`. - Removed the structured exec-output formatter and the shell `freeform` switch so tool emitters use one model-facing formatter. - Collapsed apply_patch/shell serialization tests around the remaining plain-text output expectations and removed duplicate one-variant parameterized cases. - Kept the `ApplyPatchModelOutput::ShellCommandViaHeredoc` compatibility input shape, but no longer treats it as a separate output-format mode. ## Validation - `cargo test -p codex-core client_common` - `cargo test -p codex-core shell_serialization` - `cargo test -p codex-core apply_patch_cli` - `just fix -p codex-core` ## Documentation No external Codex documentation update is needed.	2026-05-18 09:57:54 -07:00
Eric Traut	adca1b643f	[1 of 2] Optimize TUI startup terminal probes (#23175 ) ## Why Codex TUI startup still feels slower than 0.117.0 after the app-server move in 0.118.0. A visible chunk of launch-to-input latency comes from serial terminal startup probes: cursor position, keyboard enhancement support, and default foreground/background color queries can each wait on terminal responses before the first usable frame. Refs #16335. ## What This PR batches the terminal startup probes into one bounded probe. It also reuses the probed cursor position and default colors during TUI setup, fast-paths the primary-device-attributes fallback as keyboard enhancement unsupported, and keeps lightweight startup timing logs for future tuning. The startup telemetry is intentionally left in production: it records phase timings for terminal probes and initial-frame scheduling so future startup regressions can be diagnosed from normal logs rather than re-adding one-off debug instrumentation. ## Benchmark In the local pty startup benchmark, the pre-optimization `main` baseline was about 250.5ms median from launch to accepted chat input. This probe-only branch measured about 152ms median, for an approximate savings of 95-100ms. ## Stack 1. [#23175: [1 of 2] Optimize TUI startup terminal probes](https://github.com/openai/codex/pull/23175) — this PR 2. [#23176: [2 of 2] Start fresh TUI thread in background](https://github.com/openai/codex/pull/23176) — layered on this PR ## Verification - `cargo test -p codex-tui`	2026-05-18 09:04:02 -07:00
Eric Traut	e734cb5713	Hide ChatGPT usage link for non-OpenAI status (#23127 ) Addresses #22778 ## Summary Provider deployments such as Bedrock manage rate limits and billing outside ChatGPT, so the `/status` link to the ChatGPT usage page is irrelevant and confusing for those users. Custom providers that are explicitly configured to use OpenAI/ChatGPT auth still point at OpenAI-backed usage, so they should keep the link. ## Changes - Render the ChatGPT usage note only when the configured provider uses OpenAI auth. - Keep the note hidden when `/status` displays a provider such as Bedrock that manages limits elsewhere. - Add regression coverage for both Bedrock and a custom OpenAI-auth proxy provider. ## Manual Repro 1. Configure Codex with a non-OpenAI-auth provider, for example `model_provider = "amazon-bedrock"`. 2. Start the TUI and run `/status`. 3. Confirm the status card shows the custom provider, for example `Model provider: Amazon Bedrock`, and does not show `https://chatgpt.com/codex/settings/usage`. 4. Configure a custom provider that proxies to OpenAI and has OpenAI/ChatGPT auth enabled. 5. Run `/status` again and confirm the ChatGPT usage link appears for that OpenAI-auth provider.	2026-05-18 09:02:38 -07:00
Eric Traut	deb159d9ff	Fix TUI stream cleanup after turn errors (#23128 ) ## Summary Fixes #22726. After a Responses stream disconnect, the live TUI could keep accepting prompts while leaving partially streamed assistant output in its transient streaming-cell form. That made fenced diffs or SVG/XML-like content appear as raw transcript text until the user closed the TUI and resumed the same session, which rebuilt the transcript from saved history. This change finalizes the active answer stream before generic failed-turn cleanup clears the stream controller, so the live transcript takes the same source-backed markdown consolidation path as a successful turn. ## Reviewer repro 1. Start a local Codex TUI session. 2. Trigger an assistant turn that streams markdown content, especially a fenced diff or SVG/XML-like block. 3. Force or encounter a non-retry stream disconnect before the turn completes. 4. Continue using the same still-open TUI session. 5. Before this fix, the live history can stay raw/plain even though `codex resume` renders the same session normally. 6. After this fix, the failed-turn path consolidates the partial stream before rendering the error, so the live TUI keeps normal transcript rendering.	2026-05-18 09:00:57 -07:00
Eric Traut	af6ffb6ebb	Support --output-schema for exec resume (#23123 ) ## Why `codex exec resume` should have the same structured-output support as top-level `codex exec`. Without `--output-schema`, multi-turn automation has to choose between resumed session context and schema-validated JSON output. Fixes #22998. ## What changed - Marked `--output-schema` as a global `codex exec` flag so it can be passed after `resume`. - Reused the existing output schema plumbing so resumed turns attach the schema to the final response request while preserving session context.	2026-05-18 08:55:22 -07:00
Eric Traut	fce10e009d	tui: keep cleared Fast tier from reappearing after side-thread resume (#23121 ) ## Why After turning Fast mode off in the TUI, returning from a side thread could make `Fast` appear again in the main chat widget. The opt-out itself was still persisted; the display was being rebuilt from stale cached `ThreadSessionState` data, which made it look like Fast had been re-enabled. Fixes #23104. ## What changed - Keep the active thread's cached `service_tier` in sync whenever the user persists a service-tier selection. - Update both the primary-thread snapshot and the thread event store so restored TUI state reflects the current tier. - Add a focused regression test for clearing a cached Fast tier. ## Manual repro 1. Start a TUI session where `Fast` is enabled by default. 2. Run `/fast` and turn Fast mode off. Confirm `Fast` disappears from the chat widget display. 3. Re-enter thread navigation via either path: - Run `/side test`, then return to the main thread. - Run `/agent`, enter a child thread, then return to the main thread. 4. Before this fix, `Fast` reappears in the main chat widget display even though the opt-out was already persisted. 5. After this fix, `Fast` stays cleared. ## Verification - `cargo test -p codex-tui app::thread_session_state::tests::service_tier_sync_updates_active_cached_session -- --exact`	2026-05-18 08:52:18 -07:00
jif-oai	4ca60ef9ff	Emit goal update events from goal extension tools (#23306 ) ## Why Goal creation and completion are moving through the goal extension, but the rest of Codex still observes goal state through `ThreadGoalUpdated` events. Without an event from the extension-owned tool path, a model-initiated `create_goal` or `update_goal` can mutate the backend and return a tool result while app-server and TUI listeners miss the goal state transition. ## What changed - Added `GoalEventEmitter` as a small wrapper around the host `ExtensionEventSink` to build `EventMsg::ThreadGoalUpdated` events for goal updates. - Threaded the registry event sink into `GoalExtension` and the `GoalToolExecutor`s created by the extension. The public `GoalExtension::new` constructor keeps a `NoopExtensionEventSink` fallback for standalone use. - Emitted a goal update after successful `create_goal` and `update_goal` tool calls. Until `ToolCall` exposes the current turn submission id, these events use the tool call id as the event id and leave `turn_id` unset. Relevant code: - [`GoalEventEmitter::thread_goal_updated`](`1fe2d73890/codex-rs/ext/goal/src/events.rs (L19-L32)`) - [`GoalToolExecutor` emission points](`1fe2d73890/codex-rs/ext/goal/src/tool.rs (L161-L190)`) ## Testing - `cargo test -p codex-goal-extension`	2026-05-18 16:14:37 +02:00
jif-oai	b631d92170	chore: make token usage async (#23305 ) Make the `TokenUsageContributor` async. This will be required for future extension and it's basically free	2026-05-18 15:59:06 +02:00
jif-oai	500ef67ed1	chore: goal resumed metrics (#23301 ) Add metrics for goal resume	2026-05-18 15:19:23 +02:00
jif-oai	7ee7fe239f	chore: isolate thread goal storage behind GoalStore (#23295 ) ## Why Thread goal persistence is being prepared for a dedicated storage boundary. Before that split, goal-specific reads, writes, accounting, and cleanup were exposed directly on `StateRuntime`, so core and app-server callsites stayed coupled to the full runtime instead of a goal-specific store. This PR introduces that boundary without changing the goal wire API or current persistence behavior. Callers now go through `StateRuntime::thread_goals()` and the new `GoalStore`, while `GoalStore` still uses the existing state DB pool underneath. ## What changed - Added `GoalStore` in `state/src/runtime/goals.rs` and exposed it from `StateRuntime` via `thread_goals()`. - Moved thread-goal reads, writes, status updates, pause, delete, and usage accounting onto `GoalStore`. - Updated core session goal handling, app-server goal RPCs, resume snapshots, and goal tests to use the store boundary. - Kept thread deletion responsible for cascading goal cleanup by deleting the goal through the store only after a thread row is removed. ## Testing - Existing goal persistence, resume, and accounting tests were updated to exercise the new `GoalStore` access path.	2026-05-18 14:47:05 +02:00
jif-oai	6a8173588c	feat: add extension event sink capability (#23293 ) ## Why Extensions can already expose typed contributions and receive host capabilities such as `AgentSpawner`, but they do not have a typed way to send protocol events back through the host. Extensions that need to surface progress or status should not have to own persistence, ordering, transport fanout, or logging decisions themselves. ## What - Add `ExtensionEventSink`, a host-provided fire-and-forget sink for `codex_protocol::protocol::Event`. - Add `NoopExtensionEventSink` so hosts that do not expose extension event emission keep the existing empty-registry behavior. - Store the sink on `ExtensionRegistryBuilder` / `ExtensionRegistry`, with `with_event_sink(...)` and `event_sink()` accessors, and re-export the new capability from `codex-extension-api`. ## Testing - Not run locally; PR metadata/body update only.	2026-05-18 14:08:56 +02:00
jif-oai	9531e932ef	Make extension lifecycle hooks async (#23291 ) ## Why Extension lifecycle hooks sit on the host/extension boundary, but the current trait surface only allows synchronous callbacks. That forces extensions that need to seed, rehydrate, observe, or flush extension-owned state during thread and turn transitions to either block inside the callback or move async work into separate host plumbing. This PR makes those lifecycle callbacks awaitable so extension implementations can perform async work directly at the lifecycle point where the host already has the relevant session, thread, or turn stores available. ## What changed - Makes `ThreadLifecycleContributor` and `TurnLifecycleContributor` async in `codex-extension-api`. - Awaits thread start/resume/stop and turn start/stop/abort lifecycle callbacks from `codex-core`. - Updates the guardian and memories extensions to implement the async lifecycle trait surface. - Updates the existing lifecycle tests to use async contributor implementations. - Adds `async-trait` to the crates that now expose or implement these async object-safe lifecycle traits. ## Testing - Existing `codex-core` lifecycle tests were updated to cover async implementations for thread stop and turn abort ordering.	2026-05-18 13:53:58 +02:00
jif-oai	a80f07ec4a	chore: goal ext skeleton (#23288 ) Skeleton of `/goal` in extension Lot's of follow-ups coming	2026-05-18 13:32:21 +02:00
xli-oai	da14dd2add	[codex] Add installed-plugin mention API (#22448 ) ## Summary - add app-server `plugin/installed` for mention-oriented plugin loading - return installed plugins plus explicitly requested install-suggestion rows - keep remote handling on installed-state data instead of the broad catalog listing path ## Why The `@` mention surface only needs plugins that are usable now, plus a small product-approved set of install suggestions. It does not need the full catalog-shaped `plugin/list` payload that the Plugins page uses. ## Validation - `just write-app-server-schema` - `just fmt` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-core-plugins` - `cargo test -p codex-app-server --test all plugin_installed_` ## Notes - The package-wide `cargo test -p codex-app-server` run still hits an existing unrelated stack overflow in `in_process::tests::in_process_start_clamps_zero_channel_capacity`. - Companion webview PR: https://github.com/openai/openai/pull/915672	2026-05-18 03:11:54 -07:00
jif-oai	22dd9ad392	Densify and version memory summaries (#23148 ) ## Why `memory_summary.md` is injected into every session, so its value depends on staying compact, navigational, and easy to regenerate when the expected shape changes. The previous consolidation prompt encouraged a broad actionable inventory and allowed older summary structures to be patched in place, which makes it easier for stale or overly verbose summaries to keep accumulating. This change makes the summary format explicitly versioned and biases Phase 2 memory consolidation toward denser prompt-loaded context. ## What changed - Require `memory_summary.md` to begin with an exact `v1` header. - Teach consolidation to regenerate `memory_summary.md` from scratch when the header is missing or incompatible, while still allowing incremental updates to `MEMORY.md`. - Tighten the `memory_summary.md` instructions so it acts as a compact routing/index layer instead of a second handbook. - Lower `MEMORY_TOOL_DEVELOPER_INSTRUCTIONS_SUMMARY_TOKEN_LIMIT` from `5_000` to `2_500` so the runtime prompt budget matches the denser summary target. ## Verification Not run; this is a prompt/template update plus a prompt budget constant change.	2026-05-18 09:59:34 +02:00
starr-openai	64ead6a83a	Add exec-server websocket keepalive (#23226 ) ## Summary - send periodic websocket Ping frames from outbound exec-server websocket clients - cover direct exec-server websocket clients plus rendezvous harness/executor websocket connections - keep inbound axum-accepted exec-server websocket connections passive - add focused keepalive coverage for direct and relay websocket paths ## Validation - /Users/starr/code/openai/project/dotslash-gen/bin/bazel test //codex-rs/exec-server:exec-server-unit-tests --test_filter='websocket_connection_sends_keepalive_ping\|harness_connection_sends_keepalive_ping\|multiplexed_executor_sends_keepalive_ping' - /Users/starr/code/openai/project/dotslash-gen/bin/bazel test //codex-rs/exec-server:exec-server-relay-test --test_filter=multiplexed_remote_executor_routes_independent_virtual_streams	2026-05-18 03:07:32 +00:00
Ahmed Ibrahim	e7bffc5a20	[codex] Accept string input for Python turns (#23162 ) ## Summary - Allow thread.turn and turn.steer, including async variants, to accept RunInput so plain strings work alongside typed input objects. - Export RunInput and update the SDK artifact generator so regenerated turn methods keep the same signature and normalization. - Update docs, examples, notebook cells, and tests to use string shorthand for text-only turns while keeping typed inputs for multimodal input. ## Validation - uv run --extra dev ruff format . - uv run --extra dev ruff check --output-format=github . - python3 -m py_compile sdk/python/src/openai_codex/__init__.py sdk/python/src/openai_codex/api.py sdk/python/src/openai_codex/_inputs.py sdk/python/scripts/update_sdk_artifacts.py sdk/python/tests/test_public_api_signatures.py sdk/python/tests/test_app_server_streaming.py sdk/python/tests/test_app_server_turn_controls.py sdk/python/tests/test_real_app_server_integration.py - python3 -c "import json; json.load(open('sdk/python/notebooks/sdk_walkthrough.ipynb'))" - sdk/python/.venv/bin/python -c "import inspect, openai_codex; from openai_codex import Thread, AsyncThread, TurnHandle, AsyncTurnHandle, RunInput; funcs=[Thread.run, Thread.turn, AsyncThread.run, AsyncThread.turn, TurnHandle.steer, AsyncTurnHandle.steer]; assert all(inspect.signature(fn).parameters['input'].annotation == 'RunInput' for fn in funcs); assert RunInput is openai_codex.RunInput"	2026-05-17 09:05:44 -07:00
Michael Bolin	0a83353ca3	test: reduce core sandbox policy test setup (#23036 ) ## Why `SandboxPolicy` is a legacy compatibility shape, but several core tests still used it for ordinary turn setup even when the runtime path now carries `PermissionProfile`. With the first cleanup PR merged, this follow-up trims more core test scaffolding so remaining `SandboxPolicy` matches are easier to classify as production compatibility, legacy-boundary coverage, or explicit conversion tests. ## What Changed - Updated apply-patch handler and runtime tests to pass `PermissionProfile` directly. - Changed sandboxing test helpers to build permission profiles without first creating `SandboxPolicy` values. - Converted request-permissions integration turns to pass `PermissionProfile` through the test helper, leaving legacy sandbox projection at the `Op::UserTurn` boundary. - Converted unified exec integration helpers and direct turn submissions to use `PermissionProfile` values instead of `SandboxPolicy` setup. - Removed now-unused `SandboxPolicy` imports from the touched core tests. ## Test Plan - `just fmt` - `cargo test -p codex-core --lib tools::sandboxing::tests` - `cargo test -p codex-core --lib tools::runtimes::apply_patch::tests` - `cargo test -p codex-core --lib tools::handlers::apply_patch::tests` - `cargo test -p codex-core --lib unified_exec::process_manager::tests` - `cargo test -p codex-core --test all request_permissions::` - `cargo test -p codex-core --test all unified_exec::` - `just fix -p codex-core`	2026-05-17 08:39:41 -07:00
jif-oai	545ede569c	Make multi-agent v2 tool namespace configurable (#23147 ) ## Summary - Add `features.multi_agent_v2.tool_namespace` with config/schema validation for Responses-compatible namespace values. - Thread the resolved namespace into `ToolsConfig` for normal turns and review turns. - Wrap MultiAgentV2 tool specs and registry names in the configured namespace when namespace tools are supported, while falling back to the plain tool names when they are not. ## Validation - `just fmt` - `just write-config-schema` - `cargo test -p codex-features multi_agent_v2_feature_config -- --nocapture` - `cargo test -p codex-core test_build_specs_multi_agent_v2 -- --nocapture` - `cargo test -p codex-core multi_agent_v2_config -- --nocapture` - `cargo test -p codex-core multi_agent_v2_rejects_invalid_tool_namespace -- --nocapture` - `cargo test -p codex-tools` - `git diff --check`	2026-05-17 15:27:43 +02:00
Ahmed Ibrahim	f0166cadbb	[codex] Return TurnResult from Python turn handles (#23151 ) ## Why `TurnHandle.run()` returned the raw app-server `Turn`, whose live start/completed payloads do not include loaded `items`, so users saw empty `items` after starting a turn. That made the handle-based path behave differently from `Thread.run(...)`, and pushed examples toward persisted-thread reads plus helper extraction. This PR makes the run APIs standalone: starting a turn and running it returns collected turn data directly, or fails visibly when required stream events are missing. ## What Changed - Replaces the public `RunResult` export with `TurnResult`. - Adds turn metadata to `TurnResult`: `id`, `status`, `error`, `started_at`, `completed_at`, and `duration_ms`, alongside `final_response`, `items`, and `usage`. - Changes `TurnHandle.run()` and `AsyncTurnHandle.run()` to consume stream events with the same collector used by `Thread.run(...)`. - Exports `TurnError` from `openai_codex.types` for the new result shape. - Updates tests, examples, docs, and the walkthrough notebook to use `result.final_response` and `result.items` directly. - Removes persisted-thread helper paths and placeholder/skipped control flows from the public examples and notebook. ## Verification - `python3 -m py_compile ...` over changed SDK, example, and test Python files. - `python3 -c "import json; json.load(open('sdk/python/notebooks/sdk_walkthrough.ipynb'))"` - `git diff --check` - `PYTHONPATH=sdk/python/src python3 -c ...` import/signature smoke for `TurnResult`, `TurnHandle.run`, and `AsyncTurnHandle.run`.	2026-05-17 06:17:22 -07:00
Ahmed Ibrahim	4c89772314	sdk/python: add first-class login support (#23093 ) ## Why The Python SDK can already create threads and run turns, but authentication still has to be arranged outside the SDK. App-server already exposes account login, account inspection, logout, and `account/login/completed` notifications, so SDK users currently have to work around a missing public client layer for a core setup step. This change makes authentication a normal SDK workflow while preserving the backend flow shape: API-key login completes immediately, and interactive ChatGPT flows return live handles that complete later through app-server notifications. ## What changed - Added public sync and async auth methods on `Codex` / `AsyncCodex`: - `login_api_key(...)` - `login_chatgpt()` - `login_chatgpt_device_code()` - `account(...)` - `logout()` - Added public browser-login and device-code handle types with attempt-local `wait()` and `cancel()` helpers. Cancellation stays on the handle instead of a root-level SDK method. - Extended the Python app-server client and notification router so login completion events are routed by `login_id` without consuming unrelated global notifications. - Kept login request/handle logic in a focused internal `_login.py` module so `api.py` remains the public facade instead of absorbing more auth plumbing. - Exported the new handle types plus curated account/login response types from the SDK surfaces. - Updated SDK docs, added sync/async login walkthrough examples, and added a notebook login walkthrough cell. ## Verification Added SDK coverage for: - API-key login, account readback, and logout through the app-server harness in both sync and async clients. - Browser login cancellation plus `handle.wait()` completion through the real app-server boundary used by the Python SDK harness. - Waiter routing that stays scoped across replaced interactive login attempts, plus async handle cancellation coverage. - Login notification demuxing, replay of early completion events, and async client delegation. - Public export/signature assertions. - Real integration-suite smoke coverage for the new examples and notebook login cell.	2026-05-16 19:49:28 -07:00
Eric Traut	0445b290fe	[1 of 4] tui: route primary settings writes through app server (#22913 ) ## Why The TUI can run against a remote app server, but several high-traffic settings still persisted by editing the local config file. That sends remote sessions' preference writes to the wrong machine and lets local disk state drift from the app-server-owned config. This is [1 of 4] in a stacked series that moves TUI-owned config mutations onto app-server APIs. ## What changed - Added a small TUI helper for typed app-server config writes. - Routed primary interactive preference writes through `config/batchWrite`. - Preserved existing profile scoping for settings that already support `profiles.<profile>.` overrides. ## Config keys affected - `model` - `model_reasoning_effort` - `personality` - `service_tier` - `plan_mode_reasoning_effort` - `approvals_reviewer` - `notice.fast_default_opt_out` - Profile-scoped equivalents under `profiles.<profile>.` ## Suggested manual validation - Connect the TUI to a remote app server, change `model` and `model_reasoning_effort`, reconnect, and confirm the remote config retained both values while the local `config.toml` did not change. - Change `personality`, `plan_mode_reasoning_effort`, and the explicit auto-review selection, then reconnect and confirm those choices persist through the app server. - Clear the service tier back to default and confirm `service_tier` is cleared while `notice.fast_default_opt_out = true` is persisted remotely. - Repeat one setting change with an active profile and confirm the write lands under `profiles.<profile>.*`. ## Stack 1. [#22913](https://github.com/openai/codex/pull/22913) `[1 of 4]` primary settings writes 2. [#22914](https://github.com/openai/codex/pull/22914) `[2 of 4]` app and skill enablement 3. [#22915](https://github.com/openai/codex/pull/22915) `[3 of 4]` feature and memory toggles 4. [#22916](https://github.com/openai/codex/pull/22916) `[4 of 4]` startup and onboarding bookkeeping	2026-05-16 14:27:02 -07:00
sayan-oai	061a614d85	multiagent: trim model-visible description, cap to 5 models (#23069 ) ## Why The `spawn_agent` model override guidance is uncapped and bloating context. We need to trim down each entry and cap total entries. picked 5 as cap, we can change ## What changed - Cap the model override summaries shown in `spawn_agent` to the first 5 picker-visible models, preserving the existing priority ordering from the models manager. - Condense each rendered entry to the actionable pieces the model needs: - use the model slug as the label - render compact reasoning effort lists with the default marked inline - render only service tier IDs, and omit the clause when no tiers are available - Update coverage so the compact formatter shape and the top-5 cap are exercised, and keep the end-to-end request assertion aligned with real model metadata. ## Example Before: `- gpt-5.4 ('gpt-5.4\'): Strong model for everyday coding. Default reasoning effort: medium. Supported reasoning efforts: low (Fast responses with lighter reasoning), medium (Balances speed and reasoning depth for everyday tasks), high (Greater reasoning depth for complex problems), xhigh (Extra high reasoning depth for complex problems). Supported service tiers: priority (Fast: 1.5x speed, increased usage).` After: `- 'gpt-5.4': Strong model for everyday coding. Reasoning efforts: low, medium (default), high, xhigh. Service tiers: priority.`	2026-05-16 13:43:30 -07:00
Miaolin Min	6941f5c2c5	[codex] preserve MCP result meta in McpToolCallItemResult (#22946 ) ## Summary https://openai.slack.com/archives/C0ARA9UAQEA/p1778890981647319?thread_ts=1778888537.934319&cid=C0ARA9UAQEA - Add `_meta` to exec JSONL MCP tool call result events. - Copy MCP result metadata through the JSONL event conversion. - Add a focused test that verifies `_meta` is serialized as `_meta` and not `meta`. ## Verification https://www.notion.so/openai/Miaolin-0516-_meta-population-debug-3628e50b62b08074b365e0ce1ffb8f74	2026-05-16 13:27:44 -07:00
Michael Zeng	b200dd1b6f	exec-server: support auth-backed remote executor registration (#22769 ) This updates remote `exec-server` registration to use normal Codex auth instead of a registry-issued credential. The registry request is built from the existing auth-provider path, which preserves the biscuit-only registry contract introduced in [openai/openai#924101](https://github.com/openai/openai/pull/924101) while removing the old remote registry bearer env var and its direct transport assumptions. The default remote flow uses persisted ChatGPT auth from the normal Codex config/storage path. This PR also includes the containerized Agent Identity path needed by [openai/openai#924260](https://github.com/openai/openai/pull/924260): remote `exec-server` accepts `--allow-agent-identity-auth`, permits Agent Identity auth loaded from `CODEX_ACCESS_TOKEN` only when that flag is present, and reuses the existing Agent task registration plus derived `AgentAssertion` header generation. API-key auth remains unsupported, and Agent Identity stays opt-in. Validation performed beyond normal presubmit coverage: - `cargo fmt --all --check` - `cargo check -p codex-cli` - `cargo test -p codex-exec-server` - `cargo test -p codex-cli exec_server_agent_identity_auth_flag_` - `cargo test -p codex-cli remote_exec_server_auth_mode_` I also attempted `cargo test -p codex-cli`. The new CLI tests passed inside that run, but the suite ended on an unrelated local marketplace-state failure in `plugin_list_excludes_unconfigured_repo_local_marketplaces`.	2026-05-16 12:48:28 -07:00
Michael Bolin	d91bc15618	test: construct permission profiles directly (#23030 ) ## Why `SandboxPolicy` is now a legacy compatibility shape, but several tests still built a `SandboxPolicy` only to immediately convert it into `PermissionProfile` for APIs that already accept canonical runtime permissions. Those detours make it harder to audit where legacy sandbox policy is still required, because boundary-only usages are mixed together with ordinary test setup. ## What Changed - Updated tests in `codex-core`, `codex-exec`, `codex-analytics`, and `codex-config` to construct `PermissionProfile` values directly when the code under test takes a permission profile. - Changed exec-policy, request-permissions, session, and sandbox test helpers to pass `PermissionProfile` through instead of converting from `SandboxPolicy` internally. - Left `SandboxPolicy` in place where tests are explicitly exercising legacy compatibility or request/response boundaries. ## Test Plan - `cargo test -p codex-analytics -p codex-config` - `cargo test -p codex-core --lib safety::tests` - `cargo test -p codex-core --lib exec_policy::tests::` - `cargo test -p codex-core --lib exec::tests` - `cargo test -p codex-core --lib guardian_review_session_config` - `cargo test -p codex-core --lib tools::network_approval::tests` - `cargo test -p codex-core --lib tools::runtimes::shell::unix_escalation::tests` - `cargo test -p codex-core --lib managed_network` - `cargo test -p codex-core --test all request_permissions::` - `cargo test -p codex-exec sandbox` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/23030). * #23036 * __->__ #23030	2026-05-16 12:12:37 -07:00
Eric Traut	941e7f825e	Improve goal completion usage reporting (#22907 ) ## Why Goal completion follow-up turns currently receive a preformatted English usage sentence such as `time used: 2586 seconds`. That nudges the model to echo an awkward raw seconds count in the final reply, even though the tool result already exposes structured usage fields like `goal.timeUsedSeconds`, `goal.tokensUsed`, and `goal.tokenBudget`. ## What changed - Replace the preformatted completion usage sentence with guidance to read the structured goal fields from the tool result. - Preserve token-budget reporting while allowing the model to phrase elapsed time in a concise, human-friendly way that fits the response language. - Update core coverage for both the generated completion guidance and the session flow that forwards it back to the model. ## Verification Previously, it would have output a final message indicating that it "worked for 303 seconds". Now it shows the following: <img width="286" height="35" alt="image" src="https://github.com/user-attachments/assets/d7011880-9449-46a7-856f-4e50ae00eb45" />	2026-05-16 11:49:40 -07:00
Ahmed Ibrahim	a280248021	[codex] Split Python SDK helper logic (#22939 ) ## Summary - Move approval-mode mapping into `sdk/python/src/openai_codex/_approval_mode.py`. - Move initialize metadata parsing and normalization into `sdk/python/src/openai_codex/_initialize_metadata.py`. - Keep the public `ApprovalMode` export stable and retarget direct metadata helper coverage. ## Integration coverage - Add an app-server harness smoke that exercises sync and async SDK initialization plus thread creation. ## Validation - Local tests were not run per repo guidance. CI should validate this branch once the PR is online.	2026-05-16 09:47:51 -07:00
Michael Bolin	108234b5eb	core: set permission profiles from snapshots (#22920 ) ## Why #22891 moved the TUI turn-command path to pass `ActivePermissionProfile` instead of the full `PermissionProfile`, but the remaining config/session bridge still accepted the concrete `PermissionProfile` and active profile id as separate arguments. That shape made it too easy for future callers to update the concrete profile and active profile id out of sync. This PR makes the trusted session snapshot path pass one coherent value into `Permissions`, while keeping `requirements.toml` enforcement owned by the existing constrained permission state. ## What Changed - Added `PermissionProfileSnapshot` as the public snapshot value for trusted session/config synchronization. - Changed `Permissions::set_permission_profile_from_session_snapshot()` and `replace_permission_profile_from_session_snapshot()` to take a `PermissionProfileSnapshot`. - Updated the replacement path to derive its constrained `PermissionProfile` from the snapshot, so callers cannot pass a separate profile that disagrees with the snapshot. - Removed the internal tuple-style `PermissionProfileState::set_active_permission_profile()` mutation path. - Updated core session projection and TUI call sites to construct explicit legacy or active snapshots. - Documented the snapshot constructors so legacy use and id/profile mismatch hazards are called out at the API boundary. - Added a focused config test that verifies snapshot updates still respect existing permission constraints. ## How To Review 1. Start with `codex-rs/core/src/config/resolved_permission_profile.rs`; `PermissionProfileSnapshot` is the public wrapper, while `ResolvedPermissionProfile` stays internal. 2. Check `codex-rs/core/src/config/mod.rs` to confirm both session-snapshot setters validate through `PermissionProfileState` and no longer accept loose profile/id pairs. 3. Skim `codex-rs/core/src/session/session.rs` for the session projection path; it now builds the snapshot before installing it. 4. Skim the TUI changes as call-site migration from loose argument pairs to explicit snapshot construction. ## Verification - `cargo test -p codex-core permission_snapshot_setter_preserves_permission_constraints` - `cargo test -p codex-tui status_permissions_` - `cargo test -p codex-tui session_configured_preserves_profile_workspace_roots` - `just fix -p codex-core -p codex-tui`	2026-05-16 07:26:18 -07:00
Eric Traut	de9c5c0226	Fix Windows doctor npm root probe (#22967 ) ## Why On Windows npm-managed installs expose the working shim as `npm.cmd`. `codex doctor` probed bare `npm`, which could incorrectly report that npm global-root inspection was unavailable even when the install was healthy. Fixes #22964. ## What changed - Use `npm.cmd` for the doctor npm-root probe on Windows. - Keep the existing `npm` probe on non-Windows platforms.	2026-05-16 00:39:27 -07:00
Ahmed Ibrahim	326e31ab65	[codex] Refine Python SDK user-facing docs (#22941 ) ## Summary - Remove maintainer and release-process wording from the Python SDK README and docs. - Rewrite SDK-facing comments/docstrings so they read as standalone product documentation. - Add a real app-server integration smoke that follows the public quickstart-style `Codex() -> thread_start() -> run()` path. ## Integration coverage - Add `test_real_quickstart_style_flow_smoke` in the real app-server integration suite. ## Validation - Local tests were not run per repo guidance. CI should validate this branch once the PR is online.	2026-05-15 19:55:05 -07:00
Michael Bolin	9025550709	app-server-protocol: remove PermissionProfile from API (#22924 ) ## Why The app server API should expose permission profile identity, not the lower-level runtime permission model. `PermissionProfile` is the compiled sandbox/network representation that the server uses internally; exposing it through app-server-protocol forces clients to understand details that should remain implementation-level. The API boundary should prefer `ActivePermissionProfile`: a stable profile id, plus future parent-profile metadata, that clients can pass back when they want to select the same active permissions. This also avoids schema generation collisions between the app-server v2 API type space and the core protocol model. Incidentally, while PR makes a number of changes to `command/exec`, note that we are hoping to deprecate this API in favor of `process/spawn`, so we don't need to be too finicky about these changes. ## What Changed - Removed `PermissionProfile` from the app-server-protocol API surface, including generated schema and TypeScript exports. - Changed `CommandExecParams.permissionProfile` to `ActivePermissionProfile`. - Resolve command exec profile ids through `ConfigManager` for the command cwd, matching turn override selection semantics. - Updated downstream TUI tests/helpers to use core permission types directly instead of app-server-protocol `PermissionProfile` shims.	2026-05-15 17:10:15 -07:00
Michael Bolin	bbb5c2811d	tui: pass active permission profiles through app commands (#22891 ) ## Why This continues the permissions migration by keeping the TUI command boundary aligned with the app-server protocol direction from #22795: callers should select a permission profile by id instead of passing a concrete `PermissionProfile` value around as the turn configuration. `AppCommand` is internal to the TUI, but it is the path that eventually becomes `thread/turn/start`, so carrying concrete profile details there made it too easy for UI code to keep relying on the old whole-profile replacement model. ## What changed - `AppCommand::UserTurn` and `AppCommand::OverrideTurnContext` now carry `Option<ActivePermissionProfile>` instead of `PermissionProfile`. - Composer submissions copy the active permission profile id from the current session snapshot; legacy snapshots intentionally submit no active profile id. - Permission preset UI events now carry only the active built-in profile id. The app derives the concrete built-in `PermissionProfile` internally only when updating its local config/status snapshot. - Permission presets expose their built-in active profile id, and preset selection preserves that id in both the immediate turn override and the local TUI config snapshot. - Turn routing sends `TurnPermissionsOverride::ActiveProfile` when an active id is present, and only falls back to the legacy sandbox projection for the remaining runtime override path. ## How to review Start with `codex-rs/tui/src/app_command.rs` to verify the command shape no longer exposes `PermissionProfile`. Then read `codex-rs/tui/src/app/thread_routing.rs` to verify the app-server turn-start conversion: active ids go through as ids, while the legacy sandbox fallback is still constrained to the existing runtime override case. Finally, check `codex-rs/tui/src/chatwidget/permission_popups.rs`, `codex-rs/tui/src/app/event_dispatch.rs`, `codex-rs/tui/src/app/config_persistence.rs`, and `codex-rs/utils/approval-presets/src/lib.rs` to see how preset selections stay id-only across TUI events while the local display/config mirror still gets a concrete built-in profile. ## Verification Latest local verification after the id-only `AppEvent` cleanup: - `cargo check -p codex-tui --tests` - `cargo test -p codex-tui permissions_selection_sends_approvals_reviewer_in_override_turn_context` - `cargo test -p codex-tui update_feature_flags_enabling_guardian` - `cargo test -p codex-utils-approval-presets` - `just fmt` - `just fix -p codex-tui -p codex-utils-approval-presets` Earlier in the same PR, before the final event-shape cleanup: - `cargo test -p codex-tui turn_permissions_` - `cargo test -p codex-tui submission_` - `cargo test -p codex-tui session_configured_syncs_widget_config_permissions_and_cwd` - `RUST_MIN_STACK=16777216 cargo test -p codex-tui`	2026-05-15 22:42:35 +00:00
Curtis 'Fjord' Hawthorne	8543e39885	Preserve image detail in app-server inputs (#20693 ) ## Summary - Add optional image detail to user image inputs across core, app-server v2, thread history/event mapping, and the generated app-server schemas/types. - Preserve requested detail when serializing Responses image inputs: omitted detail stays on the existing `high` default, while explicit `original` keeps local images on the original-resolution path. - Support `high`/`original` consistently for tool image outputs, including MCP `codex/imageDetail`, code-mode image helpers, and `view_image`.	2026-05-15 15:04:04 -07:00
Tom	249d50aafc	[codex] Soften SQLite metadata sync failures (#22899 ) ## Summary - keep transcript-derived local thread metadata SQLite failures best-effort - preserve hard failures for explicit git-only metadata updates that still require SQLite state - add regression coverage for the soft-vs-hard metadata update policy ## Root cause The live thread metadata sync introduced after v0.131.0-alpha.8 moved append-derived metadata writes above the rollout writer. Those SQLite writes now propagated through the live thread flush path, so a corrupted optional state DB could surface as a transcript persistence warning even when JSONL writes still succeeded. The hard failures were introduced in #22236	2026-05-15 21:37:27 +00:00
Owen Lin	6a331a66eb	feat(app-server): update remote control APIs for better UX (#22877 ) ## Why To help improve `codex remote-control` CLI UX which I plan to do in a followup, this PR adds `server-name` to the various remote control APIs: - `remoteControl/enable` - `remoteControl/disable` - `remoteControl/status/changed` Also, add a `remoteControl/status/read` API. This will be helpful in the Codex App.	2026-05-15 14:33:24 -07:00
Shijie Rao	98129fb9c5	Disable DMG staging for signed macOS promotion (#22900 ) ## Why `promote_signed` is now used to finish a release from an externally signed macOS handoff, but this release path (temporarily) no longer distributes DMGs. Keeping DMG staging enabled made the handoff unnecessarily require DMG assets and notarization/stapling validation even though the promoted release only needs the signed macOS binaries. ## What changed - Set every `stage-signed-macos` matrix entry to `build_dmg: "false"`, including the primary macOS bundles. - Kept the existing DMG staging branch in place behind `matrix.build_dmg` so it can be re-enabled deliberately later. - Updated the workflow header comment so the signed handoff contract asks for signed binaries, not signed DMGs. The regular signed build path that creates, signs, notarizes, and stages DMGs is unchanged; this only affects the `promote_signed` handoff path.	2026-05-15 14:19:06 -07:00
Michael Bolin	8df2d96860	core: construct test permission profiles directly (#22795 ) ## Why The core migration is trying to make `PermissionProfile` the shape tests and runtime code reason about, leaving `SandboxPolicy` only where legacy behavior is explicitly under test. The local `permission_profile_for_sandbox_policy()` test helpers kept new permission-profile tests mentally tied to the old sandbox model even when the equivalent profile is straightforward. ## What Changed - Removed the `permission_profile_for_sandbox_policy()` helper from the network proxy spec tests and session tests. - Replaced legacy conversions for read-only, workspace-write, and full-access cases with `PermissionProfile::read_only()`, `PermissionProfile::workspace_write()`, and `PermissionProfile::Disabled`. - Constructed the external-sandbox session test's `PermissionProfile::External` directly, while preserving the legacy `SandboxPolicy` only where the test still exercises legacy config update behavior. ## How To Review This PR is intentionally test-only. Review the two touched files and check that each replacement preserves the old legacy mapping: - `SandboxPolicy::new_read_only_policy()` -> `PermissionProfile::read_only()` - `SandboxPolicy::new_workspace_write_policy()` -> `PermissionProfile::workspace_write()` - `SandboxPolicy::DangerFullAccess` -> `PermissionProfile::Disabled` - `SandboxPolicy::ExternalSandbox { network_access: Restricted }` -> `PermissionProfile::External { network: Restricted }` ## Verification - `cargo test -p codex-core requirements_allowed_domains_are_a_baseline_for_user_allowlist` - `cargo test -p codex-core start_managed_network_proxy_applies_execpolicy_network_rules` - `cargo test -p codex-core session_configured_reports_permission_profile_for_external_sandbox` - `cargo test -p codex-core managed_network_proxy_decider_survives_full_access_start` - `just fix -p codex-core` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22795). * #22891 * __->__ #22795	2026-05-15 13:09:25 -07:00
Michael Bolin	83bbb4f326	app-server: stop returning thread permission profiles (#22792 ) ## Why The app-server thread lifecycle API should no longer expose the full `PermissionProfile` value. After the permissions-profile migration, clients should round-trip only the active profile identity through `activePermissionProfile` and `permissions` when that identity is known. The full profile is server-side config. Treating a response-derived legacy sandbox projection as a new local profile can lose named-profile restrictions and accidentally widen permissions on the next turn. The legacy `sandbox` response field remains only as the compatibility/display fallback. ## What Changed - Removed `permissionProfile` from `ThreadStartResponse`, `ThreadResumeResponse`, and `ThreadForkResponse`. - Stopped populating that field in app-server thread start/resume/fork responses. - Updated embedded exec/TUI response mapping to derive display permission state from local config or the legacy sandbox fallback instead of a response profile value. - Added a TUI turn override shape that distinguishes preserving server permissions, selecting an active profile id, and sending a legacy sandbox for an explicit local override. - Preserved remote app-server permissions across turns by sending `permissions` only when an `activePermissionProfile` id is known, and otherwise sending no sandbox override unless the user selected a local override. - Kept embedded `thread/resume` hydration server-authored when `activePermissionProfile` is absent, which matches the live-thread attach path where the server ignores requested overrides. - Updated the app-server README to remove the obsolete lifecycle response `permissionProfile` reference. The remaining `permissionProfile` README references are request-side permission overrides. - Regenerated app-server JSON schema and TypeScript fixtures. - Kept the generated typed response enum exempt from `large_enum_variant`, matching the existing payload enum exemption after the lifecycle response variants shrank. ## How To Review Start with `codex-rs/app-server-protocol/src/protocol/v2/thread.rs` to confirm the response shape, then check the response construction in `codex-rs/app-server/src/request_processors`. The generated schema and TypeScript fixture changes are mechanical follow-through from the protocol removal. The TUI behavior is the delicate part: review `codex-rs/tui/src/app_server_session.rs` for response hydration and turn-start override projection, then `codex-rs/tui/src/app/thread_routing.rs` for the decision about whether the next turn should preserve the server snapshot, send an active profile id, or send a legacy sandbox for an explicit local override. ## Verification - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol thread_lifecycle_responses_default_missing_optional_fields` - `cargo test -p codex-exec session_configured_from_thread_response_uses_permission_profile_from_config` - `cargo test -p codex-tui --lib thread_response` - `cargo test -p codex-tui turn_permissions_` - `cargo test -p codex-tui resume_response_restores_turns_from_thread_items` - `cargo test -p codex-analytics track_response_only_enqueues_analytics_relevant_responses` - `just fix -p codex-analytics` - `just fix -p codex-app-server-protocol` - `just fix -p codex-tui` - `just argument-comment-lint` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22792). * #22795 * __->__ #22792	2026-05-15 12:45:48 -07:00
viyatb-oai	6afe00efda	Workflow updates (#22582 )	2026-05-15 12:41:18 -07:00
Boyang Niu	c15613f2b6	Forward apps MCP product SKU from Codex config (#22872 ) This adds `apps_mcp_product_sku` as a toplevel config.toml key. We pass the given value as a header when listing MCPs for the client, allowing connectors to be filtered per product entry point. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-15 11:52:14 -07:00
Michael Bolin	4c80435eba	telemetry: tag sandboxes from permission profiles (#22791 ) ## Why Sandbox telemetry tags should be derived from the active permission profile, not from a legacy `SandboxPolicy`, so the tagging code stays aligned with the permissions migration and does not preserve a policy-shaped production helper only for tests. ## What Changed - Removed the production `sandbox_tag(&SandboxPolicy, ...)` helper. - Updated sandbox tag tests to construct the relevant `PermissionProfile` values directly. - Kept the platform-specific sandbox tag behavior under the existing `permission_profile_sandbox_tag` path. ## How To Review The production change is in `codex-rs/core/src/sandbox_tags.rs`. Most of the diff is test cleanup that replaces legacy policy setup with permission profiles, so review the expected tag assertions rather than the old helper mechanics. ## Verification - `cargo test -p codex-core sandbox_tag` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22791). * #22795 * #22792 * __->__ #22791	2026-05-15 10:58:50 -07:00

1 2 3 4 5 ...

6604 Commits