codex

mirror of https://github.com/openai/codex.git synced 2026-05-23 12:34:25 +00:00

Author	SHA1	Message	Date
jif-oai	20fedafff8	Trace logical websocket request after untraced warmup (#23581 ) ## Why `prewarm_websocket` intentionally stays out of rollout inference tracing, but the next traced websocket request can still reuse the warmup `response_id` and send an empty `input` delta. If tracing records that wire payload verbatim, replay sees an incremental request whose parent was never traced and cannot reconstruct the conversation. This fixes that at the producer boundary instead of relaxing `rollout-trace` replay semantics around unresolved `previous_response_id` values. ## What - track whether the last websocket response came from an untraced warmup and clear that state when the websocket session is reset or reconnected - when a traced websocket request reuses that warmup parent, keep sending the compressed websocket request on the wire but record the logical `ResponsesApiRequest` in the rollout trace - add a regression test that proves replay reconstructs the logical user message even though the websocket follow-up carries `previous_response_id = warm-1` with empty `input` - update `InferenceTraceAttempt::record_started` docs to reflect that callers may record a logical request rather than the exact transport payload ## Testing - `cargo test -p codex-core --test all responses_websocket_request_prewarm_traces_logical_request`	2026-05-21 11:13:23 +02:00
Michael Bolin	0b4f86095c	sdk: launch packaged Codex runtimes (#23786 ) ## Why The Python and TypeScript SDKs launch the native Codex runtime directly, so they need to consume the same package artifact shape that release jobs now produce. The runtime wheel should be built from the canonical Codex package archive rather than reconstructing a parallel layout from loose binaries. ## What Changed - Stage `openai-codex-cli-bin` by extracting `codex-package-<target>.tar.gz` into `src/codex_cli_bin` and validating the expected package layout. - Update release workflows to pass the generated package archive into `stage-runtime` instead of the temporary package directory. - Update Python runtime setup to download `codex-package-*.tar.gz` release assets directly. - Expose Python runtime helpers for the bundled package directory and `codex-path`, and prepend that path when `openai_codex` launches the installed runtime without duplicating Windows `Path`/`PATH` keys. - Teach the TypeScript SDK to resolve package-layout optional dependencies while keeping the existing npm fallback layout, and preserve the existing Windows path variable casing when prepending `codex-path`. ## Test Plan - `python3 -m py_compile sdk/python/scripts/update_sdk_artifacts.py sdk/python/_runtime_setup.py sdk/python/src/openai_codex/client.py sdk/python-runtime/src/codex_cli_bin/__init__.py` - `uv run --frozen --project sdk/python --extra dev ruff check sdk/python/scripts/update_sdk_artifacts.py sdk/python/_runtime_setup.py sdk/python/src/openai_codex/client.py sdk/python/tests/test_artifact_workflow_and_binaries.py sdk/python-runtime/src/codex_cli_bin/__init__.py` - `uv run --frozen --project sdk/python --extra dev pytest sdk/python/tests/test_artifact_workflow_and_binaries.py` - `pnpm eslint src/exec.ts tests/exec.test.ts` - `pnpm test --runInBand tests/exec.test.ts`	2026-05-20 18:01:22 -07:00
Michael Bolin	63a72e6b78	core: pass permission profiles to Windows runner (#23715 ) ## Why This is the functional handoff PR for the Windows sandbox `PermissionProfile` migration. After #23714, the Windows elevated backend can accept a profile-native request, but core still sent a compatibility `SandboxPolicy` into the elevated command-runner path. That meant profile-only details such as deny globs had to be translated through side channels instead of being preserved in the runner `SpawnRequest`. Passing the real `PermissionProfile` completes the command-runner handoff while leaving the unelevated restricted-token fallback on the legacy policy-string API. ## What - Updates one-shot Windows elevated execution in `core/src/exec.rs` to call `run_windows_sandbox_capture_for_permission_profile_elevated`. - Updates unified exec in `core/src/unified_exec/process_manager.rs` to call `spawn_windows_sandbox_session_elevated_for_permission_profile`. - Passes `request.permission_profile` / `exec_request.permission_profile` and the stored Windows sandbox policy cwd to the elevated backend. - Keeps compatibility `SandboxPolicy` serialization only for the non-elevated restricted-token fallback. ## Verification - `cargo test -p codex-core --test all --no-run`	2026-05-20 17:57:36 -07:00
viyatb-oai	713a5b1b00	feat: support managed permission profiles in requirements.toml (#23433 ) ## Why Cloud-managed `requirements.toml` should be able to define the managed permission profiles a client may select and constrain that selectable set without requiring local user config to recreate the profile catalog. This keeps requirements focused on restrictions. The selected default remains a config or session choice, while requirements contribute the managed profile bodies and `allowed_permissions` allowlist that the config-loading boundary validates before a resolved runtime `PermissionProfile` is installed. ## What changed - Add `requirements.toml` support for a managed permission-profile catalog plus its allowlist: ```toml allowed_permissions = ["review", "build"] [permissions.review] extends = ":read-only" [permissions.build] extends = ":workspace" ``` - Merge requirements-defined profile bodies into the effective permission catalog and reject profile ids that collide with config-defined profiles. - Validate that every `allowed_permissions` entry resolves to a built-in or catalog profile before selection uses it. - Preserve allowed configured named-profile selections. When a configured named profile is disallowed, fall back to the first allowed requirements profile with a startup warning. - Keep built-in selections and the stock trust-based `:read-only` / `:workspace` fallback path intact when no permission profile is explicitly selected. - Centralize the managed catalog and allowlist selection path in `EffectivePermissionSelection` so the requirements boundary is visible in config loading. - Surface `allowedPermissions` through `configRequirements/read`, and update the generated app-server schema fixtures plus the app-server README. ## Validation - `cargo test -p codex-config` - `cargo test -p codex-core system_requirements_` - `cargo test -p codex-core system_allowed_permissions_` - `cargo test -p codex-app-server-protocol` - `just write-app-server-schema` ## Related work - Uses merged permission-profile inheritance support from #22270 and #23705. - Kept separate from the in-flight permission profile listing API in #23412.	2026-05-20 17:33:01 -07:00
Michael Bolin	c9ff067e31	windows-sandbox: add profile-native elevated APIs (#23714 ) ## Why This is the next step after #23167 in the Windows sandbox `PermissionProfile` migration. The elevated Windows backend still exposed policy-string entry points, which forced callers to pass a compatibility `SandboxPolicy` before the command-runner IPC could receive a profile. Adding profile-native APIs first keeps the core switch in the next PR small: reviewers can see that the Windows crate can prepare elevated setup, capability SIDs, and runner IPC from a resolved `PermissionProfile` without changing core behavior yet. ## What - Adds `ElevatedSandboxProfileCaptureRequest` and `run_windows_sandbox_capture_for_permission_profile_elevated` for one-shot elevated capture. - Adds `spawn_windows_sandbox_session_elevated_for_permission_profile` for unified exec sessions. - Factors elevated spawn prep through `prepare_elevated_spawn_context_for_permissions`, so both new APIs operate from `ResolvedWindowsSandboxPermissions` directly. - Keeps the existing legacy policy-string APIs as adapters for callers that have not moved yet. ## Verification - `cargo test -p codex-windows-sandbox` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/23714). * #23715 * __->__ #23714	2026-05-21 00:25:31 +00:00
viyatb-oai	a27d3847b5	[codex] Reject read-only fallback with approvals disabled (#23774 ) ## Why If a user configures `approval_policy = "never"` with `sandbox_mode = "danger-full-access"`, managed requirements can reject full access and force the existing permission fallback to read-only. That leaves Codex in a dead-end session: writes are blocked by the sandbox, while approvals are disabled so the session cannot ask to proceed. This PR rejects that constrained configuration during startup instead of letting the TUI enter a read-only session that cannot make progress. The rejection is attached to the requirement-constrained permission path in [`Config`](`39f0abc0a7/codex-rs/core/src/config/mod.rs (L3301-L3318)`). ## What changed - Reject the `danger-full-access` to read-only managed-requirements fallback when the effective approval policy is `never`. - Explain in the startup config error why the fallback is invalid and how to fix it. - Add a regression test for the managed requirements path.	2026-05-20 17:17:59 -07:00
evawong-oai	3cae84009a	Use named MITM permissions config (#18240 ) ## Stack 1. Parent PR: #18868 adds MITM hook config and model only. 2. Parent PR: #20659 wires hook enforcement into the proxy request path. 3. This PR changes the user facing PermissionProfile TOML shape. ## Why 1. The broader goal is to make MITM clamping usable from the same permission profile that already controls network behavior. 2. This PR is the config UX layer for the stack. It moves MITM policy into `[permissions.<profile>.network.mitm]` instead of exposing the flat runtime shape to users. 3. The named hook and action tables belong here because users need reusable policy blocks that are easy to review, while the proxy runtime only needs a flat hook list. 4. This PR validates action refs during config parsing so mistakes in the user facing policy fail before a proxy session starts. 5. Keeping the lowering here lets the proxy keep its simpler runtime model and lets PermissionProfile remain the single source of network permission policy. ## Summary 1. Keep MITM policy inside `[permissions.<profile>.network.mitm]` so the selected PermissionProfile owns network proxy policy. 2. Use named MITM hooks under `[permissions.<profile>.network.mitm.hooks.<name>]`. 3. Put host, methods, path prefixes, query, headers, body, and action refs on the hook table. 4. Define reusable action blocks under `[permissions.<profile>.network.mitm.actions.<name>]`. 5. Represent action blocks with `NetworkMitmActionToml`, then lower them into the proxy runtime action config. 6. Reject unknown refs, empty refs, and empty action blocks during config parsing. 7. Keep the runtime hook model unchanged by lowering config into the existing proxy hook list. 8. Preserve the #20659 activation fix for nested MITM policy. ## Example ```toml [permissions.workspace.network.mitm] enabled = true [permissions.workspace.network.mitm.hooks.github_write] host = "api.github.com" methods = ["POST", "PUT"] path_prefixes = ["/repos/openai/"] action = ["strip_auth"] [permissions.workspace.network.mitm.actions.strip_auth] strip_request_headers = ["authorization"] ``` ## Validation 1. Regenerated the config schema. 2. Ran the core MITM config parsing and validation tests. 3. Ran the core PermissionProfile MITM proxy activation tests. 4. Ran the core config schema fixture test. 5. Ran the network proxy MITM policy tests. 6. Ran the scoped Clippy fixer for the network proxy crate. 7. Ran the scoped Clippy fixer for the core crate. --------- Co-authored-by: Winston Howes <winston@openai.com>	2026-05-20 17:10:37 -07:00
Matthew Zeng	0a4179bb19	[codex] Add plugin id to MCP tool call items (#23737 ) Add owning plugin id to MCP tool call items so we can better filter them at plugin level. ## Summary - add optional `plugin_id` to MCP tool-call items and legacy begin/end events - propagate plugin metadata into emitted core items and app-server v2 `ThreadItem::McpToolCall` - preserve plugin ids through app-server replay/redaction paths and regenerate v2 schema fixtures ## Testing - `just write-app-server-schema` - `just fmt` - `just fix -p codex-core` - `cargo test -p codex-protocol -p codex-app-server-protocol` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-core mcp_tool_call_item_includes_plugin_id --lib` - `cargo check -p codex-tui --tests` - `cargo check -p codex-app-server --tests` - `git diff --check` ## Notes - `just fix -p codex-core` completed with two non-fatal `too_many_arguments` warnings on the touched MCP notification helpers. - A broader `cargo test -p codex-core` run passed core unit tests, then hit shell/sandbox/snapshot failures in the integration target. - A broader app-server downstream run hit the existing `in_process::tests::in_process_start_clamps_zero_channel_capacity` stack overflow; `cargo test -p codex-exec` also hit the existing sandbox expectation mismatch in `thread_lifecycle_params_include_legacy_sandbox_when_no_active_profile`.	2026-05-20 17:02:10 -07:00
Michael Bolin	0b5cf85b64	ci: run Codex package builder tests (#23760 ) ## Why #23752 and #23759 add Python unit tests for the Codex package builder, but the root CI workflow did not run tests under `scripts/codex_package`. That left the `zstd` resolution and prebuilt-resource packaging behavior covered locally without a CI check. ## What changed - Add a root CI step in `.github/workflows/ci.yml` that runs `python3 -m unittest discover -s scripts/codex_package -p "test_.py"`. - Keep the step with the existing Python verification checks before Node/pnpm setup. ## Verification - `python3 -m unittest discover -s scripts/codex_package -p "test_.py"` - `python3 -m py_compile scripts/codex_package/*.py`	2026-05-20 17:00:55 -07:00
Casey Chow	60b45d92d9	[codex] List marketplaces considered by plugin discovery Co-authored-by: Codex <noreply@openai.com>	2026-05-20 19:17:46 -04:00
iceweasel-oai	8253ae4e5c	Remove Windows sandbox resource stamping (#23764 ) ## Why The `codex-windows-sandbox` crate was embedding Windows resource metadata through a package-level `build.rs`. Because that package also exposes the `codex_windows_sandbox` library, downstream binaries that link the library could inherit `FileDescription` / `ProductName` values of `codex-windows-sandbox`. That made ordinary Codex binaries, including the long-lived `codex.exe` app-server sidecar, appear as `codex-windows-sandbox` in Windows UI surfaces such as Task Manager / file properties. We do not rely on this metadata enough to justify a larger bin-only resource split, so this removes the resource stamping entirely. ## What changed - Removed the `windows-sandbox-rs` build script that invoked `winres`. - Removed the setup manifest that was only consumed by that build script. - Removed the `winres` build dependency and corresponding `Cargo.lock` / `MODULE.bazel.lock` entries. - Removed the now-unused Bazel build-script data. ## Verification - `cargo build -p codex-windows-sandbox --bins` - `cargo build -p codex-cli --bin codex` - `bazel mod deps --lockfile_mode=update` via Bazelisk, with local remote-cache-disabling flags because `bazel` is not installed on PATH here - `bazel mod deps --lockfile_mode=error` via Bazelisk, with the same local flags - Verified rebuilt `codex.exe`, `codex-command-runner.exe`, and `codex-windows-sandbox-setup.exe` now have blank `FileDescription` / `ProductName` fields. - `cargo test -p codex-windows-sandbox` still fails on two legacy Windows sandbox tests with `CreateRestrictedToken failed: 87` and the follow-on poisoned test lock; 85 passed, 2 ignored.	2026-05-20 16:15:21 -07:00
guinness-oai	d6d03d42ea	[codex] Fix realtime v1 websocket compatibility (#23771 ) ## Why Realtime v1 websocket sessions now expect a slightly different boundary shape for text input, completed input transcripts, and connection headers. Codex was still using the older shape, so some v1 text appends could be rejected before the existing conversation flow could handle them. ## What changed - Send v1 user text items with `input_text` content - Accept v1 turn-marked input transcript events as completed transcripts - Add the v1 alpha header only for v1 realtime sessions - Cover the outbound text shape, transcript parsing, and versioned headers ## Test plan - `cargo test -p codex-api endpoint::realtime_websocket::methods::tests` - `cargo test -p codex-core quicksilver_alpha_header`	2026-05-20 16:03:51 -07:00
Shijie Rao	370b13afc9	Honor client-resolved service tier defaults (#23537 ) ## Why Model catalog responses can now advertise a nullable `default_service_tier` for each model. Codex needs to preserve three distinct states all the way from config/app-server inputs to inference: - no explicit service tier, so the client may apply the current model catalog default when FastMode is enabled - explicit `default`, meaning the user intentionally wants standard routing - explicit catalog tier ids such as `priority`, `flex`, or future tiers Keeping those states distinct prevents the UI from showing one tier while core sends another, especially after model switches or app-server `thread/start` / `turn/start` updates. ## What Changed - Plumbed `default_service_tier` through model catalog protocol types, app-server model responses, generated schemas, model cache fixtures, and provider/model-manager conversions. - Added the request-only `default` service tier sentinel and normalized legacy config spelling so `fast` in `config.toml` still materializes as the runtime/request id `priority`. - Moved catalog default resolution to the TUI/client side, including recomputing the effective service tier when model/FastMode-dependent surfaces change. - Updated app-server thread lifecycle config construction so `serviceTier: null` preserves explicit standard-routing intent by mapping to `default` instead of internal `None`. - Kept core responsible for validating explicit tiers against the current model and stripping `default` before `/v1/responses`, without applying catalog defaults itself. ## Validation - `CARGO_INCREMENTAL=0 cargo build -p codex-cli` - `CARGO_INCREMENTAL=0 cargo test -p codex-app-server model_list` - `cargo test -p codex-tui service_tier` - `cargo test -p codex-protocol service_tier_for_request` - `cargo test -p codex-core get_service_tier` - `RUST_MIN_STACK=8388608 CARGO_INCREMENTAL=0 cargo test -p codex-core service_tier`	2026-05-20 15:57:50 -07:00
Eric Traut	0e9d222178	Make goals feature on by default and no longer experimental (#23732 ) ## Why The `goals` feature is ready to be available without requiring users to opt into experimental features. Keeping it behind the beta flag leaves persisted thread goals and automatic goal continuation disabled by default. This PR also marks the goal-related app server APIs and events as no longer experimental. ## What changed - Mark `goals` as `Stage::Stable`. - Enable `goals` by default in `codex-rs/features/src/lib.rs`.	2026-05-20 15:07:35 -07:00
Casey Chow	3075061bdd	feat(plugins): tabulate plugin list output (#23727 ) ## Summary - render `codex plugin list` as one table per marketplace with the marketplace manifest path shown above each table - surface the installed plugin version in the CLI output by threading `installed_version` through marketplace listing state - narrow the system-root exemption so only known bundled/runtime marketplaces skip missing-manifest failures, and keep `VERSION` empty for cached-but-unconfigured plugins ## Rationale The plugin list UX was hard to scan as a flat list and did not show which installed version was active. This change makes the CLI output easier to read in the real multi-marketplace case, keeps the plugin path visible, fixes the Sapphire regression where bundled/runtime marketplace roots were blocking `plugin list`, and addresses the two review findings that came out of the follow-up deep review. ## Key Decisions - kept the CLI output grouped per marketplace instead of one global table so the marketplace path can live with the rows it owns - kept `VERSION` as the installed version, which means it is empty until a plugin is actually installed - handled the bundled/runtime regression in the CLI snapshot validation path rather than widening app-server protocol or changing marketplace loading behavior - narrowed the exemption to known system marketplace names plus expected system paths, so user-configured marketplaces under those directories still fail loudly - gated `installed_version` on actual installed state so `VERSION` cannot show stale cache state for `not installed` rows ## Validation - `just fmt` - Sapphire: `cargo test -p codex-cli --test plugin_cli` (`14 passed; 0 failed`) - Sapphire smoke test: bundled/runtime roots still work - `cargo run -q -p codex-cli -- plugin add sample@debug` - `cargo run -q -p codex-cli -- plugin list` - verified the bundled/runtime-root scenario no longer errors and shows the expected marketplace table output - Sapphire smoke test: custom marketplace under bundled path still errors - verified `failed to load configured marketplace snapshot(s)` for `custom-marketplace` - Sapphire smoke test: cached-but-unconfigured plugin hides version - verified `sample@debug not installed` renders with an empty `VERSION` column ## Sample Output ```text /tmp/custom-marketplace/plugin.json NAME VERSION STATUS DESCRIPTION sample@debug 1.0.0 enabled Debug sample plugin other@local not installed Local development plugin ```	2026-05-20 18:04:49 -04:00
Abhinav	eee3e60db3	Add SubagentStop hook (#22873 ) # What <img width="1792" height="1024" alt="image" src="https://github.com/user-attachments/assets/8f81d232-5813-4994-a61d-e42a05a93a3e" /> `SubagentStop` runs when a thread-spawned subagent turn is about to finish. Thread-spawned subagents use `SubagentStop` instead of the normal root-agent `Stop` hook. Configured handlers match on `agent_type`. Hook input includes the normal stop fields plus: - `agent_id`: the child thread id. - `agent_type`: the resolved subagent type. - `agent_transcript_path`: the child subagent transcript path. - `transcript_path`: the parent thread transcript path. - `last_assistant_message`: the final assistant message from the child turn, when available. - `stop_hook_active`: `true` when the child is already continuing because an earlier stop-like hook blocked completion. `SubagentStop` shares the same completion-control semantics as `Stop`, scoped to the child turn: - No decision allows the child turn to finish. - `decision: "block"` with a non-empty `reason` records that reason as hook feedback and continues the child with that prompt. - `continue: false` stops the child turn. If `stopReason` is present, Codex surfaces it as the stop reason. # Lifecycle Scope Only thread-spawned subagents run `SubagentStop`. Internal/system subagents such as Review, Compact, MemoryConsolidation, and Other do not run normal `Stop` hooks and do not run `SubagentStop`. This avoids exposing synthetic matcher labels for internal implementation paths. # Stack 1. #22782: add `SubagentStart`. 2. This PR: add `SubagentStop`. 3. #22882: add subagent identity to normal hook inputs.	2026-05-20 14:59:41 -07:00
viyatb-oai	40ad7be2b5	core: refresh active permission profiles at runtime (#22931 ) ## Why Once a named permission profile is selected, runtime state has to keep that profile identity intact instead of collapsing back to anonymous effective permissions. The session refresh path also needs to rebuild profile-derived network proxy state so active profile switches take effect consistently. ## What changed - Preserve the active permission profile through session updates. - Rebuild profile-derived runtime/network configuration when the active profile changes. - Keep the runtime path aligned with the current session configuration APIs. - Tighten the affected tests, including the Windows delete-pending memory-file case that was intermittently tripping CI. ## Stack 1. This PR: runtime/session/network propagation for active permission profiles. 2. [#23708](https://github.com/openai/codex/pull/23708): TUI selection plumbing and guardrail flow. 3. [#21559](https://github.com/openai/codex/pull/21559): profile-aware `/permissions` menu and custom profile display. <img width="1296" height="906" alt="image" src="https://github.com/user-attachments/assets/077fa3a7-80cb-4925-80b1-d2395018d90a" />	2026-05-20 21:55:21 +00:00
Michael Bolin	896ee672cc	windows-sandbox: feed setup from resolved permissions (#23167 ) ## Why This is the next step in the Windows sandbox migration away from the legacy `SandboxPolicy` abstraction. #22923 moved write-root and token decisions onto `ResolvedWindowsSandboxPermissions`, but setup and identity still accepted `SandboxPolicy` and converted internally. This PR pushes that conversion outward so the setup path consumes the resolved Windows permission view directly. ## What Changed - Changed `SandboxSetupRequest` to carry `ResolvedWindowsSandboxPermissions` instead of `SandboxPolicy` plus policy cwd. - Updated setup refresh/elevation and identity credential preparation to use resolved permissions for read roots, write roots, network identity, and deny-write payload planning. - Removed the production `allow.rs` legacy wrapper; allow-path computation now takes resolved permissions directly. - Added a permissions-based world-writable audit entry point while keeping the existing legacy wrapper for compatibility. - Updated legacy ACL setup and the core Windows setup bridge to construct resolved permissions at the boundary. - Hardened the Windows sandbox integration test helper staging so Bazel retries can reuse an already-staged helper if a prior sandbox helper process still has the executable open. ## Verification - `cargo test -p codex-windows-sandbox` - `cargo test -p codex-core --test all --no-run` - `just fix -p codex-windows-sandbox` - `just fix -p codex-core` - Attempted `cargo check -p codex-windows-sandbox --target x86_64-pc-windows-gnullvm`, but the local machine is missing `x86_64-w64-mingw32-clang`; Windows CI should cover that target. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/23167). * #23715 * #23714 * __->__ #23167	2026-05-20 14:52:38 -07:00
Michael Bolin	80c4a978f8	release: package prebuilt resource binaries (#23759 ) ## Why Release packaging should be a staging step once release binaries have already been built and signed. The Windows release job was downloading and signing `codex-command-runner.exe` and `codex-windows-sandbox-setup.exe`, but `scripts/build_codex_package.py` still rebuilt those helpers while creating the package archives. That makes the package step slower and, more importantly, risks putting helper binaries in the archive that were produced after the signing step. Linux had the same shape for package resources: `bwrap` could be rebuilt by the package builder instead of being passed in as a prebuilt release artifact. This builds on #23752, which fixes `.tar.zst` creation when Windows runners rely on the repository DotSlash `zstd` wrapper. ## What changed - Add explicit prebuilt resource inputs to the Codex package builder: - `--bwrap-bin` - `--codex-command-runner-bin` - `--codex-windows-sandbox-setup-bin` - Make `.github/scripts/build-codex-package-archive.sh` pass resource binaries from the release output directory when they are already present. - Build Linux `bwrap` for app-server release jobs too, so app-server package creation does not invoke Cargo just to supply the package resource. - Keep macOS package creation as a no-Cargo path when `--entrypoint-bin` is provided, since macOS packages have no resource binaries. - Add unit coverage showing prebuilt macOS, Linux, and Windows package inputs result in no source-built binaries. ## Verification - `python3 -m unittest discover -s scripts/codex_package -p 'test_.py'` - `python3 -m py_compile scripts/codex_package/.py` - `bash -n .github/scripts/build-codex-package-archive.sh` - Dry-ran Linux and Windows package builds with fake prebuilt resources and a nonexistent Cargo path to verify the package builder did not invoke Cargo. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/23759). * #23760 * __->__ #23759	2026-05-20 14:51:46 -07:00
Michael Bolin	96aa389c79	chore: use Codex Linux runners for Rust releases (#23761 ) ## Why Linux release jobs build the MUSL artifacts that ship in Codex releases, including both the primary CLI bundle and the app-server bundle. Those builds should run on the Codex Linux runner pools instead of generic Ubuntu-hosted runners so release builds use the x64 and arm64 capacity intended for Codex artifacts. ## What Changed - Moves the `x86_64-unknown-linux-musl` release matrix entries in `.github/workflows/rust-release.yml` from `ubuntu-24.04` to `codex-linux-x64-xl`. - Moves the `aarch64-unknown-linux-musl` release matrix entries from `ubuntu-24.04-arm` to `codex-linux-arm64`. - Leaves macOS release jobs, target triples, bundle names, and artifact names unchanged. ## Verification - Reviewed the workflow matrix diff for `.github/workflows/rust-release.yml`. - Not run locally; this is a GitHub Actions runner configuration change.	2026-05-20 14:45:19 -07:00
Michael Bolin	e1ec0eee5f	windows-sandbox: drive write roots from resolved permissions (#22923 ) ## Why This is the third PR in the Windows sandbox `SandboxPolicy` -> `PermissionProfile` migration stack. #22896 introduced `ResolvedWindowsSandboxPermissions`, and #22918 moved elevated runner IPC to carry `PermissionProfile`. This PR starts moving the remaining setup/spawn helpers away from asking legacy enum questions like “is this `WorkspaceWrite`?” and toward resolved runtime permission questions like “does this profile require write capability roots?” ## What changed - Added resolved-permissions helpers for network identity and write-capability detection. - Moved setup write-root gathering to operate on `ResolvedWindowsSandboxPermissions`, with the legacy `SandboxPolicy` wrapper left in place for existing call sites. - Updated identity setup, elevated capture setup, and world-writable audit denies to use resolved write roots. - Updated spawn preparation to carry resolved permissions in `SpawnContext` and use them for network blocking, setup write roots, elevated capability SID selection, and legacy capability roots. - Removed a now-unused legacy write-root helper. ## Verification - `cargo test -p codex-windows-sandbox` - `just fix -p codex-windows-sandbox` - Existing stack checks are green on #22896 and #22918; CI has started for this PR. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22923). * #23715 * #23714 * #23167 * __->__ #22923	2026-05-20 14:30:42 -07:00
Michael Bolin	f48be015d6	release: use DotSlash zstd for package archives (#23752 ) ## Why The Windows release job installed DotSlash successfully, but package archive creation still failed while writing `codex-package-.tar.zst`. The Python archiver used `shutil.which("zstd")`, which does not reliably find the extensionless DotSlash manifest at `.github/workflows/zstd` from native Windows Python. That left release packaging dependent on a command named exactly `zstd` being discoverable on `PATH`, even though the repository already carries a DotSlash wrapper for Windows runners. ## What changed - Add `resolve_zstd_command()` to prefer a real `zstd` binary when present. - Fall back to invoking `dotslash .github/workflows/zstd` when `zstd` is not on `PATH`. - Keep the error explicit when neither `zstd` nor the DotSlash fallback is available. - Add unit coverage for direct `zstd`, DotSlash fallback, and missing-tool error paths. ## Verification - `python3 -m unittest discover -s scripts/codex_package -p 'test_.py'` - `python3 -m py_compile scripts/codex_package/*.py`	2026-05-20 14:28:11 -07:00
evawong-oai	f6970214d2	Wire MITM hooks into runtime enforcement (#20659 ) ## Stack 1. Parent PR: #18868 adds MITM hook config and model only. 2. This PR wires runtime enforcement. 3. User facing config follow up: #18240 moves MITM policy into the PermissionProfile network tree. ## Why 1. After the hook model exists, the proxy needs a separate behavior change that can be tested at the request path. 2. This PR makes hooked HTTPS hosts require MITM, evaluates inner requests after CONNECT, mutates headers for matching hooks, and blocks hooked hosts when no hook matches. 3. It also fixes the activation path so a permission profile with MITM hook policy starts the managed proxy. 4. Keeping this separate from #18868 lets reviewers focus on runtime effects, telemetry, and request mutation. ## Summary 1. Store compiled MITM hooks in network proxy state. 2. Require MITM for hooked hosts even when network mode is full. 3. Evaluate inner HTTPS requests against host specific hooks. 4. Apply hook actions by replacing request headers before forwarding. 5. Block hooked hosts when no hook matches and record block telemetry. 6. Treat profile MITM hook policy as managed proxy policy so the proxy starts when needed. 7. Keep the duplicate authorization header replacement and query preserving request rebuild in this runtime PR. 8. Add runtime tests and README guidance for hook enforcement. ## Validation 1. Ran the network proxy MITM policy tests. 2. Ran the hooked host CONNECT test. 3. Ran the authorization header replacement test. 4. Ran the core permission profile proxy activation test for MITM hooks. 5. Ran the scoped Clippy fixer for the network proxy crate. 6. Ran the scoped Clippy fixer for the core crate.	2026-05-20 14:08:14 -07:00
Abhinav	af49d38373	Support compact SessionStart hooks (#21272 ) # Why Compaction replaces the live conversation history, so hooks that use `SessionStart` to re-inject durable model context need a way to run again after that rewrite. Related - #19905 adds dedicated compact lifecycle hooks # What - add `compact` as a supported `SessionStart` source and matcher value - change pending `SessionStart` state from a single slot to a small FIFO queue so `resume` / `startup` / `clear` can be preserved alongside a later `compact` - drain all queued `SessionStart` sources before the next model request, preserving their original order # Testing The new integration coverage verifies both the basic `compact` matcher path and the stacked `resume` -> `compact` case where both hooks contribute `additionalContext` to the next model turn.	2026-05-20 20:46:19 +00:00
Casey Chow	9265701b7f	[skills] Create a personal update flow for plugin creator (#23542 ) ## Summary Creates a personal-marketplace update flow for the plugin-creator skill when iterating on an existing local plugin. ## Context Plugin creation already had a scaffold path, but the follow-up story for updating an existing local plugin during development was not explicit. The goal of this change is to make that default personal-marketplace update loop legible at the point of use instead of leaving it implied or hidden behind a larger helper. ## Decision Keep the scaffold flow intact, add a dedicated update/reinstall reference centered on the personal marketplace, document the actual `codex plugin add` and marketplace-check commands directly, and keep helper automation narrowly scoped to the repetitive local-update steps. ## Changes - update plugin-creator to point existing-plugin iteration at a personal-marketplace update flow - add `references/installing-and-updating.md` with the explicit marketplace check and reinstall sequence - add small helper scripts for reading marketplace names and updating plugin versions during local iteration ## Tests - `python3 codex-rs/skills/src/assets/samples/skill-creator/scripts/quick_validate.py codex-rs/skills/src/assets/samples/plugin-creator` - `python3 -m py_compile codex-rs/skills/src/assets/samples/plugin-creator/scripts/create_basic_plugin.py codex-rs/skills/src/assets/samples/plugin-creator/scripts/read_marketplace_name.py codex-rs/skills/src/assets/samples/plugin-creator/scripts/update_plugin_cachebuster.py`	2026-05-20 16:44:41 -04:00
Michael Bolin	d1e3d54192	cli: add strict config to exec-server (#23719 ) ## Why PR #20559 added opt-in strict config parsing to the config-loading command surfaces, but `codex exec-server` was left out. That meant `codex exec-server --strict-config` was rejected even though the command can load config for remote registration, and local server startup had no way to fail fast on misspelled config keys. ## What Changed - Added `--strict-config` to `codex exec-server`. - Allowed root-level inheritance from `codex --strict-config exec-server`. - Validated config before local exec-server startup when strict mode is requested. - Reused the loaded strict-config-aware config for remote exec-server registration auth. - Added CLI coverage showing `codex exec-server --strict-config` rejects unknown config fields. ## Verification - `cargo test -p codex-cli` - New integration test: `strict_config_rejects_unknown_config_fields_for_exec_server` ## Documentation Any strict-config command list on developers.openai.com/codex should include `codex exec-server` with the other supported config-loading entry points.	2026-05-20 13:12:31 -07:00
viyatb-oai	fe7c069fe6	feat(permissions): resolve permission profile inheritance (#22270 ) ## Stack This is the foundation PR for the permission-profile inheritance stack. - This PR adds config-level `extends` resolution and merge semantics. - Follow-up: #23705 applies resolved profiles at runtime and updates the active-profile protocol surfaces. ## Why Permission profiles are starting to carry enough policy that copy-pasting near-identical definitions becomes hard to review and easy to drift. Before the runtime can consume inherited profiles, the config layer needs one explicit resolver that can merge parent chains and reject unsafe or invalid inheritance shapes. ## What changed - Add `extends` to permission-profile TOML and resolve parent chains in inheritance order. - Merge inherited profile TOML with the existing config merge behavior while preserving the permission-specific normalization needed for network domain keys. - Keep parent descriptions out of resolved child profiles and record inherited profile names separately for downstream consumers. - Reject undefined parents, unsupported built-in parents, and inheritance cycles with targeted errors. - Cover resolver behavior with TOML fixture tests and refresh the generated config schema. ## Validation - `cargo test -p codex-config` - `cargo test -p codex-core permissions_profiles_`	2026-05-20 20:12:07 +00:00
evawong-oai	3d94e24a3d	Add MITM hook config model (#18868 ) ## Stack 1. This PR adds MITM hook config and model only. 2. Runtime follow up: #20659 wires hook enforcement into the proxy request path. 3. User facing config follow up: #18240 moves MITM policy into the PermissionProfile network tree. ## Why 1. Viyat asked for the original parent PR to be split so reviewers can inspect the policy model before request behavior changes. 2. This PR gives the proxy a typed MITM hook model, validation, matcher compilation, permissions TOML plumbing, schema support, and config tests. 3. This PR deliberately does not change CONNECT or MITM request handling. 4. Keeping runtime behavior out of this PR makes the review boundary simple: does the policy model parse, validate, compile, and lower correctly. ## Summary 1. Add the MITM hook config model and matcher compilation. 2. Validate hosts, methods, paths, query matchers, header matchers, secret sources, and reserved body matching. 3. Add wildcard matcher support for path, query value, and header value matching. 4. Add permissions TOML and schema support for flat runtime hook config. 5. Add config loader tests for MITM hook overlay behavior. ## Validation 1. Regenerated the config schema. 2. Ran the network proxy MITM hook unit tests. 3. Ran the core permission profile MITM hook parsing tests. 4. Ran the core config schema fixture test. 5. Ran the scoped Clippy fixer for the network proxy crate. 6. Ran the scoped Clippy fixer for the core crate. ## Notes 1. Runtime enforcement moved to #20659. 2. User facing PermissionProfile TOML shape remains in #18240.	2026-05-20 12:51:12 -07:00
Michael Bolin	61aae56571	windows-sandbox: share bundled helper lookup (#23735 ) ## Summary Follow-up to #23636 review feedback: the Windows sandbox had two copies of the same bundled-helper lookup order, one for `codex-command-runner.exe` in `helper_materialization.rs` and one for `codex-windows-sandbox-setup.exe` in `setup.rs`. This PR centralizes that lookup in `helper_materialization::bundled_executable_path_for_exe()` and has setup reuse it for `codex-windows-sandbox-setup.exe`. The lookup behavior is unchanged: direct sibling first, package-root `codex-resources/` when running from `bin/`, then legacy sibling `codex-resources/`. ## Test plan - `cargo test -p codex-windows-sandbox` ## Notes I also attempted `cargo check -p codex-windows-sandbox --target x86_64-pc-windows-gnullvm`, but this local host is missing `x86_64-w64-mingw32-clang`.	2026-05-20 19:50:38 +00:00
Michael Bolin	729bdf3c8d	windows-sandbox: send permission profiles to elevated runner (#22918 ) ## Why This is the next PR in the Windows sandbox migration stack after #22896. The bottom PR introduces a Windows-local resolved permissions helper while existing callers still start from legacy `SandboxPolicy`. This PR moves the elevated runner IPC boundary to `PermissionProfile`, which makes the direction of the stack visible without changing the public core call sites yet. Because that changes the CLI-to-command-runner message shape, the framed IPC protocol version is bumped in the same PR so the boundary change is explicit. ## What changed - Replaced elevated IPC `policy_json_or_preset`/`sandbox_policy_cwd` fields with `permission_profile`/`permission_profile_cwd`. - Bumped the elevated command-runner IPC protocol to `IPC_PROTOCOL_VERSION = 2` and switched parent/runner frames to use the shared constant. - Converted the parent elevated paths from the parsed legacy policy into a materialized `PermissionProfile` before sending the runner request. - Added `WindowsSandboxTokenMode` resolution for managed `PermissionProfile` values and made the runner choose read-only vs writable-root capability tokens from that resolved profile. - Rejected disabled, external, unrestricted, and full-disk-write profiles before token selection. - Added IPC JSON coverage for tagged `PermissionProfile` payloads and token-mode unit coverage for the resolved permission helper. ## Verification - `cargo test -p codex-windows-sandbox` - `just fix -p codex-windows-sandbox` - `cargo check -p codex-windows-sandbox --target x86_64-pc-windows-msvc --tests` was attempted locally but blocked before crate type-checking because the macOS compiler environment lacks Windows C headers such as `windows.h` and `assert.h`; GitHub Windows CI is the required verification for the runner path. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22918). * #23715 * #23714 * #23167 * #22923 * __->__ #22918	2026-05-20 12:41:06 -07:00
Michael Bolin	cb05de6724	dotslash: publish Codex entrypoints from package archives (#23638 ) ## Summary DotSlash should resolve the same canonical package archives used by standalone installers and npm platform packages, rather than continuing to point at single-binary zstd artifacts or the older Linux bundle archive. This updates the Codex CLI and `codex-app-server` DotSlash release config entries to match `codex-package-<target>.tar.gz` and `codex-app-server-package-<target>.tar.gz`, with paths that select `bin/codex` or `bin/codex-app-server` inside the extracted package. The other helper outputs stay on their existing per-binary artifacts for now. ## Test plan - `python3 -m json.tool .github/dotslash-config.json > /dev/null` - Ran a Python regex smoke test that checked every updated `codex` and `codex-app-server` platform entry against the archive names emitted by `.github/scripts/build-codex-package-archive.sh`.	2026-05-20 12:18:10 -07:00
viyatb-oai	0edcc4b94e	fix(config): resolve cloud requirements deny-read globs (#23729 ) ## Why Cloud-managed `requirements.toml` contents were deserialized without an `AbsolutePathBuf` base directory. Relative managed `permissions.filesystem.deny_read` glob entries therefore failed while the equivalent local system requirements path succeeded under its `AbsolutePathBufGuard`. This follows the `codex_home` base path convention clarified in https://github.com/openai/codex/pull/15707. ## What changed - Resolve cloud requirements TOML under an `AbsolutePathBufGuard` rooted at `codex_home`. - Reuse the same base for cloud requirements loaded from the signed cache. - Add a regression test for a relative cloud-managed `deny_read` glob. ## Validation - `just fmt` - `cargo test -p codex-cloud-requirements` - `cargo clippy -p codex-cloud-requirements --all-targets --no-deps` - `just bazel-lock-update` - `just bazel-lock-check` - `git diff --check`	2026-05-20 12:15:44 -07:00
Michael Bolin	e389e01f83	npm: ship platform packages in Codex package layout (#23637 ) ## Summary The npm platform packages should stop carrying a bespoke native layout now that the release workflow builds canonical Codex package archives. Keeping npm on the same `bin/`, `codex-resources/`, and `codex-path/` structure lets the Rust package-layout detection behave consistently across standalone, npm, and future DotSlash installs. This changes platform npm packages to stage the `codex-package` artifact for each target under `vendor/<target>`. The Node launcher now resolves `bin/codex` and prepends `codex-path`, while retaining legacy `vendor/<target>/codex` and `vendor/<target>/path` fallback support for local development and migration. The npm staging helper downloads `codex-package` archives instead of rebuilding the CLI payload from individual `codex`, `rg`, `bwrap`, and sandbox helper artifacts. CI still needs to stage npm packages from historical rust-release workflow artifacts that predate package archives, so the staging scripts expose an explicit `--allow-legacy-codex-package` fallback. That fallback synthesizes the canonical package layout from legacy per-binary artifacts and is wired only into the CI smoke path; release staging remains strict and continues to require real package archives. For direct local use, `install_native_deps.py` now points its built-in default workflow at the same recent artifact run used by CI and automatically enables legacy package synthesis only when `--workflow-url` is omitted. Explicit workflow URLs remain strict unless callers opt in with `--allow-legacy-codex-package`. ## Test plan - `python3 -m py_compile codex-cli/scripts/build_npm_package.py codex-cli/scripts/install_native_deps.py scripts/stage_npm_packages.py scripts/codex_package/cli.py` - `node --check codex-cli/bin/codex.js` - `ruby -e 'require "yaml"; YAML.load_file(".github/workflows/rust-release.yml"); YAML.load_file(".github/workflows/ci.yml"); puts "ok"'` - Staged a synthetic `codex-linux-x64` platform package from a canonical vendor tree and verified it copied only `bin/`, `codex-path/`, `codex-resources/`, and `codex-package.json`. - Imported `install_native_deps.py` and extracted a synthetic `codex-package-x86_64-unknown-linux-musl.tar.gz` into `vendor/<target>`. - Ran legacy-layout conversion smokes for Linux, Windows, and unsigned macOS artifact naming. - Ran a synthetic `install_native_deps.py` default-workflow smoke that verifies legacy package synthesis is automatic only when `--workflow-url` is omitted. - `NPM_CONFIG_CACHE="$tmp_dir/npm-cache" python3 ./scripts/stage_npm_packages.py --release-version 0.125.0 --workflow-url https://github.com/openai/codex/actions/runs/26131514935 --package codex --allow-legacy-codex-package --output-dir "$tmp_dir"` - `node codex-cli/bin/codex.js --version` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/23637). * #23638 * __->__ #23637	2026-05-20 12:02:32 -07:00
Eric Traut	7c3cc1db81	Fix thread settings clippy failure (#23724 ) ## Why `main` picked up two small Rust build failures after nearby merges: - #23507 added a real handler for `ServerNotification::ThreadSettingsUpdated`, but the same variant was still listed in the ignored-notification match arm. Full Clippy runs treat the resulting unreachable-pattern warning as an error. - #23666 added `turn_id` and `truncation_policy` to `codex_tools::ToolCall`, while the goal extension backend test fixtures from the goal-extension work still used the old shape. That left `codex-goal-extension` tests unable to compile once the branches met on `main`. ## What changed Removed the duplicate `ThreadSettingsUpdated` match pattern from `tui/src/chatwidget/protocol.rs`. Updated the goal extension test `tool_call` helper to populate the new `ToolCall` fields, and reused that helper for the one direct literal that still had the old field list. ## Verification - `just fix -p codex-tui` - `cargo test -p codex-goal-extension`	2026-05-20 11:58:23 -07:00
sayan-oai	ed6d73b3b9	add standalone websearch api client (#23655 ) add standalone web search request types and a `codex-api` client ahead of the extension-contributed search tool. this adds typed commands/settings and opaque encrypted output handling for the new standalone search flow. the endpoint types are close to finalized but may still shift slightly as that API settles.	2026-05-20 11:38:21 -07:00
jif-oai	d84b824d53	[codex] Preserve failed goal accounting flushes (#23717 ) ## What - Preserve database accounting failures from the goal extension instead of collapsing them into `None` - Warn with turn/tool context when a flush fails - Keep stop/abort accounting snapshots alive when the final flush did not persist ## Why PR #23696 can finish and discard a turn snapshot after `account_thread_goal_usage` fails. That loses the final accumulated accounting state silently. This follow-up keeps that failure explicit and avoids deleting the local snapshot in the failing path. ## Testing - `just fmt` - `cargo test -p codex-goal-extension`	2026-05-20 20:37:27 +02:00
Michael Bolin	110b30d545	install: consume Codex package archives (#23636 ) ## Summary Standalone installs should exercise the same canonical package archive layout that release builds produce, rather than unpacking npm platform packages and reconstructing a parallel install tree. This updates `install.sh` and `install.ps1` to prefer `codex-package-<target>.tar.gz` plus `codex-package_SHA256SUMS` introduced in https://github.com/openai/codex/pull/23635, authenticate the checksum manifest against GitHub release metadata, verify the selected package archive against the authenticated manifest, and install the package archive directly. ## Compatibility Notes Package installs still leave a compatibility command at `current/codex` for managed daemon flows, while visible command shims point at `bin/codex` inside the package layout. Recent releases that predate package archives still publish per-platform npm artifacts, so both installers keep a legacy platform npm fallback for those versions and verify those archives against release metadata directly. Releases old enough to publish only the single root `codex-npm-<version>.tgz` archive are intentionally out of scope. The installers fail clearly when neither package archives nor per-platform npm archives are present. On Windows, the runtime helper lookups now recognize package-layout installs where `codex.exe` runs from `bin/`, so `codex-command-runner.exe` and `codex-windows-sandbox-setup.exe` resolve from the top-level `codex-resources/` directory. The direct-sibling and older sibling-resource fallbacks are preserved. ## Test plan - `sh -n scripts/install/install.sh` - `bash -n scripts/install/install.sh` - `pwsh -NoProfile -Command '$tokens=$null; $errors=$null; $null = [System.Management.Automation.Language.Parser]::ParseFile("scripts/install/install.ps1", [ref]$tokens, [ref]$errors); if ($errors.Count) { $errors \| Format-List ; exit 1 }'` - `HOME="$home_dir" CODEX_HOME="$tmp_dir/codex-home" CODEX_INSTALL_DIR="$bin_dir" PATH="$bin_dir:$PATH" sh scripts/install/install.sh --release 0.125.0` - Verified the 0.125.0 isolated install leaves the visible command pointed at `current/codex` and includes the legacy `codex-resources/rg` payload. - `cargo test -p codex-windows-sandbox` - `just fix -p codex-windows-sandbox` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/23636). #23638 * #23637 * __->__ #23636	2026-05-20 11:20:11 -07:00
jif-oai	c5bd131567	feat: add turn_id and truncation_policy to extension tool calls (#23666 ) ## Why Extension-owned tools currently receive a stripped `ToolCall` with only `call_id`, `tool_name`, and `payload`. That makes extension work that needs turn-local execution context awkward, especially web-search extension work that needs the active `truncation_policy` at tool invocation time. Reconstructing that value from config or `ExtensionData` would be indirect and could drift from the actual turn context, so the cleaner fix is to pass the needed turn metadata directly on the extension-facing invocation type. ## What changed - added `turn_id` and `truncation_policy` to `codex_tools::ToolCall` - populated those fields when core adapts `ToolInvocation` into an extension tool call - added a focused adapter test that verifies extension executors receive the forwarded turn metadata - updated the memories extension tests to construct the richer `ToolCall` - added the `codex-utils-output-truncation` dependency to `codex-tools` and refreshed lockfiles ## Testing - `cargo test -p codex-tools` - `cargo test -p codex-memories-extension` - `cargo test -p codex-core passes_turn_fields_to_extension_call` - `just bazel-lock-update` - `just bazel-lock-check`	2026-05-20 20:14:41 +02:00
Eric Traut	edc48e4612	Sync TUI thread settings through app server (#23507 ) Builds on #23502. ## Why #23502 adds the app-server `thread/settings/update` API and matching `thread/settings/updated` notification. The TUI already lets users change thread-scoped settings such as model, reasoning effort, service tier, approvals, permissions, personality, and collaboration mode, but those updates need to flow through the app server so embedded and connected clients observe the same thread state. This is a rework (simplification) of PR https://github.com/openai/codex/pull/22510. It has the same functionality, but the underlying `thread/settings/update` api is now simpler in that it no longer returns the effective settings as a response. Now, clients receive the effective settings only through the `thread/settings/updated` notification. ## What Changed This updates the TUI to send `thread/settings/update` whenever those thread-scoped settings change and to treat the RPC response as the authoritative acknowledgement. It also routes `thread/settings/updated` notifications back into cached session state and the visible chat widget so active and inactive threads stay in sync after app-server-originated changes. The implementation is kept to the TUI layer: settings conversion and merge logic live under `codex-rs/tui/src/app/thread_settings.rs`, with dispatch/routing hooks in the existing app and chat widget paths. ## Verification I manually tested using `codex app-server --listen unix://` and then launching two copies of the TUI that use the same local app server. I then resumed the same thread on both and verified that changes like plan mode, fast mode, model, reasoning effort, etc. are reflected "live" in the second client when modified in the first and vice versa.	2026-05-20 11:05:14 -07:00
Eric Traut	771a4e74ac	Add thread/settings/update app-server API (#23502 ) ## Why App-server clients need a way to update a thread's next-turn settings without starting a turn, adding transcript content, or waiting for turn lifecycle events. This gives settings UI a direct path for durable thread settings while clients observe the eventual effective state through a notification. This is a simplified rework of PR https://github.com/openai/codex/pull/22509. In particular, it changes the `thread/settings/update` api to return immediately rather than waiting and returning the effective (updated) thread settings. This makes the new api consistent with `turn/start` and greatly reduces the complexity of the implementation relative to the earlier attempt. ## What Changed - Adds experimental `thread/settings/update` with partial-update request fields and an empty acknowledgment response. - Adds experimental `thread/settings/updated`, carrying full effective `ThreadSettings` and scoped by `threadId` to subscribed clients for the affected thread. - Shares durable settings validation with `turn/start`, including `sandboxPolicy` plus `permissions` rejection and `serviceTier: null` clearing. - Emits the same settings notification when `turn/start` overrides change the stored effective thread settings. - Regenerates app-server protocol schema fixtures and updates `app-server/README.md`.	2026-05-20 11:03:20 -07:00
Michael Bolin	2b4898cc47	windows-sandbox: add resolved permissions helper (#22896 ) ## Why The Windows sandbox migration away from the legacy `SandboxPolicy` abstraction needs a small local bridge before IPC and core wiring can move to `PermissionProfile`. Leaf helpers currently branch directly on `WorkspaceWrite`, which spreads legacy assumptions through path planning and token setup code. This PR introduces a Windows-local resolved permissions view so those helpers can ask Windows-specific questions about runtime filesystem/network permissions without matching on the legacy policy enum everywhere. ## What changed - Added `ResolvedWindowsSandboxPermissions` in `windows-sandbox-rs/src/resolved_permissions.rs`, with legacy `SandboxPolicy` constructors for the current call sites. - Moved `allow.rs` writable-root and read-only-subpath planning onto the resolved permissions type. - Preserved Windows `TEMP`/`TMP` writable-root behavior when the effective policy includes writable tmpdir access. - Avoided resolving Unix `:slash_tmp` or parent-process `TMPDIR` while computing Windows writable roots. - Reused the shared allow-path result for setup write-root gathering and routed network-block selection through the resolved abstraction. ## Verification - `cargo test -p codex-windows-sandbox` - `just fix -p codex-windows-sandbox` - GitHub CI restarted on the amended commit; Windows Bazel is the required signal for the Windows-only code paths. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22896). * #23715 * #23714 * #23167 * #22923 * #22918 * __->__ #22896	2026-05-20 17:30:46 +00:00
Felipe Coury	050a2e2668	fix(app-server): speed up shutdown (#23578 ) ## Why Pressing `Ctrl+C` or `Ctrl+D` in the TUI could make Codex pause during shutdown when app-server background work still held outbound sender clones. Shutdown tracing against the current `~/.codex` path found three relevant holders: - `SkillsWatcher` kept its event-loop task alive until the shutdown timeout path. - `AppServerAttestationProvider` retained a strong `Arc<OutgoingMessageSender>`, which could keep outbound teardown waiting after the processor task had exited. - A background `apps/list` task could still own an outbound sender when shutdown began, causing the in-process app-server runtime to wait for its outbound channel to close. ## What Changed - Give `SkillsWatcher` an explicit shutdown `CancellationToken` and cancel it from app-server teardown so its event loop drops the outbound sender promptly. - Change `AppServerAttestationProvider` to keep a `Weak<OutgoingMessageSender>` and return immediately when it can no longer be upgraded. - Give `AppsRequestProcessor` a shutdown `CancellationToken` and cancel in-flight background `apps/list` work during teardown. ## How to Test 1. Start Codex TUI from a real home configuration. 2. Press `Ctrl+C`. 3. Confirm Codex exits promptly instead of pausing during shutdown. 4. Repeat with `Ctrl+D` and confirm the same prompt exit path. Focused manual trace validation from the investigation: - Before the full fix, reproduced shutdown traces showed outbound teardown waiting on lingering owners, including `attestation.provider=1` and later `apps.list.task=1`. - After the fix, fresh real-home `Ctrl+D` traces showed `app_server.runtime.outbound_state_after_processor_join` with `owners=none`, `app_server.runtime.wait_outbound_handle = 0ms`, and total TUI app-server shutdown around `18ms`. Targeted validation: - `RUST_MIN_STACK=8388608 cargo test -p codex-app-server`	2026-05-20 17:30:19 +00:00
Eric Traut	c0f7e1b99f	[2 of 2] Start fresh TUI thread in background (#23176 ) ## Why After the terminal-probe work in #23175, fresh-session startup still waits for `thread/start` before the chat input can become usable. The chat widget already has the machinery to hold early submissions until a session is configured, so fresh `thread/start` does not need to stay on the input-ready hot path. Refs #16335. ## What This PR starts fresh app-server threads in a background task, reports completion through a startup app event, and attaches the primary session once `thread/start` returns. Resume and fork startup paths remain synchronous. ## Benchmark In the local pty startup benchmark, this PR's pre-optimization base branch, #23175, measured about 152ms median from launch to accepted chat input. The stacked result measured about 66ms median, for an approximate additional savings of 85-95ms. For broader context, the original `main` baseline before either startup optimization was about 250.5ms median. We also measured Codex 0.117.0 on the same machine at about 64.6ms median, so the stacked branch is back in the old-startup-time range. ## Stack 1. [#23175: [1 of 2] Optimize TUI startup terminal probes](https://github.com/openai/codex/pull/23175) — base PR 2. [#23176: [2 of 2] Start fresh TUI thread in background](https://github.com/openai/codex/pull/23176) — this PR ## Verification - `cargo test -p codex-tui`	2026-05-20 10:00:33 -07:00
jif-oai	d4f842f3b3	feat: account active goal progress in the goal extension (#23696 ) ## Why The goal extension can create and surface goals, but the live turn-accounting path still stopped short of persisting active-goal progress. That leaves token and wall-clock usage, plus `ThreadGoalUpdated` events, out of sync with the extension boundary once work actually advances or a goal transitions out of active state. ## What changed - Teach `GoalAccountingState` to track the current turn, active goal, token deltas, and wall-clock progress snapshots against the persisted goal id. - Flush active-goal accounting from tool-finish, turn-stop, and turn-abort lifecycle hooks, and emit `ThreadGoalUpdated` events when persisted progress changes. - Route `create_goal` and `update_goal` through the same accounting state so new goals start from the right baseline, final progress is flushed before status changes, and `update_goal` can mark a goal `blocked` as well as `complete`. - Keep budget-limited goals accruing through the end of the turn while clearing local active-goal state once a turn or explicit update is finished. - Expand backend and lifecycle coverage around store ids, baseline reset, tool-finish accounting, budget-limited carry-through, and blocked-goal updates. ## Testing - Added focused backend coverage in `codex-rs/ext/goal/tests/goal_extension_backend.rs` for baseline reset, tool-finish accounting, budget-limited turns, and blocked-goal updates. - Extended `codex-rs/core/src/session/tests.rs` to assert that lifecycle inputs expose the expected session, thread, and turn store ids.	2026-05-20 18:36:37 +02:00
anp-oai	f198ca115b	feat: Add btw alias for side slash command (#23592 )	2026-05-20 15:49:35 +00:00
Michael Bolin	e9f59e30d9	release: publish Codex package archive checksums (#23635 ) ## Summary Standalone installers and other downstream package consumers need a stable checksum source for the canonical package archives. Relying on per-asset metadata makes that harder to consume uniformly, especially when several package archives are produced in the same release. This keeps the `codex-package-.tar.gz` and `codex-app-server-package-.tar.gz` assets in the GitHub Release upload set and adds `codex-package_SHA256SUMS` to `dist/` before the release is created. The manifest contains one SHA-256 line per package archive and fails the release job if no package archives are present. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/23635). * #23638 * #23637 * #23636 * __->__ #23635	2026-05-20 08:48:04 -07:00
Michael Bolin	b0b383bea3	runtime: use install context for bundled bwrap (#23634 ) ## Summary The Linux sandbox should find bundled `bwrap` through the same package-layout abstraction as the rest of the runtime, instead of maintaining a separate standalone-specific lookup path. This adds an `InstallContext` helper for bundled resources and updates `codex-linux-sandbox` to ask the current install context for `codex-resources/bwrap` before falling back to the old executable-relative probes. The tests cover npm-style, standalone, and canonical package layouts so `bwrap` lookup follows the package structure introduced earlier in the stack. ## Test plan - `cargo test -p codex-install-context` - `cargo test -p codex-linux-sandbox --lib` - `just fix -p codex-install-context -p codex-linux-sandbox` - `just bazel-lock-check` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/23634). * #23638 * #23637 * #23636 * #23635 * __->__ #23634	2026-05-20 08:24:43 -07:00
pakrym-oai	a52c91d8b5	[codex] Hide deferred tools from code mode prompt (#23605 ) ## Why `code_mode_only_guides_all_tools_search_and_calls_deferred_app_tools` was failing because code-mode prompt generation used the same nested tool spec list for both the model-visible `exec` guide and the runtime `ALL_TOOLS` surface. That allowed deferred MCP/app tools, such as `calendar_timezone_option_99`, to leak into the `exec` description even though they should only be discoverable through `ALL_TOOLS` at runtime. ## What changed Split code-mode nested tool planning into two sets in `core/src/tools/spec_plan.rs`: - runtime nested tool specs still include deferred tools, so `tools[...]` and `ALL_TOOLS` can call them - `exec` prompt docs only render non-deferred tools, so deferred app tools stay out of the model-visible guide ## Validation - `cargo test -p codex-core --test all code_mode_only_guides_all_tools_search_and_calls_deferred_app_tools -- --nocapture` - looped the same focused test 5 additional times with `cargo test -q -p codex-core --test all code_mode_only_guides_all_tools_search_and_calls_deferred_app_tools`	2026-05-20 08:09:45 -07:00
jif-oai	59507b8491	feat: expose turn-start metadata to extensions (#23688 ) ## Why The goal extension needs more context when a turn starts than `turn_store` alone provides. In particular, goal accounting needs the stable turn id, the effective collaboration mode, and the cumulative token-usage baseline captured at turn start so it can: - suppress goal accounting for plan-mode turns - compute exact per-turn deltas from cumulative `total_token_usage` snapshots instead of relying on the most recent usage event alone - keep the extension-owned accounting path aligned with the host turn lifecycle ## What - extend `codex_extension_api::TurnStartInput` to expose `turn_id`, `collaboration_mode`, and `token_usage_at_turn_start` - pass the full `TurnContext` plus the captured token-usage baseline through the turn-start lifecycle emission path - initialize goal turn accounting from the turn-start baseline and collaboration mode - switch goal token accounting to compute deltas from cumulative `total_token_usage` snapshots - add coverage for the new turn-start lifecycle fields and for goal-accounting baseline behavior ## Testing - added `turn_start_lifecycle_exposes_turn_metadata_and_token_baseline` in `codex-rs/core/src/session/tests.rs` - added `ext/goal/tests/accounting.rs` coverage for baseline-aware goal accounting and plan-mode suppression	2026-05-20 15:54:29 +02:00
jif-oai	1392a2a770	feat: async turn item process (#23692 ) Mechanical change	2026-05-20 15:30:01 +02:00

1 2 3 4 5 ...

6733 Commits