codex

mirror of https://github.com/openai/codex.git synced 2026-05-19 02:33:10 +00:00

Author	SHA1	Message	Date
Michael Bolin	8eee4e699f	fix: document cargo deny advisory exceptions	2026-04-22 09:43:18 -07:00
Steve Coffey	0127cef5db	Stage publishable Python runtime wheels (#18865 ) This is PR 2 of the Python SDK PyPI publishing split. [PR 1](https://github.com/openai/codex/pull/18862) refreshed the generated SDK bindings; this PR makes the runtime package itself publishable, and PR 3 will wire the SDK package/version pinning to this runtime package. ## Summary - Rename the runtime distribution to `openai-codex-cli-bin` while keeping the import package as `codex_cli_bin`. - Make the runtime package wheel-only and build `py3-none-<platform>` wheels instead of interpreter-specific wheels. - Add `stage-runtime --codex-version` and `--platform-tag` so release staging can produce the platform wheel matrix from Codex release tags. - Add focused artifact workflow tests for version normalization, platform tag injection, and runtime wheel metadata. ## Why Rename There is already an unofficial PyPI package, [`codex-bin`](https://pypi.org/project/codex-bin/), distributing OpenAI Codex binaries. Publishing the official SDK runtime dependency as `openai-codex-cli-bin` makes the ownership clear, avoids confusing the SDK-pinned runtime wheel with that unowned wrapper, and keeps the import package unchanged as `codex_cli_bin`. ## Tests - `uv run --extra dev pytest tests/test_artifact_workflow_and_binaries.py` -> 21 passed - `uv run --extra dev python scripts/update_sdk_artifacts.py stage-runtime /tmp/codex-python-pr2-rebased/runtime-stage /tmp/codex-python-pr2-rebased/codex --codex-version rust-v0.116.0-alpha.1 --platform-tag macosx_11_0_arm64` - `uv run --with build --extra dev python -m build --wheel /tmp/codex-python-pr2-rebased/runtime-stage` - `uv run --with twine --extra dev twine check /tmp/codex-python-pr2-rebased/runtime-stage/dist/openai_codex_cli_bin-0.116.0a1-py3-none-macosx_11_0_arm64.whl` ## Note - Full `uv run --extra dev pytest` currently fails because regenerating from schemas already on `main` adds new DeviceKey Python types. I left that generated catch-up out of this runtime-only PR.	2026-04-22 08:14:48 -07:00
Vaibhav Srivastav	0ebe69a8c3	[codex] Update imagegen system skill (#18852 ) ## Summary This updates the embedded `imagegen` system skill in `codex-rs/skills` with the ImageGen 2 skill changes from `openai/skills-internal#87`. The bundled skill now keeps normal image generation/editing on the built-in `image_gen` path, updates the CLI fallback defaults to `gpt-image-2`, and routes explicit transparent-output requests through `gpt-image-1.5` with clear guidance that `gpt-image-2` does not support transparent backgrounds. ## Details - Update `SKILL.md` routing guidance for built-in vs CLI fallback behavior. - Update CLI/API references for `gpt-image-2` size constraints, quality options, near-4K sizes, and unsupported options. - Update `scripts/image_gen.py` defaults and validation: - default model `gpt-image-2` - default size `auto` - default quality `medium` - reject transparent backgrounds on `gpt-image-2` - reject `input_fidelity` on `gpt-image-2` - validate flexible `gpt-image-2` sizes and suggest `3824x2160` / `2160x3824` for near-4K requests - Update prompt/reference docs with the new model and routing guidance. ## Validation - `cargo test -p codex-skills` - `git diff --check` - Manual CLI dry-runs for: - default `gpt-image-2` payload - `3824x2160` near-4K size acceptance - `3840x2160` rejection with near-4K guidance - transparent background rejection on `gpt-image-2` - transparent background acceptance on `gpt-image-1.5` - `input_fidelity` rejection on `gpt-image-2` Bazel target check was not run locally because `bazel` is not installed in this environment.	2026-04-22 15:08:10 +00:00
jif-oai	65420737e8	chore: prep memories for AB (#18973 )	2026-04-22 11:46:15 +01:00
jif-oai	ddf65c9647	fix: cargo deny (#18971 )	2026-04-22 11:46:11 +01:00
jif-oai	639382609f	fix: wait_agent timeout for queued mailbox mail (#18968 ) ## Why `wait_agent` can be called while mailbox mail is already pending. The previous implementation subscribed for future mailbox sequence changes and then waited for the next notification. If the mail was queued before that wait started, no new notification arrived, so the tool could sit until `timeout_ms` even though mail was ready to deliver. ## What Changed - Added `Session::has_pending_mailbox_items()` for checking pending mailbox mail through the session API. - Updated `multi_agents_v2::wait` to return immediately when pending mailbox mail already exists before sleeping on a new mailbox sequence update. - Reworked the regression coverage in `multi_agents_tests.rs` so already queued mailbox mail must wake `wait_agent` promptly. Relevant code: - [`wait_agent` pending-mail check](`aa8ca06e83/codex-rs/core/src/tools/handlers/multi_agents_v2/wait.rs (L55-L60)`) - [`Session::has_pending_mailbox_items`](`aa8ca06e83/codex-rs/core/src/session/mod.rs (L2979-L2981)`) - [`multi_agent_v2_wait_agent_returns_for_already_queued_mail`](`aa8ca06e83/codex-rs/core/src/tools/handlers/multi_agents_tests.rs (L2854)`) ## Verification - `cargo test -p codex-core multi_agent_v2_wait_agent_returns_for_already_queued_mail`	2026-04-22 11:16:17 +01:00
acrognale-oai	4f8c58f737	Support multiple cwd filters for thread list (#18502 ) ## Summary - Teach app-server `thread/list` to accept either a single `cwd` or an array of cwd filters, returning threads whose recorded session cwd matches any requested path - Add `useStateDbOnly` as an explicit opt-in fast path for callers that want to answer `thread/list` from SQLite without scanning JSONL rollout files - Preserve backwards compatibility: by default, `thread/list` still scans JSONL rollouts and repairs SQLite state - Wire the new cwd array and SQLite-only options through app-server, local/remote thread-store, rollout listing, generated TypeScript/schema fixtures, proto output, and docs ## Test Plan - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-rollout` - `cargo test -p codex-thread-store` - `cargo test -p codex-app-server thread_list` - `just fmt` - `just fix -p codex-app-server-protocol -p codex-rollout -p codex-thread-store -p codex-app-server` - `cargo build -p codex-cli --bin codex`	2026-04-22 06:10:09 -04:00
jif-oai	b04ffeee4c	nit: expose lib (#18962 ) As a follow-up	2026-04-22 10:06:53 +01:00
rhan-oai	213b17b7a3	[codex-analytics] guardian review TTFT plumbing and emission (#17696 ) ## Why Guardian analytics includes time-to-first-token, but the Guardian reviewer runs as a normal Codex session and `TurnCompleteEvent` did not expose TTFT. The timing needs to flow through the standard turn-completion protocol so Guardian review analytics can consume the same value as the rest of the session machinery. ## What changed Adds optional `time_to_first_token_ms` to `TurnCompleteEvent` and populates it from `TurnTiming`. The value is carried through app-server thread history, rollout reconstruction, TUI/app-server adapters, and Guardian review session handling. Guardian review analytics now captures TTFT from the reviewer turn-complete event when available. Existing tests and fixtures are updated to set the new optional field to `None` where TTFT is not relevant. ## Verification - `cargo clippy -p codex-tui --tests -- -D warnings` - `cargo clippy -p codex-core --lib --tests -- -D warnings` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17696). * __->__ #17696 * #17695 * #17693 * #18278 * #18953	2026-04-22 01:52:48 -07:00
rhan-oai	37aadeaa13	[codex-analytics] guardian review truncation (#17695 ) ## Why The Guardian review event needs to report whether the action shown to Guardian was truncated. That field should come from the same truncation path used to build the Guardian prompt, rather than being inferred after the fact. ## What changed Plumbs truncation metadata through Guardian action formatting, prompt construction, review session execution, and analytics emission. `guardian_truncate_text` now reports both the rendered text and whether it inserted the truncation marker, and `reviewed_action_truncated` is set from that prompt-building result. This keeps the analytics field aligned with the model-visible reviewed action while preserving the existing Guardian prompt behavior. ## Verification - Guardian truncation tests cover both truncated and non-truncated action payloads. - Guardian review tests assert the review session metadata and truncation field are propagated. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17695). * #17696 * __->__ #17695 * #17693 * #18278 * #18953	2026-04-22 08:35:29 +00:00
rhan-oai	4e7399c6b9	[codex-analytics] guardian review analytics events emission (#17693 ) ## Why Guardian approvals now run as review sessions, but Codex analytics did not have a terminal event for those reviews. That made it hard to measure approval outcomes, failure modes, Guardian session reuse, model metadata, token usage, and timing separately from the parent turn. ## What changed Adds `codex_guardian_review` analytics emission for Guardian approval reviews. The event is emitted from the Guardian review path with review identity, target item id, approval request source, a PII-minimized reviewed-action shape, terminal decision/status, failure reason, Guardian assessment fields, Guardian session metadata, token usage, and timing metadata. The reviewed-action payload intentionally omits high-risk fields such as shell commands, working directories, argv, file paths, network targets/hosts, rationale, retry reason, and permission justifications. It also classifies prompt-build failures separately from Guardian session/runtime failures so fail-closed cases are distinguishable in analytics. ## Verification - Guardian review analytics tests cover terminal success, timeout/cancel/fail-closed paths, session metadata, and token usage plumbing. - `cargo clippy -p codex-core --lib --tests -- -D warnings` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17693). * #17696 * #17695 * __->__ #17693	2026-04-22 01:02:47 -07:00
Michael Bolin	5eab9ff8ca	app-server: expose thread permission profiles (#18278 ) ## Why The `PermissionProfile` migration needs app-server clients to see the same constrained permission model that core is using at runtime. Before this PR, thread lifecycle responses only exposed the legacy `SandboxPolicy` shape, so clients still had to infer active permissions from sandbox fields. That makes downstream resume, fork, and override flows harder to make `PermissionProfile`-first. External sandbox policies are intentionally excluded from this canonical view. External enforcement cannot be round-tripped as a `PermissionProfile`, and exposing a lossy root-write profile would let clients accidentally change sandbox semantics if they echo the profile back later. ## What changed - Adds the app-server v2 `PermissionProfile` wire shape, including filesystem permissions and glob scan depth metadata. - Adds `PermissionProfileNetworkPermissions` so the profile response does not expose active network state through the older additional-permissions naming. - Returns `permissionProfile` from thread start, resume, and fork responses when the active sandbox can be represented as a `PermissionProfile`. - Keeps legacy `sandbox` in those responses for compatibility and documents `permissionProfile` as canonical when present. - Makes lifecycle `permissionProfile` nullable and returns `null` for `ExternalSandbox` to avoid exposing a lossy profile. - Regenerates the app-server JSON schema and TypeScript fixtures. ## Verification - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server thread_response_permission_profile_omits_external_sandbox -- --nocapture` - `cargo check --tests -p codex-analytics -p codex-exec -p codex-tui` - `just fix -p codex-app-server-protocol -p codex-app-server -p codex-analytics -p codex-exec -p codex-tui` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18278). * #18279 * __->__ #18278	2026-04-21 23:52:56 -07:00
iceweasel-oai	3a451b6321	use long-lived sessions for codex sandbox windows (#18953 ) `codex sandbox windows` previously did a one-shot spawn for all commands. This change uses the `unified_exec` session to spawn long-lived processes instead, and implements a simple bridge to forward stdin to the spawned session and stdout/stderr from the spawned session back to the caller. It also fixes a bug with the new shared spawn context code where the "no-network env" was being applied to both elevated and unelevated sandbox spawns. It should only be applied for the unelevated sandbox because the elevated one uses firewall rules instead of an env-based network suppression strategy.	2026-04-22 06:39:29 +00:00
efrazer-oai	69c8913e24	feat: add explicit AgentIdentity auth mode (#18785 ) ## Summary This PR adds `CodexAuth::AgentIdentity` as an explicit auth mode. An AgentIdentity auth record is a standalone `auth.json` mode. When `AuthManager::auth().await` loads that mode, it registers one process-scoped task and stores it in runtime-only state on the auth value. Header creation stays synchronous after that because the task is initialized before callers receive the auth object. This PR also removes the old feature flag path. AgentIdentity is selected by explicit auth mode, not by a hidden flag or lazy mutation of ChatGPT auth records. Reference old stack: https://github.com/openai/codex/pull/17387/changes ## Design Decisions - AgentIdentity is a real auth enum variant because it can be the only credential in `auth.json`. - The process task is ephemeral runtime state. It is not serialized and is not stored in rollout/session data. - Account/user metadata needed by existing Codex backend checks lives on the AgentIdentity record for now. - `is_chatgpt_auth()` remains token-specific. - `uses_codex_backend()` is the broader predicate for ChatGPT-token auth and AgentIdentity auth. ## Stack 1. https://github.com/openai/codex/pull/18757: full revert 2. https://github.com/openai/codex/pull/18871: isolated Agent Identity crate 3. This PR: explicit AgentIdentity auth mode and startup task allocation 4. https://github.com/openai/codex/pull/18811: migrate Codex backend auth callsites through AuthProvider 5. https://github.com/openai/codex/pull/18904: accept AgentIdentity JWTs and load `CODEX_AGENT_IDENTITY` ## Testing Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.	2026-04-21 22:33:24 -07:00
Michael Bolin	0fef35dc3a	core: derive active permission profiles (#18277 ) ## Why `Permissions` should not store a separate `PermissionProfile` that can drift from the constrained `SandboxPolicy` and network settings. The active profile needs to be derived from the same constrained values that already honor `requirements.toml`. ## What changed This adds derivation of the active `PermissionProfile` from the constrained runtime permission settings and exposes that derived value through config snapshots and thread state. The app-server can then report the active profile without introducing a second source of truth. ## Verification - `cargo test -p codex-core --test all permissions_messages -- --nocapture` - `cargo test -p codex-core --test all request_permissions -- --nocapture` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18277). * #18288 * #18287 * #18286 * #18285 * #18284 * #18283 * #18282 * #18281 * #18280 * #18279 * #18278 * __->__ #18277	2026-04-21 22:11:40 -07:00
Celia Chen	51fdc35945	chore: remove unused Bedrock auth lazy loading (#18948 ) ## Summary The Bedrock Mantle SigV4 auth provider currently looks like it can lazily load `AwsAuthContext`, but the provider is only constructed after `resolve_auth_method` has already loaded that context. Because `with_context` always pre-populates the `OnceCell`, the `get_or_try_init` fallback is unused in normal operation and makes the provider lifecycle harder to reason about. This change removes that dead lazy-loading path and makes the actual behavior explicit: - `BedrockAuthMethod::AwsSdkAuth` carries only the resolved `AwsAuthContext`. - `BedrockMantleSigV4AuthProvider` stores the resolved context directly. - request signing uses the stored context without going through `OnceCell`. The existing eager AWS auth resolution behavior is unchanged; this is a simplification of the provider state, not a behavior change. ## Testing - `cargo shear` - `cargo test -p codex-model-provider` - `just bazel-lock-check`	2026-04-22 05:01:22 +00:00
Dylan Hurd	34800d717e	[codex] Clean guardian instructions (#18934 ) ## Summary - Keep the guardian policy installed as guardian base instructions. - Clear inherited parent `developer_instructions` for guardian review sessions. - Update guardian config tests to assert developer instructions are cleared and policy text is sourced from base instructions. ## Why Guardian review sessions are intended to run under an isolated guardian policy. Because the guardian config is cloned from the parent config, inherited custom or managed developer instructions could otherwise remain active and conflict with guardian review behavior. ## Validation - `just fmt` - `cargo test -p codex-core guardian_review_session_config` Co-authored-by: Codex <noreply@openai.com>	2026-04-21 21:47:58 -07:00
Michael Bolin	faed6d5c07	tests: serialize process-heavy Windows CI suites (#18943 ) ## Why A [Windows Cargo build](https://github.com/openai/codex/actions/runs/24754807756/job/72425641062) on `main` timed out in several unrelated-looking suites at the same time: - `codex-app-server` account tests failed before account logic, while `mcp.initialize()` was waiting for the first JSON-RPC response. - `codex-core` `apply_patch_cli` tests timed out while running full Codex/apply_patch turns. - `codex-windows-sandbox` legacy session tests timed out while creating restricted-token child processes and private desktops. The app-server log reached the test harness write path in [`McpProcess::initialize_with_params`](`731b54d08f/codex-rs/app-server/tests/common/mcp_process.rs (L244-L263)`), but never printed the matching stdout read from [`read_jsonrpc_message`](`731b54d08f/codex-rs/app-server/tests/common/mcp_process.rs (L1123-L1128)`). The server initialize handler is a small bookkeeping/response path ([`message_processor.rs`](`731b54d08f/codex-rs/app-server/src/message_processor.rs (L601-L728)`)), so the failure looks like Windows runner process/pipe scheduling starvation rather than account-specific behavior. ## What Changed This updates `.config/nextest.toml` to serialize two process-heavy sets: - `codex-core` tests matching `package(codex-core) & kind(test) & test(apply_patch_cli)` - `codex-windows-sandbox` tests matching `package(codex-windows-sandbox) & test(legacy_)` `codex-app-server` integration tests were already serialized inside their own package; this change reduces overlap with the other suites that were saturating the runner at the same time. ## Verification - `cargo nextest list --filterset "package(codex-core) & kind(test) & test(apply_patch_cli)"` - `cargo nextest list --filterset "package(codex-windows-sandbox) & test(legacy_)"` The Windows sandbox filter naturally lists no tests on macOS, but it validates the nextest filter/config syntax locally.	2026-04-21 21:14:45 -07:00
Dylan Hurd	0e39614d87	chore(tui) debug-config guardian_policy_config (#18923 ) ## Summary List guardian_policy_config_source in `/debug-config` output ## Testing - [x] Ran locally	2026-04-21 21:00:23 -07:00
Eric Traut	c7e5a9d95e	Keep TUI status surfaces in sync (#18935 )	2026-04-21 20:39:23 -07:00
Michael Bolin	03ae4db0f4	ci: keep argument comment lint checks materialized (#18926 ) ## Why The fast `rust-ci` workflow decides whether to run the cross-platform `argument-comment-lint` job based on changed paths. PRs that touch Rust-adjacent Bazel wrapper files, such as `defs.bzl` or `workspace_root_test_launcher.*.tpl`, can change how Rust tests and lint targets behave without changing any `.rs` files. When that detector returned false, GitHub skipped the matrix job before expanding it. That produced a single skipped check named `Argument comment lint - ${{ matrix.name }}` instead of the Linux, macOS, and Windows check names that branch protection expects, leaving the PR unable to go green when those matrix checks are required. ## What Changed - Treat root Bazel wrapper files as `argument-comment-lint` relevant changes. - Keep the `argument_comment_lint_prebuilt` matrix job materialized for every PR so the per-platform check names always exist. - Add a single gate step that decides whether the real lint work should run. - Move the checkout-adjacent Bazel setup and OS-specific lint commands into `.github/actions/run-argument-comment-lint/action.yml` so the workflow does not repeat the same path-detection condition on each step. ## Verification - Parsed `.github/workflows/rust-ci.yml` and `.github/actions/run-argument-comment-lint/action.yml` with Python YAML loading. - Simulated the workflow path-matching shell conditions for the root Bazel wrapper files and confirmed they set `argument_comment_lint=true`.	2026-04-22 03:36:46 +00:00
Michael Bolin	36f8bb4ffa	exec-server: carry filesystem sandbox profiles (#18276 ) ## Why The exec-server still needs platform sandbox inputs, but the migration should preserve the `PermissionProfile` that produced them. Keeping only the derived legacy sandbox map would keep `SandboxPolicy` as the effective abstraction and would make full-disk vs. restricted profiles harder to preserve as the permissions stack starts round-tripping profiles. `PermissionProfile` entries can also be cwd-sensitive (`:cwd`, `:project_roots`, relative globs), so the exec-server must carry the request sandbox cwd instead of resolving those entries against the long-lived exec-server process cwd. ## What changed `FileSystemSandboxContext` now carries `permissions: PermissionProfile` plus an optional `cwd`: - removed `sandboxPolicy`, `sandboxPolicyCwd`, `fileSystemSandboxPolicy`, and `additionalPermissions` - added `permissions` and `cwd` - kept the platform knobs `windowsSandboxLevel`, `windowsSandboxPrivateDesktop`, and `useLegacyLandlock` Core turn and apply-patch paths populate the context from the active runtime permissions and request cwd. Exec-server derives platform `SandboxPolicy`/`FileSystemSandboxPolicy` at the filesystem boundary, adds helper runtime reads there, and rejects cwd-dependent profiles that arrive without a cwd. The legacy `FileSystemSandboxContext::new(SandboxPolicy)` constructor now preserves the old workspace-write conversion semantics for compatibility tests/callers. ## Verification - `cargo test -p codex-exec-server` - `cargo test -p codex-exec-server sandbox_cwd -- --nocapture` - `cargo test -p codex-exec-server sandbox_context_new_preserves_legacy_workspace_write_read_only_subpaths -- --nocapture` - `cargo test -p codex-core --lib file_system_sandbox_context_uses_active_attempt -- --nocapture`	2026-04-21 20:22:28 -07:00
efrazer-oai	564860e8bd	refactor: add agent identity crate (#18871 ) ## Summary This PR adds `codex-agent-identity` as an isolated crate for Agent Identity business logic. The crate owns: - AgentAssertion construction. - Agent task registration. - private-key assertion signing. - bounded blocking HTTP for task registration. It does not wire AgentIdentity into `auth.json`, `AuthManager`, rollout state, or request callsites. That integration happens in later PRs. Reference old stack: https://github.com/openai/codex/pull/17387/changes ## Stack 1. https://github.com/openai/codex/pull/18757: full revert 2. This PR: isolated Agent Identity crate 3. https://github.com/openai/codex/pull/18785: explicit AgentIdentity auth mode and startup task allocation 4. https://github.com/openai/codex/pull/18811: migrate Codex backend auth callsites through AuthProvider 5. https://github.com/openai/codex/pull/18904: accept AgentIdentity JWTs and load `CODEX_AGENT_IDENTITY` ## Testing Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.	2026-04-21 19:57:49 -07:00
Michael Bolin	8fea372c77	Fix remote app-server shutdown race (#18936 ) ## Why A Mac Bazel CI run saw `remote_notifications_arrive_over_websocket` fail during shutdown with `remote app-server shutdown channel is closed` (https://app.buildbuddy.io/invocation/9dac05d6-ae20-40f9-b627-fca6e91cf127). The remote websocket worker can legitimately finish while `shutdown()` is waiting for the shutdown acknowledgement: after the test server sends a notification and exits, the worker may deliver the required disconnect event, observe that the caller has dropped the event receiver, and exit before it sends the shutdown one-shot. That state is already terminal cleanup, not a failed shutdown, so callers should not see a `BrokenPipe` from the acknowledgement channel. ## What Changed - Treat a closed remote shutdown acknowledgement as an already-exited worker while still propagating websocket close errors when the worker returns them. - Added a deterministic regression test for the interleaving where the shutdown command is received and the worker exits before replying. ## Verification - `cargo test -p codex-app-server-client` - New test: `remote::tests::shutdown_tolerates_worker_exit_after_command_is_queued`	2026-04-22 02:41:19 +00:00
xl-openai	a978e411f6	feat: Support remote plugin list/read. (#18452 ) Add a temporary internal remote_plugin feature flag that merges remote marketplaces into plugin/list and routes plugin/read through the remote APIs when needed, while keeping pure local marketplaces working as before. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-21 18:39:07 -07:00
Michael Bolin	536952eeee	bazel: run wrapped Rust unit test shards (#18913 ) ## Why The `codex-tui` Cargo test suite was catching stale snapshot expectations, but the matching Bazel unit-test target was still green. The TUI unit target is wrapped by `workspace_root_test` so tests run from the repository root and Insta can resolve Cargo-like snapshot paths. After native Bazel sharding was enabled for that wrapped target, rules_rust also inserted its own sharding wrapper around the Rust test binary. Those two wrappers did not compose: rules_rust's sharding wrapper expects to run from its own runfiles cwd, while `workspace_root_test` deliberately changes cwd to the repo root before invoking the test. In that configuration, the inner wrapper could fail to enumerate the Rust tests and exit successfully with empty shards, so snapshot regressions were not being exercised by Bazel. ## What Changed - Stop enabling rules_rust's inner `experimental_enable_sharding` for unit-test binaries created by `codex_rust_crate`. - Keep the configured `shard_count` on the outer `workspace_root_test` target. - Add libtest sharding directly to `workspace_root_test_launcher.sh.tpl` and `workspace_root_test_launcher.bat.tpl` after the launcher has resolved the actual test binary and established the intended repository-root cwd. - Partition tests by a stable FNV-1a hash of each libtest test name, matching the stable-shard behavior we wanted without depending on the inner rules_rust wrapper. - Preserve ad-hoc local test filters by running the resolved test binary directly when explicit test args are supplied. - On Windows, run selected libtest names from the shard list in bounded PowerShell batches instead of concatenating every selected test into one `cmd.exe` command line. This PR is stacked on top of #18912, which contains only the snapshot expectation updates exposed once the Bazel target actually runs the TUI unit tests. It is also the reason #18916 becomes visible: once this wrapper fix makes Bazel execute the affected `codex-core` test, that test needs its own executable-path setup fixed. ## Verification - `cargo test -p codex-tui` - `bazel test //codex-rs/tui:tui-unit-tests --test_output=errors` - `bazel test //codex-rs/tui:all --test_output=errors` - `bash -n workspace_root_test_launcher.sh.tpl` - Exercised the Windows PowerShell batching fragment locally with a fake test binary and shard-list file.	2026-04-21 18:35:47 -07:00
Celia Chen	1cd3ad1f49	feat: add AWS SigV4 auth for OpenAI-compatible model providers (#17820 ) ## Summary Add first-class Amazon Bedrock Mantle provider support so Codex can keep using its existing Responses API transport with OpenAI-compatible AWS-hosted endpoints such as AOA/Mantle. This is needed for the AWS launch path, where provider traffic should authenticate with AWS credentials instead of OpenAI bearer credentials. Requests are authenticated immediately before transport send, so SigV4 signs the final method, URL, headers, and body bytes that `reqwest` will send. ## What Changed - Added a new `codex-aws-auth` crate for loading AWS SDK config, resolving credentials, and signing finalized HTTP requests with AWS SigV4. - Added a built-in `amazon-bedrock` provider that targets Bedrock Mantle Responses endpoints, defaults to `us-east-1`, supports region/profile overrides, disables WebSockets, and does not require OpenAI auth. - Added Amazon Bedrock auth resolution in `codex-model-provider`: prefer `AWS_BEARER_TOKEN_BEDROCK` when set, otherwise use AWS SDK credentials and SigV4 signing. - Added `AuthProvider::apply_auth` and `Request::prepare_body_for_send` so request-signing providers can sign the exact outbound request after JSON serialization/compression. - Determine the region by taking the `aws.region` config first (required for bearer token codepath), and fallback to SDK default region. ## Testing Amazon Bedrock Mantle Responses paths: - Built the local Codex binary with `cargo build`. - Verified the custom proxy-backed `aws` provider using `env_key = "AWS_BEARER_TOKEN_BEDROCK"` streamed raw `responses` output with `response.output_text.delta`, `response.completed`, and `mantle-env-ok`. - Verified a full `codex exec --profile aws` turn returned `mantle-env-ok`. - Confirmed the custom provider used the bearer env var, not AWS profile auth: bogus `AWS_PROFILE` still passed, empty env var failed locally, and malformed env var reached Mantle and failed with `401 invalid_api_key`. - Verified built-in `amazon-bedrock` with `AWS_BEARER_TOKEN_BEDROCK` set passed despite bogus AWS profiles, returning `amazon-bedrock-env-ok`. - Verified built-in `amazon-bedrock` SDK/SigV4 auth passed with `AWS_BEARER_TOKEN_BEDROCK` unset and temporary AWS session env credentials, returning `amazon-bedrock-sdk-env-ok`.	2026-04-22 01:11:17 +00:00
Michael Bolin	e18fe7a07f	test(core): move prompt debug coverage to integration suite (#18916 ) ## Why `build_prompt_input` now initializes `ExecServerRuntimePaths`, which requires a configured Codex executable path. The previous inline unit test in `core/src/prompt_debug.rs` built a bare `test_config()` and then failed before it could assert anything useful: ```text Codex executable path is not configured ``` This coverage is also integration-shaped: it drives the public `build_prompt_input` entry point through config, thread, and session setup rather than testing a small internal helper in isolation. Bazel CI did not catch this earlier because the affected test was behind the same wrapped Rust unit-test path fixed by #18913. Before that launcher/sharding fix, the outer `workspace_root_test` changed the working directory for Insta compatibility while the inner `rules_rust` sharding wrapper still expected its runfiles working directory. In practice, Bazel could report success without executing the Rust test cases in that shard. Once #18913 makes the wrapper run the Rust test binary directly and shard with libtest arguments, this stale unit test actually runs and exposes the missing `codex_self_exe` setup. ## What Changed - Moved `build_prompt_input_includes_context_and_user_message` out of `core/src/prompt_debug.rs`. - Added `core/tests/suite/prompt_debug_tests.rs` and registered it from `core/tests/suite/mod.rs`. - Builds the test config with `ConfigBuilder` and provides `codex_self_exe` using the current test executable, matching the runtime-path invariant required by prompt debug setup. - Preserves the existing assertions that the generated prompt input includes both the debug user message and project-specific user instructions. ## Verification - `cargo test -p codex-core --test all prompt_debug_tests::build_prompt_input_includes_context_and_user_message` - `bazel test //codex-rs/core:core-all-test --test_arg=prompt_debug_tests::build_prompt_input_includes_context_and_user_message --test_output=errors` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18916). * #18913 * __->__ #18916	2026-04-22 01:08:25 +00:00
Felipe Coury	09ebc34f17	fix(core): emit hooks for apply_patch edits (#18391 ) Fixes https://github.com/openai/codex/issues/16732. ## Why `apply_patch` is Codex's primary file edit path, but it was not emitting `PreToolUse` or `PostToolUse` hook events. That meant hook-based policy, auditing, and write coordination could observe shell commands while missing the actual file mutation performed by `apply_patch`. The issue also exposed that the hook runtime serialized command hook payloads with `tool_name: "Bash"` unconditionally. Even if `apply_patch` supplied hook payloads, hooks would either fail to match it directly or receive misleading stdin that identified the edit as a Bash tool call. ## What Changed - Added `PreToolUse` and `PostToolUse` payload support to `ApplyPatchHandler`. - Exposed the raw patch body as `tool_input.command` for both JSON/function and freeform `apply_patch` calls. - Taught tool hook payloads to carry a handler-supplied hook-facing `tool_name`. - Preserved existing shell compatibility by continuing to emit `Bash` for shell-like tools. - Serialized the selected hook `tool_name` into hook stdin instead of hardcoding `Bash`. - Relaxed the generated hook command input schema so `tool_name` can represent tools other than `Bash`. ## Verification Added focused handler coverage for: - JSON/function `apply_patch` calls producing a `PreToolUse` payload. - Freeform `apply_patch` calls producing a `PreToolUse` payload. - Successful `apply_patch` output producing a `PostToolUse` payload. - Shell and `exec_command` handlers continuing to expose `Bash`. Added end-to-end hook coverage for: - A `PreToolUse` hook matching `^apply_patch$` blocking the patch before the target file is created. - A `PostToolUse` hook matching `^apply_patch$` receiving the patch input and tool response, then adding context to the follow-up model request. - Non-participating tools such as the plan tool continuing not to emit `PreToolUse`/`PostToolUse` hook events. Also validated manually with a live `codex exec` smoke test using an isolated temp workspace and temp `CODEX_HOME`. The smoke test confirmed that a real `apply_patch` edit emits `PreToolUse`/`PostToolUse` with `tool_name: "apply_patch"`, a shell command still emits `tool_name: "Bash"`, and a denying `PreToolUse` hook prevents the blocked patch file from being created.	2026-04-21 22:00:40 -03:00
starr-openai	1d4cc494c9	Add turn-scoped environment selections (#18416 ) ## Summary - add experimental turn/start.environments params for per-turn environment id + cwd selections - pass selections through core protocol ops and resolve them with EnvironmentManager before TurnContext creation - treat omitted selections as default behavior, empty selections as no environment, and non-empty selections as first environment/cwd as the turn primary ## Testing - ran `just fmt` - ran `just write-app-server-schema` - not run: unit tests for this stacked PR --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-21 17:48:33 -07:00
Michael Bolin	6368f506b7	fix: windows snapshot for external_agent_config_migration::tests::prompt_snapshot did not match windows output (#18915 ) Fix a snapshot test that is failing on Windows, but is currently missed by Bazel due to https://github.com/openai/codex/pull/18913. We see this failing on Cargo builds on Windows, though. This Bazel vs. Cargo inconsistency explains why https://github.com/openai/codex/pull/18768 did not fix the Cargo Windows build.	2026-04-22 00:32:46 +00:00
Michael Bolin	799e50412e	sandboxing: materialize cwd-relative permission globs (#18867 ) ## Why #18275 anchors session-scoped `:cwd` and `:project_roots` grants to the request cwd before recording them for reuse. Relative deny glob entries need the same treatment. Without anchoring, a stored session permission can keep a pattern such as `*/.env` relative, then reinterpret that deny against a later turn cwd. That makes the persisted profile depend on the cwd at reuse time instead of the cwd that was reviewed and approved. ## What changed `intersect_permission_profiles` now materializes retained `FileSystemPath::GlobPattern` entries against the request cwd, matching the existing materialization for cwd-sensitive special paths. Materialized accepted grants are now deduplicated before deny retention runs. This keeps the sticky-grant preapproval shape stable when a repeated request is merged with the stored grant and both `:cwd = write` and the materialized absolute cwd write are present. The preapproval check compares against the same materialized form, so a later request for the same cwd-relative deny glob still matches the stored anchored grant instead of re-prompting or rejecting. Tests cover both the storage path and the preapproval path: a session-scoped `:cwd = write` grant with `*/.env = none` is stored with both the cwd write and deny glob anchored to the original request cwd, cannot be reused from a later cwd, and remains preapproved when re-requested from the original cwd after merging with the stored grant. ## Verification - `cargo test -p codex-sandboxing policy_transforms` - `cargo test -p codex-core --lib relative_deny_glob_grants_remain_preapproved_after_materialization` - `cargo clippy -p codex-sandboxing --tests -- -D clippy::redundant_clone` - `cargo clippy -p codex-core --lib -- -D clippy::redundant_clone` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18867). * #18288 * #18287 * #18286 * #18285 * #18284 * #18283 * #18282 * #18281 * #18280 * #18279 * #18278 * #18277 * #18276 * __->__ #18867	2026-04-21 17:28:58 -07:00
canvrno-oai	37701d4654	Update /statusline and /title snapshots (#18909 ) Update `/statusline` and `/title` snapshots	2026-04-21 17:16:50 -07:00
alexsong-oai	6bbd710496	[codex] Tighten external migration prompt tests (#18768 ) ## Summary - tighten the external migration prompt snapshot around stable synthetic fixture text - add focused display_description tests for relative path rewriting and plugin summaries - split the path-format assertions into smaller, easier-to-read unit tests ## Why The previous prompt snapshot was coupled to path text that came from detected migration items, which made it noisier and more brittle than necessary. This change keeps the snapshot focused on stable UI structure and moves dynamic path formatting checks into targeted unit tests. ## Validation - cargo test -p codex-tui external_agent_config_migration::tests:: - cargo test -p codex-tui external_agent_config_migration::tests::display_description_ - just fmt ## Notes Per the repo instructions, I did not rerun tests after the final `just fmt` pass.	2026-04-21 16:20:15 -07:00
canvrno-oai	2202675632	Normalize /statusline & /title items (#18886 ) This change aligns the `/statusline` and `/title` UIs around the same normalized item model so both surfaces use consistent ids, labels, and preview semantics. It keeps the shared preview work from #18435 , tightens the remaining mismatches by standardizing item naming, expands title/status item coverage where appropriate, and makes `/title` preview use the same title-specific formatting path as the real rendered terminal title. - Normalizes persisted item ids and keeps legacy aliases for compatibility - Aligns `status-line` and `terminal-title` items with the shared preview model - Routes `terminal-title` preview through title-specific formatting and truncation - Updates the affected status/title setup snapshots Added to `/statusline`: - status - task-progress Normalized in `/statusline`: - model-name -> model - project-root -> project-name Added to `/title`: - current-dir - context-remaining - context-used - five-hour-limit - weekly-limit - codex-version - used-tokens - total-input-tokens - total-output-tokens - session-id - fast-mode - model-with-reasoning Normalized in `/title`: - project -> project-name - thread -> thread-title - model-name -> model	2026-04-21 16:13:09 -07:00
maja-openai	ef00014a46	Allow guardian bare allow output (#18797 ) ## Summary Allow guardian to skip other fields and output only `{"outcome":"allow"}` when the command is low risk. This change lets guardian reviews use a non-strict text format while keeping the JSON schema itself as plain user-visible schema data, so transport strictness is carried out-of-band instead of through a schema marker key. ## What changed - Add an explicit `output_schema_strict` flag to model prompts and pass it into `codex-api` text formatting. - Set guardian reviewer prompts to non-strict schema validation while preserving strict-by-default behavior for normal callers. - Update the guardian output contract so definitely-low-risk decisions may return only `{"outcome":"allow"}`. - Treat bare allow responses as low-risk approvals in the guardian parser. - Add tests and snapshots covering the non-strict guardian request and optional guardian output fields. ## Verification - `cargo test -p codex-core guardian::tests::guardian` - `cargo test -p codex-core guardian::tests::` - `cargo test -p codex-core client_common::tests::` - `cargo test -p codex-protocol user_input_serialization_includes_final_output_json_schema` - `cargo test -p codex-api` - `git diff --check` Note: `cargo test -p codex-core` was also attempted, but this desktop environment injects ambient config/proxy state that causes unrelated config/session tests expecting pristine defaults to fail. --------- Co-authored-by: Dylan Hurd <dylan.hurd@openai.com> Co-authored-by: Codex <noreply@openai.com>	2026-04-21 15:37:12 -07:00
starr-openai	ddbe2536be	Support multiple managed environments (#18401 ) ## Summary - refactor EnvironmentManager to own keyed environments with default/local lookup helpers - keep remote exec-server client creation lazy until exec/fs use - preserve disabled agent environment access separately from internal local environment access ## Validation - not run (per Codex worktree instruction to avoid tests/builds unless requested) --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-21 15:29:35 -07:00
cassirer-openai	27d9673273	[rollout_trace] Add rollout trace crate (#18876 ) ## Summary Adds the standalone `codex-rollout-trace` crate, which defines the raw trace event format, replay/reduction model, writer, and reducer logic for reconstructing model-visible conversation/runtime state from recorded rollout data. The crate-level design is documented in [`codex-rs/rollout-trace/README.md`](https://github.com/openai/codex/blob/codex/rollout-trace-crate/codex-rs/rollout-trace/README.md). ## Stack This is PR 1/5 in the rollout trace stack. - [#18876](https://github.com/openai/codex/pull/18876): Add rollout trace crate - [#18877](https://github.com/openai/codex/pull/18877): Record core session rollout traces - [#18878](https://github.com/openai/codex/pull/18878): Trace tool and code-mode boundaries - [#18879](https://github.com/openai/codex/pull/18879): Trace sessions and multi-agent edges - [#18880](https://github.com/openai/codex/pull/18880): Add debug trace reduction command ## Review Notes This PR intentionally does not wire tracing into live Codex execution. It establishes the data model and reducer contract first, with crate-local tests covering conversation reconstruction, compaction boundaries, tool/session edges, and code-cell lifecycle reduction. Later PRs emit into this model. The README is the best entry point for reviewing the intended trace format and reduction semantics before diving into the reducer modules.	2026-04-21 21:54:05 +00:00
Shijie Rao	c5e9c6f71f	Preserve Cloudfare HTTP cookies in codex (#17783 ) ## Summary - Adds a process-local, in-memory cookie store for ChatGPT HTTP clients. - Limits cookie storage and replay to a shared ChatGPT host allowlist. - Wires the shared store into the default Codex reqwest client and backend client. - Shares the ChatGPT host allowlist with remote-control URL validation to avoid drift. - Enables reqwest cookie support and updates lockfiles.	2026-04-21 14:40:15 -07:00
efrazer-oai	be75785504	fix: fully revert agent identity runtime wiring (#18757 ) ## Summary This PR fully reverts the previously merged Agent Identity runtime integration from the old stack: https://github.com/openai/codex/pull/17387/changes It removes the Codex-side task lifecycle wiring, rollout/session persistence, feature flag plumbing, lazy `auth.json` mutation, background task auth paths, and request callsite changes introduced by that stack. This leaves the repo in a clean pre-AgentIdentity integration state so the follow-up PRs can reintroduce the pieces in smaller reviewable layers. ## Stack 1. This PR: full revert 2. https://github.com/openai/codex/pull/18871: move Agent Identity business logic into a crate 3. https://github.com/openai/codex/pull/18785: add explicit AgentIdentity auth mode and startup task allocation 4. https://github.com/openai/codex/pull/18811: migrate auth callsites through AuthProvider ## Testing Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.	2026-04-21 14:30:55 -07:00
Ruslan Nigmatullin	69c3d12274	app-server: implement device key v2 methods (#18430 ) ## Why The device-key protocol needs an app-server implementation that keeps local key operations behind the same request-processing boundary as other v2 APIs. app-server owns request dispatch, transport policy, documentation, and JSON-RPC error shaping. `codex-device-key` owns key binding, validation, platform provider selection, and signing mechanics. Keeping the adapter thin makes the boundary easier to review and avoids moving local key-management details into thread orchestration code. ## What changed - Added `DeviceKeyApi` as the app-server adapter around `DeviceKeyStore`. - Converted protocol protection policies, payload variants, algorithms, and protection classes to and from the device-key crate types. - Encoded SPKI public keys and DER signatures as base64 protocol fields. - Routed `device/key/create`, `device/key/public`, and `device/key/sign` through `MessageProcessor`. - Rejected remote transports before provider access while allowing local `stdio` and in-process callers to reach the device-key API. - Added stdio, in-process, and websocket tests for device-key validation and transport policy. - Documented the device-key methods in the app-server v2 method list. ## Test coverage - `device_key_create_rejects_empty_account_user_id` - `in_process_allows_device_key_requests_to_reach_device_key_api` - `device_key_methods_are_rejected_over_websocket` ## Stack This is PR 3 of 4 in the device-key app-server stack. It is stacked on #18429. ## Validation - `cargo test -p codex-app-server device_key` - `just fix -p codex-app-server`	2026-04-21 14:07:08 -07:00
Felipe Coury	e502f0b52d	feat(tui): shortcuts to change reasoning level temporarily (#18866 ) ## Summary Adds main-chat shortcuts for changing reasoning effort one step at a time: - `Alt+,` lowers reasoning (has the `<` arrow on the key) - `Alt+.` raises reasoning (similarly, has the `>` arrow) The shortcut updates the active session only. It does not persist the selected reasoning level as the default for future sessions. In Plan mode, it applies temporarily to Plan mode without opening the global-vs-Plan scope prompt. ## Details The shortcut uses the active model preset to decide which reasoning levels are valid. If the current session has no explicit reasoning effort, it starts from the model default. Each keypress moves to the next supported level in the requested direction. The shortcut only runs from the main chat surface. If a popup or modal is open, input remains owned by that UI. In Plan mode, the shortcut updates the in-memory Plan reasoning override directly. The model/reasoning picker still keeps the existing scope prompt for explicit picker changes. ## Notes Ctrl-plus and Ctrl-minus were considered, but terminals do not deliver those combinations consistently, so this PR uses Alt shortcuts instead. If the current effort is unsupported by the selected model, the shortcut skips to the nearest supported level in the requested direction. If there is no valid step, it shows the existing boundary message. ## Tests - `cargo test -p codex-tui reasoning_shortcuts` - `cargo test -p codex-tui reasoning_effort` - `cargo test -p codex-tui reasoning_shortcut` - `cargo test -p codex-tui footer_snapshots` - `cargo test -p codex-tui` - `just fix -p codex-tui` - `./tools/argument-comment-lint/run.py -p codex-tui -- --tests` --------- Co-authored-by: Eric Traut <etraut@openai.com>	2026-04-21 18:04:03 -03:00
pakrym-oai	ffa6944587	Load app-server config through ConfigManager (#18870 ) ## Summary - Load app-server startup config through `ConfigManager` instead of direct `ConfigBuilder` calls. - Move `ConfigManager` constructor-owned state (`cli_overrides`, runtime feature map, cloud requirements loader) behind internal manager fields. - Pass `ConfigManager` into `MessageProcessor` directly instead of reconstructing it from raw args. ## Tests - `cargo check -p codex-app-server` - `cargo test -p codex-app-server` - `just fix -p codex-app-server` - `just fmt`	2026-04-21 14:01:02 -07:00
jif-oai	15b8cde2a4	chore: default multi-agent v2 fork to all (#18873 ) Default sub-agents v2 to `all` for the fork mode	2026-04-21 21:54:58 +01:00
iceweasel-oai	6f6997758a	skip busted tests while I fix them (#18885 )	2026-04-21 13:40:34 -07:00
Ruslan Nigmatullin	56375712e3	app-server: fix Bazel clippy in tracing tests (#18872 ) ## Why PR #18431 exposed a Bazel clippy failure in the app-server unit-test target across Linux, macOS, and Windows. The failing lint was `clippy::await_holding_invalid_type`: two tracing tests serialized access to global tracing state by holding a `tokio::sync::MutexGuard` across awaited test work. That serialization is still needed because the tests share process-global tracing setup and exporter state, but it should not require holding an async mutex guard through the whole test body. ## What changed - Replaced the bespoke async `tracing_test_guard` helper with `serial_test` on the two tracing tests that need global tracing serialization. - Removed the `#[expect(clippy::await_holding_invalid_type)]` annotations and the lock guard callsites that Bazel clippy rejected. ## Validation - `cargo test -p codex-app-server jsonrpc_span` - `just fix -p codex-app-server` - `git diff --check` I also attempted the exact failing Bazel clippy target locally with BuildBuddy disabled: `bazel --noexperimental_remote_repo_contents_cache build --config=clippy --bes_backend= --remote_cache= --experimental_remote_downloader= -- //codex-rs/app-server:app-server-unit-tests-bin`. That run did not reach clippy because Bazel timed out downloading `libcap-2.27.tar.gz` from `kernel.org`.	2026-04-21 13:10:36 -07:00
Ruslan Nigmatullin	5bab04dcd7	app-server: add codex-device-key crate (#18429 ) ## Why Device-key storage and signing are local security-sensitive operations with platform-specific behavior. Keeping the core API in `codex-device-key` keeps app-server focused on routing and business logic instead of owning key-management details. The crate keeps the signing surface intentionally narrow: callers can create a bound key, fetch its public key, or sign one of the structured payloads accepted by the crate. It does not expose a generic arbitrary-byte signing API. Key IDs cross into platform-specific labels, tags, and metadata paths, so externally supplied IDs are constrained to the same auditable namespace created by the crate: `dk_` followed by unpadded base64url for 32 bytes. Remote-control target paths are also tied to each signed payload shape so connection proofs cannot be reused for enrollment endpoints, or vice versa. ## What changed - Added the `codex-device-key` workspace crate. - Added account/client-bound key creation with stable `dk_` key IDs. - Added strict `key_id` validation before public-key lookup or signing reaches a provider. - Added public-key lookup and structured signing APIs. - Split remote-control client endpoint allowlists by connection vs enrollment payload shape. - Added validation for key bindings, accepted payload fields, token expiration, and payload/key binding mismatches. - Added flow-oriented docs on the validation helpers that gate provider signing. - Added protection policy and protection-class types without wiring a platform provider yet. - Added an unsupported default provider so platforms without an implementation fail explicitly instead of silently falling back to software-backed keys. - Updated Cargo and Bazel lock metadata for the new crate and non-platform-specific dependencies. ## Stack This is stacked on #18428. ## Validation - `cargo test -p codex-device-key` - Added unit coverage for strict `key_id` validation before provider use. - Added unit coverage that rejects remote-control paths from the wrong signed payload shape. - `just bazel-lock-update` - `just bazel-lock-check`	2026-04-21 17:57:00 +00:00
iceweasel-oai	8612714aa6	Add Windows sandbox unified exec runtime support (#15578 ) ## Summary This is the runtime/foundation half of the Windows sandbox unified-exec work. - add Windows sandbox `unified_exec` session support in `windows-sandbox-rs` for both: - the legacy restricted-token backend - the elevated runner backend - extend the PTY/process runtime so driver-backed sessions can support: - stdin streaming - stdout/stderr separation - exit propagation - PTY resize hooks - add Windows sandbox runtime coverage in `codex-windows-sandbox` / `codex-utils-pty` This PR does not enable Windows sandbox `UnifiedExec` for product callers yet because hooking this up to app-server comes in the next PR. Windows sandbox advertising is intentionally kept aligned with `main`, so sandboxed Windows callers still fall back to `ShellCommand`. This PR isolates the runtime/session layer so it can be reviewed independently from product-surface enablement. --------- Co-authored-by: jif-oai <jif@openai.com> Co-authored-by: Codex <noreply@openai.com>	2026-04-21 10:44:49 -07:00
Steve Coffey	38ba876ea9	Refresh generated Python app-server SDK types (#18862 ) This is the first step in splitting the Python SDK PyPI publish work into reviewable layers: land the generated SDK refresh by itself before changing packaging mechanics. The next PRs will make the runtime wheel publishable, then wire the SDK package/version pinning to that runtime. ## Summary - Refresh generated Python app-server v2 models and notification registry from the current schema. - Update the public API signature expectations for the newly generated kwargs. ## Stack - PR 1 of 3 for the Python SDK PyPI publishing split. - Follow-up PRs will handle runtime wheel publishing mechanics, then SDK/package version pinning. ## Tests - `uv run --extra dev pytest` in `sdk/python` -> 51 passed, 37 skipped.	2026-04-21 10:23:27 -07:00
Michael Bolin	f8562bd47b	sandboxing: intersect permission profiles semantically (#18275 ) ## Why Permission approval responses must not be able to grant more access than the tool requested. Moving this flow to `PermissionProfile` means the comparison must be profile-shaped instead of `SandboxPolicy`-shaped, and cwd-relative special paths such as `:cwd` and `:project_roots` must stay anchored to the turn that produced the request. ## What changed This implements semantic `PermissionProfile` intersection in `codex-sandboxing` for file-system and network permissions. The intersection accepts narrower path grants, rejects broader grants, preserves deny-read carve-outs and glob scan depth, and materializes cwd-dependent special-path grants to absolute paths before they can be recorded for reuse. The request-permissions response paths now use that intersection consistently. App-server captures the request turn cwd before waiting for the client response, includes that cwd in the v2 approval params, and core stores the requested profile plus cwd for direct TUI/client responses and Guardian decisions before recording turn- or session-scoped grants. The TUI app-server bridge now preserves the app-server request cwd when converting permission approval params into core events. ## Verification - `cargo test -p codex-sandboxing intersect_permission_profiles -- --nocapture` - `cargo test -p codex-app-server request_permissions_response -- --nocapture` - `cargo test -p codex-core request_permissions_response_materializes_session_cwd_grants_before_recording -- --nocapture` - `cargo check -p codex-tui --tests` - `cargo check --tests` - `cargo test -p codex-tui app_server_request_permissions_preserves_file_system_permissions`	2026-04-21 10:23:01 -07:00

1 2 3 4 5 ...

5677 Commits