codex

mirror of https://github.com/openai/codex.git synced 2026-06-01 19:02:59 +00:00

Author	SHA1	Message	Date
pakrym-oai	ed5944ba1d	Simplify MCP tool handler plumbing (#21595 ) ## Why The MCP tool path had accumulated a few core-owned special cases: a dedicated payload variant, resolver plumbing, a legacy `AfterToolUse` translation path, and a side channel for parallel-call metadata. That made `ToolRegistry` and the spec builder know more about MCP than they needed to. This change moves MCP-specific execution details back onto `ToolInfo` and `McpHandler` so `codex-core` can treat MCP calls like normal function calls while still preserving MCP-specific dispatch and telemetry behavior where it belongs. ## What changed - removed `resolve_mcp_tool_info`, `ToolPayload::Mcp`, `ToolKind`, and the remaining registry-side MCP resolver path - stored MCP routing metadata directly on `McpHandler` and `ToolInfo`, including `supports_parallel_tool_calls` - deleted the legacy `AfterToolUse` consumer in `core`, which removes the need for handler-specific `after_tool_use_payload` implementations - switched tool-result telemetry to handler-provided tags and kept MCP-specific dispatch payload construction inside the handler - simplified tool spec planning/building by passing `ToolInfo` directly and dropping the direct/deferred MCP wrapper structs and the parallel-server side table ## Testing - `cargo check -p codex-core -p codex-mcp -p codex-otel` - `cargo test -p codex-core mcp_parallel_support_uses_exact_payload_server` - `cargo test -p codex-core direct_mcp_tools_register_namespaced_handlers` - `cargo test -p codex-core search_tool_description_lists_each_mcp_source_once` - `cargo test -p codex-mcp list_all_tools_uses_startup_snapshot_while_client_is_pending` - `just fix -p codex-core -p codex-mcp -p codex-otel`	2026-05-12 00:11:31 +00:00
Felipe Coury	e16b4e46d4	fix(tui): handle hidden app git directives (#21946 )	2026-05-11 21:08:40 -03:00
Ruslan Nigmatullin	95d8669ab2	[exec-server] serve websocket listener via HTTP upgrade (#21963 ) ## Why `codex exec-server` should keep the existing public `ws://IP:PORT` URL shape while serving that websocket connection through an HTTP upgrade path internally. That keeps the client-facing configuration simple and allows the listener to work through intermediate HTTP-aware infrastructure. ## What changed - keep the emitted and configured exec-server URL as `ws://IP:PORT` - serve that websocket endpoint through Axum HTTP upgrade handling on `/` - expose `GET /readyz` from the same listener for readiness checks - route upgraded Axum websocket streams through the shared JSON-RPC connection machinery - initialize the rustls crypto provider before websocket client connections - preserve inbound binary websocket JSON-RPC parsing for compatibility with the prior transport behavior ## Verification - `cargo test -p codex-exec-server --test health --test process --test websocket --test initialize --test exec_process`	2026-05-11 17:04:21 -07:00
Matthew Zeng	e15ecc9c35	Add production startup and TTFT telemetry (#22198 ) ## Why While investigating `codex exec hi` startup latency, the useful questions were not "is startup slow?" but "which durable bucket is slow in production?" The path we observed has a few distinct stages: 1. `thread/start` creates the session 2. startup prewarm builds the turn context, tools, and prompt 3. startup prewarm warms the websocket 4. the first real turn resolves the prewarm 5. the model produces the first token Before this PR, production telemetry had some of the raw measurements already: - aggregate startup-prewarm duration / age-at-first-turn metrics - TTFT as a metric - websocket request telemetry But there was no coherent production event stream for the startup breakdown itself, and TTFT was metric-only. That made it hard to answer the same latency questions from OpenTelemetry-backed logs without adding one-off local instrumentation. ## What changed Add durable production telemetry on the existing `SessionTelemetry` path: - new `codex.startup_phase` OTel log/trace events plus `codex.startup.phase.duration_ms` - new `codex.turn_ttft` OTel log/trace events while preserving the existing TTFT metric The startup phase event is emitted for the coarse buckets we actually observed while running `exec hi`: - `thread_start_create_thread` - `startup_prewarm_total` - `startup_prewarm_create_turn_context` - `startup_prewarm_build_tools` - `startup_prewarm_build_prompt` - `startup_prewarm_websocket_warmup` - `startup_prewarm_resolve` These phases are intentionally low-cardinality so they remain safe as production telemetry tags. ## Why this shape This keeps the instrumentation on the same production path as the rest of the session telemetry instead of adding a local debug-only trace mode. It also avoids changing startup behavior: - prewarm still runs - no control flow changes - no extra remote calls - no user-visible behavior changes One boundary is intentional: very early process bootstrap that happens before a session exists is not included here, because this PR uses session-scoped production telemetry. The expensive buckets we were trying to understand after `thread/start` are now covered durably. ## Verification - `cargo test -p codex-otel` - `cargo test -p codex-core turn_timing` - `cargo test -p codex-core regular_turn_emits_turn_started_without_waiting_for_startup_prewarm` - `cargo test -p codex-core interrupting_regular_turn_waiting_on_startup_prewarm_emits_turn_aborted` - `cargo test -p codex-app-server thread_start` - `just fix -p codex-otel -p codex-core -p codex-app-server` I also ran `cargo test -p codex-core`; it built successfully and then hit an existing unrelated stack overflow in `tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`.	2026-05-11 23:58:36 +00:00
Michael Bolin	a9dc65e802	merge commit for archive created by Sapling	2026-05-11 16:50:06 -07:00
Michael Bolin	014c5898ce	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 16:49:49 -07:00
starr-openai	22e84c49d0	Support multi-environment apply_patch selection (#21617 ) ## Summary - add multi-environment apply_patch routing for both freeform and function-call tool flows - parse and reconcile the optional environment selector in the main apply_patch parser, then verify against the selected environment in the handler - carry environment_id through runtime and approval surfaces so remote-targeted patches stay explicit end to end ## Testing - just fmt - remote exec-server e2e: `cargo test -p codex-core --test all apply_patch_multi_environment_uses_remote_executor -- --nocapture` on dev via `scripts/test-remote-env.sh` --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-11 16:33:44 -07:00
Michael Bolin	53d3023da1	merge commit for archive created by Sapling	2026-05-11 16:29:03 -07:00
Michael Bolin	ba3a40bc3b	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 16:28:52 -07:00
Michael Bolin	1c2f0d38d3	merge commit for archive created by Sapling	2026-05-11 16:05:29 -07:00
Michael Bolin	4081253747	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 16:05:17 -07:00
Michael Bolin	f921703092	merge commit for archive created by Sapling	2026-05-11 15:44:54 -07:00
Michael Bolin	435b7ab8c5	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 15:44:41 -07:00
alexsong-oai	bb6134c028	Stop uploading accepted line fingerprints (#22180 ) ## Summary - keep accepted-line diff parsing and fingerprint hashing logic locally - stop uploading path/line hash fingerprints in the accepted-line analytics event payload - keep aggregate accepted added/deleted line counts in the event ## Testing - just fmt - cargo test -p codex-analytics - just fix -p codex-analytics	2026-05-11 15:41:38 -07:00
Owen Lin	4859d80ffe	Update codex remote-control to start the daemon (#22218 ) ## Why Update `codex remote-control` to use the new app server daemon commands instead. - if the updater loop is not running, bootstrap the daemon with remote control enabled (`codex app-server daemon bootstrap --remote-control`) - otherwise, enable the persisted remote-control setting and start the daemon normally	2026-05-11 15:38:30 -07:00
Michael Bolin	5fa4a8c994	merge commit for archive created by Sapling	2026-05-11 15:29:31 -07:00
Michael Bolin	2d23f3ad7b	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 15:24:36 -07:00
Michael Bolin	c50d5fdbb7	Merge `6579ec2f9d` into sapling-pr-archive-bolinfest	2026-05-11 15:23:23 -07:00
Michael Bolin	6579ec2f9d	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 15:23:16 -07:00
Abhinav	9ab7f4e6ac	Add Windows hook command overrides (#22159 ) # Why Managed hook configs need a shared cross-platform shape without making the existing `command` field polymorphic. The common case is still one command string, with Windows needing a different entrypoint only when the runtime is actually Windows. Keeping `command` as the portable/default path and adding an optional Windows override keeps the config easier to read, preserves the existing scalar shape for non-Windows users, and avoids forcing every caller into a `{ unix, windows }` object when only one platform needs special handling. # What - Add optional `command_windows` / `commandWindows` alongside the existing hook `command` field. - Resolve `command_windows` only on Windows during hook discovery; other platforms continue to use `command` unchanged. - Keep trust hashing aligned to the effective command selected for the current runtime. # Docs The Codex hooks/config reference should document `command_windows` as the Windows-only override for command hooks.	2026-05-11 22:22:29 +00:00
Michael Bolin	84307c03ee	merge commit for archive created by Sapling	2026-05-11 15:19:58 -07:00
Michael Bolin	162da66557	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 15:19:34 -07:00
rhan-oai	a175ddacc0	[codex-analytics] emit terminal review events (#18748 ) ## Why Review telemetry should describe reviews as first-class events, not only as counters denormalized onto terminal tool-item events. That lets us analyze guardian and user reviews consistently across command execution, file changes, permissions, and network access, while still preserving the terminal item summaries that existing tool analytics need. To make those review events accurate, analytics also needs the observed completion time for each review and enough command metadata to distinguish `shell` from `unified_exec` reviews. ## What changed - emit generic `codex_review_event` rows for completed user and guardian reviews, with review subjects, reviewer, trigger, terminal status, resolution, and observed duration - reduce approval request / response / abort facts into review events for command execution, file change, and permissions flows - keep denormalized review counts, final approval outcome, and permission-request flags on terminal tool-item events for item-associated reviews - plumb review completion timing so user-review responses and aborts use app-server-observed completion times, while guardian analytics reuse the same terminal timestamps emitted on guardian assessment events - carry command approval `source` through the protocol and app-server layers so review analytics can distinguish `shell` from `unified_exec` - add analytics coverage for user-review emission, guardian-review emission, permission reviews that should not denormalize onto tool items, item-summary isolation across threads, and the serialized review-event shape ## Verification - `cargo test -p codex-analytics` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18748). * __->__ #18748 * #21434 * #18747 * #17090 * #17089 * #20514	2026-05-11 22:13:32 +00:00
Ahmed Ibrahim	aa9e8f0262	[8/8] Add Python SDK Ruff formatting (#22021 ) ## Why The Python SDK needs the same tight formatter/lint loop as the rest of the repo: a safe Ruff autofix pass, Ruff formatting, editor save behavior, and CI checks that catch drift. Without that loop, SDK changes can land with formatting or import ordering that differs from what reviewers and CI expect. ## What - Add Ruff configuration to `sdk/python/pyproject.toml`, excluding generated protocol code and notebooks from the normal lint/format pass. - Update `just fmt` so it still formats Rust and also runs Python SDK Ruff autofix and formatting. - Add Python SDK CI steps for `ruff check` and `ruff format --check` before pytest. - Recommend the Ruff VS Code extension and enable Python format/fix/organize-on-save so Cmd+S uses the same tooling. - Apply the resulting Ruff formatting to SDK Python files, examples, and the checked-in generated `v2_all.py` output emitted by the pinned generator. - Add a guard test for the `just fmt` recipe so it keeps working from both Rust and Python SDK working directories. ## Stack 1. #21891 `[1/8]` Pin Python SDK runtime dependency 2. #21893 `[2/8]` Generate Python SDK types from pinned runtime 3. #21895 `[3/8]` Run Python SDK tests in CI 4. #21896 `[4/8]` Define Python SDK public API surface 5. #21905 `[5/8]` Rename Python SDK package to `openai-codex` 6. #21910 `[6/8]` Add high-level Python SDK approval mode 7. #22014 `[7/8]` Add Python SDK app-server integration harness 8. This PR `[8/8]` Add Python SDK Ruff formatting ## Verification - Added `test_root_fmt_recipe_formats_rust_and_python_sdk` for the shared format recipe. - Ran `just fmt` after the recipe update. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-12 01:10:29 +03:00
Michael Bolin	f7a99e5a29	Merge `ea13af2fa5` into sapling-pr-archive-bolinfest	2026-05-11 15:06:46 -07:00
Ahmed Ibrahim	3e10e09e24	[7/8] Add Python SDK app-server integration harness (#22014 ) ## Why The SDK had behavioral tests that replaced SDK client internals. Those tests could catch wrapper mistakes, but they did not prove the pinned app-server runtime, generated notification models, request routing, and sync/async public clients worked together. This PR adds deterministic integration coverage that starts the pinned `codex app-server` process and mocks only the upstream Responses HTTP boundary. ## What - Add `AppServerHarness` and `MockResponsesServer` helpers for isolated `CODEX_HOME`, mock-provider config, queued SSE responses, and captured `/v1/responses` requests. - Add shared helpers for SSE construction, stream assertions, approval-policy inspection, and image fixtures. - Split integration coverage into focused modules for run behavior, inputs, streaming, turn controls, approvals, and thread lifecycle. - Cover sync and async `Thread.run`, `TurnHandle.stream`, interleaved streams, approval-mode persistence, lifecycle helpers, final-answer phase handling, image inputs, loaded skill input injection, steering, interruption, listing, history reads, run overrides, and token usage mapping. - Replace public-wrapper tests that duplicated integration-test behavior with lower-level client tests only where direct client behavior is the thing under test. ## Stack 1. #21891 `[1/8]` Pin Python SDK runtime dependency 2. #21893 `[2/8]` Generate Python SDK types from pinned runtime 3. #21895 `[3/8]` Run Python SDK tests in CI 4. #21896 `[4/8]` Define Python SDK public API surface 5. #21905 `[5/8]` Rename Python SDK package to `openai-codex` 6. #21910 `[6/8]` Add high-level Python SDK approval mode 7. This PR `[7/8]` Add Python SDK app-server integration harness 8. #22021 `[8/8]` Add Python SDK Ruff formatting ## Verification - Added pinned app-server integration tests under `sdk/python/tests/test_app_server_*.py` and `test_real_app_server_integration.py`. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-12 01:06:41 +03:00
Michael Bolin	ea13af2fa5	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 15:06:35 -07:00
Ahmed Ibrahim	2b90c37069	[6/8] Add high-level Python SDK approval mode (#21910 ) ## Why The high-level SDK should expose the approval behavior it actually supports instead of leaking generated app-server routing fields. New work should have two clear choices: default auto review, or explicitly deny escalated permission requests. Existing threads and subsequent turns should preserve their current approval behavior unless the caller passes an override. ## What - Add the public `ApprovalMode` enum with `auto_review` and `deny_all`. - Default new thread creation to `ApprovalMode.auto_review`. - Preserve existing approval settings by default for resume, fork, run, and turn helpers. - Remove raw `approval_policy` / `approvals_reviewer` kwargs from high-level SDK wrappers. - Update generated wrapper output, docs, examples, notebooks, and tests for the high-level approval mode API. ## Stack 1. #21891 `[1/8]` Pin Python SDK runtime dependency 2. #21893 `[2/8]` Generate Python SDK types from pinned runtime 3. #21895 `[3/8]` Run Python SDK tests in CI 4. #21896 `[4/8]` Define Python SDK public API surface 5. #21905 `[5/8]` Rename Python SDK package to `openai-codex` 6. This PR `[6/8]` Add high-level Python SDK approval mode 7. #22014 `[7/8]` Add Python SDK app-server integration harness 8. #22021 `[8/8]` Add Python SDK Ruff formatting ## Verification - Added approval-mode mapping/default tests for new threads, existing threads, forks, resumes, and subsequent turns. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-12 01:02:43 +03:00
Ahmed Ibrahim	f1b84fac63	[5/8] Rename Python SDK package to openai-codex (#21905 ) ## Why The SDK should publish under the reserved public distribution name `openai-codex`, and its import module should match that name in the Python style. Since package names can contain hyphens but import modules cannot, the public import path becomes `openai_codex`. Keeping the rename separate from the public API surface change makes the naming change easy to review and avoids mixing it with API curation. ## What - Rename the SDK distribution from `openai-codex-app-server-sdk` to `openai-codex`. - Rename the import package from `codex_app_server` to `openai_codex`. - Keep the runtime wheel as the separate `openai-codex-cli-bin` dependency. - Update docs, examples, notebooks, artifact scripts, lockfile metadata, and tests for the new distribution/module names. ## Stack 1. #21891 `[1/8]` Pin Python SDK runtime dependency 2. #21893 `[2/8]` Generate Python SDK types from pinned runtime 3. #21895 `[3/8]` Run Python SDK tests in CI 4. #21896 `[4/8]` Define Python SDK public API surface 5. This PR `[5/8]` Rename Python SDK package to `openai-codex` 6. #21910 `[6/8]` Add high-level Python SDK approval mode 7. #22014 `[7/8]` Add Python SDK app-server integration harness 8. #22021 `[8/8]` Add Python SDK Ruff formatting ## Verification - Updated package metadata and public API tests to assert the distribution and import names. Co-authored-by: Codex <noreply@openai.com>	2026-05-12 00:59:25 +03:00
Michael Bolin	49a7b9f625	Merge `3eace94625` into sapling-pr-archive-bolinfest	2026-05-11 14:58:29 -07:00
Michael Bolin	3eace94625	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 14:58:24 -07:00
Ahmed Ibrahim	b4bc02439f	[4/8] Define Python SDK public API surface (#21896 ) ## Why The SDK package root should be the ergonomic public client API, not a dump of every generated app-server schema type. Generated models still need a supported import path, but callers should be able to tell which names are high-level SDK entrypoints and which names are protocol value models. ## What - Define a curated root `__all__` for clients, handles, input helpers, retry helpers, config, and public errors. - Add a `types` module as the supported home for generated app-server response, event, enum, and helper models. - Update docs and examples to import protocol/value models from the type module. - Add tests that lock root exports, type-module exports, star-import behavior, and example import hygiene. ## Stack 1. #21891 `[1/8]` Pin Python SDK runtime dependency 2. #21893 `[2/8]` Generate Python SDK types from pinned runtime 3. #21895 `[3/8]` Run Python SDK tests in CI 4. This PR `[4/8]` Define Python SDK public API surface 5. #21905 `[5/8]` Rename Python SDK package to `openai-codex` 6. #21910 `[6/8]` Add high-level Python SDK approval mode 7. #22014 `[7/8]` Add Python SDK app-server integration harness 8. #22021 `[8/8]` Add Python SDK Ruff formatting ## Verification - Added public API signature tests for root exports, `types` exports, and example imports. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-12 00:57:44 +03:00
Michael Bolin	c04e59e045	merge commit for archive created by Sapling	2026-05-11 14:54:04 -07:00
Michael Bolin	34416d6a9e	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 14:53:58 -07:00
Ahmed Ibrahim	3e2936dd0e	[3/8] Run Python SDK tests in CI (#21895 ) ## Why The Python SDK stack now depends on packaging metadata, pinned runtime wheels, generated artifacts, async behavior, and stream interleaving. Those checks need to run in CI so future changes cannot bypass the SDK test suite. ## What - Add a dedicated `python-sdk` job to `.github/workflows/sdk.yml`. - Run the job in `python:3.12-alpine` so dependency resolution exercises the pinned musl runtime wheel. - Keep the Python SDK test job parallel to the existing SDK job instead of serializing the full workflow. ## Stack 1. #21891 `[1/8]` Pin Python SDK runtime dependency 2. #21893 `[2/8]` Generate Python SDK types from pinned runtime 3. This PR `[3/8]` Run Python SDK tests in CI 4. #21896 `[4/8]` Define Python SDK public API surface 5. #21905 `[5/8]` Rename Python SDK package to `openai-codex` 6. #21910 `[6/8]` Add high-level Python SDK approval mode 7. #22014 `[7/8]` Add Python SDK app-server integration harness 8. #22021 `[8/8]` Add Python SDK Ruff formatting ## Verification - The added workflow job installs the SDK with `uv sync --extra dev --frozen` and runs the Python SDK pytest suite. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-12 00:53:36 +03:00
Ahmed Ibrahim	6a4653efc8	[2/8] Generate Python SDK types from pinned runtime (#21893 ) ## Why Once the SDK declares its runtime package, generated Python artifacts should come from that pinned runtime rather than whatever app-server schema happens to be in the current checkout. That keeps the generated API and model surface aligned with the runtime users install. ## What - Teach `scripts/update_sdk_artifacts.py generate-types` to invoke the pinned runtime package for schema generation. - Regenerate `v2_all.py`, `notification_registry.py`, and generated public wrapper methods from that schema. - Add freshness coverage so regenerating from the pinned runtime must leave checked-in artifacts unchanged. ## Stack 1. #21891 `[1/8]` Pin Python SDK runtime dependency 2. This PR `[2/8]` Generate Python SDK types from pinned runtime 3. #21895 `[3/8]` Run Python SDK tests in CI 4. #21896 `[4/8]` Define Python SDK public API surface 5. #21905 `[5/8]` Rename Python SDK package to `openai-codex` 6. #21910 `[6/8]` Add high-level Python SDK approval mode 7. #22014 `[7/8]` Add Python SDK app-server integration harness 8. #22021 `[8/8]` Add Python SDK Ruff formatting ## Verification - Added `test_generated_files_are_up_to_date` for pinned-runtime generation drift. - Added generator-structure tests for schema annotation and notification metadata generation. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-12 00:53:21 +03:00
Ahmed Ibrahim	5fe33443b0	[1/8] Pin Python SDK runtime dependency (#21891 ) ## Why The Python SDK depends on the app-server runtime package for the bundled `codex` binary and schema source of truth. That relationship should be explicit in package metadata instead of inferred from matching version numbers, so installers, lockfiles, and reviewers can see exactly which runtime the SDK expects. ## What - Declare `openai-codex-cli-bin==0.131.0a4` as a Python SDK dependency. - Update runtime setup helpers to resolve the runtime version from the declared dependency pin. - Refresh the SDK lockfile for the pinned runtime wheel. - Update package/runtime tests and docs that describe where the runtime version comes from. ## Stack 1. This PR `[1/8]` Pin Python SDK runtime dependency 2. #21893 `[2/8]` Generate Python SDK types from pinned runtime 3. #21895 `[3/8]` Run Python SDK tests in CI 4. #21896 `[4/8]` Define Python SDK public API surface 5. #21905 `[5/8]` Rename Python SDK package to `openai-codex` 6. #21910 `[6/8]` Add high-level Python SDK approval mode 7. #22014 `[7/8]` Add Python SDK app-server integration harness 8. #22021 `[8/8]` Add Python SDK Ruff formatting ## Verification - Added coverage for the SDK runtime dependency pin and runtime distribution naming. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-12 00:42:26 +03:00
viyatb-oai	c7b55cdc46	feat: add network proxy feature flag (#20147 ) ## Why The permissions migration is making `permissions.<profile>.network.enabled` the canonical sandbox network bit, while proxy startup is a separate concern. Enabling network access should not implicitly start the proxy, and users who are still on legacy sandbox modes need a separate place to opt into proxy startup and provide proxy-specific settings. This follow-up to #19900 gives the network proxy its own feature surface instead of overloading permission-profile network semantics. ## What changed - Add an experimental `network_proxy` feature with a configurable `[features.network_proxy]` table. - Overlay `features.network_proxy` settings onto the configured proxy state after permission-profile selection, so the proxy only starts when the active `NetworkSandboxPolicy` already allows network access. - Preserve `[experimental_network]` startup behavior independently of the new feature flag. ## Behavior and examples There are now three related knobs: - `permissions.<profile>.network.enabled` controls whether the active permission profile has network access at all. - `features.network_proxy` enables proxy restrictions for an already-network-enabled profile. - Legacy `sandbox_mode` plus `[sandbox_workspace_write].network_access` still control whether legacy `workspace-write` has network access at all. The rule is: - network off + proxy flag on -> network stays off, proxy is a no-op - network on + proxy flag off -> unrestricted direct network - network on + proxy flag on -> network stays on, with proxy restrictions applied For permission profiles, the feature toggle adds proxy restrictions only when network access is already enabled: ```toml default_permissions = "workspace" [permissions.workspace.filesystem] ":minimal" = "read" [permissions.workspace.network] enabled = true [features] network_proxy = true ``` If `network.enabled = false`, the same feature flag is a no-op: network remains off and the proxy does not start. For legacy sandbox config, `network_access` remains the master switch: ```toml sandbox_mode = "workspace-write" [sandbox_workspace_write] network_access = true [features] network_proxy = true ``` That keeps legacy `workspace-write` network access on, but routes it through the proxy policy. If `network_access = false`, the proxy feature is a no-op and legacy `workspace-write` remains offline. The same proxy opt-in can be supplied from the CLI: ```bash codex -c 'features.network_proxy=true' ``` Additional proxy settings can be supplied when a table is needed: ```bash codex \ -c 'features.network_proxy.enabled=true' \ -c 'features.network_proxy.enable_socks5=false' ``` The intended behavior matrix is: \| Config surface \| Network setting \| `features.network_proxy` \| Direct sandbox network \| Proxy \| \| --- \| --- \| --- \| --- \| --- \| \| Permission profile \| `network.enabled = false` \| off \| restricted \| off \| \| Permission profile \| `network.enabled = false` \| on \| restricted \| off \| \| Permission profile \| `network.enabled = true` \| off \| enabled \| off \| \| Permission profile \| `network.enabled = true` \| on \| enabled \| on \| \| Legacy `workspace-write` \| `network_access = false` \| off \| restricted \| off \| \| Legacy `workspace-write` \| `network_access = false` \| on \| restricted \| off \| \| Legacy `workspace-write` \| `network_access = true` \| off \| enabled \| off \| \| Legacy `workspace-write` \| `network_access = true` \| on \| enabled \| on \| `[experimental_network]` requirements remain separate from the user feature toggle and still start the proxy on their own. Relevant code: - [`features/src/feature_configs.rs`](https://github.com/openai/codex/blob/43785aff47/codex-rs/features/src/feature_configs.rs#L58-L117) defines the feature-specific proxy config. - [`core/src/config/mod.rs`](https://github.com/openai/codex/blob/43785aff47/codex-rs/core/src/config/mod.rs#L1959-L1964) reads the feature table, and [later applies it only when network access is already enabled](https://github.com/openai/codex/blob/43785aff47/codex-rs/core/src/config/mod.rs#L2448-L2458). ## Verification Added focused coverage for: - keeping the proxy off when `features.network_proxy` is enabled but sandbox network access is disabled - the full permission-profile and legacy `workspace-write` matrix above - preserving `[experimental_network]` startup without the feature - reusing profile-supplied proxy settings when the feature is enabled Ran: - `cargo test -p codex-features` - `cargo test -p codex-core network_proxy_feature` - `cargo test -p codex-core experimental_network_requirements_enable_proxy_without_feature`	2026-05-11 14:12:00 -07:00
cooper-oai	54ec99cb54	[login] revoke superseded auth tokens on relogin (#21747 ) ## Summary - revoke previously stored managed ChatGPT tokens after a successful re-login - keep the new login successful even when revocation is unavailable or fails - cover the shared persistence path used by browser and device-code login flows ## Why A new `codex login` currently overwrites existing managed ChatGPT credentials without attempting to revoke the superseded tokens, leaving old credentials valid longer than necessary. ## Validation - `just fmt` - `CARGO_HOME=/tmp/cargo-home cargo test -p codex-login` ## Notes - Initial local Cargo validation hit a corrupt existing crate cache in the default `CARGO_HOME`; rerunning with a clean temporary `CARGO_HOME` passed. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-11 13:36:46 -07:00
Michael Bolin	a5d2fe5104	merge commit for archive created by Sapling	2026-05-11 13:31:53 -07:00
Michael Bolin	50719c6d17	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 13:31:36 -07:00
Michael Bolin	2799c3c082	Merge `3de8082c79` into sapling-pr-archive-bolinfest	2026-05-11 12:56:42 -07:00
Michael Bolin	3de8082c79	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 12:56:36 -07:00
Michael Bolin	de130c5844	merge commit for archive created by Sapling	2026-05-11 12:46:04 -07:00
Michael Bolin	f70b13c7ff	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 12:45:56 -07:00
Ruslan Nigmatullin	e3f481da98	daemon: refresh updater after validated binary rollout (#21853 ) ## Why `bootstrap` starts a detached pid-backed updater loop, but before this change that updater could keep running an old executable image even after `install.sh` replaced the managed standalone binary under `CODEX_HOME`. That left the updater itself behind the binary it had just rolled out, especially when the app-server was stopped or when the managed binary changed without a version-string change. ## What changed - Track updater identity from the executable contents rather than only the reported CLI version. - Force the managed app-server restart path when the managed binary contents differ from the running updater image, then re-exec the updater from the managed binary once the rollout is in a safe state. - Distinguish a genuinely absent managed app-server from a managed process that exists but is not yet probeable, so self-refresh does not skip a required restart. - Keep the restart/re-exec decision under the daemon operation lock so `bootstrap` cannot race the handoff. - Update `app-server-daemon/README.md` to document the resulting standalone and out-of-band update behavior. ## Verification - `cargo test -p codex-app-server-daemon` - `just fix -p codex-app-server-daemon` Added focused unit coverage for: - content-based updater refresh decisions - safe updater re-exec outcomes across restart states	2026-05-11 12:37:10 -07:00
Michael Bolin	e54b1ab023	merge commit for archive created by Sapling	2026-05-11 12:35:52 -07:00
Michael Bolin	4df845f153	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 12:35:34 -07:00
Felipe Coury	99b98aece6	config: accept `minus` in TUI keymap config (#22192 ) ## Summary Fixes #22128. The `/keymap` flow already persists the `-` key as `minus`, and the runtime keymap parser already accepts that spelling. `codex-config` was the missing leg: it rejected `minus` during config deserialization, so a binding saved by Codex could fail on the next startup or config reload. ## What Changed - Accept `minus` as a valid canonical key name in `tui.keymap` config normalization. - Update the config validation message so its supported-key list includes `minus`. - Add regression coverage that deserializes both `minus` and `alt-minus` under `[tui.keymap.global]` and verifies the normalized config shape. ## How to Test 1. Start Codex TUI. 2. Run `/keymap`. 3. Assign the `-` key to an action and save the change. 4. Restart Codex or reload the config. 5. Confirm the config loads normally and the saved binding remains usable instead of failing on `minus`. 6. As a focused regression check, repeat with a modifier form such as `alt--` captured through `/keymap`, which persists as `alt-minus` and should also reload successfully. Targeted tests: - `cargo test -p codex-config`	2026-05-11 16:34:33 -03:00
Michael Bolin	c9d222f803	Merge `64435bee22` into sapling-pr-archive-bolinfest	2026-05-11 12:24:43 -07:00

1 2 3 4 5 ...

15342 Commits