codex

mirror of https://github.com/openai/codex.git synced 2026-05-15 16:53:05 +00:00

Author	SHA1	Message	Date
Michael Bolin	84307c03ee	merge commit for archive created by Sapling	2026-05-11 15:19:58 -07:00
Michael Bolin	162da66557	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 15:19:34 -07:00
Michael Bolin	f7a99e5a29	Merge `ea13af2fa5` into sapling-pr-archive-bolinfest	2026-05-11 15:06:46 -07:00
Ahmed Ibrahim	3e10e09e24	[7/8] Add Python SDK app-server integration harness (#22014 ) ## Why The SDK had behavioral tests that replaced SDK client internals. Those tests could catch wrapper mistakes, but they did not prove the pinned app-server runtime, generated notification models, request routing, and sync/async public clients worked together. This PR adds deterministic integration coverage that starts the pinned `codex app-server` process and mocks only the upstream Responses HTTP boundary. ## What - Add `AppServerHarness` and `MockResponsesServer` helpers for isolated `CODEX_HOME`, mock-provider config, queued SSE responses, and captured `/v1/responses` requests. - Add shared helpers for SSE construction, stream assertions, approval-policy inspection, and image fixtures. - Split integration coverage into focused modules for run behavior, inputs, streaming, turn controls, approvals, and thread lifecycle. - Cover sync and async `Thread.run`, `TurnHandle.stream`, interleaved streams, approval-mode persistence, lifecycle helpers, final-answer phase handling, image inputs, loaded skill input injection, steering, interruption, listing, history reads, run overrides, and token usage mapping. - Replace public-wrapper tests that duplicated integration-test behavior with lower-level client tests only where direct client behavior is the thing under test. ## Stack 1. #21891 `[1/8]` Pin Python SDK runtime dependency 2. #21893 `[2/8]` Generate Python SDK types from pinned runtime 3. #21895 `[3/8]` Run Python SDK tests in CI 4. #21896 `[4/8]` Define Python SDK public API surface 5. #21905 `[5/8]` Rename Python SDK package to `openai-codex` 6. #21910 `[6/8]` Add high-level Python SDK approval mode 7. This PR `[7/8]` Add Python SDK app-server integration harness 8. #22021 `[8/8]` Add Python SDK Ruff formatting ## Verification - Added pinned app-server integration tests under `sdk/python/tests/test_app_server_*.py` and `test_real_app_server_integration.py`. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-12 01:06:41 +03:00
Michael Bolin	ea13af2fa5	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 15:06:35 -07:00
Ahmed Ibrahim	2b90c37069	[6/8] Add high-level Python SDK approval mode (#21910 ) ## Why The high-level SDK should expose the approval behavior it actually supports instead of leaking generated app-server routing fields. New work should have two clear choices: default auto review, or explicitly deny escalated permission requests. Existing threads and subsequent turns should preserve their current approval behavior unless the caller passes an override. ## What - Add the public `ApprovalMode` enum with `auto_review` and `deny_all`. - Default new thread creation to `ApprovalMode.auto_review`. - Preserve existing approval settings by default for resume, fork, run, and turn helpers. - Remove raw `approval_policy` / `approvals_reviewer` kwargs from high-level SDK wrappers. - Update generated wrapper output, docs, examples, notebooks, and tests for the high-level approval mode API. ## Stack 1. #21891 `[1/8]` Pin Python SDK runtime dependency 2. #21893 `[2/8]` Generate Python SDK types from pinned runtime 3. #21895 `[3/8]` Run Python SDK tests in CI 4. #21896 `[4/8]` Define Python SDK public API surface 5. #21905 `[5/8]` Rename Python SDK package to `openai-codex` 6. This PR `[6/8]` Add high-level Python SDK approval mode 7. #22014 `[7/8]` Add Python SDK app-server integration harness 8. #22021 `[8/8]` Add Python SDK Ruff formatting ## Verification - Added approval-mode mapping/default tests for new threads, existing threads, forks, resumes, and subsequent turns. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-12 01:02:43 +03:00
Ahmed Ibrahim	f1b84fac63	[5/8] Rename Python SDK package to openai-codex (#21905 ) ## Why The SDK should publish under the reserved public distribution name `openai-codex`, and its import module should match that name in the Python style. Since package names can contain hyphens but import modules cannot, the public import path becomes `openai_codex`. Keeping the rename separate from the public API surface change makes the naming change easy to review and avoids mixing it with API curation. ## What - Rename the SDK distribution from `openai-codex-app-server-sdk` to `openai-codex`. - Rename the import package from `codex_app_server` to `openai_codex`. - Keep the runtime wheel as the separate `openai-codex-cli-bin` dependency. - Update docs, examples, notebooks, artifact scripts, lockfile metadata, and tests for the new distribution/module names. ## Stack 1. #21891 `[1/8]` Pin Python SDK runtime dependency 2. #21893 `[2/8]` Generate Python SDK types from pinned runtime 3. #21895 `[3/8]` Run Python SDK tests in CI 4. #21896 `[4/8]` Define Python SDK public API surface 5. This PR `[5/8]` Rename Python SDK package to `openai-codex` 6. #21910 `[6/8]` Add high-level Python SDK approval mode 7. #22014 `[7/8]` Add Python SDK app-server integration harness 8. #22021 `[8/8]` Add Python SDK Ruff formatting ## Verification - Updated package metadata and public API tests to assert the distribution and import names. Co-authored-by: Codex <noreply@openai.com>	2026-05-12 00:59:25 +03:00
Michael Bolin	49a7b9f625	Merge `3eace94625` into sapling-pr-archive-bolinfest	2026-05-11 14:58:29 -07:00
Michael Bolin	3eace94625	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 14:58:24 -07:00
Ahmed Ibrahim	b4bc02439f	[4/8] Define Python SDK public API surface (#21896 ) ## Why The SDK package root should be the ergonomic public client API, not a dump of every generated app-server schema type. Generated models still need a supported import path, but callers should be able to tell which names are high-level SDK entrypoints and which names are protocol value models. ## What - Define a curated root `__all__` for clients, handles, input helpers, retry helpers, config, and public errors. - Add a `types` module as the supported home for generated app-server response, event, enum, and helper models. - Update docs and examples to import protocol/value models from the type module. - Add tests that lock root exports, type-module exports, star-import behavior, and example import hygiene. ## Stack 1. #21891 `[1/8]` Pin Python SDK runtime dependency 2. #21893 `[2/8]` Generate Python SDK types from pinned runtime 3. #21895 `[3/8]` Run Python SDK tests in CI 4. This PR `[4/8]` Define Python SDK public API surface 5. #21905 `[5/8]` Rename Python SDK package to `openai-codex` 6. #21910 `[6/8]` Add high-level Python SDK approval mode 7. #22014 `[7/8]` Add Python SDK app-server integration harness 8. #22021 `[8/8]` Add Python SDK Ruff formatting ## Verification - Added public API signature tests for root exports, `types` exports, and example imports. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-12 00:57:44 +03:00
Michael Bolin	c04e59e045	merge commit for archive created by Sapling	2026-05-11 14:54:04 -07:00
Michael Bolin	34416d6a9e	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 14:53:58 -07:00
Ahmed Ibrahim	3e2936dd0e	[3/8] Run Python SDK tests in CI (#21895 ) ## Why The Python SDK stack now depends on packaging metadata, pinned runtime wheels, generated artifacts, async behavior, and stream interleaving. Those checks need to run in CI so future changes cannot bypass the SDK test suite. ## What - Add a dedicated `python-sdk` job to `.github/workflows/sdk.yml`. - Run the job in `python:3.12-alpine` so dependency resolution exercises the pinned musl runtime wheel. - Keep the Python SDK test job parallel to the existing SDK job instead of serializing the full workflow. ## Stack 1. #21891 `[1/8]` Pin Python SDK runtime dependency 2. #21893 `[2/8]` Generate Python SDK types from pinned runtime 3. This PR `[3/8]` Run Python SDK tests in CI 4. #21896 `[4/8]` Define Python SDK public API surface 5. #21905 `[5/8]` Rename Python SDK package to `openai-codex` 6. #21910 `[6/8]` Add high-level Python SDK approval mode 7. #22014 `[7/8]` Add Python SDK app-server integration harness 8. #22021 `[8/8]` Add Python SDK Ruff formatting ## Verification - The added workflow job installs the SDK with `uv sync --extra dev --frozen` and runs the Python SDK pytest suite. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-12 00:53:36 +03:00
Ahmed Ibrahim	6a4653efc8	[2/8] Generate Python SDK types from pinned runtime (#21893 ) ## Why Once the SDK declares its runtime package, generated Python artifacts should come from that pinned runtime rather than whatever app-server schema happens to be in the current checkout. That keeps the generated API and model surface aligned with the runtime users install. ## What - Teach `scripts/update_sdk_artifacts.py generate-types` to invoke the pinned runtime package for schema generation. - Regenerate `v2_all.py`, `notification_registry.py`, and generated public wrapper methods from that schema. - Add freshness coverage so regenerating from the pinned runtime must leave checked-in artifacts unchanged. ## Stack 1. #21891 `[1/8]` Pin Python SDK runtime dependency 2. This PR `[2/8]` Generate Python SDK types from pinned runtime 3. #21895 `[3/8]` Run Python SDK tests in CI 4. #21896 `[4/8]` Define Python SDK public API surface 5. #21905 `[5/8]` Rename Python SDK package to `openai-codex` 6. #21910 `[6/8]` Add high-level Python SDK approval mode 7. #22014 `[7/8]` Add Python SDK app-server integration harness 8. #22021 `[8/8]` Add Python SDK Ruff formatting ## Verification - Added `test_generated_files_are_up_to_date` for pinned-runtime generation drift. - Added generator-structure tests for schema annotation and notification metadata generation. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-12 00:53:21 +03:00
Ahmed Ibrahim	5fe33443b0	[1/8] Pin Python SDK runtime dependency (#21891 ) ## Why The Python SDK depends on the app-server runtime package for the bundled `codex` binary and schema source of truth. That relationship should be explicit in package metadata instead of inferred from matching version numbers, so installers, lockfiles, and reviewers can see exactly which runtime the SDK expects. ## What - Declare `openai-codex-cli-bin==0.131.0a4` as a Python SDK dependency. - Update runtime setup helpers to resolve the runtime version from the declared dependency pin. - Refresh the SDK lockfile for the pinned runtime wheel. - Update package/runtime tests and docs that describe where the runtime version comes from. ## Stack 1. This PR `[1/8]` Pin Python SDK runtime dependency 2. #21893 `[2/8]` Generate Python SDK types from pinned runtime 3. #21895 `[3/8]` Run Python SDK tests in CI 4. #21896 `[4/8]` Define Python SDK public API surface 5. #21905 `[5/8]` Rename Python SDK package to `openai-codex` 6. #21910 `[6/8]` Add high-level Python SDK approval mode 7. #22014 `[7/8]` Add Python SDK app-server integration harness 8. #22021 `[8/8]` Add Python SDK Ruff formatting ## Verification - Added coverage for the SDK runtime dependency pin and runtime distribution naming. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-12 00:42:26 +03:00
viyatb-oai	c7b55cdc46	feat: add network proxy feature flag (#20147 ) ## Why The permissions migration is making `permissions.<profile>.network.enabled` the canonical sandbox network bit, while proxy startup is a separate concern. Enabling network access should not implicitly start the proxy, and users who are still on legacy sandbox modes need a separate place to opt into proxy startup and provide proxy-specific settings. This follow-up to #19900 gives the network proxy its own feature surface instead of overloading permission-profile network semantics. ## What changed - Add an experimental `network_proxy` feature with a configurable `[features.network_proxy]` table. - Overlay `features.network_proxy` settings onto the configured proxy state after permission-profile selection, so the proxy only starts when the active `NetworkSandboxPolicy` already allows network access. - Preserve `[experimental_network]` startup behavior independently of the new feature flag. ## Behavior and examples There are now three related knobs: - `permissions.<profile>.network.enabled` controls whether the active permission profile has network access at all. - `features.network_proxy` enables proxy restrictions for an already-network-enabled profile. - Legacy `sandbox_mode` plus `[sandbox_workspace_write].network_access` still control whether legacy `workspace-write` has network access at all. The rule is: - network off + proxy flag on -> network stays off, proxy is a no-op - network on + proxy flag off -> unrestricted direct network - network on + proxy flag on -> network stays on, with proxy restrictions applied For permission profiles, the feature toggle adds proxy restrictions only when network access is already enabled: ```toml default_permissions = "workspace" [permissions.workspace.filesystem] ":minimal" = "read" [permissions.workspace.network] enabled = true [features] network_proxy = true ``` If `network.enabled = false`, the same feature flag is a no-op: network remains off and the proxy does not start. For legacy sandbox config, `network_access` remains the master switch: ```toml sandbox_mode = "workspace-write" [sandbox_workspace_write] network_access = true [features] network_proxy = true ``` That keeps legacy `workspace-write` network access on, but routes it through the proxy policy. If `network_access = false`, the proxy feature is a no-op and legacy `workspace-write` remains offline. The same proxy opt-in can be supplied from the CLI: ```bash codex -c 'features.network_proxy=true' ``` Additional proxy settings can be supplied when a table is needed: ```bash codex \ -c 'features.network_proxy.enabled=true' \ -c 'features.network_proxy.enable_socks5=false' ``` The intended behavior matrix is: \| Config surface \| Network setting \| `features.network_proxy` \| Direct sandbox network \| Proxy \| \| --- \| --- \| --- \| --- \| --- \| \| Permission profile \| `network.enabled = false` \| off \| restricted \| off \| \| Permission profile \| `network.enabled = false` \| on \| restricted \| off \| \| Permission profile \| `network.enabled = true` \| off \| enabled \| off \| \| Permission profile \| `network.enabled = true` \| on \| enabled \| on \| \| Legacy `workspace-write` \| `network_access = false` \| off \| restricted \| off \| \| Legacy `workspace-write` \| `network_access = false` \| on \| restricted \| off \| \| Legacy `workspace-write` \| `network_access = true` \| off \| enabled \| off \| \| Legacy `workspace-write` \| `network_access = true` \| on \| enabled \| on \| `[experimental_network]` requirements remain separate from the user feature toggle and still start the proxy on their own. Relevant code: - [`features/src/feature_configs.rs`](https://github.com/openai/codex/blob/43785aff47/codex-rs/features/src/feature_configs.rs#L58-L117) defines the feature-specific proxy config. - [`core/src/config/mod.rs`](https://github.com/openai/codex/blob/43785aff47/codex-rs/core/src/config/mod.rs#L1959-L1964) reads the feature table, and [later applies it only when network access is already enabled](https://github.com/openai/codex/blob/43785aff47/codex-rs/core/src/config/mod.rs#L2448-L2458). ## Verification Added focused coverage for: - keeping the proxy off when `features.network_proxy` is enabled but sandbox network access is disabled - the full permission-profile and legacy `workspace-write` matrix above - preserving `[experimental_network]` startup without the feature - reusing profile-supplied proxy settings when the feature is enabled Ran: - `cargo test -p codex-features` - `cargo test -p codex-core network_proxy_feature` - `cargo test -p codex-core experimental_network_requirements_enable_proxy_without_feature`	2026-05-11 14:12:00 -07:00
cooper-oai	54ec99cb54	[login] revoke superseded auth tokens on relogin (#21747 ) ## Summary - revoke previously stored managed ChatGPT tokens after a successful re-login - keep the new login successful even when revocation is unavailable or fails - cover the shared persistence path used by browser and device-code login flows ## Why A new `codex login` currently overwrites existing managed ChatGPT credentials without attempting to revoke the superseded tokens, leaving old credentials valid longer than necessary. ## Validation - `just fmt` - `CARGO_HOME=/tmp/cargo-home cargo test -p codex-login` ## Notes - Initial local Cargo validation hit a corrupt existing crate cache in the default `CARGO_HOME`; rerunning with a clean temporary `CARGO_HOME` passed. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-11 13:36:46 -07:00
Michael Bolin	a5d2fe5104	merge commit for archive created by Sapling	2026-05-11 13:31:53 -07:00
Michael Bolin	50719c6d17	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 13:31:36 -07:00
Michael Bolin	2799c3c082	Merge `3de8082c79` into sapling-pr-archive-bolinfest	2026-05-11 12:56:42 -07:00
Michael Bolin	3de8082c79	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 12:56:36 -07:00
Michael Bolin	de130c5844	merge commit for archive created by Sapling	2026-05-11 12:46:04 -07:00
Michael Bolin	f70b13c7ff	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 12:45:56 -07:00
Ruslan Nigmatullin	e3f481da98	daemon: refresh updater after validated binary rollout (#21853 ) ## Why `bootstrap` starts a detached pid-backed updater loop, but before this change that updater could keep running an old executable image even after `install.sh` replaced the managed standalone binary under `CODEX_HOME`. That left the updater itself behind the binary it had just rolled out, especially when the app-server was stopped or when the managed binary changed without a version-string change. ## What changed - Track updater identity from the executable contents rather than only the reported CLI version. - Force the managed app-server restart path when the managed binary contents differ from the running updater image, then re-exec the updater from the managed binary once the rollout is in a safe state. - Distinguish a genuinely absent managed app-server from a managed process that exists but is not yet probeable, so self-refresh does not skip a required restart. - Keep the restart/re-exec decision under the daemon operation lock so `bootstrap` cannot race the handoff. - Update `app-server-daemon/README.md` to document the resulting standalone and out-of-band update behavior. ## Verification - `cargo test -p codex-app-server-daemon` - `just fix -p codex-app-server-daemon` Added focused unit coverage for: - content-based updater refresh decisions - safe updater re-exec outcomes across restart states	2026-05-11 12:37:10 -07:00
Michael Bolin	e54b1ab023	merge commit for archive created by Sapling	2026-05-11 12:35:52 -07:00
Michael Bolin	4df845f153	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 12:35:34 -07:00
Felipe Coury	99b98aece6	config: accept `minus` in TUI keymap config (#22192 ) ## Summary Fixes #22128. The `/keymap` flow already persists the `-` key as `minus`, and the runtime keymap parser already accepts that spelling. `codex-config` was the missing leg: it rejected `minus` during config deserialization, so a binding saved by Codex could fail on the next startup or config reload. ## What Changed - Accept `minus` as a valid canonical key name in `tui.keymap` config normalization. - Update the config validation message so its supported-key list includes `minus`. - Add regression coverage that deserializes both `minus` and `alt-minus` under `[tui.keymap.global]` and verifies the normalized config shape. ## How to Test 1. Start Codex TUI. 2. Run `/keymap`. 3. Assign the `-` key to an action and save the change. 4. Restart Codex or reload the config. 5. Confirm the config loads normally and the saved binding remains usable instead of failing on `minus`. 6. As a focused regression check, repeat with a modifier form such as `alt--` captured through `/keymap`, which persists as `alt-minus` and should also reload successfully. Targeted tests: - `cargo test -p codex-config`	2026-05-11 16:34:33 -03:00
Michael Bolin	c9d222f803	Merge `64435bee22` into sapling-pr-archive-bolinfest	2026-05-11 12:24:43 -07:00
Michael Bolin	64435bee22	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 12:24:36 -07:00
Matthew Zeng	192481d1a1	[elicitation] Advertise new url elicitation capability when auth_elicitation is enabled. (#22188 ) ## Why We've added support for auth elicitation behind the auth_elicitation flag, but servers need to explicitly check the capability before it decides to send elicitations in order to be backward compatible. This PR adds the capability advertising conditioned on the flag. ## What changed - Build `client_elicitation_capability` from the `AuthElicitation` feature state. - Thread that capability through MCP config, session startup, and `McpConnectionManager` so RMCP initialization advertises the correct elicitation support. - Advertise both `form` and `url` elicitation when the feature is enabled, and preserve the empty default capability when it is disabled. - Add coverage for the feature-derived config shape and the advertised initialization payload. ## Testing - `cargo test -p codex-mcp` - `cargo test -p codex-core to_mcp_config_preserves_auth_elicitation_feature_from_config` - `cargo test -p codex-core` (currently fails outside this change in `tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed` with a stack overflow after unrelated tests have started running)	2026-05-11 12:23:55 -07:00
viyatb-oai	d0fa2d81d8	feat(connectors): support managed app tool approval requirements (#21061 ) ## Why Managed requirements can already centrally disable apps, but they could not express the per-tool app approval rules that normal config already supports. That left admins without a way to enforce connector tool approvals through `/etc/codex/requirements.toml` or cloud requirements. ## What changed - Extend app requirements with per-tool `approval_mode` entries. - Merge managed app tool requirements across managed sources while preserving higher-precedence exact tool settings. - Apply managed tool approvals separately from user app config so managed policy is matched only on raw MCP `tool.name`, while user config keeps the existing raw-name-then-title convenience fallback. - Add coverage for local requirements, cloud requirements parsing, managed-over-user precedence, and a title-collision case that must not widen managed auto-approval. ## Configuration shape Local `/etc/codex/requirements.toml` and cloud requirements use the same TOML shape: ```toml [apps.connector_123123.tools."calendar/list_events"] approval_mode = "approve" ``` This is a per-tool approval rule keyed by app ID and raw MCP tool name, not an app-level boolean such as `apps.connector_123123.approve = true`.	2026-05-11 19:08:26 +00:00
Michael Bolin	59add06f0c	merge commit for archive created by Sapling	2026-05-11 11:55:32 -07:00
Michael Bolin	cea7dd8caa	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 11:55:23 -07:00
viyatb-oai	6506765168	fix(permissions): preserve managed deny-read during escalation (#15977 ) ## Why Managed filesystem `deny_read` requirements are administrator-enforced restrictions on specific paths. Once those requirements are active, Codex should not drop them just because an execution path would otherwise leave the sandbox. Before this change, an explicit escalation, a prefix-rule allow, a sandbox-denial retry, or an app-server legacy sandbox override could rebuild the runtime policy without those managed read-deny entries and expose a path the administrator had marked unreadable. This is narrower than general sandbox-mode constraints. If an enterprise only sets `allowed_sandbox_modes`, a trusted `prefix_rule(..., decision = "allow")` can still run its matching command unsandboxed; this PR only preserves managed filesystem `deny_read` restrictions across those paths. ## What Changed - Mark filesystem policies built from managed `deny_read` requirements so callers can tell when those deny entries must survive escalation. - Preserve managed deny-read entries when runtime permission profiles are rebuilt through protocol, app-server, or legacy sandbox-policy compatibility paths. - Keep managed deny-read attempts inside the selected sandbox on the first attempt and after sandbox-denial retries. - Preserve the same behavior in the zsh-fork escalation path, including prefix-rule-driven escalation. - Add a regression test showing the opposite case too: without managed deny-read, a prefix-rule allow still chooses unsandboxed execution. ## Verification Targeted automated verification: ```shell cargo test -p codex-core shell_request_escalation_execution_is_explicit -- --nocapture cargo test -p codex-core prefix_rule_uses_unsandboxed_execution_without_managed_deny_read -- --nocapture cargo test -p codex-core prefix_rule_preserves_managed_deny_read_escalation -- --nocapture cargo test -p codex-protocol permission_profile_round_trip_preserves_filesystem_policy_metadata -- --nocapture cargo test -p codex-protocol preserving_deny_entries_keeps_unrestricted_policy_enforceable -- --nocapture cargo test -p codex-app-server-protocol permission_profile_file_system_permissions_preserves_policy_metadata -- --nocapture cargo check -p codex-app-server -p codex-tui ``` Smoke-test invocations: ```shell # macOS exact deny + allowed control codex exec --skip-git-repo-check -C "$ROOT" \ -c 'default_permissions="deny_read_smoke"' \ -c 'permissions.deny_read_smoke.filesystem={":minimal"="read",":project_roots"={"."="write","secrets"="none","future-secret"="none","*/.env"="none"}}' \ 'Run shell commands only. Print the contents of allowed.txt. Then test whether reading secrets/exact-secret.txt succeeds without printing that file if it does. End with exactly two lines: allowed=<contents> and exact_secret=<BLOCKED or READABLE>.' # Linux exact deny + allowed control codex exec --skip-git-repo-check -C "$ROOT" \ -c 'default_permissions="deny_read_smoke"' \ -c 'permissions.deny_read_smoke.filesystem={":minimal"="read",glob_scan_max_depth=3,":project_roots"={"."="write","secrets"="none","future-secret"="none","*/.env"="none"}}' \ 'Run shell commands only. Print the contents of allowed.txt. Then test whether reading secrets/exact-secret.txt succeeds without printing that file if it does. End with exactly two lines: allowed=<contents> and exact_secret=<BLOCKED or READABLE>.' ``` Observed manual smoke matrix: \| Case \| macOS Seatbelt \| Linux bubblewrap \| \| --- \| --- \| --- \| \| `cat allowed.txt` \| Pass \| Pass \| \| `cat secrets/exact-secret.txt` \| Blocked \| Blocked \| \| `cat envs/root.env` \| Blocked \| Blocked \| \| `cat envs/nested/one.env` \| Blocked \| Blocked \| \| `cat envs/nested/two.env` \| Blocked \| Blocked \| \| `cat alias-to-secrets/exact-secret.txt` \| Blocked \| Blocked \| \| Missing denied path \| A file created after sandbox setup remained unreadable \| Creation was blocked by the reserved missing-path placeholder, and the placeholder was cleaned up after exit \| \| Real `codex exec` shell turn \| Pass \| Pass \| Notes: - The Linux smoke run used the fallback glob walker because the devbox did not have `rg` installed. - The smoke matrix verifies the end-to-end filesystem behavior on macOS and Linux; the escalation-specific behavior is covered by the focused tests above. --------- Co-authored-by: Codex <noreply@openai.com> Co-authored-by: Charlie Marsh <charliemarsh@openai.com>	2026-05-11 11:49:44 -07:00
Michael Bolin	78f29f853d	merge commit for archive created by Sapling	2026-05-11 11:45:28 -07:00
Owen Lin	7bddb3083d	fix(app-server): thread history redaction for remote clients (#22178 ) ## Summary Remote clients can still receive large `thread/resume` histories when prior turns include MCP tool call payloads or image-generation results. This adds a temporary response-only redaction path for the known remote client names. Longer term we will move towards fully paginated APIs backed by SQLite. ## Changes - Redact MCP tool call payload-bearing fields in `thread/resume` responses for `codex_chatgpt_android_remote` and `codex_chatgpt_ios_remote`. - Drop `imageGeneration` items from those `thread/resume` responses. - Keep redaction out of persisted rollout files, `thread/read`, `thread/turns/list`, live notifications, and token usage replay. - Cover the behavior with app-server helper tests and a v2 resume integration test that checks both remote clients plus a non-target control client. ## Testing - `cargo test -p codex-app-server thread_resume_redaction` - `cargo test -p codex-app-server thread_resume_redacts_payloads_for_chatgpt_remote_clients`	2026-05-11 11:45:25 -07:00
Michael Bolin	cdfe9c443c	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 11:45:09 -07:00
Michael Bolin	0726b39509	Merge `c326cb1025` into sapling-pr-archive-bolinfest	2026-05-11 11:42:01 -07:00
Michael Bolin	c326cb1025	config: add strict config parsing	2026-05-11 11:41:50 -07:00
Felipe Coury	90bd445e7f	fix(exec-server): suppress Windows taskkill output (#22058 ) ## Summary This is the `exec-server` follow-up to #21759. #21759 fixed the Windows `taskkill` output leak for the `rmcp-client` MCP teardown path, but #22050 showed that `exec-server` still had a parallel `taskkill /T /F` cleanup path in `exec-server/src/connection.rs`. Because that command inherited the parent stdio handles, Windows could still print `SUCCESS:` lines into the user's terminal during stdio child cleanup. This change silences that remaining `exec-server` callsite by redirecting `taskkill` stdin, stdout, and stderr to `Stdio::null()`. ## What Changed - add a Windows-only `Stdio` import in `exec-server/src/connection.rs` - redirect the `taskkill` command in `kill_windows_process_tree` to `Stdio::null()` for stdin, stdout, and stderr - keep the existing kill semantics unchanged by still checking `.status()` and preserving the existing fallback/logging behavior ## How to Test Manual validation is Windows-only, so I did not run the UI repro path locally here. 1. On Windows, use a Codex build from this branch. 2. Exercise an `exec-server` stdio flow that spawns a child process tree and then triggers transport cleanup. 3. Confirm the child process tree is still torn down. 4. Confirm the terminal no longer shows `SUCCESS: The process with PID ... has been terminated.` lines during cleanup. Targeted tests: - `cargo test -p codex-exec-server client::tests::dropping_stdio_client_terminates_spawned_process -- --exact` - `cargo test -p codex-exec-server client::tests::malformed_stdio_message_terminates_spawned_process -- --exact` Notes: - `cargo test -p codex-exec-server` still hits unrelated local macOS `sandbox-exec: sandbox_apply: Operation not permitted` failures in `tests/file_system.rs`. ## References - Fixes the remaining callsite discussed in #22050 - Related earlier fix: #21759	2026-05-11 15:40:56 -03:00
Dylan Hurd	e783dab44c	fix(exec-policy) use is_known_safe_command less (#20305 ) ## Summary Restricts behavior of `is_known_safe_command` only to modes where it is explicitly part of the documented behavior: - when `environment_lacks_sandbox_protections` - in `AskForApproval::UnlessTrusted` Notably, as a result of this, escalations for commands that pass `is_known_safe_commands` are no longer auto-approved in AskForApproval::OnRequest or AskForApproval::Granular. ## Testing - [x] Updated unit tests - [x] Updated approvals scenario tests. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-11 11:37:53 -07:00
Michael Bolin	a2d6b53765	Merge `6b894fde91` into sapling-pr-archive-bolinfest	2026-05-11 11:35:54 -07:00
canvrno-oai	eaf05c9002	Unified mentions in TUI (#19068 ) This PR replaces the TUI’s file-only `@mention` popup with a unified mentions experience. Typing `@...` now searches across filesystem matches, installed plugins, and skills in one popup, with result types clearly labeled and selectable from the same flow. - Adds a unified `@mentions` popup that returns: - plugins - skills - files - directories - Adds search modes so users can narrow the popup without changing their query: - All Results _(default/same as Codex App)_ - Filesystem Only - Plugins _(...and skills)_ - Preserves existing insertion behavior: - selected file paths are inserted into the prompt - paths with spaces are quoted - image file selections still attach as images when possible - selecting a plugin or skill inserts the corresponding `$name` - the composer records the canonical mention binding, such as `plugin://...` or the skill path - Expanded `@mentions` rendering: - type tags for Plugin, Skill, File, and Dir - distinct plugin/filesystem colors - stable fixed-height layout (8 rows) - truncation behavior for narrow terminals Note: - The unified mentions popup does not display app connectors under `@mention` results for Codex App parity. Connector mentions remain available through the existing `$mention` path. https://github.com/user-attachments/assets/f93781ed-57d3-4cb5-9972-675bc5f3ef3f	2026-05-11 11:34:52 -07:00
jif-oai	b401666ca5	Add process-scoped SQLite telemetry (#22154 ) ## Summary - add SQLite init, backfill-gate, and fallback telemetry without introducing a cross-cutting state-db access wrapper - install one process-scoped telemetry sink after OTEL startup and let low-level state/rollout paths emit through it directly - add process-start metrics for the process owners that initialize SQLite --------- Co-authored-by: Owen Lin <owen@openai.com>	2026-05-11 11:32:40 -07:00
Michael Bolin	6b894fde91	Move workspace roots onto thread/session state and stop using active permission profile modifications as an overlay for writable roots. Existing app-server threads now preserve their persisted PermissionProfile value across resume, fork, and turn updates; permissions requests on existing threads only update the active named profile after validating it exists. Workspace roots can be updated independently, and SandboxPolicy::WorkspaceWrite no longer stores its own writable_roots.	2026-05-11 11:23:30 -07:00
rhan-oai	cf6342b75b	[codex-analytics] add turn tool counts to turn events (#21431 ) ## Summary - accumulate completed tool-item counts per turn from the item lifecycle - populate the reserved count fields on `codex_turn_event` - add reducer coverage for zero-count turns and mixed completed tool items ## Why PR #17090 moved tool-item analytics onto the item lifecycle, so the turn reducer can now derive the per-turn tool counts from the same completed items instead of leaving the reserved fields null. ## Validation - `just fmt` - `cargo test -p codex-analytics`	2026-05-11 18:18:02 +00:00
Won Park	0dbad2a348	Make auto-review denial short-circuit use a rolling review window (#22110 ) ## Why Long-running turns can accumulate enough denied auto-review decisions to trip the global short-circuit even when those denials are spread far apart. The breaker should still stop genuinely bad loops, but it should judge recent behavior instead of lifetime turn history. ## What changed - Replaced the lifetime `10 total denials` threshold with `10 denials in the last 50 reviews`. - Kept the existing `3 consecutive denials` interrupt behavior unchanged. - Tracked recent auto-review outcomes in the circuit breaker and updated the warning copy to report the rolling-window count. - Renamed the new rolling-window coverage to `auto_review_*` test names. - Added coverage that confirms older denials fall out of the 50-review window and no longer trigger the breaker. ## Validation - `just fmt` - `cargo test -p codex-core guardian_rejection_circuit_breaker --lib` - `cargo test -p codex-core auto_review_rejection_circuit_breaker --lib`	2026-05-11 11:03:11 -07:00
Eric Traut	1e65b3e0af	Fix goal update and add `/goal edit` command in TUI (#21954 ) ## Why Users have requested the ability to edit a goal's objective after a goal has been created. This PR exposes a new `/goal edit` command in the TUI to address this request. In the process of implementing this, I also noticed an existing bug in the goal runtime. When a goal's objective is updated through the `thread/goal/set` app server API, the goal runtime didn't emit a new steering prompt to tell the agent about the new objective. This PR also fixes this hole. ## What Changed - Adds `/goal edit` in the TUI, opening an edit box prefilled with the current goal objective. - Keeps active and paused goals in their current state, resets completed goals to active, keeps budget-limited goals budget-limited, and preserves the existing token budget. - Changes the existing `thread/goal/set` behavior so editing an objective preserves goal accounting instead of resetting it. The older reset-on-new-objective behavior was left over from before `thread/goal/clear`; clients that need to reset accounting can now clear the existing goal and create a new one. - Reuses the existing goal set API path; this does not add or change app-server protocol surface area. - Adds a dedicated goal runtime steering prompt when an externally persisted goal mutation changes the objective, so active turns receive the updated objective. ## Validation - Make sure `/goal edit` returns an error if no goal currently exists - Make sure `/goal edit` displays an edit box that can be optionally canceled with no side effects - Make sure that an edited goal results in a steer so the agent starts pursuing the new objective - Make sure the new objective is reflected in the goal if you use `/goal` to display the goal summary - Make sure that `/goal edit` doesn't reset the token budget, time/token accounting on the updated goal	2026-05-11 10:49:19 -07:00
jif-oai	32b1ae7099	chore: drop built-in MCPs (#22173 ) Drop something that was never used	2026-05-11 19:45:08 +02:00
Ruslan Nigmatullin	a124ddb854	app-server: remove TCP websocket listener (#21843 ) ## Why The app-server no longer needs to expose a TCP websocket listener. Keeping that transport also kept around a separate listener/auth surface that is unnecessary now that local clients can use stdio or the Unix-domain control socket, while remote connectivity is handled by `remote_control`. ## What Changed - Removed `ws://IP:PORT` parsing and the `AppServerTransport::WebSocket` startup path. - Deleted the app-server websocket listener auth module and removed related CLI flags/dependencies. - Kept websocket framing only where it is still needed: over the Unix-domain control socket and in the outbound `remote_control` connection. - Updated app-server CLI/help text and `app-server/README.md` to document only `stdio://`, `unix://`, `unix://PATH`, and `off` for local transports. - Converted affected app-server integration coverage from TCP websocket listeners to UDS-backed websocket connections, and added a parse test that rejects `ws://` listen URLs. - Removed the now-unused workspace `constant_time_eq` dependency and refreshed `Cargo.lock` after `cargo shear` caught the drift. - Moved test app-server UDS socket paths to short Unix temp paths so macOS Bazel test sandboxes do not exceed Unix socket path limits. ## Verification - Added/updated tests around UDS websocket transport behavior and `ws://` listen URL rejection. - `cargo shear` - `cargo metadata --no-deps --format-version 1` - `cargo test -p codex-app-server unix_socket_transport` - `cargo test -p codex-app-server unix_socket_disconnect` - `just fix -p codex-app-server` - `git diff --check` Local full Rust test execution was blocked before compilation by an external fetch failure for the pinned `nornagon/crossterm` git dependency. `just bazel-lock-update` and `just bazel-lock-check` were retried after the manifest cleanup but remain blocked by external BuildBuddy/V8 fetch timeouts.	2026-05-11 10:17:26 -07:00

1 2 3 4 5 ...

15220 Commits