codex

mirror of https://github.com/openai/codex.git synced 2026-05-16 01:02:48 +00:00

Author	SHA1	Message	Date
Michael Bolin	9f75b476b1	app-server: test persisted active permission profile	2026-05-12 16:50:48 -07:00
Michael Bolin	41199009ea	permissions: move workspace roots onto thread state	2026-05-12 16:50:48 -07:00
Michael Bolin	1b233d3714	core: box multi-agent handler futures	2026-05-12 16:50:48 -07:00
Anton Panasenko	ac466c0dbd	feat(exec-server): use protobuf relay frames (#22343 ) ## Why Remote exec-server now needs one executor websocket to serve multiple harness JSON-RPC sessions. Rendezvous routes by `stream_id`, and the exec-server side needs to use the same stable relay frame contract instead of a hand-rolled JSON shape. The relay protocol also needs to make ownership boundaries clear: harness and executor endpoints own sequencing, acks, retries, duplicate suppression, segmentation, and reassembly; rendezvous only routes frames. ## What Changed - Add the checked-in `codex.exec_server.relay.v1.RelayMessageFrame` proto plus generated prost bindings for `codex-exec-server`. - Encode remote harness/executor relay traffic as binary protobuf websocket frames while keeping local websocket JSON-RPC unchanged. - Demux executor-side relay streams into independent `ConnectionProcessor` sessions keyed by `stream_id`. - Add a programmatic `RemoteExecutorConfig::with_bearer_token(...)` constructor for non-CLI callers and integration tests. - Add an integration test that starts the remote executor against a fake registry/rendezvous websocket and verifies two virtual streams share one executor websocket without cross-talk, including per-stream reset behavior. - Document the remote relay envelope, sequence ranges, `ack`/`ack_bits`, and endpoint responsibilities in `exec-server/README.md`. ## Verification - `cargo test -p codex-exec-server --test relay multiplexed_remote_executor_routes_independent_virtual_streams -- --exact` - `cargo test -p codex-exec-server --test relay` - `cargo test -p codex-exec-server` passed outside the sandbox. The sandboxed run hit macOS `sandbox-exec: sandbox_apply: Operation not permitted` in filesystem sandbox tests.	2026-05-12 16:50:45 -07:00
Felipe Coury	6dc3b3d7c8	test(tui): relax configured pet load timeout (#22392 ) ## Why Windows CI has been timing out in `configured_pet_load_is_deferred_until_after_construction` while waiting for the deferred configured-pet load event. The test still needs to prove construction returns before the pet image is available, but the background load slices the built-in pet spritesheet into frame cache files. That work can exceed the old 2 second deadline on slower or more contended CI machines. ## What Changed - Increased the test wait for `ConfiguredPetLoaded` from 2 seconds to 30 seconds. - Kept the post-construction assertion intact so the test still verifies that the pet is not loaded synchronously during `ChatWidget` construction. ## How to Test Targeted tests: - `cargo test -p codex-tui configured_pet_load_is_deferred_until_after_construction` - `just argument-comment-lint` Additional check: - `cargo test -p codex-tui` was run, but the broader crate suite did not complete successfully due to unrelated existing failures: - `status::tests::status_permissions_full_disk_managed_without_network_is_external_sandbox` - `status::tests::status_permissions_full_disk_managed_with_network_is_danger_full_access` - later abort in `tests::fork_last_filters_latest_session_by_cwd_unless_show_all` from stack overflow	2026-05-12 16:50:35 -07:00
pakrym-oai	960d42ddae	code-mode: carry nested tool kind through runtime (#22377 ) ## Why Code mode only used nested spec lookup at execution time to rediscover whether a nested tool should be invoked as a function tool or a freeform tool. That information is already present in the enabled tool metadata that code mode builds to expose `tools.*` and `ALL_TOOLS`, so re-looking it up from the router was redundant and kept execution coupled to a separate spec lookup path. ## What Changed - thread `CodeModeToolKind` through the code-mode runtime `ToolCall` event and `CodeModeNestedToolCall` - emit the nested tool kind directly from the V8 callback using the already-enabled tool metadata - build nested tool payloads from the propagated kind instead of calling `find_spec` - remove the now-unused `find_spec` plumbing from the router and parallel runtime helpers - add unit coverage for function vs freeform payload shaping and update affected router tests ## Testing - `cargo test -p codex-code-mode` - `cargo test -p codex-core code_mode::tests` - `cargo test -p codex-core extension_tool_bundles_are_model_visible_and_dispatchable` - `cargo test -p codex-core model_visible_specs_filter_deferred_dynamic_tools`	2026-05-12 23:34:37 +00:00
Dylan Hurd	8123bddb16	chore(config) include_collaboration_mode_instructions (#22383 ) ## Summary Adds include_collaboration_mode_instructions, which is a config equivalent to include_permissions_instructions for collaboration modes. Desired for situations where we want to disable this instruction from entering the context ## Testing - [x] Added unit test	2026-05-12 15:50:10 -07:00
pakrym-oai	862b2122ee	tools: remove is_mutating dispatch gating (#22382 ) ## Why Tool dispatch had two serialization mechanisms: - `supports_parallel_tool_calls` decides whether a tool participates in the shared parallel-execution lock. - `is_mutating` separately gated some calls inside dispatch. That second hook no longer carried its weight. The remaining parallel-support flag is already the per-tool concurrency policy, so keeping a second mutating gate made dispatch harder to follow and left behind extra session plumbing that only existed for that path. ## What changed - Removed `is_mutating` from tool handlers and deleted the `tool_call_gate` path that existed only to support it. - Simplified dispatch and routing to rely on the existing per-tool `supports_parallel_tool_calls` boolean. - Dropped the now-unused handler overrides and related session/test scaffolding. - Kept the router/parallel tests focused on the surviving per-tool behavior. - Removed the unused `codex-utils-readiness` dependency from `codex-core` as a follow-up fix for `cargo shear`. ## Testing - `cargo test -p codex-core parallel_support_does_not_match_namespaced_local_tool_names` - `cargo test -p codex-core mcp_parallel_support_uses_handler_data` - `cargo test -p codex-core tools_without_handlers_do_not_support_parallel`	2026-05-12 22:44:54 +00:00
Chris Bookholt	5e3ee5eddf	[codex] Tighten unified exec sandbox setup (#22207 ) ## Summary - tighten unified exec sandbox initialization - preserve the requested process workdir independently from sandbox setup - add regression coverage for the updated invariant ## Validation - Ran `/tmp/cargo-tools/bin/just fmt`. - Ran the targeted `codex-core` regression test successfully. - Ran `cargo test -p codex-core`; it did not complete cleanly because unrelated existing agent/config-loader tests failed and the run later aborted on a stack overflow in `tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`. Co-authored-by: Codex <noreply@openai.com>	2026-05-12 08:41:00 -07:00
jif-oai	89c8e9a4db	fix: uv lock (#22323 ) Update the lock of UV	2026-05-12 16:24:54 +02:00
Felipe Coury	95b332c820	feat(tui): add ambient terminal pets (#21206 ) ## Why The Codex App has animated pets, but the TUI had no equivalent ambient companion surface. This brings that experience into terminal Codex while keeping the main chat flow usable: the pet should feel present, but it cannot cover transcript text, composer input, approvals, or picker content. The feature also needs to be terminal-aware. Different terminals support different image protocols, tmux can interfere with image rendering, and some users will want pets disabled entirely or anchored differently depending on their layout. <table> <tr><td> <img width="4110" height="2584" alt="CleanShot 2026-05-05 at 12 41 45@2x" src="https://github.com/user-attachments/assets/68a1fcbc-2104-48d6-b834-69c6aaa95cdf" /> <p align="center">macOS - Ghostty, iTerm2 and WezTerm with Custom Pet</p> </td></tr> <tr><td> ![Uploading CleanShot 2026-05-10 at 20.28.30.png…]() <p align="center">Windows Terminal</p> </td></tr> <tr><td> <img width="3902" height="2752" alt="CleanShot 2026-05-05 at 12 39 02@2x" src="https://github.com/user-attachments/assets/300e2931-6b00-467e-91cb-ab8e28470500" /> <p align="center">Linux - WezTerm and Ghostty</p> </td></tr> </table> ## What Changed - Add a TUI ambient pet renderer in `codex-rs/tui/src/pets/`. - Port the app-style pet animation states so the sprite changes with task status, waiting-for-input states, review/ready states, and failures. - Add `/pets` selection UI with a preview pane, loading state, built-in pet choices, and a first-row `Disable terminal pets` option. - Download built-in pet spritesheets on demand from the same public CDN path already used by Android, under `https://persistent.oaistatic.com/codex/pets/v1/...`, and cache them locally under `~/.codex/cache/tui-pets/`. - Keep custom pets local. - Add config support for pet selection, disabling pets, and choosing whether the pet follows the composer bottom or anchors to the terminal bottom. - Reserve layout space around the pet so transcript wrapping, live responses, and composer input do not render underneath the sprite. - Gate image rendering by terminal capability, disable image pets under tmux, and support both Kitty Graphics and SIXEL terminals. - Add redraw cleanup for terminal image artifacts, including sixel cell clearing. ## Current Scope - This is an initial TUI version of ambient pets, not full App parity. - It focuses on ambient sprite rendering, `/pets` selection, custom pets, terminal capability gating, and on-demand CDN-backed built-in assets. - The ambient text overlay is currently disabled, so the TUI renders the pet sprite without extra status text beside it. ## How to Test 1. Start Codex TUI in a terminal with image support. 2. Run `/pets`. 3. Confirm the picker shows built-in pets plus custom pets, and the first item is `Disable terminal pets`. 4. On a fresh `~/.codex/cache/tui-pets/`, move onto a built-in pet and confirm the first preview downloads the spritesheet from the shared Codex pets CDN and renders successfully. 5. Move through the pet list and confirm subsequent built-in previews use the local cache. 6. Select a pet, then send and receive messages. Confirm transcript and composer text wrap before the pet instead of rendering underneath the sprite. 7. Change the pet anchor setting and confirm the pet can either follow the composer bottom or sit at the terminal bottom. 8. Return to `/pets`, choose `Disable terminal pets`, and confirm the sprite disappears cleanly. Targeted tests: - `cargo test -p codex-tui ambient_pet_` - `cargo test -p codex-tui resize_reflow_wraps_transcript_early_when_pet_is_enabled` - `cargo insta pending-snapshots`	2026-05-12 10:43:17 -03:00
cassirer-openai	cb55b769d1	[rollout-trace] Add x-codex-inference-call-id header to inference calls. (#22311 ) This allows us to attach call logs to inference requests in traces.	2026-05-12 05:55:11 -07:00
jif-oai	d996f5366f	feat: guardian as an extension (contributors part) (#22216 ) Part 1 of guardian as extension. This bind all the logic to spawn another agent from an extension and it adds `ThreadId` in the start thread collaborator	2026-05-12 14:41:45 +02:00
xl-openai	5b1a4c2fa7	feat: Normalize remote plugin summary identities. (#22265 ) Makes plugin summaries use config-style plugin@marketplace IDs while exposing backend remote IDs separately as remotePluginId. Also fix the consistency issue of REMOTE_SHARED_WITH_ME_MARKETPLACE_NAME	2026-05-12 00:58:37 -07:00
viyatb-oai	46f30d0282	feat(sandbox): add Windows deny-read parity (#18202 ) ## Why The split filesystem policy stack already supports exact and glob `access = none` read restrictions on macOS and Linux. Windows still needed subprocess handling for those deny-read policies without claiming enforcement from a backend that cannot provide it. ## Key finding The unelevated restricted-token backend cannot safely enforce deny-read overlays. Its `WRITE_RESTRICTED` token model is authoritative for write checks, not read denials, so this PR intentionally fails that backend closed when deny-read overrides are present instead of claiming unsupported enforcement. ## What changed This PR adds the Windows deny-read enforcement layer and makes the backend split explicit: - Resolves Windows deny-read filesystem policy entries into concrete ACL targets. - Preserves exact missing paths so they can be materialized and denied before an enforceable sandboxed process starts. - Snapshot-expands existing glob matches into ACL targets for Windows subprocess enforcement. - Honors `glob_scan_max_depth` when expanding Windows deny-read globs. - Plans both the configured lexical path and the canonical target for existing paths so reparse-point aliases are covered. - Threads deny-read overrides through the elevated/logon-user Windows sandbox backend and unified exec. - Applies elevated deny-read ACLs synchronously before command launch rather than delegating them to the background read-grant helper. - Reconciles persistent deny-read ACEs per sandbox principal so policy changes do not leave stale deny-read ACLs behind. - Fails closed on the unelevated restricted-token backend when deny-read overrides are present, because its `WRITE_RESTRICTED` token model is not authoritative for read denials. ## Landed prerequisites These prerequisite PRs are already on `main`: 1. #15979 `feat(permissions): add glob deny-read policy support` 2. #18096 `feat(sandbox): add glob deny-read platform enforcement` 3. #17740 `feat(config): support managed deny-read requirements` This PR targets `main` directly and contains only the Windows deny-read enforcement layer. ## Implementation notes - Exact deny-read paths remain enforceable on the elevated path even when they do not exist yet: Windows materializes the missing path before applying the deny ACE, so the sandboxed command cannot create and read it during the same run. - Existing exact deny paths are preserved lexically until the ACL planner, which then adds the canonical target as a second ACL target when needed. That keeps both the configured alias and the resolved object covered. - Windows ACLs do not consume Codex glob syntax directly, so glob deny-read entries are expanded to the concrete matches that exist before process launch. - Glob traversal deduplicates directory visits within each pattern walk to avoid cycles, without collapsing distinct lexical roots that happen to resolve to the same target. - Persistent deny-read ACL state is keyed by sandbox principal SID, so cleanup only removes ACEs owned by the same backend principal. - Deny-read ACEs are fail-closed on the elevated path: setup aborts if mandatory deny-read ACL application fails. - Unelevated restricted-token sessions reject deny-read overrides early instead of running with a silently unenforceable read policy. ## Verification - `cargo test -p codex-core windows_restricted_token_rejects_unreadable_split_carveouts` - `just fmt` - `just fix -p codex-core` - `just fix -p codex-windows-sandbox` - GitHub Actions rerun is in progress on the pushed head. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-11 23:04:28 -07:00
pakrym-oai	c9e46ed639	[codex] Make handlers own parallel tool support (#22254 ) ## Why `ToolRouter::tool_supports_parallel()` was still consulting configured specs when a handler lookup missed, even though parallel schedulability is really a property of the executable handler. Keeping that metadata on `ConfiguredToolSpec` duplicated state between the model-visible spec layer and the runtime handler layer. This change makes handlers the sole source of truth for parallel tool support and removes the extra spec wrapper that only existed to carry duplicated metadata. ## What changed - removed `ConfiguredToolSpec` and store plain `ToolSpec` values in the registry/router builder path - changed `ToolRouter::tool_supports_parallel()` to consult only the handler registry and fall back to `false` - simplified spec collection and test helpers to operate directly on `ToolSpec` - updated router/spec tests to cover handler-owned parallel behavior and the no-handler fallback ## Validation - `cargo test -p codex-tools` - `cargo test -p codex-core mcp_parallel_support_uses_handler_data` - `cargo test -p codex-core deferred_responses_api_tool_serializes_with_defer_loading` - `cargo test -p codex-core tools_without_handlers_do_not_support_parallel` - `cargo test -p codex-core request_plugin_install_can_be_registered_without_search_tool` ## Docs No documentation updates needed.	2026-05-11 22:26:33 -07:00
pakrym-oai	79c65f816c	[codex] Filter legacy warning messages during compaction (#22243 ) ## Why Older sessions can contain model-warning records persisted as `user` messages, including the unified exec process-limit warning, the `apply_patch`-via-`exec_command` warning, and the model-mismatch high-risk cyber fallback warning. Those warnings are no longer produced as conversation history items, but when old sessions compact they should still be recognized as injected context rather than preserved as real user turns. ## What changed - Removed `record_model_warning` and the production paths that emitted these warning messages into conversation history. - Added `LegacyUnifiedExecProcessLimitWarning`, `LegacyApplyPatchExecCommandWarning`, and `LegacyModelMismatchWarning` contextual fragments that are used only for matching old persisted messages. - Registered the legacy fragments with contextual user message detection so compaction filters them through the existing fragment path. - Added focused compaction coverage for old warning messages being dropped during compacted-history processing. ## Testing - `cargo test -p codex-core warning` - `just fix -p codex-core`	2026-05-11 19:51:51 -07:00
Abhinav	d08906a944	Support PreToolUse updatedInput rewrites (#20527 ) ## Why `PreToolUse` already exposes `updatedInput` in its hook output schema, but Codex currently rejects it instead of applying the rewrite. That leaves hook authors unable to make the documented pre-execution adjustment to a tool call before it runs. ## What - Accept `updatedInput` from `PreToolUse` hooks when paired with `permissionDecision: "allow"`. - Apply the rewritten input before dispatch so the tool executes the updated payload, not the original one. - Preserve the stable hook-facing compatibility shapes that participating tool handlers expose: - Bash-like tools (`shell`, `container.exec`, `local_shell`, `shell_command`, `exec_command`) use `{ "command": ... }`. - `apply_patch` exposes its patch body through the same command-shaped hook contract. - MCP tools expose their JSON argument object directly. - Keep each participating tool handler responsible for translating hook-facing `updatedInput` back into its concrete invocation shape. ## Verification Direct Bash-like rewrite coverage: - `pre_tool_use_rewrites_shell_before_execution` - `pre_tool_use_rewrites_container_exec_before_execution` - `pre_tool_use_rewrites_local_shell_before_execution` - `pre_tool_use_rewrites_shell_command_before_execution` - `pre_tool_use_rewrites_exec_command_before_execution` These cases assert that each supported Bash-like surface runs only the rewritten command while the hook still observes the original `{ "command": ... }` input. `pre_tool_use_rewrites_apply_patch_before_execution` - Model emits one patch. - Hook swaps in a different patch. - Asserts only the rewritten file is created, and the hook saw the original patch. `pre_tool_use_rewrites_code_mode_nested_exec_command_before_execution` - Model runs one nested shell command from code mode. - Hook rewrites it. - Asserts only the rewritten command runs, and the hook saw the original nested input. `pre_tool_use_rewrites_mcp_tool_before_execution` - Model calls the RMCP echo tool. - Hook rewrites the MCP arguments. - Asserts the MCP server receives and returns the rewritten message, not the original one.	2026-05-11 22:27:24 -04:00
starr-openai	17ed5ad0b0	Apply sandbox context to local view_image reads (#21861 ) ## Summary - create a selected-cwd filesystem sandbox context for view_image metadata and file reads in both local and remote environments - add a local restricted-profile regression test for the previously unsandboxed read path ## Validation - just fmt - bazel test --bes_backend= --bes_results_url= --test_output=errors --test_filter=view_image::tests::handle_passes_sandbox_context_for_local_filesystem_reads //codex-rs/core:core-unit-tests --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-11 18:48:43 -07:00
efrazer-oai	fd24c00b0b	feat(skills): default plugin creator to personal share flow (#22221 ) ## Summary Plugin creation now defaults to the personal marketplace path and ends with a readable handoff back into Codex after a marketplace-backed scaffold. Before this change, `plugin-creator` centered repo-local marketplace updates and did not clearly guide the agent to return the user to the created plugin afterward. This PR updates the bundled system skill so marketplace-backed scaffolds default to `~/plugins/<plugin-name>` plus `~/.agents/plugins/marketplace.json`, ask for user intent only when an existing repo marketplace makes personal vs team scope ambiguous, and end with named Markdown deeplinks labeled `View <plugin-name>` and `Share <plugin-name>`. ## What changed - default marketplace-backed creation to the personal plugin location - document the explicit repo/team override path for codebases that should own the plugin entry - ask personal vs team only when the current Git repo already has `.agents/plugins/marketplace.json` and the user has not stated scope - require named Markdown deeplinks after marketplace-backed creation so the final response returns the user to the exact plugin cleanly - keep the deeplink targets precise with real absolute `marketplacePath` and normalized `pluginName` values - align the bundled prompt, scaffold help text, and marketplace reference spec with the new default ## Testing Tests: targeted skill validation, Python compile checks, personal-default scaffold smoke, repo-override scaffold smoke, and whitespace checks.	2026-05-11 17:58:48 -07:00
pakrym-oai	ed5944ba1d	Simplify MCP tool handler plumbing (#21595 ) ## Why The MCP tool path had accumulated a few core-owned special cases: a dedicated payload variant, resolver plumbing, a legacy `AfterToolUse` translation path, and a side channel for parallel-call metadata. That made `ToolRegistry` and the spec builder know more about MCP than they needed to. This change moves MCP-specific execution details back onto `ToolInfo` and `McpHandler` so `codex-core` can treat MCP calls like normal function calls while still preserving MCP-specific dispatch and telemetry behavior where it belongs. ## What changed - removed `resolve_mcp_tool_info`, `ToolPayload::Mcp`, `ToolKind`, and the remaining registry-side MCP resolver path - stored MCP routing metadata directly on `McpHandler` and `ToolInfo`, including `supports_parallel_tool_calls` - deleted the legacy `AfterToolUse` consumer in `core`, which removes the need for handler-specific `after_tool_use_payload` implementations - switched tool-result telemetry to handler-provided tags and kept MCP-specific dispatch payload construction inside the handler - simplified tool spec planning/building by passing `ToolInfo` directly and dropping the direct/deferred MCP wrapper structs and the parallel-server side table ## Testing - `cargo check -p codex-core -p codex-mcp -p codex-otel` - `cargo test -p codex-core mcp_parallel_support_uses_exact_payload_server` - `cargo test -p codex-core direct_mcp_tools_register_namespaced_handlers` - `cargo test -p codex-core search_tool_description_lists_each_mcp_source_once` - `cargo test -p codex-mcp list_all_tools_uses_startup_snapshot_while_client_is_pending` - `just fix -p codex-core -p codex-mcp -p codex-otel`	2026-05-12 00:11:31 +00:00
Felipe Coury	e16b4e46d4	fix(tui): handle hidden app git directives (#21946 )	2026-05-11 21:08:40 -03:00
Ruslan Nigmatullin	95d8669ab2	[exec-server] serve websocket listener via HTTP upgrade (#21963 ) ## Why `codex exec-server` should keep the existing public `ws://IP:PORT` URL shape while serving that websocket connection through an HTTP upgrade path internally. That keeps the client-facing configuration simple and allows the listener to work through intermediate HTTP-aware infrastructure. ## What changed - keep the emitted and configured exec-server URL as `ws://IP:PORT` - serve that websocket endpoint through Axum HTTP upgrade handling on `/` - expose `GET /readyz` from the same listener for readiness checks - route upgraded Axum websocket streams through the shared JSON-RPC connection machinery - initialize the rustls crypto provider before websocket client connections - preserve inbound binary websocket JSON-RPC parsing for compatibility with the prior transport behavior ## Verification - `cargo test -p codex-exec-server --test health --test process --test websocket --test initialize --test exec_process`	2026-05-11 17:04:21 -07:00
Matthew Zeng	e15ecc9c35	Add production startup and TTFT telemetry (#22198 ) ## Why While investigating `codex exec hi` startup latency, the useful questions were not "is startup slow?" but "which durable bucket is slow in production?" The path we observed has a few distinct stages: 1. `thread/start` creates the session 2. startup prewarm builds the turn context, tools, and prompt 3. startup prewarm warms the websocket 4. the first real turn resolves the prewarm 5. the model produces the first token Before this PR, production telemetry had some of the raw measurements already: - aggregate startup-prewarm duration / age-at-first-turn metrics - TTFT as a metric - websocket request telemetry But there was no coherent production event stream for the startup breakdown itself, and TTFT was metric-only. That made it hard to answer the same latency questions from OpenTelemetry-backed logs without adding one-off local instrumentation. ## What changed Add durable production telemetry on the existing `SessionTelemetry` path: - new `codex.startup_phase` OTel log/trace events plus `codex.startup.phase.duration_ms` - new `codex.turn_ttft` OTel log/trace events while preserving the existing TTFT metric The startup phase event is emitted for the coarse buckets we actually observed while running `exec hi`: - `thread_start_create_thread` - `startup_prewarm_total` - `startup_prewarm_create_turn_context` - `startup_prewarm_build_tools` - `startup_prewarm_build_prompt` - `startup_prewarm_websocket_warmup` - `startup_prewarm_resolve` These phases are intentionally low-cardinality so they remain safe as production telemetry tags. ## Why this shape This keeps the instrumentation on the same production path as the rest of the session telemetry instead of adding a local debug-only trace mode. It also avoids changing startup behavior: - prewarm still runs - no control flow changes - no extra remote calls - no user-visible behavior changes One boundary is intentional: very early process bootstrap that happens before a session exists is not included here, because this PR uses session-scoped production telemetry. The expensive buckets we were trying to understand after `thread/start` are now covered durably. ## Verification - `cargo test -p codex-otel` - `cargo test -p codex-core turn_timing` - `cargo test -p codex-core regular_turn_emits_turn_started_without_waiting_for_startup_prewarm` - `cargo test -p codex-core interrupting_regular_turn_waiting_on_startup_prewarm_emits_turn_aborted` - `cargo test -p codex-app-server thread_start` - `just fix -p codex-otel -p codex-core -p codex-app-server` I also ran `cargo test -p codex-core`; it built successfully and then hit an existing unrelated stack overflow in `tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`.	2026-05-11 23:58:36 +00:00
starr-openai	22e84c49d0	Support multi-environment apply_patch selection (#21617 ) ## Summary - add multi-environment apply_patch routing for both freeform and function-call tool flows - parse and reconcile the optional environment selector in the main apply_patch parser, then verify against the selected environment in the handler - carry environment_id through runtime and approval surfaces so remote-targeted patches stay explicit end to end ## Testing - just fmt - remote exec-server e2e: `cargo test -p codex-core --test all apply_patch_multi_environment_uses_remote_executor -- --nocapture` on dev via `scripts/test-remote-env.sh` --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-11 16:33:44 -07:00
alexsong-oai	bb6134c028	Stop uploading accepted line fingerprints (#22180 ) ## Summary - keep accepted-line diff parsing and fingerprint hashing logic locally - stop uploading path/line hash fingerprints in the accepted-line analytics event payload - keep aggregate accepted added/deleted line counts in the event ## Testing - just fmt - cargo test -p codex-analytics - just fix -p codex-analytics	2026-05-11 15:41:38 -07:00
Owen Lin	4859d80ffe	Update codex remote-control to start the daemon (#22218 ) ## Why Update `codex remote-control` to use the new app server daemon commands instead. - if the updater loop is not running, bootstrap the daemon with remote control enabled (`codex app-server daemon bootstrap --remote-control`) - otherwise, enable the persisted remote-control setting and start the daemon normally	2026-05-11 15:38:30 -07:00
Abhinav	9ab7f4e6ac	Add Windows hook command overrides (#22159 ) # Why Managed hook configs need a shared cross-platform shape without making the existing `command` field polymorphic. The common case is still one command string, with Windows needing a different entrypoint only when the runtime is actually Windows. Keeping `command` as the portable/default path and adding an optional Windows override keeps the config easier to read, preserves the existing scalar shape for non-Windows users, and avoids forcing every caller into a `{ unix, windows }` object when only one platform needs special handling. # What - Add optional `command_windows` / `commandWindows` alongside the existing hook `command` field. - Resolve `command_windows` only on Windows during hook discovery; other platforms continue to use `command` unchanged. - Keep trust hashing aligned to the effective command selected for the current runtime. # Docs The Codex hooks/config reference should document `command_windows` as the Windows-only override for command hooks.	2026-05-11 22:22:29 +00:00
rhan-oai	a175ddacc0	[codex-analytics] emit terminal review events (#18748 ) ## Why Review telemetry should describe reviews as first-class events, not only as counters denormalized onto terminal tool-item events. That lets us analyze guardian and user reviews consistently across command execution, file changes, permissions, and network access, while still preserving the terminal item summaries that existing tool analytics need. To make those review events accurate, analytics also needs the observed completion time for each review and enough command metadata to distinguish `shell` from `unified_exec` reviews. ## What changed - emit generic `codex_review_event` rows for completed user and guardian reviews, with review subjects, reviewer, trigger, terminal status, resolution, and observed duration - reduce approval request / response / abort facts into review events for command execution, file change, and permissions flows - keep denormalized review counts, final approval outcome, and permission-request flags on terminal tool-item events for item-associated reviews - plumb review completion timing so user-review responses and aborts use app-server-observed completion times, while guardian analytics reuse the same terminal timestamps emitted on guardian assessment events - carry command approval `source` through the protocol and app-server layers so review analytics can distinguish `shell` from `unified_exec` - add analytics coverage for user-review emission, guardian-review emission, permission reviews that should not denormalize onto tool items, item-summary isolation across threads, and the serialized review-event shape ## Verification - `cargo test -p codex-analytics` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18748). * __->__ #18748 * #21434 * #18747 * #17090 * #17089 * #20514	2026-05-11 22:13:32 +00:00
Ahmed Ibrahim	aa9e8f0262	[8/8] Add Python SDK Ruff formatting (#22021 ) ## Why The Python SDK needs the same tight formatter/lint loop as the rest of the repo: a safe Ruff autofix pass, Ruff formatting, editor save behavior, and CI checks that catch drift. Without that loop, SDK changes can land with formatting or import ordering that differs from what reviewers and CI expect. ## What - Add Ruff configuration to `sdk/python/pyproject.toml`, excluding generated protocol code and notebooks from the normal lint/format pass. - Update `just fmt` so it still formats Rust and also runs Python SDK Ruff autofix and formatting. - Add Python SDK CI steps for `ruff check` and `ruff format --check` before pytest. - Recommend the Ruff VS Code extension and enable Python format/fix/organize-on-save so Cmd+S uses the same tooling. - Apply the resulting Ruff formatting to SDK Python files, examples, and the checked-in generated `v2_all.py` output emitted by the pinned generator. - Add a guard test for the `just fmt` recipe so it keeps working from both Rust and Python SDK working directories. ## Stack 1. #21891 `[1/8]` Pin Python SDK runtime dependency 2. #21893 `[2/8]` Generate Python SDK types from pinned runtime 3. #21895 `[3/8]` Run Python SDK tests in CI 4. #21896 `[4/8]` Define Python SDK public API surface 5. #21905 `[5/8]` Rename Python SDK package to `openai-codex` 6. #21910 `[6/8]` Add high-level Python SDK approval mode 7. #22014 `[7/8]` Add Python SDK app-server integration harness 8. This PR `[8/8]` Add Python SDK Ruff formatting ## Verification - Added `test_root_fmt_recipe_formats_rust_and_python_sdk` for the shared format recipe. - Ran `just fmt` after the recipe update. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-12 01:10:29 +03:00
Ahmed Ibrahim	3e10e09e24	[7/8] Add Python SDK app-server integration harness (#22014 ) ## Why The SDK had behavioral tests that replaced SDK client internals. Those tests could catch wrapper mistakes, but they did not prove the pinned app-server runtime, generated notification models, request routing, and sync/async public clients worked together. This PR adds deterministic integration coverage that starts the pinned `codex app-server` process and mocks only the upstream Responses HTTP boundary. ## What - Add `AppServerHarness` and `MockResponsesServer` helpers for isolated `CODEX_HOME`, mock-provider config, queued SSE responses, and captured `/v1/responses` requests. - Add shared helpers for SSE construction, stream assertions, approval-policy inspection, and image fixtures. - Split integration coverage into focused modules for run behavior, inputs, streaming, turn controls, approvals, and thread lifecycle. - Cover sync and async `Thread.run`, `TurnHandle.stream`, interleaved streams, approval-mode persistence, lifecycle helpers, final-answer phase handling, image inputs, loaded skill input injection, steering, interruption, listing, history reads, run overrides, and token usage mapping. - Replace public-wrapper tests that duplicated integration-test behavior with lower-level client tests only where direct client behavior is the thing under test. ## Stack 1. #21891 `[1/8]` Pin Python SDK runtime dependency 2. #21893 `[2/8]` Generate Python SDK types from pinned runtime 3. #21895 `[3/8]` Run Python SDK tests in CI 4. #21896 `[4/8]` Define Python SDK public API surface 5. #21905 `[5/8]` Rename Python SDK package to `openai-codex` 6. #21910 `[6/8]` Add high-level Python SDK approval mode 7. This PR `[7/8]` Add Python SDK app-server integration harness 8. #22021 `[8/8]` Add Python SDK Ruff formatting ## Verification - Added pinned app-server integration tests under `sdk/python/tests/test_app_server_*.py` and `test_real_app_server_integration.py`. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-12 01:06:41 +03:00
Ahmed Ibrahim	2b90c37069	[6/8] Add high-level Python SDK approval mode (#21910 ) ## Why The high-level SDK should expose the approval behavior it actually supports instead of leaking generated app-server routing fields. New work should have two clear choices: default auto review, or explicitly deny escalated permission requests. Existing threads and subsequent turns should preserve their current approval behavior unless the caller passes an override. ## What - Add the public `ApprovalMode` enum with `auto_review` and `deny_all`. - Default new thread creation to `ApprovalMode.auto_review`. - Preserve existing approval settings by default for resume, fork, run, and turn helpers. - Remove raw `approval_policy` / `approvals_reviewer` kwargs from high-level SDK wrappers. - Update generated wrapper output, docs, examples, notebooks, and tests for the high-level approval mode API. ## Stack 1. #21891 `[1/8]` Pin Python SDK runtime dependency 2. #21893 `[2/8]` Generate Python SDK types from pinned runtime 3. #21895 `[3/8]` Run Python SDK tests in CI 4. #21896 `[4/8]` Define Python SDK public API surface 5. #21905 `[5/8]` Rename Python SDK package to `openai-codex` 6. This PR `[6/8]` Add high-level Python SDK approval mode 7. #22014 `[7/8]` Add Python SDK app-server integration harness 8. #22021 `[8/8]` Add Python SDK Ruff formatting ## Verification - Added approval-mode mapping/default tests for new threads, existing threads, forks, resumes, and subsequent turns. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-12 01:02:43 +03:00
Ahmed Ibrahim	f1b84fac63	[5/8] Rename Python SDK package to openai-codex (#21905 ) ## Why The SDK should publish under the reserved public distribution name `openai-codex`, and its import module should match that name in the Python style. Since package names can contain hyphens but import modules cannot, the public import path becomes `openai_codex`. Keeping the rename separate from the public API surface change makes the naming change easy to review and avoids mixing it with API curation. ## What - Rename the SDK distribution from `openai-codex-app-server-sdk` to `openai-codex`. - Rename the import package from `codex_app_server` to `openai_codex`. - Keep the runtime wheel as the separate `openai-codex-cli-bin` dependency. - Update docs, examples, notebooks, artifact scripts, lockfile metadata, and tests for the new distribution/module names. ## Stack 1. #21891 `[1/8]` Pin Python SDK runtime dependency 2. #21893 `[2/8]` Generate Python SDK types from pinned runtime 3. #21895 `[3/8]` Run Python SDK tests in CI 4. #21896 `[4/8]` Define Python SDK public API surface 5. This PR `[5/8]` Rename Python SDK package to `openai-codex` 6. #21910 `[6/8]` Add high-level Python SDK approval mode 7. #22014 `[7/8]` Add Python SDK app-server integration harness 8. #22021 `[8/8]` Add Python SDK Ruff formatting ## Verification - Updated package metadata and public API tests to assert the distribution and import names. Co-authored-by: Codex <noreply@openai.com>	2026-05-12 00:59:25 +03:00
Ahmed Ibrahim	b4bc02439f	[4/8] Define Python SDK public API surface (#21896 ) ## Why The SDK package root should be the ergonomic public client API, not a dump of every generated app-server schema type. Generated models still need a supported import path, but callers should be able to tell which names are high-level SDK entrypoints and which names are protocol value models. ## What - Define a curated root `__all__` for clients, handles, input helpers, retry helpers, config, and public errors. - Add a `types` module as the supported home for generated app-server response, event, enum, and helper models. - Update docs and examples to import protocol/value models from the type module. - Add tests that lock root exports, type-module exports, star-import behavior, and example import hygiene. ## Stack 1. #21891 `[1/8]` Pin Python SDK runtime dependency 2. #21893 `[2/8]` Generate Python SDK types from pinned runtime 3. #21895 `[3/8]` Run Python SDK tests in CI 4. This PR `[4/8]` Define Python SDK public API surface 5. #21905 `[5/8]` Rename Python SDK package to `openai-codex` 6. #21910 `[6/8]` Add high-level Python SDK approval mode 7. #22014 `[7/8]` Add Python SDK app-server integration harness 8. #22021 `[8/8]` Add Python SDK Ruff formatting ## Verification - Added public API signature tests for root exports, `types` exports, and example imports. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-12 00:57:44 +03:00
Ahmed Ibrahim	3e2936dd0e	[3/8] Run Python SDK tests in CI (#21895 ) ## Why The Python SDK stack now depends on packaging metadata, pinned runtime wheels, generated artifacts, async behavior, and stream interleaving. Those checks need to run in CI so future changes cannot bypass the SDK test suite. ## What - Add a dedicated `python-sdk` job to `.github/workflows/sdk.yml`. - Run the job in `python:3.12-alpine` so dependency resolution exercises the pinned musl runtime wheel. - Keep the Python SDK test job parallel to the existing SDK job instead of serializing the full workflow. ## Stack 1. #21891 `[1/8]` Pin Python SDK runtime dependency 2. #21893 `[2/8]` Generate Python SDK types from pinned runtime 3. This PR `[3/8]` Run Python SDK tests in CI 4. #21896 `[4/8]` Define Python SDK public API surface 5. #21905 `[5/8]` Rename Python SDK package to `openai-codex` 6. #21910 `[6/8]` Add high-level Python SDK approval mode 7. #22014 `[7/8]` Add Python SDK app-server integration harness 8. #22021 `[8/8]` Add Python SDK Ruff formatting ## Verification - The added workflow job installs the SDK with `uv sync --extra dev --frozen` and runs the Python SDK pytest suite. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-12 00:53:36 +03:00
Ahmed Ibrahim	6a4653efc8	[2/8] Generate Python SDK types from pinned runtime (#21893 ) ## Why Once the SDK declares its runtime package, generated Python artifacts should come from that pinned runtime rather than whatever app-server schema happens to be in the current checkout. That keeps the generated API and model surface aligned with the runtime users install. ## What - Teach `scripts/update_sdk_artifacts.py generate-types` to invoke the pinned runtime package for schema generation. - Regenerate `v2_all.py`, `notification_registry.py`, and generated public wrapper methods from that schema. - Add freshness coverage so regenerating from the pinned runtime must leave checked-in artifacts unchanged. ## Stack 1. #21891 `[1/8]` Pin Python SDK runtime dependency 2. This PR `[2/8]` Generate Python SDK types from pinned runtime 3. #21895 `[3/8]` Run Python SDK tests in CI 4. #21896 `[4/8]` Define Python SDK public API surface 5. #21905 `[5/8]` Rename Python SDK package to `openai-codex` 6. #21910 `[6/8]` Add high-level Python SDK approval mode 7. #22014 `[7/8]` Add Python SDK app-server integration harness 8. #22021 `[8/8]` Add Python SDK Ruff formatting ## Verification - Added `test_generated_files_are_up_to_date` for pinned-runtime generation drift. - Added generator-structure tests for schema annotation and notification metadata generation. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-12 00:53:21 +03:00
Ahmed Ibrahim	5fe33443b0	[1/8] Pin Python SDK runtime dependency (#21891 ) ## Why The Python SDK depends on the app-server runtime package for the bundled `codex` binary and schema source of truth. That relationship should be explicit in package metadata instead of inferred from matching version numbers, so installers, lockfiles, and reviewers can see exactly which runtime the SDK expects. ## What - Declare `openai-codex-cli-bin==0.131.0a4` as a Python SDK dependency. - Update runtime setup helpers to resolve the runtime version from the declared dependency pin. - Refresh the SDK lockfile for the pinned runtime wheel. - Update package/runtime tests and docs that describe where the runtime version comes from. ## Stack 1. This PR `[1/8]` Pin Python SDK runtime dependency 2. #21893 `[2/8]` Generate Python SDK types from pinned runtime 3. #21895 `[3/8]` Run Python SDK tests in CI 4. #21896 `[4/8]` Define Python SDK public API surface 5. #21905 `[5/8]` Rename Python SDK package to `openai-codex` 6. #21910 `[6/8]` Add high-level Python SDK approval mode 7. #22014 `[7/8]` Add Python SDK app-server integration harness 8. #22021 `[8/8]` Add Python SDK Ruff formatting ## Verification - Added coverage for the SDK runtime dependency pin and runtime distribution naming. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-12 00:42:26 +03:00
viyatb-oai	c7b55cdc46	feat: add network proxy feature flag (#20147 ) ## Why The permissions migration is making `permissions.<profile>.network.enabled` the canonical sandbox network bit, while proxy startup is a separate concern. Enabling network access should not implicitly start the proxy, and users who are still on legacy sandbox modes need a separate place to opt into proxy startup and provide proxy-specific settings. This follow-up to #19900 gives the network proxy its own feature surface instead of overloading permission-profile network semantics. ## What changed - Add an experimental `network_proxy` feature with a configurable `[features.network_proxy]` table. - Overlay `features.network_proxy` settings onto the configured proxy state after permission-profile selection, so the proxy only starts when the active `NetworkSandboxPolicy` already allows network access. - Preserve `[experimental_network]` startup behavior independently of the new feature flag. ## Behavior and examples There are now three related knobs: - `permissions.<profile>.network.enabled` controls whether the active permission profile has network access at all. - `features.network_proxy` enables proxy restrictions for an already-network-enabled profile. - Legacy `sandbox_mode` plus `[sandbox_workspace_write].network_access` still control whether legacy `workspace-write` has network access at all. The rule is: - network off + proxy flag on -> network stays off, proxy is a no-op - network on + proxy flag off -> unrestricted direct network - network on + proxy flag on -> network stays on, with proxy restrictions applied For permission profiles, the feature toggle adds proxy restrictions only when network access is already enabled: ```toml default_permissions = "workspace" [permissions.workspace.filesystem] ":minimal" = "read" [permissions.workspace.network] enabled = true [features] network_proxy = true ``` If `network.enabled = false`, the same feature flag is a no-op: network remains off and the proxy does not start. For legacy sandbox config, `network_access` remains the master switch: ```toml sandbox_mode = "workspace-write" [sandbox_workspace_write] network_access = true [features] network_proxy = true ``` That keeps legacy `workspace-write` network access on, but routes it through the proxy policy. If `network_access = false`, the proxy feature is a no-op and legacy `workspace-write` remains offline. The same proxy opt-in can be supplied from the CLI: ```bash codex -c 'features.network_proxy=true' ``` Additional proxy settings can be supplied when a table is needed: ```bash codex \ -c 'features.network_proxy.enabled=true' \ -c 'features.network_proxy.enable_socks5=false' ``` The intended behavior matrix is: \| Config surface \| Network setting \| `features.network_proxy` \| Direct sandbox network \| Proxy \| \| --- \| --- \| --- \| --- \| --- \| \| Permission profile \| `network.enabled = false` \| off \| restricted \| off \| \| Permission profile \| `network.enabled = false` \| on \| restricted \| off \| \| Permission profile \| `network.enabled = true` \| off \| enabled \| off \| \| Permission profile \| `network.enabled = true` \| on \| enabled \| on \| \| Legacy `workspace-write` \| `network_access = false` \| off \| restricted \| off \| \| Legacy `workspace-write` \| `network_access = false` \| on \| restricted \| off \| \| Legacy `workspace-write` \| `network_access = true` \| off \| enabled \| off \| \| Legacy `workspace-write` \| `network_access = true` \| on \| enabled \| on \| `[experimental_network]` requirements remain separate from the user feature toggle and still start the proxy on their own. Relevant code: - [`features/src/feature_configs.rs`](https://github.com/openai/codex/blob/43785aff47/codex-rs/features/src/feature_configs.rs#L58-L117) defines the feature-specific proxy config. - [`core/src/config/mod.rs`](https://github.com/openai/codex/blob/43785aff47/codex-rs/core/src/config/mod.rs#L1959-L1964) reads the feature table, and [later applies it only when network access is already enabled](https://github.com/openai/codex/blob/43785aff47/codex-rs/core/src/config/mod.rs#L2448-L2458). ## Verification Added focused coverage for: - keeping the proxy off when `features.network_proxy` is enabled but sandbox network access is disabled - the full permission-profile and legacy `workspace-write` matrix above - preserving `[experimental_network]` startup without the feature - reusing profile-supplied proxy settings when the feature is enabled Ran: - `cargo test -p codex-features` - `cargo test -p codex-core network_proxy_feature` - `cargo test -p codex-core experimental_network_requirements_enable_proxy_without_feature`	2026-05-11 14:12:00 -07:00
cooper-oai	54ec99cb54	[login] revoke superseded auth tokens on relogin (#21747 ) ## Summary - revoke previously stored managed ChatGPT tokens after a successful re-login - keep the new login successful even when revocation is unavailable or fails - cover the shared persistence path used by browser and device-code login flows ## Why A new `codex login` currently overwrites existing managed ChatGPT credentials without attempting to revoke the superseded tokens, leaving old credentials valid longer than necessary. ## Validation - `just fmt` - `CARGO_HOME=/tmp/cargo-home cargo test -p codex-login` ## Notes - Initial local Cargo validation hit a corrupt existing crate cache in the default `CARGO_HOME`; rerunning with a clean temporary `CARGO_HOME` passed. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-11 13:36:46 -07:00
Ruslan Nigmatullin	e3f481da98	daemon: refresh updater after validated binary rollout (#21853 ) ## Why `bootstrap` starts a detached pid-backed updater loop, but before this change that updater could keep running an old executable image even after `install.sh` replaced the managed standalone binary under `CODEX_HOME`. That left the updater itself behind the binary it had just rolled out, especially when the app-server was stopped or when the managed binary changed without a version-string change. ## What changed - Track updater identity from the executable contents rather than only the reported CLI version. - Force the managed app-server restart path when the managed binary contents differ from the running updater image, then re-exec the updater from the managed binary once the rollout is in a safe state. - Distinguish a genuinely absent managed app-server from a managed process that exists but is not yet probeable, so self-refresh does not skip a required restart. - Keep the restart/re-exec decision under the daemon operation lock so `bootstrap` cannot race the handoff. - Update `app-server-daemon/README.md` to document the resulting standalone and out-of-band update behavior. ## Verification - `cargo test -p codex-app-server-daemon` - `just fix -p codex-app-server-daemon` Added focused unit coverage for: - content-based updater refresh decisions - safe updater re-exec outcomes across restart states	2026-05-11 12:37:10 -07:00
Felipe Coury	99b98aece6	config: accept `minus` in TUI keymap config (#22192 ) ## Summary Fixes #22128. The `/keymap` flow already persists the `-` key as `minus`, and the runtime keymap parser already accepts that spelling. `codex-config` was the missing leg: it rejected `minus` during config deserialization, so a binding saved by Codex could fail on the next startup or config reload. ## What Changed - Accept `minus` as a valid canonical key name in `tui.keymap` config normalization. - Update the config validation message so its supported-key list includes `minus`. - Add regression coverage that deserializes both `minus` and `alt-minus` under `[tui.keymap.global]` and verifies the normalized config shape. ## How to Test 1. Start Codex TUI. 2. Run `/keymap`. 3. Assign the `-` key to an action and save the change. 4. Restart Codex or reload the config. 5. Confirm the config loads normally and the saved binding remains usable instead of failing on `minus`. 6. As a focused regression check, repeat with a modifier form such as `alt--` captured through `/keymap`, which persists as `alt-minus` and should also reload successfully. Targeted tests: - `cargo test -p codex-config`	2026-05-11 16:34:33 -03:00
Matthew Zeng	192481d1a1	[elicitation] Advertise new url elicitation capability when auth_elicitation is enabled. (#22188 ) ## Why We've added support for auth elicitation behind the auth_elicitation flag, but servers need to explicitly check the capability before it decides to send elicitations in order to be backward compatible. This PR adds the capability advertising conditioned on the flag. ## What changed - Build `client_elicitation_capability` from the `AuthElicitation` feature state. - Thread that capability through MCP config, session startup, and `McpConnectionManager` so RMCP initialization advertises the correct elicitation support. - Advertise both `form` and `url` elicitation when the feature is enabled, and preserve the empty default capability when it is disabled. - Add coverage for the feature-derived config shape and the advertised initialization payload. ## Testing - `cargo test -p codex-mcp` - `cargo test -p codex-core to_mcp_config_preserves_auth_elicitation_feature_from_config` - `cargo test -p codex-core` (currently fails outside this change in `tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed` with a stack overflow after unrelated tests have started running)	2026-05-11 12:23:55 -07:00
viyatb-oai	d0fa2d81d8	feat(connectors): support managed app tool approval requirements (#21061 ) ## Why Managed requirements can already centrally disable apps, but they could not express the per-tool app approval rules that normal config already supports. That left admins without a way to enforce connector tool approvals through `/etc/codex/requirements.toml` or cloud requirements. ## What changed - Extend app requirements with per-tool `approval_mode` entries. - Merge managed app tool requirements across managed sources while preserving higher-precedence exact tool settings. - Apply managed tool approvals separately from user app config so managed policy is matched only on raw MCP `tool.name`, while user config keeps the existing raw-name-then-title convenience fallback. - Add coverage for local requirements, cloud requirements parsing, managed-over-user precedence, and a title-collision case that must not widen managed auto-approval. ## Configuration shape Local `/etc/codex/requirements.toml` and cloud requirements use the same TOML shape: ```toml [apps.connector_123123.tools."calendar/list_events"] approval_mode = "approve" ``` This is a per-tool approval rule keyed by app ID and raw MCP tool name, not an app-level boolean such as `apps.connector_123123.approve = true`.	2026-05-11 19:08:26 +00:00
viyatb-oai	6506765168	fix(permissions): preserve managed deny-read during escalation (#15977 ) ## Why Managed filesystem `deny_read` requirements are administrator-enforced restrictions on specific paths. Once those requirements are active, Codex should not drop them just because an execution path would otherwise leave the sandbox. Before this change, an explicit escalation, a prefix-rule allow, a sandbox-denial retry, or an app-server legacy sandbox override could rebuild the runtime policy without those managed read-deny entries and expose a path the administrator had marked unreadable. This is narrower than general sandbox-mode constraints. If an enterprise only sets `allowed_sandbox_modes`, a trusted `prefix_rule(..., decision = "allow")` can still run its matching command unsandboxed; this PR only preserves managed filesystem `deny_read` restrictions across those paths. ## What Changed - Mark filesystem policies built from managed `deny_read` requirements so callers can tell when those deny entries must survive escalation. - Preserve managed deny-read entries when runtime permission profiles are rebuilt through protocol, app-server, or legacy sandbox-policy compatibility paths. - Keep managed deny-read attempts inside the selected sandbox on the first attempt and after sandbox-denial retries. - Preserve the same behavior in the zsh-fork escalation path, including prefix-rule-driven escalation. - Add a regression test showing the opposite case too: without managed deny-read, a prefix-rule allow still chooses unsandboxed execution. ## Verification Targeted automated verification: ```shell cargo test -p codex-core shell_request_escalation_execution_is_explicit -- --nocapture cargo test -p codex-core prefix_rule_uses_unsandboxed_execution_without_managed_deny_read -- --nocapture cargo test -p codex-core prefix_rule_preserves_managed_deny_read_escalation -- --nocapture cargo test -p codex-protocol permission_profile_round_trip_preserves_filesystem_policy_metadata -- --nocapture cargo test -p codex-protocol preserving_deny_entries_keeps_unrestricted_policy_enforceable -- --nocapture cargo test -p codex-app-server-protocol permission_profile_file_system_permissions_preserves_policy_metadata -- --nocapture cargo check -p codex-app-server -p codex-tui ``` Smoke-test invocations: ```shell # macOS exact deny + allowed control codex exec --skip-git-repo-check -C "$ROOT" \ -c 'default_permissions="deny_read_smoke"' \ -c 'permissions.deny_read_smoke.filesystem={":minimal"="read",":project_roots"={"."="write","secrets"="none","future-secret"="none","*/.env"="none"}}' \ 'Run shell commands only. Print the contents of allowed.txt. Then test whether reading secrets/exact-secret.txt succeeds without printing that file if it does. End with exactly two lines: allowed=<contents> and exact_secret=<BLOCKED or READABLE>.' # Linux exact deny + allowed control codex exec --skip-git-repo-check -C "$ROOT" \ -c 'default_permissions="deny_read_smoke"' \ -c 'permissions.deny_read_smoke.filesystem={":minimal"="read",glob_scan_max_depth=3,":project_roots"={"."="write","secrets"="none","future-secret"="none","*/.env"="none"}}' \ 'Run shell commands only. Print the contents of allowed.txt. Then test whether reading secrets/exact-secret.txt succeeds without printing that file if it does. End with exactly two lines: allowed=<contents> and exact_secret=<BLOCKED or READABLE>.' ``` Observed manual smoke matrix: \| Case \| macOS Seatbelt \| Linux bubblewrap \| \| --- \| --- \| --- \| \| `cat allowed.txt` \| Pass \| Pass \| \| `cat secrets/exact-secret.txt` \| Blocked \| Blocked \| \| `cat envs/root.env` \| Blocked \| Blocked \| \| `cat envs/nested/one.env` \| Blocked \| Blocked \| \| `cat envs/nested/two.env` \| Blocked \| Blocked \| \| `cat alias-to-secrets/exact-secret.txt` \| Blocked \| Blocked \| \| Missing denied path \| A file created after sandbox setup remained unreadable \| Creation was blocked by the reserved missing-path placeholder, and the placeholder was cleaned up after exit \| \| Real `codex exec` shell turn \| Pass \| Pass \| Notes: - The Linux smoke run used the fallback glob walker because the devbox did not have `rg` installed. - The smoke matrix verifies the end-to-end filesystem behavior on macOS and Linux; the escalation-specific behavior is covered by the focused tests above. --------- Co-authored-by: Codex <noreply@openai.com> Co-authored-by: Charlie Marsh <charliemarsh@openai.com>	2026-05-11 11:49:44 -07:00
Owen Lin	7bddb3083d	fix(app-server): thread history redaction for remote clients (#22178 ) ## Summary Remote clients can still receive large `thread/resume` histories when prior turns include MCP tool call payloads or image-generation results. This adds a temporary response-only redaction path for the known remote client names. Longer term we will move towards fully paginated APIs backed by SQLite. ## Changes - Redact MCP tool call payload-bearing fields in `thread/resume` responses for `codex_chatgpt_android_remote` and `codex_chatgpt_ios_remote`. - Drop `imageGeneration` items from those `thread/resume` responses. - Keep redaction out of persisted rollout files, `thread/read`, `thread/turns/list`, live notifications, and token usage replay. - Cover the behavior with app-server helper tests and a v2 resume integration test that checks both remote clients plus a non-target control client. ## Testing - `cargo test -p codex-app-server thread_resume_redaction` - `cargo test -p codex-app-server thread_resume_redacts_payloads_for_chatgpt_remote_clients`	2026-05-11 11:45:25 -07:00
Felipe Coury	90bd445e7f	fix(exec-server): suppress Windows taskkill output (#22058 ) ## Summary This is the `exec-server` follow-up to #21759. #21759 fixed the Windows `taskkill` output leak for the `rmcp-client` MCP teardown path, but #22050 showed that `exec-server` still had a parallel `taskkill /T /F` cleanup path in `exec-server/src/connection.rs`. Because that command inherited the parent stdio handles, Windows could still print `SUCCESS:` lines into the user's terminal during stdio child cleanup. This change silences that remaining `exec-server` callsite by redirecting `taskkill` stdin, stdout, and stderr to `Stdio::null()`. ## What Changed - add a Windows-only `Stdio` import in `exec-server/src/connection.rs` - redirect the `taskkill` command in `kill_windows_process_tree` to `Stdio::null()` for stdin, stdout, and stderr - keep the existing kill semantics unchanged by still checking `.status()` and preserving the existing fallback/logging behavior ## How to Test Manual validation is Windows-only, so I did not run the UI repro path locally here. 1. On Windows, use a Codex build from this branch. 2. Exercise an `exec-server` stdio flow that spawns a child process tree and then triggers transport cleanup. 3. Confirm the child process tree is still torn down. 4. Confirm the terminal no longer shows `SUCCESS: The process with PID ... has been terminated.` lines during cleanup. Targeted tests: - `cargo test -p codex-exec-server client::tests::dropping_stdio_client_terminates_spawned_process -- --exact` - `cargo test -p codex-exec-server client::tests::malformed_stdio_message_terminates_spawned_process -- --exact` Notes: - `cargo test -p codex-exec-server` still hits unrelated local macOS `sandbox-exec: sandbox_apply: Operation not permitted` failures in `tests/file_system.rs`. ## References - Fixes the remaining callsite discussed in #22050 - Related earlier fix: #21759	2026-05-11 15:40:56 -03:00
Dylan Hurd	e783dab44c	fix(exec-policy) use is_known_safe_command less (#20305 ) ## Summary Restricts behavior of `is_known_safe_command` only to modes where it is explicitly part of the documented behavior: - when `environment_lacks_sandbox_protections` - in `AskForApproval::UnlessTrusted` Notably, as a result of this, escalations for commands that pass `is_known_safe_commands` are no longer auto-approved in AskForApproval::OnRequest or AskForApproval::Granular. ## Testing - [x] Updated unit tests - [x] Updated approvals scenario tests. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-11 11:37:53 -07:00
canvrno-oai	eaf05c9002	Unified mentions in TUI (#19068 ) This PR replaces the TUI’s file-only `@mention` popup with a unified mentions experience. Typing `@...` now searches across filesystem matches, installed plugins, and skills in one popup, with result types clearly labeled and selectable from the same flow. - Adds a unified `@mentions` popup that returns: - plugins - skills - files - directories - Adds search modes so users can narrow the popup without changing their query: - All Results _(default/same as Codex App)_ - Filesystem Only - Plugins _(...and skills)_ - Preserves existing insertion behavior: - selected file paths are inserted into the prompt - paths with spaces are quoted - image file selections still attach as images when possible - selecting a plugin or skill inserts the corresponding `$name` - the composer records the canonical mention binding, such as `plugin://...` or the skill path - Expanded `@mentions` rendering: - type tags for Plugin, Skill, File, and Dir - distinct plugin/filesystem colors - stable fixed-height layout (8 rows) - truncation behavior for narrow terminals Note: - The unified mentions popup does not display app connectors under `@mention` results for Codex App parity. Connector mentions remain available through the existing `$mention` path. https://github.com/user-attachments/assets/f93781ed-57d3-4cb5-9972-675bc5f3ef3f	2026-05-11 11:34:52 -07:00
jif-oai	b401666ca5	Add process-scoped SQLite telemetry (#22154 ) ## Summary - add SQLite init, backfill-gate, and fallback telemetry without introducing a cross-cutting state-db access wrapper - install one process-scoped telemetry sink after OTEL startup and let low-level state/rollout paths emit through it directly - add process-start metrics for the process owners that initialize SQLite --------- Co-authored-by: Owen Lin <owen@openai.com>	2026-05-11 11:32:40 -07:00
rhan-oai	cf6342b75b	[codex-analytics] add turn tool counts to turn events (#21431 ) ## Summary - accumulate completed tool-item counts per turn from the item lifecycle - populate the reserved count fields on `codex_turn_event` - add reducer coverage for zero-count turns and mixed completed tool items ## Why PR #17090 moved tool-item analytics onto the item lifecycle, so the turn reducer can now derive the per-turn tool counts from the same completed items instead of leaving the reserved fields null. ## Validation - `just fmt` - `cargo test -p codex-analytics`	2026-05-11 18:18:02 +00:00

1 2 3 4 5 ...

6416 Commits