codex

mirror of https://github.com/openai/codex.git synced 2026-05-15 08:42:34 +00:00

Author	SHA1	Message	Date
Ahmed Ibrahim	9d579813bb	1- Add model service tiers metadata (#20969 ) ## Why The model list needs to carry display-ready service tier metadata so clients can render tier choices with stable IDs, names, and descriptions. A raw speed-tier string list is not enough for richer UI copy or future tier labels. ## What changed - Added `ModelServiceTier` to shared model metadata with string `id`, `name`, and `description` fields. - Added `service_tiers` to `ModelInfo` and `ModelPreset`, preserving empty defaults for older cached model payloads. - Exposed `serviceTiers` on app-server v2 `Model` responses and threaded it through TUI app-server model conversion. - Marked legacy `additional_speed_tiers` / `additionalSpeedTiers` metadata as deprecated in source and generated schema output. - Regenerated app-server protocol JSON schema and TypeScript fixtures, including `ModelServiceTier.ts`. ## Verification - Ran `just write-app-server-schema`. - Did not run local tests per repo instruction; relying on PR CI. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-05 09:51:18 +03:00
Abhinav	dca105cf99	Spill large hook outputs from context (#21069 ) ## Why Large hook outputs can enter model-visible context through hook-specific paths such as `additionalContext` and `Stop` continuation prompts. Without a dedicated cap, one hook can inject a large blob directly into conversation history instead of leaving a bounded preview for the model and preserving the full text elsewhere. ## What - spill hook text once it exceeds a fixed `2_500`-token budget, preserving the full output on disk and leaving a head/tail preview plus saved path in context - add shared hook-output spilling under `CODEX_HOME/hook_outputs/<thread_id>/<uuid>.txt` - apply the cap to both `additionalContext`, `feedback_message`, and `Stop` continuation fragments	2026-05-05 05:03:18 +00:00
rhan-oai	aee1fe2659	[codex-analytics] add item lifecycle timing (#20514 ) ## Why Tool families already disagree on what their existing `duration` fields mean, so lifecycle latency should live on the shared item envelope instead of being inferred from per-tool execution fields. Carrying that envelope through app-server notifications gives downstream consumers one reusable timing signal without pretending every tool has the same execution semantics. ## What changed - Adds `started_at_ms` to core `ItemStartedEvent` values and `completed_at_ms` to core `ItemCompletedEvent` values. - Populates those timestamps in the shared session lifecycle emitters, so protocol-native items get timing without each producer tracking its own clock state. - Exposes `startedAtMs` on app-server `item/started` notifications and `completedAtMs` on `item/completed` notifications. - Maps the lifecycle timestamps through the app-server boundary while leaving legacy-converted notifications nullable when no lifecycle timestamp exists. - Regenerates the app-server JSON schema and TypeScript fixtures for the notification-envelope change and updates downstream fixtures that construct those notifications directly. - Extends the existing web-search and image-generation integration flows to assert the new lifecycle timestamps on the native item events. ## Verification - `cargo check -p codex-protocol -p codex-core -p codex-app-server-protocol -p codex-app-server -p codex-tui -p codex-exec -p codex-app-server-client` - `cargo test -p codex-core --test all web_search_item_is_emitted` - `cargo test -p codex-core --test all image_generation_call_event_is_emitted` - `cargo test -p codex-app-server-protocol` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/20514). * #18748 * #18747 * #17090 * #17089 * __->__ #20514	2026-05-04 22:33:20 +00:00
kmeelu-oai	e7e6267ab3	Make realtime sideband startup async (#20715 ) ## Summary Moves the WebRTC realtime sideband websocket join out of the voice start critical path. Call creation still posts the SDP offer and session config synchronously so the client gets the SDP answer, but the sideband websocket now connects in the input task async and doesn't block conversation state installation. This lets the normal realtime input channels buffer text, handoff output, and audio while the WebRTC sideband websocket is connecting. If the sideband join fails while the conversation is still active, the task sends a RealtimeEvent::Error through the existing events_tx / fanout path. To rephrase this: * No longer blocked on sideband: the client can receive the SDP answer earlier, set up the WebRTC peer connection, and let the media leg progress while the sideband websocket joins. * Still blocked on sideband: queued text, handoff output, and sideband server events cannot flow until connect_webrtc_sideband(...).await finishes and then run_realtime_input_task(...) starts ## Validation - `env CODEX_SKIP_VENDORED_BWRAP=1 cargo test --manifest-path codex-rs/Cargo.toml -p codex-core --test all conversation_webrtc_start_posts_generated_session` `CODEX_SKIP_VENDORED_BWRAP=1` is needed in this local environment because `libcap.pc` is not installed for the vendored bubblewrap build. ## Testing I tested this locally by running `cargo run -p codex-cli --bin codex -- --enable realtime_conversation` and invoking `/realtime`. Then, we get logs emitted in `~/.codex/log/codex-tui.log`. ### Before the Change Logging commit (`c0299e6edf`) ``` 2026-05-04T16:06:09.251956Z INFO session_loop{thread_id=019df3b9-e3d8-7271-b13a-b880119aa4c2}:submission_dispatch{otel.name="op.dispatch.realtime_conversation_start" submission.id="019df3bd-65df-7ee2-8125-1d6701fe39d2" codex.op="realtime_conversation_start"}: codex_core::realtime_conversation: starting realtime conversation 2026-05-04T16:06:09.251980Z INFO session_loop{thread_id=019df3b9-e3d8-7271-b13a-b880119aa4c2}:submission_dispatch{otel.name="op.dispatch.realtime_conversation_start" submission.id="019df3bd-65df-7ee2-8125-1d6701fe39d2" codex.op="realtime_conversation_start"}: codex_core::realtime_conversation: creating realtime call transport="webrtc" 2026-05-04T16:06:10.365722Z INFO session_loop{thread_id=019df3b9-e3d8-7271-b13a-b880119aa4c2}:submission_dispatch{otel.name="op.dispatch.realtime_conversation_start" submission.id="019df3bd-65df-7ee2-8125-1d6701fe39d2" codex.op="realtime_conversation_start"}: codex_core::realtime_conversation: realtime call created; sdp answer ready transport="webrtc" call_id=rtc_u0_Dbq65nhak5eLjQZ73yhAy elapsed_ms=1113 total_elapsed_ms=1113 2026-05-04T16:06:10.365843Z INFO session_loop{thread_id=019df3b9-e3d8-7271-b13a-b880119aa4c2}:submission_dispatch{otel.name="op.dispatch.realtime_conversation_start" submission.id="019df3bd-65df-7ee2-8125-1d6701fe39d2" codex.op="realtime_conversation_start"}: codex_core::realtime_conversation: connecting realtime sideband websocket call_id=rtc_u0_Dbq65nhak5eLjQZ73yhAy 2026-05-04T16:06:10.784528Z INFO session_loop{thread_id=019df3b9-e3d8-7271-b13a-b880119aa4c2}:submission_dispatch{otel.name="op.dispatch.realtime_conversation_start" submission.id="019df3bd-65df-7ee2-8125-1d6701fe39d2" codex.op="realtime_conversation_start"}: codex_core::realtime_conversation: connected realtime sideband websocket call_id=rtc_u0_Dbq65nhak5eLjQZ73yhAy elapsed_ms=418 total_elapsed_ms=1532 2026-05-04T16:06:10.784665Z INFO session_loop{thread_id=019df3b9-e3d8-7271-b13a-b880119aa4c2}:submission_dispatch{otel.name="op.dispatch.realtime_conversation_start" submission.id="019df3bd-65df-7ee2-8125-1d6701fe39d2" codex.op="realtime_conversation_start"}: codex_core::realtime_conversation: realtime conversation started ``` ### After the Change Logging commit (`c8b00ac21a`) ``` 2026-05-04T15:41:24.080363Z INFO ... codex_core::realtime_conversation: starting realtime conversation 2026-05-04T15:41:24.080434Z INFO ... codex_core::realtime_conversation: creating realtime call transport="webrtc" 2026-05-04T15:41:25.106906Z INFO ... codex_core::realtime_conversation: realtime call created; sdp answer ready transport="webrtc" call_id=rtc_u0_Dbpi8nhak5eLjQZ73yhAy elapsed_ms=1026 total_elapsed_ms=1026 2026-05-04T15:41:25.107067Z INFO ... codex_core::realtime_conversation: spawned realtime sideband connection task transport="webrtc" total_elapsed_ms=1026 2026-05-04T15:41:25.107160Z INFO ... codex_core::realtime_conversation: realtime conversation started 2026-05-04T15:41:25.107185Z INFO codex_core::realtime_conversation: connecting realtime sideband websocket call_id=rtc_u0_Dbpi8nhak5eLjQZ73yhAy 2026-05-04T15:41:25.107352Z INFO ... codex_core::realtime_conversation: sent realtime sdp answer to client 2026-05-04T15:41:26.076685Z INFO codex_core::realtime_conversation: connected realtime sideband websocket call_id=rtc_u0_Dbpi8nhak5eLjQZ73yhAy elapsed_ms=969 total_elapsed_ms=1996 2026-05-04T15:41:26.573893Z INFO codex_core::realtime_conversation: realtime session updated realtime_session_id=sess_u0_Dbpi8nhak5eLjQZ73yhAy 2026-05-04T15:41:26.573970Z INFO codex_core::realtime_conversation: received realtime conversation event event=SessionUpdated { ... } ``` ### Conclusion Here we see that we saved about a half a second in conversation startup (1532ms -> 969ms). This also checks out with my sanity tests; I was seeing at most a second of saving. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-04 22:28:14 +00:00
charley-openai	a6599b8202	Add reasoning effort to turn tracing spans (#20060 ) Why #19432 added token usage to the turn and response spans. This follow-up adds the configured reasoning effort so performance traces can be filtered by model effort. [example trace](https://openai.datadoghq.com/apm/trace/1ff708a87159ff4898bdc8bd6091ec18?graphType=waterfall&shouldShowLegend=true&spanID=6596351544047485652&traceQuery=) <img width="533" height="434" alt="Screenshot 2026-04-28 at 3 52 12 PM" src="https://github.com/user-attachments/assets/77ef32fc-d7cd-4eec-87b4-26c6798f1af8" /> What Changed - Adds `codex.turn.reasoning_effort` to the turn span. - Adds `codex.request.reasoning_effort` to `handle_responses`. - Extends the span test to cover explicit `high` effort with token usage. Testing - `cargo test -p codex-core turn_and_completed_response_spans_record_token_usage` - `cargo test -p codex-otel` - `just fmt` - `just fix -p codex-core` - `just fix -p codex-otel`	2026-05-04 12:57:05 -07:00
Michael Bolin	229b40aa21	core: fix apply_patch request permissions test (#21060 ) ## Why The Bazel test coverage change exposed `approved_folder_write_request_permissions_unblocks_later_apply_patch`, and `rust-ci-full.yml` showed the same test failing on `main` on macOS. There were two separate classes of problems here. ### Clean CI failure The test emits an `apply_patch` tool call, but its config did not enable the `apply_patch` tool, so the mocked response completed without an `apply-patch-call` output. After enabling the tool, the same path also needs the aggregate `codex-core` test binary to dispatch `--codex-run-as-fs-helper`; sandboxed `apply_patch` uses that helper under macOS Seatbelt. The test now also canonicalizes the temporary patch target before building the patch payload so the path matches normalized grants on macOS, where `/var` paths often normalize to `/private/var`. ### Local/enterprise config isolation The core test harness now builds its default test config with managed config disabled, so host-managed enterprise config cannot alter these tests. The request-permissions turns in this test also explicitly use the user reviewer path, keeping the assertions focused on `request_permissions` behavior rather than reviewer defaults from the host. ## What Changed - Enable `apply_patch` in `approved_folder_write_request_permissions_unblocks_later_apply_patch`. - Teach the core integration test binary to dispatch `CODEX_FS_HELPER_ARG1`, matching the existing apply-patch and linux-sandbox dispatch paths. - Canonicalize the tempdir-backed patch target before creating the patch. - Ignore managed config in default core test configs and explicitly pin this test to `ApprovalsReviewer::User`. ## Verification Run outside the Codex app sandbox because these macOS tests intentionally spawn Seatbelt: - `cargo test -p codex-core approved_folder_write_request_permissions_unblocks_later_apply_patch` - `cargo test -p codex-core approved_folder_write_request_permissions_unblocks_later_exec_without_sandbox_args`	2026-05-04 12:48:59 -07:00
sayan-oai	b9e8df47da	Use MCP server instructions in deferred namespace descriptions (#21053 ) ## Why MCP servers can provide `instructions` that explain what their tools are for. Directly exposed MCP namespaces already use those instructions when a connector description is not available, but deferred `tool_search` results did not preserve that fallback. The direct path falls back from connector metadata to server instructions, while the deferred path only carried `connector_description` and otherwise fell back to generic namespace text. That meant a plain MCP server could provide useful model-facing guidance and still appear as `Tools in the X namespace.` whenever it was discovered lazily through `tool_search`. ## What changed - Store one model-facing `namespace_description` on `ToolInfo`, using connector descriptions for connector-backed tools and server instructions for plain MCP servers. - Thread that namespace description through the `tool_search` source list, search indexing, and returned namespace metadata. - Add an end-to-end regression test for deferred non-app MCP search results exposing server instructions as the namespace description. ## Verification - `cargo test -p codex-tools search_tool_description_lists_each_mcp_source_once --lib` - `cargo test -p codex-core --test all tool_search_uses_non_app_mcp_server_instructions_as_namespace_description`	2026-05-04 19:36:07 +00:00
Ruslan Nigmatullin	4d201e340e	state: pass state db handles through consumers (#20561 ) ## Why SQLite state was still being opened from consumer paths, including lazy `OnceCell`-backed thread-store call sites. That let one process construct multiple state DB connections for the same Codex home, which makes SQLite lock contention and `database is locked` failures much easier to hit. State DB lifetime should be chosen by main-like entrypoints and tests, then passed through explicitly. Consumers should use the supplied `Option<StateDbHandle>` or `StateDbHandle` and keep their existing filesystem fallback or error behavior when no handle is available. The startup path also needs to keep the rollout crate in charge of SQLite state initialization. Opening `codex_state::StateRuntime` directly bypasses rollout metadata backfill, so entrypoints should initialize through `codex_rollout::state_db` and receive a handle only after required rollout backfills have completed. ## What Changed - Initialize the state DB in main-like entrypoints for CLI, TUI, app-server, exec, MCP server, and the thread-manager sample. - Pass `Option<StateDbHandle>` through `ThreadManager`, `LocalThreadStore`, app-server processors, TUI app wiring, rollout listing/recording, personality migration, shell snapshot cleanup, session-name lookup, and memory/device-key consumers. - Remove the lazy local state DB wrapper from the thread store so non-test consumers use only the supplied handle or their existing fallback path. - Make `codex_rollout::state_db::init` the local state startup path: it opens/migrates SQLite, runs rollout metadata backfill when needed, waits for concurrent backfill workers up to a bounded timeout, verifies completion, and then returns the initialized handle. - Keep optional/non-owning SQLite helpers, such as remote TUI local reads, as open-only paths that do not run startup backfill. - Switch app-server startup from direct `codex_state::StateRuntime::init` to the rollout state initializer so app-server cannot skip rollout backfill. - Collapse split rollout lookup/list APIs so callers use the normal methods with an optional state handle instead of `_with_state_db` variants. - Restore `getConversationSummary(ThreadId)` to delegate through `ThreadStore::read_thread` instead of a LocalThreadStore-specific rollout path special case. - Keep DB-backed rollout path lookup keyed on the DB row and file existence, without imposing the filesystem filename convention on existing DB rows. - Verify readable DB-backed rollout paths against `session_meta.id` before returning them, so a stale SQLite row that points at another thread's JSONL falls back to filesystem search and read-repairs the DB row. - Keep `debug prompt-input` filesystem-only so a one-off debug command does not initialize or backfill SQLite state just to print prompt input. - Keep goal-session test Codex homes alive only in the goal-specific helper, rather than leaking tempdirs from the shared session test helper. - Update tests and call sites to pass explicit state handles where DB behavior is expected and explicit `None` where filesystem-only behavior is intended. ## Validation - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout -p codex-thread-store -p codex-app-server -p codex-core -p codex-tui -p codex-exec -p codex-cli --tests` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout state_db_` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout find_thread_path` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout find_thread_path -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout try_init_ -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-rollout` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo clippy -p codex-rollout --lib -- -D warnings` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-thread-store read_thread_falls_back_when_sqlite_path_points_to_another_thread -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-thread-store` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core shell_snapshot` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all personality_migration` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all rollout_list_find` - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all rollout_list_find::find_prefers_sqlite_path_by_id -- --nocapture` - `RUST_MIN_STACK=8388608 CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core --test all rollout_list_find -- --nocapture` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-core interrupt_accounts_active_goal_before_pausing` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-app-server get_auth_status -- --test-threads=1` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo test -p codex-app-server --lib` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db cargo check -p codex-rollout -p codex-app-server --tests` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p codex-thread-store -p codex-core -p codex-app-server -p codex-tui -p codex-exec -p codex-cli` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout -p codex-app-server` - `CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-rollout` - `CODEX_SKIP_VENDORED_BWRAP=1 CARGO_TARGET_DIR=/tmp/codex-target-state-db just fix -p codex-core` - `just argument-comment-lint -p codex-core` - `just argument-comment-lint -p codex-rollout` Focused coverage added in `codex-rollout`: - `recorder::tests::state_db_init_backfills_before_returning` verifies the rollout metadata row exists before startup init returns. - `state_db::tests::try_init_waits_for_concurrent_startup_backfill` verifies startup waits for another worker to finish backfill instead of disabling the handle for the process. - `state_db::tests::try_init_times_out_waiting_for_stuck_startup_backfill` verifies startup does not hang indefinitely on a stuck backfill lease. - `tests::find_thread_path_accepts_existing_state_db_path_without_canonical_filename` verifies DB-backed lookup accepts valid existing rollout paths even when the filename does not include the thread UUID. - `tests::find_thread_path_falls_back_when_db_path_points_to_another_thread` verifies DB-backed lookup ignores a stale row whose existing path belongs to another thread and read-repairs the row after filesystem fallback. Focused coverage updated in `codex-core`: - `rollout_list_find::find_prefers_sqlite_path_by_id` now uses a DB-preferred rollout file with matching `session_meta.id`, so it still verifies that valid SQLite paths win without depending on stale/empty rollout contents. `cargo test -p codex-app-server thread_list_respects_search_term_filter -- --test-threads=1 --nocapture` was attempted locally but timed out waiting for the app-server test harness `initialize` response before reaching the changed thread-list code path. `bazel test //codex-rs/thread-store:thread-store-unit-tests --test_output=errors` was attempted locally after the thread-store fix, but this container failed before target analysis while fetching `v8+` through BuildBuddy/direct GitHub. The equivalent local crate coverage, including `cargo test -p codex-thread-store`, passes. A plain local `cargo check -p codex-rollout -p codex-app-server --tests` also requires system `libcap.pc` for `codex-linux-sandbox`; the follow-up app-server check above used `CODEX_SKIP_VENDORED_BWRAP=1` in this container.	2026-05-04 11:46:03 -07:00
jif-oai	e3451ce6be	core: share responses request builder with compact requests (#20989 ) ## Why `ModelClientSession` and `compact_conversation_history()` were still rebuilding the same `ResponsesApiRequest` fields separately. That duplication makes it easy for normal `/responses` turns and compact requests to drift when request-shape changes land later, which is exactly the kind of cache-affecting divergence we want to avoid. This follow-up keeps the scope small by extracting the shared request-construction logic into one helper and using it from both paths. ## What changed - move `ResponsesApiRequest` construction into a shared `ModelClient::build_responses_request(...)` helper in `core/src/client.rs` - update the normal `/responses` streaming path to call that helper instead of the old `ModelClientSession`-local implementation - update `compact_conversation_history()` to derive its compact payload from the same helper so `model`, `instructions`, `input`, `tools`, `parallel_tool_calls`, `reasoning`, and `text` stay aligned with normal request building - add a unit test covering the shared helper's prompt cache key, installation metadata, and `service_tier` behavior ## Verification - `cargo test -p codex-core build_responses_request_sets_shared_cache_and_metadata_fields` - `cargo test -p codex-core --test all remote_compact_v2_reuses_context_compaction_for_followups` ## Docs No docs update needed.	2026-05-04 17:18:38 +00:00
jif-oai	d927f61208	feat: add remote compaction v2 Responses client path (#20773 ) ## Why This adds the `remote_compaction_v2` client path so remote compaction can run through the normal Responses stream and install a `context_compaction` item that trigger a compaction. The goal is to migrate some of the compaction logic on the client side We keeps the v2 transport behind a feature flag while letting follow-up requests reuse the compacted context instead of falling back to the legacy compaction item shape. ## What changed - add `ResponseItem::ContextCompaction` and refresh the generated app-server / schema / TypeScript fixtures that expose response items on the wire - add `core/src/compact_remote_v2.rs` to send compaction through the standard streamed Responses client, require exactly one `context_compaction` output item, and install that item into compacted history - route manual compact and auto-compaction through the v2 path when `remote_compaction_v2` is enabled, while keeping the existing remote compaction path as the fallback - preserve the new item type across history retention, follow-up request construction, telemetry, rollout persistence, and rollout-trace normalization - add targeted coverage for the feature flag, `context_compaction` serialization, rollout-trace normalization, and remote-compaction follow-up behavior ## Verification - added protocol tests for `context_compaction` serialization/deserialization in `protocol/src/models.rs` - added rollout-trace coverage for `context_compaction` normalization in `rollout-trace/src/reducer/conversation_tests.rs` - added remote compaction integration coverage for v2 follow-up reuse and mixed compaction output streams in `core/tests/suite/compact_remote.rs` --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-04 14:15:01 +02:00
Matthew Zeng	f88701f5c8	[tool_suggest] More prompt polishes. (#20566 ) Tool suggest still misfires when model needs tool_search, updating the prompts to further disambiguate it: - [x] rename it from `tool_suggest` to `request_plugin_install` - [x] rephrase "suggestion" to "install" in the tool descriptions. - [x] disambiguate "the tool" vs "the plugin/connector". Tested with the Codex App and verified it still works.	2026-05-02 04:22:12 +00:00
Channing Conger	a5fbcf1ab4	Prune unused code-mode globals (#20542 ) Hide Atomics, SharedArrayBuffer, and WebAssembly from the code-mode runtime since the harness does not expose worker support or need those APIs.	2026-05-01 15:11:22 -07:00
starr-openai	2952beb009	Surface multi-environment choices in environment context (#20646 ) ## Why The model needs a way to see which environments are available during a multi-environment turn without changing the legacy single-environment prompt surface or pulling replay/persistence changes into the same review. ## Stack 1. https://github.com/openai/codex/pull/20646 - `EnvironmentContext` rendering for selected environments (this PR) 2. https://github.com/openai/codex/pull/20669 - selected-environment ownership and tool config prep 3. https://github.com/openai/codex/pull/20647 - process-tool `environment_id` routing ## What Changed - extend `environment_context` so multi-environment turns render an `<environments>` block with the selected environment ids and cwd values - keep zero- and single-environment turns on the existing cwd-only render path - keep replay and persistence paths on the legacy surface for now so this PR stays scoped to live prompt rendering - add focused coverage in `codex-rs/core/src/context/environment_context_tests.rs` ## Testing - CI --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-01 22:11:06 +00:00
pakrym-oai	aed74e5ee4	[codex] Emit image view as core item (#20512 ) ## Why Image-view results should be represented as a core-produced turn item instead of being reconstructed by app-server. At the same time, existing rollout/history paths still understand the legacy `ViewImageToolCall` event, so this keeps that event as compatibility output generated from the new item lifecycle. ## What changed - Added `TurnItem::ImageView` to `codex-protocol`. - Emitted image-view item start/completion directly from the core `view_image` handler. - Kept `ViewImageToolCall` as a legacy event and generate it from completed `TurnItem::ImageView` items. - Kept `thread_history.rs` on the legacy `ViewImageToolCall` replay path, with `ImageView` item lifecycle events ignored there. - Updated app-server protocol conversion, rollout persistence, and affected exhaustive event matches for the new item plus legacy fan-out shape. ## Verification - `cargo test -p codex-protocol -p codex-app-server-protocol -p codex-rollout -p codex-rollout-trace -p codex-mcp-server -p codex-app-server --lib` - `cargo test -p codex-core --test all view_image_tool_attaches_local_image` - `just fix -p codex-protocol -p codex-core -p codex-app-server-protocol -p codex-app-server -p codex-rollout -p codex-rollout-trace -p codex-mcp-server` - `git diff --check`	2026-05-01 11:28:30 -07:00
Abhinav	78baa20780	deprecate legacy notify (#20524 ) # Why `notify` is the remaining compatibility surface from the legacy hook implementation. The newer lifecycle hook engine now owns the active hook system, so we should start steering users away from adding new `notify` configs before removing the old path entirely. This also adds a lightweight watchpoint for the deprecation so we can see how much legacy usage remains before the clean drop. # What - emit a startup deprecation notice when a non-empty `notify` command is configured - emit `codex.notify.configured` when a session starts with legacy `notify` configured - emit `codex.notify.run` when the legacy notify path fires after a completed turn - mark `notify` as deprecated in the config schema and repo docs - remove the orphaned `codex-rs/hooks/src/user_notification.rs` file that is no longer compiled - add regression coverage for the new deprecation notice # Next steps A follow-up PR can remove the legacy notify path entirely once we are ready for the clean drop. Before then, we can watch `codex.notify.configured` and `codex.notify.run` to understand the deprecation impact and remaining active usage. The cleanup PR should then delete the `notify` config field, the `legacy_notify` implementation, the old compatibility dispatch types and callsites that only exist for the legacy path, and the remaining compatibility docs/tests. # Testing - `cargo test -p codex-hooks` - `cargo test -p codex-config` - `cargo test -p codex-core emits_deprecation_notice_for_notify`	2026-05-01 17:35:21 +00:00
pakrym-oai	f476338f93	Move apply-patch file changes into turn items (#20540 ) ## Why Apply-patch file changes are now part of the core turn item stream, so v2 clients can consume the same first-class item lifecycle path used by other turn items instead of relying on app-server-specific remapping from legacy patch events. ## What changed - Added a core `TurnItem::FileChange` carrying apply-patch changes and completion metadata. - Updated the apply-patch tool emitter to send `ItemStarted` / `ItemCompleted` with the new `FileChange` item while preserving legacy `PatchApplyBegin` / `PatchApplyEnd` fan-out. - Updated app-server v2 conversion to render the new core item directly and stopped `event_mapping` from remapping old patch begin/end events into item notifications. - Kept thread history reconstruction based on the existing old apply-patch events for rollout compatibility. ## Verification - `cargo test -p codex-protocol -p codex-app-server-protocol` - `cargo test -p codex-core --test all apply_patch_tool_executes_and_emits_patch_events` - `cargo test -p codex-app-server bespoke_event_handling`	2026-05-01 08:47:18 -07:00
Tom	fe05acad23	Make thread store process-scoped (#19474 ) - Build one app-server process ThreadStore from startup config and share it with ThreadManager and CodexMessageProcessor. - Remove per-thread/fork store reconstruction so effective thread config cannot switch the persistence backend. - Add params to ThreadStore create/resume for specifying thread metadata, since otherwise the metadata from store creation would be used (incorrectly).	2026-04-30 21:24:59 -07:00
pakrym-oai	f50c02d7bc	[codex] Remove unused event messages (#20511 ) ## Why Several legacy `EventMsg` variants were still emitted or mapped even though clients either ignored them or had moved to item/lifecycle events. `Op::Undo` had also degraded to an unavailable shim, so this removes that dead task path instead of preserving a command that cannot do useful work. `McpStartupComplete`, `WebSearchBegin`, and `ImageGenerationBegin` are intentionally kept because useful consumers still depend on them: MCP startup completion drives readiness behavior, and the begin events let app-server/core consumers surface in-progress web-search and image-generation items before the final payload arrives. ## What Changed - Removed weak legacy event variants and payloads from `codex-protocol`, including legacy agent deltas, background events, and undo lifecycle events. - Kept/restored `EventMsg::McpStartupComplete`, `EventMsg::WebSearchBegin`, and `EventMsg::ImageGenerationBegin` with serializer and emission coverage. - Updated core, rollout, MCP server, app-server thread history, review/delegate filtering, and tests to rely on the useful replacement events that remain. - Removed `Op::Undo`, `UndoTask`, the undo test module, and stale TUI slash-command comments. - Stopped agent job/background progress and compaction retry notices from emitting `BackgroundEvent` payloads. ## Verification - `cargo check -p codex-protocol -p codex-app-server-protocol -p codex-core -p codex-rollout -p codex-rollout-trace -p codex-mcp-server` - `cargo test -p codex-protocol -p codex-app-server-protocol -p codex-rollout -p codex-rollout-trace -p codex-mcp-server` - `cargo test -p codex-core --test all suite::items` - `just fix -p codex-protocol -p codex-app-server-protocol -p codex-core -p codex-rollout -p codex-rollout-trace -p codex-mcp-server` - Earlier coverage on this PR also included `codex-mcp`, `codex-tui`, core library tests, MCP/plugin/delegate/review/agent job tests, and MCP startup TUI tests.	2026-04-30 20:03:26 -07:00
Dylan Hurd	af089fb21d	fix(exec_policy) heredoc parsing file_redirect (#20113 ) ## Summary Fixes a regression introduced in #10941 so that heredocs do not permit file redirects to be approved by rules, and adds scenario tests to cover this behavior. Previously, heredoc command parsing would allow redirects and environment variables: ```bash # commands_for_exec_policy() would parse this via parse_shell_lc_single_command_prefix PATH=/tmp/bad:$PATH cat <<'EOF' > /tmp/bad/hello.txt hello EOF ``` This conflicts with the Codex Rules documentation; heredoc parsing logic should abide by the same strictness of parsing. ## Tests - [x] Updated unit tests accordingly - [x] Added scenario tests for these cases --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-01 01:05:02 +00:00
iceweasel-oai	4f96001fa7	execpolicy: unwrap PowerShell -Command wrappers on Windows (#20336 ) ## Why On Windows, Codex runs shell commands through a top-level `powershell.exe -NoProfile -Command ...` wrapper. `execpolicy` was matching that wrapper instead of the inner command, so prefix rules like `["git", "push"]` did not fire for PowerShell-wrapped commands even though the same normalization already happens for `bash -lc` on Unix. This change makes the Windows shell wrapper transparent to rule matching while preserving the existing Windows unmatched-command safelist and dangerous-command heuristics. ## What changed - add `parse_powershell_command_plain_commands()` in `shell-command/src/powershell.rs` to unwrap the top-level PowerShell `-Command` body with `extract_powershell_command()` and parse it with the existing PowerShell AST parser - update `core/src/exec_policy.rs` so `commands_for_exec_policy()` treats top-level PowerShell wrappers like `bash -lc` and evaluates rules against the parsed inner commands - carry a small `ExecPolicyCommandOrigin` through unmatched-command evaluation and expose `is_safe_powershell_words()` / `is_dangerous_powershell_words()` so Windows safelist and dangerous-command checks still work after unwrap - add Windows-focused tests for wrapped PowerShell prompt/allow matches, wrapper parsing, and unmatched safe/dangerous inner commands, and re-enable the end-to-end `execpolicy_blocks_shell_invocation` test on Windows ## Testing - `cargo test -p codex-shell-command`	2026-05-01 00:56:20 +00:00
Abhinav	0d9a5d20ec	Alias codex_hooks feature as hooks (#20522 ) # Why The hooks feature flag should use the concise canonical name `hooks`, while existing configs that still use `codex_hooks` continue to work during the rename. # What - change the canonical `Feature::CodexHooks` key from `codex_hooks` to `hooks` - register `codex_hooks` through the existing legacy-alias path - update the config schema and canonical config fixtures to prefer `hooks` - add regression coverage that both `hooks` and `codex_hooks` resolve to `Feature::CodexHooks` # Verification - `cargo test -p codex-features` - `cargo test -p codex-core config::schema_tests` - `cargo test -p codex-core pre_tool_use_blocks_shell_when_defined_in_config_toml` - `cargo test -p codex-app-server hooks_list_uses_each_cwds_effective_feature_enablement`	2026-05-01 00:46:33 +00:00
Akshay Nathan	8426edf71e	Stateful streaming apply_patch parser	2026-04-30 21:41:15 +00:00
Ahmed Ibrahim	8a97f3cf03	realtime: rename provider session ids (#20361 ) ## Summary Codex is repurposing `session` to mean a thread group, so the realtime provider session id should no longer use `session_id` / `sessionId` in Codex-facing protocol payloads. This PR renames that provider-specific field to `realtime_session_id` / `realtimeSessionId` and intentionally breaks clients that still send the old field names. ## What Changed - Renamed realtime provider session fields in `ConversationStartParams`, `RealtimeConversationStartedEvent`, and `RealtimeEvent::SessionUpdated`. - Renamed app-server v2 realtime request and notification fields to `realtimeSessionId`. - Removed legacy serde aliases for `session_id` / `sessionId`; clients must send the new names. - Propagated the rename through core realtime startup, app-server adapters, codex-api websocket handling, and TUI realtime state. - Regenerated app-server protocol schema/TypeScript outputs and updated app-server README examples. - Kept upstream Realtime API concepts unchanged: provider `session.id` parsing and `x-session-id` headers still use the upstream wire names. ## Testing - CI is running on the latest pushed commit. - Earlier local verification on this PR: - `cargo test -p codex-protocol` - `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-core realtime_conversation` - `cargo test -p codex-app-server-protocol` - `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-app-server realtime_conversation` - attempted `CODEX_SKIP_VENDORED_BWRAP=1 cargo test -p codex-tui` (local linker bus error while linking the test binary) --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-30 13:39:48 +03:00
pakrym-oai	fedcefe9da	Reduce the surface of collaboration modes (#20149 ) Collaboration modes were slightly invasive both into ThreadManager construction and ModelProvider	2026-04-29 17:22:41 -07:00
Matthew Zeng	8ce48f9968	[tool_suggest] Improve tool_suggest triggering conditions. (#20091 ) ## Summary - Tighten `tool_suggest` guidance so it prefers explicit plugin install requests, while still allowing a connector install when the relevant plugin is already installed and a needed connector from that plugin is missing. - Tell the model not to call `tool_suggest` in parallel with other tools. ## Testing - `cargo test -p codex-tools tool_suggest` - `cargo test -p codex-core tool_suggest`	2026-04-29 13:41:12 -07:00
viyatb-oai	07c8b8c77c	fix: handle deferred network proxy denials (#19184 ) ## Why This bug is exposed by Guardian/auto-review approvals. With the managed network proxy enabled, a blocked network request can be reported back through the network approval service as an approval denial after the command has already started. Before this change, the shell and unified exec runtimes registered those network approval calls, but did not have a way to observe an async proxy denial as a cancellation/failure signal for the running process. The result was confusing: Guardian/auto-review could correctly deny network access, but the command path could keep running or unregister the approval without surfacing the denial as the command failure. ## What Changed - `NetworkApprovalService` now attaches a cancellation token to active and deferred network approvals. - Proxy-denial outcomes are recorded only for active registrations, cancel the owning token, and are consumed when the approval is finalized. - The shell runtime combines the normal command timeout with the network-denial cancellation token. - Unified exec stores the deferred network approval object, terminates tracked processes when the proxy denial arrives, and returns the denial as a process failure while polling or completing the process. - Tool orchestration passes the active network approval cancellation token into the sandbox attempt and preserves deferred approval errors instead of silently unregistering them. - App-server `command/exec` now handles the combined timeout-or-cancellation expiration variant used by the runtime. ## Verification - `cargo test -p codex-core network_approval --lib` - `cargo clippy -p codex-app-server --all-targets -- -D warnings` - `cargo clippy -p codex-core --all-targets -- -D warnings` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-29 19:13:57 +00:00
pakrym-oai	8356806fc9	Add ThreadManager sample crate (#20141 ) Summary: - Add codex-thread-manager-sample, a one-shot binary that starts a ThreadManager thread, submits a prompt, and prints the final assistant output. - Pass ThreadStore into ThreadManager::new and expose thread_store_from_config for existing callsites. - Build the sample Config directly with only --model and prompt inputs. Verification: - just fmt - cargo check -p codex-thread-manager-sample -p codex-app-server -p codex-mcp-server - git diff --check Tests: Not run per request.	2026-04-29 11:21:06 -07:00
starr-openai	e1ec9e63a0	Add environment provider snapshot (#20058 ) ## Summary - Change `EnvironmentProvider` to return concrete `Environment` instances instead of `EnvironmentConfigurations`. - Make `DefaultEnvironmentProvider` provide the provider-visible `local` environment plus optional `remote` environment from `CODEX_EXEC_SERVER_URL`. - Keep `EnvironmentManager` as the concrete cache while exposing its own explicit local environment for `local_environment()` fallback paths. ## Validation - `just fmt` - `git diff --check` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-28 20:05:18 -07:00
Michael Bolin	c9f7c88f3d	fix: restore live event submit path for apply patch tests (#20108 ) ## Summary This fixes the CI regression introduced by [#20040](https://github.com/openai/codex/pull/20040). That PR migrated several `apply_patch_cli` tests from direct `codex.submit(Op::UserTurn { ... })` calls to `harness.submit(...)`. `harness.submit()` waits for `TurnComplete` before returning, which drains the same event stream that these tests use to assert `TurnDiff`, `PatchApplyUpdated`, and related live events. The regressed tests then timed out waiting for events that had already been consumed. This change restores a no-wait submit path for the event-observing `apply_patch_cli` tests so they can watch the turn stream directly again. ## What Changed - added a local `submit_without_wait(...)` helper in `codex-rs/core/tests/suite/apply_patch_cli.rs` - switched the `apply_patch_cli` tests that assert live turn events back to that helper - left the profile-backed `harness.submit(...)` migration in place for tests that only care about final filesystem or tool output state ## Why macOS Looked Green In the failing run [25084487331](https://github.com/openai/codex/actions/runs/25084487331), `//codex-rs/core:core-all-test` was cached on macOS, so the regressed tests were not rerun there. The Linux GNU, Linux MUSL, and Windows Bazel jobs reran the target and exposed the failure. ## Verification - `cargo test -p codex-core apply_patch_ -- --nocapture` - previously failing local cases now pass again: - `apply_patch_cli_move_without_content_change_has_no_turn_diff` - `apply_patch_turn_diff_for_rename_with_content_change` - `apply_patch_aggregates_diff_across_multiple_tool_calls`	2026-04-28 18:09:20 -07:00
Michael Bolin	1211a90a35	core tests: migrate hook turns to profiles (#20041 ) ## Summary - Removes `SandboxPolicy` from the hooks test suite. - Submits hook-related turns with explicit `PermissionProfile` values for disabled, read-only, and workspace-write cases. - Preserves the managed-network hook test by configuring and submitting a workspace-write profile with enabled network, allowing the existing requirements-backed proxy path to remain covered. ## Verification - `cargo check -p codex-core --tests` - `just fmt`	2026-04-28 17:18:45 -07:00
Michael Bolin	1fed948c66	core tests: migrate apply patch turns to profiles (#20040 ) ## Summary - Removes `SandboxPolicy` from the apply-patch CLI test suite. - Uses the harness' profile-backed submit helper for danger/no-sandbox turns instead of constructing `Op::UserTurn` manually with legacy fields. - Converts the workspace-write traversal cases to submit `PermissionProfile::workspace_write_with(...)` directly. ## Verification - `cargo check -p codex-core --tests` - `just fmt`	2026-04-28 17:18:19 -07:00
Michael Bolin	1dae5788e1	core tests: migrate rmcp turns to profiles (#20037 ) ## Summary - Removes `SandboxPolicy` from the RMCP client test suite. - Adds shared read-only user-turn helpers that submit `PermissionProfile::read_only()` plus the legacy compatibility projection required by the current `Op::UserTurn` shape. - Keeps sandbox metadata assertions intact by deriving the expected legacy `sandboxPolicy` value from the same read-only profile used for the turn. ## Verification - `cargo check -p codex-core --tests` - `just fmt`	2026-04-28 17:17:47 -07:00
Michael Bolin	6662c0f312	core tests: migrate compact turns to profiles (#20035 ) ## Summary - Removes the remaining `SandboxPolicy` usage from the compaction test suite. - Adds a small local helper for direct `Op::UserTurn` construction so these tests send `PermissionProfile::Disabled` plus the legacy compatibility projection required by the protocol field. - Keeps the existing danger/full-access behavior while exercising the canonical permission profile path. ## Verification - `cargo check -p codex-core --tests` - `just fmt`	2026-04-28 17:17:12 -07:00
Michael Bolin	026df712cc	core tests: migrate zsh-fork permissions to profiles (#20034 ) ## Summary - Updates the zsh-fork test helper to configure `PermissionProfile` directly instead of constructing a legacy `SandboxPolicy`. - Sends permission-profile-backed turns from the skill approval zsh-fork tests so the runtime and request path exercise the canonical permissions model. - Leaves the broader approvals suite on legacy policies for now, except for the zsh-fork test that shares this helper. ## Verification - `cargo check -p codex-core --tests` - `just fmt`	2026-04-28 17:15:58 -07:00
Michael Bolin	1ea90410e1	core tests: migrate request permissions tool turns to profiles (#20033 ) ## Summary This migrates the macOS request-permissions tool tests from legacy `SandboxPolicy` setup to `PermissionProfile` setup. The tests still exercise the same workspace-write baseline and request-permission grants, but the canonical permissions value is now the profile. ## Changes - Replaces the `workspace_write_excluding_tmp()` helper with a `PermissionProfile::workspace_write_with()` helper. - Applies test config through `Permissions::set_permission_profile()`. - Uses `turn_permission_fields()` for `Op::UserTurn` compatibility fields. - Removes the `SandboxPolicy` import from `request_permissions_tool.rs`. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:15:13 -07:00
Michael Bolin	af39e488bc	core tests: migrate prompt caching turns to profiles (#20032 ) ## Summary This removes the explicit `SandboxPolicy` constructors from `core/tests/suite/prompt_caching.rs`. The tests still exercise the same prompt-cache invariants across permission and turn-context changes, but the permission source is now `PermissionProfile`. ## Changes - Uses `PermissionProfile::workspace_write_with()` for workspace-write override scenarios. - Uses `PermissionProfile::Disabled` for the no-sandbox per-turn override. - Projects profiles through `turn_permission_fields()` or `to_legacy_sandbox_policy()` only to populate compatibility fields on existing ops. - Removes the `SandboxPolicy` import from `prompt_caching.rs`. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:13:53 -07:00
Michael Bolin	5d08315c00	core tests: migrate exec policy turns to profiles (#20030 ) ## Summary This migrates `core/tests/suite/exec_policy.rs` away from legacy `SandboxPolicy` turn construction. These tests all use no-sandbox turns to exercise exec-policy behavior, so `PermissionProfile::Disabled` is the canonical representation. ## Changes - Replaces direct `SandboxPolicy::DangerFullAccess` turn fields with `PermissionProfile::Disabled`. - Uses `turn_permission_fields()` to populate the compatibility `sandbox_policy` field required by `Op::UserTurn`. - Removes the `SandboxPolicy` import from `exec_policy.rs`. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:12:48 -07:00
Michael Bolin	b599849d86	core tests: migrate permissions message tests to profiles (#20028 ) ## Summary This removes another test-only `SandboxPolicy` dependency by configuring `permissions_messages.rs` with a `PermissionProfile` directly. The test still verifies the rendered compatibility permissions text, but now obtains the legacy projection from the loaded `Config` rather than using `SandboxPolicy` as the source of truth. ## Changes - Builds the workspace-write test setup with `PermissionProfile::workspace_write_with()`. - Applies that profile through `Permissions::set_permission_profile()`. - Uses `Config::legacy_sandbox_policy()` only for the expected `PermissionsInstructions` compatibility rendering. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:12:10 -07:00
Michael Bolin	3ef09c71d3	core tests: migrate tools tests to permission profiles (#20027 ) ## Summary This continues the test-side migration away from `SandboxPolicy` by removing the remaining legacy policy setup in `core/tests/suite/tools.rs`. The affected test was already modeling a profile-backed filesystem policy with a deny-read glob, so configuring the test through `Permissions::set_permission_profile()` is a better match for the behavior being exercised. ## Changes - Drops the `SandboxPolicy` import from `core/tests/suite/tools.rs`. - Configures the glob deny-read shell test directly with a `PermissionProfile` instead of creating a legacy read-only policy first. - Submits the test turn with the session permission profile so the deny-read glob remains active for the command under test. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:11:43 -07:00
Michael Bolin	8d3992d830	core tests: migrate plan item turns to profiles (#20026 ) ## Why The core item tests still had a cluster of plan-mode `Op::UserTurn` literals that used `SandboxPolicy::DangerFullAccess` and omitted `permission_profile`. These tests are validating emitted item lifecycle events, so keeping them on the legacy sandbox-only turn shape adds noise to the broader permissions migration without testing legacy behavior. ## What Changed - Adds a local `disabled_plan_turn()` helper that preserves the existing `std::env::current_dir()` turn cwd behavior. - Uses `turn_permission_fields(PermissionProfile::Disabled, cwd)` to populate both the compatibility `sandbox_policy` and canonical `permission_profile` fields. - Replaces the plan-mode hand-built turns in `codex-rs/core/tests/suite/items.rs`, removing all `SandboxPolicy` references from that file and reducing remaining `codex-rs/core/tests` `SandboxPolicy` files from 16 to 15. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:11:17 -07:00
Michael Bolin	162f4e3183	core tests: migrate safety check turns to profiles (#20024 ) ## Why This stack is retiring direct `SandboxPolicy` construction from tests so core coverage exercises the same `PermissionProfile` turn path used by runtime code. `safety_check_downgrade.rs` still submitted each test turn as `SandboxPolicy::DangerFullAccess` with no permission profile, even though the tests are about model verification/reroute behavior rather than legacy sandbox conversion. ## What Changed - Adds a local `disabled_text_turn()` helper that derives both the compatibility `sandbox_policy` and canonical `permission_profile` from `PermissionProfile::Disabled`. - Replaces repeated hand-built `Op::UserTurn` literals in `codex-rs/core/tests/suite/safety_check_downgrade.rs` with that helper. - Removes all `SandboxPolicy` references from the safety-check suite, reducing the remaining `codex-rs/core/tests` files that mention `SandboxPolicy` from 17 to 16. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:10:42 -07:00
Michael Bolin	2a8ce9b319	core tests: migrate view image turns to profiles (#20021 ) ## Why This stack is removing direct `SandboxPolicy` usage from test code so new tests exercise the same `PermissionProfile` path that runtime code now treats as canonical. `view_image.rs` still built `Op::UserTurn` requests with `SandboxPolicy::DangerFullAccess` and no permission profile, which kept another core test module on the legacy turn shape. ## What Changed - Adds a small `disabled_user_turn()` helper for the view-image suite that derives the compatibility `sandbox_policy` and canonical `permission_profile` from `PermissionProfile::Disabled`. - Replaces repeated direct `Op::UserTurn` literals in `codex-rs/core/tests/suite/view_image.rs` with that helper. - Removes all `SandboxPolicy` references from `view_image.rs`, reducing the remaining `codex-rs/core/tests` files that mention `SandboxPolicy` from 18 to 17. ## Verification - `cargo check -p codex-core --tests`	2026-04-28 17:09:48 -07:00
Michael Bolin	d77d23da2e	core tests: migrate model/personality turns to profiles (#20018 ) ## Summary - Migrates `model_switching.rs` and `personality.rs` direct `Op::UserTurn` construction from legacy `SandboxPolicy` literals to `PermissionProfile`-backed turn fields. - Adds small local helpers in each file so tests keep asserting model/personality behavior without repeating permission plumbing. - Reduces `rg -l '\bSandboxPolicy\b' codex-rs/core/tests` from 20 files to 18; `codex-rs/tui` remains at zero `SandboxPolicy` references. ## Testing - `cargo check -p codex-core --tests` - `just fmt`	2026-04-28 17:09:12 -07:00
Michael Bolin	d6d79ffcc7	core tests: send model turns with permission profiles (#20016 ) ## Summary - Migrate direct `Op::UserTurn` construction in remote-model tests from legacy `SandboxPolicy::DangerFullAccess` to `PermissionProfile::Disabled` via `turn_permission_fields()`. - Migrate the Responses API proxy header helper from an inline workspace-write `SandboxPolicy` to `PermissionProfile::workspace_write()`. - Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 22 files after #20015 to 20 files. ## Testing - `cargo check -p codex-core --tests` - `just fmt` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/20016). * #20041 * #20040 * #20037 * #20035 * #20034 * #20033 * #20032 * #20030 * #20028 * #20027 * #20026 * #20024 * #20021 * #20018 * __->__ #20016	2026-04-28 17:08:04 -07:00
Michael Bolin	158b2a4201	core tests: configure profiles directly (#20015 ) ## Summary - Replace legacy sandbox config setup in delegate and telemetry tests with direct `PermissionProfile` configuration. - Move no-sandbox and read-only test turns in `tools.rs`, `code_mode.rs`, `user_shell_cmd.rs`, and `model_visible_layout.rs` from legacy `SandboxPolicy` values to `PermissionProfile` helpers, while leaving the deny-glob read-only compatibility case for a later targeted cleanup. - Use `PermissionProfile::read_only()` where tests need managed read-only behavior and `PermissionProfile::Disabled` where they intentionally need no sandbox. - Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 27 files after #20013 to 22 files. ## Testing - `cargo check -p codex-core --tests` - `just fmt`	2026-04-28 17:06:59 -07:00
Michael Bolin	52e79ee49a	core tests: migrate more turns to permission profiles (#20013 ) ## Summary - Migrate another batch of direct `Op::UserTurn` test construction from legacy `SandboxPolicy` values to `PermissionProfile` inputs via `turn_permission_fields()`. - Replace a one-off read-only `SandboxPolicy` bridge in the macOS exec test with `PermissionProfile::read_only()`. - Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 32 files at the start of the cleanup stack to 27 files. ## Testing - `cargo check -p codex-core --tests` - `just fmt` - `just fix -p codex-core`	2026-04-28 17:05:53 -07:00
Michael Bolin	7d15936e69	core tests: build user turns from permission profiles (#20011 ) ## Summary - Add `turn_permission_fields()` so tests that construct `Op::UserTurn` directly can provide a canonical `PermissionProfile` while still filling the required legacy `sandbox_policy` compatibility field. - Migrate direct user-turn construction in core integration tests from `SandboxPolicy::DangerFullAccess` to `PermissionProfile::Disabled`. - Continue reducing direct `SandboxPolicy` usage in `codex-rs/core/tests`, from 41 files after #20010 to 32 files in this PR. ## Testing - `cargo check -p codex-core --tests` - `just fmt` - `just fix -p core_test_support` - `just fix -p codex-core`	2026-04-28 17:03:20 -07:00
Michael Bolin	891722849d	core tests: submit turns with permission profiles (#20010 ) ## Summary - Add `PermissionProfile`-based turn submission helpers to `core_test_support`, while keeping the legacy `SandboxPolicy` helper for tests that intentionally exercise legacy fallback behavior. - Switch the default `TestCodex::submit_turn()` path to send a real `PermissionProfile` plus the required legacy compatibility projection in `Op::UserTurn`. - Migrate straightforward app/search/shell/truncation tests from `SandboxPolicy::{DangerFullAccess, ReadOnly}` to `PermissionProfile::{Disabled, read_only}`. - Add a TUI compatibility projection helper for legacy app-server fields so non-legacy writable roots are preserved instead of being downgraded to read-only. - Fix remote start/resume/fork sandbox-mode projection to classify any managed profile with writable roots as workspace-write, not only profiles that can write `cwd`. - Reduce `SandboxPolicy` references in `codex-rs/core/tests` from 47 files to 41 files without changing production behavior. ## Testing - `cargo check -p codex-core --tests` - `cargo test -p codex-tui compatibility_profile_preserves_unbridgeable_write_roots` - `cargo test -p codex-tui sandbox_mode_preserves_non_cwd_write_roots_for_remote_sessions` - `just fmt` - `just fix -p core_test_support` - `just fix -p codex-core`	2026-04-28 23:01:40 +00:00
Abhinav	c6e7d564c3	Discover hooks bundled with plugins (#19705 ) ## Why Plugins can bundle lifecycle hooks, but Codex previously only discovered hooks from user, project, and managed config layers. This adds the plugin discovery and runtime plumbing needed for plugin-bundled hooks while keeping execution behind the `plugin_hooks` feature flag. ## What - Discovers plugin hook sources from each plugin's default `hooks/hooks.json`. - Supports `plugin.json` manifest `hooks` entries as either relative paths or inline hook objects. - Plumbs discovered plugin hook sources through plugin loading into the hook runtime when `plugin_hooks` is enabled. - Marks plugin-originated hook runs as `HookSource::Plugin`. - Injects `PLUGIN_ROOT` and `CLAUDE_PLUGIN_ROOT` into plugin hook command environments. - Updates generated schemas and hook source metadata for the plugin hook source. ## Stack 1. This PR - openai/codex#19705 2. openai/codex#19778 3. openai/codex#19840 4. openai/codex#19882 ## Reviewer Notes - Core logic is in `codex-rs/core-plugins/src/loader.rs` and `codex-rs/hooks/src/engine/discovery.rs` - Moved existing / adding new tests to `codex-rs/core-plugins/src/loader_tests.rs` hence the large diff there - Otherwise mostly plumbing and minor schema updates ### Core Changes The `codex-rs/core` changes are limited to wiring plugin hook support into existing core flows: - `core/src/session/session.rs` conditionally pulls effective plugin hook sources and plugin hook load warnings from `PluginsManager` when `plugin_hooks` is enabled, then passes them into `HooksConfig`. - `core/src/hook_runtime.rs` adds the `plugin` metric tag for `HookSource::Plugin`. - `core/config.schema.json` picks up the new `plugin_hooks` feature flag, and `core/src/plugins/manager_tests.rs` updates fixtures for the added plugin hook fields. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-28 14:17:18 -07:00
charley-openai	de2ccf9473	[codex] Add token usage to turn tracing spans (#19432 ) ## Why Slow Codex turns are easier to debug when token usage is visible in the trace itself, without joining against separate analytics. This adds token usage to existing turn-handling spans for regular user turns only. [Example turn](https://openai.datadoghq.com/apm/trace/9d353efa2cb5de1f4c5b93dc33c3df04?colorBy=service&graphType=flamegraph&shouldShowLegend=true&sort=time&spanID=3555541504891512675&spanViewType=metadata&traceQuery=) <img width="1447" height="967" alt="Screenshot 2026-04-24 at 3 03 07 PM" src="https://github.com/user-attachments/assets/ab7bb187-e7fc-41f0-a366-6c44610b2b2c" /> ## What Changed Added response-level token fields on completed handle_responses spans: gen_ai.usage.input_tokens gen_ai.usage.cache_read.input_tokens gen_ai.usage.output_tokens codex.usage.reasoning_output_tokens codex.usage.total_tokens Added aggregate token fields on regular turn spans: codex.turn.token_usage.* Added an explicit regular-turn opt-in via SessionTask::records_turn_token_usage_on_span() so this is not coupled to span-name strings. ## Testing - `cargo test -p codex-otel` - `cargo test -p codex-core turn_and_completed_response_spans_record_token_usage` - `just fmt` - `just fix -p codex-core` - `just fix -p codex-otel` - Manual local Electron/app-server smoke test: regular user turn emits the new span fields Known status: `cargo test -p codex-core` was attempted and failed in unrelated existing areas: config approvals, request-permissions, git-info ordering, and subagent metadata persistence.	2026-04-28 11:41:32 -07:00

1 2 3 4 5 ...

1160 Commits