codex

mirror of https://github.com/openai/codex.git synced 2026-05-24 21:14:51 +00:00

Author	SHA1	Message	Date
Ahmed Ibrahim	2f6fc7c137	Add realtime output modality and transcript events (#17701 ) - Add outputModality to thread/realtime/start and wire text/audio output selection through app-server, core, API, and TUI.\n- Rename the realtime transcript delta notification and add a separate transcript done notification that forwards final text from item done without correlating it with deltas.	2026-04-14 00:13:13 -07:00
rhan-oai	b704df85b8	[codex-analytics] feature plumbing and emittance (#16640 ) --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16640). * #16870 * #16706 * #16641 * __->__ #16640	2026-04-13 23:11:49 -07:00
Thibault Sottiaux	05c5829923	[codex] drain mailbox only at request boundaries (#17749 ) This changes multi-agent v2 mailbox handling so incoming inter-agent messages no longer preempt an in-flight sampling stream at reasoning or commentary output-item boundaries.	2026-04-13 22:09:51 -07:00
pakrym-oai	3b24a9a532	Refactor plugin loading to async (#17747 ) Simplifies skills migration.	2026-04-13 21:52:56 -07:00
xli-oai	ff584c5a4b	[codex] Refactor marketplace add into shared core flow (#17717 ) ## Summary Move `codex marketplace add` onto a shared core implementation so the CLI and app-server path can use one source of truth. This change: - adds shared marketplace-add orchestration in `codex-core` - switches the CLI command to call that shared implementation - removes duplicated CLI-only marketplace add helpers - preserves focused parser and add-path coverage while moving the shared behavior into core tests ## Why The new `marketplace/add` RPC should reuse the same underlying marketplace-add flow as the CLI. This refactor lands that consolidation first so the follow-up app-server PR can be mostly protocol and handler wiring. ## Validation - `cargo test -p codex-core marketplace_add` - `cargo test -p codex-cli marketplace_cmd` - `just fix -p codex-core` - `just fix -p codex-cli` - `just fmt`	2026-04-13 20:37:11 -07:00
pakrym-oai	d4be06adea	Add turn item injection API (#17703 ) ## Summary - Add `turn/inject_items` app-server v2 request support for appending raw Responses API items to a loaded thread history without starting a turn. - Generate JSON schema and TypeScript protocol artifacts for the new params and empty response. - Document the new endpoint and include a request/response example. - Preserve compatibility with the typo alias `turn/injet_items` while returning the canonical method name. ## Testing - Not run (not requested)	2026-04-13 16:11:05 -07:00
josiah-openai	937dd3812d	Add `supports_parallel_tool_calls` flag to included mcps (#17667 ) ## Why For more advanced MCP usage, we want the model to be able to emit parallel MCP tool calls and have Codex execute eligible ones concurrently, instead of forcing all MCP calls through the serial block. The main design choice was where to thread the config. I made this server-level because parallel safety depends on the MCP server implementation. Codex reads the flag from `mcp_servers`, threads the opted-in server names into `ToolRouter`, and checks the parsed `ToolPayload::Mcp { server, .. }` at execution time. That avoids relying on model-visible tool names, which can be incomplete in deferred/search-tool paths or ambiguous for similarly named servers/tools. ## What was added Added `supports_parallel_tool_calls` for MCP servers. Before: ```toml [mcp_servers.docs] command = "docs-server" ``` After: ```toml [mcp_servers.docs] command = "docs-server" supports_parallel_tool_calls = true ``` MCP calls remain serial by default. Only tools from opted-in servers are eligible to run in parallel. Docs also now warn to enable this only when the server’s tools are safe to run concurrently, especially around shared state or read/write races. ## Testing Tested with a local stdio MCP server exposing real delay tools. The model/Responses side was mocked only to deterministically emit two MCP calls in the same turn. Each test called `query_with_delay` and `query_with_delay_2` with `{ "seconds": 25 }`. \| Build/config \| Observed \| Wall time \| \| --- \| --- \| --- \| \| main with flag enabled \| serial \| `58.79s` \| \| PR with flag enabled \| parallel \| `31.73s` \| \| PR without flag \| serial \| `56.70s` \| PR with flag enabled showed both tools start before either completed; main and PR-without-flag completed the first delay before starting the second. Also added an integration test. Additional checks: - `cargo test -p codex-tools` passed - `cargo test -p codex-core mcp_parallel_support_uses_exact_payload_server` passed - `git diff --check` passed	2026-04-13 15:16:34 -07:00
Ahmed Ibrahim	ec0133f5f8	Cap realtime mirrored user turns (#17685 ) Cap mirrored user text sent to realtime with the existing 300-token turn budget while preserving the full model turn. Adds integration coverage for capped realtime mirror payloads. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-13 14:31:18 -07:00
Kevin Liu	ecdd733a48	Remove unnecessary tests (#17395 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.	2026-04-13 21:02:12 +00:00
jif-oai	d905376628	feat: Avoid reloading curated marketplaces for tool-suggest discovera… (#17638 ) - stop `list_tool_suggest_discoverable_plugins()` from reloading the curated marketplace for each discoverable plugin - reuse a direct plugin-detail loader against the already-resolved marketplace entry The trigger was to stop those logs spamming: ``` d=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/life-science-research/.codex-plugin/plugin.json 2026-04-13T12:27:30.402Z WARN [019d81cf-6f69-7230-98aa-74294ff2dc5a] codex_core::plugins::manifest - session_loop{thread_id=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/build-ios-apps/.codex-plugin/plugin.json 2026-04-13T12:27:30.402Z WARN [019d81cf-6f69-7230-98aa-74294ff2dc5a] codex_core::plugins::manifest - session_loop{thread_id=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/life-science-research/.codex-plugin/plugin.json 2026-04-13T12:27:30.405Z WARN [019d81cf-6f69-7230-98aa-74294ff2dc5a] codex_core::plugins::manifest - session_loop{thread_id=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/build-ios-apps/.codex-plugin/plugin.json 2026-04-13T12:27:30.406Z WARN [019d81cf-6f69-7230-98aa-74294ff2dc5a] codex_core::plugins::manifest - session_loop{thread_id=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/life-science-research/.codex-plugin/plugin.json 2026-04-13T12:27:30.408Z WARN [019d81cf-6f69-7230-98aa-74294ff2dc5a] codex_core::plugins::manifest - session_loop{thread_id=019d81cf-6f69-7230-98aa-74294ff2dc5a}:submission_dispatch{otel.name="op.dispatch.user_input" submission.id="019d86c8-0a8e-7013-b442-109aabbf75c9" codex.op="user_input"}:turn{otel.name="session_task.turn" thread.id=019d81cf-6f69-7230-98aa-74294ff2dc5a turn.id=019d86c8-0a8e-7013-b442-109aabbf75c9 model=gpt-5.4}: ignoring interface.defaultPrompt: prompt must be at most 128 characters path=/Users/jif/.codex/.tmp/plugins/plugins/build-ios-apps/.codex-plugin/plugin.json ```	2026-04-13 19:08:43 +00:00
jif-oai	46a266cd6a	feat: disable memory endpoint (#17626 )	2026-04-13 18:29:49 +01:00
pakrym-oai	ac82443d07	Use AbsolutePathBuf in skill loading and codex_home (#17407 ) Helps with FS migration later	2026-04-13 10:26:51 -07:00
Eric Traut	a5783f90c9	Fix custom tool output cleanup on stream failure (#17470 ) Addresses #16255 Problem: Incomplete Responses streams could leave completed custom tool outputs out of cleanup and retry prompts, making persisted history inconsistent and retries stale. Solution: Route stream and output-item errors through shared cleanup, and rebuild retry prompts from fresh session history after the first attempt.	2026-04-13 08:35:17 -07:00
friel-openai	776246c3f5	Make forked agent spawns keep parent model config (#17247 ) ## Summary When a `spawn_agent` call does a full-history fork, keep the parent's effective agent type and model configuration instead of applying child role/model overrides. This is the minimal config-inheritance slice of #16055. Prompt-cache key inheritance and MCP tool-surface stability are split into follow-up PRs. ## Design - Reject `agent_type`, `model`, and `reasoning_effort` for v1 `fork_context` spawns. - Reject `agent_type`, `model`, and `reasoning_effort` for v2 `fork_turns = "all"` spawns. - Keep v2 partial-history forks (`fork_turns = "N"`) configurable; requested model/reasoning overrides and role config still apply there. - Keep non-forked spawn behavior unchanged. ## Tests - `cargo +1.93.1 test -p codex-core spawn_agent_fork_context --lib` - `cargo +1.93.1 test -p codex-core multi_agent_v2_spawn_fork_turns --lib` - `cargo +1.93.1 test -p codex-core multi_agent_v2_spawn_partial_fork_turns_allows_agent_type_override --lib`	2026-04-13 15:28:40 +00:00
jif-oai	86bd0bc95c	nit: change consolidation model (#17633 )	2026-04-13 13:02:07 +01:00
jif-oai	bacb92b1d7	Build remote exec env from exec-server policy (#17216 ) ## Summary - add an exec-server `envPolicy` field; when present, the server starts from its own process env and applies the shell environment policy there - keep `env` as the exact environment for local/embedded starts, but make it an overlay for remote unified-exec starts - move the shell-environment-policy builder into `codex-config` so Core and exec-server share the inherit/filter/set/include behavior - overlay only runtime/sandbox/network deltas from Core onto the exec-server-derived env ## Why Remote unified exec was materializing the shell env inside Core and forwarding the whole map to exec-server, so remote processes could inherit the orchestrator machine's `HOME`, `PATH`, etc. This keeps the base env on the executor while preserving Core-owned runtime additions like `CODEX_THREAD_ID`, unified-exec defaults, network proxy env, and sandbox marker env. ## Validation - `just fmt` - `git diff --check` - `cargo test -p codex-exec-server --lib` - `cargo test -p codex-core --lib unified_exec::process_manager::tests` - `cargo test -p codex-core --lib exec_env::tests` - `cargo test -p codex-core --lib exec_env_tests` (compile-only; filter matched 0 tests) - `cargo test -p codex-config --lib shell_environment` (compile-only; filter matched 0 tests) - `just bazel-lock-update` ## Known local validation issue - `just bazel-lock-check` is not runnable in this checkout: it invokes `./scripts/check-module-bazel-lock.sh`, which is missing. --------- Co-authored-by: Codex <noreply@openai.com> Co-authored-by: pakrym-oai <pakrym@openai.com>	2026-04-13 09:59:08 +01:00
jif-oai	4ffe6c2ce6	feat: ignore keyring on 0.0.0 (#17221 ) To prevent the spammy: <img width="424" height="172" alt="Screenshot 2026-04-09 at 13 36 16" src="https://github.com/user-attachments/assets/b5ece9e3-c561-422f-87ec-041e7bd6813d" />	2026-04-13 09:58:47 +01:00
starr-openai	d626dc3895	Run exec-server fs operations through sandbox helper (#17294 ) ## Summary - run exec-server filesystem RPCs requiring sandboxing through a `codex-fs` arg0 helper over stdin/stdout - keep direct local filesystem execution for `DangerFullAccess` and external sandbox policies - remove the standalone exec-server binary path in favor of top-level arg0 dispatch/runtime paths - add sandbox escape regression coverage for local and remote filesystem paths ## Validation - `just fmt` - `git diff --check` - remote devbox: `cd codex-rs && bazel test --bes_backend= --bes_results_url= //codex-rs/exec-server:all` (6/6 passed) --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-12 18:36:03 -07:00
pakrym-oai	7c1e41c8b6	Add MCP tool wall time to model output (#17406 ) Include MCP wall time in the output so the model is aware of how long it's calls are taking.	2026-04-12 18:26:15 -07:00
Eric Traut	470510174b	Remove context status-line meter (#17420 ) Addresses #17313 Problem: The visual context meter in the status line was confusing and continued to draw negative feedback, and context reporting should remain an explicit opt-in rather than part of the default footer. Solution: Remove the visual meter, restore opt-in context remaining/used percentage items that explicitly say "Context", keep existing context-usage configs working as a hidden alias, and update the setup text and snapshots.	2026-04-12 15:42:09 -07:00
Ahmed Ibrahim	d840b247d7	Mirror user text into realtime (#17520 ) - Let typed user messages submit while realtime is active and mirror accepted text into the realtime text stream. - Add integration coverage and snapshot for outbound realtime text.	2026-04-12 15:03:14 -07:00
Francis Chalissery	720932ca3d	[codex] Support flattened deferred MCP tool calls (#17556 ) ## Summary - register flattened handler aliases for deferred MCP tools - cover the node_repl-shaped deferred MCP call path in tool registry tests ## Root Cause Deferred MCP tools were registered only under their namespaced handler key, e.g. `mcp__node_repl__:js`. If the model/bridge emitted the flattened qualified name `mcp__node_repl__js`, core parsed it as an MCP payload but dispatch looked up the flattened handler key and returned `unsupported call` before reaching the MCP handler. ## Validation - `just fmt` - `cargo test -p codex-tools search_tool_registers_deferred_mcp_flattened_handlers` - `cargo test -p codex-core search_tool_registers_namespaced_mcp_tool_aliases` - `git diff --check`	2026-04-12 13:19:36 -07:00
Ahmed Ibrahim	4db60d5d8b	Budget realtime current thread context (#17519 ) Select Current Thread startup context by budget from newest turns, cap each rendered turn at 300 approximate tokens, and add formatter plus integration snapshot coverage.	2026-04-12 11:59:09 -07:00
Won Park	3895ddd6b1	Clarify guardian timeout guidance (#17521 ) ## Summary - update the guardian timeout guidance to say permission approval review timed out - simplify the retry guidance to say retry once or ask the user for guidance or explicit approval ## Testing - cargo test -p codex-core guardian_timeout_message_distinguishes_timeout_from_policy_denial - cargo test -p codex-core guardian_review_decision_maps_to_mcp_tool_decision	2026-04-12 02:03:53 -07:00
Won Park	ba839c23f3	changing decision semantics after guardian timeout (#17486 ) Summary This PR treats Guardian timeouts as distinct from explicit denials in the core approval paths. Timeouts now return timeout-specific guidance instead of Guardian policy-rejection messaging. It updates the command, shell, network, and MCP approval flows and adds focused test coverage.	2026-04-12 00:00:50 -07:00
sayan-oai	1325bcd3f6	chore: refactor name and namespace to single type (#17402 ) avoid passing them both around, unify on a type. this now also keys `ToolRegistry`. tests pass	2026-04-11 23:06:22 +00:00
Ahmed Ibrahim	163ae7d3e6	fix (#17493 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.	2026-04-11 13:52:17 -07:00
Eric Traut	e9e7ef3d36	Fix thread/list cwd filtering for Windows verbatim paths (#17414 ) Addresses #17302 Problem: `thread/list` compared cwd filters with raw path equality, so `resume --last` could miss Windows sessions when the saved cwd used a verbatim path form and the current cwd did not. Solution: Normalize cwd comparisons through the existing path comparison utilities before falling back to direct equality, and add Windows regression coverage for verbatim paths. I made this a general utility function and replaced all of the duplicated instance of it across the code base.	2026-04-10 23:08:02 -07:00
Matthew Zeng	b7139a7e8f	[mcp] Support MCP Apps part 3 - Add mcp tool call support. (#17364 ) - [x] Add a new app-server method so that MCP Apps can call their own MCP server directly.	2026-04-11 04:39:19 +00:00
Won Park	37aac89a6d	representing guardian review timeouts in protocol types (#17381 ) ## Summary - Add `TimedOut` to Guardian/review carrier types: - `ReviewDecision::TimedOut` - `GuardianAssessmentStatus::TimedOut` - app-server v2 `GuardianApprovalReviewStatus::TimedOut` - Regenerate app-server JSON/TypeScript schemas for the new wire shape. - Wire the new status through core/app-server/TUI mappings with conservative fail-closed handling. - Keep `TimedOut` non-user-selectable in the approval UI. Does not change runtime behavior yet; emitting `TimeOut` and parent-model timeout messaging will come in followup PRs	2026-04-10 20:02:33 -07:00
xli-oai	f9a8d1870f	Add marketplace command (#17087 ) Added a new top-level `codex marketplace add` command for installing plugin marketplaces into Codex’s local marketplace cache. This change adds source parsing for local directories, GitHub shorthand, and git URLs, supports optional `--ref` and git-only `--sparse` checkout paths, stages the source in a temp directory, validates the marketplace manifest, and installs it under `$CODEX_HOME/marketplaces/<marketplace-name>` Included tests cover local install behavior in the CLI and marketplace discovery from installed roots in core. Scoped formatting and fix passes were run, and targeted CLI/core tests passed.	2026-04-10 19:18:37 -07:00
Owen Lin	58933237cd	feat(analytics): add guardian review event schema (#17055 ) Just the analytics schema definition for guardian evaluations. No wiring done yet.	2026-04-10 17:33:58 -07:00
viyatb-oai	b114781495	fix(permissions): fix symlinked writable roots in sandbox permissions (#15981 ) ## Summary - preserve logical symlink paths during permission normalization and config cwd handling - bind real targets for symlinked readable/writable roots in bwrap and remap carveouts and unreadable roots there - add regressions for symlinked carveouts and nested symlink escape masking ## Root cause Permission normalization canonicalized symlinked writable roots and cwd to their real targets too early. That drifted policy checks away from the logical paths the sandboxed process can actually address, while bwrap still needed the real targets for mounts. The mismatch caused shell and apply_patch failures on symlinked writable roots. ## Impact Fixes #15781. Also fixes #17079: - #17079 is the protected symlinked carveout side: bwrap now binds the real symlinked writable-root target and remaps carveouts before masking. Related to #15157: - #15157 is the broader permission-check side of this path-identity problem. This PR addresses the shared logical-vs-canonical normalization issue, but the reported Darwin prompt behavior should be validated separately before auto-closing it. This should also fix #14672, #14694, #14715, and #15725: - #14672, #14694, and #14715 are the same Linux symlinked-writable-root/bwrap family as #15781. - #15725 is the protected symlinked workspace path variant; the PR preserves the protected logical path in policy space while bwrap applies read-only or unreadable treatment to the resolved target so file-vs-directory bind mismatches do not abort sandbox setup. ## Notes - Added Linux-only regressions for symlinked writable ancestors and protected symlinked directory targets, including nested symlink escape masking without rebinding the escape target writable. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-10 17:00:58 -07:00
Shijie Rao	930e5adb7e	Revert "Option to Notify Workspace Owner When Usage Limit is Reached" (#17391 ) Reverts openai/codex#16969 #sev3-2026-04-10-accountscheckversion-500s-for-openai-workspace-7300	2026-04-10 23:33:13 +00:00
Owen Lin	a3be74143a	fix(guardian, app-server): introduce guardian review ids (#17298 ) ## Description This PR introduces `review_id` as the stable identifier for guardian reviews and exposes it in app-server `item/autoApprovalReview/started` and `item/autoApprovalReview/completed` events. Internally, guardian rejection state is now keyed by `review_id` instead of the reviewed tool item ID. `target_item_id` is still included when a review maps to a concrete thread item, but it is no longer overloaded as the review lifecycle identifier. ## Motivation We'd like to give users the ability to preempt a guardian review while it's running (approve or decline). However, we can't implement the API that allows the user to override a running guardian review because we didn't have a unique `review_id` per guardian review. Using `target_item_id` is not correct since: - with execve reviews, there can be multiple execve calls (and therefore guardian reviews) per shell command - with network policy reviews, there is no target item ID The PR that actually implements user overrides will use `review_id` as the stable identifier.	2026-04-10 16:21:02 -07:00
Abhinav	7999b0f60f	Support clear SessionStart source (#17073 ) ## Motivation The `SessionStart` hook already receives `startup` and `resume` sources, but sessions created from `/clear` previously looked like normal startup sessions. This makes it impossible for hook authors to distinguish between these with the matcher. ## Summary - Add `InitialHistory::Cleared` so `/clear`-created sessions can be distinguished from ordinary startup sessions. - Add `SessionStartSource::Clear` and wire it through core, app-server thread start params, and TUI clear-session flow. - Update app-server protocol schemas, generated TypeScript, docs, and related tests. https://github.com/user-attachments/assets/9cae3cb4-41c7-4d06-b34f-966252442e5c	2026-04-10 16:05:21 -07:00
Won Park	147cb84112	add parent-id to guardian context (#17194 ) adding parent codex session id to guardian prompt	2026-04-10 13:57:56 -07:00
rhan-oai	5779be314a	[codex-analytics] add compaction analytics event (#17155 ) - event for compaction analytics - introduces thread-connection and thread metadata caches for data denormalization, expected to be useful for denormalization onto core emitted events in general - threads analytics event client into core (mirrors approved implementation in #16640) - denormalizes key thread metadata: thread_source, subagent_source, parent_thread_id, as well as app-server client and runtime metadata) - compaction strategy defaults to memento, forward compatible with expected prefill_compaction strategy 1. Manual standalone compact, local `INFO \| 2026-04-09 17:35:50 \| codex_backend.routers.analytics_events \| analytics_events.track_analytics_events:526 \| Tracked codex_compaction_event event params={'thread_id': '019d74d0-5cfb-70c0-bef9-165c3bf9b2df', 'turn_id': '019d74d0-d7f6-7c81-acc6-aae2030243d6', 'product_surface': 'codex', 'app_server_client': {'product_client_id': 'CODEX_CLI', 'client_name': 'codex-tui', 'client_version': '0.0.0', 'rpc_transport': 'in_process', 'experimental_api_enabled': True}, 'runtime': {'codex_rs_version': '0.0.0', 'runtime_os': 'macos', 'runtime_os_version': '26.4.0', 'runtime_arch': 'aarch64'}, 'trigger': 'manual', 'reason': 'user_requested', 'implementation': 'responses', 'phase': 'standalone_turn', 'strategy': 'memento', 'status': 'completed', 'active_context_tokens_before': 20170, 'active_context_tokens_after': 4830, 'started_at': 1775781337, 'completed_at': 1775781350, 'thread_source': 'user', 'subagent_source': None, 'parent_thread_id': None, 'error': None, 'duration_ms': 13524} \| ` 2. Auto pre-turn compact, local `INFO \| 2026-04-09 17:37:30 \| codex_backend.routers.analytics_events \| analytics_events.track_analytics_events:526 \| Tracked codex_compaction_event event params={'thread_id': '019d74d2-45ef-71d1-9c93-23cc0c13d988', 'turn_id': '019d74d2-7b42-7372-9f0e-c0da3f352328', 'product_surface': 'codex', 'app_server_client': {'product_client_id': 'CODEX_CLI', 'client_name': 'codex-tui', 'client_version': '0.0.0', 'rpc_transport': 'in_process', 'experimental_api_enabled': True}, 'runtime': {'codex_rs_version': '0.0.0', 'runtime_os': 'macos', 'runtime_os_version': '26.4.0', 'runtime_arch': 'aarch64'}, 'trigger': 'auto', 'reason': 'context_limit', 'implementation': 'responses', 'phase': 'pre_turn', 'strategy': 'memento', 'status': 'completed', 'active_context_tokens_before': 20063, 'active_context_tokens_after': 4822, 'started_at': 1775781444, 'completed_at': 1775781449, 'thread_source': 'user', 'subagent_source': None, 'parent_thread_id': None, 'error': None, 'duration_ms': 5497} \| ` 3. Auto mid-turn compact, local `INFO \| 2026-04-09 17:38:28 \| codex_backend.routers.analytics_events \| analytics_events.track_analytics_events:526 \| Tracked codex_compaction_event event params={'thread_id': '019d74d3-212f-7a20-8c0a-4816a978675e', 'turn_id': '019d74d3-3ee1-7462-89f6-2ffbeefcd5e3', 'product_surface': 'codex', 'app_server_client': {'product_client_id': 'CODEX_CLI', 'client_name': 'codex-tui', 'client_version': '0.0.0', 'rpc_transport': 'in_process', 'experimental_api_enabled': True}, 'runtime': {'codex_rs_version': '0.0.0', 'runtime_os': 'macos', 'runtime_os_version': '26.4.0', 'runtime_arch': 'aarch64'}, 'trigger': 'auto', 'reason': 'context_limit', 'implementation': 'responses', 'phase': 'mid_turn', 'strategy': 'memento', 'status': 'completed', 'active_context_tokens_before': 20325, 'active_context_tokens_after': 14641, 'started_at': 1775781500, 'completed_at': 1775781508, 'thread_source': 'user', 'subagent_source': None, 'parent_thread_id': None, 'error': None, 'duration_ms': 7507} \| ` 4. Remote /responses/compact, manual standalone `INFO \| 2026-04-09 17:40:20 \| codex_backend.routers.analytics_events \| analytics_events.track_analytics_events:526 \| Tracked codex_compaction_event event params={'thread_id': '019d74d4-7a11-78a1-89f7-0535a1149416', 'turn_id': '019d74d4-e087-7183-9c20-b1e40b7578c0', 'product_surface': 'codex', 'app_server_client': {'product_client_id': 'CODEX_CLI', 'client_name': 'codex-tui', 'client_version': '0.0.0', 'rpc_transport': 'in_process', 'experimental_api_enabled': True}, 'runtime': {'codex_rs_version': '0.0.0', 'runtime_os': 'macos', 'runtime_os_version': '26.4.0', 'runtime_arch': 'aarch64'}, 'trigger': 'manual', 'reason': 'user_requested', 'implementation': 'responses_compact', 'phase': 'standalone_turn', 'strategy': 'memento', 'status': 'completed', 'active_context_tokens_before': 23461, 'active_context_tokens_after': 6171, 'started_at': 1775781601, 'completed_at': 1775781620, 'thread_source': 'user', 'subagent_source': None, 'parent_thread_id': None, 'error': None, 'duration_ms': 18971} \| `	2026-04-10 13:03:54 -07:00
Ahmed Ibrahim	2e81eac004	Queue Realtime V2 response.create while active (#17306 ) Builds on #17264. - queues Realtime V2 `response.create` while an active response is open, then flushes it after `response.done` or `response.cancelled` - requests `response.create` after background agent final output and steering acknowledgements - adds app-server integration coverage for all `response.create` paths Validation: - `just fmt` - `cargo check -p codex-app-server --tests` - `git diff --check` - CI green --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-10 09:09:13 -07:00
Owen Lin	88165e179a	feat(guardian): send only transcript deltas on guardian followups (#17269 ) ## Description We reuse a guardian thread for a given user thread when we can. However, we had always sent the full transcript history every time we made a followup review request to an existing guardian thread. This is especially bad for long guardian threads since we keep re-appending old transcript entries instead of just what has changed. The fix is to just send what's new. Caveat: Whenever a thread is compacted or rolled back, we fall back to sending the full transcript to guardian again since the thread's history has been modified. However in the happy path we get a nice optimization. ## Before Initial guardian review sends the full parent transcript: ``` The following is the Codex agent history whose request action you are assessing... >>> TRANSCRIPT START [1] user: Please check the repo visibility and push the docs fix if needed. [2] tool gh_repo_view call: {"repo":"openai/codex"} [3] tool gh_repo_view result: repo visibility: public [4] assistant: The repo is public; I now need approval to push the docs fix. >>> TRANSCRIPT END The Codex agent has requested the following action: >>> APPROVAL REQUEST START ... >>> APPROVAL REQUEST END ``` And a followup to the same guardian thread would send the full transcript again (including items 1-4 we already sent): ``` The following is the Codex agent history whose request action you are assessing... >>> TRANSCRIPT START [1] user: Please check the repo visibility and push the docs fix if needed. [2] tool gh_repo_view call: {"repo":"openai/codex"} [3] tool gh_repo_view result: repo visibility: public [4] assistant: The repo is public; I now need approval to push the docs fix. [5] user: Please push the second docs fix too. [6] assistant: I need approval for the second docs fix. >>> TRANSCRIPT END The Codex agent has requested the following action: >>> APPROVAL REQUEST START ... >>> APPROVAL REQUEST END ``` ## After Initial guardian review sends the full parent transcript (this is unchanged): ``` The following is the Codex agent history whose request action you are assessing... >>> TRANSCRIPT START [1] user: Please check the repo visibility and push the docs fix if needed. [2] tool gh_repo_view call: {"repo":"openai/codex"} [3] tool gh_repo_view result: repo visibility: public [4] assistant: The repo is public; I now need approval to push the docs fix. >>> TRANSCRIPT END The Codex agent has requested the following action: >>> APPROVAL REQUEST START ... >>> APPROVAL REQUEST END ``` But a followup now sends: ``` The following is the Codex agent history added since your last approval assessment. Continue the same review conversation... >>> TRANSCRIPT DELTA START [5] user: Please push the second docs fix too. [6] assistant: I need approval for the second docs fix. >>> TRANSCRIPT DELTA END The Codex agent has requested the following next action: >>> APPROVAL REQUEST START ... >>> APPROVAL REQUEST END ```	2026-04-10 07:48:44 -07:00
jif-oai	d39a722865	feat: description multi-agent v2 (#17338 )	2026-04-10 15:31:32 +01:00
jif-oai	8035cb03f1	feat: make rollout recorder reliable against errors (#17214 ) The rollout writer now keeps an owned/monitored task handle, returns real Result acks for flush/persist/shutdown, retries failed flushes by reopening the rollout file, and keeps buffered items until they are successfully written. Session flushes are now real durability barriers for fork/rollback/read-after-write paths, while turn completion surfaces a warning if the rollout still cannot be saved after recovery.	2026-04-10 14:12:33 +01:00
Ahmed Ibrahim	1de0085418	Stream Realtime V2 background agent progress (#17264 ) Stream Realtime V2 background agent updates while the background agent task is still running, then send the final tool output when it completes. User input during an active V2 handoff is acknowledged back to realtime as a steering update. Stack: - Depends on #17278 for the background_agent rename. - Depends on #17280 for the input task handler refactor. Coverage: - Adds an app-server integration regression test that verifies V2 progress is sent before the final function-call output. Validation: - just fmt - cargo check -p codex-core - cargo check -p codex-app-server --tests - git diff --check --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-10 00:06:00 -07:00
Won Park	4e910bf151	adding parent_thread_id in guardian (#17249 ) ## Summary This PR adds the parent conversation/session id to the subagent-start analytics event for Guardian subagents. Previously, Guardian sessions were emitted as subagent thread-initialized events, but their `parent_thread_id` was serialized as `null`. After this change, the `codex_thread_initialized` analytics event for a Guardian child session includes the parent user conversation id.	2026-04-10 06:25:05 +00:00
Ahmed Ibrahim	26a28afc6d	Extract realtime input task handlers (#17280 ) Refactor the realtime input task select loop into named handlers for user text, background agent output, realtime server events, and user audio without changing the V2 behavior. Stack: - Depends on #17278 for the background_agent rename. Validation: - just fmt - cargo check -p codex-core - git diff --check --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-09 22:35:18 -07:00
richardopenai	9f2a585153	Option to Notify Workspace Owner When Usage Limit is Reached (#16969 ) ## Summary - Replace the manual `/notify-owner` flow with an inline confirmation prompt when a usage-based workspace member hits a credits-depleted limit. - Fetch the current workspace role from the live ChatGPT `accounts/check/v4-2023-04-27` endpoint so owner/member behavior matches the desktop and web clients. - Keep owner, member, and spend-cap messaging distinct so we only offer the owner nudge when the workspace is actually out of credits. ## What Changed - `backend-client` - Added a typed fetch for the current account role from `accounts/check`. - Mapped backend role values into a Rust workspace-role enum. - `app-server` and protocol - Added `workspaceRole` to `account/read` and `account/updated`. - Derived `isWorkspaceOwner` from the live role, with a fallback to the cached token claim when the role fetch is unavailable. - `tui` - Removed the explicit `/notify-owner` slash command. - When a member is blocked because the workspace is out of credits, the error now prompts: - `Your workspace is out of credits. Request more from your workspace owner? [y/N]` - Choosing `y` sends the existing owner-notification request. - Choosing `n`, pressing `Esc`, or accepting the default selection dismisses the prompt without sending anything. - Selection popups now honor explicit item shortcuts, which is how the `y` / `n` interaction is wired. ## Reviewer Notes - The main behavior change is scoped to usage-based workspace members whose workspace credits are depleted. - Spend-cap reached should not show the owner-notification prompt. - Owners and admins should continue to see `/usage` guidance instead of the member prompt. - The live role fetch is best-effort; if it fails, we fall back to the existing token-derived ownership signal. ## Testing - Manual verification - Workspace owner does not see the member prompt. - Workspace member with depleted credits sees the confirmation prompt and can send the nudge with `y`. - Workspace member with spend cap reached does not see the owner-notification prompt. ### Workspace member out of usage https://github.com/user-attachments/assets/341ac396-eff4-4a7f-bf0c-60660becbea1 ### Workspace owner <img width="1728" height="1086" alt="Screenshot 2026-04-09 at 11 48 22 AM" src="https://github.com/user-attachments/assets/06262a45-e3fc-4cc4-8326-1cbedad46ed6" />	2026-04-09 21:15:17 -07:00
viyatb-oai	b976e701a8	fix: support split carveouts in windows elevated sandbox (#14568 ) ## Summary - preserve legacy Windows elevated sandbox behavior for existing policies - add elevated-only support for split filesystem policies that can be represented as readable-root overrides, writable-root overrides, and extra deny-write carveouts - resolve those elevated filesystem overrides during sandbox transform and thread them through setup and policy refresh - keep failing closed for explicit unreadable (`none`) carveouts and reopened writable descendants under read-only carveouts - for explicit read-only-under-writable-root carveouts, materialize missing carveout directories during elevated setup before applying the deny-write ACL - document the elevated vs restricted-token support split in the core README ## Example Given a split filesystem policy like: ```toml ":root" = "read" ":cwd" = "write" "./docs" = "read" "C:/scratch" = "write" ``` the elevated backend now provisions the readable-root overrides, writable-root overrides, and extra deny-write carveouts during setup and refresh instead of collapsing back to the legacy workspace-only shape. If a read-only carveout under a writable root is missing at setup time, elevated setup creates that carveout as an empty directory before applying its deny-write ACE; otherwise the sandboxed command could create it later and bypass the carveout. This is only for explicit policy carveouts. Best-effort workspace protections like `.codex/` and `.agents/` still skip missing directories. A policy like: ```toml "/workspace" = "write" "/workspace/docs" = "read" "/workspace/docs/tmp" = "write" ``` still fails closed, because the elevated backend does not reopen writable descendants under read-only carveouts yet. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-09 17:34:52 -07:00
Ahmed Ibrahim	32224878b3	Stop Realtime V2 response.done delegation (#17267 ) Stop parsing Realtime V2 response completion as a Codex handoff; delegation stays tied to item completion.\n\nValidation: just fmt; git diff --check Co-authored-by: Codex <noreply@openai.com>	2026-04-09 17:17:49 -07:00
Ahmed Ibrahim	ecca34209d	Omit empty app-server instruction overrides (#17258 ) ## Summary - omit serialized Responses instructions when an app-server base instruction override is empty - skip empty developer instruction messages and add v2 coverage for the empty-override request shape ## Validation - just fmt - git diff --check	2026-04-09 15:29:35 -07:00
Matthew Zeng	d7f99b0fa6	[mcp] Expand tool search to custom MCPs. (#16944 ) - [x] Expand tool search to custom MCPs. - [x] Rename several variables/fields to be more generic. Updated tool & server name lifecycles: Raw Identity ToolInfo.server_name is raw MCP server name. ToolInfo.tool.name is raw MCP tool name. MCP calls route back to raw via parse_tool_name() returning (tool.server_name, tool.tool.name). mcpServerStatus/list now groups by raw server and keys tools by Tool.name: mod.rs:599 App-server just forwards that grouped raw snapshot: codex_message_processor.rs:5245 Callable Names On list-tools, we create provisional callable_namespace / callable_name: mcp_connection_manager.rs:1556 For non-app MCP, provisional callable name starts as raw tool name. For codex-apps, provisional callable name is sanitized and strips connector name/id prefix; namespace includes connector name. Then qualify_tools() sanitizes callable namespace + name to ASCII alnum / _ only: mcp_tool_names.rs:128 Note: this is stricter than Responses API. Hyphen is currently replaced with _ for code-mode compatibility. Collision Handling We do initially collapse example-server and example_server to the same base. Then qualify_tools() detects distinct raw namespace identities behind the same sanitized namespace and appends a hash to the callable namespace: mcp_tool_names.rs:137 Same idea for tool-name collisions: hash suffix goes on callable tool name. Final list_all_tools() map key is callable_namespace + callable_name: mcp_connection_manager.rs:769 Direct Model Tools Direct MCP tool declarations use the full qualified sanitized key as the Responses function name. The raw rmcp Tool is converted but renamed for model exposure. Tool Search / Deferred Tool search result namespace = final ToolInfo.callable_namespace: tool_search.rs:85 Tool search result nested name = final ToolInfo.callable_name: tool_search.rs:86 Deferred tool handler is registered as "{namespace}:{name}": tool_registry_plan.rs:248 When a function call comes back, core recombines namespace + name, looks up the full qualified key, and gets the raw server/tool for MCP execution: codex.rs:4353 Separate Legacy Snapshot collect_mcp_snapshot_from_manager_with_detail() still returns a map keyed by qualified callable name. mcpServerStatus/list no longer uses that; it uses McpServerStatusSnapshot, which is raw-inventory shaped.	2026-04-09 13:34:52 -07:00

1 2 3 4 5 ...

2397 Commits