codex

mirror of https://github.com/openai/codex.git synced 2026-05-15 08:42:34 +00:00

Author	SHA1	Message	Date
Liang-Ting Jiang	4fb52feece	Delay auth fetch for file download materialization	2026-04-28 11:59:29 -07:00
Liang-Ting Jiang	963420d166	Simplify Codex Apps file download materialization	2026-04-28 11:59:29 -07:00
Liang-Ting Jiang	c53d84690f	Drop stale Codex Apps provider gate	2026-04-28 11:59:29 -07:00
Liang-Ting Jiang	7cb38ed2e7	Require absolute file download URLs	2026-04-28 11:59:28 -07:00
Liang-Ting Jiang	1887585b8c	Trim file API regression tests	2026-04-28 11:59:28 -07:00
Liang-Ting Jiang	9110f98cad	Tighten file upload auth and failure checks	2026-04-28 11:59:28 -07:00
Liang-Ting Jiang	0084a82a78	Avoid panic in uploaded file payload cleanup	2026-04-28 11:59:28 -07:00
Liang-Ting Jiang	25e8c2caf7	Fix stale Codex Apps test meta key	2026-04-28 11:59:28 -07:00
Liang-Ting Jiang	e8a4b40df7	Add OpenAI file download materialization and library upload	2026-04-28 11:59:27 -07:00
viyatb-oai	3377afd84a	fix(network-proxy): harden linux proxy bridge helpers (#20001 ) ## Why The Linux managed-proxy bridge helpers are long-lived child processes in the sandbox networking path. Before this change they stayed dumpable and the network seccomp profile did not block cross-process memory syscalls, so another same-user process could potentially inspect or modify bridge memory instead of interacting only through the intended proxy interface. ## What changed - reuse the shared `codex-process-hardening` helper to mark bridge helper children non-dumpable before they begin serving - deny `process_vm_readv` and `process_vm_writev` in the existing network seccomp filter ## Security impact Bridge helpers are less exposed to same-user cross-process inspection or memory writes, which reduces the chance that sandboxed code can interfere with proxy support processes outside the intended IPC path. ## Verification - `cargo test -p codex-process-hardening` - `cargo test -p codex-linux-sandbox` - attempted `cargo check -p codex-linux-sandbox --target x86_64-unknown-linux-gnu`; blocked on missing `x86_64-linux-gnu-gcc` on this macOS host --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-28 11:52:50 -07:00
charley-openai	de2ccf9473	[codex] Add token usage to turn tracing spans (#19432 ) ## Why Slow Codex turns are easier to debug when token usage is visible in the trace itself, without joining against separate analytics. This adds token usage to existing turn-handling spans for regular user turns only. [Example turn](https://openai.datadoghq.com/apm/trace/9d353efa2cb5de1f4c5b93dc33c3df04?colorBy=service&graphType=flamegraph&shouldShowLegend=true&sort=time&spanID=3555541504891512675&spanViewType=metadata&traceQuery=) <img width="1447" height="967" alt="Screenshot 2026-04-24 at 3 03 07 PM" src="https://github.com/user-attachments/assets/ab7bb187-e7fc-41f0-a366-6c44610b2b2c" /> ## What Changed Added response-level token fields on completed handle_responses spans: gen_ai.usage.input_tokens gen_ai.usage.cache_read.input_tokens gen_ai.usage.output_tokens codex.usage.reasoning_output_tokens codex.usage.total_tokens Added aggregate token fields on regular turn spans: codex.turn.token_usage.* Added an explicit regular-turn opt-in via SessionTask::records_turn_token_usage_on_span() so this is not coupled to span-name strings. ## Testing - `cargo test -p codex-otel` - `cargo test -p codex-core turn_and_completed_response_spans_record_token_usage` - `just fmt` - `just fix -p codex-core` - `just fix -p codex-otel` - Manual local Electron/app-server smoke test: regular user turn emits the new span fields Known status: `cargo test -p codex-core` was attempted and failed in unrelated existing areas: config approvals, request-permissions, git-info ordering, and subagent metadata persistence.	2026-04-28 11:41:32 -07:00
canvrno-oai	640a1b23ea	Fix plan mode nudge test after task completion signature change (#20045 ) Updates the plan mode nudge test to pass the new `duration_ms` argument to task completion. Co-authored-by: Codex <noreply@openai.com>	2026-04-28 11:24:22 -07:00
Michael Bolin	9e26613657	permissions: add built-in default profiles (#19900 ) ## Why The migration away from `SandboxPolicy` needs new configs to start from permissions profiles instead of deriving profiles from legacy sandbox modes. Existing users can have empty `config.toml` files, and we should not rewrite user-owned config files that may live in shared repositories. This PR introduces built-in profile names so an empty config can resolve to a canonical `PermissionProfile`, while explicit named `[permissions]` profiles still behave predictably. ## What changed - Adds built-in `default_permissions` profile names: - `:read-only` maps to `PermissionProfile::read_only()`. - `:workspace` maps to the workspace-write profile, including project-root metadata carveouts. - `:danger-no-sandbox` maps to `PermissionProfile::Disabled`, preserving the distinction between no sandbox and a broad managed sandbox. - Reserves the `:` prefix for built-in profiles so user-defined `[permissions]` profiles cannot collide with future built-ins. - Allows `default_permissions` to reference a built-in profile without requiring a `[permissions]` table. - Makes an otherwise empty config choose a built-in profile by trust/platform context: trusted or untrusted project roots use `:workspace` when the platform supports that sandbox, while roots without a trust decision use `:read-only`. - Keeps legacy `sandbox_mode` configs on the legacy path, and still rejects user-defined `[permissions]` profiles that omit `default_permissions` so we do not silently guess among custom profiles. - Preserves compatibility behavior for implicit defaults: bare `network.enabled = true` allows runtime network without starting the managed proxy, explicit profile proxy policy still starts the proxy, and implicit workspace/add-dir roots keep legacy metadata carveouts. ## Verification - `cargo test -p codex-core builtin --lib` - `cargo test -p codex-core profile_network_proxy_config` - `cargo test -p codex-core implicit_builtin_workspace_profile_preserves_add_dir_metadata_carveouts` - `cargo test -p codex-core permissions_profiles_network_enabled_allows_runtime_network_without_proxy` - `cargo test -p codex-core permissions_profiles_proxy_policy_starts_managed_network_proxy` ## Documentation Public Codex config docs should mention these built-in names when the `[permissions]` config format is ready to document as stable. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19900). * #20041 * #20040 * #20037 * #20035 * #20034 * #20033 * #20032 * #20030 * #20028 * #20027 * #20026 * #20024 * #20021 * #20018 * #20016 * #20015 * #20013 * #20011 * #20010 * #20008 * __->__ #19900	2026-04-28 11:21:39 -07:00
viyatb-oai	3afb185a4f	fix(network-proxy): tighten network proxy bypass defaults (#20002 ) ## Why Managed sessions use `NO_PROXY` to keep a small set of destinations on the direct path by default. The old default also bypassed all IPv4 link-local addresses in `169.254.0.0/16`, which includes metadata endpoints such as `169.254.169.254`. Because `NO_PROXY` is evaluated by the client before the request reaches the managed proxy, requests to that range could skip proxy-side allowlist and local-binding checks entirely. On hosts where a link-local metadata service is reachable, that creates a path to sensitive environment metadata or credentials outside the intended enforcement point. ## What changed - remove the default IPv4 link-local `169.254.0.0/16` bypass from the managed proxy environment - keep the existing loopback and private-network defaults unchanged - update the regression assertion to lock in the narrower default ## Security impact Link-local requests now stay on the managed-proxy path by default, so the proxy can apply configured policy before they reach metadata-style endpoints or other link-local services. ## Verification - `cargo test -p codex-network-proxy` Co-authored-by: Codex <noreply@openai.com>	2026-04-28 10:51:43 -07:00
stefanstokic-oai	4c68bd728f	External agent session support (#19895 ) ## Summary This extends external agent detection/import beyond config artifacts so Codex can detect recent sessions files from the external agent home and import them into Codex rollout history. ## What changed - Added a focused `external_agent_sessions` module for: - session discovery - source-record parsing - rollout construction - import ledger tracking - Wired session detection/import into the app-server external agent config API. - Added compaction handling so large imported sessions can be resumed safely before the first follow-up turn. ## Testing Added coverage for: - recent-session detection - custom-title handling - recency filtering - dedupe and re-detect-after-source-change behavior - visible imported turn construction - backward-compatible import payload deserialization - end-to-end RPC import flow - rejection of undetected session paths - repeat-import behavior - large-session compaction before first follow-up Ran: - `cargo test -p codex-app-server external_agent_config_import_ --test all`	2026-04-28 17:42:36 +00:00
Felipe Coury	a036584104	fix(tui): let esc exit empty shell mode (#19986 ) ## Summary - exit shell mode when `Esc` is pressed while the absorbed `!` is the only input - add direct regression coverage plus a composer snapshot for the restored normal prompt state ## Root cause Shell mode stores the leading `!` outside the editable textarea. After typing only `!`, the textarea is empty but the composer is still in bash mode, so the existing empty-composer `Esc` handling never runs. ## Validation - `just fmt` - `cargo test -p codex-tui bottom_pane::chat_composer::tests::esc_exits_empty_shell_mode` - `cargo test -p codex-tui bottom_pane::chat_composer::tests::footer_mode_snapshots` - `cargo insta pending-snapshots` `cargo test -p codex-tui` still reports unrelated existing `/status` snapshot drift in this local environment because the rendered permissions text is `workspace-write with network access` instead of the older `read-only` fixture text.	2026-04-28 14:35:24 -03:00
canvrno-oai	bc5a1b961e	Move local /resume cwd filtering into thread/list (#19931 ) Move local resume and fork cwd filtering to `thread/list` instead of filtering in the TUI. This makes the `/resume` menu feel slightly faster to load when working in repos with many historical threads, and centralizes the cwd filtering in app-server. Affected: - /resume from inside the TUI. - codex resume with no session ID and without --last - codex resume --all - codex fork with no session ID and without --last - codex fork --all Not affected: - codex resume <id> - codex fork <id> - codex resume --last - codex fork --last Steps to test performance improvement in a real Codex environment: - Launch `codex resume` using compiled binary in a directory that has seen many threads. - Launch `codex resume` using release binary in same directory. - Observe difference in time-to-full-page as threads load.	2026-04-28 10:35:10 -07:00
Felipe Coury	c6bcd27832	feat(tui): suggest plan mode from composer drafts (#19901 ) ## Summary - suggest Plan mode when the current composer draft contains the standalone word `plan` - shares the Codex App heuristics for detection - excludes things line `/plan` and the word plan in shell mode - reuse the existing `Shift+Tab` mode cycle and add thread-scoped dismissal with `Esc` - replace the normal footer hint while the reminder is visible so the statusline stays anchored https://github.com/user-attachments/assets/01123ae8-cee6-4e95-b563-44655c071cde ## Why The desktop app already nudges users toward Plan mode when their draft clearly signals planning intent. The TUI had the underlying `/plan` and `Shift+Tab` flows, but no equivalent reminder at the moment the user was most likely to benefit from them. ## Details The reminder is shown only when Plan mode is available, the draft contains standalone `plan`, the user is not already in Plan mode, the composer is actionable, and the current thread has not dismissed the reminder. Slash-command and shell-command drafts are excluded. The first implementation used an extra composer row, but that moved the statusline whenever the heuristic fired. This version keeps the layout stable by rendering the reminder in the existing footer row instead. ## Validation - `INSTA_UPDATE=always cargo test -p codex-tui chatwidget::tests::plan_mode::plan_mode_nudge -- --nocapture` - `just fmt` - `just fix -p codex-tui` - `./tools/argument-comment-lint/run.py -p codex-tui` - `cargo insta pending-snapshots` - `git diff --check`	2026-04-28 14:34:10 -03:00
maja-openai	273c2e21a9	Clarify network approval auto-review prompts (#19907 ) ## Why Network access approval prompts were showing the generic retry reason, which made auto-review focus on the blocked connection instead of the command that caused it. This makes network approvals easier to assess by telling the reviewer to evaluate whether the triggering command was authorised by the user and within policy, and to treat the network call as acceptable when it is a reasonable consequence of that command. ## What changed - Split guardian approval request prompt rendering so `NetworkAccess` has a dedicated branch. - For network requests, show `Network approval context` and `Network access JSON` instead of `Retry reason` / `Planned action JSON`. - Added regression coverage for the network approval prompt wording and for omitting retry reason in this case. ## Verification - `cargo test -p codex-core guardian::tests::build_guardian_prompt_items_explains_network_access_review_scope`	2026-04-28 10:25:37 -07:00
mchen-oai	01de13b7e6	Record MCP result telemetry on mcp.tools.call spans (#19509 ) ## Why - Without change: MCP tool call spans include request-side details such as server, tool, call ID, connector, session, and turn. - Issue: Some useful telemetry is only known by the MCP server after it handles the tool call, such as target identity or whether the call triggered a user-facing flow. ## What Changed - With change: Codex reads allowlisted telemetry from `_meta["codex/telemetry"]["span"]` and records it on the `mcp.tools.call` span. - Adds span fields for `codex.mcp.target.id` and `codex.mcp.user_flow.triggered`, with strict type checks and bounded target ID length. ## Verification `codex-rs/core/src/mcp_tool_call_tests.rs`	2026-04-28 17:20:38 +00:00
evawong-oai	0670d8971a	Enforce workspace metadata protections in Seatbelt (#19847 ) ## Summary Translate FileSystemSandboxPolicy project root metadata carveouts into macOS Seatbelt rules. ## Scope 1. Thread protected metadata names into Seatbelt access roots. 2. Ask FileSystemSandboxPolicy whether each metadata carveout is writable. 3. Emit Seatbelt deny rules that block creating or replacing protected metadata names under writable roots. 4. Add coverage for first time metadata creation and read only carveouts. ## Reviewer Focus 1. This PR only covers the macOS sandbox adapter. 2. The policy decision comes from FileSystemSandboxPolicy. 3. Read only subpath carveouts and metadata protection checks should compose cleanly. ## Stack 1. Policy primitive: #19846 2. macOS Seatbelt adapter: this PR 3. Shell preflight UX: #19848 4. Runtime profile propagation: #19849 5. Linux bubblewrap adapter: #19852 ## Validation 1. formatting for codex sandboxing 2. codex sandboxing package tests	2026-04-28 10:13:00 -07:00
efrazer-oai	f6797c3ac6	feat: verify agent identity JWTs with JWKS (#19764 )	2026-04-28 09:56:20 -07:00
colby-oai	6138063656	Strip connector provenance metadata from custom MCP tools (#19875 ) # Summary This prevents non-codex_apps MCP servers from spoofing connector provenance metadata.	2026-04-28 12:43:26 -04:00
mchen-oai	ccec84b148	Add turn start timestamp to turn metadata (#19473 ) ## Why - Without change: MCP tool calls receive `_meta["x-codex-turn-metadata"]` with `session_id` and `turn_id`. - Issue: MCP servers may want the turn start timestamp to measure internal latency relative to turn start. ## What Changed - With change: turn metadata now includes `turn_started_at_unix_ms`, which is propagated to MCP tool calls in `_meta["x-codex-turn-metadata"]`. ## Verification - `codex-rs/core/src/mcp_tool_call_tests.rs` - `codex-rs/core/src/turn_metadata_tests.rs` - `codex-rs/core/src/turn_timing_tests.rs` - `codex-rs/core/tests/responses_headers.rs` - `codex-rs/core/tests/suite/search_tool.rs`	2026-04-28 16:36:59 +00:00
Eric Traut	4e0cf945b7	Terminate stdio MCP servers on shutdown to avoid process leaks (#19753 ) ## Why Several bug reports describe thread shutdown (including subagent threads) leaving stdio MCP server processes behind. These reports all point at the same lifecycle gap: Codex launches stdio MCP servers, but the session-level shutdown path does not explicitly close MCP clients or terminate the server process tree. Fixes #12491 Fixes #12976 Fixes #18881 Fixes #19469 ## History This is best understood as a regression/coverage gap in MCP session lifecycle management, not as stdio MCP cleanup being absent all along. #10710 added process-group cleanup for stdio MCP servers, but that cleanup only runs when the `RmcpClient`/transport is dropped. The older reports (#12491 and #12976) came after that cleanup existed, which suggests the remaining problem was that some higher-level shutdown paths kept the MCP manager alive or replaced it without explicitly draining clients. The newer reports (#18881 and #19469) exposed the same family around manager replacement and shutdown. ## What changed - Added an explicit stdio MCP process handle in `codex-rmcp-client` so local MCP servers terminate their process group and executor-backed MCP servers call the executor process terminator. - Added `RmcpClient::shutdown()` and manager-level MCP shutdown draining so session shutdown, channel-close fallback, MCP refresh, and connector probing stop owned MCP clients. - Added regression coverage that starts a stdio MCP server, begins an in-flight blocking tool call, shuts down the client, and asserts the server process exits. ## Verification - `cargo test -p codex-rmcp-client` - `cargo test -p codex-mcp` - `just fix -p codex-rmcp-client` - `just fix -p codex-mcp` - `just fix -p codex-core` - Manual before/after validation with a temporary repro script: - Pre-fix binary from `HEAD^` (`fed0a8f4fa`): reproduced the leak with surviving MCP server and child PIDs, `survivors=[77583, 77592]`, `leaked=true`. - Post-fix binary from this branch (`67e318148b`): verified both MCP processes were gone after interrupting `codex exec`, `survivors=[]`, `leaked=false`.	2026-04-28 09:29:57 -07:00
Eric Traut	087c9c1f1f	TUI: use cumulative turn duration for worked-for separator (#19929 ) ## Why Fixes #19814. The TUI's current `Worked for ...` timing behavior is a leftover from #9599. At that point, models could emit multiple assistant messages in one turn for preambles/commentary, but the TUI did not yet have a reliable signal that an assistant message was the final answer when it started streaming. To avoid showing an ever-growing elapsed time on each preamble separator, #9599 made the separator timer incremental by tracking elapsed time since the previous separator. That workaround is no longer the right model for the final completed-turn display. Since then, #16638 added protocol-native turn timing, including `duration_ms` on turn completion. With that cumulative duration available at the point where the TUI renders the completed-turn separator, the UI can show the actual turn duration directly instead of carrying per-separator timing state. ## What Changed - Thread `duration_ms` into `ChatWidget::on_task_complete` from both legacy `TurnCompleteEvent` handling and app-server `TurnCompleted` notifications. - Use `duration_ms` for the final `Worked for ...` separator, falling back to the status indicator timer only when the protocol duration is unavailable. - Keep mid-turn separators before later assistant text as plain visual dividers instead of clocked `Worked for ...` separators. - Remove the old incremental separator timer state and helper (`last_separator_elapsed_secs` / `worked_elapsed_from`). - Add a snapshot regression test for a turn that runs a command and then completes with a final answer, verifying the final separator uses the cumulative turn duration. ## Verification - `cargo test -p codex-tui final_worked_for_uses_cumulative_turn_duration_snapshot` - `just fix -p codex-tui` Manual repro prompt: ```text Manual timing repro. First send a short preamble/commentary sentence before using tools. Then run exactly this shell command: sleep 75; echo MANUAL_TIMING_DONE. After the command finishes, give a final answer that says "done". Do not skip the preamble. ``` After this change, the mid-turn break before the final answer should be a plain divider, and the final completed-turn separator should show `Worked for ...` using the cumulative turn duration. Before: <img width="414" height="102" alt="Screenshot 2026-04-27 at 10 09 01 PM" src="https://github.com/user-attachments/assets/b9e2ce01-2460-40e4-a5c4-c9ba8add2557" /> After: <img width="485" height="149" alt="Screenshot 2026-04-27 at 10 09 07 PM" src="https://github.com/user-attachments/assets/d24089ae-d4e2-41b6-b966-07c98706ead4" />	2026-04-28 09:24:29 -07:00
jif-oai	5b7d6f5c4f	feat: house-keeping memories 3 (#20005 ) Move stuff in memories, no behavioural change expected	2026-04-28 18:13:35 +02:00
evawong-oai	0156b1e61f	[sandbox] Enforce protected workspace metadata paths (#19846 ) ## Summary Make FileSystemSandboxPolicy the semantic source of truth for project root metadata protection. Under writable roots, `.git`, `.codex`, and `.agents` stay protected unless user policy grants an explicit write rule for that metadata path. ## Scope 1. Add `protected_metadata_names` to `WritableRoot`. 2. Teach `FileSystemSandboxPolicy::can_write_path_with_cwd` to reject protected metadata writes under writable roots unless explicitly allowed. 3. Default workspace write profiles to protect `.git`, `.codex`, and `.agents`. 4. Add the Linux fallback setup needed before Linux enforcement lands later in the stack. ## Reviewer Focus 1. The policy decision belongs in FileSystemSandboxPolicy, not shell command parsing. 2. Legacy SandboxPolicy remains a compatibility projection, not the source of the new rule. 3. Explicit user write rules can still opt into these metadata paths. ## Stack 1. Policy primitive: this PR 2. macOS Seatbelt adapter: #19847 3. Shell preflight UX: #19848 4. Runtime profile propagation: #19849 5. Linux bubblewrap adapter: #19852 ## Validation 1. codex protocol permissions tests 2. formatting for codex protocol and codex linux sandbox 3. diff whitespace check	2026-04-28 09:10:41 -07:00
Felipe Coury	5e737372ee	feat(tui): add configurable keymap support (#18593 ) ## Why The TUI currently handles keyboard shortcuts as hard-coded event matches spread across app, composer, pager, list, approval, and navigation code. That makes shortcuts hard to customize, makes displayed hints easy to drift from actual behavior, and makes future keymap work riskier because there is no central action inventory. This PR adds the foundation for configurable, action-based keymaps without adding the interactive remapping UI yet. Onboarding intentionally stays on fixed startup shortcuts because users cannot reasonably configure keymaps before completing onboarding. This is PR1 in the keymap stack: - PR1: #18593: configurable keymap foundation - PR2: #18594: `/keymap` picker and guided remapping UI - PR3: #18595: Vim composer mode and the remap option ## Design Notes The new model resolves named actions into concrete runtime bindings once from config, then passes those bindings to the UI surfaces that handle input or render shortcut hints. The main concepts are: - Context: a scope where an action is active, such as `global`, `chat`, `composer`, `editor`, `pager`, `list`, or `approval`. - Action: a named operation inside a context, such as `global.open_transcript`, `composer.submit`, or `pager.close`. - Binding: one or more single-key shortcuts assigned to an action, written as config strings such as `ctrl-t`, `alt-backspace`, or `page-down`. Multi-step sequences such as `ctrl-x ctrl-s`, `g g`, or leader-key flows are not part of this PR. - Resolution order: context-specific config wins first, supported global fallbacks come next, and built-in defaults fill in anything unset. - Explicit unbinding: an empty array removes an action binding in that scope and does not fall through to a fallback binding. - Conflict validation: a resolved keymap rejects duplicate active bindings inside the same scope so one keypress cannot dispatch two actions. ## What Changed - Added `TuiKeymap` config support under `[tui.keymap]`, including typed contexts/actions, key alias normalization, generated schema coverage, and user-facing config errors. - Added `RuntimeKeymap` resolution in `codex-rs/tui/src/keymap.rs`, including fallback precedence, built-in defaults, explicit unbinding, and per-context conflict validation. - Rewired existing TUI handlers to consume resolved keymap actions instead of directly matching hard-coded keys in each component. - Updated key hint rendering and footer/pager/list surfaces so displayed shortcuts follow the resolved keymap. - Kept onboarding shortcuts fixed in `codex-rs/tui/src/onboarding/keys.rs` instead of exposing them through `[tui.keymap]`. ## Validation The branch includes focused coverage for config parsing, key normalization, runtime fallback resolution, explicit unbinding, duplicate-key conflict validation, default keymap consistency, onboarding startup key behavior, and UI hint snapshots affected by resolved key bindings.	2026-04-28 12:52:25 -03:00
Eric Traut	a61c785040	Reset TUI keyboard reporting on exit (#19625 ) ## Why Codex enables enhanced keyboard reporting while the TUI owns the terminal. In iTerm2, exiting the TUI with Ctrl+C can intermittently leave the parent shell receiving raw CSI-u / `modifyOtherKeys` fragments instead of normal key input. Final terminal cleanup should put the parent shell back into normal keyboard reporting even if the terminal misses the usual stack pop. Fixes #19553. ## What Changed - Move TUI keyboard enhancement setup and detection into `tui/src/tui/keyboard_modes.rs`. - Add an exit-only `restore_after_exit()` path that performs the normal keyboard enhancement pop plus unconditional keyboard enhancement and `modifyOtherKeys` resets. - Keep temporary restore paths, such as external-editor handoff, using the balanced stack pop behavior. ## Confidence Medium. This is a speculative fix: I was not able to reproduce the reported iTerm2 behavior manually, but the symptoms line up with terminal keyboard reporting state surviving Codex exit. The added reset sequences are scoped to final TUI shutdown and should be harmless when the terminal is already clean.	2026-04-28 08:51:44 -07:00
friel-openai	598bbcdb58	Preserve assistant phase for replayed messages (#19832 )	2026-04-28 08:46:13 -07:00
jif-oai	21e19912e0	feat: house-keeping memories 2 (#20000 ) Just move metrics in a dedicated file	2026-04-28 17:26:44 +02:00
jif-oai	5a79dfab7c	feat: house-keeping memories 1 (#19998 ) Just move metrics in a dedicated file	2026-04-28 17:11:49 +02:00
jif-oai	1b74360365	feat: skip memory startup when Codex rate limits are low (#19990 ) ## Why Memory startup runs in the background after an eligible turn, but it can consume Codex backend quota at exactly the wrong time: when the user is already near a rate-limit boundary. This PR adds a guard so the memory pipeline backs off when the Codex rate-limit snapshot says the remaining budget is too low. ## What Changed - Added `memories.min_rate_limit_remaining_percent` with a default of `25`, clamped to `0..=100`, and regenerated `core/config.schema.json`. - Added `codex-rs/memories/write/src/guard.rs`, which fetches Codex backend rate limits before memory startup and skips phase 1 / phase 2 when the Codex limit is reached or either tracked window is above the configured usage ceiling. - Keeps startup best-effort: non-Codex auth or rate-limit fetch/client failures preserve the existing memory startup behavior. - Records a `codex.memory.startup` counter with `status=skipped_rate_limit` when startup is skipped. - Added config parsing/clamping coverage and guard unit tests. ## Verification - Added `codex-rs/memories/write/src/guard_tests.rs` for threshold, primary/secondary window, and reached-limit behavior. - Added config tests for TOML parsing and clamping.	2026-04-28 17:07:16 +02:00
efrazer-oai	0e8d6b8765	fix: configure AgentIdentity AuthAPI base URL (#19904 ) ## Summary AgentIdentity runtime loading currently registers tasks against a single hardcoded AuthAPI base URL. That works for production, but local and staging validation may need registration to target a different authapi-login-provider without baking internal staging service URLs into the OSS binary. This PR adds a small config surface for `agent_identity_authapi_base_url` and threads it through the existing auth-loading path as a direct argument. Explicit config wins. Without config, task registration keeps using the production AuthAPI URL, matching the current default behavior. ## Stack 1. openai/codex#19762 - `refactor: make auth loading async` (merged) 2. openai/codex#19763 - `refactor: load agent identity runtime eagerly` 3. This PR - `fix: configure AgentIdentity AuthAPI base URL` 4. openai/codex#19764 - `feat: verify agent identity JWTs with JWKS` ## Design decisions - Keep the existing auth-loading shape and pass the new value as an argument. This avoids another wrapper loader and keeps the call path readable. - Add config instead of embedding internal staging URLs. Environments that need a non-production AuthAPI can configure it explicitly. - Keep the default AuthAPI registration URL as production. `chatgpt_base_url` remains separate and is used by the follow-up JWKS verification PR for fetching public keys from the ChatGPT backend route. - Resolve the AuthAPI base URL inside AgentIdentity loading, because task registration is the only consumer of this value. ## Testing Tests: targeted Rust checks, AgentIdentity auth tests, config schema regeneration, formatter/fix pass, and whitespace diff check.	2026-04-28 08:06:45 -07:00
jif-oai	a9e5c34083	feat: trigger memories from user turns with cooldown (#19970 ) ## Why Memory startup was tied to thread lifecycle events such as create, load, and fork. That can run memory work before a thread receives real user input, and it makes startup cost scale with thread management instead of actual turns. Moving the trigger to `thread/sendInput` keeps memory startup aligned with the first real user turn and lets it use the current thread config at turn time. The idea is to prevent ghost cost due to pre-warm triggered by the app Turn-based startup can also make global phase-2 consolidation easier to request repeatedly, so this adds a success cooldown and tightens the default startup scan window. ## What Changed - Start `codex_memories_write::start_memories_startup_task` after a non-empty `thread/sendInput` turn is submitted, instead of from thread create/load/fork paths: `d4a6885b78/codex-rs/app-server/src/codex_message_processor.rs (L6477-L6487)` - Expose `CodexThread::config()` so app-server can pass the live config into memory startup at turn time. - Add a six-hour successful-run cooldown for global phase-2 consolidation via `SkippedCooldown`: `d4a6885b78/codex-rs/state/src/runtime/memories.rs (L963-L966)` - Reduce memory startup defaults to at most 2 rollouts over 10 days: `d4a6885b78/codex-rs/config/src/types.rs (L31-L34)` ## Verification Updated the memory runtime coverage around phase-2 reclaim behavior, including `phase2_global_lock_respects_success_cooldown`. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-28 16:23:13 +02:00
jif-oai	fa127be25f	Stabilize memory Phase 2 input ordering (#19967 ) ## Why Phase 2 still needs to choose the most relevant stage-1 memory outputs by usage and recency, but exposing that ranking as the rendered `raw_memories.md` order creates unnecessary large diff. Usage-count or timestamp changes can reshuffle otherwise unchanged memories, making the workspace diff noisy and giving the consolidation prompt a misleading recency signal from file position. This fix will reduce token consumption ## What Changed - Keep the existing top-N Phase 2 selection ranking by `usage_count`, `last_usage`, `source_updated_at`, and `thread_id`. - Return the selected rows in stable ascending `thread_id` order before syncing Phase 2 filesystem inputs. - Update the memory README, raw memories header, and consolidation prompt so they describe the stable order and tell the prompt to use metadata and workspace diffs instead of file order as the recency signal. - Adjust the memory runtime tests to use deterministic thread IDs and assert the stable return order separately from the ranked selection semantics. ## Test Coverage - Existing memory runtime tests in `codex-rs/state/src/runtime/memories.rs` now cover the stable returned ordering for Phase 2 inputs. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-28 13:32:05 +02:00
jif-oai	54d1401170	feat: fix hinting 3 (#19963 ) Fix https://github.com/openai/codex/pull/19805#discussion_r3153265562	2026-04-28 13:12:51 +02:00
jif-oai	b7c0f26910	feat: fix hinting 2 (#19961 ) Fix this: https://github.com/openai/codex/pull/19805#discussion_r3153265562	2026-04-28 13:06:41 +02:00
jif-oai	431ebeaef7	feat: split memories part 2 (#19860 ) Keep extracting memories out of core and moving the write trigger in the app-server This is temporary and it should move at the client level as a follow-up This makes core fully independant from `codex-memories-write` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-28 13:03:28 +02:00
jif-oai	fd36838cf3	Add MultiAgentV2 root and subagent context hints (#19805 ) ## Why MultiAgentV2 sessions need startup guidance that matches the role of the thread that is actually being created. Root agents and subagents have different responsibilities, and forked subagents can inherit parent rollout history. If the parent hint is carried into the child context, the child can see stale or conflicting developer guidance before its own session-specific context is added. ## What changed - Added `features.multi_agent_v2.root_agent_usage_hint_text` and `features.multi_agent_v2.subagent_usage_hint_text` config fields, including schema/config parsing support. - Injected the matching root or subagent hint into the initial context as its own developer message when `multi_agent_v2` is enabled. - Filtered configured MultiAgentV2 usage-hint developer messages out of forked parent history so a child thread receives fresh guidance for its own session source/config. - Added targeted coverage for config parsing, initial-context rendering, feature-config deserialization, and forked-history filtering. ## Context examples With this config: ```toml [features.multi_agent_v2] enabled = true root_agent_usage_hint_text = "Root guidance." subagent_usage_hint_text = "Subagent guidance." ``` A root thread initial context renders the root hint as a standalone developer message: ```text [developer] <existing developer context, when present> [developer] Root guidance. ``` A subagent thread initial context renders the subagent hint instead: ```text [developer] <existing developer context, when present> [developer] Subagent guidance. ``` When a subagent forks parent history, any parent developer message whose text exactly matches the configured MultiAgentV2 root or subagent hint is omitted from the forked history before the child receives its fresh subagent hint.	2026-04-28 12:31:45 +02:00
xli-oai	803705f795	Add remote plugin uninstall API (#19456 ) ## Summary - Adds the remote `plugin/uninstall` request form using required `pluginId` plus optional `remoteMarketplaceName`, while preserving local `pluginId` uninstall. - Adds `codex_core_plugins::remote::uninstall_remote_plugin` for the deployed ChatGPT plugin backend uninstall path and validates the backend returns the same id with `enabled: false`. - Routes app-server remote uninstall through feature checks, remote plugin id validation, backend mutation, local downloaded cache deletion, cache clearing, docs, and regenerated protocol schemas. ## Tests - `just write-app-server-schema` - `just fmt` - `cargo test -p codex-app-server-protocol plugin_uninstall_params_serialization_omits_force_remote_sync` - `cargo test -p codex-app-server plugin_uninstall --test all` - `cargo test -p codex-app-server plugin_uninstall` - `cargo build -p codex-cli` - `CODEX_BIN=/Users/xli/code/codex/codex-rs/target/debug/codex python3 /Users/xli/.codex/skills/xli-test-marketplace-api/scripts/run_marketplace_api_matrix.py` (44 pass / 0 fail) - `just fix -p codex-app-server-protocol -p codex-app-server -p codex-tui` - `just fix -p codex-app-server`	2026-04-28 03:27:53 -07:00
xl-openai	7d72fc8f53	feat: Cache remote plugin bundles on install (#19914 ) Remote installs now fetch, validate, download, and cache the plugin bundle locally	2026-04-28 00:53:27 -07:00
Eric Traut	b985768dc1	Add `codex update` command (#19933 ) ## Why Addresses #9274 Running `codex update` currently starts an interactive Codex session with `update` as the prompt. That is a rough edge for users who expect a direct self-update command after seeing the existing update notice, and it forces them to copy the suggested package-manager command manually. ## What changed - Added a top-level `codex update` subcommand. - Reused the existing install-channel detection and update command runner that the TUI already uses for update prompts. - Exposed the update-action lookup from `codex-tui` so the CLI can invoke the same behavior. - Added CLI coverage to ensure `codex update` is parsed as a subcommand instead of becoming an interactive prompt. ## Verification - `cargo test -p codex-cli` - `cargo test -p codex-tui update_action::tests`	2026-04-27 23:33:59 -07:00
Michael Bolin	0a32c8b396	app-server-protocol: mark permission profiles experimental (#19899 ) ## Why `PermissionProfile` is now the canonical internal permissions representation, but the app-server wire shape is still intentionally unstable while the migration continues. Stable app-server clients should not see or generate code for these fields until the wire format settles. ## What changed - Marks every app-server v2 field that sends `PermissionProfile` as experimental, including `command/exec`, `thread/start`, `thread/resume`, `thread/fork`, and `turn/start` request/response payloads. - Enables per-field experimental inspection for `command/exec`, so `permissionProfile` is gated without making the entire method experimental. - Fixes the generated TypeScript schema filter to be comment-aware. The previous scanner treated apostrophes inside doc comments as string delimiters, so some experimental fields leaked into stable TypeScript even though stable JSON was filtered correctly. ## Verification - `cargo test -p codex-app-server-protocol` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19899). * #19900 * __->__ #19899	2026-04-28 06:08:34 +00:00
Michael Bolin	341550c275	permissions: store thread sessions as profiles (#19776 ) ## Why After thread sessions have a required `PermissionProfile`, the TUI no longer needs to cache a separate legacy `SandboxPolicy` in `ThreadSessionState`. Keeping the legacy field would reintroduce two permission authorities in the session cache and make later replay/switching logic easier to get wrong. This PR keeps legacy app-server compatibility at the ingestion boundary: old `sandbox` response values are still accepted, but they are immediately converted to a cwd-anchored profile. ## What Changed - Removes `ThreadSessionState.sandbox_policy`. - Updates active-session permission syncing to write only the current `PermissionProfile`. - Updates thread-read/replay/test fixtures to use profiles as the cached session permission source. - Leaves legacy `sandbox` fields in app-server request/response protocol paths unchanged; those are compatibility boundaries and are converted before entering cached TUI state. ## Verification - `cargo test -p codex-tui thread_session_state::tests --lib` - `cargo test -p codex-tui inactive_thread_started_notification_initializes_replay_session --lib` - `cargo test -p codex-tui thread_events --lib` - `just fix -p codex-tui` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19776). * #19900 * #19899 * __->__ #19776	2026-04-28 05:49:58 +00:00
Eric Traut	92fb848065	Allow large remote app-server resume responses (#19920 ) ## Why Remote TUI resume uses the app-server websocket client. That client inherited tungstenite's default `16 MiB` frame limit, so a large saved session could make `thread/resume` return a single JSON-RPC response frame that the client rejected before the TUI could deserialize or render it. Fixes #19837 ## What Changed - Configure the remote app-server websocket client with a bounded `128 MiB` max frame/message size. - Preserve the concrete remote worker exit reason when completing pending requests after a transport/read failure instead of replacing it with a generic channel-closed error. - Add a regression test that sends a single `>16 MiB` JSON-RPC response frame and verifies the typed request succeeds. Note: This isn't a perfect fix. It really just moves the limit to a much larger value. I looked at a bunch of other potential fixes (both server-side and client-side), and they all involved significant complexity, had backward-compatibility impact, or impacted performance of common use cases. This simple fix should address the vast majority of remote use cases. ## Verification I reproed the problem locally using a long rollout. Verified that fix addresses connection drop.	2026-04-27 22:44:10 -07:00
Michael Bolin	fc2a69107c	permissions: derive snapshot sandbox projections (#19775 ) ## Why `ThreadConfigSnapshot` is used by app-server and thread metadata code as a stable view of active runtime settings. Keeping both `sandbox_policy` and `permission_profile` in the snapshot duplicates permission state and makes it possible for the legacy projection to drift from the canonical profile. The legacy `sandbox` value is still needed at app-server compatibility boundaries, so this PR derives it on demand from the snapshot profile and cwd instead of storing it. ## What Changed - Removes `ThreadConfigSnapshot.sandbox_policy`. - Adds `ThreadConfigSnapshot::sandbox_policy()` as a compatibility projection from `permission_profile` plus `cwd`. - Updates app-server response/metadata code and tests to call the projection only where legacy fields still exist. - Keeps snapshot construction profile-only so split filesystem rules, disabled enforcement, and external enforcement remain represented by the canonical profile. ## Verification - `cargo test -p codex-app-server thread_response_permission_profile_preserves_enforcement --lib` - `cargo test -p codex-core dispatch_reclaims_stale_global_lock_and_starts_consolidation --lib` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19775). * #19900 * #19899 * #19776 * __->__ #19775	2026-04-27 22:30:47 -07:00
Michael Bolin	bf38def44e	permissions: make SessionConfigured profile-only (#19774 ) ## Why `SessionConfiguredEvent` is the internal event that tells clients what permissions are active for a session. Emitting both `sandbox_policy` and `permission_profile` leaves two possible authorities and forces every consumer to decide which one to honor. At this point in the migration, the profile is expressive enough to represent managed, disabled, and external sandbox enforcement, so the internal event can be profile-only. The wire compatibility concern is older serialized events or rollout data that only contain `sandbox_policy`; those still need to deserialize. ## What Changed - Removes `sandbox_policy` from `SessionConfiguredEvent` and makes `permission_profile` required. - Adds custom deserialization so old payloads with only `sandbox_policy` are upgraded to a cwd-anchored `PermissionProfile`. - Updates core event emission and TUI session handling to sync permissions from the profile directly. - Updates app-server response construction to derive the legacy `sandbox` response field from the active thread snapshot instead of from `SessionConfiguredEvent`. - Updates yolo-mode display logic to treat both `PermissionProfile::Disabled` and managed unrestricted filesystem plus enabled network as full-access, while still preserving the distinction between no sandbox and external sandboxing. ## Verification - `cargo test -p codex-protocol session_configured_event --lib` - `cargo test -p codex-protocol serialize_event --lib` - `cargo test -p codex-exec session_configured --lib` - `cargo test -p codex-app-server thread_response_permission_profile_preserves_enforcement --lib` - `cargo test -p codex-core session_configured_reports_permission_profile_for_external_sandbox --lib` - `cargo test -p codex-tui session_configured --lib` - `cargo test -p codex-tui yolo_mode_includes_managed_full_access_profiles --lib` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19774). * #19900 * #19899 * #19776 * #19775 * __->__ #19774	2026-04-27 22:06:47 -07:00
Eric Traut	5ba908d179	Avoid persisting ShutdownComplete after thread shutdown (#19630 ) ## Why Fixes #19475. `codex exec` can finish successfully and then emit an `ERROR` on stderr: ```text failed to record rollout items: thread <id> not found ``` That happens because shutdown closes the live thread writer before emitting `ShutdownComplete`. The terminal event was still using the normal `send_event_raw` path, so it tried to append rollout items through a recorder that had already been removed. The answer is correct, but wrappers that treat stderr as failure can retry completed exec runs. This looks like a likely recent regression from [#18882](https://github.com/openai/codex/pull/18882), which routed live thread writes through `ThreadStore` and added the shutdown-time live writer close. I have not bisected this, so the PR treats #18882 as the likely source based on the affected shutdown code path rather than a proven first-bad commit. ## What Changed `ShutdownComplete` now bypasses rollout persistence after thread shutdown and is delivered directly to clients. The shutdown path still records the protocol event in the rollout trace before delivery, preserving trace visibility without attempting a post-shutdown thread-store append. The change also adds a regression test with the in-memory thread store to assert that shutdown creates and shuts down the live thread without appending another item after shutdown.	2026-04-27 22:02:08 -07:00

1 2 3 4 5 ...

5935 Commits