codex

mirror of https://github.com/openai/codex.git synced 2026-05-15 08:42:34 +00:00

Author	SHA1	Message	Date
xli-oai	4ecb4497b2	Remove workspace message created_at field	2026-05-05 23:12:40 -07:00
xli-oai	f9cd5bd631	Gate workspace headline polling	2026-05-05 23:12:40 -07:00
xli-oai	926c68e4f4	Add workspace headline polling client	2026-05-05 23:12:40 -07:00
viyatb-oai	9766d3d51c	fix(bwrap): emit libcap after standalone archive (#21285 ) ## Why #21255 added the standalone `codex-bwrap` binary. In the Cargo build, [`pkg_config::probe("libcap")`](`a736cb55a2/codex-rs/bwrap/build.rs (L37-L39)`) emits `-lcap` before [`cc::Build::compile("standalone_bwrap")`](`a736cb55a2/codex-rs/bwrap/build.rs (L50-L67)`) adds the static bwrap archive. The Linux musl link then sees `-lcap -lstandalone_bwrap`; because static archives are resolved left-to-right, `cap_from_name` is still undefined once `standalone_bwrap` introduces that reference. The musl setup already builds `libcap.a` and exposes it through [`libcap.pc`](`a736cb55a2/.github/scripts/install-musl-build-tools.sh (L78-L88)`), so the failure is link ordering rather than a missing dependency. ## What changed - probe `libcap` with `cargo_metadata(false)` so `pkg-config` does not emit its link flags early - emit the discovered `libcap` search paths and libraries after `standalone_bwrap` is compiled, preserving the needed static-link order ## Verification - `cargo test -p codex-bwrap` - `cargo clippy -p codex-bwrap --all-targets` The affected Linux musl release link is exercised by CI, which is the path this fix targets.	2026-05-05 22:22:01 -07:00
Matthew Zeng	41505bcea2	[mcp] Return Accept early per feedback. (#21277 ) - [x] Return Accept early when auto_deny is enabled per feedback.	2026-05-05 21:23:42 -07:00
aaronl-openai	9f06d171e2	Preserve session MCP config on refresh (#21055 ) # Overview MCP refreshes were rebuilding active threads from fresh disk-backed config only, which dropped thread-start session overlays such as app-injected MCP servers. This keeps refreshes current with disk config while preserving the thread-local config that only the active thread knows about. # Changes - Rebuild refreshed config per active thread using that thread's current `cwd`, rather than fanning out one app-server config to every thread. - Preserve each thread's `SessionFlags` layer while replacing reloadable config layers with freshly loaded config, then derive the MCP refresh payload from the rebuilt result. - Move MCP refresh orchestration into app-server so manual refreshes fail loudly while background refreshes remain best-effort, and route plugin-triggered refreshes through the same per-thread reload path. - Add regression coverage for session overlays, fresh project config, plugin-derived MCP config, current requirements, and strict vs best-effort refresh behavior. # Verification - Passed focused Rust coverage for the thread-config rebuild behavior and deferred MCP refresh flow, plus `cargo test -p codex-app-server --lib`. - Verified end to end in the Codex dev app against the locally built CLI: registered an MCP via thread config, verified that it could be used successfully before refresh, manually triggered MCP refresh, and verified that it continued to be available afterward.	2026-05-05 21:09:28 -07:00
Andrei Eternal	8ef31894dc	app-server: align dynamic tool identifiers with Responses API (#20724 ) ## Why Codex currently accepts dynamic tool names and namespaces that the upstream Responses function-tool path does not actually support. In practice, that means app-server can register a dynamic tool successfully and only discover later that the LLM-facing tool contract will reject or mishandle it. This PR tightens the app-server-side dynamic tool contract to match the Responses API before we stack dynamic tool hook support on top of it. ## What changed - validate dynamic tool `name` against the Responses function-tool identifier contract: `^[a-zA-Z0-9_-]+$`, length `1..128` - validate dynamic tool `namespace` the same way, with the Responses namespace length limit `1..64` - reject namespaces that collide with the always-reserved Responses runtime namespaces such as `functions`, `multi_tool_use`, `file_search`, `web`, `browser`, `image_gen`, `computer`, `container`, `terminal`, `python`, `python_user_visible`, `api_tool`, `tool_search`, and `submodel_delegator` - escape invalid identifiers in error messages so control characters do not spill raw into logs or client-visible error text - document the tightened dynamic tool identifier contract in `codex-rs/app-server/README.md` - add both unit coverage for the validator and an app-server integration test that rejects a `thread/start` request with Responses-incompatible dynamic tool identifiers ## Verification - `cargo test -p codex-app-server validate_dynamic_tools_` - `cargo test -p codex-app-server --test all thread_start_rejects_dynamic_tools_not_supported_by_responses`	2026-05-05 21:05:00 -07:00
xl-openai	5119680f85	feat: Add plugin share access controls (#21124 ) Extends `plugin/share/save` to accept optional discoverability and shareTargets while uploading plugin contents, and adds `plugin/share/updateTargets` for share-only target updates without re-uploading.	2026-05-05 20:14:18 -07:00
rhan-oai	b3d4f1a9f0	[codex-analytics] rework thread_source for thread analytics (#20949 ) ## Summary - make `thread_source` an explicit optional thread-level field on `thread/start`, `thread/fork`, and returned thread payloads - persist `thread_source` in rollout/session metadata so resumed live threads retain the original value - replace the old best-effort `session_source` -> `thread_source` mapping with an explicit caller-supplied analytics classification ## Why Before this change, analytics `thread_source` was populated by a best-effort mapping from `session_source`. `session_source` describes the runtime/client surface, not the actual thread-level origin, so that projection was not accurate enough to distinguish cases such as `user`, `subagent`, `memory_consolidation`, and future thread origins reliably. Making `thread_source` explicit keeps one thread-level analytics field while letting callers provide the real classification directly instead of recovering it indirectly from `session_source`. ## Impact For new analytics events, `thread_source` now reflects the explicit thread-level classification supplied by the caller rather than an inferred value derived from `session_source`. Existing protocol fields remain optional; callers that omit `threadSource` now produce `null` instead of a best-effort inferred value. ## Validation - `just write-app-server-schema` - `cargo test -p codex-analytics -p codex-core -p codex-app-server-protocol --no-run` - `cargo test -p codex-app-server-protocol generated_ts_optional_nullable_fields_only_in_params` - `cargo test -p codex-analytics thread_initialized_event_serializes_expected_shape` - `cargo test -p codex-core resume_stopped_thread_from_rollout_preserves_thread_source`	2026-05-06 02:12:31 +00:00
Abdulrahman Alfozan	94db03d5af	Expose plugin manifest keywords in app server (#21271 ) ## Summary - Add plugin manifest keywords to core plugin marketplace/detail models - Expose keywords on app-server v2 PluginSummary and generated schema/types - Populate keywords in plugin/list and plugin/read responses for local plugins Depends on https://github.com/openai/openai/pull/891087 ## Validation - just fmt - just write-app-server-schema - cargo test -p codex-app-server-protocol - cargo test -p codex-core-plugins - cargo test -p codex-app-server plugin_list_keeps_valid_marketplaces_when_another_marketplace_fails_to_load - cargo test -p codex-app-server plugin_read_returns_plugin_details_with_bundle_contents	2026-05-06 02:09:05 +00:00
pakrym-oai	136e442e95	[codex] Remove legacy ListSkills op (#21282 ) ## Why `skills/list` is already exposed through app-server v2 and covered by the app-server test suite. Keeping the separate core `Op::ListSkills` path leaves a duplicate legacy protocol surface that no longer needs to be maintained. ## What Changed - Removed `Op::ListSkills` and `EventMsg::ListSkillsResponse` from the core protocol. - Deleted the corresponding core session handler and stale core integration tests. - Removed rollout/MCP ignore branches and protocol v1 docs references for the deleted event/op. - Left app-server `skills/list` and its existing coverage intact. ## Validation - `cargo test -p codex-protocol` - `cargo test -p codex-core --test all suite::skills` - `cargo check -p codex-mcp-server -p codex-rollout -p codex-rollout-trace` - `just fix -p codex-core`	2026-05-05 18:58:18 -07:00
pakrym-oai	024118625e	[codex] Remove unused ListModels op (#21276 ) ## Why The core protocol still exposed a `ListModels` submission op even though no client sends it and the core submission loop treated it as an ignored unknown op. Keeping the dead variant made the protocol surface look supported while the active model listing API is the app-server `model/list` JSON-RPC request. ## What Changed - Removed the unused `Op::ListModels` variant from `codex-rs/protocol`. - Removed its `Op::kind()` mapping. The existing app-server `model/list` endpoint is unchanged. ## Verification - `cargo test -p codex-protocol`	2026-05-06 01:57:17 +00:00
Michael Bolin	a736cb55a2	release/npm: bundle standalone bwrap on Linux (#21257 )	2026-05-05 18:21:52 -07:00
iceweasel-oai	db22c91e61	Share Git safe-command logic on Windows (#21275 ) ## Why BUGB-15601 showed that the Windows safe-command path had drifted from the generic Git classifier. The Windows-specific Git parser could classify a PowerShell-wrapped `git` command as safe as soon as it found a safelisted subcommand, without applying the generic checks for unsafe subcommand options such as `--output`, `--ext-diff`, `--textconv`, `--paginate`, or `cat-file --filters`. The generic classifier already models the Git command boundary and the read-only argument checks more carefully, so Windows should reuse that logic instead of maintaining a smaller parallel parser. ## What Changed - Extracted the existing generic Git classification logic into `is_safe_git_command`. - Updated `windows_safe_commands.rs` to call that shared helper for parsed PowerShell `git` commands. - Removed the Windows-only Git subcommand safelist, including the `cat-file` allowance that was part of the reported bypass. - Added a Windows regression test that keeps PowerShell-wrapped Git commands with side-effecting options classified unsafe. - Made the full-path PowerShell test discover the installed PowerShell executable instead of depending on one hard-coded `pwsh.exe` path. ## Verification - `cargo test -p codex-shell-command rejects_git_subcommand_options_with_side_effects` - `cargo test -p codex-shell-command git_global_override_flags_are_not_safe` - `cargo test -p codex-shell-command windows_powershell_full_path_is_safe -- --nocapture` Co-authored-by: Codex <codex@openai.com>	2026-05-05 17:49:42 -07:00
mchen-oai	794c240f25	Add model and reasoning effort to MCP turn metadata (#21219 ) ## Why - Similar change as https://github.com/openai/codex/pull/19473. - Without change: MCP tool calls receive `_meta["x-codex-turn-metadata"]` with `session_id`, `turn_id`, and `turn_started_at_unix_ms`. - Issue: MCP servers may want the model and reasoning effort to better understand tool-call behavior and latency relative to turn start. ## What Changed - With change: MCP turn metadata now includes `model` and `reasoning_effort`, propagated in `_meta["x-codex-turn-metadata"]`. - Normal `/responses` turn metadata headers are unchanged. ## Verification - `codex-rs/core/src/mcp_tool_call_tests.rs` - `codex-rs/core/src/turn_metadata_tests.rs` - `codex-rs/core/tests/suite/search_tool.rs`	2026-05-05 17:37:48 -07:00
pakrym-oai	2c1a361a2e	[codex] Move thread naming to app server (#21260 ) ## Why Thread names are app-server metadata now, backed by the thread store and sqlite state database. Keeping a core `SetThreadName` op plus a rollout `thread_name_updated` event made rename persistence live in the wrong layer and required historical replay support for an event that new app-server flows should not write. ## What changed - Removed `Op::SetThreadName` and `EventMsg::ThreadNameUpdated` from the core protocol and deleted the core handler path that appended rename events to rollouts. - Updated app-server `thread/name/set` so both loaded and unloaded threads write through thread-store metadata and app-server emits `thread/name/updated` notifications. - Updated local thread-store name metadata updates to write sqlite title metadata and the legacy thread-name index without appending rollout events. - Removed state extraction and rollout handling for the deleted thread-name event. ## Validation - `cargo test -p codex-app-server thread_name_updated_broadcasts` - `cargo test -p codex-app-server thread_name_set_is_reflected_in_read_list_and_resume` - `cargo test -p codex-thread-store update_thread_metadata_sets_name_on_active_rollout_and_indexes_name` - `cargo test -p codex-state` - `cargo check -p codex-mcp-server -p codex-rollout-trace` - `just fix -p codex-app-server -p codex-thread-store -p codex-state -p codex-mcp-server -p codex-rollout-trace` ## Docs No external documentation update is expected for this internal ownership change.	2026-05-05 17:16:06 -07:00
Michael Bolin	3ec18a2c0a	release: publish standalone bwrap artifacts (#21256 ) Summary - Build Linux `bwrap` before the main release binaries. - Export the release `bwrap` SHA-256 as `CODEX_BWRAP_SHA256` so the Codex binary can verify the bundled fallback. - Sign, stage, and upload `bwrap` alongside the primary Linux release artifacts. Verification - YAML parse check for `.github/workflows/rust-release.yml` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/21256). * #21257 * __->__ #21256	2026-05-05 17:15:46 -07:00
Michael Bolin	26f355b67b	linux-sandbox: use standalone bundled bwrap (#21255 ) Summary - Add `codex-bwrap`, a standalone `bwrap` binary built from the existing vendored bubblewrap sources. - Remove the linked vendored bwrap path from `codex-linux-sandbox`; runtime now prefers system `bwrap` and falls back to bundled `codex-resources/bwrap`. - Add bundled SHA-256 verification with missing/all-zero digest as the dev-mode skip value, then exec the verified file through `/proc/self/fd`. - Keep `launcher.rs` focused on choosing and dispatching the preferred launcher. Bundled lookup, digest verification, and bundled exec now live in `linux-sandbox/src/bundled_bwrap.rs`; Bazel runfiles lookup lives in `linux-sandbox/src/bazel_bwrap.rs`; shared argv/fd exec helpers live in `linux-sandbox/src/exec_util.rs`. - Teach Bazel tests to surface the Bazel-built `//codex-rs/bwrap:bwrap` through `CARGO_BIN_EXE_bwrap`; `codex-linux-sandbox` only honors that fallback in debug Bazel runfiles environments so release/user runtime lookup stays tied to `codex-resources/bwrap`. - Allow `codex-exec-server` filesystem helpers to preserve just the Bazel bwrap/runfiles variables they need in debug Bazel builds, since those helpers intentionally rebuild a small environment before spawning `codex-linux-sandbox`. - Verify the Bazel bwrap target in Linux release CI with a build-only check. Running `bwrap --version` is too strong for GitHub runners because bubblewrap still attempts namespace setup there. Verification - Latest update: `cargo test -p codex-linux-sandbox` - Latest update: `just fix -p codex-linux-sandbox` - `cargo check --target x86_64-unknown-linux-gnu -p codex-linux-sandbox` could not run locally because this macOS machine does not have `x86_64-linux-gnu-gcc`; GitHub Linux Bazel CI is expected to cover the Linux-only modules. - Earlier in this PR: `cargo test -p codex-bwrap` - Earlier in this PR: `cargo test -p codex-exec-server` - Earlier in this PR: `cargo check --release -p codex-exec-server` - Earlier in this PR: `just fix -p codex-linux-sandbox -p codex-exec-server` - Earlier in this PR: `bazel test --nobuild //codex-rs/linux-sandbox:linux-sandbox-all-test //codex-rs/core:core-all-test //codex-rs/exec-server:exec-server-file_system-test //codex-rs/app-server:app-server-all-test` (analysis completed; Bazel then refuses to run tests under `--nobuild`) - Earlier in this PR: `bazel build --nobuild //codex-rs/bwrap:bwrap` - Prior to this update: `just bazel-lock-update`, `just bazel-lock-check`, and YAML parse check for `.github/workflows/bazel.yml` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/21255). * #21257 * #21256 * __->__ #21255	2026-05-05 17:14:29 -07:00
Channing Conger	03d3403a41	ci: trigger rusty-v8 releases from tags (#21259 ) Swap to tag based releasing and allow tags of type `rusty-v8-v..*`	2026-05-05 16:56:43 -07:00
Owen Lin	d7de4dd3ac	chore(app-server-protocol): split v2 API definitions into modules (#21251 ) ## Why `codex-rs/app-server-protocol/src/protocol/v2.rs` had grown into a single ~12k-line definition file for the entire app-server v2 API. This is purely a mechanical refactor to break up the monolithic `v2.rs` file that contains all app-server API v2 types into more modular files, grouped by resource (e.g. account, thread, turn, etc.). `just write-app-server-schema` shows no real changes, so we can be sure that this is purely an internal organizational change. ## What changed - Replaced the monolithic `protocol/v2.rs` with a `protocol/v2/` module tree and a small `mod.rs` that only declares and reexports modules. - Grouped v2 API definitions by conceptual owner, including `account`, `apps`, `collaboration_mode`, `command_exec`, `config`, `device_key`, `experimental_feature`, `feedback`, `fs`, `hook`, `item`, `mcp`, `model`, `notification`, `permissions`, `plugin`, `process`, `realtime`, `review`, `thread`, `thread_data`, `turn`, and `windows_sandbox`. - Moved v2 tests into `protocol/v2/tests.rs` so `mod.rs` stays small. - Kept shared protocol helpers in `protocol/v2/shared.rs`, including the enum mirroring macro and common cross-resource types. - Co-located resource-specific notifications and server-request payloads with the modules that own those resources. - Regenerated app-server protocol schema fixtures. The schema diffs are non-semantic newline-only changes after the refactor. ## Verification - `cargo check -p codex-app-server-protocol` - `cargo test -p codex-app-server-protocol` - `just write-app-server-schema`	2026-05-05 16:46:51 -07:00
Michael Bolin	332b8b2c74	fix build (#21261 ) I believe a merge race in https://github.com/openai/codex/pull/20689 broke the build, so this is a quick fix. `cargo check --tests` passed locally.	2026-05-05 16:02:06 -07:00
Tom	ee02cf26d6	codex: use ThreadStore history for core review forks (#20577 ) - fork loaded parent threads from `ThreadStore` history in core agent control paths - migrate guardian review fork history to loaded session history instead of rereading rollout files ## Verification - `cargo test -p codex-core spawn_agent_fork`	2026-05-05 15:25:19 -07:00
Michael Zeng	d0f9d5eba2	Add cloud executor registration to exec-server (#19575 ) ## Summary This PR adds the first `codex-rs` milestone for remote-exec e2e: a local `codex exec-server` can now register itself with `codex-cloud-environments` and attach to the returned rendezvous websocket. At a high level, `codex exec-server --cloud ...` now: - loads ChatGPT auth from normal Codex config - registers an executor with `codex-cloud-environments` - receives a signed rendezvous websocket URL - serves the existing exec-server JSON-RPC protocol over that websocket ## What Changed - Added `--cloud`, `--cloud-base-url`, `--cloud-environment-id`, and `--cloud-name` to `codex exec-server` - Added a new `exec-server/src/cloud.rs` module that handles: - registration requests - auth/header setup - bounded auth retry on `401/403` - reconnect/backoff after websocket disconnects - Reused the existing `ConnectionProcessor` / `ExecServerHandler` path so cloud mode serves the same exec/filesystem RPC surface as local websocket mode - Added cloud-specific error variants and minimal docs for the new mode ## Testing Manual e2e test that fully goes through exec server flow with our codex cloud agent as orchestrator	2026-05-05 22:01:48 +00:00
Rasmus Rygaard	7e310bc7f3	Inject state DB, agent graph store (#20689 ) ## Why We want the agent graph store to be passed down the stack as a real dependency, the same way we already treat the thread store. This will let us inject the agent graph store as a real dependency and support implementations other than the local SQLite-backed one. Right now most code instantiates a state DB and an agent graph store just-in-time. Ideally, we would not depend on the state DB directly but only read through the higher-level interfaces. This change makes the dependency boundaries explicit and moves state DB initialization to process bootstrap instead of hiding it inside local store implementations. ## What changed - `ThreadManager` now requires a `StateDbHandle` and an `AgentGraphStore` at construction time instead of treating them as optional internals. - The local store constructors no longer lazily initialize SQLite. Callers now initialize the state DB once per process and use that shared handle to build: - `LocalThreadStore` - `LocalAgentGraphStore` - App bootstraps (`app-server`, `mcp-server`, `prompt_debug`, and the thread-manager sample) now initialize the state DB up front and inject the resulting handle down the stack. - `app-server` now consistently uses its process-scoped state DB handle instead of reopening SQLite or trying to recover it from loaded threads. - Device-key storage now reuses the shared state DB handle instead of maintaining its own lazy opener. - The thread archive / descendant traversal paths now use the injected `AgentGraphStore` instead of reaching through local thread-store-specific state. ## Verification - `cargo check -p codex-core -p codex-thread-store -p codex-app-server -p codex-mcp-server -p codex-thread-manager-sample --tests` - `cargo test -p codex-thread-store` - `cargo test -p codex-core thread_manager_accepts_separate_agent_graph_store_and_thread_store -- --nocapture` - `cargo test -p codex-app-server thread_archive_archives_spawned_descendants -- --nocapture`	2026-05-05 21:45:29 +00:00
Channing Conger	36460387ec	Enable V8 sandboxing for source-built builds (#21146 ) ## Summary This is the first PR in the V8 in-process sandboxing rollout. It adds the build-system and Rust feature plumbing needed to support sandboxed V8 builds, then enables sandboxing by default for the source-built Bazel V8 path that we control directly. It deliberately keeps the published `rusty_v8` artifact workflows on their current non-sandboxed contract so this PR can land and ship independently before we change any released artifacts. ## Rollout plan - [x] PR 1: land sandbox plumbing and default source-built Bazel V8 to sandboxed mode - [ ] PR 2: publish sandbox-enabled release artifacts and add compatibility validation - Produce sandboxed artifact pairs for every released Cargo target that does not already use the source-built Bazel path. - Add CI coverage that consumes those sandboxed artifacts and verifies: - `codex-v8-poc` reports sandbox enabled - `codex-code-mode` builds/tests against the sandboxed path - [ ] PR 3: switch release consumers to sandboxed artifacts by default - Update released artifact selectors/checksums. - Enable the Rust `v8_enable_sandbox` feature in the default release path. - Make the sandboxed artifact family the normal path for published builds. - [ ] PR 4: remove rollout-only compatibility paths - Remove the temporary non-sandbox release compatibility config once the new default has shipped and baked. - Keep the invariant tests permanently.	2026-05-05 14:36:37 -07:00
Felipe Coury	bb2257e3f5	[codex] fix TUI turn items view fixtures (#21243 ) ## Summary Adds the required `items_view` field to the three session picker `Turn` test fixtures that populate full turn item lists. ## Root Cause `#21063` added `Turn.items_view` to the app-server protocol type. The later session picker merge added three test-only `codex_app_server_protocol::Turn` literals without the new field, which broke Bazel compilation on `main` with `E0063: missing field items_view`. ## Validation - `just fmt` - `cargo test -p codex-tui resume_picker --no-fail-fast` - `just argument-comment-lint` I also ran `cargo test -p codex-tui`; it compiled and ran the suite, but this local machine failed two pre-existing status permission-profile tests because `/etc/codex/requirements.toml` disallows `DangerFullAccess`.	2026-05-05 14:24:28 -07:00
Eric Traut	8c88f9a304	Auto-deny MCP elicitations for Xcode 26.4 clients (#21113 ) ## Summary Xcode 26.4 was built against app-server behavior from before MCP elicitation requests became client-visible in CLI 0.120.0 via #17043. That client line does not expect the new events/messages, so this PR restores the old behavior for exactly that client/version combination. The compatibility handling stays in the app-server layer: when the initialized client is `Xcode` and its version starts with `26.4`, the app server marks the live Codex thread so MCP elicitations are auto-denied. The flag is applied on thread start/resume/fork/turn attachment, carried through `Codex`/`CodexThread`, and stored on `McpConnectionManager` so refreshed MCP managers preserve the behavior. ## Notes This is intentionally narrow and includes a TODO to remove the compatibility path once Xcode 26.4 ages out.	2026-05-05 14:05:42 -07:00
pakrym-oai	f593323ef1	[codex] Split tool handlers by tool name (#20687 ) ## Why Tool registration used to bind a tool name to a handler externally, which left ownership split between the registry plan and the handler implementation. Some built-in handlers also multiplexed multiple in-core tools by switching on the invoked tool name internally. This moves the registry identity onto the handler itself and makes built-in multi-tool areas use separate concrete handlers, so each registered handler instance owns exactly one tool name and one dispatch path. ## What Changed - Added `ToolHandler::tool_name()` and changed `ToolRegistryBuilder::register_handler` to derive the registry key from the handler. - Split built-in multiplexed handlers into concrete per-tool handlers for unified exec, shell/local shell/container exec, MCP resources, goal tools, and agent job tools. - Kept name-carrying handler instances only where the runtime target is inherently external or dynamic, such as MCP tools, dynamic tools, and unavailable placeholders. - Updated `ToolHandlerKind` and registry-plan construction so plan entries map directly to concrete handler registrations. ## Verification - `cargo test -p codex-tools tool_registry_plan` - `cargo test -p codex-core --lib tools::registry_tests` - `just fix -p codex-tools` - `just fix -p codex-core`	2026-05-05 13:46:45 -07:00
viyatb-oai	9cbef243b5	fix(linux-sandbox): isolate Linux sandbox synthetic mount registry per user for shared codex use case (#21234 ) ## Summary - make the Linux sandbox synthetic mount registry path unique per effective UID - keep same-user coordination intact while avoiding collisions between users sharing `/tmp` - add a regression test for the registry path contract ## Why Issue #21192 reports that the Linux sandbox currently uses one global temp path at `/tmp/codex-bwrap-synthetic-mount-targets`. If another user creates that directory first, later users can fail to open the shared lock file with `Permission denied`. ## Validation - `just fmt` - `cargo test -p codex-linux-sandbox` - `cargo clippy -p codex-linux-sandbox --all-targets` Fixes #21192	2026-05-05 20:43:37 +00:00
viyatb-oai	8b95d5467e	fix(linux-sandbox): avoid panic on bwrap build failures (#21127 ) ## Summary - Propagate Linux bubblewrap argument-construction failures instead of panicking in the helper - Keep mutable-symlink carveouts fail-closed while reporting them as ordinary sandbox build failures - Add regression coverage for a protected `.codex` symlink inside a writable workspace root ## Root cause Linux bubblewrap intentionally rejects read-only carveouts that cross a symlink the sandboxed process can still rewrite. That is the correct security behavior for protected metadata paths such as `.codex`. The bug was one layer higher: `linux_run_main` treated the expected build failure as impossible and panicked while constructing the bubblewrap argv. For issue #20716, that turned a normal fail-closed sandbox outcome into a noisy panic in the transcript. ## User impact Users with a project-local `.codex` symlink inside a writable workspace still get the conservative sandbox decision, but they no longer see a Rust panic for that condition. The helper now exits with the concise sandbox-build error so the normal denial / escalation path can handle it. Fixes #20716	2026-05-05 13:34:08 -07:00
Felipe Coury	3b2ebb368e	feat(tui): redesign session picker (#20065 ) ## Why The resume/fork picker is becoming the main way users recover previous work, but the old fixed table made sessions hard to scan once thread names, branches, working directories, and timestamps all mattered. This redesign makes the picker denser by default, easier to search, and safer to inspect before resuming or forking. <table> <tr> <td> <img width="1660" height="1103" alt="CleanShot 2026-05-03 at 12 34 10" src="https://github.com/user-attachments/assets/313ede1d-1da4-4863-acd2-56b3e27e9703" /> </td> <td> <img width="1662" height="1100" alt="CleanShot 2026-05-03 at 12 34 15" src="https://github.com/user-attachments/assets/cfde7d5c-bab0-4994-a807-254e53f344ea" /> </td> </tr> <tr> <td> <img width="1664" height="1107" alt="CleanShot 2026-05-03 at 12 39 22" src="https://github.com/user-attachments/assets/e1ee58ca-4dc5-4a35-ae0f-47562da3974c" /> </td> <td> <img width="1662" height="1100" alt="CleanShot 2026-05-03 at 12 35 09" src="https://github.com/user-attachments/assets/9c888072-eedf-4f45-985c-0c14df28bcc7" /> </td> </tr> </table> ## What Changed - Replaces the old session table with responsive session rows that prioritize the session name or preview, then show timestamp, cwd, and branch metadata. - Makes dense view the default while keeping comfortable view available through `Ctrl+O`. - Persists the picker view preference in `[tui].session_picker_view`, including active profile-scoped config. - Adds sort/filter controls for updated time, created time, cwd, and all sessions. - Expands search matching across session name, preview, thread id, branch, and cwd. - Makes `Esc` safer in search mode: it clears an active query before starting a new session. - Adds lazy transcript inspection: - `Space` expands recent transcript context inline. - `Ctrl+T` opens a transcript overlay. - raw reasoning visibility follows `show_raw_agent_reasoning`. - Keeps remote cwd filtering server-side for remote app-server sessions so local path normalization does not incorrectly hide remote results. - Updates snapshots and config schema for the new picker states and config option. ## How to Test 1. Start Codex in a repo with several saved sessions. 2. Press `Ctrl+R` / resume picker entry point. 3. Confirm the picker opens in dense mode and shows session name or preview, timestamp, cwd, and branch metadata. 4. Press `Ctrl+O` and confirm it switches between dense and comfortable views. 5. Restart Codex and confirm the selected view persists. 6. Type a query that matches a branch, cwd, thread id, or session name; confirm matching sessions appear. 7. Press `Esc` while the query is non-empty and confirm it clears search instead of starting a new session. 8. Select a session and press `Space`; confirm recent transcript context expands inline. 9. Press `Ctrl+T`; confirm the transcript overlay opens and respects raw-reasoning visibility settings. Targeted tests: - `cargo test -p codex-tui resume_picker --no-fail-fast` - `cargo test -p codex-core runtime_config_resolves_session_picker_view_default_and_override` - `cargo test -p codex-core profile_tui_rejects_unsupported_settings` - `cargo check -p codex-thread-manager-sample` - `cargo insta pending-snapshots`	2026-05-05 13:32:54 -07:00
Felipe Coury	52fbbe7cdd	feat(tui): route /diff through workspace commands (#21001 ) Stacked on #20892. ## Why #20892 adds the TUI workspace command abstraction so branch status metadata can run through app-server instead of assuming the CLI process has the active workspace locally. `/diff` still used direct local process execution, which means remote app-server sessions could compute the diff against the wrong machine or fail to see the active workspace at all. This PR moves `/diff` onto that same app-server-backed command path so Git runs wherever the active workspace lives. ## What Changed - Route `/diff` through the TUI `WorkspaceCommandExecutor` using the active chat cwd. - Replace direct `tokio::process::Command` usage in `get_git_diff` with argv-based workspace command requests. - Preserve the existing `/diff` behavior: tracked diff output, untracked file diffs, treating Git diff exit code `1` as success, and showing the existing non-git-repository message. - Extend `WorkspaceCommand` with caller-set timeouts and an explicit uncapped-output opt-out. Metadata probes remain capped by default; `/diff` opts out because its full output is the user-visible payload. ## How to Test Manual reviewer path: 1. Start the Codex TUI from a Git worktree with one tracked file change and one untracked file. 2. Run `/diff`. 3. Confirm the rendered diff includes both the tracked diff and the untracked file diff. 4. Start the TUI outside a Git worktree, or switch to a non-git cwd, then run `/diff`. 5. Confirm it shows the existing `/diff` not-inside-a-git-repository message. Targeted tests run: - `cargo test -p codex-tui get_git_diff -- --nocapture` - `cargo test -p codex-tui branch_summary -- --nocapture` - `cargo test -p codex-tui`	2026-05-05 17:09:25 -03:00
rhan-oai	9e0c191c13	add turn items view to app-server turns (#21063 ) ## Why `Turn.items` currently overloads an empty array to mean either that no items exist or that the server intentionally did not load them for this response. That ambiguity blocks future lazy-loading work where clients need to distinguish unloaded, summary, and fully hydrated turn payloads. ## What changed - add a new `TurnItemsView` enum with `notLoaded`, `summary`, and `full` variants - add required `itemsView` metadata to app-server `Turn` payloads - mark reconstructed persisted history as `full` and live shell-style turn payloads as `notLoaded` - keep current `thread/turns/list` behavior unchanged and document that it still returns `full` turns today - regenerate the JSON and TypeScript protocol fixtures ## Verification - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server thread_read_can_include_turns` - `cargo test -p codex-app-server thread_turns_list_can_page_backward_and_forward` - `cargo test -p codex-app-server thread_resume_rejects_history_when_thread_is_running` - `just fix -p codex-app-server-protocol` - `just fix -p codex-app-server` - `just fmt`	2026-05-05 19:17:16 +00:00
pakrym-oai	b6d4c4ea6b	[codex] Use shared app-server JSON-RPC error helpers (#21221 ) ## Why App-server had repeated hand-built JSON-RPC error objects for standard error shapes. Using the shared helpers keeps the common `invalid_request`, `invalid_params`, and `internal_error` construction in one place and reduces the chance of new call sites drifting from the common error payload shape. ## What changed - Replaced manual standard JSON-RPC error object creation with `internal_error(...)`, `invalid_request(...)`, and `invalid_params(...)` across app-server request processors and runtime paths. - Removed local duplicate helper definitions from search and review request handling. - Preserved existing structured `data` payloads by creating the shared helper error first and then attaching the existing metadata. - Left custom non-standard errors and raw error-code assertions intact. ## Validation - `cargo test -p codex-app-server`	2026-05-05 12:13:59 -07:00
Abhinav	0452dca986	hook trust metadata and enforcement (#20321 ) # Why We want shared hook trust that both the app and the TUI can build on, but the metadata is only useful if runtime behavior agrees with it. This PR adds a single backend trust model for hooks so unmanaged hooks cannot run until the current definition has been reviewed, while managed hooks remain runnable and non-configurable. # What - persist `trusted_hash` alongside hook state in `config.toml` - expose `currentHash` and derived `trustStatus` through `hooks/list` - derive trust from normalized hook definitions so equivalent hooks from `config.toml` and `hooks.json` share the same trust identity - gate unmanaged hooks on trust before they enter the runnable handler set # Reviewer Notes - key file to review is `codex-rs/hooks/src/engine/discovery.rs` - the only core change is schema related	2026-05-05 19:13:55 +00:00
starr-openai	78421face0	Route process tools to selected environments (#20647 ) ## Why When a turn exposes multiple selected environments, shell-style tools need a model-facing way to identify the intended target environment and handlers need to resolve that target before parsing cwd-relative permission fields or launching processes. This PR scopes that rollout to process tools. Filesystem-oriented tools such as `apply_patch`, `view_image`, and `list_dir` are intentionally left for follow-up slices. ## What Changed - Adds an `include_environment_id` option to shell-style tool schema builders. - Exposes optional `environment_id` on `shell`, `shell_command`, and `exec_command` only when `ToolEnvironmentMode::Multiple` is active. - Adds a shared handler helper that parses `environment_id` and `workdir` from JSON function-call arguments and returns the selected `Environment` plus effective absolute cwd. - Uses that helper in `shell`, `shell_command`, and `exec_command` handling so process execution uses the selected environment filesystem and cwd. - Changes `ExecCommandRequest` to carry a required resolved `cwd`, removing the process-manager fallback to the primary turn cwd for new exec commands. - Leaves `write_stdin` unchanged because it targets an existing process id, not a new environment. ## Testing - Added unit coverage for process-tool schema exposure, selected environment resolution, primary fallback, no-environment handling, unknown environment ids, and resolving cwd-relative permission paths against the selected environment cwd. - Added a remote-suite e2e coverage case for `exec_command` routing across explicit zero environments, one local environment, and local+remote environments. - Ran `just fmt` and `git diff --check`. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-05 12:12:03 -07:00
rhan-oai	fb7e1eb6fc	[codex-analytics] add tool item event schemas (#17089 ) ## Why Tool analytics need stable, typed payloads before the later lifecycle reducer starts emitting them. Keeping the event schema definitions isolated in their own PR makes the emitted surface reviewable separately from the reducer logic that produces those events. ## What changed - Adds the common tool-item analytics event base plus event payload types for command execution, file changes, MCP calls, dynamic tools, collaboration tools, web search, and image generation. - Extends `TrackEventRequest` with the corresponding tool-item variants. - Adds serialization coverage for the command-execution event shape. ## Verification - `cargo test -p codex-analytics` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17089). * #18748 * #18747 * #17090 * __->__ #17089 * #20514	2026-05-05 11:49:30 -07:00
Owen Lin	6075b77001	app-server: ignore persist_extended_history param (#21225 ) ## Why Taking a step to removing the `persistExtendedHistory` field. It's not scalable to be persisting so much data in the rollout file and returning it in the thread history. When a client explicitly sends `true`, the server now tells that client the parameter is deprecated and ignored so the caller has a clear migration signal via the `deprecationNotice` notification. ## What changed - Keep the `persist_extended_history` / `persistExtendedHistory` field in the v2 protocol for compatibility, but document it as deprecated and ignored. - Ignore the parameter in app-server `thread/start`, `thread/resume`, and `thread/fork`; those paths always use limited history persistence now. - Stop treating `persistExtendedHistory` as a running-thread resume override mismatch. - Emit a connection-scoped `deprecationNotice` when a request explicitly sets `persist_extended_history: true`. ## Verification - Added `thread_start_deprecates_persist_extended_history_true` to cover the deprecation notice. - `cargo test -p codex-app-server` - `cargo test -p codex-app-server-protocol`	2026-05-05 18:36:13 +00:00
Felipe Coury	5e0a4adbe5	feat(tui): add raw scrollback mode (#20819 ) ## Why Granular copy is particularly difficult with the current output. Part of it was solved with the introduction of the `/copy` command but when you only need to copy parts of a response, you still encounter some issues: - When you copy a paragraph, the result is a sequence of separate lines instead of one correctly joined paragraph. - When a word wraps, part of it stays on the original line and the rest appears at the start of the next line. - When you copy a long command, extra line breaks are often inserted, and command arguments can be split across multiple lines. https://github.com/user-attachments/assets/0ef85c84-9363-4aad-b43a-15fce062a443 ## Solution Now that we own the scrollback and we re-create it when we resize, we have the opportunity of toggling between the raw text and the rich text we see today. - Add TUI raw scrollback mode with `tui.raw_output_mode`, `/raw [on\|off]`, and the configurable `tui.keymap.global.toggle_raw_output` action. - Render transcript cells through rich/raw-aware paths so raw mode preserves source text and lets the terminal soft-wrap selection-friendly output. - Bind raw-mode toggle to `alt-r` by default, with the keybinding path toggling silently while `/raw` continues to emit confirmation messages. ## Related Issues Likely addressed by raw mode: - #12200: clean copy for multiline and soft-wrapped output. Raw mode removes Codex-inserted wrapping/indentation and lets the terminal soft-wrap logical lines. - #9252: command suggestions gain unwanted leading spaces when copied. Raw mode renders transcript text without the rich-mode left padding/gutter. - #8258: prompt output is hard to copy because of leading indentation. Raw mode renders user/source-backed transcript text without that decorative indentation. Partially or conditionally addressed: - #2880: copy/export message as Markdown. Raw mode exposes raw Markdown for terminal selection, but this PR does not add a dedicated export/copy-message command. - #19820: mouse drag selection + copy in the TUI. Raw mode improves terminal-native selection of output/history text, but this PR does not implement in-TUI mouse selection, highlighting, auto-copy, or composer selection. - #18979: copied content is divided into two parts. This should improve cases caused by Codex-inserted wraps/padding in rendered output; if the report is about pasting into the composer/input path, that remains outside this PR. ## Validation - `just write-config-schema` - `just fmt` - `cargo test -p codex-config` - `cargo test -p codex-tui` - `just fix -p codex-tui` - `just argument-comment-lint` - `cargo test -p codex-tui raw_output_mode_can_change_without_inserting_notice -- --nocapture` - `cargo test -p codex-tui raw_slash_command_toggles_and_accepts_on_off_args -- --nocapture` - `cargo test -p codex-tui raw_output_toggle -- --nocapture` - `git diff --check` - `cargo insta pending-snapshots`	2026-05-05 11:17:47 -07:00
viyatb-oai	172303bbfa	chore: add minimal proxy egress diagnostics (#21220 ) ## Why Recent Auto Review reports show Git traffic hanging through the local proxy on both SSH and HTTPS paths. Today the support bundle does not make it obvious whether a request is stuck before upstream dialing, during the proxy hop, or after the upstream response begins, which slows down root-cause triage. This adds a small amount of runtime visibility at the existing proxy boundaries without changing routing or policy behavior. ## What changed - log whether HTTP and CONNECT traffic take the direct or upstream-proxy route - log start / success / failure timings for CONNECT, HTTP, and SOCKS5 upstream dials - log CONNECT forwarding lifecycle events - describe HTTP success at the response-header boundary that is actually observed, rather than implying the full body finished ## Verification - `cargo test -p codex-network-proxy` - `cargo clippy -p codex-network-proxy --all-targets -- -D warnings`	2026-05-05 17:50:59 +00:00
viyatb-oai	ed6082c9f9	fix(sandboxing): Bound advisory system bwrap startup probe (#20111 ) ## Why Linux startup runs an advisory system `bwrap` warning probe on each launch. On hosts with NFS or autofs mounts, its `--ro-bind / /` probe can take tens of seconds before Codex prints anything, matching #19828. Because this probe only decides whether to surface a warning, it should not be allowed to stall startup. Relevant pre-change path: [`codex-rs/sandboxing/src/bwrap.rs`](`de2ccf9473/codex-rs/sandboxing/src/bwrap.rs (L64-L80)`) ## What changed - Bound the advisory system `bwrap` probe to 500 ms. - Preserve the existing warning behavior when `bwrap` promptly reports a known user-namespace failure. - Kill and reap the probe child on timeout, then suppress the advisory warning instead of blocking startup. - Read probe stderr with a bounded nonblocking drain so descendants that inherit the pipe cannot extend startup after the probe child exits. - Add regression coverage for both a deliberately slow fake `bwrap` process and a fake probe whose descendant keeps stderr open. ## Security This only bounds the advisory startup probe. It does not change the command execution path or add a fail-open sandbox fallback. The related command-side hang in #20017 remains separate from this PR. ## Verification - Added `system_bwrap_probe_times_out_without_reporting_a_warning`. - Added `system_bwrap_probe_does_not_wait_for_descendants_holding_stderr_open`. - `cargo test -p codex-sandboxing` - `cargo clippy -p codex-sandboxing --all-targets -- -D warnings` Fixes #19828 Related: #20017	2026-05-05 10:45:35 -07:00
Felipe Coury	a3a09dfc9b	fix(tui): external editor expansion for same-size large pastes (#21190 ) ## Why We found this while reviewing #21091, but confirmed it is not introduced by that PR: the order-sensitive `current_text_with_pending()` replacement loop already existed, and `main` already allowed active same-size large pastes to use prefix-overlapping labels such as `[Pasted Content N chars]` and `[Pasted Content N chars] #2`. #21091 fixes placeholder numbering after a draft is cleared, so a fresh same-size paste can reuse the base label. This PR fixes a different path: when a draft already contains multiple active same-size large pastes, the placeholders can overlap by prefix, for example `[Pasted Content N chars]` and `[Pasted Content N chars] #2`. That overlap breaks `current_text_with_pending()` when the composer materializes the draft text for the external editor. Replacing the base placeholder first can partially rewrite the `#2` placeholder, leaving the external editor seeded with corrupted text instead of both paste payloads. \| Before \| After \| \|---\|---\| \| <img width="1230" height="1008" alt="CleanShot 2026-05-05 at 10 18 09" src="https://github.com/user-attachments/assets/88a2936c-cf00-4adc-8567-8fd8f398b4a8" /> \| <img width="1230" height="1008" alt="CleanShot 2026-05-05 at 10 20 31" src="https://github.com/user-attachments/assets/119cff52-43c8-432a-9367-418d82f4ed82" /> \| \| <img width="1230" height="1008" alt="CleanShot 2026-05-05 at 10 18 57" src="https://github.com/user-attachments/assets/026031bb-839b-4252-a0fd-9ba9616435fe" /> \| <img width="1230" height="1008" alt="CleanShot 2026-05-05 at 10 21 31" src="https://github.com/user-attachments/assets/8cb6f2c8-3a5d-411b-8623-dca666ee3c08" /> \| ## What Changed - Changed `current_text_with_pending()` to expand pending pastes through the existing element-range based `expand_pending_pastes()` helper instead of global string replacement. - Added a regression test with two different same-length large pastes to ensure both overlapping placeholders expand to their original payloads. ## How to Test 1. Start Codex TUI. 2. Paste a large string, for example 1004 `A` characters. ```shell perl -e 'print "A" x 1004' \| pbcopy ``` 3. Paste a second large string with the same length, for example 1004 `B` characters. ```shell perl -e 'print "B" x 1004' \| pbcopy ``` 4. Open the external editor from the composer. 5. Confirm the editor is seeded with the full `A...` payload followed by the full `B...` payload, with no literal `#2` left behind. Targeted tests: - `cargo test -p codex-tui current_text_with_pending_expands_overlapping_placeholders` - `just argument-comment-lint-from-source -p codex-tui` I also ran `cargo test -p codex-tui`; it reached the full crate suite but failed two unrelated local status tests because this machine's `/etc/codex/requirements.toml` rejects `DangerFullAccess`.	2026-05-05 14:41:43 -03:00
Abhinav	13be504063	revert legacy notify deprecation (#21152 ) # Why Revert #20524 for now because the computer use plugin has not migrated off legacy `notify` yet. Keeping the deprecation in place today would show users a warning before the plugin path is ready to move, so this rolls the change back until that migration is complete. # What - revert the legacy `notify` deprecation change from #20524 - restore the prior `notify` behavior and remove the temporary deprecation metrics/docs from that change Once the computer use plugin has migrated, we can land the same deprecation again.	2026-05-05 10:34:44 -07:00
canvrno-oai	394242e95b	[codex] Fix fork --last cwd filtering (#21089 ) Fixes #20945. This keeps `codex fork --last` aligned with the neighboring latest-session lookup flows. The local fork path now uses the same cwd-scope helper as `resume --last`, which is also a small code cleanup around how this selection logic is shared. Credit to @chanwooyang1 for the report and for pointing out the narrow fix direction. What changed: - Route `fork --last` through the shared latest-session cwd filter. - Preserve `--all` as the explicit opt-in for global latest-session selection. - Keep remote cwd override behavior unchanged. - Add focused coverage for local default, `--all`, and remote override filter semantics. Validation: - Ran `just fmt`. - Ran `git diff --check`. - Reviewed the `fork --last`, `resume --last`, and fork picker selection paths against the issue report.	2026-05-05 10:33:40 -07:00
canvrno-oai	1feaa7d85b	[codex] Fix TUI large paste placeholder numbering after Ctrl+C (#21091 ) Fixes #19940. Large-paste placeholder numbering was backed by a per-size counter, so clearing a draft with `Ctrl+C` left numbering state behind even though the active pending paste state was gone. This updates the composer to derive the next placeholder suffix from active pending pastes instead, which keeps simultaneous same-size pastes distinct while letting fresh drafts reuse the base label. This is also a small code cleanup: pending paste state is now the source of truth instead of maintaining a separate counter. Credit to @Sungyoun-Kim for the issue report, root-cause notes, and fork with the proposed fix, and to @charley-oai for the earlier related #10032 proposal. Changes: - Remove the monotonic large-paste counter from the composer. - Compute suffixes from currently active pending paste placeholders. - Document large-paste placeholder behavior in the composer module docs. - Add regression coverage for `Ctrl+C` clearing and deletion/reset behavior. Testing: - `just fmt` - `git diff --check`	2026-05-05 10:33:37 -07:00
Abhinav	af86be529c	Support PreToolUse additionalContext (#20692 ) # Why `PreToolUse` already exposes `hookSpecificOutput.additionalContext` in the generated hook schema, but the runtime still rejected it as unsupported. That leaves `PreToolUse` out of step with the other context-injecting hooks and prevents hook authors from attaching model-visible guidance to a pending tool call before it runs. # What - Parse `PreToolUse.additionalContext` and carry it through the hook event pipeline. - Record `PreToolUse` context at the hook boundary so successful context is preserved for both allowed and blocked calls without widening the tool registry surface. - Preserve existing deny behavior when context is combined with either `permissionDecision: "deny"` or the legacy `decision: "block"` shape.	2026-05-05 10:29:30 -07:00
iceweasel-oai	f35285dc78	Add Windows sandbox readiness RPC (#20708 ) ## Why The desktop app on Windows needs a read-only way to tell, before the next tool call, whether the local Windows sandbox setup is in a state that should block the user and ask for setup again. The main case we want to cover is the elevated sandbox setup version bump. Today, if the app is configured for elevated Windows sandboxing and the installed setup is stale, the next sandboxed shell/exec path can end up triggering the elevated setup flow directly. That means the user can see an unexpected UAC prompt with no UI explanation. This change adds a small app-server preflight so the desktop app can ask “is Windows sandbox ready, not configured, or update-required?” during startup and show the appropriate blocking UI before the user hits a tool call. ## What changed - Added a new read-only app-server RPC: `windowsSandbox/readiness` - Added a new protocol enum and response type: - `WindowsSandboxReadiness` - `WindowsSandboxReadinessResponse` - Added core readiness logic in `core/src/windows_sandbox.rs`: - `ready` - `notConfigured` - `updateRequired` - Wired the new request through `codex_message_processor` - Regenerated the vendored app-server schema fixtures ## Readiness semantics This is intentionally a coarse startup/version-bump readiness check, not a full predictor of every runtime repair case. For now, readiness is determined from: - the configured Windows sandbox level - `sandbox_setup_is_complete()` for elevated mode That means: - `disabled` maps to `notConfigured` - `restricted token` maps to `ready` - `elevated` maps to `ready` or `updateRequired` depending on `sandbox_setup_is_complete()` This is deliberate for the first UI integration because the common case we want to catch is “the app updated, the elevated setup version bumped, and the user should see an update-required blocker instead of a surprise UAC prompt”. It does not attempt to model every case where the deeper runtime path might decide to repair or re-run setup. ## Testing - Ran `cargo fmt --all -- app-server-protocol/src/protocol/common.rs app-server-protocol/src/protocol/v2.rs app-server/src/codex_message_processor.rs core/src/windows_sandbox.rs core/src/windows_sandbox_tests.rs` - Added unit tests for the pure readiness mapping in `core/src/windows_sandbox_tests.rs` - Regenerated vendored schema fixtures with `cargo run -p codex-app-server-protocol --bin write_schema_fixtures -- --schema-root app-server-protocol/schema` - Did not run the full cargo test suite	2026-05-05 09:58:23 -07:00
Eric Traut	f09e1936e0	Validate /goal objective length in TUI (#20746 ) ## Why Long `/goal` definitions currently reach lower-level goal validation and can produce an opaque failure. This bug was reported by a user. Pasted instruction blocks are especially confusing because the composer can still contain a paste placeholder before expansion, which may otherwise fall into the generic prompt-size error path. There was also a related paste edge case where `/goal ` followed by a multiline block whose first pasted line was blank looked like a bare `/goal` command. That showed the goal usage/summary instead of setting the pasted objective. ## What Changed This adds TUI-side preflight validation for `/goal <objective>` using the shared `MAX_THREAD_GOAL_OBJECTIVE_CHARS` limit. Oversized typed, queued, and pasted goal objectives now fail locally with a goal-specific message that recommends putting longer instructions in a file and referencing that file from the goal. The TUI now also lets inline-argument slash commands consume later-line arguments before treating the first line as a bare command, so `/goal ` followed by blank lines and then objective text sets the goal instead of opening the bare `/goal` flow. ## Manual Testing 1. Start the TUI with goals enabled and an active session. 2. Submit `/goal ` followed by exactly 4,000 objective characters. It should continue through the normal goal-setting path. 3. Submit `/goal ` followed by 4,001 objective characters. It should not set a goal, and should show `Goal objective is too long: 4,001 characters. Limit: 4,000 characters.` followed by the guidance to put longer instructions in a file and reference that file from the goal. 4. Type `/goal `, paste a large block that becomes a `[Pasted Content ... chars]` placeholder, then submit. It should validate the expanded pasted text and show the goal-specific file guidance rather than the generic prompt-size error. 5. Type `/goal `, paste a multiline block whose first line is blank, then submit. It should set the objective from the non-blank pasted content instead of showing `Usage: /goal <objective>` or the bare goal summary. 6. While a turn is running, queue an oversized `/goal` command. When the queue drains, it should show the same goal-specific error and should not emit a goal-setting request.	2026-05-05 09:55:07 -07:00
Eric Traut	91b7350187	Add goal lifecycle metrics (#20799 ) ## Why Adding goal metrics makes it possible to track how often goals are created, completed, and stopped by budget limits, plus the final token and wall-clock usage for terminal outcomes. ## What Changed - Added OpenTelemetry metric constants for goal lifecycle tracking: - `codex.goal.created`: increments each time a new persisted goal is created or an existing goal is replaced with a new objective. - `codex.goal.completed`: increments when a goal transitions to `complete`. - `codex.goal.budget_limited`: increments when a goal transitions to `budget_limited` because its token budget has been reached. - `codex.goal.token_count`: records the final persisted token count when a goal transitions to `complete` or `budget_limited`. - `codex.goal.duration_s`: records the final persisted elapsed wall-clock time, in seconds, when a goal transitions to `complete` or `budget_limited`. - Emitted creation metrics when a goal is created or replaced. - Emitted terminal outcome counters and final usage histograms when a goal transitions to `complete` or `budget_limited`, avoiding double-counting later in-flight accounting for already budget-limited goals. - Added focused `codex-core` tests for create/complete metrics and one-time budget-limit metrics.	2026-05-05 09:21:54 -07:00
Felipe Coury	69283aa1c0	fix(tui): make /copy work inside tmux without passthrough (#20207 ) ## Summary - prefer tmux's native clipboard integration for `/copy` when running inside tmux - fall back to OSC 52 when tmux clipboard copy is unavailable - add coverage for tmux-preferred, fallback, and combined-failure paths ## Why Inside tmux, `/copy` previously relied on DCS-wrapped OSC 52 when `TMUX` was set. That only reaches the outer terminal when tmux passthrough is enabled, so Codex could report success even though the system clipboard never changed. ## User impact `/copy` now works inside tmux even when `allow-passthrough` is off, as long as tmux clipboard integration is available. If tmux cannot handle the copy, Codex still keeps the existing OSC 52 fallback path. ## Validation - `cargo test -p codex-tui` - `just fmt` - `just fix -p codex-tui` - `just argument-comment-lint` - manually verified `/copy` inside tmux with `allow-passthrough off` Fixes #19926	2026-05-05 16:18:02 +00:00

1 2 3 4 5 ...

6208 Commits