codex

mirror of https://github.com/openai/codex.git synced 2026-05-15 08:42:34 +00:00

Author	SHA1	Message	Date
starr-openai	2bfcc88340	Compile archive lld shim with Roslyn	2026-05-13 17:10:35 -07:00
starr-openai	134893adf3	Fix MSVC lld wrapper compilation	2026-05-13 17:06:07 -07:00
starr-openai	bb20435949	Fix archive lld wrapper compilation	2026-05-13 17:05:58 -07:00
starr-openai	c974d3f5cd	Wrap lld for ARM64 MSVC setup action	2026-05-13 17:03:50 -07:00
starr-openai	6d357d4fe3	Filter unsupported ARM64 lld flag in archive probe	2026-05-13 17:03:43 -07:00
starr-openai	11a10df438	Prefer rust-lld in MSVC setup action	2026-05-13 16:56:26 -07:00
starr-openai	2116527ae0	Probe ARM64 archive with rust-lld	2026-05-13 16:56:19 -07:00
starr-openai	96b02724a0	Prefer lld-link in MSVC setup action	2026-05-13 16:47:23 -07:00
starr-openai	04132fdfbf	Probe ARM64 archive with lld-link	2026-05-13 16:47:22 -07:00
starr-openai	fe3230ad4a	Add archive cargo process probe	2026-05-13 16:34:37 -07:00
starr-openai	ff0141d713	Set explicit ARM64 Cargo linker for archive probe	2026-05-13 16:14:45 -07:00
starr-openai	d5ebb31383	Set explicit Cargo linker in MSVC setup action	2026-05-13 16:14:35 -07:00
starr-openai	5440bbfaaa	Normalize MSVC PATH export	2026-05-13 16:06:56 -07:00
starr-openai	9c6ce80d08	Normalize MSVC PATH export for archive probe	2026-05-13 16:06:53 -07:00
starr-openai	202487bd63	Export ARM64 MSVC env for archive probe	2026-05-13 16:00:38 -07:00
starr-openai	858c744081	Add MSVC env helper for ARM64 archive build	2026-05-13 15:59:13 -07:00
starr-openai	7f3f228a60	Try Windows arm64 nextest archive Add an opt-in rust-ci-full path that builds the Windows arm64 nextest archive on Windows x64, uploads it, and runs Windows arm64 shard jobs from that archive instead of recompiling in every shard. Co-authored-by: Codex <noreply@openai.com>	2026-05-07 18:11:30 -07:00
starr-openai	755d128add	Shard Windows arm64 nextest runs Add a dynamic rust-ci-full test matrix so workflow_dispatch or shard-specific full-ci branch names can split the Windows arm64 nextest lane across 2 or 4 hosts while leaving the default push behavior unchanged. Co-authored-by: Codex <noreply@openai.com>	2026-05-07 17:11:16 -07:00
starr-openai	cd8ea2f36b	Keep sccache stats alive through CI jobs Disable the sccache daemon idle timeout in rust-ci-full so long test phases can still report the compile-cache stats collected during the build phase. Co-authored-by: Codex <noreply@openai.com>	2026-05-07 15:59:45 -07:00
starr-openai	fcb1fb8ec6	Re-enable Windows sccache in Rust CI Let Windows rust-ci-full jobs use sccache again, store the fallback cache on the configured work drive, and set Cargo's rustc wrapper to an absolute sccache path so Windows subprocesses resolve it consistently. Co-authored-by: Codex <noreply@openai.com>	2026-05-07 15:20:40 -07:00
starr-openai	077a3970d7	Use Dev Drive for Windows CI Configure Windows Rust CI jobs and the shared Bazel CI setup to put temp, repository-cache, and output-root paths on the runner's fast work drive when available. Fall back to C: if no secondary drive or Dev Drive provisioning path is available. Co-authored-by: Codex <noreply@openai.com>	2026-05-07 15:20:40 -07:00
starr-openai	5815dd6a4b	Give Windows arm64 tests enough CI time Let the Windows arm64 test matrix use a longer timeout after CI showed the lane spending most of the default 45 minutes compiling before nextest could finish. Also pin nextest through taiki-e/install-action's supported tool version syntax so the requested version is not ignored. Co-authored-by: Codex <noreply@openai.com>	2026-05-07 15:20:39 -07:00
starr-openai	296fa6df0c	Serialize Windows process-heavy nextest cases Windows rust-ci-full repeatedly times out in subprocess-heavy tests even when the global nextest thread count is capped. Isolate the recurring Windows-only families with nextest overrides so the rest of the suite can keep normal parallelism. Co-authored-by: Codex <noreply@openai.com>	2026-05-07 15:20:39 -07:00
starr-openai	64c684bd57	Add Windows nextest thread override for rust-ci-full Co-authored-by: Codex <noreply@openai.com>	2026-05-07 15:20:39 -07:00
starr-openai	ce5d84e43a	Make pending sideband close test deterministic Replace the realtime websocket accept-delay race with an explicit test-server gate so close is issued while the sideband connection is pending, then prove the closed conversation does not emit stale events or send sideband websocket requests. Co-authored-by: Codex <noreply@openai.com>	2026-05-07 15:20:35 -07:00
starr-openai	926b8d77cd	Tolerate transient Windows metadata denial in memory startup test Keep polling when Windows temporarily denies metadata reads while the phase 2 memory workspace is being cleaned up, so the test still verifies the file is removed and the baseline becomes clean. Co-authored-by: Codex <noreply@openai.com>	2026-05-07 14:48:09 -07:00
starr-openai	7cd5127421	Wait for agent shutdown before resume tests reopen IDs Subscribe before test shutdown and close operations, then wait for the Shutdown status before resuming the same thread IDs. This removes the Windows live-writer race exposed by the full nextest run. Co-authored-by: Codex <noreply@openai.com>	2026-05-07 14:48:09 -07:00
starr-openai	6a2ce743f1	Make Windows realtime shell test use successful cmd echo Use a Windows command form that exits successfully in constrained CI shells and trim the expected newline in the delegated realtime shell-tool assertion. Co-authored-by: Codex <noreply@openai.com>	2026-05-07 14:48:08 -07:00
starr-openai	32deb67fc6	Harden Windows realtime and agent resume tests Avoid PowerShell command forms that depend on method invocation for the delegated realtime shell-tool test, and wait for a shutdown status before resuming the same subagent thread in the nickname/role restore test. Co-authored-by: Codex <noreply@openai.com>	2026-05-07 14:48:08 -07:00
starr-openai	59d9e96d66	Use PowerShell literal output in sandbox tests The legacy sandbox runs PowerShell in constrained language mode, so method calls fail and module-backed cmdlets may not autoload. Use literal string expressions for the PowerShell I/O smoke tests so they exercise process output without depending on cmdlets or method invocation. Co-authored-by: Codex <noreply@openai.com>	2026-05-07 14:48:08 -07:00
starr-openai	097e3ef949	Avoid PowerShell module autoload in sandbox tests Windows arm64 can launch pwsh in the legacy sandbox while still failing Write-Output because Microsoft.PowerShell.Utility cannot autoload. Use Console output in the legacy PowerShell smoke tests so they continue to verify sandbox process I/O without depending on module autoload. Co-authored-by: Codex <noreply@openai.com>	2026-05-07 14:48:07 -07:00
starr-openai	f3afa1132d	Fix rollout cwd fixture import Import the Windows-aware test_path_buf helper from core_test_support where it is defined. Co-authored-by: Codex <noreply@openai.com>	2026-05-07 14:48:07 -07:00
starr-openai	a666109389	Make rollout cwd fixtures drive-stable on Windows Dev Drive setup can put temporary Codex homes on D:, which exposed test fixtures that wrote root-relative '/' rollout cwd values while assertions expected the Windows-aware C:\ root helper. Use the same test_path_buf helper when creating and expecting fake rollout cwd values so the tests remain independent of the process temp drive. Co-authored-by: Codex <noreply@openai.com>	2026-05-07 14:48:07 -07:00
starr-openai	16648c8d1c	Make realtime sideband failure test deterministic Use the existing mock server as the sideband failure endpoint instead of relying on an OS-level connection refusal from 127.0.0.1:1. Disable retries in this failure-path test so Windows CI does not spend the default retry budget before emitting the expected error/close events. Co-authored-by: Codex <noreply@openai.com>	2026-05-07 14:48:06 -07:00
starr-openai	7d2c8dbec4	Fix agent job worker assignment race Claim job items before spawning workers and allow reports to complete unassigned running items, so fast workers cannot lose stop=true reports before the parent records their thread id. Co-authored-by: Codex <noreply@openai.com>	2026-05-07 14:48:06 -07:00
starr-openai	bfe33e5a7a	Make agent job stop cancellation atomic A worker stop request used to record the item result and job cancellation in separate updates, so the job runner could observe the item completion first and continue spawning pending work. Commit both state updates together and prevent completion from overwriting a final cancellation. Co-authored-by: Codex <noreply@openai.com>	2026-05-07 14:48:05 -07:00
William Woodruff	8abcc5357d	[codex] Fully qualify hash-pins in GitHub Actions (#21436 ) This builds on top of https://github.com/openai/codex/pull/15828 by ensuring that hash-pinned actions with version comments are fully qualified, rather than referencing floating/mutable comments like "v7". This makes actions management tools behave more consistently. This shouldn't break anything, since it's comment only. But if it does, ping ww@ 🙂	2026-05-07 14:31:20 -07:00
Zanie Blue	27ec488ad5	Add a Cargo build profile for benchmarking (#21574 ) A clean release build takes ~18m and an incremental build takes ~12m. This is far too slow to iterate on performance related changes and the build time is dominated by LTO. This pull request adds a `profiling` profile for Cargo which takes ~13m clean and ~6m incremental, the primary change is that LTO is disabled. This matches a profile used in uv and follows the great work at https://github.com/astral-sh/uv/pull/5955 — there's a bit of commentary there about the trade-offs this implies. We've found that this does not inhibit the ability to accurately benchmark as measurements with LTO disabled are generally consistent with the results with LTO enabled and it makes it much faster (~2x) to rebuild after making a change. This is motivated by my interest in improving Codex TUI performance, which is blocked by the tragically builds right now. I tested incremental build times by making a no-op change to the `codex-cli` crate.	2026-05-07 14:30:35 -07:00
Zanie Blue	8367ef4522	Use descriptive names for Cargo profile options (#21582 ) These are equivalent and their intent is clearer, e.g., I was confused if `debug = 1` meant the same thing as `debug = true` (it does not).	2026-05-07 14:19:32 -07:00
iceweasel-oai	163eac9306	Grant sandbox users access to desktop runtime bin (#21564 ) ## Why Codex desktop copies bundled Windows binaries out of `WindowsApps` into a LocalAppData runtime cache before launching `codex.exe`. Sandboxed commands can then need to execute helpers from that cache, but the sandbox user group may not have read/execute access to the runtime bin directory. This makes the Windows sandbox refresh path repair that access directly so the packaged desktop runtime remains usable from sandboxed sessions. ## What changed - Added `setup_runtime_bin` to locate `%LOCALAPPDATA%\OpenAI\Codex\bin`, matching the desktop bundled-binaries destination path, with the same `USERPROFILE\AppData\Local` fallback shape. - During refresh setup, check whether `CodexSandboxUsers` already has read/execute access to the runtime bin directory. - If access is missing, grant `CodexSandboxUsers` `OI/CI/RX` inheritance on that directory. - If the runtime bin directory does not exist, no-op cleanly. ## Verification - `cargo build -p codex-windows-sandbox --bin codex-windows-sandbox-setup` - `cargo test -p codex-windows-sandbox --bin codex-windows-sandbox-setup` - Manual Windows ACL exercise against the installed packaged runtime bin: - existing inherited `CodexSandboxUsers:(I)(OI)(CI)(RX)` no-ops without changing SDDL - after disabling inheritance and removing the group ACE, setup adds `CodexSandboxUsers:(OI)(CI)(RX)` - with `LOCALAPPDATA` pointed at a fake location without `OpenAI\Codex\bin`, setup exits successfully and does not create the directory - restored the real runtime bin with inherited ACLs and confirmed the final SDDL matched the baseline exactly	2026-05-07 11:38:10 -07:00
Tom	4242bba2eb	Route ThreadManager rollout path reads through thread store (#21265 ) - Route ThreadManager rollout-path resume/fork through ThreadStore history reads. - Add in-memory store coverage proving path-addressed reads are used. This isn't strictly necessary for the ThreadStore migration, since these ThreadManager methods _only_ work for path-based lookups, but I'm trying to migrate all the rollout recorder callsites to use the threadstore were possible for consistency.	2026-05-07 11:25:25 -07:00
Tom	0274398901	[codex] Fix pathless thread summaries (#21266 ) ## Summary Fix `getConversationSummary` so thread-id summaries work for stored threads that do not have a local rollout path, such as remote thread stores. The root cause was that `summary_from_stored_thread` returned `None` when `StoredThread.rollout_path` was absent, and `get_thread_summary_response_inner` treated that as an internal error. This made conversation-id lookups depend on a local-only field even though the thread store can address the thread by id.	2026-05-07 11:18:16 -07:00
Tom	56823ec46b	Move thread name edits to ThreadStore (#21264 ) - Route live thread renames through `ThreadStore` metadata updates. - Read resumed thread names from store metadata with legacy local fallback preserved in the store.	2026-05-07 11:12:22 -07:00
Charlie Marsh	0dc1885a5c	Upgrade `cargo-shear` to 1.11.2 (#21547 ) ## Summary Catches a few additional dependencies (`sha2`, `url`) that should be in `dev-dependencies`.	2026-05-07 11:07:18 -07:00
pakrym-oai	566f2cb612	[codex] Move tool specs onto handlers (#21461 ) ## Why This is the next stacked step after deleting the tool-handler kind indirection. Specs should come from the registered handlers themselves so registry construction has a single source of truth for handler behavior and exposed tool definitions. ## What changed - Added `ToolHandler::spec()` plus handler-provided parallel/code-mode metadata, and made `ToolRegistryBuilder::register_handler` automatically collect specs from registered handlers. - Moved builtin tool spec construction into the corresponding handlers and their adjacent `_spec` modules, including shell, unified exec, apply patch, view image, request plugin install, tool search, MCP resource, goals, planning, permissions, agent jobs, and multi-agent tools. - Reworked configurable handlers to receive their tool-building options through constructors, with non-optional handler options where the handler is always spec-backed. Shell fallback handlers keep an explicit no-spec mode because they are also registered as hidden dispatch aliases. - Kept `CodeModeExecuteHandler` on the explicit configured wrapper so the code-mode exec spec can still be built from the nested registry. ## Verification - `cargo check -p codex-core` - `cargo test -p codex-core tools::spec_plan::tests` - `cargo test -p codex-core tools::spec::tests` - `cargo test -p codex-core tools::handlers::multi_agents_spec::tests` - `RUST_MIN_STACK=16777216 cargo test -p codex-core tools::handlers::multi_agents::tests` - `cargo test -p codex-core tools::handlers::apply_patch::tests` - `cargo test -p codex-core tools::handlers::unified_exec::tests` - `just fix -p codex-core` - `git diff --check`	2026-05-07 10:48:36 -07:00
jif-oai	eb0462f2af	app-server: refresh live threads from latest config snapshot (#21187 ) ## Why App-server config writes were leaving existing threads partially stale. After a config mutation, the app-server told each live thread to run `Op::ReloadUserConfig`, but that path only re-read the user `config.toml` layer. Settings that came from the app-server's materialized config snapshot did not propagate to existing threads until restart. This change prevent a FS access from `core` for CCA. ## What changed - add `CodexThread::refresh_runtime_config()` and `Session::refresh_runtime_config()` so the app-server can push a freshly rebuilt config snapshot into a live thread - rebuild the latest config with each thread's `cwd` after config mutations, then refresh the thread from that snapshot instead of asking it to reload only `config.toml` - keep session-static settings unchanged during refresh, while updating runtime-refreshable state such as the config layer stack, `tool_suggest`, and derived hook/plugin/skill state - keep `reload_user_config_layer()` as the file-backed fallback for legacy local reload flows, but route the shared refresh logic through the new runtime refresh path ## Testing - add a session test that verifies `refresh_runtime_config()` rebuilds hooks from refreshed config - add a session test that verifies runtime-refreshable fields update while session-static settings like `model` and `notify` stay unchanged --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-07 19:22:04 +02:00
Owen Lin	129401df43	add top-level remote-control command (#21424 ) ## Summary `codex --enable remote_control app-server --listen off` is the current way to start a headless, remote-controllable app-server, but it is hard to remember and exposes implementation details. This adds `codex remote-control` as a friendly top-level wrapper for that flow. The command starts a foreground app-server with local transports disabled and enables `remote_control` only for that invocation. ## Changes - Add a visible `codex remote-control` CLI subcommand. - Launch app-server with `AppServerTransport::Off`. - Append `features.remote_control=true` after root feature toggles so the explicit command wins over `--disable remote_control`. - Reject root `--remote` / `--remote-auth-token-env`, matching other non-TUI subcommands. - Add tests for parsing, launch defaults, override ordering, and remote flag rejection. ## Verification - `cargo test -p codex-cli` - `just fix -p codex-cli`	2026-05-07 10:17:07 -07:00
pakrym-oai	857e731478	[codex] Remove string-keyed MCP tool maps (#21454 ) ## Summary This PR removes the synthetic `HashMap<String, ToolInfo>` keys from MCP tool discovery. `McpConnectionManager::list_all_tools()` now returns normalized `Vec<ToolInfo>`, and downstream code derives identity from `ToolInfo::canonical_tool_name()`. The motivation is to keep model-visible tool identity on `ToolName`/`ToolInfo` instead of parallel string map keys, so future namespace changes do not have to preserve otherwise-unused lookup keys. ## Changes - Rename the MCP normalization path from `qualify_tools` to `normalize_tools_for_model` and return tool values directly. - Flow MCP tool lists through connectors, plugin injection, router/spec building, code mode, and tool search as vectors/slices. - Keep direct/deferred subtraction local to `mcp_tool_exposure`, using `ToolName` values. - Update tests to compare `ToolName` instances where MCP identity matters. ## Validation - `cargo test -p codex-mcp test_normalize_tools` - `cargo test -p codex-core mcp_tool_exposure` - `cargo test -p codex-core direct_mcp_tools_register_namespaced_handlers` - `cargo test -p codex-core search_tool_registers_namespaced_mcp_tool_aliases` - `just fix -p codex-mcp` - `just fix -p codex-core`	2026-05-07 10:16:10 -07:00
xl-openai	114bac1409	feat: Expose plugin share metadata in shareContext (#21495 ) Extends PluginSummary.shareContext with shareUrl and reader shareTargets	2026-05-07 10:07:03 -07:00
rhan-oai	3444b0d60a	[codex-analytics] add tool review event schema (#18747 ) ## Why We want to emit terminal review analytics for tool-related approval flows, but the event contract needs to exist before the reducer can publish anything. This PR is the schema-only slice for the Codex review event family. ## What changed - add the `ReviewEvent` analytics envelope in `codex-rs/analytics/src/events.rs` - define the review subject kind, reviewer, trigger, terminal status, and post-review resolution enums - define the review event payload with thread, turn, item, lineage, tool, and timing fields that the emitter stack will populate ## Verification - stacked verification in dependent PRs: `cargo test -p codex-analytics analytics_client_tests --manifest-path codex-rs/Cargo.toml` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18747). * #18748 * #21434 * __->__ #18747 * #17090 * #17089 * #20514	2026-05-07 09:46:46 -07:00

1 2 3 4 5 ...

6305 Commits