codex

mirror of https://github.com/openai/codex.git synced 2026-04-27 16:15:09 +00:00

Author	SHA1	Message	Date
Michael Bolin	bfff0c729f	config: enforce enterprise feature requirements (#13388 ) ## Why Enterprises can already constrain approvals, sandboxing, and web search through `requirements.toml` and MDM, but feature flags were still only configurable as managed defaults. That meant an enterprise could suggest feature values, but it could not actually pin them. This change closes that gap and makes enterprise feature requirements behave like the other constrained settings. The effective feature set now stays consistent with enterprise requirements during config load, when config writes are validated, and when runtime code mutates feature flags later in the session. It also tightens the runtime API for managed features. `ManagedFeatures` now follows the same constraint-oriented shape as `Constrained<T>` instead of exposing panic-prone mutation helpers, and production code can no longer construct it through an unconstrained `From<Features>` path. The PR also hardens the `compact_resume_fork` integration coverage on Windows. After the feature-management changes, `compact_resume_after_second_compaction_preserves_history` was overflowing the libtest/Tokio thread stacks on Windows, so the test now uses an explicit larger-stack harness as a pragmatic mitigation. That may not be the ideal root-cause fix, and it merits a parallel investigation into whether part of the async future chain should be boxed to reduce stack pressure instead. ## What Changed Enterprises can now pin feature values in `requirements.toml` with the requirements-side `features` table: ```toml [features] personality = true unified_exec = false ``` Only canonical feature keys are allowed in the requirements `features` table; omitted keys remain unconstrained. - Added a requirements-side pinned feature map to `ConfigRequirementsToml`, threaded it through source-preserving requirements merge and normalization in `codex-config`, and made the TOML surface use `[features]` (while still accepting legacy `[feature_requirements]` for compatibility). - Exposed `featureRequirements` from `configRequirements/read`, regenerated the JSON/TypeScript schema artifacts, and updated the app-server README. - Wrapped the effective feature set in `ManagedFeatures`, backed by `ConstrainedWithSource<Features>`, and changed its API to mirror `Constrained<T>`: `can_set(...)`, `set(...) -> ConstraintResult<()>`, and result-returning `enable` / `disable` / `set_enabled` helpers. - Removed the legacy-usage and bulk-map passthroughs from `ManagedFeatures`; callers that need those behaviors now mutate a plain `Features` value and reapply it through `set(...)`, so the constrained wrapper remains the enforcement boundary. - Removed the production loophole for constructing unconstrained `ManagedFeatures`. Non-test code now creates it through the configured feature-loading path, and `impl From<Features> for ManagedFeatures` is restricted to `#[cfg(test)]`. - Rejected legacy feature aliases in enterprise feature requirements, and return a load error when a pinned combination cannot survive dependency normalization. - Validated config writes against enterprise feature requirements before persisting changes, including explicit conflicting writes and profile-specific feature states that normalize into invalid combinations. - Updated runtime and TUI feature-toggle paths to use the constrained setter API and to persist or apply the effective post-constraint value rather than the requested value. - Updated the `core_test_support` Bazel target to include the bundled core model-catalog fixtures in its runtime data, so helper code that resolves `core/models.json` through runfiles works in remote Bazel test environments. - Renamed the core config test coverage to emphasize that effective feature values are normalized at runtime, while conflicting persisted config writes are rejected. - Ran `compact_resume_after_second_compaction_preserves_history` inside an explicit 8 MiB test thread and Tokio runtime worker stack, following the existing larger-stack integration-test pattern, to keep the Windows `compact_resume_fork` test slice from aborting while a parallel investigation continues into whether some of the underlying async futures should be boxed. ## Verification - `cargo test -p codex-config` - `cargo test -p codex-core feature_requirements_ -- --nocapture` - `cargo test -p codex-core load_requirements_toml_produces_expected_constraints -- --nocapture` - `cargo test -p codex-core compact_resume_after_second_compaction_preserves_history -- --nocapture` - `cargo test -p codex-core compact_resume_fork -- --nocapture` - Re-ran the built `codex-core` `tests/all` binary with `RUST_MIN_STACK=262144` for `compact_resume_after_second_compaction_preserves_history` to confirm the explicit-stack harness fixes the deterministic low-stack repro. - `cargo test -p codex-core` - This still fails locally in unrelated integration areas that expect the `codex` / `test_stdio_server` binaries or hit existing `search_tool` wiremock mismatches. ## Docs `developers.openai.com/codex` should document the requirements-side `[features]` table for enterprise and MDM-managed configuration, including that it only accepts canonical feature keys and that conflicting config writes are rejected.	2026-03-04 04:40:22 +00:00
sayan-oai	082682a628	feat: load plugin apps (#13401 ) load plugin-apps from `.app.json`. make apps runtime-mentionable iff `codex_apps` MCP actually exposes tools for that `connector_id`. if the app isn't available, it's filtered out of runtime connector set, so no tools are added and no app-mentions resolve. right now we don't have a clean cli-side error for an app not being installed. can look at this after. ### Tests Added tests, tested locally that using a plugin that bundles an app picks up the app.	2026-03-03 16:29:15 -08:00
pakrym-oai	69df12efb3	Remove Responses V1 websocket implementation (#13364 ) V2 is the way to go!	2026-03-03 11:32:53 -07:00
pash-openai	2f5b01abd6	add fast mode toggle (#13212 ) - add a local Fast mode setting in codex-core (similar to how model id is currently stored on disk locally) - send `service_tier=priority` on requests when Fast is enabled - add `/fast` in the TUI and persist it locally - feature flag	2026-03-02 20:29:33 -08:00
Ahmed Ibrahim	b20b6aa46f	Update realtime websocket API (#13265 ) - migrate the realtime websocket transport to the new session and handoff flow - make the realtime model configurable in config.toml and use API-key auth for the websocket --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-02 16:05:40 -08:00
Ahmed Ibrahim	0aeb55bf08	Record realtime close marker on replacement (#13058 ) ## Summary - record a realtime close developer message when a new realtime session replaces an active one - assert the replacement marker through the mocked responses request path --------- Co-authored-by: Codex <noreply@openai.com> Co-authored-by: Charles Cunningham <ccunningham@openai.com>	2026-03-01 13:54:12 -08:00
Michael Bolin	1a8d930267	core: adopt host_executable() rules in zsh-fork (#13046 ) ## Why [#12964](https://github.com/openai/codex/pull/12964) added `host_executable()` support to `codex-execpolicy`, but the zsh-fork interception path in `unix_escalation.rs` was still evaluating commands with the default exact-token matcher. That meant an intercepted absolute executable such as `/usr/bin/git status` could still miss basename rules like `prefix_rule(pattern = ["git", "status"])`, even when the policy also defined a matching `host_executable(name = "git", ...)` entry. This PR adopts the new matching behavior in the zsh-fork runtime only. That keeps the rollout intentionally narrow: zsh-fork already requires explicit user opt-in, so it is a safer first caller to exercise the new `host_executable()` scheme before expanding it to other execpolicy call sites. It also brings zsh-fork back in line with the current `prefix_rule()` execution model. Until prefix rules can carry their own permission profiles, a matched `prefix_rule()` is expected to rerun the intercepted command unsandboxed on `allow`, or after the user accepts `prompt`, instead of merely continuing inside the inherited shell sandbox. ## What Changed - added `evaluate_intercepted_exec_policy()` in `core/src/tools/runtimes/shell/unix_escalation.rs` to centralize execpolicy evaluation for intercepted commands - switched intercepted direct execs in the zsh-fork path to `check_multiple_with_options(...)` with `MatchOptions { resolve_host_executables: true }` - added `commands_for_intercepted_exec_policy()` so zsh-fork policy evaluation works from intercepted `(program, argv)` data instead of reconstructing a synthetic command before matching - left shell-wrapper parsing intentionally disabled by default behind `ENABLE_INTERCEPTED_EXEC_POLICY_SHELL_WRAPPER_PARSING`, so path-sensitive matching relies on later direct exec interception rather than shell-script parsing - made matched `prefix_rule()` decisions rerun intercepted commands with `EscalationExecution::Unsandboxed`, while unmatched-command fallback keeps the existing sandbox-preserving behavior - extracted the zsh-fork test harness into `core/tests/common/zsh_fork.rs` so both the skill-focused and approval-focused integration suites can exercise the same runtime setup - limited this change to the intercepted zsh-fork path rather than changing every execpolicy caller at once - added runtime coverage in `core/src/tools/runtimes/shell/unix_escalation_tests.rs` for allowed and disallowed `host_executable()` mappings and the wrapper-parsing modes - added integration coverage in `core/tests/suite/approvals.rs` to verify a saved `prefix_rule(pattern=["touch"], decision="allow")` reruns under zsh-fork outside a restrictive `WorkspaceWrite` sandbox --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/13046). * #13065 * __->__ #13046	2026-02-28 01:41:23 +00:00
Curtis 'Fjord' Hawthorne	7e980d7db6	Support multimodal custom tool outputs (#12948 ) ## Summary This changes `custom_tool_call_output` to use the same output payload shape as `function_call_output`, so freeform tools can return either plain text or structured content items. The main goal is to let `js_repl` return image content from nested `view_image` calls in its own `custom_tool_call_output`, instead of relying on a separate injected message. ## What changed - Changed `custom_tool_call_output.output` from `string` to `FunctionCallOutputPayload` - Updated freeform tool plumbing to preserve structured output bodies - Updated `js_repl` to aggregate nested tool content items and attach them to the outer `js_repl` result - Removed the old `js_repl` special case that injected `view_image` results as a separate pending user image message - Updated normalization/history/truncation paths to handle multimodal `custom_tool_call_output` - Regenerated app-server protocol schema artifacts ## Behavior Direct `view_image` calls still return a `function_call_output` with image content. When `view_image` is called inside `js_repl`, the outer `js_repl` `custom_tool_call_output` now carries: - an `input_text` item if the JS produced text output - one or more `input_image` items from nested tool results So the nested image result now stays inside the `js_repl` tool output instead of being injected as a separate message. ## Compatibility This is intended to be backward-compatible for resumed conversations. Older histories that stored `custom_tool_call_output.output` as a plain string still deserialize correctly, and older histories that used the previous injected-image-message flow also continue to resume. Added regression coverage for resuming a pre-change rollout containing: - string-valued `custom_tool_call_output` - legacy injected image message history #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/12948	2026-02-26 18:17:46 -08:00
Ahmed Ibrahim	a11da86b37	Make realtime audio test deterministic (#12959 ) ## Summary\n- add a websocket test-server request waiter so tests can synchronize on recorded client messages\n- use that waiter in the realtime delegation test instead of a fixed audio timeout\n- add temporary timing logs in the test and websocket mock to inspect where the flake stalls	2026-02-26 16:09:00 -08:00
pakrym-oai	951a389654	Allow clients not to send summary as an option (#12950 ) Summary is a required parameter on UserTurn. Ideally we'd like the core to decide the appropriate summary level. Make the summary optional and don't send it when not needed.	2026-02-26 14:37:38 -08:00
jif-oai	3404ecff15	feat: add post-compaction sub-agent infos (#12774 ) Co-authored-by: Codex <noreply@openai.com>	2026-02-26 18:55:34 +00:00
Charley Cunningham	07aefffb1f	core: bundle settings diff updates into one dev/user envelope (#12417 ) ## Summary - bundle contextual prompt injection into at most one developer message plus one contextual user message in both: - per-turn settings updates - initial context insertion - preserve `<model_switch>` across compaction by rebuilding it through canonical initial-context injection, instead of relying on strip/reattach hacks - centralize contextual user fragment detection in one shared definition table and reuse it for parsing/compaction logic - keep `AGENTS.md` in its natural serialized format: - `# AGENTS.md instructions for {dirname}` - `<INSTRUCTIONS>...</INSTRUCTIONS>` - simplify related tests/helpers and accept the expected snapshot/layout updates from bundled multi-part messages ## Why The goal is to converge toward a simpler, more intentional prompt shape where contextual updates are consistently represented as one developer envelope plus one contextual user envelope, while keeping parsing and compaction behavior aligned with that representation. ## Notable details - the temporary `SettingsUpdateEnvelope` wrapper was removed; these paths now return `Vec<ResponseItem>` directly - local/remote compaction no longer rely on model-switch strip/restore helpers - contextual user detection is now driven by shared fragment definitions instead of ad hoc matcher assembly - AGENTS/user instructions are still the same logical context; only the synthetic `<user_instructions>` wrapper was replaced by the natural AGENTS text format ## Testing - `just fmt` - `cargo test -p codex-app-server codex_message_processor::tests::extract_conversation_summary_prefers_plain_user_messages -- --exact` - `cargo test -p codex-core compact::tests::collect_user_messages_filters_session_prefix_entries --lib -- --exact` - `cargo test -p codex-core --test all 'suite::compact::snapshot_request_shape_pre_turn_compaction_strips_incoming_model_switch' -- --exact` - `cargo test -p codex-core --test all 'suite::compact_remote::snapshot_request_shape_remote_pre_turn_compaction_strips_incoming_model_switch' -- --exact` - `cargo test -p codex-core --test all 'suite::client::includes_apps_guidance_as_developer_message_when_enabled' -- --exact` - `cargo test -p codex-core --test all 'suite::client::includes_developer_instructions_message_in_request' -- --exact` - `cargo test -p codex-core --test all 'suite::client::includes_user_instructions_message_in_request' -- --exact` - `cargo test -p codex-core --test all 'suite::client::resume_includes_initial_messages_and_sends_prior_items' -- --exact` - `cargo test -p codex-core --test all 'suite::review::review_input_isolated_from_parent_history' -- --exact` - `cargo test -p codex-exec --test all 'suite::resume::exec_resume_last_respects_cwd_filter_and_all_flag' -- --exact` - `cargo test -p core_test_support context_snapshot::tests::full_text_mode_preserves_unredacted_text -- --exact` ## Notes - I also ran several targeted `compact`, `compact_remote`, `prompt_caching`, `model_visible_layout`, and `event_mapping` tests while iterating on prompt-shape changes. - I have not claimed a clean full-workspace `cargo test` from this environment because local sandbox/resource conditions have previously produced unrelated failures in large workspace runs.	2026-02-26 00:12:08 -08:00
daveaitel-openai	dcab40123f	Agent jobs (spawn_agents_on_csv) + progress UI (#10935 ) ## Summary - Add agent job support: spawn a batch of sub-agents from CSV, auto-run, auto-export, and store results in SQLite. - Simplify workflow: remove run/resume/get-status/export tools; spawn is deterministic and completes in one call. - Improve exec UX: stable, single-line progress bar with ETA; suppress sub-agent chatter in exec. ## Why Enables map-reduce style workflows over arbitrarily large repos using the existing Codex orchestrator. This addresses review feedback about overly complex job controls and non-deterministic monitoring. ## Demo (progress bar) ``` ./codex-rs/target/debug/codex exec \ --enable collab \ --enable sqlite \ --full-auto \ --progress-cursor \ -c agents.max_threads=16 \ -C /Users/daveaitel/code/codex \ - <<'PROMPT' Create /tmp/agent_job_progress_demo.csv with columns: path,area and 30 rows: path = item-01..item-30, area = test. Then call spawn_agents_on_csv with: - csv_path: /tmp/agent_job_progress_demo.csv - instruction: "Run `python - <<'PY'` to sleep a random 0.3–1.2s, then output JSON with keys: path, score (int). Set score = 1." - output_csv_path: /tmp/agent_job_progress_demo_out.csv PROMPT ``` ## Review feedback addressed - Auto-start jobs on spawn; removed run/resume/status/export tools. - Auto-export on success. - More descriptive tool spec + clearer prompts. - Avoid deadlocks on spawn failure; pending/running handled safely. - Progress bar no longer scrolls; stable single-line redraw. ## Tests - `cd codex-rs && cargo test -p codex-exec` - `cd codex-rs && cargo build -p codex-cli`	2026-02-24 21:00:19 +00:00
pakrym-oai	97d0068658	Send warmup request (#11258 ) Send a request with `generate: falls` but a full set of tools and instructions to pre-warm inference. --------- Co-authored-by: Codex <noreply@openai.com>	2026-02-24 08:15:47 -08:00
Michael Bolin	e8949f4507	test: vendor zsh fork via DotSlash and stabilize zsh-fork tests (#12518 ) ## Why The zsh integration tests were still brittle in two ways: - they relied on `CODEX_TEST_ZSH_PATH` / environment-specific setup, so they often did not exercise the patched zsh fork that `shell-tool-mcp` ships - once the tests consistently used the vendored zsh fork, they exposed real Linux-specific zsh-fork issues in CI In particular, the Linux failures were not just test noise: - the zsh-fork launch path was dropping `ExecRequest.arg0`, so Linux `codex-linux-sandbox` arg0 dispatch did not run and zsh wrapper-mode could receive malformed arguments - the `turn_start_shell_zsh_fork_subcommand_decline_marks_parent_declined_v2` test uses the zsh exec bridge (which talks to the parent over a Unix socket), but Linux restricted sandbox seccomp denies `connect(2)`, causing timeouts on `ubuntu-24.04` x86/arm This PR makes the zsh tests consistently run against the intended vendored zsh fork and fixes/hardens the zsh-fork path so the Linux CI signal is meaningful. ## What Changed - Added a single shared test-only DotSlash file for the patched zsh fork at `codex-rs/exec-server/tests/suite/zsh` (analogous to the existing `bash` test resource). - Updated both app-server and exec-server zsh tests to use that shared DotSlash zsh (no duplicate zsh DotSlash file, no `CODEX_TEST_ZSH_PATH` dependency). - Updated the app-server zsh-fork test helper to resolve the shared DotSlash zsh and avoid silently falling back to host zsh. - Kept the app-server zsh-fork tests configured via `config.toml`, using a test wrapper path where needed to force `zsh -df` (and rewrite `-lc` to `-c`) for the subcommand-decline test. - Hardened the app-server subcommand-decline zsh-fork test for CI variability: - tolerate an extra `/responses` POST with a no-op mock response - tolerate non-target approval ordering while remaining strict on the two `/usr/bin/true` approvals and decline behavior - use `DangerFullAccess` on Linux for this one test because it validates zsh approval flow, not Linux sandbox socket restrictions - Fixed zsh-fork process launching on Linux by preserving `req.arg0` in `ZshExecBridge::execute_shell_request(...)` so `codex-linux-sandbox` arg0 dispatch continues to work. - Moved `maybe_run_zsh_exec_wrapper_mode()` under `arg0_dispatch_or_else(...)` in `app-server` and `cli` so wrapper-mode handling coexists correctly with arg0-dispatched helper modes. - Consolidated duplicated `dotslash -- fetch` resolution logic into shared test support (`core/tests/common/lib.rs`). - Updated `codex-rs/exec-server/tests/suite/accept_elicitation.rs` to use DotSlash zsh and hardened the zsh elicitation test for Bazel/zsh differences by: - resolving an absolute `git` path - running `git init --quiet .` - asserting success / `.git` creation instead of relying on banner text ## Verification - `cargo test -p codex-app-server turn_start_zsh_fork -- --nocapture` - `cargo test -p codex-exec-server accept_elicitation -- --nocapture` - `bazel test //codex-rs/exec-server:exec-server-all-test --test_output=streamed --test_arg=--nocapture --test_arg=accept_elicitation_for_prompt_rule_with_zsh` - CI (`rust-ci`) on the final cleaned commit: `Tests — ubuntu-24.04 - x86_64-unknown-linux-gnu` and `Tests — ubuntu-24.04-arm - aarch64-unknown-linux-gnu` passed in [run 22291424358](https://github.com/openai/codex/actions/runs/22291424358)	2026-02-22 19:39:56 -08:00
Michael Bolin	1af2a37ada	chore: remove codex-core public protocol/shell re-exports (#12432 ) ## Why `codex-rs/core/src/lib.rs` re-exported a broad set of types and modules from `codex-protocol` and `codex-shell-command`. That made it easy for workspace crates to import those APIs through `codex-core`, which in turn hides dependency edges and makes it harder to reduce compile-time coupling over time. This change removes those public re-exports so call sites must import from the source crates directly. Even when a crate still depends on `codex-core` today, this makes dependency boundaries explicit and unblocks future work to drop `codex-core` dependencies where possible. ## What Changed - Removed public re-exports from `codex-rs/core/src/lib.rs` for: - `codex_protocol::protocol` and related protocol/model types (including `InitialHistory`) - `codex_protocol::config_types` (`protocol_config_types`) - `codex_shell_command::{bash, is_dangerous_command, is_safe_command, parse_command, powershell}` - Migrated workspace Rust call sites to import directly from: - `codex_protocol::protocol` - `codex_protocol::config_types` - `codex_protocol::models` - `codex_shell_command` - Added explicit `Cargo.toml` dependencies (`codex-protocol` / `codex-shell-command`) in crates that now import those crates directly. - Kept `codex-core` internal modules compiling by using `pub(crate)` aliases in `core/src/lib.rs` (internal-only, not part of the public API). - Updated the two utility crates that can already drop a `codex-core` dependency edge entirely: - `codex-utils-approval-presets` - `codex-utils-cli` ## Verification - `cargo test -p codex-utils-approval-presets` - `cargo test -p codex-utils-cli` - `cargo check --workspace --all-targets` - `just clippy`	2026-02-20 23:45:35 -08:00
Charley Cunningham	bb0ac5be70	Fix compaction context reinjection and model baselines (#12252 ) ## Summary - move regular-turn context diff/full-context persistence into `run_turn` so pre-turn compaction runs before incoming context updates are recorded - after successful pre-turn compaction, rely on a cleared `reference_context_item` to trigger full context reinjection on the follow-up regular turn (manual `/compact` keeps replacement history summary-only and also clears the baseline) - preserve `<model_switch>` when full context is reinjected, and inject it before the rest of the full-context items - scope `reference_context_item` and `previous_model` to regular user turns only so standalone tasks (`/compact`, shell, review, undo) cannot suppress future reinjection or `<model_switch>` behavior - make context-diff persistence + `reference_context_item` updates explicit in the regular-turn path, with clearer docs/comments around the invariant - stop persisting local `/compact` `RolloutItem::TurnContext` snapshots (only regular turns persist `TurnContextItem` now) - simplify resume/fork previous-model/reference-baseline hydration by looking up the last surviving turn context from rollout lifecycle events, including rollback and compaction-crossing handling - remove the legacy fallback that guessed from bare `TurnContext` rollouts without lifecycle events - update compaction/remote-compaction/model-visible snapshots and compact test assertions (including remote compaction mock response shape) ## Why We were persisting incoming context items before spawning the regular turn task, which let pre-turn compaction requests accidentally include incoming context diffs without the new user message. Fixing that exposed follow-on baseline issues around `/compact`, resume/fork, and standalone tasks that could cause duplicate context injection or suppress `<model_switch>` instructions. This PR re-centers the invariants around regular turns: - regular turns persist model-visible context diffs/full reinjection and update the `reference_context_item` - standalone tasks do not advance those regular-turn baselines - compaction clears the baseline when replacement history may have stripped the referenced context diffs ## Follow-ups (TODOs left in code) - `TODO(ccunningham)`: fix rollback/backtracking baseline handling more comprehensively - `TODO(ccunningham)`: include pending incoming context items in pre-turn compaction threshold estimation - `TODO(ccunningham)`: inject updated personality spec alongside `<model_switch>` so some model-switch paths can avoid forced full reinjection - `TODO(ccunningham)`: review task turn lifecycle (`TurnStarted`/`TurnComplete`) behavior and emit task-start context diffs for task types that should have them (excluding `/compact`) ## Validation - `just fmt` - CI should cover the updated compaction/resume/model-visible snapshot expectations and rollout-hydration behavior - I did not rerun the full local test suite after the latest resume-lookup / rollout-persistence simplifications	2026-02-20 23:13:08 -08:00
Charley Cunningham	eb68767f2f	Unify remote compaction snapshot mocks around default endpoint behavior (#12050 ) ## Summary - standardize remote compaction test mocking around one default behavior in shared helpers - make default remote compact mocks mirror production shape: keep `message/user` + `message/developer`, drop assistant/tool artifacts, then append a summary user message - switch non-special `compact_remote` tests to the shared default mock instead of ad-hoc JSON payloads ## Special-case tests that still use explicit mocks - remote compaction error payload / HTTP failure behavior - summary-only compact output behavior - manual `/compact` with no prior user messages - stale developer-instruction injection coverage ## Why This removes inconsistent manual remote compaction fixtures and gives us one source of truth for normal remote compact behavior, while preserving explicit mocks only where tests intentionally cover non-default behavior.	2026-02-17 18:18:47 -08:00
Anton Panasenko	1d95656149	bazel: fix snapshot parity for tests/.rs rust_test targets (#11893 ) ## Summary - make `rust_test` targets generated from `tests/.rs` use Cargo-style crate names (file stem) so snapshot names match Cargo (`all__...` instead of Bazel-derived names) - split lib vs `tests/.rs` test env wiring in `codex_rust_crate` to keep existing lib snapshot behavior while applying Bazel runfiles-compatible workspace root for `tests/.rs` - compute the `tests/*.rs` snapshot workspace root from package depth so `insta` resolves committed snapshots under Bazel `--noenable_runfiles` ## Validation - `bazelisk test //codex-rs/core:core-all-test --test_arg=suite::compact:: --cache_test_results=no` - `bazelisk test //codex-rs/core:core-all-test --test_arg=suite::compact_remote:: --cache_test_results=no`	2026-02-16 07:11:59 +00:00
Anton Panasenko	02abd9a8ea	feat: persist and restore codex app's tools after search (#11780 ) ### What changed 1. Removed per-turn MCP selection reset in `core/src/tasks/mod.rs`. 2. Added `SessionState::set_mcp_tool_selection(Vec<String>)` in `core/src/state/session.rs` for authoritative restore behavior (deduped, order-preserving, empty clears). 3. Added rollout parsing in `core/src/codex.rs` to recover `active_selected_tools` from prior `search_tool_bm25` outputs: - tracks matching `call_id`s - parses function output text JSON - extracts `active_selected_tools` - latest valid payload wins - malformed/non-matching payloads are ignored 4. Applied restore logic to resumed and forked startup paths in `core/src/codex.rs`. 5. Updated instruction text to session/thread scope in `core/templates/search_tool/tool_description.md`. 6. Expanded tests in `core/tests/suite/search_tool.rs`, plus unit coverage in: - `core/src/codex.rs` - `core/src/state/session.rs` ### Behavior after change 1. Search activates matched tools. 2. Additional searches union into active selection. 3. Selection survives new turns in the same thread. 4. Resume/fork restores selection from rollout history. 5. Separate threads do not inherit selection unless forked.	2026-02-15 19:18:41 -08:00
Charley Cunningham	85034b189e	core: snapshot tests for compaction requests, post-compaction layout, some additional compaction tests (#11487 ) This PR keeps compaction context-layout test coverage separate from runtime compaction behavior changes, so runtime logic review can stay focused. ## Included - Adds reusable context snapshot helpers in `core/tests/common/context_snapshot.rs` for rendering model-visible request/history shapes. - Standardizes helper naming for readability: - `format_request_input_snapshot` - `format_response_items_snapshot` - `format_labeled_requests_snapshot` - `format_labeled_items_snapshot` - Expands snapshot coverage for both local and remote compaction flows: - pre-turn auto-compaction - pre-turn failure/context-window-exceeded paths - mid-turn continuation compaction - manual `/compact` with and without prior user turns - Captures both sides where relevant: - compaction request shape - post-compaction history layout shape - Adds/uses shared request-inspection helpers so assertions target structured request content instead of ad-hoc JSON string parsing. - Aligns snapshots/assertions to current behavior and leaves explicit `TODO(ccunningham)` notes where behavior is known and intentionally deferred. ## Not Included - No runtime compaction logic changes. - No model-visible context/state behavior changes.	2026-02-14 19:57:10 -08:00
pakrym-oai	eac5473114	Do not attempt to append after response.completed (#11402 ) Completed responses are fully done, and new response must be created.	2026-02-11 07:45:17 -08:00
Michael Bolin	476c1a7160	Remove `test-support` feature from `codex-core` and replace it with explicit test toggles (#11405 ) ## Why `codex-core` was being built in multiple feature-resolved permutations because test-only behavior was modeled as crate features. For a large crate, those permutations increase compile cost and reduce cache reuse. ## Net Change - Removed the `test-support` crate feature and related feature wiring so `codex-core` no longer needs separate feature shapes for test consumers. - Standardized cross-crate test-only access behind `codex_core::test_support`. - External test code now imports helpers from `codex_core::test_support`. - Underlying implementation hooks are kept internal (`pub(crate)`) instead of broadly public. ## Outcome - Fewer `codex-core` build permutations. - Better incremental cache reuse across test targets. - No intended production behavior change.	2026-02-10 22:44:02 -08:00
Michael Bolin	b68a84ee8e	Remove `deterministic_process_ids` feature to avoid duplicate `codex-core` builds (#11393 ) ## Why `codex-core` enabled `deterministic_process_ids` through a self dev-dependency. That forced a second feature-resolved build of the same crate, which increased compile time and test latency. ## What Changed - Removed the `deterministic_process_ids` feature from `codex-rs/core/Cargo.toml`. - Removed the self dev-dependency on `codex-core` that enabled that feature. - Removed the Bazel `deterministic_process_ids` crate feature for `codex-core`. - Added a test-only `AtomicBool` override in unified exec process-id allocation. - Added a test-support setter for that override and re-exported it from `codex-core`. - Enabled deterministic process IDs in integration tests via `core_test_support` ctor. ## Behavior - Production behavior remains random process IDs. - Unit tests remain deterministic via `cfg(test)`. - Integration tests remain deterministic via explicit test-support initialization. ## Validation - `just fmt` - `cargo test -p codex-core unified_exec::` - `cargo test -p codex-core --test all unified_exec -- --test-threads=1` - `cargo tree -p codex-core -e features` (verified the removed feature path)	2026-02-10 19:07:01 -08:00
Anton Panasenko	a94505a92a	feat: enable premessage-deflate for websockets (#10966 ) note: unfortunately, tokio-tungstenite / tungstenite upgrade triggers some problems with linker of rama-tls-boring with openssl: ``` error: linking with `/Users/apanasenko/Library/Caches/cargo-zigbuild/0.20.1/zigcc-x86_64-unknown-linux-musl-ff6a.sh` failed: exit status: 1 \| = note: "/Users/apanasenko/Library/Caches/cargo-zigbuild/0.20.1/zigcc-x86_64-unknown-linux-musl-ff6a.sh" "-m64" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/rcrt1.o" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/crti.o" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/crtbeginS.o" "<1 object files omitted>" "-Wl,--as-needed" "-Wl,-Bstatic" "/var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/{liblzma_sys-662a82316f96ec30,libbzip2_sys-bf78a2d58d5cbce6,liblibsqlite3_sys-6c004987fd67a36a,libtree_sitter_bash-220b99a97d331ab7,libtree_sitter-858f0a1dbfea58bd,libzstd_sys-6eb237deec748c5b,libring-2a87376483bf916f,libopenssl_sys-7c189e68b37fe2bb,liblibz_sys-4344eef4345520b1,librama_boring_sys-0414e98115015ee0}.rlib" "-lc++" "-lc++abi" "-lunwind" "-lc" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/libcompiler_builtins-*.rlib" "-L" "/var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/raw-dylibs" "-Wl,-Bdynamic" "-Wl,--eh-frame-hdr" "-Wl,-z,noexecstack" "-nostartfiles" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/libz-sys-ff5ea50d88c28ffb/out/lib" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/ring-bdec3dddc19f5a5e/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/openssl-sys-96e0870de3ca22bc/out/openssl-build/install/lib" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/zstd-sys-0cc37a5da1481740/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/tree-sitter-72d2418073317c0f/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/tree-sitter-bash-bfd293a9f333ce6a/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/libsqlite3-sys-b78b2cfb81a330fc/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/bzip2-sys-69a145cc859ef275/out/lib" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/lzma-sys-07e92d0b6baa6fd4/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build/crypto/" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build/ssl/" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build/" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build" "-L" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained" "-L" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib" "-o" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/deps/codex_network_proxy-d08268b863517761" "-Wl,--gc-sections" "-static-pie" "-Wl,-z,relro,-z,now" "-Wl,-O1" "-Wl,--strip-all" "-nodefaultlibs" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/crtendS.o" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/crtn.o" = note: some arguments are omitted. use `--verbose` to show all linker arguments = note: warning: ignoring deprecated linker optimization setting '1' warning: unable to open library directory '/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build/crypto/': FileNotFound ld.lld: error: duplicate symbol: SSL_export_keying_material >>> defined at ssl_lib.c:3816 (ssl/ssl_lib.c:3816) >>> libssl-lib-ssl_lib.o:(SSL_export_keying_material) in archive /var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/libopenssl_sys-7c189e68b37fe2bb.rlib >>> defined at t1_enc.cc:205 (/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/boringssl/ssl/t1_enc.cc:205) >>> t1_enc.cc.o:(.text.SSL_export_keying_material+0x0) in archive /var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/librama_boring_sys-0414e98115015ee0.rlib ld.lld: error: duplicate symbol: d2i_ASN1_TIME >>> defined at a_time.c:27 (crypto/asn1/a_time.c:27) >>> libcrypto-lib-a_time.o:(d2i_ASN1_TIME) in archive /var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/libopenssl_sys-7c189e68b37fe2bb.rlib >>> defined at a_time.cc:34 (/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/boringssl/crypto/asn1/a_time.cc:34) >>> a_time.cc.o:(.text.d2i_ASN1_TIME+0x0) in archive /var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/librama_boring_sys-0414e98115015ee0.rlib ``` that force me to migrate away from rama-tls-boring to rama-tls-rustls and pin `ring` for rustls.	2026-02-07 17:59:34 -08:00
Josh McKinney	e416e578bb	core: preconnect Responses websocket for first turn (#10698 ) ## Problem The first user turn can pay websocket handshake latency even when a session has already started. We want to reduce that initial delay while preserving turn semantics and avoiding any prompt send during startup. Reviewer feedback also called out duplicated connect/setup paths and unnecessary preconnect state complexity. ## Mental model `ModelClient` owns session-scoped transport state. During session startup, it can opportunistically warm one websocket handshake slot. A turn-scoped `ModelClientSession` adopts that slot once if available, restores captured sticky turn-state, and otherwise opens a websocket through the same shared connect path. If startup preconnect is still in flight, first turn setup awaits that task and treats it as the first connection attempt for the turn. Preconnect is handshake-only. The first `response.create` is still sent only when a turn starts. ## Non-goals This change does not make preconnect required for correctness and does not change prompt/turn payload semantics. It also does not expand fallback behavior beyond clearing preconnect state when fallback activates. ## Tradeoffs The implementation prioritizes simpler ownership and shared connection code over header-match gating for reuse. The single-slot cache keeps lifecycle straightforward but only benefits the immediate next turn. Awaiting in-flight preconnect has the same app-level connect-timeout semantics as existing websocket connect behavior (no new timeout class introduced by this PR). ## Architecture `core/src/client.rs`: - Added session-level preconnect lifecycle state (`Idle` / `InFlight` / `Ready`) carrying one warmed websocket plus optional captured turn-state. - Added `pre_establish_connection()` startup warmup and `preconnect()` handshake-only setup. - Deduped auth/provider resolution into `current_client_setup()` and websocket handshake wiring into `connect_websocket()` / `build_websocket_headers()`. - Updated turn websocket path to adopt preconnect first, await in-flight preconnect when present, then create a new websocket only when needed. - Ensured fallback activation clears warmed preconnect state. - Added documentation for lifecycle, ownership, sticky-routing invariants, and timeout semantics. `core/src/codex.rs`: - Session startup invokes `model_client.pre_establish_connection(...)`. - Turn metadata resolution uses the shared timeout helper. `core/src/turn_metadata.rs`: - Centralized shared timeout helper used by both turn-time metadata resolution and startup preconnect metadata building. `core/tests/common/responses.rs` + websocket test suites: - Added deterministic handshake waiting helper (`wait_for_handshakes`) with bounded polling. - Added startup preconnect and in-flight preconnect reuse coverage. - Fallback expectations now assert exactly two websocket attempts in covered scenarios (startup preconnect + turn attempt before fallback sticks). ## Observability Preconnect remains best-effort and non-fatal. Existing websocket/fallback telemetry remains in place, and debug logs now make preconnect-await behavior and preconnect failures easier to reason about. ## Tests Validated with: 1. `just fmt` 2. `cargo test -p codex-core websocket_preconnect -- --nocapture` 3. `cargo test -p codex-core websocket_fallback -- --nocapture` 4. `cargo test -p codex-core websocket_first_turn_waits_for_inflight_preconnect -- --nocapture`	2026-02-06 19:08:24 +00:00
Ahmed Ibrahim	f9c38f531c	add none personality option (#10688 ) - add none personality enum value and empty placeholder behavior\n- add docs/schema updates and e2e coverage	2026-02-04 15:40:33 -08:00
pakrym-oai	0efd33f7f4	Update tests to stop using sse_completed fixture (#10638 ) Summary: - replace the `sse_completed` fixture and related JSON template with direct `responses::ev_completed` payload builders - cascade the new SSE helpers through all affected core tests for consistency and clarity - remove legacy fixtures that were no longer needed once the helpers are in place Testing: - Not run (not requested)	2026-02-04 08:38:06 -08:00
Michael Bolin	66447d5d2c	feat: replace custom mcp-types crate with equivalents from rmcp (#10349 ) We started working with MCP in Codex before https://crates.io/crates/rmcp was mature, so we had our own crate for MCP types that was generated from the MCP schema: `8b95d3e082/codex-rs/mcp-types/README.md` Now that `rmcp` is more mature, it makes more sense to use their MCP types in Rust, as they handle details (like the `_meta` field) that our custom version ignored. Though one advantage that our custom types had is that our generated types implemented `JsonSchema` and `ts_rs::TS`, whereas the types in `rmcp` do not. As such, part of the work of this PR is leveraging the adapters between `rmcp` types and the serializable types that are API for us (app server and MCP) introduced in #10356. Note this PR results in a number of changes to `codex-rs/app-server-protocol/schema`, which merit special attention during review. We must ensure that these changes are still backwards-compatible, which is possible because we have: ```diff - export type CallToolResult = { content: Array<ContentBlock>, isError?: boolean, structuredContent?: JsonValue, }; + export type CallToolResult = { content: Array<JsonValue>, structuredContent?: JsonValue, isError?: boolean, _meta?: JsonValue, }; ``` so `ContentBlock` has been replaced with the more general `JsonValue`. Note that `ContentBlock` was defined as: ```typescript export type ContentBlock = TextContent \| ImageContent \| AudioContent \| ResourceLink \| EmbeddedResource; ``` so the deletion of those individual variants should not be a cause of great concern. Similarly, we have the following change in `codex-rs/app-server-protocol/schema/typescript/Tool.ts`: ``` - export type Tool = { annotations?: ToolAnnotations, description?: string, inputSchema: ToolInputSchema, name: string, outputSchema?: ToolOutputSchema, title?: string, }; + export type Tool = { name: string, title?: string, description?: string, inputSchema: JsonValue, outputSchema?: JsonValue, annotations?: JsonValue, icons?: Array<JsonValue>, _meta?: JsonValue, }; ``` so: - `annotations?: ToolAnnotations` ➡️ `JsonValue` - `inputSchema: ToolInputSchema` ➡️ `JsonValue` - `outputSchema?: ToolOutputSchema` ➡️ `JsonValue` and two new fields: `icons?: Array<JsonValue>, _meta?: JsonValue` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10349). * #10357 * __->__ #10349 * #10356	2026-02-02 17:41:55 -08:00
pakrym-oai	fbb3a30953	Remove WebSocket wire format (#10179 ) I'd like WireApi to go away (when chat is removed) and WebSockets is still responses API just over a different transport.	2026-01-29 13:50:53 -08:00
sayan-oai	ff9fa56368	default enable compression, update test helpers (#10102 ) set `enable_request_compression` flag to default-enabled. update integration test helpers to decompress `zstd` if flag set.	2026-01-28 12:25:40 -08:00
sayan-oai	86adf53235	fix: handle all web_search actions and in progress invocations (#9960 ) ### Summary - Parse all `web_search` tool actions (`search`, `find_in_page`, `open_page`). - Previously we only parsed + displayed `search`, which made the TUI appear to pause when the other actions were being used. - Show in progress `web_search` calls as `Searching the web` - Previously we only showed completed tool calls <img width="308" height="149" alt="image" src="https://github.com/user-attachments/assets/90a4e8ff-b06a-48ff-a282-b57b31121845" /> ### Tests Added + updated tests, tested locally ### Follow ups Update VSCode extension to display these as well	2026-01-27 03:33:48 +00:00
pakrym-oai	998e88b12a	Use test_codex more (#9961 ) Reduces boilderplate.	2026-01-26 18:52:10 -08:00
Anton Panasenko	e117a3ff33	feat: support proxy for ws connection (#9719 ) reapply websocket changes without changing tls lib.	2026-01-22 15:23:15 -08:00
Dylan Hurd	8b3521ee77	feat(core) update Personality on turn (#9644 ) ## Summary Support updating Personality mid-Thread via UserTurn/OverwriteTurn. This is explicitly unused by the clients so far, to simplify PRs - app-server and tui implementations will be follow-ups. ## Testing - [x] added integration tests	2026-01-22 12:04:23 -08:00
pakrym-oai	4d48d4e0c2	Revert "feat: support proxy for ws connection" (#9693 ) Reverts openai/codex#9409	2026-01-22 15:57:18 +00:00
Anton Panasenko	7b27aa7707	feat: support proxy for ws connection (#9409 ) unfortunately tokio-tungstenite doesn't support proxy configuration outbox, while https://github.com/snapview/tokio-tungstenite/pull/370 is in review, we can depend on source code for now.	2026-01-20 09:36:30 -08:00
Ahmed Ibrahim	146d54cede	Add collaboration_mode override to turns (#9408 )	2026-01-16 21:51:25 -08:00
Ahmed Ibrahim	ebdd8795e9	Turn-state sticky routing per turn (#9332 ) - capture the header from SSE/WS handshakes, store it per ModelClientSession using `Oncelock`, echo it on turn-scoped requests, and add SSE+WS integration tests for within-turn persistence + cross-turn reset. - keep `x-codex-turn-state` sticky within a user turn to maintain routing continuity for retries/tool follow-ups.	2026-01-16 09:30:11 -08:00
charley-oai	4a9c2bcc5a	Add text element metadata to types (#9235 ) Initial type tweaking PR to make the diff of https://github.com/openai/codex/pull/9116 smaller This should not change any behavior, just adds some fields to types	2026-01-14 16:41:50 -08:00
pakrym-oai	9f8d3c14ce	Fix flakiness in WebSocket tests (#9169 ) The connection was being added to the list after the WebSocket response was sent. So the test can sometimes race and observe connections before the list was updated. After this change, connection and request is added to the list before the response is sent.	2026-01-13 15:09:59 -08:00
pakrym-oai	2d56519ecd	Support response.done and add integration tests (#9129 ) The agent loop using a persistent incremental web socket connection.	2026-01-13 16:12:30 +00:00
Ahmed Ibrahim	cbca43d57a	Send message by default mid turn. queue messages by tab (#9077 ) https://github.com/user-attachments/assets/03838730-4ddc-44df-a2c7-cb8ecda78660	2026-01-12 23:06:35 -08:00
pakrym-oai	490c1c1fdd	Add model client sessions (#9102 ) Maintain a long-running session.	2026-01-13 01:15:56 +00:00
zbarsky-openai	2a06d64bc9	feat: add support for building with Bazel (#8875 ) This PR configures Codex CLI so it can be built with [Bazel](https://bazel.build) in addition to Cargo. The `.bazelrc` includes configuration so that remote builds can be done using [BuildBuddy](https://www.buildbuddy.io). If you are familiar with Bazel, things should work as you expect, e.g., run `bazel test //... --keep-going` to run all the tests in the repo, but we have also added some new aliases in the `justfile` for convenience: - `just bazel-test` to run tests locally - `just bazel-remote-test` to run tests remotely (currently, the remote build is for x86_64 Linux regardless of your host platform). Note we are currently seeing the following test failures in the remote build, so we still need to figure out what is happening here: ``` failures: suite::compact::manual_compact_twice_preserves_latest_user_messages suite::compact_resume_fork::compact_resume_after_second_compaction_preserves_history suite::compact_resume_fork::compact_resume_and_fork_preserve_model_history_view ``` - `just build-for-release` to build release binaries for all platforms/architectures remotely To setup remote execution: - [Create a buildbuddy account](https://app.buildbuddy.io/) (OpenAI employees should also request org access at https://openai.buildbuddy.io/join/ with their `@openai.com` email address.) - [Copy your API key](https://app.buildbuddy.io/docs/setup/) to `~/.bazelrc` (add the line `build --remote_header=x-buildbuddy-api-key=YOUR_KEY`) - Use `--config=remote` in your `bazel` invocations (or add `common --config=remote` to your `~/.bazelrc`, or use the `just` commands) ## CI In terms of CI, this PR introduces `.github/workflows/bazel.yml`, which uses Bazel to run the tests _locally_ on Mac and Linux GitHub runners (we are working on supporting Windows, but that is not ready yet). Note that the failures we are seeing in `just bazel-remote-test` do not occur on these GitHub CI jobs, so everything in `.github/workflows/bazel.yml` is green right now. The `bazel.yml` uses extra config in `.github/workflows/ci.bazelrc` so that macOS CI jobs build _remotely_ on Linux hosts (using the `docker://docker.io/mbolin491/codex-bazel` Docker image declared in the root `BUILD.bazel`) using cross-compilation to build the macOS artifacts. Then these artifacts are downloaded locally to GitHub's macOS runner so the tests can be executed natively. This is the relevant config that enables this: ``` common:macos --config=remote common:macos --strategy=remote common:macos --strategy=TestRunner=darwin-sandbox,local ``` Because of the remote caching benefits we get from BuildBuddy, these new CI jobs can be extremely fast! For example, consider these two jobs that ran all the tests on Linux x86_64: - Bazel 1m37s https://github.com/openai/codex/actions/runs/20861063212/job/59940545209?pr=8875 - Cargo 9m20s https://github.com/openai/codex/actions/runs/20861063192/job/59940559592?pr=8875 For now, we will continue to run both the Bazel and Cargo jobs for PRs, but once we add support for Windows and running Clippy, we should be able to cutover to using Bazel exclusively for PRs, which should still speed things up considerably. We will probably continue to run the Cargo jobs post-merge for commits that land on `main` as a sanity check. Release builds will also continue to be done by Cargo for now. Earlier attempt at this PR: https://github.com/openai/codex/pull/8832 Earlier attempt to add support for Buck2, now abandoned: https://github.com/openai/codex/pull/8504 --------- Co-authored-by: David Zbarsky <dzbarsky@gmail.com> Co-authored-by: Michael Bolin <mbolin@openai.com>	2026-01-09 11:09:43 -08:00
jif-oai	1aed01e99f	renaming: task to turn (#8963 )	2026-01-09 17:31:17 +00:00
Ahmed Ibrahim	81caee3400	Add 5s timeout to models list call + integration test (#8942 ) - Enforce a 5s timeout around the remote models refresh to avoid hanging /models calls.	2026-01-08 18:06:10 -08:00
Ahmed Ibrahim	0d3e673019	remove `get_responses_requests` and `get_responses_request_bodies` to use in-place matcher (#8858 )	2026-01-08 13:57:48 -08:00
Michael Bolin	1e29774fce	fix: leverage codex_utils_cargo_bin() in codex-rs/core/tests/suite (#8887 ) This eliminates our dependency on the `escargot` crate and better prepares us for Bazel builds: https://github.com/openai/codex/pull/8875.	2026-01-08 14:56:16 +00:00
Michael Bolin	7520d8ba58	fix: leverage find_resource! macro in load_sse_fixture_with_id (#8888 ) This helps prepare us for Bazel builds: https://github.com/openai/codex/pull/8875.	2026-01-08 09:34:05 -05:00

1 2 3

122 Commits