Commit Graph

161 Commits

Author SHA1 Message Date
Charley Cunningham
07aefffb1f core: bundle settings diff updates into one dev/user envelope (#12417)
## Summary
- bundle contextual prompt injection into at most one developer message
plus one contextual user message in both:
  - per-turn settings updates
  - initial context insertion
- preserve `<model_switch>` across compaction by rebuilding it through
canonical initial-context injection, instead of relying on
strip/reattach hacks
- centralize contextual user fragment detection in one shared definition
table and reuse it for parsing/compaction logic
- keep `AGENTS.md` in its natural serialized format:
  - `# AGENTS.md instructions for {dirname}`
  - `<INSTRUCTIONS>...</INSTRUCTIONS>`
- simplify related tests/helpers and accept the expected snapshot/layout
updates from bundled multi-part messages

## Why
The goal is to converge toward a simpler, more intentional prompt shape
where contextual updates are consistently represented as one developer
envelope plus one contextual user envelope, while keeping parsing and
compaction behavior aligned with that representation.

## Notable details
- the temporary `SettingsUpdateEnvelope` wrapper was removed; these
paths now return `Vec<ResponseItem>` directly
- local/remote compaction no longer rely on model-switch strip/restore
helpers
- contextual user detection is now driven by shared fragment definitions
instead of ad hoc matcher assembly
- AGENTS/user instructions are still the same logical context; only the
synthetic `<user_instructions>` wrapper was replaced by the natural
AGENTS text format

## Testing
- `just fmt`
- `cargo test -p codex-app-server
codex_message_processor::tests::extract_conversation_summary_prefers_plain_user_messages
-- --exact`
- `cargo test -p codex-core
compact::tests::collect_user_messages_filters_session_prefix_entries
--lib -- --exact`
- `cargo test -p codex-core --test all
'suite::compact::snapshot_request_shape_pre_turn_compaction_strips_incoming_model_switch'
-- --exact`
- `cargo test -p codex-core --test all
'suite::compact_remote::snapshot_request_shape_remote_pre_turn_compaction_strips_incoming_model_switch'
-- --exact`
- `cargo test -p codex-core --test all
'suite::client::includes_apps_guidance_as_developer_message_when_enabled'
-- --exact`
- `cargo test -p codex-core --test all
'suite::client::includes_developer_instructions_message_in_request' --
--exact`
- `cargo test -p codex-core --test all
'suite::client::includes_user_instructions_message_in_request' --
--exact`
- `cargo test -p codex-core --test all
'suite::client::resume_includes_initial_messages_and_sends_prior_items'
-- --exact`
- `cargo test -p codex-core --test all
'suite::review::review_input_isolated_from_parent_history' -- --exact`
- `cargo test -p codex-exec --test all
'suite::resume::exec_resume_last_respects_cwd_filter_and_all_flag' --
--exact`
- `cargo test -p core_test_support
context_snapshot::tests::full_text_mode_preserves_unredacted_text --
--exact`

## Notes
- I also ran several targeted `compact`, `compact_remote`,
`prompt_caching`, `model_visible_layout`, and `event_mapping` tests
while iterating on prompt-shape changes.
- I have not claimed a clean full-workspace `cargo test` from this
environment because local sandbox/resource conditions have previously
produced unrelated failures in large workspace runs.
2026-02-26 00:12:08 -08:00
daveaitel-openai
dcab40123f Agent jobs (spawn_agents_on_csv) + progress UI (#10935)
## Summary
- Add agent job support: spawn a batch of sub-agents from CSV, auto-run,
auto-export, and store results in SQLite.
- Simplify workflow: remove run/resume/get-status/export tools; spawn is
deterministic and completes in one call.
- Improve exec UX: stable, single-line progress bar with ETA; suppress
sub-agent chatter in exec.

## Why
Enables map-reduce style workflows over arbitrarily large repos using
the existing Codex orchestrator. This addresses review feedback about
overly complex job controls and non-deterministic monitoring.

## Demo (progress bar)
```
./codex-rs/target/debug/codex exec \
  --enable collab \
  --enable sqlite \
  --full-auto \
  --progress-cursor \
  -c agents.max_threads=16 \
  -C /Users/daveaitel/code/codex \
  - <<'PROMPT'
Create /tmp/agent_job_progress_demo.csv with columns: path,area and 30 rows:
path = item-01..item-30, area = test.

Then call spawn_agents_on_csv with:
- csv_path: /tmp/agent_job_progress_demo.csv
- instruction: "Run `python - <<'PY'` to sleep a random 0.3–1.2s, then output JSON with keys: path, score (int). Set score = 1."
- output_csv_path: /tmp/agent_job_progress_demo_out.csv
PROMPT
```

## Review feedback addressed
- Auto-start jobs on spawn; removed run/resume/status/export tools.
- Auto-export on success.
- More descriptive tool spec + clearer prompts.
- Avoid deadlocks on spawn failure; pending/running handled safely.
- Progress bar no longer scrolls; stable single-line redraw.

## Tests
- `cd codex-rs && cargo test -p codex-exec`
- `cd codex-rs && cargo build -p codex-cli`
2026-02-24 21:00:19 +00:00
pakrym-oai
97d0068658 Send warmup request (#11258)
Send a request with `generate: falls` but a full set of tools and
instructions to pre-warm inference.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-02-24 08:15:47 -08:00
Michael Bolin
e8949f4507 test: vendor zsh fork via DotSlash and stabilize zsh-fork tests (#12518)
## Why

The zsh integration tests were still brittle in two ways:

- they relied on `CODEX_TEST_ZSH_PATH` / environment-specific setup, so
they often did not exercise the patched zsh fork that `shell-tool-mcp`
ships
- once the tests consistently used the vendored zsh fork, they exposed
real Linux-specific zsh-fork issues in CI

In particular, the Linux failures were not just test noise:

- the zsh-fork launch path was dropping `ExecRequest.arg0`, so Linux
`codex-linux-sandbox` arg0 dispatch did not run and zsh wrapper-mode
could receive malformed arguments
- the
`turn_start_shell_zsh_fork_subcommand_decline_marks_parent_declined_v2`
test uses the zsh exec bridge (which talks to the parent over a Unix
socket), but Linux restricted sandbox seccomp denies `connect(2)`,
causing timeouts on `ubuntu-24.04` x86/arm

This PR makes the zsh tests consistently run against the intended
vendored zsh fork and fixes/hardens the zsh-fork path so the Linux CI
signal is meaningful.

## What Changed

- Added a single shared test-only DotSlash file for the patched zsh fork
at `codex-rs/exec-server/tests/suite/zsh` (analogous to the existing
`bash` test resource).
- Updated both app-server and exec-server zsh tests to use that shared
DotSlash zsh (no duplicate zsh DotSlash file, no `CODEX_TEST_ZSH_PATH`
dependency).
- Updated the app-server zsh-fork test helper to resolve the shared
DotSlash zsh and avoid silently falling back to host zsh.
- Kept the app-server zsh-fork tests configured via `config.toml`, using
a test wrapper path where needed to force `zsh -df` (and rewrite `-lc`
to `-c`) for the subcommand-decline test.
- Hardened the app-server subcommand-decline zsh-fork test for CI
variability:
  - tolerate an extra `/responses` POST with a no-op mock response
- tolerate non-target approval ordering while remaining strict on the
two `/usr/bin/true` approvals and decline behavior
- use `DangerFullAccess` on Linux for this one test because it validates
zsh approval flow, not Linux sandbox socket restrictions
- Fixed zsh-fork process launching on Linux by preserving `req.arg0` in
`ZshExecBridge::execute_shell_request(...)` so `codex-linux-sandbox`
arg0 dispatch continues to work.
- Moved `maybe_run_zsh_exec_wrapper_mode()` under
`arg0_dispatch_or_else(...)` in `app-server` and `cli` so wrapper-mode
handling coexists correctly with arg0-dispatched helper modes.
- Consolidated duplicated `dotslash -- fetch` resolution logic into
shared test support (`core/tests/common/lib.rs`).
- Updated `codex-rs/exec-server/tests/suite/accept_elicitation.rs` to
use DotSlash zsh and hardened the zsh elicitation test for Bazel/zsh
differences by:
  - resolving an absolute `git` path
  - running `git init --quiet .`
- asserting success / `.git` creation instead of relying on banner text

## Verification

- `cargo test -p codex-app-server turn_start_zsh_fork -- --nocapture`
- `cargo test -p codex-exec-server accept_elicitation -- --nocapture`
- `bazel test //codex-rs/exec-server:exec-server-all-test
--test_output=streamed --test_arg=--nocapture
--test_arg=accept_elicitation_for_prompt_rule_with_zsh`
- CI (`rust-ci`) on the final cleaned commit: `Tests — ubuntu-24.04 -
x86_64-unknown-linux-gnu` and `Tests — ubuntu-24.04-arm -
aarch64-unknown-linux-gnu` passed in [run
22291424358](https://github.com/openai/codex/actions/runs/22291424358)
2026-02-22 19:39:56 -08:00
Michael Bolin
1af2a37ada chore: remove codex-core public protocol/shell re-exports (#12432)
## Why

`codex-rs/core/src/lib.rs` re-exported a broad set of types and modules
from `codex-protocol` and `codex-shell-command`. That made it easy for
workspace crates to import those APIs through `codex-core`, which in
turn hides dependency edges and makes it harder to reduce compile-time
coupling over time.

This change removes those public re-exports so call sites must import
from the source crates directly. Even when a crate still depends on
`codex-core` today, this makes dependency boundaries explicit and
unblocks future work to drop `codex-core` dependencies where possible.

## What Changed

- Removed public re-exports from `codex-rs/core/src/lib.rs` for:
- `codex_protocol::protocol` and related protocol/model types (including
`InitialHistory`)
  - `codex_protocol::config_types` (`protocol_config_types`)
- `codex_shell_command::{bash, is_dangerous_command, is_safe_command,
parse_command, powershell}`
- Migrated workspace Rust call sites to import directly from:
  - `codex_protocol::protocol`
  - `codex_protocol::config_types`
  - `codex_protocol::models`
  - `codex_shell_command`
- Added explicit `Cargo.toml` dependencies (`codex-protocol` /
`codex-shell-command`) in crates that now import those crates directly.
- Kept `codex-core` internal modules compiling by using `pub(crate)`
aliases in `core/src/lib.rs` (internal-only, not part of the public
API).
- Updated the two utility crates that can already drop a `codex-core`
dependency edge entirely:
  - `codex-utils-approval-presets`
  - `codex-utils-cli`

## Verification

- `cargo test -p codex-utils-approval-presets`
- `cargo test -p codex-utils-cli`
- `cargo check --workspace --all-targets`
- `just clippy`
2026-02-20 23:45:35 -08:00
Charley Cunningham
bb0ac5be70 Fix compaction context reinjection and model baselines (#12252)
## Summary
- move regular-turn context diff/full-context persistence into
`run_turn` so pre-turn compaction runs before incoming context updates
are recorded
- after successful pre-turn compaction, rely on a cleared
`reference_context_item` to trigger full context reinjection on the
follow-up regular turn (manual `/compact` keeps replacement history
summary-only and also clears the baseline)
- preserve `<model_switch>` when full context is reinjected, and inject
it *before* the rest of the full-context items
- scope `reference_context_item` and `previous_model` to regular user
turns only so standalone tasks (`/compact`, shell, review, undo) cannot
suppress future reinjection or `<model_switch>` behavior
- make context-diff persistence + `reference_context_item` updates
explicit in the regular-turn path, with clearer docs/comments around the
invariant
- stop persisting local `/compact` `RolloutItem::TurnContext` snapshots
(only regular turns persist `TurnContextItem` now)
- simplify resume/fork previous-model/reference-baseline hydration by
looking up the last surviving turn context from rollout lifecycle
events, including rollback and compaction-crossing handling
- remove the legacy fallback that guessed from bare `TurnContext`
rollouts without lifecycle events
- update compaction/remote-compaction/model-visible snapshots and
compact test assertions (including remote compaction mock response
shape)

## Why
We were persisting incoming context items before spawning the regular
turn task, which let pre-turn compaction requests accidentally include
incoming context diffs without the new user message. Fixing that exposed
follow-on baseline issues around `/compact`, resume/fork, and standalone
tasks that could cause duplicate context injection or suppress
`<model_switch>` instructions.

This PR re-centers the invariants around regular turns:
- regular turns persist model-visible context diffs/full reinjection and
update the `reference_context_item`
- standalone tasks do not advance those regular-turn baselines
- compaction clears the baseline when replacement history may have
stripped the referenced context diffs

## Follow-ups (TODOs left in code)
- `TODO(ccunningham)`: fix rollback/backtracking baseline handling more
comprehensively
- `TODO(ccunningham)`: include pending incoming context items in
pre-turn compaction threshold estimation
- `TODO(ccunningham)`: inject updated personality spec alongside
`<model_switch>` so some model-switch paths can avoid forced full
reinjection
- `TODO(ccunningham)`: review task turn lifecycle
(`TurnStarted`/`TurnComplete`) behavior and emit task-start context
diffs for task types that should have them (excluding `/compact`)

## Validation
- `just fmt`
- CI should cover the updated compaction/resume/model-visible snapshot
expectations and rollout-hydration behavior
- I did **not** rerun the full local test suite after the latest
resume-lookup / rollout-persistence simplifications
2026-02-20 23:13:08 -08:00
Charley Cunningham
eb68767f2f Unify remote compaction snapshot mocks around default endpoint behavior (#12050)
## Summary
- standardize remote compaction test mocking around one default behavior
in shared helpers
- make default remote compact mocks mirror production shape: keep
`message/user` + `message/developer`, drop assistant/tool artifacts,
then append a summary user message
- switch non-special `compact_remote` tests to the shared default mock
instead of ad-hoc JSON payloads

## Special-case tests that still use explicit mocks
- remote compaction error payload / HTTP failure behavior
- summary-only compact output behavior
- manual `/compact` with no prior user messages
- stale developer-instruction injection coverage

## Why
This removes inconsistent manual remote compaction fixtures and gives us
one source of truth for normal remote compact behavior, while preserving
explicit mocks only where tests intentionally cover non-default
behavior.
2026-02-17 18:18:47 -08:00
Anton Panasenko
1d95656149 bazel: fix snapshot parity for tests/*.rs rust_test targets (#11893)
## Summary
- make `rust_test` targets generated from `tests/*.rs` use Cargo-style
crate names (file stem) so snapshot names match Cargo (`all__...`
instead of Bazel-derived names)
- split lib vs `tests/*.rs` test env wiring in `codex_rust_crate` to
keep existing lib snapshot behavior while applying Bazel
runfiles-compatible workspace root for `tests/*.rs`
- compute the `tests/*.rs` snapshot workspace root from package depth so
`insta` resolves committed snapshots under Bazel `--noenable_runfiles`

## Validation
- `bazelisk test //codex-rs/core:core-all-test
--test_arg=suite::compact:: --cache_test_results=no`
- `bazelisk test //codex-rs/core:core-all-test
--test_arg=suite::compact_remote:: --cache_test_results=no`
2026-02-16 07:11:59 +00:00
Anton Panasenko
02abd9a8ea feat: persist and restore codex app's tools after search (#11780)
### What changed
1. Removed per-turn MCP selection reset in `core/src/tasks/mod.rs`.
2. Added `SessionState::set_mcp_tool_selection(Vec<String>)` in
`core/src/state/session.rs` for authoritative restore behavior (deduped,
order-preserving, empty clears).
3. Added rollout parsing in `core/src/codex.rs` to recover
`active_selected_tools` from prior `search_tool_bm25` outputs:
   - tracks matching `call_id`s
   - parses function output text JSON
   - extracts `active_selected_tools`
   - latest valid payload wins
   - malformed/non-matching payloads are ignored
4. Applied restore logic to resumed and forked startup paths in
`core/src/codex.rs`.
5. Updated instruction text to session/thread scope in
`core/templates/search_tool/tool_description.md`.
6. Expanded tests in `core/tests/suite/search_tool.rs`, plus unit
coverage in:
   - `core/src/codex.rs`
   - `core/src/state/session.rs`

### Behavior after change
1. Search activates matched tools.
2. Additional searches union into active selection.
3. Selection survives new turns in the same thread.
4. Resume/fork restores selection from rollout history.
5. Separate threads do not inherit selection unless forked.
2026-02-15 19:18:41 -08:00
Charley Cunningham
85034b189e core: snapshot tests for compaction requests, post-compaction layout, some additional compaction tests (#11487)
This PR keeps compaction context-layout test coverage separate from
runtime compaction behavior changes, so runtime logic review can stay
focused.

## Included
- Adds reusable context snapshot helpers in
`core/tests/common/context_snapshot.rs` for rendering model-visible
request/history shapes.
- Standardizes helper naming for readability:
  - `format_request_input_snapshot`
  - `format_response_items_snapshot`
  - `format_labeled_requests_snapshot`
  - `format_labeled_items_snapshot`
- Expands snapshot coverage for both local and remote compaction flows:
  - pre-turn auto-compaction
  - pre-turn failure/context-window-exceeded paths
  - mid-turn continuation compaction
  - manual `/compact` with and without prior user turns
- Captures both sides where relevant:
  - compaction request shape
  - post-compaction history layout shape
- Adds/uses shared request-inspection helpers so assertions target
structured request content instead of ad-hoc JSON string parsing.
- Aligns snapshots/assertions to current behavior and leaves explicit
`TODO(ccunningham)` notes where behavior is known and intentionally
deferred.

## Not Included
- No runtime compaction logic changes.
- No model-visible context/state behavior changes.
2026-02-14 19:57:10 -08:00
pakrym-oai
eac5473114 Do not attempt to append after response.completed (#11402)
Completed responses are fully done, and new response must be created.
2026-02-11 07:45:17 -08:00
Michael Bolin
476c1a7160 Remove test-support feature from codex-core and replace it with explicit test toggles (#11405)
## Why

`codex-core` was being built in multiple feature-resolved permutations
because test-only behavior was modeled as crate features. For a large
crate, those permutations increase compile cost and reduce cache reuse.

## Net Change

- Removed the `test-support` crate feature and related feature wiring so
`codex-core` no longer needs separate feature shapes for test consumers.
- Standardized cross-crate test-only access behind
`codex_core::test_support`.
- External test code now imports helpers from
`codex_core::test_support`.
- Underlying implementation hooks are kept internal (`pub(crate)`)
instead of broadly public.

## Outcome

- Fewer `codex-core` build permutations.
- Better incremental cache reuse across test targets.
- No intended production behavior change.
2026-02-10 22:44:02 -08:00
Michael Bolin
b68a84ee8e Remove deterministic_process_ids feature to avoid duplicate codex-core builds (#11393)
## Why

`codex-core` enabled `deterministic_process_ids` through a self
dev-dependency.
That forced a second feature-resolved build of the same crate, which
increased
compile time and test latency.

## What Changed

- Removed the `deterministic_process_ids` feature from
`codex-rs/core/Cargo.toml`.
- Removed the self dev-dependency on `codex-core` that enabled that
feature.
- Removed the Bazel `deterministic_process_ids` crate feature for
`codex-core`.
- Added a test-only `AtomicBool` override in unified exec process-id
allocation.
- Added a test-support setter for that override and re-exported it from
`codex-core`.
- Enabled deterministic process IDs in integration tests via
`core_test_support` ctor.

## Behavior

- Production behavior remains random process IDs.
- Unit tests remain deterministic via `cfg(test)`.
- Integration tests remain deterministic via explicit test-support
initialization.

## Validation

- `just fmt`
- `cargo test -p codex-core unified_exec::`
- `cargo test -p codex-core --test all unified_exec -- --test-threads=1`
- `cargo tree -p codex-core -e features` (verified the removed feature
path)
2026-02-10 19:07:01 -08:00
Anton Panasenko
a94505a92a feat: enable premessage-deflate for websockets (#10966)
note:
unfortunately, tokio-tungstenite / tungstenite upgrade triggers some
problems with linker of rama-tls-boring with openssl:
```
error: linking with `/Users/apanasenko/Library/Caches/cargo-zigbuild/0.20.1/zigcc-x86_64-unknown-linux-musl-ff6a.sh` failed: exit status: 1
  |
  = note:  "/Users/apanasenko/Library/Caches/cargo-zigbuild/0.20.1/zigcc-x86_64-unknown-linux-musl-ff6a.sh" "-m64" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/rcrt1.o" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/crti.o" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/crtbeginS.o" "<1 object files omitted>" "-Wl,--as-needed" "-Wl,-Bstatic" "/var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/{liblzma_sys-662a82316f96ec30,libbzip2_sys-bf78a2d58d5cbce6,liblibsqlite3_sys-6c004987fd67a36a,libtree_sitter_bash-220b99a97d331ab7,libtree_sitter-858f0a1dbfea58bd,libzstd_sys-6eb237deec748c5b,libring-2a87376483bf916f,libopenssl_sys-7c189e68b37fe2bb,liblibz_sys-4344eef4345520b1,librama_boring_sys-0414e98115015ee0}.rlib" "-lc++" "-lc++abi" "-lunwind" "-lc" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/libcompiler_builtins-*.rlib" "-L" "/var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/raw-dylibs" "-Wl,-Bdynamic" "-Wl,--eh-frame-hdr" "-Wl,-z,noexecstack" "-nostartfiles" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/libz-sys-ff5ea50d88c28ffb/out/lib" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/ring-bdec3dddc19f5a5e/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/openssl-sys-96e0870de3ca22bc/out/openssl-build/install/lib" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/zstd-sys-0cc37a5da1481740/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/tree-sitter-72d2418073317c0f/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/tree-sitter-bash-bfd293a9f333ce6a/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/libsqlite3-sys-b78b2cfb81a330fc/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/bzip2-sys-69a145cc859ef275/out/lib" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/lzma-sys-07e92d0b6baa6fd4/out" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build/crypto/" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build/ssl/" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build/" "-L" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build" "-L" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained" "-L" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib" "-o" "/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/deps/codex_network_proxy-d08268b863517761" "-Wl,--gc-sections" "-static-pie" "-Wl,-z,relro,-z,now" "-Wl,-O1" "-Wl,--strip-all" "-nodefaultlibs" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/crtendS.o" "<sysroot>/lib/rustlib/x86_64-unknown-linux-musl/lib/self-contained/crtn.o"
  = note: some arguments are omitted. use `--verbose` to show all linker arguments
  = note: warning: ignoring deprecated linker optimization setting '1'
          warning: unable to open library directory '/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/build/crypto/': FileNotFound
          ld.lld: error: duplicate symbol: SSL_export_keying_material
          >>> defined at ssl_lib.c:3816 (ssl/ssl_lib.c:3816)
          >>>            libssl-lib-ssl_lib.o:(SSL_export_keying_material) in archive /var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/libopenssl_sys-7c189e68b37fe2bb.rlib
          >>> defined at t1_enc.cc:205 (/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/boringssl/ssl/t1_enc.cc:205)
          >>>            t1_enc.cc.o:(.text.SSL_export_keying_material+0x0) in archive /var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/librama_boring_sys-0414e98115015ee0.rlib

          ld.lld: error: duplicate symbol: d2i_ASN1_TIME
          >>> defined at a_time.c:27 (crypto/asn1/a_time.c:27)
          >>>            libcrypto-lib-a_time.o:(d2i_ASN1_TIME) in archive /var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/libopenssl_sys-7c189e68b37fe2bb.rlib
          >>> defined at a_time.cc:34 (/Users/apanasenko/code/codex/codex-rs/target/x86_64-unknown-linux-musl/release/build/rama-boring-sys-0bc2dfbf669addc4/out/boringssl/crypto/asn1/a_time.cc:34)
          >>>            a_time.cc.o:(.text.d2i_ASN1_TIME+0x0) in archive /var/folders/kt/52y_g75x3ng8ktvk3rfwm6400000gp/T/rustcyGQdYm/librama_boring_sys-0414e98115015ee0.rlib
``` 

that force me to migrate away from rama-tls-boring to rama-tls-rustls
and pin `ring` for rustls.
2026-02-07 17:59:34 -08:00
Josh McKinney
e416e578bb core: preconnect Responses websocket for first turn (#10698)
## Problem
The first user turn can pay websocket handshake latency even when a
session has already started. We want to reduce that initial delay while
preserving turn semantics and avoiding any prompt send during startup.

Reviewer feedback also called out duplicated connect/setup paths and
unnecessary preconnect state complexity.

## Mental model
`ModelClient` owns session-scoped transport state. During session
startup, it can opportunistically warm one websocket handshake slot. A
turn-scoped `ModelClientSession` adopts that slot once if available,
restores captured sticky turn-state, and otherwise opens a websocket
through the same shared connect path.

If startup preconnect is still in flight, first turn setup awaits that
task and treats it as the first connection attempt for the turn.

Preconnect is handshake-only. The first `response.create` is still sent
only when a turn starts.

## Non-goals
This change does not make preconnect required for correctness and does
not change prompt/turn payload semantics. It also does not expand
fallback behavior beyond clearing preconnect state when fallback
activates.

## Tradeoffs
The implementation prioritizes simpler ownership and shared connection
code over header-match gating for reuse. The single-slot cache keeps
lifecycle straightforward but only benefits the immediate next turn.

Awaiting in-flight preconnect has the same app-level connect-timeout
semantics as existing websocket connect behavior (no new timeout class
introduced by this PR).

## Architecture
`core/src/client.rs`:
- Added session-level preconnect lifecycle state (`Idle` / `InFlight` /
`Ready`) carrying one warmed websocket plus optional captured
turn-state.
- Added `pre_establish_connection()` startup warmup and `preconnect()`
handshake-only setup.
- Deduped auth/provider resolution into `current_client_setup()` and
websocket handshake wiring into `connect_websocket()` /
`build_websocket_headers()`.
- Updated turn websocket path to adopt preconnect first, await in-flight
preconnect when present, then create a new websocket only when needed.
- Ensured fallback activation clears warmed preconnect state.
- Added documentation for lifecycle, ownership, sticky-routing
invariants, and timeout semantics.

`core/src/codex.rs`:
- Session startup invokes `model_client.pre_establish_connection(...)`.
- Turn metadata resolution uses the shared timeout helper.

`core/src/turn_metadata.rs`:
- Centralized shared timeout helper used by both turn-time metadata
resolution and startup preconnect metadata building.

`core/tests/common/responses.rs` + websocket test suites:
- Added deterministic handshake waiting helper (`wait_for_handshakes`)
with bounded polling.
- Added startup preconnect and in-flight preconnect reuse coverage.
- Fallback expectations now assert exactly two websocket attempts in
covered scenarios (startup preconnect + turn attempt before fallback
sticks).

## Observability
Preconnect remains best-effort and non-fatal. Existing
websocket/fallback telemetry remains in place, and debug logs now make
preconnect-await behavior and preconnect failures easier to reason
about.

## Tests
Validated with:
1. `just fmt`
2. `cargo test -p codex-core websocket_preconnect -- --nocapture`
3. `cargo test -p codex-core websocket_fallback -- --nocapture`
4. `cargo test -p codex-core
websocket_first_turn_waits_for_inflight_preconnect -- --nocapture`
2026-02-06 19:08:24 +00:00
Ahmed Ibrahim
f9c38f531c add none personality option (#10688)
- add none personality enum value and empty placeholder behavior\n- add
docs/schema updates and e2e coverage
2026-02-04 15:40:33 -08:00
pakrym-oai
0efd33f7f4 Update tests to stop using sse_completed fixture (#10638)
Summary:
- replace the `sse_completed` fixture and related JSON template with
direct `responses::ev_completed` payload builders
- cascade the new SSE helpers through all affected core tests for
consistency and clarity
- remove legacy fixtures that were no longer needed once the helpers are
in place

Testing:
- Not run (not requested)
2026-02-04 08:38:06 -08:00
Michael Bolin
66447d5d2c feat: replace custom mcp-types crate with equivalents from rmcp (#10349)
We started working with MCP in Codex before
https://crates.io/crates/rmcp was mature, so we had our own crate for
MCP types that was generated from the MCP schema:


8b95d3e082/codex-rs/mcp-types/README.md

Now that `rmcp` is more mature, it makes more sense to use their MCP
types in Rust, as they handle details (like the `_meta` field) that our
custom version ignored. Though one advantage that our custom types had
is that our generated types implemented `JsonSchema` and `ts_rs::TS`,
whereas the types in `rmcp` do not. As such, part of the work of this PR
is leveraging the adapters between `rmcp` types and the serializable
types that are API for us (app server and MCP) introduced in #10356.

Note this PR results in a number of changes to
`codex-rs/app-server-protocol/schema`, which merit special attention
during review. We must ensure that these changes are still
backwards-compatible, which is possible because we have:

```diff
- export type CallToolResult = { content: Array<ContentBlock>, isError?: boolean, structuredContent?: JsonValue, };
+ export type CallToolResult = { content: Array<JsonValue>, structuredContent?: JsonValue, isError?: boolean, _meta?: JsonValue, };
```

so `ContentBlock` has been replaced with the more general `JsonValue`.
Note that `ContentBlock` was defined as:

```typescript
export type ContentBlock = TextContent | ImageContent | AudioContent | ResourceLink | EmbeddedResource;
```

so the deletion of those individual variants should not be a cause of
great concern.

Similarly, we have the following change in
`codex-rs/app-server-protocol/schema/typescript/Tool.ts`:

```
- export type Tool = { annotations?: ToolAnnotations, description?: string, inputSchema: ToolInputSchema, name: string, outputSchema?: ToolOutputSchema, title?: string, };
+ export type Tool = { name: string, title?: string, description?: string, inputSchema: JsonValue, outputSchema?: JsonValue, annotations?: JsonValue, icons?: Array<JsonValue>, _meta?: JsonValue, };
```

so:

- `annotations?: ToolAnnotations` ➡️ `JsonValue`
- `inputSchema: ToolInputSchema` ➡️ `JsonValue`
- `outputSchema?: ToolOutputSchema` ➡️ `JsonValue`

and two new fields: `icons?: Array<JsonValue>, _meta?: JsonValue`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10349).
* #10357
* __->__ #10349
* #10356
2026-02-02 17:41:55 -08:00
pakrym-oai
fbb3a30953 Remove WebSocket wire format (#10179)
I'd like WireApi to go away (when chat is removed) and WebSockets is
still responses API just over a different transport.
2026-01-29 13:50:53 -08:00
sayan-oai
ff9fa56368 default enable compression, update test helpers (#10102)
set `enable_request_compression` flag to default-enabled.

update integration test helpers to decompress `zstd` if flag set.
2026-01-28 12:25:40 -08:00
sayan-oai
86adf53235 fix: handle all web_search actions and in progress invocations (#9960)
### Summary
- Parse all `web_search` tool actions (`search`, `find_in_page`,
`open_page`).
- Previously we only parsed + displayed `search`, which made the TUI
appear to pause when the other actions were being used.
- Show in progress `web_search` calls as `Searching the web`
  - Previously we only showed completed tool calls

<img width="308" height="149" alt="image"
src="https://github.com/user-attachments/assets/90a4e8ff-b06a-48ff-a282-b57b31121845"
/>

### Tests
Added + updated tests, tested locally

### Follow ups
Update VSCode extension to display these as well
2026-01-27 03:33:48 +00:00
pakrym-oai
998e88b12a Use test_codex more (#9961)
Reduces boilderplate.
2026-01-26 18:52:10 -08:00
Anton Panasenko
e117a3ff33 feat: support proxy for ws connection (#9719)
reapply websocket changes without changing tls lib.
2026-01-22 15:23:15 -08:00
Dylan Hurd
8b3521ee77 feat(core) update Personality on turn (#9644)
## Summary
Support updating Personality mid-Thread via UserTurn/OverwriteTurn. This
is explicitly unused by the clients so far, to simplify PRs - app-server
and tui implementations will be follow-ups.

## Testing
- [x] added integration tests
2026-01-22 12:04:23 -08:00
pakrym-oai
4d48d4e0c2 Revert "feat: support proxy for ws connection" (#9693)
Reverts openai/codex#9409
2026-01-22 15:57:18 +00:00
Anton Panasenko
7b27aa7707 feat: support proxy for ws connection (#9409)
unfortunately tokio-tungstenite doesn't support proxy configuration
outbox, while https://github.com/snapview/tokio-tungstenite/pull/370 is
in review, we can depend on source code for now.
2026-01-20 09:36:30 -08:00
Ahmed Ibrahim
146d54cede Add collaboration_mode override to turns (#9408) 2026-01-16 21:51:25 -08:00
Ahmed Ibrahim
ebdd8795e9 Turn-state sticky routing per turn (#9332)
- capture the header from SSE/WS handshakes, store it per
ModelClientSession using `Oncelock`, echo it on turn-scoped requests,
and add SSE+WS integration tests for within-turn persistence +
cross-turn reset.

- keep `x-codex-turn-state` sticky within a user turn to maintain
routing continuity for retries/tool follow-ups.
2026-01-16 09:30:11 -08:00
charley-oai
4a9c2bcc5a Add text element metadata to types (#9235)
Initial type tweaking PR to make the diff of
https://github.com/openai/codex/pull/9116 smaller

This should not change any behavior, just adds some fields to types
2026-01-14 16:41:50 -08:00
pakrym-oai
9f8d3c14ce Fix flakiness in WebSocket tests (#9169)
The connection was being added to the list after the WebSocket response
was sent.

So the test can sometimes race and observe connections before the list
was updated.

After this change, connection and request is added to the list before
the response is sent.
2026-01-13 15:09:59 -08:00
pakrym-oai
2d56519ecd Support response.done and add integration tests (#9129)
The agent loop using a persistent incremental web socket connection.
2026-01-13 16:12:30 +00:00
Ahmed Ibrahim
cbca43d57a Send message by default mid turn. queue messages by tab (#9077)
https://github.com/user-attachments/assets/03838730-4ddc-44df-a2c7-cb8ecda78660
2026-01-12 23:06:35 -08:00
pakrym-oai
490c1c1fdd Add model client sessions (#9102)
Maintain a long-running session.
2026-01-13 01:15:56 +00:00
zbarsky-openai
2a06d64bc9 feat: add support for building with Bazel (#8875)
This PR configures Codex CLI so it can be built with
[Bazel](https://bazel.build) in addition to Cargo. The `.bazelrc`
includes configuration so that remote builds can be done using
[BuildBuddy](https://www.buildbuddy.io).

If you are familiar with Bazel, things should work as you expect, e.g.,
run `bazel test //... --keep-going` to run all the tests in the repo,
but we have also added some new aliases in the `justfile` for
convenience:

- `just bazel-test` to run tests locally
- `just bazel-remote-test` to run tests remotely (currently, the remote
build is for x86_64 Linux regardless of your host platform). Note we are
currently seeing the following test failures in the remote build, so we
still need to figure out what is happening here:

```
failures:
    suite::compact::manual_compact_twice_preserves_latest_user_messages
    suite::compact_resume_fork::compact_resume_after_second_compaction_preserves_history
    suite::compact_resume_fork::compact_resume_and_fork_preserve_model_history_view
```

- `just build-for-release` to build release binaries for all
platforms/architectures remotely

To setup remote execution:
- [Create a buildbuddy account](https://app.buildbuddy.io/) (OpenAI
employees should also request org access at
https://openai.buildbuddy.io/join/ with their `@openai.com` email
address.)
- [Copy your API key](https://app.buildbuddy.io/docs/setup/) to
`~/.bazelrc` (add the line `build
--remote_header=x-buildbuddy-api-key=YOUR_KEY`)
- Use `--config=remote` in your `bazel` invocations (or add `common
--config=remote` to your `~/.bazelrc`, or use the `just` commands)

## CI

In terms of CI, this PR introduces `.github/workflows/bazel.yml`, which
uses Bazel to run the tests _locally_ on Mac and Linux GitHub runners
(we are working on supporting Windows, but that is not ready yet). Note
that the failures we are seeing in `just bazel-remote-test` do not occur
on these GitHub CI jobs, so everything in `.github/workflows/bazel.yml`
is green right now.

The `bazel.yml` uses extra config in `.github/workflows/ci.bazelrc` so
that macOS CI jobs build _remotely_ on Linux hosts (using the
`docker://docker.io/mbolin491/codex-bazel` Docker image declared in the
root `BUILD.bazel`) using cross-compilation to build the macOS
artifacts. Then these artifacts are downloaded locally to GitHub's macOS
runner so the tests can be executed natively. This is the relevant
config that enables this:

```
common:macos --config=remote
common:macos --strategy=remote
common:macos --strategy=TestRunner=darwin-sandbox,local
```

Because of the remote caching benefits we get from BuildBuddy, these new
CI jobs can be extremely fast! For example, consider these two jobs that
ran all the tests on Linux x86_64:

- Bazel 1m37s
https://github.com/openai/codex/actions/runs/20861063212/job/59940545209?pr=8875
- Cargo 9m20s
https://github.com/openai/codex/actions/runs/20861063192/job/59940559592?pr=8875

For now, we will continue to run both the Bazel and Cargo jobs for PRs,
but once we add support for Windows and running Clippy, we should be
able to cutover to using Bazel exclusively for PRs, which should still
speed things up considerably. We will probably continue to run the Cargo
jobs post-merge for commits that land on `main` as a sanity check.

Release builds will also continue to be done by Cargo for now.

Earlier attempt at this PR: https://github.com/openai/codex/pull/8832
Earlier attempt to add support for Buck2, now abandoned:
https://github.com/openai/codex/pull/8504

---------

Co-authored-by: David Zbarsky <dzbarsky@gmail.com>
Co-authored-by: Michael Bolin <mbolin@openai.com>
2026-01-09 11:09:43 -08:00
jif-oai
1aed01e99f renaming: task to turn (#8963) 2026-01-09 17:31:17 +00:00
Ahmed Ibrahim
81caee3400 Add 5s timeout to models list call + integration test (#8942)
- Enforce a 5s timeout around the remote models refresh to avoid hanging
/models calls.
2026-01-08 18:06:10 -08:00
Ahmed Ibrahim
0d3e673019 remove get_responses_requests and get_responses_request_bodies to use in-place matcher (#8858) 2026-01-08 13:57:48 -08:00
Michael Bolin
1e29774fce fix: leverage codex_utils_cargo_bin() in codex-rs/core/tests/suite (#8887)
This eliminates our dependency on the `escargot` crate and better
prepares us for Bazel builds: https://github.com/openai/codex/pull/8875.
2026-01-08 14:56:16 +00:00
Michael Bolin
7520d8ba58 fix: leverage find_resource! macro in load_sse_fixture_with_id (#8888)
This helps prepare us for Bazel builds:
https://github.com/openai/codex/pull/8875.
2026-01-08 09:34:05 -05:00
jif-oai
116059c3a0 chore: unify conversation with thread name (#8830)
Done and verified by Codex + refactor feature of RustRover
2026-01-07 17:04:53 +00:00
jif-oai
1dd1355df3 feat: agent controller (#8783)
Added an agent control plane that lets sessions spawn or message other
conversations via `AgentControl`.

`AgentBus` (core/src/agent/bus.rs) keeps track of the last known status
of a conversation.

ConversationManager now holds shared state behind an Arc so AgentControl
keeps only a weak back-reference, the goal is just to avoid explicit
cycle reference.

Follow-ups:
* Build a small tool in the TUI to be able to see every agent and send
manual message to each of them
* Handle approval requests in this TUI
* Add tools to spawn/communicate between agents (see related design)
* Define agent types
2026-01-06 19:08:02 +00:00
Ahmed Ibrahim
66b7c673e9 Refresh on models etag mismatch (#8491)
- Send models etag
- Refresh models on 412
- This wires `ModelsManager` to `ModelFamily` so we don't mutate it
mid-turn
2026-01-01 11:41:16 -08:00
Michael Bolin
e61bae12e3 feat: introduce codex-utils-cargo-bin as an alternative to assert_cmd::Command (#8496)
This PR introduces a `codex-utils-cargo-bin` utility crate that
wraps/replaces our use of `assert_cmd::Command` and
`escargot::CargoBuild`.

As you can infer from the introduction of `buck_project_root()` in this
PR, I am attempting to make it possible to build Codex under
[Buck2](https://buck2.build) as well as `cargo`. With Buck2, I hope to
achieve faster incremental local builds (largely due to Buck2's
[dice](https://buck2.build/docs/insights_and_knowledge/modern_dice/)
build strategy, as well as benefits from its local build daemon) as well
as faster CI builds if we invest in remote execution and caching.

See
https://buck2.build/docs/getting_started/what_is_buck2/#why-use-buck2-key-advantages
for more details about the performance advantages of Buck2.

Buck2 enforces stronger requirements in terms of build and test
isolation. It discourages assumptions about absolute paths (which is key
to enabling remote execution). Because the `CARGO_BIN_EXE_*` environment
variables that Cargo provides are absolute paths (which
`assert_cmd::Command` reads), this is a problem for Buck2, which is why
we need this `codex-utils-cargo-bin` utility.

My WIP-Buck2 setup sets the `CARGO_BIN_EXE_*` environment variables
passed to a `rust_test()` build rule as relative paths.
`codex-utils-cargo-bin` will resolve these values to absolute paths,
when necessary.


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/8496).
* #8498
* __->__ #8496
2025-12-23 19:29:32 -08:00
Michael Bolin
3d4ced3ff5 chore: migrate from Config::load_from_base_config_with_overrides to ConfigBuilder (#8276)
https://github.com/openai/codex/pull/8235 introduced `ConfigBuilder` and
this PR updates all call non-test call sites to use it instead of
`Config::load_from_base_config_with_overrides()`.

This is important because `load_from_base_config_with_overrides()` uses
an empty `ConfigRequirements`, which is a reasonable default for testing
so the tests are not influenced by the settings on the host. This method
is now guarded by `#[cfg(test)]` so it cannot be used by business logic.

Because `ConfigBuilder::build()` is `async`, many of the test methods
had to be migrated to be `async`, as well. On the bright side, this made
it possible to eliminate a bunch of `block_on_future()` stuff.
2025-12-18 16:12:52 -08:00
jif-oai
ae57e18947 feat: close unified_exec at end of turn (#8052) 2025-12-16 12:16:43 +00:00
Ahmed Ibrahim
d802b18716 fix parallel tool calls (#7956) 2025-12-16 01:28:27 +00:00
xl-openai
5d77d4db6b Reimplement skills loading using SkillsManager + skills/list op. (#7914)
refactor the way we load and manage skills:
1. Move skill discovery/caching into SkillsManager and reuse it across
sessions.
2. Add the skills/list API (Op::ListSkills/SkillsListResponse) to fetch
skills for one or more cwds. Also update app-server for VSCE/App;
3. Trigger skills/list during session startup so UIs preload skills and
handle errors immediately.
2025-12-14 09:58:17 -08:00
Michael Bolin
b1905d3754 fix: added test helpers for platform-specific paths (#7954)
This addresses post-merge feedback from
https://github.com/openai/codex/pull/7856.
2025-12-13 00:14:12 +00:00
jif-oai
29381ba5c2 feat: add shell snapshot for shell command (#7786) 2025-12-11 13:46:43 +00:00
xl-openai
b36ecb6c32 Inject SKILL.md when it's explicitly mentioned. (#7763)
1. Skills load once in core at session start; the cached outcome is
reused across core and surfaced to TUI via SessionConfigured.
2. TUI detects explicit skill selections, and core injects the matching
SKILL.md content into the turn when a selected skill is present.
2025-12-10 13:59:17 -08:00