codex

mirror of https://github.com/openai/codex.git synced 2026-04-28 00:25:56 +00:00

Author	SHA1	Message	Date
Ahmed Ibrahim	5e01450963	Strip unsupported images from prompt history to guard against model switch (#11349 ) - Make `ContextManager::for_prompt` modality-aware and strip input_image content when the active model is text-only. - Added a test for multi-model -> text-only model switch	2026-02-10 11:58:00 -08:00
Ahmed Ibrahim	9c4656000f	Sanitize MCP image output for text-only models (#11346 ) - Replace image blocks in MCP tool results with a text placeholder when the active model does not accept image input. - Add an e2e rmcp test to verify sanitized tool output is what gets sent back to the model.	2026-02-10 11:25:32 -08:00
Ahmed Ibrahim	6e96e4837e	Always expose view_image and return unsupported image-input error (#11336 ) - Keep `view_image` in the advertised tool list for all models. - Return a clear error when the current model does not support image inputs, and cover it with a unit test.	2026-02-10 11:25:12 -08:00
jif-oai	847a6092e6	fix: reduce usage of `open_if_present` (#11344 )	2026-02-10 19:25:07 +00:00
pakrym-oai	0639c33892	Compare full request for websockets incrementality (#11343 ) Tools can dynamically change mid-turn now. We need to be more thorough about reusing incremental connections.	2026-02-10 19:14:36 +00:00
Dylan Hurd	f3bbcc987d	test(core): stabilize ARM bazel remote-model and parallelism tests (#11330 ) ## Summary - keep wiremock MockServer handles alive through async assertions in remote model suite tests - assert /models request count in remote_models_hide_picker_only_models - use a slightly higher parallel timing threshold on aarch64 while keeping existing x86 threshold ## Validation - just fmt - targeted tests: - cargo test -p codex-core --test all suite::remote_models::remote_models_merge_replaces_overlapping_model -- --exact - cargo test -p codex-core --test all suite::remote_models::remote_models_hide_picker_only_models -- --exact - cargo test -p codex-core --test all suite::tool_parallelism::shell_tools_run_in_parallel -- --exact - soak loop: 40 iterations of all three targeted tests ## Notes - cargo test -p codex-core has one unrelated local-env failure in shell_snapshot::tests::try_new_creates_and_deletes_snapshot_file from exported certificate env content in this workspace. - local bazel test //codex-rs/core:core-all-test failed to build due missing rust-objcopy in this host toolchain.	2026-02-10 10:57:50 -08:00
Shijie Rao	c4b771a16f	Fix: update parallel tool call exec approval to approve on request id (#11162 ) ### Summary In parallel tool call, exec command approvals were not approved at request level but at a turn level. i.e. when a single request is approved, the system currently treats all requests in turn as approved. ### Before https://github.com/user-attachments/assets/d50ed129-b3d2-4b2f-97fa-8601eb11f6a8 ### After https://github.com/user-attachments/assets/36528a43-a4aa-4775-9e12-f13287ef19fc	2026-02-10 09:38:00 -08:00
Max Johnson	47356ff83c	Revert "Add app-server transport layer with websocket support (#10693 )" (#11323 ) Suspected cause of deadlocking bug	2026-02-10 17:37:49 +00:00
Fouad Matin	693bac1851	fix(protocol): approval policy never prompt (#11288 ) This removes overly directed language about how the model should behave when it's in `approval_policy=never` mode. --------- Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>	2026-02-10 09:27:46 -08:00
jif-oai	59c625458b	Fix pending input test waiting logic (#11322 ) ## Summary - remove redundant user message wait that could time out and cause flakiness - rely on the existing turn-complete wait to ensure the follow-up request is observed ## Testing - Not run (not requested)	2026-02-10 15:40:53 +00:00
viyatb-oai	3391e5ea86	feat(sandbox): enforce proxy-aware network routing in sandbox (#11113 ) ## Summary - expand proxy env injection to cover common tool env vars (`HTTP_PROXY`/`HTTPS_PROXY`/`ALL_PROXY`/`NO_PROXY` families + tool-specific variants) - harden macOS Seatbelt network policy generation to route through inferred loopback proxy endpoints and fail closed when proxy env is malformed - thread proxy-aware Linux sandbox flags and add minimal bwrap netns isolation hook for restricted non-proxy runs - add/refresh tests for proxy env wiring, Seatbelt policy generation, and Linux sandbox argument wiring	2026-02-10 07:44:21 +00:00
Dylan Hurd	168c359b71	Adjust shell command timeouts for Windows (#11247 ) Summary - add platform-aware defaults for shell command timeouts so Windows tests get longer waits - keep medium timeout longer on Windows to ensure flakiness is reduced Testing - Not run (not requested)	2026-02-09 20:03:32 -08:00
Ahmed Ibrahim	d1df3bd63b	Revert "Revert "Update models.json"" (#11256 ) Reverts openai/codex#11255	2026-02-09 19:22:41 -08:00
Ahmed Ibrahim	a1abd53b6a	Remove offline fallback for models (#11238 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.	2026-02-09 16:58:54 -08:00
Ahmed Ibrahim	481145e959	Use longest remote model prefix matching (#11228 ) Match model metadata by longest matching remote slug prefix before local fallback. - Update `get_model_info` to prefer the most specific remote slug prefix for the requested model. - Add an integration test to assert `gpt-5.3-codex-test` resolves to `gpt-5.3-codex` over `gpt-5.3`.	2026-02-09 15:05:56 -08:00
Matthew Zeng	d90df4761b	[apps] Add gated instructions for Apps. (#10924 ) - [x] Add gated instructions for Apps.	2026-02-09 14:48:09 -08:00
Anton Panasenko	becc3a0424	feat: search_tool (#10657 ) Why We Did This - The goal is to reduce MCP tool context pollution by not exposing the full MCP tool list up front - It forces an explicit discovery step (`search_tool_bm25`) so the model narrows tool scope before making MCP calls, which helps relevance and lowers prompt/tool clutter. What It Changed - Added a new experimental feature flag `search_tool` in `core/src/features.rs:90` and `core/src/features.rs:430`. - Added config/schema support for that flag in `core/config.schema.json:214` and `core/config.schema.json:1235`. - Added BM25 dependency (`bm25`) in `Cargo.toml:129` and `core/Cargo.toml:23`. - Added new tool handler `search_tool_bm25` in `core/src/tools/handlers/search_tool_bm25.rs:18`. - Registered the handler and tool spec in `core/src/tools/handlers/mod.rs:11` and `core/src/tools/spec.rs:780` and `core/src/tools/spec.rs:1344`. - Extended `ToolsConfig` to carry `search_tool` enablement in `core/src/tools/spec.rs:32` and `core/src/tools/spec.rs:56`. - Injected dedicated developer instructions for tool-discovery workflow in `core/src/codex.rs:483` and `core/src/codex.rs:1976`, using `core/templates/search_tool/developer_instructions.md:1`. - Added session state to store one-shot selected MCP tools in `core/src/state/session.rs:27` and `core/src/state/session.rs:131`. - Added filtering so when feature is enabled, only selected MCP tools are exposed on the next request (then consumed) in `core/src/codex.rs:3800` and `core/src/codex.rs:3843`. - Added E2E suite coverage for enablement/instructions/hide-until-search/one-turn-selection in `core/tests/suite/search_tool.rs:72`, `core/tests/suite/search_tool.rs:109`, `core/tests/suite/search_tool.rs:147`, and `core/tests/suite/search_tool.rs:218`. - Refactored test helper utilities to support config-driven tool collection in `core/tests/suite/tools.rs:281`. Net Behavioral Effect - With `search_tool` off: existing MCP behavior (tools exposed normally). - With `search_tool` on: MCP tools start hidden, model must call `search_tool_bm25`, and only returned `selected_tools` are available for the next model call.	2026-02-09 12:53:50 -08:00
jif-oai	c2bfd1e473	Revert "chore: enable sub agents" (#11230 ) Reverts openai/codex#11173	2026-02-09 20:22:38 +00:00
pakrym-oai	ccd17374cb	Move warmup to the task level (#11216 ) Instead of storing a special connection on the client level make the regular task responsible for establishing a normal client session and open a connection on it. Then when the turn is started we pass in a pre-established session.	2026-02-09 10:57:52 -08:00
Rasmus Rygaard	b2d3843109	Translate websocket errors (#10937 ) When getting errors over a websocket connection, translate the error into our regular API error format	2026-02-09 17:53:09 +00:00
jif-oai	cfce286459	tools: remove get_memory tool and tests (#11198 ) Drop this memory tool as the design changed	2026-02-09 17:47:36 +00:00
gt-oai	54b401aa5f	Deflake mixed parallel tools timing test (#11193 ) ``` FAIL [ 1.903s] (1926/3311) codex-core::all suite::tool_parallelism::mixed_parallel_tools_run_in_parallel stdout ─── running 1 test test suite::tool_parallelism::mixed_parallel_tools_run_in_parallel ... FAILED failures: failures: suite::tool_parallelism::mixed_parallel_tools_run_in_parallel test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 684 filtered out; finished in 1.86s stderr ─── thread 'suite::tool_parallelism::mixed_parallel_tools_run_in_parallel' (205083) panicked at core/tests/suite/tool_parallelism.rs:74:5: expected parallel execution to finish quickly, got 1.406255993s stack backtrace: 0: __rustc::rust_begin_unwind at /rustc/254b59607d4417e9dffbc307138ae5c86280fe4c/library/std/src/panicking.rs:689:5 1: core::panicking::panic_fmt at /rustc/254b59607d4417e9dffbc307138ae5c86280fe4c/library/core/src/panicking.rs:80:14 2: all::suite::tool_parallelism::assert_parallel_duration at ./tests/suite/tool_parallelism.rs:74:5 3: all::suite::tool_parallelism::mixed_parallel_tools_run_in_parallel::{{closure}} at ./tests/suite/tool_parallelism.rs:206:5 4: <core::pin::Pin<P> as core::future::future::Future>::poll at /home/runner/.rustup/toolchains/1.93.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/future.rs:133:9 5: tokio::runtime::park::CachedParkThread::block_on::{{closure}} at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/park.rs:284:71 6: tokio::task::coop::with_budget at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/task/coop/mod.rs:167:5 7: tokio::task::coop::budget at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/task/coop/mod.rs:133:5 8: tokio::runtime::park::CachedParkThread::block_on at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/park.rs:284:31 9: tokio::runtime::context::blocking::BlockingRegionGuard::block_on at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/context/blocking.rs:66:14 10: tokio::runtime::scheduler::multi_thread::MultiThread::block_on::{{closure}} at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/scheduler/multi_thread/mod.rs:89:22 11: tokio::runtime::context::runtime::enter_runtime at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/context/runtime.rs:65:16 12: tokio::runtime::scheduler::multi_thread::MultiThread::block_on at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/scheduler/multi_thread/mod.rs:88:9 13: tokio::runtime::runtime::Runtime::block_on_inner at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/runtime.rs:370:50 14: tokio::runtime::runtime::Runtime::block_on at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/runtime.rs:342:18 15: all::suite::tool_parallelism::mixed_parallel_tools_run_in_parallel at ./tests/suite/tool_parallelism.rs:208:7 16: all::suite::tool_parallelism::mixed_parallel_tools_run_in_parallel::{{closure}} at ./tests/suite/tool_parallelism.rs:178:52 17: core::ops::function::FnOnce::call_once at /home/runner/.rustup/toolchains/1.93.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:250:5 18: core::ops::function::FnOnce::call_once at /rustc/254b59607d4417e9dffbc307138ae5c86280fe4c/library/core/src/ops/function.rs:250:5 note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace. ```	2026-02-09 15:16:54 +00:00
jif-oai	284c03ceab	chore: enable sub agents (#11173 )	2026-02-09 11:25:37 +00:00
jif-oai	6cf61725d0	feat: do not close unified exec processes across turns (#10799 ) With this PR we do not close the unified exec processes (i.e. background terminals) at the end of a turn unless: * The user interrupt the turn * The user decide to clean the processes through `app-server` or `/clean` I made sure that `codex exec` correctly kill all the processes	2026-02-09 10:27:46 +00:00
Michael Bolin	383b45279e	feat: include NetworkConfig through ExecParams (#11105 ) This PR adds the following field to `Config`: ```rust pub network: Option<NetworkProxy>, ``` Though for the moment, it will always be initialized as `None` (this will be addressed in a subsequent PR). This PR does the work to thread `network` through to `execute_exec_env()`, `process_exec_tool_call()`, and `UnifiedExecRuntime.run()` to ensure it is available whenever we span a process.	2026-02-09 03:32:17 +00:00
Eric Traut	b3de6c7f2b	Defer persistence of rollout file (#11028 ) - Defer rollout persistence for fresh threads (`InitialHistory::New`): keep rollout events in memory and only materialize rollout file + state DB row on first `EventMsg::UserMessage`. - Keep precomputed rollout path available before materialization. - Change `thread/start` to build thread response from live config snapshot and optional precomputed path. - Improve pre-materialization behavior in app-server/TUI: clearer invalid-request errors for file-backed ops and a friendlier `/fork` “not ready yet” UX. - Update tests to match deferred semantics across start/read/archive/unarchive/fork/resume/review flows. - Improved resilience of user_shell test, which should be unrelated to this change but must be affected by timing changes For Reviewers: * The primary change is in recorder.rs * Most of the other changes were to fix up broken assumptions in existing tests Testing: * Manually tested CLI * Exercised app server paths by manually running IDE Extension with rebuilt CLI binary * Only user-visible change is that `/fork` in TUI generates visible error if used prior to first turn	2026-02-07 23:05:03 -08:00
pakrym-oai	6d08298f4e	Fallback to HTTP on UPGRADE_REQUIRED (#10824 ) Allow the server to trigger a connection downgrade in case the protocol changes in incompatible ways.	2026-02-08 05:06:33 +00:00
pakrym-oai	8fe5066bcc	Simplify pre-connect (#11040 )	2026-02-07 15:52:03 -08:00
Michael Bolin	a118494323	feat: add support for allowed_web_search_modes in requirements.toml (#10964 ) This PR makes it possible to disable live web search via an enterprise config even if the user is running in `--yolo` mode (though cached web search will still be available). To do this, create `/etc/codex/requirements.toml` as follows: ```toml # "live" is not allowed; "disabled" is allowed even though not listed explicitly. allowed_web_search_modes = ["cached"] ``` Or set `requirements_toml_base64` MDM as explained on https://developers.openai.com/codex/security/#locations. ### Why - Enforce admin/MDM/`requirements.toml` constraints on web-search behavior, independent of user config and per-turn sandbox defaults. - Ensure per-turn config resolution and review-mode overrides never crash when constraints are present. ### What - Add `allowed_web_search_modes` to requirements parsing and surface it in app-server v2 `ConfigRequirements` (`allowedWebSearchModes`), with fixtures updated. - Define a requirements allowlist type (`WebSearchModeRequirement`) and normalize semantics: - `disabled` is always implicitly allowed (even if not listed). - An empty list is treated as `["disabled"]`. - Make `Config.web_search_mode` a `Constrained<WebSearchMode>` and apply requirements via `ConstrainedWithSource<WebSearchMode>`. - Update per-turn resolution (`resolve_web_search_mode_for_turn`) to: - Prefer `Live → Cached → Disabled` when `SandboxPolicy::DangerFullAccess` is active (subject to requirements), unless the user preference is explicitly `Disabled`. - Otherwise, honor the user’s preferred mode, falling back to an allowed mode when necessary. - Update TUI `/debug-config` and app-server mapping to display normalized `allowed_web_search_modes` (including implicit `disabled`). - Fix web-search integration tests to assert cached behavior under `SandboxPolicy::ReadOnly` (since `DangerFullAccess` legitimately prefers `live` when allowed).	2026-02-07 05:55:15 +00:00
daniel-oai	84bce2b8e6	TUI/Core: preserve duplicate skill/app mention selection across submit + resume (#10855 ) ## What changed - In `codex-rs/core/src/skills/injection.rs`, we now honor explicit `UserInput::Skill { name, path }` first, then fall back to text mentions only when safe. - In `codex-rs/tui/src/bottom_pane/chat_composer.rs`, mention selection is now token-bound (selected mention is tied to the specific inserted `$token`), and we snapshot bindings at submit time so selection is not lost. - In `codex-rs/tui/src/chatwidget.rs` and `codex-rs/tui/src/bottom_pane/mod.rs`, submit/queue paths now consume the submit-time mention snapshot (instead of rereading cleared composer state). - In `codex-rs/tui/src/mention_codec.rs` and `codex-rs/tui/src/bottom_pane/chat_composer_history.rs`, history now round-trips mention targets so resume restores the same selected duplicate. - In `codex-rs/tui/src/bottom_pane/skill_popup.rs` and `codex-rs/tui/src/bottom_pane/chat_composer.rs`, duplicate labels are normalized to `[Repo]` / `[App]`, app rows no longer show `Connected -`, and description space is a bit wider. <img width="550" height="163" alt="Screenshot 2026-02-05 at 9 56 56 PM" src="https://github.com/user-attachments/assets/346a7eb2-a342-4a49-aec8-68dfec0c7d89" /> <img width="550" height="163" alt="Screenshot 2026-02-05 at 9 57 09 PM" src="https://github.com/user-attachments/assets/5e04d9af-cccf-4932-98b3-c37183e445ed" /> ## Before vs now - Before: selecting a duplicate could still submit the default/repo match, and resume could lose which duplicate was originally selected. - Now: the exact selected target (skill path or app id) is preserved through submit, queue/restore, and resume. ## Manual test 1. Build and run this branch locally: - `cd /Users/daniels/code/codex/codex-rs` - `cargo build -p codex-cli --bin codex` - `./target/debug/codex` 2. Open mention picker with `$` and pick a duplicate entry (not the first one). 3. Confirm duplicate UI: - repo duplicate rows show `[Repo]` - app duplicate rows show `[App]` - app description does not start with `Connected -` 4. Submit the prompt, then press Up to restore draft and submit again. Expected: it keeps the same selected duplicate target. 5. Use `/resume` to reopen the session and send again. Expected: restored mention still resolves to the same duplicate target.	2026-02-06 15:59:00 -08:00
alexsong-oai	daeef06bec	add originator to otel (#10826 )	2026-02-06 15:13:56 -08:00
Brian Yu	1fbf5ed06f	Support alternative websocket API (#10861 ) Test plan ``` cargo build -p codex-cli && RUST_LOG='codex_api::endpoint::responses_websocket=trace,codex_core::client=debug,codex_core::codex=debug' \ ./target/debug/codex \ --enable responses_websockets_v2 \ --profile byok \ --full-auto ```	2026-02-06 14:40:50 -08:00
Ahmed Ibrahim	ba8b5d9018	Treat compaction failure as failure state (#10927 ) - Return compaction errors from local and remote compaction flows.\n- Stop turns/tasks when auto-compaction fails instead of continuing execution.	2026-02-06 13:51:46 -08:00
Charley Cunningham	143daadb31	core: refresh developer instructions after compaction replacement history (#10574 ) ## Summary When replaying compacted history (especially `replacement_history` from remote compaction), we should not keep stale developer messages from older session state. This PR trims developer- role messages from compacted replacement history and reinjects fresh developer instructions derived from current turn/session state. This aligns compaction replay behavior with the intended "fresh instructions after summary" model. ## Problem Compaction replay had two paths: - `Compacted { replacement_history: None }`: rebuilt with fresh initial context - `Compacted { replacement_history: Some(...) }`: previously used raw replacement history as-is The second path could carry stale developer instructions (permissions/personality/collab-mode guidance) across session changes. ## What Changed ### 1) Added helper to refresh compacted developer instructions - File: `codex-rs/core/src/compact.rs` - Function: `refresh_compacted_developer_instructions(...)` Behavior: - remove all `ResponseItem::Message { role: "developer", .. }` from compacted history - append fresh developer messages from current `build_initial_context(...)` ### 2) Applied helper in remote compaction flow - File: `codex-rs/core/src/compact_remote.rs` - After receiving compact endpoint output, refresh developer instructions before replacing history and persisting `replacement_history`. ### 3) Applied helper while reconstructing history from rollout - File: `codex-rs/core/src/codex.rs` - In `reconstruct_history_from_rollout(...)`, when processing `Compacted` entries with `replacement_history`, refresh developer instructions instead of directly replacing with raw history. ## Non-Goals / Follow-up This PR does not address the existing first-turn-after-resume double-injection behavior. A follow-up PR will handle resume-time dedup/idempotence separately. If you want, I can also give you a shorter “squash-merge friendly” version of the description. ## Codex author `codex fork 019c25e6-706e-75d1-9198-688ec00a8256`	2026-02-06 12:25:08 -08:00
Josh McKinney	e416e578bb	core: preconnect Responses websocket for first turn (#10698 ) ## Problem The first user turn can pay websocket handshake latency even when a session has already started. We want to reduce that initial delay while preserving turn semantics and avoiding any prompt send during startup. Reviewer feedback also called out duplicated connect/setup paths and unnecessary preconnect state complexity. ## Mental model `ModelClient` owns session-scoped transport state. During session startup, it can opportunistically warm one websocket handshake slot. A turn-scoped `ModelClientSession` adopts that slot once if available, restores captured sticky turn-state, and otherwise opens a websocket through the same shared connect path. If startup preconnect is still in flight, first turn setup awaits that task and treats it as the first connection attempt for the turn. Preconnect is handshake-only. The first `response.create` is still sent only when a turn starts. ## Non-goals This change does not make preconnect required for correctness and does not change prompt/turn payload semantics. It also does not expand fallback behavior beyond clearing preconnect state when fallback activates. ## Tradeoffs The implementation prioritizes simpler ownership and shared connection code over header-match gating for reuse. The single-slot cache keeps lifecycle straightforward but only benefits the immediate next turn. Awaiting in-flight preconnect has the same app-level connect-timeout semantics as existing websocket connect behavior (no new timeout class introduced by this PR). ## Architecture `core/src/client.rs`: - Added session-level preconnect lifecycle state (`Idle` / `InFlight` / `Ready`) carrying one warmed websocket plus optional captured turn-state. - Added `pre_establish_connection()` startup warmup and `preconnect()` handshake-only setup. - Deduped auth/provider resolution into `current_client_setup()` and websocket handshake wiring into `connect_websocket()` / `build_websocket_headers()`. - Updated turn websocket path to adopt preconnect first, await in-flight preconnect when present, then create a new websocket only when needed. - Ensured fallback activation clears warmed preconnect state. - Added documentation for lifecycle, ownership, sticky-routing invariants, and timeout semantics. `core/src/codex.rs`: - Session startup invokes `model_client.pre_establish_connection(...)`. - Turn metadata resolution uses the shared timeout helper. `core/src/turn_metadata.rs`: - Centralized shared timeout helper used by both turn-time metadata resolution and startup preconnect metadata building. `core/tests/common/responses.rs` + websocket test suites: - Added deterministic handshake waiting helper (`wait_for_handshakes`) with bounded polling. - Added startup preconnect and in-flight preconnect reuse coverage. - Fallback expectations now assert exactly two websocket attempts in covered scenarios (startup preconnect + turn attempt before fallback sticks). ## Observability Preconnect remains best-effort and non-fatal. Existing websocket/fallback telemetry remains in place, and debug logs now make preconnect-await behavior and preconnect failures easier to reason about. ## Tests Validated with: 1. `just fmt` 2. `cargo test -p codex-core websocket_preconnect -- --nocapture` 3. `cargo test -p codex-core websocket_fallback -- --nocapture` 4. `cargo test -p codex-core websocket_first_turn_waits_for_inflight_preconnect -- --nocapture`	2026-02-06 19:08:24 +00:00
jif-oai	aab61934af	Handle required MCP startup failures across components (#10902 ) Summary - add a `required` flag for MCP servers everywhere config/CLI data is touched so mandatory helpers can be round-tripped - have `codex exec` and `codex app-server` thread start/resume fail fast when required MCPs fail to initialize	2026-02-06 17:14:37 +01:00
Eric Traut	dd80e332c4	Removed the "remote_compaction" feature flag (#10840 ) This feature is always on now	2026-02-05 23:54:57 -08:00
Anton Panasenko	4ee039744e	feat: expose detailed metrics to runtime metrics (#10699 )	2026-02-05 18:22:30 -08:00
pakrym-oai	dbe47ea01a	Send beta header with websocket connects (#10727 )	2026-02-05 15:05:02 -08:00
sayan-oai	378f1cabe8	go back to auto-enabling web_search for azure (#10820 ) ###### What Remove special-casing that prevented auto-enabling `web_search` for Azure model provider users. Addresses #10071, #10257. ###### Why Azure fixed their responsesapi implementation; `web_search` is now supported on models it wasn't before (like `gpt-5.1-codex-max`). This request now works: ``` curl "$AZURE_API_ENDPOINT" -H "Content-Type: application/json" -H "Authorization: Bearer $AZURE_API_KEY" -d '{ "model": "gpt-5.1-codex-max", "tools": [ { "type": "web_search" } ], "tool_choice": "auto", "input": "Find the sunrise time in Paris today and cite the source." }' ``` ###### Tests Tested with above curl, removed Azure-specific tests.	2026-02-05 14:57:07 -08:00
jif-oai	428a9f6035	feat: wait for backfill to be ready (#10790 )	2026-02-05 20:45:16 +00:00
sayan-oai	5fdf6f5efa	chore: rm web-search-eligible header (#10660 ) default-enablement of web_search is now client-side, no need to send eligibility headers to backend. Tested locally, headers no longer sent. will wait for corresponding backend change to deploy before merging	2026-02-05 11:48:34 -08:00
Owen Lin	3582b74d01	fix(auth): isolate chatgptAuthTokens concept to auth manager and app-server (#10423 ) So that the rest of the codebase (like TUI) don't need to be concerned whether ChatGPT auth was handled by Codex itself or passed in via app-server's external auth mode.	2026-02-05 10:46:06 -08:00
jif-oai	9ee746afd6	Leverage state DB metadata for thread summaries (#10621 ) Summary: - read conversation summaries and cwd info from the state DB when possible so we no longer rely on rollout files for metadata and avoid extra I/O - persist CLI version in thread metadata, surface it through summary builders, and add the necessary DB migration hooks - simplify thread listing by using enriched state DB data directly rather than reading rollout heads Testing: - Not run (not requested)	2026-02-05 16:39:11 +00:00
jif-oai	41f3b1ba0b	feat: add memory tool (#10637 ) Add a tool for memory to retrieve a full memory based on the memory ID	2026-02-05 16:16:31 +00:00
jif-oai	97582ac52d	Allow user shell commands to run alongside active turns (#10513 ) Summary - refactor user shell command execution into a shared helper and add modes for standalone vs active-turn execution - run user shell commands asynchronously when a turn is already active so they don’t replace or abort the current turn - extend the tests to cover the new behavior and add the generated Codex environment manifest Testing - Not run (not requested)	2026-02-05 11:11:00 +00:00
Dylan Hurd	fe8b474acd	fix(core,app-server) resume with different model (#10719 ) ## Summary When resuming with a different model, we should also append a developer message with the model instructions ## Testing - [x] Added unit tests	2026-02-05 00:40:05 -08:00
Charley Cunningham	dc7007beaa	Fix remote compaction estimator/payload instruction small mismatch (#10692 ) ## Summary This PR fixes a deterministic mismatch in remote compaction where pre-trim estimation and the `/v1/responses/compact` payload could use different base instructions. Before this change: - pre-trim estimation used model-derived instructions (`model_info.get_model_instructions(...)`) - compact payload used session base instructions (`sess.get_base_instructions()`) After this change: - remote pre-trim estimation and compact payload both use the same `BaseInstructions` instance from session state. ## Changes - Added a shared estimator entry point in `ContextManager`: - `estimate_token_count_with_base_instructions(&self, base_instructions: &BaseInstructions) -> Option<i64>` - Kept `estimate_token_count(&TurnContext)` as a thin wrapper that resolves model/personality instructions and delegates to the new helper. - Updated remote compaction flow to fetch base instructions once and reuse it for both: - trim preflight estimation - compact request payload construction - Added regression coverage for parity and behavior: - unit test verifying explicit-base estimator behavior - integration test proving remote compaction uses session override instructions and trims accordingly ## Why this matters This removes a deterministic divergence source where pre-trim could think the request fits while the actual compact request exceeded context because its instructions were longer/different. ## Scope In scope: - estimator/payload base-instructions parity in remote compaction Out of scope: - retry-on-`context_length_exceeded` - compaction threshold/headroom policy changes - broader trimming policy changes ## Codex author: `codex fork 019c2b24-c2df-7b31-a482-fb8cf7a28559`	2026-02-04 23:24:06 -08:00
Dylan Hurd	e482978261	fix(core) switching model appends model instructions (#10651 ) ## Summary When switching models, we should append the instructions of the new model to the conversation as a developer message. ## Test - [x] Adds a unit test	2026-02-05 05:50:38 +00:00
Dylan Hurd	a05aadfa1b	chore(config) Default Personality Pragmatic (#10705 ) ## Summary Switch back to Pragmatic personality ## Testing - [x] Updated unit tests	2026-02-04 21:22:47 -08:00

1 2 3 4 5 ...

520 Commits