codex

mirror of https://github.com/openai/codex.git synced 2026-05-24 21:14:51 +00:00

Author	SHA1	Message	Date
pakrym-oai	71e4c6fa17	Move codex module under session (#18249 ) ## Summary - rename the core codex module root to session/mod.rs without using #[path] - move the codex module directory and tests under core/src/session - remove session/mod.rs reexports so call sites use explicit child module paths ## Testing - cargo test -p codex-core --lib - cargo check -p codex-core --tests - just fmt - just fix -p codex-core - git diff --check	2026-04-17 16:18:53 +00:00
rhan-oai	5779be314a	[codex-analytics] add compaction analytics event (#17155 ) - event for compaction analytics - introduces thread-connection and thread metadata caches for data denormalization, expected to be useful for denormalization onto core emitted events in general - threads analytics event client into core (mirrors approved implementation in #16640) - denormalizes key thread metadata: thread_source, subagent_source, parent_thread_id, as well as app-server client and runtime metadata) - compaction strategy defaults to memento, forward compatible with expected prefill_compaction strategy 1. Manual standalone compact, local `INFO \| 2026-04-09 17:35:50 \| codex_backend.routers.analytics_events \| analytics_events.track_analytics_events:526 \| Tracked codex_compaction_event event params={'thread_id': '019d74d0-5cfb-70c0-bef9-165c3bf9b2df', 'turn_id': '019d74d0-d7f6-7c81-acc6-aae2030243d6', 'product_surface': 'codex', 'app_server_client': {'product_client_id': 'CODEX_CLI', 'client_name': 'codex-tui', 'client_version': '0.0.0', 'rpc_transport': 'in_process', 'experimental_api_enabled': True}, 'runtime': {'codex_rs_version': '0.0.0', 'runtime_os': 'macos', 'runtime_os_version': '26.4.0', 'runtime_arch': 'aarch64'}, 'trigger': 'manual', 'reason': 'user_requested', 'implementation': 'responses', 'phase': 'standalone_turn', 'strategy': 'memento', 'status': 'completed', 'active_context_tokens_before': 20170, 'active_context_tokens_after': 4830, 'started_at': 1775781337, 'completed_at': 1775781350, 'thread_source': 'user', 'subagent_source': None, 'parent_thread_id': None, 'error': None, 'duration_ms': 13524} \| ` 2. Auto pre-turn compact, local `INFO \| 2026-04-09 17:37:30 \| codex_backend.routers.analytics_events \| analytics_events.track_analytics_events:526 \| Tracked codex_compaction_event event params={'thread_id': '019d74d2-45ef-71d1-9c93-23cc0c13d988', 'turn_id': '019d74d2-7b42-7372-9f0e-c0da3f352328', 'product_surface': 'codex', 'app_server_client': {'product_client_id': 'CODEX_CLI', 'client_name': 'codex-tui', 'client_version': '0.0.0', 'rpc_transport': 'in_process', 'experimental_api_enabled': True}, 'runtime': {'codex_rs_version': '0.0.0', 'runtime_os': 'macos', 'runtime_os_version': '26.4.0', 'runtime_arch': 'aarch64'}, 'trigger': 'auto', 'reason': 'context_limit', 'implementation': 'responses', 'phase': 'pre_turn', 'strategy': 'memento', 'status': 'completed', 'active_context_tokens_before': 20063, 'active_context_tokens_after': 4822, 'started_at': 1775781444, 'completed_at': 1775781449, 'thread_source': 'user', 'subagent_source': None, 'parent_thread_id': None, 'error': None, 'duration_ms': 5497} \| ` 3. Auto mid-turn compact, local `INFO \| 2026-04-09 17:38:28 \| codex_backend.routers.analytics_events \| analytics_events.track_analytics_events:526 \| Tracked codex_compaction_event event params={'thread_id': '019d74d3-212f-7a20-8c0a-4816a978675e', 'turn_id': '019d74d3-3ee1-7462-89f6-2ffbeefcd5e3', 'product_surface': 'codex', 'app_server_client': {'product_client_id': 'CODEX_CLI', 'client_name': 'codex-tui', 'client_version': '0.0.0', 'rpc_transport': 'in_process', 'experimental_api_enabled': True}, 'runtime': {'codex_rs_version': '0.0.0', 'runtime_os': 'macos', 'runtime_os_version': '26.4.0', 'runtime_arch': 'aarch64'}, 'trigger': 'auto', 'reason': 'context_limit', 'implementation': 'responses', 'phase': 'mid_turn', 'strategy': 'memento', 'status': 'completed', 'active_context_tokens_before': 20325, 'active_context_tokens_after': 14641, 'started_at': 1775781500, 'completed_at': 1775781508, 'thread_source': 'user', 'subagent_source': None, 'parent_thread_id': None, 'error': None, 'duration_ms': 7507} \| ` 4. Remote /responses/compact, manual standalone `INFO \| 2026-04-09 17:40:20 \| codex_backend.routers.analytics_events \| analytics_events.track_analytics_events:526 \| Tracked codex_compaction_event event params={'thread_id': '019d74d4-7a11-78a1-89f7-0535a1149416', 'turn_id': '019d74d4-e087-7183-9c20-b1e40b7578c0', 'product_surface': 'codex', 'app_server_client': {'product_client_id': 'CODEX_CLI', 'client_name': 'codex-tui', 'client_version': '0.0.0', 'rpc_transport': 'in_process', 'experimental_api_enabled': True}, 'runtime': {'codex_rs_version': '0.0.0', 'runtime_os': 'macos', 'runtime_os_version': '26.4.0', 'runtime_arch': 'aarch64'}, 'trigger': 'manual', 'reason': 'user_requested', 'implementation': 'responses_compact', 'phase': 'standalone_turn', 'strategy': 'memento', 'status': 'completed', 'active_context_tokens_before': 23461, 'active_context_tokens_after': 6171, 'started_at': 1775781601, 'completed_at': 1775781620, 'thread_source': 'user', 'subagent_source': None, 'parent_thread_id': None, 'error': None, 'duration_ms': 18971} \| `	2026-04-10 13:03:54 -07:00
Dylan Hurd	6c36e7d688	fix(app-server) revert null instructions changes (#17047 )	2026-04-07 15:18:34 -07:00
Ahmed Ibrahim	24c598e8a9	Honor null thread instructions (#16964 ) - Treat explicit null thread instructions as a blank-slate override while preserving omitted-field fallback behavior. - Preserve null through rollout resume/fork and keep explicit empty strings distinct. - Add app-server v2 start/fork coverage for the tri-state instruction params.	2026-04-07 04:10:19 +00:00
rhan-oai	756c45ec61	[codex-analytics] add protocol-native turn timestamps (#16638 ) --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/16638). * #16870 * #16706 * #16659 * #16641 * #16640 * __->__ #16638	2026-04-06 16:22:59 -07:00
Ahmed Ibrahim	af8a9d2d2b	remove temporary ownership re-exports (#16626 ) Stacked on #16508. This removes the temporary `codex-core` / `codex-login` re-export shims from the ownership split and rewrites callsites to import directly from `codex-model-provider-info`, `codex-models-manager`, `codex-api`, `codex-protocol`, `codex-feedback`, and `codex-response-debug-context`. No behavior change intended; this is the mechanical import cleanup layer split out from the ownership move. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-03 00:33:34 -07:00
Michael Bolin	aa2403e2eb	core: remove cross-crate re-exports from lib.rs (#16512 ) ## Why `codex-core` was re-exporting APIs owned by sibling `codex-` crates, which made downstream crates depend on `codex-core` as a proxy module instead of the actual owner crate. Removing those forwards makes crate boundaries explicit and lets leaf crates drop unnecessary `codex-core` dependencies. In this PR, this reduces the dependency on `codex-core` to `codex-login` in the following files: ``` codex-rs/backend-client/Cargo.toml codex-rs/mcp-server/tests/common/Cargo.toml ``` ## What - Remove `codex-rs/core/src/lib.rs` re-exports for symbols owned by `codex-login`, `codex-mcp`, `codex-rollout`, `codex-analytics`, `codex-protocol`, `codex-shell-command`, `codex-sandboxing`, `codex-tools`, and `codex-utils-path`. - Delete the `default_client` forwarding shim in `codex-rs/core`. - Update in-crate and downstream callsites to import directly from the owning `codex-` crate. - Add direct Cargo dependencies where callsites now target the owner crate, and remove `codex-core` from `codex-rs/backend-client`.	2026-04-01 23:06:24 -07:00
Andrei Eternal	267499bed8	[hooks] use a user message > developer message for prompt continuation (#14867 ) ## Summary Persist Stop-hook continuation prompts as `user` messages instead of hidden `developer` messages + some requested integration tests This is a followup to @pakrym 's comment in https://github.com/openai/codex/pull/14532 to make sure stop-block continuation prompts match training for turn loops - Stop continuation now writes `<hook_prompt hook_run_id="...">stop hook's user prompt<hook_prompt>` - Introduces quick-xml dependency, though we already indirectly depended on it anyway via syntect - This PR only has about 500 lines of actual logic changes, the rest is tests/schema ## Testing Example run (with a sessionstart hook and 3 stop hooks) - this shows context added by session start, then two stop hooks sending their own additional prompts in a new turn. The model responds with a single message addressing both. Then when that turn ends, the hooks detect that they just ran using `stop_hook_active` and decide not to infinite loop test files for this (unzip, move codex -> .codex): [codex.zip](https://github.com/user-attachments/files/26075806/codex.zip) ``` › cats • Running SessionStart hook: lighting the observatory SessionStart hook (completed) warning: Hi, I'm a session start hook for wizard-tower (startup). hook context: A wimboltine stonpet is an exotic cuisine from hyperspace • Cats are tiny zen wizards, my friend: equal parts nap, mystery, and chaos. If you want, we can talk cat facts, cat breeds, cat names, or build something cat-themed in this repo. • Running Stop hook: checking the tower wards • Running Stop hook: sacking the guards • Running Stop hook: hiring the guards Stop hook (completed) warning: Wizard Tower Stop hook reviewed the completed reply (177 chars). Stop hook (blocked) warning: Wizard Tower Stop hook continuing conversation feedback: cook the stonpet Stop hook (blocked) warning: Wizard Tower Stop hook continuing conversation feedback: eat the cooked stonpet • Stonpet’s cooked, aloha style: flash-seared over a blue quasiflame, glazed with nebula salt, and rested until the hyperspace juices settle. Now we eat with gratitude, my friend. One mindful bite in, and the flavor is pure cosmic surf: smoky, bright, and totally out of this dimension. • Running Stop hook: checking the tower wards • Running Stop hook: sacking the guards • Running Stop hook: hiring the guards Stop hook (completed) warning: Wizard Tower Stop hook reviewed the completed reply (285 chars). Stop hook (completed) warning: Wizard Tower Stop hook saw a second pass and stayed calm to avoid a loop. Stop hook (completed) warning: Wizard Tower Stop hook saw a second pass and stayed calm to avoid a loop. ```	2026-03-19 10:53:08 -07:00
Michael Bolin	b77fe8fefe	Apply argument comment lint across codex-rs (#14652 ) ## Why Once the repo-local lint exists, `codex-rs` needs to follow the checked-in convention and CI needs to keep it from drifting. This commit applies the fallback `/param/` style consistently across existing positional literal call sites without changing those APIs. The longer-term preference is still to avoid APIs that require comments by choosing clearer parameter types and call shapes. This PR is intentionally the mechanical follow-through for the places where the existing signatures stay in place. After rebasing onto newer `main`, the rollout also had to cover newly introduced `tui_app_server` call sites. That made it clear the first cut of the CI job was too expensive for the common path: it was spending almost as much time installing `cargo-dylint` and re-testing the lint crate as a representative test job spends running product tests. The CI update keeps the full workspace enforcement but trims that extra overhead from ordinary `codex-rs` PRs. ## What changed - keep a dedicated `argument_comment_lint` job in `rust-ci` - mechanically annotate remaining opaque positional literals across `codex-rs` with exact `/param/` comments, including the rebased `tui_app_server` call sites that now fall under the lint - keep the checked-in style aligned with the lint policy by using `/param/` and leaving string and char literals uncommented - cache `cargo-dylint`, `dylint-link`, and the relevant Cargo registry/git metadata in the lint job - split changed-path detection so the lint crate's own `cargo test` step runs only when `tools/argument-comment-lint/` or `rust-ci.yml` changes - continue to run the repo wrapper over the `codex-rs` workspace, so product-code enforcement is unchanged Most of the code changes in this commit are intentionally mechanical comment rewrites or insertions driven by the lint itself. ## Verification - `./tools/argument-comment-lint/run.sh --workspace` - `cargo test -p codex-tui-app-server -p codex-tui` - parsed `.github/workflows/rust-ci.yml` locally with PyYAML --- -> #14652 * #14651	2026-03-16 16:48:15 -07:00
pakrym-oai	477a2dd345	Add code_mode_only feature (#14617 ) Summary - add the code_mode_only feature flag/config schema and wire its dependency on code_mode - update code mode tool descriptions to list nested tools with detailed headers - restrict available tools for prompt and exec descriptions when code_mode_only is enabled and test the behavior Testing - Not run (not requested)	2026-03-13 13:30:19 -07:00
Rasmus Rygaard	53d5972226	Reapply "Pass more params to compaction" (#14298 ) (#14521 ) This reverts commit `8af97ce4b0`. Confirmed that this runs locally without the previous issues with tool use	2026-03-12 23:27:21 +00:00
Anton Panasenko	77b0c75267	feat: search_tool migrate to bring you own tool of Responses API (#14274 ) ## Why to support a new bring your own search tool in Responses API(https://developers.openai.com/api/docs/guides/tools-tool-search#client-executed-tool-search) we migrating our bm25 search tool to use official way to execute search on client and communicate additional tools to the model. ## What - replace the legacy `search_tool_bm25` flow with client-executed `tool_search` - add protocol, SSE, history, and normalization support for `tool_search_call` and `tool_search_output` - return namespaced Codex Apps search results and wire namespaced follow-up tool calls back into MCP dispatch	2026-03-11 17:51:51 -07:00
Rasmus Rygaard	7f22329389	Revert "Pass more params to compaction" (#14298 )	2026-03-11 12:33:10 -07:00
Rasmus Rygaard	2621ba17e3	Pass more params to compaction (#14247 ) Pass more params to /compact. This should give us parity with the /responses endpoint to improve caching. I'm torn about the MCP await. Blocking will give us parity but it seems like we explicitly don't block on MCPs. Happy either way	2026-03-11 12:33:09 -07:00
Owen Lin	289ed549cf	chore(otel): rename OtelManager to SessionTelemetry (#13808 ) ## Summary This is a purely mechanical refactor of `OtelManager` -> `SessionTelemetry` to better convey what the struct is doing. No behavior change. ## Why `OtelManager` ended up sounding much broader than what this type actually does. It doesn't manage OTEL globally; it's the session-scoped telemetry surface for emitting log/trace events and recording metrics with consistent session metadata (`app_version`, `model`, `slug`, `originator`, etc.). `SessionTelemetry` is a more accurate name, and updating the call sites makes that boundary a lot easier to follow. ## Validation - `just fmt` - `cargo test -p codex-otel` - `cargo test -p codex-core`	2026-03-06 16:23:30 -08:00
Won Park	fa2306b303	image-gen-core (#13290 ) Core tool-calling for image-gen, handles requesting and receiving logic for images using response API	2026-03-03 23:11:28 -08:00
Ahmed Ibrahim	0aeb55bf08	Record realtime close marker on replacement (#13058 ) ## Summary - record a realtime close developer message when a new realtime session replaces an active one - assert the replacement marker through the mocked responses request path --------- Co-authored-by: Codex <noreply@openai.com> Co-authored-by: Charles Cunningham <ccunningham@openai.com>	2026-03-01 13:54:12 -08:00
Charley Cunningham	695957a348	Unify rollout reconstruction with resume/fork TurnContext hydration (#12612 ) ## Summary This PR unifies rollout history reconstruction and resume/fork metadata hydration under a single `Session::reconstruct_history_from_rollout` implementation. The key change from main is that replay metadata now comes from the same reconstruction pass that rebuilds model-visible history, instead of doing a second bespoke rollout scan to recover `previous_model` / `reference_context_item`. ## What Changed ### Unified reconstruction output `reconstruct_history_from_rollout` now returns a single `RolloutReconstruction` bundle containing: - rebuilt `history` - `previous_model` - `reference_context_item` Resume and fork both consume that shared output directly. ### Reverse replay core The reconstruction logic moved into `codex-rs/core/src/codex/rollout_reconstruction.rs` and now scans rollout items newest-to-oldest. That reverse pass: - derives `previous_model` - derives whether `reference_context_item` is preserved or cleared - stops early once it has both resume metadata and a surviving `replacement_history` checkpoint History materialization is still bridged eagerly for now by replaying only the surviving suffix forward, which keeps the history result stable while moving the control flow toward the future lazy reverse loader design. ### Removed bespoke context lookup This deletes `last_rollout_regular_turn_context_lookup` and its separate compaction-aware scan. The previous model / baseline metadata is now computed from the same replay state that rebuilds history, so resume/fork cannot drift from the reconstructed transcript view. ### `TurnContextItem` persistence contract `TurnContextItem` is now treated as the replay source of truth for durable model-visible baselines. This PR keeps the following contract explicit: - persist `TurnContextItem` for the first real user turn so resume can recover `previous_model` - persist it for later turns that emit model-visible context updates - if mid-turn compaction reinjects full initial context into replacement history, persist a fresh `TurnContextItem` after `Compacted` so resume/fork can re-establish the baseline from the rewritten history - do not treat manual compaction or pre-sampling compaction as creating a new durable baseline on their own ## Behavior Preserved - rollback replay stays aligned with `drop_last_n_user_turns` - rollback skips only user turns - incomplete active user turns are dropped before older finalized turns when rollback applies - unmatched aborts do not consume the current active turn - missing abort IDs still conservatively clear stale compaction state - compaction clears `reference_context_item` until a later `TurnContextItem` re-establishes it - `previous_model` still comes from the newest surviving user turn that established one ## Tests Targeted validation run for the current branch shape: - `cd codex-rs && cargo test -p codex-core --lib codex::rollout_reconstruction_tests -- --nocapture` - `cd codex-rs && just fmt` The branch also extracts the rollout reconstruction tests into `codex-rs/core/src/codex/rollout_reconstruction_tests.rs` so this logic has a dedicated home instead of living inline in `codex.rs`.	2026-02-27 13:50:45 -08:00
jif-oai	3404ecff15	feat: add post-compaction sub-agent infos (#12774 ) Co-authored-by: Codex <noreply@openai.com>	2026-02-26 18:55:34 +00:00
Charley Cunningham	07aefffb1f	core: bundle settings diff updates into one dev/user envelope (#12417 ) ## Summary - bundle contextual prompt injection into at most one developer message plus one contextual user message in both: - per-turn settings updates - initial context insertion - preserve `<model_switch>` across compaction by rebuilding it through canonical initial-context injection, instead of relying on strip/reattach hacks - centralize contextual user fragment detection in one shared definition table and reuse it for parsing/compaction logic - keep `AGENTS.md` in its natural serialized format: - `# AGENTS.md instructions for {dirname}` - `<INSTRUCTIONS>...</INSTRUCTIONS>` - simplify related tests/helpers and accept the expected snapshot/layout updates from bundled multi-part messages ## Why The goal is to converge toward a simpler, more intentional prompt shape where contextual updates are consistently represented as one developer envelope plus one contextual user envelope, while keeping parsing and compaction behavior aligned with that representation. ## Notable details - the temporary `SettingsUpdateEnvelope` wrapper was removed; these paths now return `Vec<ResponseItem>` directly - local/remote compaction no longer rely on model-switch strip/restore helpers - contextual user detection is now driven by shared fragment definitions instead of ad hoc matcher assembly - AGENTS/user instructions are still the same logical context; only the synthetic `<user_instructions>` wrapper was replaced by the natural AGENTS text format ## Testing - `just fmt` - `cargo test -p codex-app-server codex_message_processor::tests::extract_conversation_summary_prefers_plain_user_messages -- --exact` - `cargo test -p codex-core compact::tests::collect_user_messages_filters_session_prefix_entries --lib -- --exact` - `cargo test -p codex-core --test all 'suite::compact::snapshot_request_shape_pre_turn_compaction_strips_incoming_model_switch' -- --exact` - `cargo test -p codex-core --test all 'suite::compact_remote::snapshot_request_shape_remote_pre_turn_compaction_strips_incoming_model_switch' -- --exact` - `cargo test -p codex-core --test all 'suite::client::includes_apps_guidance_as_developer_message_when_enabled' -- --exact` - `cargo test -p codex-core --test all 'suite::client::includes_developer_instructions_message_in_request' -- --exact` - `cargo test -p codex-core --test all 'suite::client::includes_user_instructions_message_in_request' -- --exact` - `cargo test -p codex-core --test all 'suite::client::resume_includes_initial_messages_and_sends_prior_items' -- --exact` - `cargo test -p codex-core --test all 'suite::review::review_input_isolated_from_parent_history' -- --exact` - `cargo test -p codex-exec --test all 'suite::resume::exec_resume_last_respects_cwd_filter_and_all_flag' -- --exact` - `cargo test -p core_test_support context_snapshot::tests::full_text_mode_preserves_unredacted_text -- --exact` ## Notes - I also ran several targeted `compact`, `compact_remote`, `prompt_caching`, `model_visible_layout`, and `event_mapping` tests while iterating on prompt-shape changes. - I have not claimed a clean full-workspace `cargo test` from this environment because local sandbox/resource conditions have previously produced unrelated failures in large workspace runs.	2026-02-26 00:12:08 -08:00
Charley Cunningham	bb0ac5be70	Fix compaction context reinjection and model baselines (#12252 ) ## Summary - move regular-turn context diff/full-context persistence into `run_turn` so pre-turn compaction runs before incoming context updates are recorded - after successful pre-turn compaction, rely on a cleared `reference_context_item` to trigger full context reinjection on the follow-up regular turn (manual `/compact` keeps replacement history summary-only and also clears the baseline) - preserve `<model_switch>` when full context is reinjected, and inject it before the rest of the full-context items - scope `reference_context_item` and `previous_model` to regular user turns only so standalone tasks (`/compact`, shell, review, undo) cannot suppress future reinjection or `<model_switch>` behavior - make context-diff persistence + `reference_context_item` updates explicit in the regular-turn path, with clearer docs/comments around the invariant - stop persisting local `/compact` `RolloutItem::TurnContext` snapshots (only regular turns persist `TurnContextItem` now) - simplify resume/fork previous-model/reference-baseline hydration by looking up the last surviving turn context from rollout lifecycle events, including rollback and compaction-crossing handling - remove the legacy fallback that guessed from bare `TurnContext` rollouts without lifecycle events - update compaction/remote-compaction/model-visible snapshots and compact test assertions (including remote compaction mock response shape) ## Why We were persisting incoming context items before spawning the regular turn task, which let pre-turn compaction requests accidentally include incoming context diffs without the new user message. Fixing that exposed follow-on baseline issues around `/compact`, resume/fork, and standalone tasks that could cause duplicate context injection or suppress `<model_switch>` instructions. This PR re-centers the invariants around regular turns: - regular turns persist model-visible context diffs/full reinjection and update the `reference_context_item` - standalone tasks do not advance those regular-turn baselines - compaction clears the baseline when replacement history may have stripped the referenced context diffs ## Follow-ups (TODOs left in code) - `TODO(ccunningham)`: fix rollback/backtracking baseline handling more comprehensively - `TODO(ccunningham)`: include pending incoming context items in pre-turn compaction threshold estimation - `TODO(ccunningham)`: inject updated personality spec alongside `<model_switch>` so some model-switch paths can avoid forced full reinjection - `TODO(ccunningham)`: review task turn lifecycle (`TurnStarted`/`TurnComplete`) behavior and emit task-start context diffs for task types that should have them (excluding `/compact`) ## Validation - `just fmt` - CI should cover the updated compaction/resume/model-visible snapshot expectations and rollout-hydration behavior - I did not rerun the full local test suite after the latest resume-lookup / rollout-persistence simplifications	2026-02-20 23:13:08 -08:00
Charley Cunningham	67e577da53	Handle model-switch base instructions after compaction (#11659 ) Strip trailing <model_switch> during model-switch compaction request, and append <model_switch> after model switch compaction	2026-02-13 19:02:53 -08:00
Celia Chen	641d5268fa	chore: persist turn_id in rollout session and make turn_id uuid based (#11246 ) Problem: 1. turn id is constructed in-memory; 2. on resuming threads, turn_id might not be unique; 3. client cannot no the boundary of a turn from rollout files easily. This PR does three things: 1. persist `task_started` and `task_complete` events; 1. persist `turn_id` in rollout turn events; 5. generate turn_id as unique uuids instead of incrementing it in memory. This helps us resolve the issue of clients wanting to have unique turn ids for resuming a thread, and knowing the boundry of each turn in rollout files. example debug logs ``` 2026-02-11T00:32:10.746876Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=8 turn=Turn { id: "019c4a07-d809-74c3-bc4b-fd9618487b4b", items: [UserMessage { id: "item-24", content: [Text { text: "hi", text_elements: [] }] }, AgentMessage { id: "item-25", text: "Hi. I’m in the workspace with your current changes loaded and ready. Send the next task and I’ll execute it end-to-end." }], status: Completed, error: None } 2026-02-11T00:32:10.746888Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=9 turn=Turn { id: "019c4a18-1004-76c0-a0fb-a77610f6a9b8", items: [UserMessage { id: "item-26", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-27", text: "Hello. Ready for the next change in `codex-rs`; I can continue from the current in-progress diff or start a new task." }], status: Completed, error: None } 2026-02-11T00:32:10.746899Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=10 turn=Turn { id: "019c4a19-41f0-7db0-ad78-74f1503baeb8", items: [UserMessage { id: "item-28", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-29", text: "Hello. Send the specific change you want in `codex-rs`, and I’ll implement it and run the required checks." }], status: Completed, error: None } ``` backward compatibility: if you try to resume an old session without task_started and task_complete event populated, the following happens: - If you resume and do nothing: those reconstructed historical IDs can differ next time you resume. - If you resume and send a new turn: the new turn gets a fresh UUID from live submission flow and is persisted, so that new turn’s ID is stable on later resumes. I think this behavior is fine, because we only care about deterministic turn id once a turn is triggered.	2026-02-11 03:56:01 +00:00
Ahmed Ibrahim	5e01450963	Strip unsupported images from prompt history to guard against model switch (#11349 ) - Make `ContextManager::for_prompt` modality-aware and strip input_image content when the active model is text-only. - Added a test for multi-model -> text-only model switch	2026-02-10 11:58:00 -08:00
Charley Cunningham	9450cd9ce5	core: add focused diagnostics for remote compaction overflows (#11133 ) ## Summary - add targeted remote-compaction failure diagnostics in compact_remote logging - log the specific values needed to explain overflow timing: - last_api_response_total_tokens - estimated_tokens_of_items_added_since_last_successful_api_response - estimated_bytes_of_items_added_since_last_successful_api_response - failing_compaction_request_body_bytes - simplify breakdown naming and remove last_api_response_total_bytes_estimate (it was an approximation and not useful for debugging) ## Why When compaction fails with context_length_exceeded, we need concrete, low-ambiguity numbers that map directly to: 1) what the API most recently reported, and 2) what local history added since then. This keeps the failure logs actionable without adding broad, noisy metrics. ## Testing - just fmt - cargo test -p codex-core	2026-02-09 12:42:20 -08:00
Ahmed Ibrahim	ba8b5d9018	Treat compaction failure as failure state (#10927 ) - Return compaction errors from local and remote compaction flows.\n- Stop turns/tasks when auto-compaction fails instead of continuing execution.	2026-02-06 13:51:46 -08:00
Charley Cunningham	143daadb31	core: refresh developer instructions after compaction replacement history (#10574 ) ## Summary When replaying compacted history (especially `replacement_history` from remote compaction), we should not keep stale developer messages from older session state. This PR trims developer- role messages from compacted replacement history and reinjects fresh developer instructions derived from current turn/session state. This aligns compaction replay behavior with the intended "fresh instructions after summary" model. ## Problem Compaction replay had two paths: - `Compacted { replacement_history: None }`: rebuilt with fresh initial context - `Compacted { replacement_history: Some(...) }`: previously used raw replacement history as-is The second path could carry stale developer instructions (permissions/personality/collab-mode guidance) across session changes. ## What Changed ### 1) Added helper to refresh compacted developer instructions - File: `codex-rs/core/src/compact.rs` - Function: `refresh_compacted_developer_instructions(...)` Behavior: - remove all `ResponseItem::Message { role: "developer", .. }` from compacted history - append fresh developer messages from current `build_initial_context(...)` ### 2) Applied helper in remote compaction flow - File: `codex-rs/core/src/compact_remote.rs` - After receiving compact endpoint output, refresh developer instructions before replacing history and persisting `replacement_history`. ### 3) Applied helper while reconstructing history from rollout - File: `codex-rs/core/src/codex.rs` - In `reconstruct_history_from_rollout(...)`, when processing `Compacted` entries with `replacement_history`, refresh developer instructions instead of directly replacing with raw history. ## Non-Goals / Follow-up This PR does not address the existing first-turn-after-resume double-injection behavior. A follow-up PR will handle resume-time dedup/idempotence separately. If you want, I can also give you a shorter “squash-merge friendly” version of the description. ## Codex author `codex fork 019c25e6-706e-75d1-9198-688ec00a8256`	2026-02-06 12:25:08 -08:00
Charley Cunningham	dc7007beaa	Fix remote compaction estimator/payload instruction small mismatch (#10692 ) ## Summary This PR fixes a deterministic mismatch in remote compaction where pre-trim estimation and the `/v1/responses/compact` payload could use different base instructions. Before this change: - pre-trim estimation used model-derived instructions (`model_info.get_model_instructions(...)`) - compact payload used session base instructions (`sess.get_base_instructions()`) After this change: - remote pre-trim estimation and compact payload both use the same `BaseInstructions` instance from session state. ## Changes - Added a shared estimator entry point in `ContextManager`: - `estimate_token_count_with_base_instructions(&self, base_instructions: &BaseInstructions) -> Option<i64>` - Kept `estimate_token_count(&TurnContext)` as a thin wrapper that resolves model/personality instructions and delegates to the new helper. - Updated remote compaction flow to fetch base instructions once and reuse it for both: - trim preflight estimation - compact request payload construction - Added regression coverage for parity and behavior: - unit test verifying explicit-base estimator behavior - integration test proving remote compaction uses session override instructions and trims accordingly ## Why this matters This removes a deterministic divergence source where pre-trim could think the request fits while the actual compact request exceeded context because its instructions were longer/different. ## Scope In scope: - estimator/payload base-instructions parity in remote compaction Out of scope: - retry-on-`context_length_exceeded` - compaction threshold/headroom policy changes - broader trimming policy changes ## Codex author: `codex fork 019c2b24-c2df-7b31-a482-fb8cf7a28559`	2026-02-04 23:24:06 -08:00
pakrym-oai	0e8d359da9	Session-level model client (#10664 ) Make ModelClient a session-scoped object. Move state that is session level onto the client, and make state that is per-turn explicit on corresponding methods. Stop taking a huge Config object, instead only pass in values that are actually needed. --------- Co-authored-by: Josh McKinney <joshka@openai.com>	2026-02-04 16:58:48 -08:00
pakrym-oai	7f20357611	Stop client from being state carrier (#10595 ) I'd like to make client session wide. This requires shedding all random state it has to carry.	2026-02-04 09:05:37 -08:00
pakrym-oai	cbfd2a37cc	Trim compaction input (#10374 ) Two fixes: 1. Include trailing tool output in the total context size calculation. Otherwise when checking whether compaction should run we ignore newly added outputs. 2. Trim trailing tool output/tool calls until we can fit the request into the model context size. Otherwise the compaction endpoint will fail to compact. We only trim items that can be reproduced again by the model (tool calls, tool call outputs).	2026-02-02 19:03:11 -08:00
pakrym-oai	03fcd12e77	Do not append items on override turn context (#10354 )	2026-02-01 18:51:26 -08:00
Charley Cunningham	ec4a2d07e4	Plan mode: stream proposed plans, emit plan items, and render in TUI (#9786 ) ## Summary - Stream proposed plans in Plan Mode using `<proposed_plan>` tags parsed in core, emitting plan deltas plus a plan `ThreadItem`, while stripping tags from normal assistant output. - Persist plan items and rebuild them on resume so proposed plans show in thread history. - Wire plan items/deltas through app-server protocol v2 and render a dedicated proposed-plan view in the TUI, including the “Implement this plan?” prompt only when a plan item is present. ## Changes ### Core (`codex-rs/core`) - Added a generic, line-based tag parser that buffers each line until it can disprove a tag prefix; implements auto-close on `finish()` for unterminated tags. `codex-rs/core/src/tagged_block_parser.rs` - Refactored proposed plan parsing to wrap the generic parser. `codex-rs/core/src/proposed_plan_parser.rs` - In plan mode, stream assistant deltas as: - Normal text → `AgentMessageContentDelta` - Plan text → `PlanDelta` + `TurnItem::Plan` start/completion (`codex-rs/core/src/codex.rs`) - Final plan item content is derived from the completed assistant message (authoritative), not necessarily the concatenated deltas. - Strips `<proposed_plan>` blocks from assistant text in plan mode so tags don’t appear in normal messages. (`codex-rs/core/src/stream_events_utils.rs`) - Persist `ItemCompleted` events only for plan items for rollout replay. (`codex-rs/core/src/rollout/policy.rs`) - Guard `update_plan` tool in Plan Mode with a clear error message. (`codex-rs/core/src/tools/handlers/plan.rs`) - Updated Plan Mode prompt to: - keep `<proposed_plan>` out of non-final reasoning/preambles - require exact tag formatting - allow only one `<proposed_plan>` block per turn (`codex-rs/core/templates/collaboration_mode/plan.md`) ### Protocol / App-server protocol - Added `TurnItem::Plan` and `PlanDeltaEvent` to core protocol items. (`codex-rs/protocol/src/items.rs`, `codex-rs/protocol/src/protocol.rs`) - Added v2 `ThreadItem::Plan` and `PlanDeltaNotification` with EXPERIMENTAL markers and note that deltas may not match the final plan item. (`codex-rs/app-server-protocol/src/protocol/v2.rs`) - Added plan delta route in app-server protocol common mapping. (`codex-rs/app-server-protocol/src/protocol/common.rs`) - Rebuild plan items from persisted `ItemCompleted` events on resume. (`codex-rs/app-server-protocol/src/protocol/thread_history.rs`) ### App-server - Forward plan deltas to v2 clients and map core plan items to v2 plan items. (`codex-rs/app-server/src/bespoke_event_handling.rs`, `codex-rs/app-server/src/codex_message_processor.rs`) - Added v2 plan item tests. (`codex-rs/app-server/tests/suite/v2/plan_item.rs`) ### TUI - Added a dedicated proposed plan history cell with special background and padding, and moved “• Proposed Plan” outside the highlighted block. (`codex-rs/tui/src/history_cell.rs`, `codex-rs/tui/src/style.rs`) - Only show “Implement this plan?” when a plan item exists. (`codex-rs/tui/src/chatwidget.rs`, `codex-rs/tui/src/chatwidget/tests.rs`) <img width="831" height="847" alt="Screenshot 2026-01-29 at 7 06 24 PM" src="https://github.com/user-attachments/assets/69794c8c-f96b-4d36-92ef-c1f5c3a8f286" /> ### Docs / Misc - Updated protocol docs to mention plan deltas. (`codex-rs/docs/protocol_v1.md`) - Minor plumbing updates in exec/debug clients to tolerate plan deltas. (`codex-rs/debug-client/src/reader.rs`, `codex-rs/exec/...`) ## Tests - Added core integration tests: - Plan mode strips plan from agent messages. - Missing `</proposed_plan>` closes at end-of-message. (`codex-rs/core/tests/suite/items.rs`) - Added unit tests for generic tag parser (prefix buffering, non-tag lines, auto-close). (`codex-rs/core/src/tagged_block_parser.rs`) - Existing app-server plan item tests in v2. (`codex-rs/app-server/tests/suite/v2/plan_item.rs`) ## Notes / Behavior - Plan output no longer appears in standard assistant text in Plan Mode; it streams via `PlanDelta` and completes as a `TurnItem::Plan`. - The final plan item content is authoritative and may diverge from streamed deltas (documented as experimental). - Reasoning summaries are not filtered; prompt instructs the model not to include `<proposed_plan>` outside the final plan message. ## Codex Author `codex fork 019bec2d-b09d-7450-b292-d7bcdddcdbfb`	2026-01-30 18:59:30 +00:00
Ahmed Ibrahim	b7edeee8ca	compaction (#10034 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.	2026-01-28 11:36:11 -08:00
Dylan Hurd	8b3521ee77	feat(core) update Personality on turn (#9644 ) ## Summary Support updating Personality mid-Thread via UserTurn/OverwriteTurn. This is explicitly unused by the clients so far, to simplify PRs - app-server and tui implementations will be follow-ups. ## Testing - [x] added integration tests	2026-01-22 12:04:23 -08:00
Dylan Hurd	714151eb4e	feat(personality) introduce model_personality config (#9459 ) ## Summary Introduces the concept of a config model_personality. I would consider this an MVP for testing out the feature. There are a number of follow-ups to this PR: - More sophisticated templating with validation - In-product experience to manage this ## Testing - [x] Testing locally	2026-01-20 11:06:14 -08:00
Dylan Hurd	675f165c56	fix(core) Preserve base_instructions in SessionMeta (#9427 ) ## Summary This PR consolidates base_instructions onto SessionMeta / SessionConfiguration, so we ensure `base_instructions` is set once per session and should be (mostly) immutable, unless: - overridden by config on resume / fork - sub-agent tasks, like review or collab In a future PR, we should convert all references to `base_instructions` to consistently used the typed struct, so it's less likely that we put other strings there. See #9423. However, this PR is already quite complex, so I'm deferring that to a follow-up. ## Testing - [x] Added a resume test to assert that instructions are preserved. In particular, `resume_switches_models_preserves_base_instructions` fails against main. Existing test coverage thats assert base instructions are preserved across multiple requests in a session: - Manual compact keeps baseline instructions: core/tests/suite/compact.rs:199 - Auto-compact keeps baseline instructions: core/tests/suite/compact.rs:1142 - Prompt caching reuses the same instructions across two requests: core/tests/suite/prompt_caching.rs:150 and core/tests/suite/prompt_caching.rs:157 - Prompt caching with explicit expected string across two requests: core/tests/suite/prompt_caching.rs:213 and core/tests/suite/prompt_caching.rs:222 - Resume with model switch keeps original instructions: core/tests/suite/resume.rs:136 - Compact/resume/fork uses request 0 instructions for later expected payloads: core/tests/suite/compact_resume_fork.rs:215	2026-01-19 21:59:36 -08:00
jif-oai	1aed01e99f	renaming: task to turn (#8963 )	2026-01-09 17:31:17 +00:00
jif-oai	69898e3dba	clean: all history cloning (#8916 )	2026-01-08 18:17:18 +00:00
jif-oai	9ba27cfa0a	feat: add compaction event (#7289 )	2025-11-25 16:12:14 +00:00
Celia Chen	9bce050385	[app-server & core] introduce new codex error code and v2 app-server error events (#6938 ) This PR does two things: 1. populate a new `codex_error_code` protocol in error events sent from core to client; 2. old v1 core events `codex/event/stream_error` and `codex/event/error` will now both become `error`. We also show codex error code for turncompleted -> error status. new events in app server test: ``` < { < "method": "codex/event/stream_error", < "params": { < "conversationId": "019aa34c-0c14-70e0-9706-98520a760d67", < "id": "0", < "msg": { < "codex_error_code": { < "response_stream_disconnected": { < "http_status_code": 401 < } < }, < "message": "Reconnecting... 2/5", < "type": "stream_error" < } < } < } { < "method": "error", < "params": { < "error": { < "codexErrorCode": { < "responseStreamDisconnected": { < "httpStatusCode": 401 < } < }, < "message": "Reconnecting... 2/5" < } < } < } < { < "method": "turn/completed", < "params": { < "turn": { < "error": { < "codexErrorCode": { < "responseTooManyFailedAttempts": { < "httpStatusCode": 401 < } < }, < "message": "exceeded retry limit, last status: 401 Unauthorized, request id: 9a1b495a1a97ed3e-SJC" < }, < "id": "0", < "items": [], < "status": "failed" < } < } < } ```	2025-11-20 23:06:55 +00:00
Celia Chen	72a1453ac5	Revert "[core] add optional status_code to error events (#6865 )" (#6955 ) This reverts commit `c2ec477d93`. # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.	2025-11-20 01:26:14 +00:00
Celia Chen	c2ec477d93	[core] add optional status_code to error events (#6865 ) We want to better uncover error status code for clients. Add an optional status_code to error events (thread error, error, stream error) so app server could uncover the status code from the client side later. in event log: ``` < { < "method": "codex/event/stream_error", < "params": { < "conversationId": "019a9a32-f576-7292-9711-8e57e8063536", < "id": "0", < "msg": { < "message": "Reconnecting... 5/5", < "status_code": 401, < "type": "stream_error" < } < } < } < { < "method": "codex/event/error", < "params": { < "conversationId": "019a9a32-f576-7292-9711-8e57e8063536", < "id": "0", < "msg": { < "message": "exceeded retry limit, last status: 401 Unauthorized, request id: 9a0cb03a485067f7-SJC", < "status_code": 401, < "type": "error" < } < } < } ```	2025-11-19 19:51:21 +00:00
jif-oai	3e9e1d993d	chore: consolidate compaction token usage (#6894 )	2025-11-19 11:26:01 +00:00
pakrym-oai	75f38f16dd	Run remote auto compaction (#6879 )	2025-11-19 00:43:58 -08:00
jif-oai	c56d0c159b	fix: local compaction (#6844 )	2025-11-18 22:18:10 +00:00
Ahmed Ibrahim	3de8790714	Add the utility to truncate by tokens (#6746 ) - This PR is to make it on path for truncating by tokens. This path will be initially used by unified exec and context manager (responsible for MCP calls mainly). - We are exposing new config `calls_output_max_tokens` - Use `tokens` as the main budget unit but truncate based on the model family by Introducing `TruncationPolicy`. - Introduce `truncate_text` as a router for truncation based on the mode. In next PRs: - remove truncate_with_line_bytes_budget - Add the ability to the model to override the token budget.	2025-11-18 11:36:23 -08:00
jif-oai	838531d3e4	feat: remote compaction (#6795 ) Co-authored-by: pakrym-oai <pakrym@openai.com>	2025-11-18 16:51:16 +00:00

48 Commits