codex

mirror of https://github.com/openai/codex.git synced 2026-04-26 15:45:02 +00:00

Author	SHA1	Message	Date
sayan-oai	060a320e7d	fix: show user warning when using default fallback metadata (#11690 ) ### What It's currently unclear when the harness falls back to the default, generic `ModelInfo`. This happens when the `remote_models` feature is disabled or the model is truly unknown, and can lead to bad performance and issues in the harness. Add a user-facing warning when this happens so they are aware when their setup is broken. ### Tests Added tests, tested locally.	2026-02-15 18:46:05 -08:00
Charley Cunningham	f24669d444	Persist complete TurnContextItem state via canonical conversion (#11656 ) ## Summary This PR delivers the first small, shippable step toward model-visible state diffing by making `TurnContextItem` more complete and standardizing how it is built. Specifically, it: - Adds persisted network context to `TurnContextItem`. - Introduces a single canonical `TurnContext -> TurnContextItem` conversion path. - Routes existing rollout write sites through that canonical conversion helper. No context injection/diff behavior changes are included in this PR. ## Why this change The design goal is to make `TurnContextItem` the canonical source of truth for context-diff decisions. Before this PR: - `TurnContextItem` did not include all TurnContext-derived environment inputs needed for v1 completeness. - Construction was duplicated at multiple write sites. This PR addresses both with a minimal, reviewable change. ## Changes ### 1) Extend `TurnContextItem` with network state - Added `TurnContextNetworkItem { allowed_domains, denied_domains }`. - Added `network: Option<TurnContextNetworkItem>` to `TurnContextItem`. - Kept backward compatibility by making the new field optional and skipped when absent. Files: - `codex-rs/protocol/src/protocol.rs` ### 2) Canonical conversion helper - Added `TurnContext::to_turn_context_item(collaboration_mode)` in core. - Added internal helper to derive network fields from `config_layer_stack.requirements().network`. Files: - `codex-rs/core/src/codex.rs` ### 3) Use canonical conversion at rollout write sites - Replaced ad hoc `TurnContextItem { ... }` construction with `to_turn_context_item(...)` in: - sampling request path - compaction path Files: - `codex-rs/core/src/codex.rs` - `codex-rs/core/src/compact.rs` ### 4) Update fixtures/tests for new optional field - Updated existing `TurnContextItem` literals in tests to include `network: None`. - Added protocol tests for: - deserializing old payloads with no `network` - serializing when `network` is present Files: - `codex-rs/core/tests/suite/resume_warning.rs` - No replay/diff logic changes. - Persisted rollout `TurnContextItem` now carries additional network context when available. - Older rollout lines without `network` remain readable.	2026-02-12 17:22:44 -08:00
Michael Bolin	a4cc1a4a85	feat: introduce Permissions (#11633 ) ## Why We currently carry multiple permission-related concepts directly on `Config` for shell/unified-exec behavior (`approval_policy`, `sandbox_policy`, `network`, `shell_environment_policy`, `windows_sandbox_mode`). Consolidating these into one in-memory struct makes permission handling easier to reason about and sets up the next step: supporting named permission profiles (`[permissions.PROFILE_NAME]`) without changing behavior now. This change is mostly mechanical: it updates existing callsites to go through `config.permissions`, but it does not yet refactor those callsites to take a single `Permissions` value in places where multiple permission fields are still threaded separately. This PR intentionally does not change the on-disk `config.toml` format yet and keeps compatibility with legacy config keys. ## What Changed - Introduced `Permissions` in `core/src/config/mod.rs`. - Added `Config::permissions` and moved effective runtime permission fields under it: - `approval_policy` - `sandbox_policy` - `network` - `shell_environment_policy` - `windows_sandbox_mode` - Updated config loading/building so these effective values are still derived from the same existing config inputs and constraints. - Updated Windows sandbox helpers/resolution to read/write via `permissions`. - Threaded the new field through all permission consumers across core runtime, app-server, CLI/exec, TUI, and sandbox summary code. - Updated affected tests to reference `config.permissions.*`. - Renamed the struct/field from `EffectivePermissions`/`effective_permissions` to `Permissions`/`permissions` and aligned variable naming accordingly. ## Verification - `just fix -p codex-core -p codex-tui -p codex-cli -p codex-app-server -p codex-exec -p codex-utils-sandbox-summary` - `cargo build -p codex-core -p codex-tui -p codex-cli -p codex-app-server -p codex-exec -p codex-utils-sandbox-summary`	2026-02-12 14:42:54 -08:00
Owen Lin	efc8d45750	feat(app-server): experimental flag to persist extended history (#11227 ) This PR adds an experimental `persist_extended_history` bool flag to app-server thread APIs so rollout logs can retain a richer set of EventMsgs for non-lossy Thread > Turn > ThreadItems reconstruction (i.e. on `thread/resume`). ### Motivation Today, our rollout recorder only persists a small subset (e.g. user message, reasoning, assistant message) of `EventMsg` types, dropping a good number (like command exec, file change, etc.) that are important for reconstructing full item history for `thread/resume`, `thread/read`, and `thread/fork`. Some clients want to be able to resume a thread without lossiness. This lossiness is primarily a UI thing, since what the model sees are `ResponseItem` and not `EventMsg`. ### Approach This change introduces an opt-in `persist_full_history` flag to preserve those events when you start/resume/fork a thread (defaults to `false`). This is done by adding an `EventPersistenceMode` to the rollout recorder: - `Limited` (existing behavior, default) - `Extended` (new opt-in behavior) In `Extended` mode, persist additional `EventMsg` variants needed for non-lossy app-server `ThreadItem` reconstruction. We now store the following ThreadItems that we didn't before: - web search - command execution - patch/file changes - MCP tool calls - image view calls - collab tool outcomes - context compaction - review mode enter/exit For command executions in particular, we truncate the output using the existing `truncate_text` from core to store an upper bound of 10,000 bytes, which is also the default value for truncating tool outputs shown to the model. This keeps the size of the rollout file and command execution items returned over the wire reasonable. And we also persist `EventMsg::Error` which we can now map back to the Turn's status and populates the Turn's error metadata. #### Updates to EventMsgs To truly make `thread/resume` non-lossy, we also needed to persist the `status` on `EventMsg::CommandExecutionEndEvent` and `EventMsg::PatchApplyEndEvent`. Previously it was not obvious whether a command failed or was declined (similar for apply_patch). These EventMsgs were never persisted before so I made it a required field.	2026-02-12 19:34:22 +00:00
Michael Bolin	476c1a7160	Remove `test-support` feature from `codex-core` and replace it with explicit test toggles (#11405 ) ## Why `codex-core` was being built in multiple feature-resolved permutations because test-only behavior was modeled as crate features. For a large crate, those permutations increase compile cost and reduce cache reuse. ## Net Change - Removed the `test-support` crate feature and related feature wiring so `codex-core` no longer needs separate feature shapes for test consumers. - Standardized cross-crate test-only access behind `codex_core::test_support`. - External test code now imports helpers from `codex_core::test_support`. - Underlying implementation hooks are kept internal (`pub(crate)`) instead of broadly public. ## Outcome - Fewer `codex-core` build permutations. - Better incremental cache reuse across test targets. - No intended production behavior change.	2026-02-10 22:44:02 -08:00
Celia Chen	641d5268fa	chore: persist turn_id in rollout session and make turn_id uuid based (#11246 ) Problem: 1. turn id is constructed in-memory; 2. on resuming threads, turn_id might not be unique; 3. client cannot no the boundary of a turn from rollout files easily. This PR does three things: 1. persist `task_started` and `task_complete` events; 1. persist `turn_id` in rollout turn events; 5. generate turn_id as unique uuids instead of incrementing it in memory. This helps us resolve the issue of clients wanting to have unique turn ids for resuming a thread, and knowing the boundry of each turn in rollout files. example debug logs ``` 2026-02-11T00:32:10.746876Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=8 turn=Turn { id: "019c4a07-d809-74c3-bc4b-fd9618487b4b", items: [UserMessage { id: "item-24", content: [Text { text: "hi", text_elements: [] }] }, AgentMessage { id: "item-25", text: "Hi. I’m in the workspace with your current changes loaded and ready. Send the next task and I’ll execute it end-to-end." }], status: Completed, error: None } 2026-02-11T00:32:10.746888Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=9 turn=Turn { id: "019c4a18-1004-76c0-a0fb-a77610f6a9b8", items: [UserMessage { id: "item-26", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-27", text: "Hello. Ready for the next change in `codex-rs`; I can continue from the current in-progress diff or start a new task." }], status: Completed, error: None } 2026-02-11T00:32:10.746899Z DEBUG codex_app_server_protocol::protocol::thread_history: built turn from rollout items turn_index=10 turn=Turn { id: "019c4a19-41f0-7db0-ad78-74f1503baeb8", items: [UserMessage { id: "item-28", content: [Text { text: "hello", text_elements: [] }] }, AgentMessage { id: "item-29", text: "Hello. Send the specific change you want in `codex-rs`, and I’ll implement it and run the required checks." }], status: Completed, error: None } ``` backward compatibility: if you try to resume an old session without task_started and task_complete event populated, the following happens: - If you resume and do nothing: those reconstructed historical IDs can differ next time you resume. - If you resume and send a new turn: the new turn gets a fresh UUID from live submission flow and is persisted, so that new turn’s ID is stable on later resumes. I think this behavior is fine, because we only care about deterministic turn id once a turn is triggered.	2026-02-11 03:56:01 +00:00
Dylan Hurd	8b3521ee77	feat(core) update Personality on turn (#9644 ) ## Summary Support updating Personality mid-Thread via UserTurn/OverwriteTurn. This is explicitly unused by the clients so far, to simplify PRs - app-server and tui implementations will be follow-ups. ## Testing - [x] added integration tests	2026-01-22 12:04:23 -08:00
charley-oai	fe641f759f	Add collaboration_mode to TurnContextItem (#9583 ) ## Summary - add optional `collaboration_mode` to `TurnContextItem` in rollouts - persist the current collaboration mode when recording turn context (sampling + compaction) ## Rationale We already persist turn context data for resume logic. Capturing collaboration mode in the rollout gives us the mode context for each turn, enabling follow‑up work to diff mode instructions correctly on resume. ## Changes - protocol: add optional `collaboration_mode` field to `TurnContextItem` - core: persist collaboration mode alongside other turn context settings in rollouts	2026-01-21 14:14:21 -08:00
Dylan Hurd	675f165c56	fix(core) Preserve base_instructions in SessionMeta (#9427 ) ## Summary This PR consolidates base_instructions onto SessionMeta / SessionConfiguration, so we ensure `base_instructions` is set once per session and should be (mostly) immutable, unless: - overridden by config on resume / fork - sub-agent tasks, like review or collab In a future PR, we should convert all references to `base_instructions` to consistently used the typed struct, so it's less likely that we put other strings there. See #9423. However, this PR is already quite complex, so I'm deferring that to a follow-up. ## Testing - [x] Added a resume test to assert that instructions are preserved. In particular, `resume_switches_models_preserves_base_instructions` fails against main. Existing test coverage thats assert base instructions are preserved across multiple requests in a session: - Manual compact keeps baseline instructions: core/tests/suite/compact.rs:199 - Auto-compact keeps baseline instructions: core/tests/suite/compact.rs:1142 - Prompt caching reuses the same instructions across two requests: core/tests/suite/prompt_caching.rs:150 and core/tests/suite/prompt_caching.rs:157 - Prompt caching with explicit expected string across two requests: core/tests/suite/prompt_caching.rs:213 and core/tests/suite/prompt_caching.rs:222 - Resume with model switch keeps original instructions: core/tests/suite/resume.rs:136 - Compact/resume/fork uses request 0 instructions for later expected payloads: core/tests/suite/compact_resume_fork.rs:215	2026-01-19 21:59:36 -08:00
jif-oai	1aed01e99f	renaming: task to turn (#8963 )	2026-01-09 17:31:17 +00:00
jif-oai	116059c3a0	chore: unify conversation with thread name (#8830 ) Done and verified by Codex + refactor feature of RustRover	2026-01-07 17:04:53 +00:00
Anton Panasenko	cbc5fb9acf	chore: save more about turn context in rollout log file (#8458 ) ### Motivation - Persist richer per-turn configuration in rollouts so resumed/forked sessions and tooling can reason about the exact instruction inputs and output constraints used for a turn. ### Description - Extend `TurnContextItem` to include optional `base_instructions`, `user_instructions`, and `developer_instructions`. - Record the optional `final_output_json_schema` associated with a turn. - Add an optional `truncation_policy` to `TurnContextItem` and populate it when writing turn-context rollout items. - Introduce a protocol-level `TruncationPolicy` representation and convert from core truncation policy when recording. ### Testing - `cargo test -p codex-protocol` (pass)	2025-12-22 19:51:07 -08:00
Michael Bolin	dc61fc5f50	feat: support allowed_sandbox_modes in requirements.toml (#8298 ) This adds support for `allowed_sandbox_modes` in `requirements.toml` and provides legacy support for constraining sandbox modes in `managed_config.toml`. This is converted to `Constrained<SandboxPolicy>` in `ConfigRequirements` and applied to `Config` such that constraints are enforced throughout the harness. Note that, because `managed_config.toml` is deprecated, we do not add support for the new `external-sandbox` variant recently introduced in https://github.com/openai/codex/pull/8290. As noted, that variant is not supported in `config.toml` today, but can be configured programmatically via app server.	2025-12-19 21:09:20 +00:00
Michael Bolin	0a7021de72	fix: enable resume_warning that was missing from mod.rs (#8333 ) This test was introduced in https://github.com/openai/codex/pull/6507, but was not included in `mod.rs`. It does not appear that it was getting compiled?	2025-12-19 19:21:47 +00:00
Michael Bolin	3d4ced3ff5	chore: migrate from Config::load_from_base_config_with_overrides to ConfigBuilder (#8276 ) https://github.com/openai/codex/pull/8235 introduced `ConfigBuilder` and this PR updates all call non-test call sites to use it instead of `Config::load_from_base_config_with_overrides()`. This is important because `load_from_base_config_with_overrides()` uses an empty `ConfigRequirements`, which is a reasonable default for testing so the tests are not influenced by the settings on the host. This method is now guarded by `#[cfg(test)]` so it cannot be used by business logic. Because `ConfigBuilder::build()` is `async`, many of the test methods had to be migrated to be `async`, as well. On the bright side, this made it possible to eliminate a bunch of `block_on_future()` stuff.	2025-12-18 16:12:52 -08:00
gt-oai	9352c6b235	feat: Constrain values for approval_policy (#7778 ) Constrain `approval_policy` through new `admin_policy` config. This PR will: 1. Add a `admin_policy` section to config, with a single field (for now) `allowed_approval_policies`. This list constrains the set of user-settable `approval_policy`s. 2. Introduce a new `Constrained<T>` type, which combines a current value and a validator function. The validator function ensures disallowed values are not set. 3. Change the type of `approval_policy` on `Config` and `SessionConfiguration` from `AskForApproval` to `Constrained<AskForApproval>`. The validator function is set by the values passed into `allowed_approval_policies`. 4. `GenericDisplayRow`: add a `disabled_reason: Option<String>`. When set, it disables selection of the value and indicates as such in the menu. This also makes it unselectable with arrow keys or numbers. This is used in the `/approvals` menu. Follow ups are: 1. Do the same thing to `sandbox_policy`. 2. Propagate the allowed set of values through app-server for the extension (though already this should prevent app-server from setting this values, it's just that we want to disable UI elements that are unsettable). Happy to split this PR up if you prefer, into the logical numbered areas above. Especially if there are parts we want to gavel on separately (e.g. admin_policy). Disabled full access: <img width="1680" height="380" alt="image" src="https://github.com/user-attachments/assets/1fb61c8c-1fcb-4dc4-8355-2293edb52ba0" /> Disabled `--yolo` on startup: <img width="749" height="76" alt="image" src="https://github.com/user-attachments/assets/0a1211a0-6eb1-40d6-a1d7-439c41e94ddb" /> CODEX-4087	2025-12-17 16:19:27 +00:00
Ahmed Ibrahim	cb9a189857	make `model` optional in config (#7769 ) - Make Config.model optional and centralize default-selection logic in ModelsManager, including a default_model helper (with codex-auto-balanced when available) so sessions now carry an explicit chosen model separate from the base config. - Resolve `model` once in `core` and `tui` from config. Then store the state of it on other structs. - Move refreshing models to be before resolving the default model	2025-12-10 11:19:00 -08:00
jif-oai	530db0ad73	feat: warning switch model on resume (#6507 ) <img width="1259" height="40" alt="Screenshot 2025-11-11 at 14 01 41" src="https://github.com/user-attachments/assets/48ead3d2-d89c-4d8a-a578-82d9663dbd88" />	2025-11-12 11:13:37 +00:00

18 Commits