codex

mirror of https://github.com/openai/codex.git synced 2026-05-27 22:44:23 +00:00

Author	SHA1	Message	Date
Eric Traut	96836e15ed	Improve goal continuation based on feedback (#22045 ) ## Summary This PR updates the goal continuation prompt to address feedback from early adopters. There are two primary changes: 1. Goal continuation and budget-limit steering prompts now use hidden user-context messages instead of hidden developer messages. 2. The goal continuation prompt is refined to improve the model's ability to fully complete the active goal rather than stop at a smaller or merely passing subset. The user-message transition is important for two reasons. First, it eliminates an issue where older steering messages could be responded to again after a new turn. Second, it works better with compaction because user messages are treated differently from developer messages during compaction. The prompt refinements make persistence explicit, ground work in current evidence, encourage `update_plan` for multi-step progress visibility, and require stronger completion audits before calling `update_goal`. It also removes the elapsed-time reporting in the prompt; I saw evidence that this was causing the model to shortcut work as it became nervous about time. These changes were tested with evals. Chriss4123 has also been running independent evals in [#19910](https://github.com/openai/codex/issues/19910), and many of the improvements in this PR were suggested by him. ## Verification - Tested with evals. - Added and updated focused `codex-core` coverage for hidden goal user context, continuation and budget-limit request shape, prompt rendering, and objective delimiter escaping.	2026-05-11 09:51:21 -07:00
sayan-oai	77d9223e9f	[codex] compact network context rendering (#21875 ) ## Why The model-visible `<network>` context currently repeats indentation and a pair of XML tags for every allowed or denied domain. Large domain sets spend a surprising amount of prompt budget on that scaffolding instead of the actual policy values. ## What changed - Render allowed domains as one comma-separated `<allowed>` value instead of one element per domain. - Render denied domains the same way. - Keep the full allow/deny domain sets model-visible while updating the serialization and settings-update coverage for the denser shape. ## Example Before: ```xml <network enabled="true"> <allowed>api.example.test</allowed> <allowed>cdn.example.test</allowed> <denied>blocked.example.test</denied> </network> ``` After: ```xml <network enabled="true"><allowed>api.example.test,cdn.example.test</allowed><denied>blocked.example.test</denied></network> ``` ## Validation - `cargo test -p codex-core environment_context` - `cargo test -p codex-core build_settings_update_items_emits_environment_item_for_network_changes` - Ran a local `codex` session with a real network context containing 121 allowed domains and 42 denied domains, then inspected the raw prompt with `raw_token_viewer_cli.py`. With the same domain set, the rendered `<network>` section shrank from 7,175 characters across 161 lines to 3,666 characters on one line, and the containing environment-context block fell from 6,428 tokens to 5,379 tokens.	2026-05-09 03:52:48 +00:00
starr-openai	63a27ad6c6	Avoid hard-coded environment context shell (#21390 ) ## Summary - make resolved turn environment shell metadata optional instead of hard-coding bash - render environment context shells from explicit environment metadata when present, falling back to the existing session shell - update environment context tests for inherited PowerShell-style fallback and explicit per-environment shell override ## Testing - Not run (not requested; formatted with `just fmt`). Co-authored-by: Codex <noreply@openai.com>	2026-05-06 19:54:26 +00:00
starr-openai	905987c08f	Prepare selected environment plumbing (#20669 ) ## Why This is a prep PR in the multi-environment process-tool stack. It separates ownership/config cleanup from the behavior change that teaches process tools to route by selected environment, so the follow-up PR can focus on model-facing `environment_id` behavior. ## Stack 1. https://github.com/openai/codex/pull/20646 - `EnvironmentContext` rendering for selected environments 2. https://github.com/openai/codex/pull/20669 - selected-environment ownership and tool config prep (this PR) 3. https://github.com/openai/codex/pull/20647 - process-tool `environment_id` routing ## What Changed - keep the resolved turn environment list wrapped in `ResolvedTurnEnvironments` through `TurnContext` instead of unwrapping it back to a raw `Vec` - add `TurnContext::resolve_path_against` so cwd-relative path resolution has one shared helper - replace the old tool config boolean with `ToolEnvironmentMode::{None, Single, Multiple}` ## Testing - Tests not run locally; this prep refactor is covered by GitHub CI for the stack. Co-authored-by: Codex <noreply@openai.com>	2026-05-04 17:55:49 +00:00
starr-openai	2952beb009	Surface multi-environment choices in environment context (#20646 ) ## Why The model needs a way to see which environments are available during a multi-environment turn without changing the legacy single-environment prompt surface or pulling replay/persistence changes into the same review. ## Stack 1. https://github.com/openai/codex/pull/20646 - `EnvironmentContext` rendering for selected environments (this PR) 2. https://github.com/openai/codex/pull/20669 - selected-environment ownership and tool config prep 3. https://github.com/openai/codex/pull/20647 - process-tool `environment_id` routing ## What Changed - extend `environment_context` so multi-environment turns render an `<environments>` block with the selected environment ids and cwd values - keep zero- and single-environment turns on the existing cwd-only render path - keep replay and persistence paths on the legacy surface for now so this PR stays scoped to live prompt rendering - add focused coverage in `codex-rs/core/src/context/environment_context_tests.rs` ## Testing - CI --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-01 22:11:06 +00:00
jif-oai	431ebeaef7	feat: split memories part 2 (#19860 ) Keep extracting memories out of core and moving the write trigger in the app-server This is temporary and it should move at the client level as a follow-up This makes core fully independant from `codex-memories-write` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-28 13:03:28 +02:00
Andrey Mishchenko	35bc6e3d01	Delete unused ResponseItem::Message.end_turn (#19605 ) This field is unused. Delete it.	2026-04-26 17:18:09 -07:00
Michael Bolin	4d7ce3447d	permissions: make runtime config profile-backed (#19606 ) ## Why This supersedes #19391. During stack repair, GitHub marked #19391 as merged into a temporary stack branch rather than into `main`, so the runtime-config change needed a fresh PR. `PermissionProfile` is now the canonical permissions shape after #19231 because it can distinguish `Managed`, `Disabled`, and `External` enforcement while also carrying filesystem rules that legacy `SandboxPolicy` cannot represent cleanly. Core config and session state still needed to accept profile-backed permissions without forcing every profile through the strict legacy bridge, which rejected valid runtime profiles such as direct write roots. The unrelated CI/test hardening that previously rode along with this PR has been split into #19683 so this PR stays focused on the permissions model migration. ## What Changed - Adds `Permissions.permission_profile` and `SessionConfiguration.permission_profile` as constrained runtime state, while keeping `sandbox_policy` as a legacy compatibility projection. - Introduces profile setters that keep `PermissionProfile`, split filesystem/network policies, and legacy `SandboxPolicy` projections synchronized. - Uses a compatibility projection for requirement checks and legacy consumers instead of rejecting profiles that cannot round-trip through `SandboxPolicy` exactly. - Updates config loading, config overrides, session updates, turn context plumbing, prompt permission text, sandbox tags, and exec request construction to carry profile-backed runtime permissions. - Preserves configured deny-read entries and `glob_scan_max_depth` when command/session profiles are narrowed. - Adds `PermissionProfile::read_only()` and `PermissionProfile::workspace_write()` presets that match legacy defaults. ## Verification - `cargo test -p codex-core direct_write_roots` - `cargo test -p codex-core runtime_roots_to_legacy_projection` - `cargo test -p codex-app-server requested_permissions_trust_project_uses_permission_profile_intent` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19606). * #19395 * #19394 * #19393 * #19392 * __->__ #19606	2026-04-26 13:29:54 -07:00
Michael Bolin	789f387982	permissions: remove legacy read-only access modes (#19449 ) ## Why `ReadOnlyAccess` was a transitional legacy shape on `SandboxPolicy`: `FullAccess` meant the historical read-only/workspace-write modes could read the full filesystem, while `Restricted` tried to carry partial readable roots. The partial-read model now belongs in `FileSystemSandboxPolicy` and `PermissionProfile`, so keeping it on `SandboxPolicy` makes every legacy projection reintroduce lossy read-root bookkeeping and creates unnecessary noise in the rest of the permissions migration. This PR makes the legacy policy model narrower and explicit: `SandboxPolicy::ReadOnly` and `SandboxPolicy::WorkspaceWrite` represent the old full-read sandbox modes only. Split readable roots, deny-read globs, and platform-default/minimal read behavior stay in the runtime permissions model. ## What changed - Removes `ReadOnlyAccess` from `codex_protocol::protocol::SandboxPolicy`, including the generated `access` and `readOnlyAccess` API fields. - Updates legacy policy/profile conversions so restricted filesystem reads are represented only by `FileSystemSandboxPolicy` / `PermissionProfile` entries. - Keeps app-server v2 compatible with legacy `fullAccess` read-access payloads by accepting and ignoring that no-op shape, while rejecting legacy `restricted` read-access payloads instead of silently widening them to full-read legacy policies. - Carries Windows sandbox platform-default read behavior with an explicit override flag instead of depending on `ReadOnlyAccess::Restricted`. - Refreshes generated app-server schema/types and updates tests/docs for the simplified legacy policy shape. ## Verification - `cargo check -p codex-app-server-protocol --tests` - `cargo check -p codex-windows-sandbox --tests` - `cargo test -p codex-app-server-protocol sandbox_policy_` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19449). * #19395 * #19394 * #19393 * #19392 * #19391 * __->__ #19449	2026-04-24 17:16:58 -07:00
xl-openai	1e560f33e1	feat: Compress skill paths with root aliases (#19098 ) Add skill root tracking so model-visible skill lists can use short path aliases when absolute paths would exceed the metadata budget.	2026-04-24 15:49:07 -07:00
jif-oai	120aa07d81	Make MultiAgentV2 interruption markers assistant-authored (#19124 ) ## Why `MultiAgentV2` follow-up messages are delivered to agents as assistant-authored `InterAgentCommunication` envelopes. When `followup_task` used `interrupt: true`, the interrupted-turn guidance was still persisted as a contextual user message, so model-visible history made a system-generated interruption boundary look user-authored. This keeps interruption guidance consistent with the rest of the v2 inter-agent message stream while preserving the legacy marker shape for non-v2 sessions. ## What changed - Make `interrupted_turn_history_marker` feature-aware. - Record the interrupted-turn marker as an assistant `OutputText` message when `Feature::MultiAgentV2` is enabled. - Keep the existing user contextual fragment for non-v2 sessions. - Apply the same feature-aware marker to interrupted fork snapshots. - Add coverage for the live `followup_task` interrupt path and the helper-level v2 marker shape. ## Testing - `cargo test -p codex-core multi_agent_v2_followup_task_interrupts_busy_child_without_losing_message -- --nocapture` - `cargo test -p codex-core multi_agent_v2_interrupted_marker_uses_assistant_output_message -- --nocapture` - `cargo test -p codex-core interrupted_fork_snapshot -- --nocapture`	2026-04-24 13:39:26 +02:00
jif-oai	a2f868c9d6	feat: drop spawned-agent context instructions (#19127 ) ## Why MultiAgentV2 children should not receive an extra model-visible developer fragment just because they were spawned. The parent/configured developer instructions should carry through normally, but the dedicated `<spawned_agent_context>` block is no longer desired. ## What changed - Removed the `SpawnAgentInstructions` context fragment and its `<spawned_agent_context>` wrapper. - Stopped appending spawned-agent instructions in `codex-rs/core/src/tools/handlers/multi_agents_v2/spawn.rs`. - Updated subagent notification coverage to assert inherited parent developer instructions without expecting the spawned-agent wrapper. ## Verification - `cargo test -p codex-core --test all spawned_multi_agent_v2_child_inherits_parent_developer_context -- --nocapture` - `cargo test -p codex-core --test all skills_toggle_skips_instructions_for_parent_and_spawned_child -- --nocapture` - `cargo test -p codex-core --test all subagent_notifications -- --nocapture`	2026-04-23 18:54:45 +02:00
Won Park	83ec1eb5d6	Rename approvals reviewer variant to auto-review (#19056 ) ## Why `approvals_reviewer` now uses `auto_review` as the canonical config/API value after #18504, but the Rust enum variant and nearby helper/test names still used `GuardianSubagent` / guardian approval wording. That made follow-up code and reviews confusing even though the external value had already moved to Auto-review. ## What changed - Renamed `ApprovalsReviewer::GuardianSubagent` to `ApprovalsReviewer::AutoReview`. - Updated protocol, app-server, config, core, TUI, exec, and analytics test callsites. - Renamed nearby helper/test names from guardian approval wording to Auto-review wording where they refer to the approvals reviewer mode. - Preserved wire compatibility: - `auto_review` remains the canonical serialized value. - `guardian_subagent` remains accepted as a legacy alias. This intentionally does not rename the `[features].guardian_approval` key, `Feature::GuardianApproval`, `core/src/guardian`, analytics event names, or app-server Guardian review event types. ## Verification - `cargo test -p codex-protocol approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent` - `cargo test -p codex-app-server-protocol approvals_reviewer_serializes_auto_review_and_accepts_legacy_guardian_subagent` - `cargo test -p codex-config approvals_reviewer` - `cargo test -p codex-tui update_feature_flags` - `cargo test -p codex-core permissions_instructions` - `cargo test -p codex-tui permissions_selection`	2026-04-22 17:22:35 -07:00
pakrym-oai	2a226096f6	Split DeveloperInstructions into individual fragments. (#18813 ) Split DeveloperInstructions into individual fragments.	2026-04-21 10:22:36 -07:00
pakrym-oai	4c2e730488	Organize context fragments (#18794 ) Organize context fragments under `core/context`. Implement same trait on all of them.	2026-04-20 22:39:17 -07:00

15 Commits