codex

mirror of https://github.com/openai/codex.git synced 2026-05-22 03:54:18 +00:00

Author	SHA1	Message	Date
jif-oai	8b25d90733	chore: code mode render truncation	2026-05-19 15:53:57 +02:00
jif-oai	b3ae3de405	Defer v1 multi-agent tools behind tool search (#23144 ) Summary: defer v1 multi-agent tools when tool_search and namespace tools are available; keep concise searchable descriptions and move the v1 usage guidance into developer instructions; add targeted coverage. Testing: not run per request; ran just fmt.	2026-05-19 15:04:35 +02:00
jif-oai	80fdd4688f	Add `body_after_prefix` auto-compact token limit scope (#22870 ) ## Why `model_auto_compact_token_limit` has only been able to budget the full active context. That makes it hard to set a small "growth since compaction" budget for sessions that preserve a large carried window prefix: the preserved prefix can consume the whole budget and force immediate repeated compaction. This PR adds an opt-in `body_after_prefix` scope so callers can apply `model_auto_compact_token_limit` to sampled output and later growth after the current carried prefix, while still forcing compaction before the full model context window is exhausted. ## What changed - Adds `AutoCompactTokenLimitScope` with the existing `total` behavior as the default and a new `body_after_prefix` mode: [`config_types.rs`](`973806b1cb/codex-rs/protocol/src/config_types.rs (L24-L37)`). - Threads `model_auto_compact_token_limit_scope` through config loading, `Config`, `core-api`, and app-server v2 schema/TypeScript generation. - Records the first observed input-token count for a `body_after_prefix` compaction window and uses it as the baseline when deciding whether the scoped auto-compaction budget is exhausted: [`turn.rs`](`973806b1cb/codex-rs/core/src/session/turn.rs (L743-L781)`). - Keeps a hard context-window cap in `body_after_prefix`, so scoped budgeting cannot let the active context overrun the usable window. ## Verification Added compact-suite coverage for the two key behaviors: `body_after_prefix` does not re-compact just because the carried prefix is larger than the scoped budget, and it still compacts when the total active context reaches the configured context window: [`compact.rs`](`973806b1cb/codex-rs/core/tests/suite/compact.rs (L3003-L3128)`).	2026-05-19 10:19:46 +00:00
jif-oai	05e171094d	Remove ToolsConfig from tool planning (#22835 ) ## Why `codex-tools` is meant to hold reusable tool primitives, but `ToolsConfig` had become a second copy of core runtime decisions instead of a small shared contract. It carried provider capabilities, auth/model gates, permission and environment state, web/search/image feature gates, multi-agent settings, and goal availability from core into `codex-tools` ([definition](`22dd9ad392/codex-rs/tools/src/tool_config.rs (L97)`), [stored on each `TurnContext`](`22dd9ad392/codex-rs/core/src/session/turn_context.rs (L87)`)). Every session/context variant then had to build and mutate that snapshot before assembling tools. This PR removes that master object instead of renaming it. Tool planning now reads the live `TurnContext`, where `codex-core` already owns those decisions, while `codex-tools` keeps only reusable primitives and a generic `ToolSetBuilder`/`ToolSet` accumulator. ## What Changed - Removed `ToolsConfig` / `ToolsConfigParams` from `codex-tools`; the crate keeps the shared helpers that still belong there, including request-user-input mode selection, shell backend/type resolution, `UnifiedExecShellMode`, and `ToolEnvironmentMode`. - Replaced config-snapshot planning with `ToolRouter::from_turn_context` and a `spec_plan` pipeline over `CoreToolPlanContext`, deriving provider capabilities, auth gates, model support, feature gates, environment count, goal support, multi-agent options, web search, and image generation from the authoritative turn state. - Added generic `codex_tools::ToolSetBuilder` / `ToolSet`, plus the small core adapter needed to accumulate `CoreToolRuntime` values and hosted model specs. - Added the `tool_family::shell` registration module and moved shell/unified-exec/memory accounting call sites to read the narrow per-turn fields directly. - Narrowed `TurnContext` to the remaining explicit per-turn fields needed by planning: `available_models`, `unified_exec_shell_mode`, and `goal_tools_supported`. - Reworked MCP exposure and tool-search setup so deferred/direct MCP behavior is driven by the current turn rather than a precomputed config snapshot. - Replaced the large expected-spec fixture tests with focused behavior-level coverage for shell tools, environments, goal and agent-job gates, MCP direct/deferred exposure, tool search, request-plugin-install, code mode, multi-agent mode, hosted tools, and extension executor dispatch. ## Verification - `cargo check -p codex-tools` - `cargo check -p codex-core --lib` - `cargo test -p codex-tools` - `cargo test -p codex-core spec_plan --lib` - `cargo test -p codex-core router --lib`	2026-05-19 11:24:09 +02:00
jif-oai	ba57aab13a	feat: dedicated goal DB (#23300 ) ## Why Thread goals are moving toward extension-owned runtime behavior, but their persisted state was still stored in the shared state database. This makes the goal store harder to isolate and keeps future storage splits tied to ad hoc runtime plumbing. This PR gives goals their own SQLite database while keeping the existing `StateRuntime` entry point. The goal is to make this the pattern for adding more dedicated runtime databases later. This also reduce load on existing DB and reduce contention ## Limitation Thread preview from goal is not supported anymore. I'm looking into this [EDIT]: solved ## What changed - Added a dedicated `goals_1.sqlite` database with its own `goals_migrations` directory. - Moved `thread_goals` creation into the goals DB migration set. - Dropped the old `thread_goals` table from the main state DB with a normal state migration. There is intentionally no backfill for existing goal rows. - Changed `GoalStore` to be backed only by the goals DB pool. - Removed the old goal-write side effect that filled empty `threads.preview` values from the goal objective. - Added shared runtime DB path metadata so startup, telemetry, `codex doctor`, and repair handling can include future DBs without bespoke path lists. - Updated Bazel compile data so the new goals migration directory is available to `sqlx::migrate!`. ## Verification - `cargo check --tests -p codex-state -p codex-cli -p codex-core -p codex-app-server` - `just fix -p codex-state` - `just fix -p codex-cli` - `just fix -p codex-app-server`	2026-05-19 11:11:41 +02:00
jif-oai	826b2182ed	Preserve context baselines for full-history agent forks (#23352 ) ## Why Full-history agent forks should continue from the same prompt prefix as the parent. Dropping the stored `TurnContext` baseline forced the child to rebuild startup context on its first turn, which can duplicate developer instructions and also loses the cache continuity that a full-history fork is supposed to preserve. Truncated forks are different: once we keep only the last N turns, the original prompt prefix is no longer intact, so the child must establish a fresh context baseline. ## What changed - Preserve `RolloutItem::TurnContext` when forking with `SpawnAgentForkMode::FullHistory`, and keep dropping it for truncated forks: `4090717d94/codex-rs/core/src/agent/control.rs (L98-L126)` and `4090717d94/codex-rs/core/src/agent/control.rs (L399-L401)` - Remove the special-case MultiAgentV2 usage-hint filtering path. Full-history fork now preserves the cached developer prefix instead of trying to reconstruct part of it. - Extend the fork coverage to assert both sides of the contract: full-history forks keep the parent reference baseline, while last-N forks rebuild context after truncation: `4090717d94/codex-rs/core/src/agent/control_tests.rs (L603-L759)` and `4090717d94/codex-rs/core/src/agent/control_tests.rs (L854-L977)` ## Verification - `cargo test -p codex-core spawn_agent_can_fork_parent_thread_history_with_sanitized_items -- --nocapture` - `RUST_MIN_STACK=16777216 cargo test -p codex-core spawn_agent_fork_last_n_turns_keeps_only_recent_turns -- --nocapture`	2026-05-19 10:34:24 +02:00
viyatb-oai	3009e23644	core: expose permission profile picker metadata (#22928 ) ## Why The `/permissions` picker needs a config-level way to distinguish legacy anonymous presets from named permission-profile mode. That signal cannot be inferred reliably in the TUI, especially for the edge case where `default_permissions = ":workspace"` is present without a `[permissions]` table. ## What changed - Expose whether the merged config is explicitly in permission-profile mode. - Expose the configured custom permission profile IDs alongside the built-in profile semantics. - Add regression coverage for profile mode detection and custom profile metadata, including the `default_permissions = ":workspace"` case. - Update the thread-manager sample config literal to match the expanded config shape. ## Stack 1. This PR: config metadata needed by downstream permission-profile consumers. 2. [#22931](https://github.com/openai/codex/pull/22931): refresh active permission profiles through runtime/session/network state. 3. [#21559](https://github.com/openai/codex/pull/21559): switch `/permissions` to the profile-aware TUI picker. ## Verification - `cargo check -p codex-thread-manager-sample` - `cargo test -p codex-core default_permissions_can_select_builtin_profile_without_permissions_table` - `cargo test -p codex-core permissions_profiles_allow_direct_write_roots_outside_workspace_root`	2026-05-18 23:26:17 -07:00
sayan-oai	1dd9bf9a74	Remove explicit connector tool undeferral (#23390 ) ## Summary - remove the explicit-connector carveout that kept mentioned app tools directly exposed instead of deferred - keep the surviving explicit-mention reconstruction only for analytics, preserving `codex_app_mentioned` and `codex_app_used.invoke_type` - trim the now-unused prompt/tool-exposure plumbing and refresh coverage around always-defer behavior ## Verification - `just fmt` - `cargo test -p codex-analytics` - `cargo test -p codex-core` (one transient timeout in `shell_snapshot::tests::macos_zsh_snapshot_includes_sections`; isolated rerun passed) - `cargo test -p codex-core --lib shell_snapshot::tests::macos_zsh_snapshot_includes_sections` - `cargo test -p codex-core --test all explicit_app_mentions_respect_always_defer` - `cargo test -p codex-core --lib mcp_tool_exposure::tests::always_defer_feature_defers_apps_too` - `just fix -p codex-analytics` - `just fix -p codex-core`	2026-05-18 21:33:46 -07:00
Eric Traut	a668379abf	[5 of 7] Replace OverrideTurnContext with ThreadSettings (#22508 ) Stack position: [5 of 7] ## Summary This PR adds `Op::ThreadSettings`, a queued settings-only update mechanism for changing stored thread settings without starting a new turn. It also removes the legacy `Op::OverrideTurnContext` in the same layer, so reviewers can see the replacement and deletion together. ## Changes - Add `Op::ThreadSettings` for settings-only queued updates. - Emit `ThreadSettingsApplied` with the effective thread settings snapshot after core applies an update. - Route settings-only updates through the same submission queue as user input. - Migrate remaining `OverrideTurnContext` tests and callers to the queued `Op::ThreadSettings` path. - Delete `Op::OverrideTurnContext` from the core protocol and submission loop. This stack addresses #20656 and #22090. ## Stack 1. [1 of 7] [Add thread settings to UserInput](https://github.com/openai/codex/pull/23080) 2. [2 of 7] [Remove UserInputWithTurnContext](https://github.com/openai/codex/pull/23081) 3. [3 of 7] [Remove UserTurn](https://github.com/openai/codex/pull/23075) 4. [4 of 7] [Placeholder for OverrideTurnContext cleanup](https://github.com/openai/codex/pull/23087) 5. [5 of 7] [Replace OverrideTurnContext with ThreadSettings](https://github.com/openai/codex/pull/22508) (this PR) 6. [6 of 7] [Add app-server thread settings API](https://github.com/openai/codex/pull/22509) 7. [7 of 7] [Sync TUI thread settings](https://github.com/openai/codex/pull/22510)	2026-05-18 21:03:51 -07:00
pakrym-oai	9e9a62dc28	[codex] Extract turn skill and plugin injections (#23396 ) ## Why `run_turn` had accumulated the turn-scoped skill, plugin, app, MCP, connector-selection, and analytics setup inline. That made the orchestration path harder to scan even though the actual turn item injection still needs to stay in `run_turn` so ordering is explicit. ## What changed This extracts that setup into `build_skills_and_plugins`, which returns the combined injection `ResponseItem`s and the explicitly enabled connector IDs. `run_turn` now keeps the required orchestration pieces: context update recording, user input handling, connector selection merge, and the explicit per-item `record_conversation_items` calls for injection items. The refactor keeps the change LOC-neutral in `core/src/session/turn.rs` and preserves the existing response-item based injection path. ## Validation - `cargo test -p codex-core collect_explicit_app_ids_from_skill_items` - `just fix -p codex-core`	2026-05-18 20:33:27 -07:00
Eric Traut	1a25d8b6e5	[3 of 7] Remove UserTurn (#23075 ) Stack position: [3 of 7] ## Summary This PR finishes the input-op consolidation by moving the remaining `Op::UserTurn` callers onto `Op::UserInput` and deleting `Op::UserTurn`. This touches a lot of files, but it is a low-risk mechanical migration. ## Stack 1. [1 of 7] [Add thread settings to UserInput](https://github.com/openai/codex/pull/23080) 2. [2 of 7] [Remove UserInputWithTurnContext](https://github.com/openai/codex/pull/23081) 3. [3 of 7] [Remove UserTurn](https://github.com/openai/codex/pull/23075) (this PR) 4. [4 of 7] [Placeholder for OverrideTurnContext cleanup](https://github.com/openai/codex/pull/23087) 5. [5 of 7] [Replace OverrideTurnContext with ThreadSettings](https://github.com/openai/codex/pull/22508) 6. [6 of 7] [Add app-server thread settings API](https://github.com/openai/codex/pull/22509) 7. [7 of 7] [Sync TUI thread settings](https://github.com/openai/codex/pull/22510)	2026-05-18 19:56:00 -07:00
Eric Traut	e811234484	[2 of 7] Remove UserInputWithTurnContext (#23081 ) Stack position: [2 of 7] ## Summary This PR removes the overlapping `Op::UserInputWithTurnContext` variant now that `Op::UserInput` can carry thread settings overrides directly. ## Stack 1. [1 of 7] [Add thread settings to UserInput](https://github.com/openai/codex/pull/23080) 2. [2 of 7] [Remove UserInputWithTurnContext](https://github.com/openai/codex/pull/23081) (this PR) 3. [3 of 7] [Remove UserTurn](https://github.com/openai/codex/pull/23075) 4. [4 of 7] [Placeholder for OverrideTurnContext cleanup](https://github.com/openai/codex/pull/23087) 5. [5 of 7] [Replace OverrideTurnContext with ThreadSettings](https://github.com/openai/codex/pull/22508) 6. [6 of 7] [Add app-server thread settings API](https://github.com/openai/codex/pull/22509) 7. [7 of 7] [Sync TUI thread settings](https://github.com/openai/codex/pull/22510)	2026-05-18 19:41:33 -07:00
Eric Traut	84d941d07f	[1 of 7] Add thread settings to UserInput (#23080 ) Stack position: [1 of 7] ## Summary The first three PRs in this stack are a cleanup pass before the actual thread settings API work. Today, core has several overlapping "user input" ops: `UserInput`, `UserInputWithTurnContext`, and `UserTurn`. They differ mostly in how much next-turn state they carry, which makes the later queued thread settings update harder to reason about and review. This PR starts that cleanup by adding the shared `ThreadSettingsOverrides` payload and allowing `Op::UserInput` to carry it. Existing variants remain in place here, so this layer is mostly a behavior-preserving API shape change plus mechanical constructor updates. ## End State After PR3 By the end of PR3, `Op::UserInput` is the only "user input" core op. It can carry optional thread settings overrides for callers that need to update stored defaults with a turn, while callers without updates use empty settings. `Op::UserInputWithTurnContext` and `Op::UserTurn` are deleted. ## End State After PR5 By the end of PR5, core will have only two ops for this area: - `Op::UserInput` for user-input-bearing submissions. - `Op::ThreadSettings` for settings-only updates. ## Stack 1. [1 of 7] [Add thread settings to UserInput](https://github.com/openai/codex/pull/23080) (this PR) 2. [2 of 7] [Remove UserInputWithTurnContext](https://github.com/openai/codex/pull/23081) 3. [3 of 7] [Remove UserTurn](https://github.com/openai/codex/pull/23075) 4. [4 of 7] [Placeholder for OverrideTurnContext cleanup](https://github.com/openai/codex/pull/23087) 5. [5 of 7] [Replace OverrideTurnContext with ThreadSettings](https://github.com/openai/codex/pull/22508) 6. [6 of 7] [Add app-server thread settings API](https://github.com/openai/codex/pull/22509) 7. [7 of 7] [Sync TUI thread settings](https://github.com/openai/codex/pull/22510)	2026-05-18 18:48:35 -07:00
sayan-oai	daa11820b0	Remove ToolSearch feature toggle (#23389 ) ## Summary - mark `ToolSearch` as removed and ignore stale config writes for its legacy key - make search tool exposure depend only on model capability, not a feature toggle - remove app-server enablement support and prune now-obsolete test coverage/setup ## Verification - `cargo test -p codex-features` - `cargo test -p codex-tools` - `cargo test -p codex-core search_tool_requires_model_capability` - `cargo test -p codex-app-server experimental_feature_enablement_set_` ## Notes - This keeps the legacy config key as a no-op for compatibility while removing the ability to toggle the behavior off cleanly. - No developer-facing docs update outside the touched app-server README was needed.	2026-05-19 01:24:39 +00:00
xl-openai	6b54ced108	cleanup: Remove skill env var dependency prompting (#22721 ) Deletes the skill env var dependency prompt feature and its runtime path. env_var entries in skill dependency metadata are now silently ignored during skill loading.	2026-05-19 01:24:19 +00:00
pakrym-oai	17d552fb4d	[codex] Remove external websocket session resets (#23384 ) ## Why Compaction now installs replacement history inside the session, but the turn and compaction callers were still reaching into `ModelClientSession` to reset websocket transport state after that install. That made a transport-level reset part of the compaction API even though websocket incremental request selection already checks whether the next request is a strict extension of the previous one and falls back to a full `response.create` when it is not. ## What changed - Removed the compaction-side calls to `reset_websocket_session` from `compact.rs` and `session/turn.rs`. - Simplified pre-sampling and mid-turn compaction helpers so they return `CodexResult<()>` instead of carrying a reset flag. - Made `ModelClientSession::reset_websocket_session` private to `client.rs`, leaving only the websocket timeout recovery path inside the client as a caller. ## Validation - `cargo test -p codex-core --test all responses_websocket_creates_on_non_prefix` - `cargo test -p codex-core --test all steered_user_input_waits_for_model_continuation_after_mid_turn_compact` - `cargo test -p codex-core --test all pre_sampling_compact_runs_on_switch_to_smaller_context_model`	2026-05-19 01:13:38 +00:00
pakrym-oai	afa0101ae2	[codex] Move pending input into input queue (#22728 ) ## Why Pending model input was split across `Session`, `TurnState`, and the agent mailbox. That made it easy for new paths to manage queued user input or mailbox delivery outside the intended ownership boundary. This PR consolidates the model-facing input lifecycle behind the session input queue so turn-local pending input, next-turn queued items, and mailbox delivery coordination are owned in one place. ## What Changed - Added `session/input_queue.rs` to own pending input queues and mailbox delivery coordination. - Removed the standalone `agent/mailbox.rs` channel wrapper and store mailbox items directly in the input queue. - Moved pending-input mutations off `TurnState`; `TurnState` now exposes the queue-owned storage directly for now. - Routed abort cleanup, mailbox delivery phase changes, next-turn queued items, and active-turn pending input through `InputQueue`. - Boxed stack-heavy agent resume/fork startup futures that the refactor pushed over the default test stack. - Updated session, task, goal, stream-event, and multi-agent call sites and tests to use the new queue ownership. ## Verification - `cargo test -p codex-core --lib agent::control::tests` - `cargo test -p codex-core --lib agent::control::tests::resume_closed_child_reopens_open_descendants -- --exact` - `cargo test -p codex-core --lib agent::control::tests::spawn_agent_fork_last_n_turns_keeps_only_recent_turns -- --exact` - `cargo test -p codex-core --lib agent::control::tests::resume_thread_subagent_restores_stored_nickname_and_role -- --exact` - `cargo test -p codex-core` was also run; it completed with 1814 passed, 4 ignored, and one timeout in `agent::control::tests::resume_thread_subagent_restores_stored_nickname_and_role`, which passed when rerun in isolation.	2026-05-18 15:43:01 -07:00
Matthew Zeng	a66e0e9c4b	Include plugin id in plugin MCP tool metadata (#23353 ) Adding the id of the plugin that contains the MCP (if any) so we can apply filters at plugin level. ## Summary - carry the plugin owner into MCP runtime provenance - attach `plugin_id` to outbound plugin-backed MCP tool-call `_meta` - avoid misattributing user-configured MCP servers that shadow plugin server names ## Testing - `just fmt` - `just fix -p codex-mcp` - `just fix -p codex-core` - `cargo test -p codex-mcp` - `cargo test -p codex-core plugin_mcp_tool_call_request_meta_includes_plugin_id` - `cargo test -p codex-core to_mcp_config_omits_plugin_id_when_user_server_shadows_plugin_mcp` - `cargo test -p codex-core rebuild_preserving_session_layers_refreshes_plugin_derived_mcp_config` - `git diff --check` ## Notes - Attempted `cargo test -p codex-core`; it aborted in `agent::control::tests::resume_agent_from_rollout_skips_descendants_when_parent_resume_fails` with a stack overflow before the full suite completed.	2026-05-18 15:33:33 -07:00
pakrym-oai	f2368b7de6	[codex] Trim unused TurnContextItem fields (#22709 ) ## Why `TurnContextItem` is the durable baseline used to reconstruct context diffs across resume/fork. Most of the old persisted-only fields on it are no longer read, so keeping them in rollout snapshots adds schema surface and state that can drift without affecting reconstruction. `summary` is the exception: older Codex versions require it to deserialize `turn_context` records, so keep writing a default compatibility value until that schema surface can be removed safely. ## What changed - Removed the unused persisted fields from `TurnContextItem`: trace ids, user/developer instructions, output schema, and truncation policy. - Kept `summary` with a compatibility comment and made `TurnContext::to_turn_context_item` write `ReasoningSummary::Auto` instead of live turn state. - Updated rollout/context reconstruction fixtures for the retained summary field. ## Verification - `cargo test -p codex-protocol --lib turn_context_item` - `cargo test -p codex-rollout resume_candidate_matches_cwd_reads_latest_turn_context` - `cargo test -p codex-state turn_context` - `cargo test -p codex-core --lib new_default_turn_captures_current_span_trace_id` - `cargo test -p codex-core --lib record_initial_history_resumed_turn_context_after_compaction_reestablishes_reference_context_item` - `cargo test -p codex-core --test all emits_warning_when_resumed_model_differs` - `git diff --check`	2026-05-18 21:54:36 +00:00
jif-oai	c69cde3547	Add tool lifecycle extension contributor (#23309 ) ## Why Extensions that need to track runtime progress currently have no typed host signal for tool execution. The goal extension in particular needs to observe tool attempts without inspecting tool payloads, owning tool implementations, or staying coupled to core-only runtime plumbing. This adds a narrow lifecycle contributor API for host-owned tool execution: extensions can observe when an accepted tool call starts and how it finishes, while policy hooks and tool handlers continue to own payload rewriting, blocking, and execution. Relevant code: - [`ToolLifecycleContributor`](`3ad2850ffc/codex-rs/ext/extension-api/src/contributors.rs (L119)`) defines the extension-facing observer contract. - [`tool_lifecycle.rs`](`3ad2850ffc/codex-rs/ext/extension-api/src/contributors/tool_lifecycle.rs`) defines the typed start/finish inputs, source, and outcome enums. - [`notify_tool_start` / `notify_tool_finish`](`3ad2850ffc/codex-rs/core/src/tools/lifecycle.rs`) bridges core tool dispatch into the extension registry. ## What Changed - Added `ToolLifecycleContributor` to `codex-extension-api`, including: - `ToolStartInput` - `ToolFinishInput` - `ToolCallSource` - `ToolCallOutcome` - Added registration and lookup support on `ExtensionRegistryBuilder` / `ExtensionRegistry`. - Wired core tool dispatch to notify lifecycle contributors for: - accepted tool starts - completed tool calls, including the tool output success marker - pre-tool-use blocks - failures before or after the handler runs - cancellation/abort in the parallel tool path - Registered the goal extension as a lifecycle contributor and added the outcome filter it will use for goal progress accounting. ## Test Coverage - Added `dispatch_notifies_tool_lifecycle_contributors` to cover lifecycle notification ordering and outcomes for successful and handler-failed tool calls.	2026-05-18 21:55:57 +02:00
Celia Chen	4dbca61e20	fix: default unknown tool schemas to empty schemas (#22380 ) ## Why Some tool providers, especially MCP servers and dynamic tool sources, can supply schema nodes that omit `type` and have no recognized JSON Schema shape hints. Previously, `sanitize_json_schema` filled those unknown nodes in as `string`, which made the schema parseable but invented a scalar constraint that the provider did not specify. For description-only fields, that could incorrectly steer tool arguments away from the provider's actual accepted shape. The Responses API accepts permissive empty schemas such as `{}` at nested property positions, so Codex should preserve that permissive meaning instead of coercing unknown schema nodes into a misleading scalar type. ## What Changed - Changed the no-hints fallback in `codex-rs/tools/src/json_schema.rs` to clear unrecognized object schema nodes to `{}`. - Empty schemas now remain `{}` rather than becoming `type: "string"`. - Description-only or otherwise metadata-only nested property schemas now become `{}` while surrounding object/array/string/number inference still applies when recognized hints are present. - Updated `codex-tools` and `codex-core` tests to cover top-level empty schemas, nested empty schemas, metadata-only malformed schemas, dynamic tools, and MCP tool specs. ## Verification - `cargo test -p codex-tools` - `cargo test -p codex-core test_mcp_tool_property_missing_type_defaults_to_empty_schema` - Manually verified the real Responses API behavior for both empty-schema positions: - Top-level function `parameters: {}` is accepted and echoed back as `{"type":"object","properties":{}}`; when forced to call the tool, Responses emitted empty object arguments: `"arguments": "{}"`. - Nested property schema `{}` is accepted and preserved as `{}`; when forced to call a tool with `metadata.extra`, Responses emitted `"arguments": "{\"metadata\":{\"extra\":\"codex schema sanitizer behavior\"}}"`.	2026-05-18 12:41:10 -07:00
starr-openai	10f7dc6eb5	codex: route global AGENTS reads through LOCAL_FS (#23343 ) ## Summary - make `load_global_instructions` read through an `ExecutorFileSystem` - call global AGENTS reads with explicit `LOCAL_FS` so they stay tied to local codex-home state ## Validation - `bazel test --bes_backend= --bes_results_url= --test_filter=instruction_sources_include_global_before_agents_md_docs //codex-rs/core:core-unit-tests` on `dev`	2026-05-18 19:26:10 +00:00
Eric Traut	55f6bbc667	goals: keep pause transitions explicit (#23088 ) ## Problem This addresses several user-reported cases where active goals were paused even though the user had not explicitly asked for that transition: - the guardian approval-review circuit breaker interrupted a turn and implicitly paused the goal - a shutdown in one app-server instance could pause a goal while a second instance was still actively running the same thread - steering-style interrupts could also pause the goal even though they are meant to redirect work, not stop the goal lifecycle The common problem was that core treated `TurnAbortReason::Interrupted` as an implicit request to transition the persisted goal to `paused`. That made unrelated interrupt paths mutate goal state as a side effect, and in the multi-app-server case it allowed stale process teardown to pause a live goal owned by another running client. After this change, transitioning a goal to `paused` is always an explicit action performed by a client or another intentional goal-state mutation. It is never an implicit transition triggered by generic interrupt handling. Refs #22884. ## What changed - Remove the goal runtime path that paused active goals after interrupted task aborts. - Drop the now-unused abort reason from `GoalRuntimeEvent::TaskAborted`. - Update the focused regression coverage so an interrupted active goal still accounts usage but remains `active`.	2026-05-18 11:58:40 -07:00
Eric Traut	0d344aca9b	goal: pause continuation loops on usage limits and blockers (#23094 ) Addresses #22833, #22245, #23067 ## Why `/goal` can keep synthesizing turns even when the next turn cannot make meaningful progress. Hard usage exhaustion can replay failing turns, and repeated permission or external-resource blockers can keep burning tokens while waiting for user or system intervention. ## What changed - Add resumable `blocked` and `usageLimited` goal states. As with `paused`, goal continuation stops with these states. - Move to `usageLimited` after usage-limit failures. - Allow the built-in `update_goal` tool to set `blocked` only under explicit repeated-impasse guidance. Updated goal continuation prompt to specify that agent should use `blocked` only when it has made at least three attempts to get past an impasse. Most of the files touched by this PR are because of the small app server protocol update. ## Validation I manually reproduced a number of situations where an agent can run into a true impasse and verified that it properly enters `blocked` state. I then resumed and verified that it once again entered `blocked` state several turns later if the impasse still exists. I also manually reproduced the usage-limit condition by creating a simulated responses API endpoint that returns 429 errors with the appropriate error message. Verified that the goal runtime properly moves the goal into `usageLimited` state and TUI UI updates appropriately. Verified that `/goal resume` resumes (and immediately goes back into `ussageLImited` state if appropriate). ## Follow-up PRs Small changes will be needed to the GUI clients to properly handle the two new states.	2026-05-18 11:28:53 -07:00
starr-openai	9286ff2805	Fix remote turn diff display roots (#23261 ) ## Why `TurnDiffTracker` computes a display root so turn diffs can be rendered repo-relative. For remote exec-server turns, the selected turn `cwd` may exist only inside the selected environment, but `run_turn` was discovering the git root through the local host filesystem. When that lookup failed, nested remote-session diffs fell back to the nested `cwd` and showed `/tmp/...`-prefixed paths instead of repo-relative paths. ## What changed - Resolve the diff display root from the primary selected turn environment when one exists, using that environment's filesystem and `cwd`. - Add `codex_git_utils::get_git_repo_root_with_fs(...)` so git-root discovery can run against an `ExecutorFileSystem`, including remote environments. - Reuse that helper from `resolve_root_git_project_for_trust(...)` and add coverage for `.git` gitdir-pointer detection. ## Validation - Devbox Bazel: `//codex-rs/core:core-unit-tests --test_filter=get_git_repo_root_with_fs_detects_gitdir_pointer` - Devbox Docker-backed remote-env repro: `//codex-rs/core:core-all-test --test_filter=apply_patch_turn_diff_paths_stay_repo_relative_when_session_cwd_is_nested`	2026-05-18 10:53:49 -07:00
pakrym-oai	82061660ae	[codex] Remove legacy shell output formatting paths (#22706 ) ## Why The client and tool pipeline still carried compatibility code for legacy structured shell output. Current shell and apply_patch responses are already plain text for model consumption, so keeping a JSON-serialization path plus shell-item rewrite logic makes the request formatter and tests preserve a format we do not need anymore. ## What Changed - Removed the client-side shell output rewrite from `core/src/client_common.rs`. - Removed the structured exec-output formatter and the shell `freeform` switch so tool emitters use one model-facing formatter. - Collapsed apply_patch/shell serialization tests around the remaining plain-text output expectations and removed duplicate one-variant parameterized cases. - Kept the `ApplyPatchModelOutput::ShellCommandViaHeredoc` compatibility input shape, but no longer treats it as a separate output-format mode. ## Validation - `cargo test -p codex-core client_common` - `cargo test -p codex-core shell_serialization` - `cargo test -p codex-core apply_patch_cli` - `just fix -p codex-core` ## Documentation No external Codex documentation update is needed.	2026-05-18 09:57:54 -07:00
jif-oai	b631d92170	chore: make token usage async (#23305 ) Make the `TokenUsageContributor` async. This will be required for future extension and it's basically free	2026-05-18 15:59:06 +02:00
jif-oai	500ef67ed1	chore: goal resumed metrics (#23301 ) Add metrics for goal resume	2026-05-18 15:19:23 +02:00
jif-oai	7ee7fe239f	chore: isolate thread goal storage behind GoalStore (#23295 ) ## Why Thread goal persistence is being prepared for a dedicated storage boundary. Before that split, goal-specific reads, writes, accounting, and cleanup were exposed directly on `StateRuntime`, so core and app-server callsites stayed coupled to the full runtime instead of a goal-specific store. This PR introduces that boundary without changing the goal wire API or current persistence behavior. Callers now go through `StateRuntime::thread_goals()` and the new `GoalStore`, while `GoalStore` still uses the existing state DB pool underneath. ## What changed - Added `GoalStore` in `state/src/runtime/goals.rs` and exposed it from `StateRuntime` via `thread_goals()`. - Moved thread-goal reads, writes, status updates, pause, delete, and usage accounting onto `GoalStore`. - Updated core session goal handling, app-server goal RPCs, resume snapshots, and goal tests to use the store boundary. - Kept thread deletion responsible for cascading goal cleanup by deleting the goal through the store only after a thread row is removed. ## Testing - Existing goal persistence, resume, and accounting tests were updated to exercise the new `GoalStore` access path.	2026-05-18 14:47:05 +02:00
jif-oai	9531e932ef	Make extension lifecycle hooks async (#23291 ) ## Why Extension lifecycle hooks sit on the host/extension boundary, but the current trait surface only allows synchronous callbacks. That forces extensions that need to seed, rehydrate, observe, or flush extension-owned state during thread and turn transitions to either block inside the callback or move async work into separate host plumbing. This PR makes those lifecycle callbacks awaitable so extension implementations can perform async work directly at the lifecycle point where the host already has the relevant session, thread, or turn stores available. ## What changed - Makes `ThreadLifecycleContributor` and `TurnLifecycleContributor` async in `codex-extension-api`. - Awaits thread start/resume/stop and turn start/stop/abort lifecycle callbacks from `codex-core`. - Updates the guardian and memories extensions to implement the async lifecycle trait surface. - Updates the existing lifecycle tests to use async contributor implementations. - Adds `async-trait` to the crates that now expose or implement these async object-safe lifecycle traits. ## Testing - Existing `codex-core` lifecycle tests were updated to cover async implementations for thread stop and turn abort ordering.	2026-05-18 13:53:58 +02:00
Michael Bolin	0a83353ca3	test: reduce core sandbox policy test setup (#23036 ) ## Why `SandboxPolicy` is a legacy compatibility shape, but several core tests still used it for ordinary turn setup even when the runtime path now carries `PermissionProfile`. With the first cleanup PR merged, this follow-up trims more core test scaffolding so remaining `SandboxPolicy` matches are easier to classify as production compatibility, legacy-boundary coverage, or explicit conversion tests. ## What Changed - Updated apply-patch handler and runtime tests to pass `PermissionProfile` directly. - Changed sandboxing test helpers to build permission profiles without first creating `SandboxPolicy` values. - Converted request-permissions integration turns to pass `PermissionProfile` through the test helper, leaving legacy sandbox projection at the `Op::UserTurn` boundary. - Converted unified exec integration helpers and direct turn submissions to use `PermissionProfile` values instead of `SandboxPolicy` setup. - Removed now-unused `SandboxPolicy` imports from the touched core tests. ## Test Plan - `just fmt` - `cargo test -p codex-core --lib tools::sandboxing::tests` - `cargo test -p codex-core --lib tools::runtimes::apply_patch::tests` - `cargo test -p codex-core --lib tools::handlers::apply_patch::tests` - `cargo test -p codex-core --lib unified_exec::process_manager::tests` - `cargo test -p codex-core --test all request_permissions::` - `cargo test -p codex-core --test all unified_exec::` - `just fix -p codex-core`	2026-05-17 08:39:41 -07:00
jif-oai	545ede569c	Make multi-agent v2 tool namespace configurable (#23147 ) ## Summary - Add `features.multi_agent_v2.tool_namespace` with config/schema validation for Responses-compatible namespace values. - Thread the resolved namespace into `ToolsConfig` for normal turns and review turns. - Wrap MultiAgentV2 tool specs and registry names in the configured namespace when namespace tools are supported, while falling back to the plain tool names when they are not. ## Validation - `just fmt` - `just write-config-schema` - `cargo test -p codex-features multi_agent_v2_feature_config -- --nocapture` - `cargo test -p codex-core test_build_specs_multi_agent_v2 -- --nocapture` - `cargo test -p codex-core multi_agent_v2_config -- --nocapture` - `cargo test -p codex-core multi_agent_v2_rejects_invalid_tool_namespace -- --nocapture` - `cargo test -p codex-tools` - `git diff --check`	2026-05-17 15:27:43 +02:00
sayan-oai	061a614d85	multiagent: trim model-visible description, cap to 5 models (#23069 ) ## Why The `spawn_agent` model override guidance is uncapped and bloating context. We need to trim down each entry and cap total entries. picked 5 as cap, we can change ## What changed - Cap the model override summaries shown in `spawn_agent` to the first 5 picker-visible models, preserving the existing priority ordering from the models manager. - Condense each rendered entry to the actionable pieces the model needs: - use the model slug as the label - render compact reasoning effort lists with the default marked inline - render only service tier IDs, and omit the clause when no tiers are available - Update coverage so the compact formatter shape and the top-5 cap are exercised, and keep the end-to-end request assertion aligned with real model metadata. ## Example Before: `- gpt-5.4 ('gpt-5.4\'): Strong model for everyday coding. Default reasoning effort: medium. Supported reasoning efforts: low (Fast responses with lighter reasoning), medium (Balances speed and reasoning depth for everyday tasks), high (Greater reasoning depth for complex problems), xhigh (Extra high reasoning depth for complex problems). Supported service tiers: priority (Fast: 1.5x speed, increased usage).` After: `- 'gpt-5.4': Strong model for everyday coding. Reasoning efforts: low, medium (default), high, xhigh. Service tiers: priority.`	2026-05-16 13:43:30 -07:00
Michael Bolin	d91bc15618	test: construct permission profiles directly (#23030 ) ## Why `SandboxPolicy` is now a legacy compatibility shape, but several tests still built a `SandboxPolicy` only to immediately convert it into `PermissionProfile` for APIs that already accept canonical runtime permissions. Those detours make it harder to audit where legacy sandbox policy is still required, because boundary-only usages are mixed together with ordinary test setup. ## What Changed - Updated tests in `codex-core`, `codex-exec`, `codex-analytics`, and `codex-config` to construct `PermissionProfile` values directly when the code under test takes a permission profile. - Changed exec-policy, request-permissions, session, and sandbox test helpers to pass `PermissionProfile` through instead of converting from `SandboxPolicy` internally. - Left `SandboxPolicy` in place where tests are explicitly exercising legacy compatibility or request/response boundaries. ## Test Plan - `cargo test -p codex-analytics -p codex-config` - `cargo test -p codex-core --lib safety::tests` - `cargo test -p codex-core --lib exec_policy::tests::` - `cargo test -p codex-core --lib exec::tests` - `cargo test -p codex-core --lib guardian_review_session_config` - `cargo test -p codex-core --lib tools::network_approval::tests` - `cargo test -p codex-core --lib tools::runtimes::shell::unix_escalation::tests` - `cargo test -p codex-core --lib managed_network` - `cargo test -p codex-core --test all request_permissions::` - `cargo test -p codex-exec sandbox` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/23030). * #23036 * __->__ #23030	2026-05-16 12:12:37 -07:00
Eric Traut	941e7f825e	Improve goal completion usage reporting (#22907 ) ## Why Goal completion follow-up turns currently receive a preformatted English usage sentence such as `time used: 2586 seconds`. That nudges the model to echo an awkward raw seconds count in the final reply, even though the tool result already exposes structured usage fields like `goal.timeUsedSeconds`, `goal.tokensUsed`, and `goal.tokenBudget`. ## What changed - Replace the preformatted completion usage sentence with guidance to read the structured goal fields from the tool result. - Preserve token-budget reporting while allowing the model to phrase elapsed time in a concise, human-friendly way that fits the response language. - Update core coverage for both the generated completion guidance and the session flow that forwards it back to the model. ## Verification Previously, it would have output a final message indicating that it "worked for 303 seconds". Now it shows the following: <img width="286" height="35" alt="image" src="https://github.com/user-attachments/assets/d7011880-9449-46a7-856f-4e50ae00eb45" />	2026-05-16 11:49:40 -07:00
Michael Bolin	108234b5eb	core: set permission profiles from snapshots (#22920 ) ## Why #22891 moved the TUI turn-command path to pass `ActivePermissionProfile` instead of the full `PermissionProfile`, but the remaining config/session bridge still accepted the concrete `PermissionProfile` and active profile id as separate arguments. That shape made it too easy for future callers to update the concrete profile and active profile id out of sync. This PR makes the trusted session snapshot path pass one coherent value into `Permissions`, while keeping `requirements.toml` enforcement owned by the existing constrained permission state. ## What Changed - Added `PermissionProfileSnapshot` as the public snapshot value for trusted session/config synchronization. - Changed `Permissions::set_permission_profile_from_session_snapshot()` and `replace_permission_profile_from_session_snapshot()` to take a `PermissionProfileSnapshot`. - Updated the replacement path to derive its constrained `PermissionProfile` from the snapshot, so callers cannot pass a separate profile that disagrees with the snapshot. - Removed the internal tuple-style `PermissionProfileState::set_active_permission_profile()` mutation path. - Updated core session projection and TUI call sites to construct explicit legacy or active snapshots. - Documented the snapshot constructors so legacy use and id/profile mismatch hazards are called out at the API boundary. - Added a focused config test that verifies snapshot updates still respect existing permission constraints. ## How To Review 1. Start with `codex-rs/core/src/config/resolved_permission_profile.rs`; `PermissionProfileSnapshot` is the public wrapper, while `ResolvedPermissionProfile` stays internal. 2. Check `codex-rs/core/src/config/mod.rs` to confirm both session-snapshot setters validate through `PermissionProfileState` and no longer accept loose profile/id pairs. 3. Skim `codex-rs/core/src/session/session.rs` for the session projection path; it now builds the snapshot before installing it. 4. Skim the TUI changes as call-site migration from loose argument pairs to explicit snapshot construction. ## Verification - `cargo test -p codex-core permission_snapshot_setter_preserves_permission_constraints` - `cargo test -p codex-tui status_permissions_` - `cargo test -p codex-tui session_configured_preserves_profile_workspace_roots` - `just fix -p codex-core -p codex-tui`	2026-05-16 07:26:18 -07:00
Curtis 'Fjord' Hawthorne	8543e39885	Preserve image detail in app-server inputs (#20693 ) ## Summary - Add optional image detail to user image inputs across core, app-server v2, thread history/event mapping, and the generated app-server schemas/types. - Preserve requested detail when serializing Responses image inputs: omitted detail stays on the existing `high` default, while explicit `original` keeps local images on the original-resolution path. - Support `high`/`original` consistently for tool image outputs, including MCP `codex/imageDetail`, code-mode image helpers, and `view_image`.	2026-05-15 15:04:04 -07:00
Michael Bolin	8df2d96860	core: construct test permission profiles directly (#22795 ) ## Why The core migration is trying to make `PermissionProfile` the shape tests and runtime code reason about, leaving `SandboxPolicy` only where legacy behavior is explicitly under test. The local `permission_profile_for_sandbox_policy()` test helpers kept new permission-profile tests mentally tied to the old sandbox model even when the equivalent profile is straightforward. ## What Changed - Removed the `permission_profile_for_sandbox_policy()` helper from the network proxy spec tests and session tests. - Replaced legacy conversions for read-only, workspace-write, and full-access cases with `PermissionProfile::read_only()`, `PermissionProfile::workspace_write()`, and `PermissionProfile::Disabled`. - Constructed the external-sandbox session test's `PermissionProfile::External` directly, while preserving the legacy `SandboxPolicy` only where the test still exercises legacy config update behavior. ## How To Review This PR is intentionally test-only. Review the two touched files and check that each replacement preserves the old legacy mapping: - `SandboxPolicy::new_read_only_policy()` -> `PermissionProfile::read_only()` - `SandboxPolicy::new_workspace_write_policy()` -> `PermissionProfile::workspace_write()` - `SandboxPolicy::DangerFullAccess` -> `PermissionProfile::Disabled` - `SandboxPolicy::ExternalSandbox { network_access: Restricted }` -> `PermissionProfile::External { network: Restricted }` ## Verification - `cargo test -p codex-core requirements_allowed_domains_are_a_baseline_for_user_allowlist` - `cargo test -p codex-core start_managed_network_proxy_applies_execpolicy_network_rules` - `cargo test -p codex-core session_configured_reports_permission_profile_for_external_sandbox` - `cargo test -p codex-core managed_network_proxy_decider_survives_full_access_start` - `just fix -p codex-core` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22795). * #22891 * __->__ #22795	2026-05-15 13:09:25 -07:00
Boyang Niu	c15613f2b6	Forward apps MCP product SKU from Codex config (#22872 ) This adds `apps_mcp_product_sku` as a toplevel config.toml key. We pass the given value as a header when listing MCPs for the client, allowing connectors to be filtered per product entry point. --------- Co-authored-by: Codex <noreply@openai.com>	2026-05-15 11:52:14 -07:00
Michael Bolin	4c80435eba	telemetry: tag sandboxes from permission profiles (#22791 ) ## Why Sandbox telemetry tags should be derived from the active permission profile, not from a legacy `SandboxPolicy`, so the tagging code stays aligned with the permissions migration and does not preserve a policy-shaped production helper only for tests. ## What Changed - Removed the production `sandbox_tag(&SandboxPolicy, ...)` helper. - Updated sandbox tag tests to construct the relevant `PermissionProfile` values directly. - Kept the platform-specific sandbox tag behavior under the existing `permission_profile_sandbox_tag` path. ## How To Review The production change is in `codex-rs/core/src/sandbox_tags.rs`. Most of the diff is test cleanup that replaces legacy policy setup with permission profiles, so review the expected tag assertions rather than the old helper mechanics. ## Verification - `cargo test -p codex-core sandbox_tag` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22791). * #22795 * #22792 * __->__ #22791	2026-05-15 10:58:50 -07:00
Michael Bolin	aeca1cba6f	context: remove legacy permissions instructions helper (#22790 ) ## Why The permissions instruction builder should consume the new permissions model directly. Keeping a `SandboxPolicy` conversion helper in this path encourages new code to route through legacy sandbox policy values even when the caller already has a `PermissionProfile`. ## What Changed - Removed `PermissionsInstructions::from_policy`. - Removed the test that exercised that legacy helper. - Left the existing profile-based instruction coverage in place. ## How To Review Review `codex-rs/core/src/context/permissions_instructions.rs` first. This PR is intentionally narrow: the production behavior should be unchanged for profile callers, and the deleted surface was only a convenience adapter from `SandboxPolicy`. ## Verification - `cargo test -p codex-core builds_permissions_from_profile` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22790). * #22795 * #22792 * #22791 * __->__ #22790	2026-05-15 10:11:16 -07:00
Chris Bookholt	9facdccb37	Ignore configured hooks in git helpers (#22843 ) ## What - Internal Git helper commands now ignore configured hook directories during repository bookkeeping. ## Why - These helper flows should stay consistent even when a repository has hook-directory configuration of its own. ## How - Pass a command-local `core.hooksPath` override in the shared helper path and the Git-info helper path. - Add regressions for the baseline index rewrite flow and the metadata status flow. ## Validation - `cargo fmt --manifest-path /Users/bookholt/code/codex/codex-rs/Cargo.toml --all --check` - `cargo test --manifest-path /Users/bookholt/code/codex/codex-rs/Cargo.toml -p codex-git-utils` - `cargo test --manifest-path /Users/bookholt/code/codex/codex-rs/Cargo.toml -p codex-core test_get_has_changes_`	2026-05-15 10:07:54 -07:00
Michael Bolin	68ccfdc905	guardian: use permission profile for review sandbox (#22789 ) ## Why `SandboxPolicy` is being pushed back toward legacy config loading and compatibility boundaries. Guardian review sessions already want the built-in read-only permission behavior; carrying that as an active `PermissionProfile` makes the review sandbox follow the new permissions path instead of configuring the child session through the legacy policy API. ## What Changed - Configure the guardian review session with `PermissionProfile::read_only()`. - Send the read-only profile through the guardian child `Op::UserTurn`. - Keep the legacy `sandbox_policy` field populated with `SandboxPolicy::new_read_only_policy()` declared next to the profile so the two remain visibly in sync until the compatibility field goes away. ## How To Review Start in `codex-rs/core/src/guardian/review_session.rs`. The important check is that both the guardian config and the child turn now use the read-only permission profile, while the remaining `SandboxPolicy::ReadOnly` assignment is only the compatibility field required by the current turn protocol. ## Verification - `cargo test -p codex-core guardian_review_session_config_clears_parent_developer_instructions` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22789). * #22795 * #22792 * #22791 * #22790 * __->__ #22789	2026-05-15 08:59:31 -07:00
jif-oai	cccde930ce	Move memory prompt injection to app-server extension (#22841 ) ## Why Memory prompt injection should be owned by the extension path that app-server composes at runtime, not by an inlined special case inside `codex-core`. This keeps `codex-core` focused on session orchestration while allowing the memories extension to own its app-server prompt behavior. ## What Changed - Registers `codex-memories-extension` in the app-server extension registry. - Moves the memory developer-instruction injection out of `core/src/session/mod.rs` and into the memories extension prompt contributor. - Adds config-change handling so the extension keeps its per-thread memory settings in sync after startup. - Leaves memories read/retrieval tools unregistered for now so this PR only changes prompt injection. - Removes the stale `cargo-shear` ignore now that app-server depends on the extension crate. ## Validation Not run locally; validation is left to CI.	2026-05-15 16:19:34 +02:00
jif-oai	5d30764fe9	Run compact hooks for remote compaction v2 (#22828 ) ## Why Remote compaction v2 is the `/responses` implementation of session-history compaction, but it still needs to preserve the observable contract of the legacy `/responses/compact` path. In particular, users and integrations that rely on `PreCompact` and `PostCompact` hooks should not see different behavior when `remote_compaction_v2` is enabled. ## What Changed - Runs `PreCompact` before issuing the remote compaction v2 request, including `Interrupted` analytics when a pre-hook stops execution. - Runs `PostCompact` after a successful v2 compaction and aborts the turn if the post-hook stops execution. - Adds `compact_remote_parity` coverage that compares legacy and v2 compaction across manual transcript shapes, automatic pre-turn compaction, automatic mid-turn compaction, hook payloads, replacement history, follow-up request payloads, and API-key `service_tier=fast` behavior. - Registers the new parity suite under `core/tests/suite`. Relevant code: - [`compact_remote_v2.rs`](`af63745cb5/codex-rs/core/src/compact_remote_v2.rs`) - [`compact_remote_parity.rs`](`af63745cb5/codex-rs/core/tests/suite/compact_remote_parity.rs`) ## Verification - Added `core/tests/suite/compact_remote_parity.rs` to assert parity between legacy remote compaction and remote compaction v2 for the affected request, hook, rollout-history, and follow-up paths. - Existing `compact_remote_v2` unit coverage still exercises v2 replacement-history retention and compaction-output collection.	2026-05-15 15:26:21 +02:00
jif-oai	c03cea4ca2	Remove zombie tools spec module (#22820 ) ## Summary - move tool_user_shell_type out of the old tools::spec module and call it from tools directly - attach the remaining spec planning model tests under spec_plan - delete core/src/tools/spec.rs ## Tests - just fmt - cargo test -p codex-core tools::spec_plan Note: a broader cargo test -p codex-core run on the earlier PR-head worktree still hit the pre-existing stack overflow in agent::control::tests::spawn_agent_fork_last_n_turns_keeps_only_recent_turns.	2026-05-15 13:44:58 +02:00
jif-oai	6f1a01fbdd	Simplify tool executor and registry plumbing (#22636 ) ## Why The tool runtime path still had a typed output associated type on `ToolExecutor`, plus a core-only `RegisteredTool` adapter and extension-only executor aliases. That made every new shared tool runtime carry extra adapter plumbing before it could participate in core dispatch, extension tools, hook payloads, telemetry, and model-visible spec generation. This PR moves output erasure to the shared executor boundary so core and extension tools can use the same execution contract directly. ## What Changed - Changed `codex_tools::ToolExecutor` to return `Box<dyn ToolOutput>` instead of an associated `Output` type. - Removed the extension-specific `ExtensionToolExecutor` / `ExtensionToolOutput` aliases and exposed `ToolExecutor<ToolCall>` plus `ToolOutput` through `codex-extension-api`. - Reworked core tool registration around `CoreToolRuntime` and `ToolRegistry::from_tools`, removing the extra `RegisteredTool` / `ToolRegistryBuilder` layer. - Consolidated model-visible spec planning and registry construction in `core/src/tools/spec_plan.rs`, including deferred tool search and code-mode-only filtering. - Added `ToolOutput` helpers for post-tool-use hook ids and inputs so MCP, unified exec, extension, and other boxed outputs preserve the same hook payload behavior. - Updated core handlers, memories tools, and the related registry/spec/router tests to use the simplified contract. ## Test Coverage - Updated coverage for tool spec planning, registry lookup, deferred tool search registration, extension tool routing, post-tool-use hook payloads, dispatch tracing, guardian output extraction, and memories extension tool execution.	2026-05-15 11:47:54 +02:00
jif-oai	0322ac3df8	[codex] Use compaction_trigger item for remote compaction v2 (#22809 ) ## Why Remote compaction v2 was still using `context_compaction` as both the request trigger and the compacted output shape. The Responses API now has the landed contract for this flow: Codex sends a dedicated `{ "type": "compaction_trigger" }` input item, and the backend returns the standard `compaction` output item with encrypted content. This aligns the v2 path with that wire contract while preserving the existing local compacted-history post-processing behavior. ## What changed - Add `ResponseItem::CompactionTrigger` and regenerate the app-server protocol schema fixtures. - Send `compaction_trigger` from `remote_compaction_v2` instead of a payload-less `context_compaction`. - Collect exactly one backend `compaction` output item, then reuse the existing compacted-history rebuilding path. - Treat the trigger item as a transient request marker rather than model output or persisted rollout/memory content. ## Verification - `cargo test -p codex-protocol compaction_trigger` - `cargo test -p codex-core remote_compact_v2` - `cargo test -p codex-core compact_remote_v2` - `cargo test -p codex-core responses_websocket_sends_response_processed_after_remote_compaction_v2` - `just write-app-server-schema` - `cargo test -p codex-app-server-protocol schema_fixtures`	2026-05-15 11:40:35 +02:00
Michael Bolin	8a5306ff88	app-server: use permission ids and runtime workspace roots (#22611 ) ## Why This PR builds on [#22610](https://github.com/openai/codex/pull/22610) and is the app-server side of the migration from mutable per-turn `SandboxPolicy` replacement toward selecting immutable permission profiles by id plus mutable runtime workspace roots. Once permission profiles can carry their own immutable `workspace_roots`, app-server no longer needs to mutate the selected `PermissionProfile` just to represent thread-specific filesystem context. The mutable part now lives on the thread as explicit `runtimeWorkspaceRoots`, while `:workspace_roots` remains symbolic until the sandbox is realized for a turn. ## What Changed - Replaced the v2 permission-selection wrapper surface with plain profile ids for `thread/start`, `thread/resume`, `thread/fork`, and `turn/start`. - Removed the API surface for profile modifications (`PermissionProfileSelectionParams`, `PermissionProfileModificationParams`, `ActivePermissionProfileModification`). - Added experimental `runtimeWorkspaceRoots` fields to the thread lifecycle and turn-start APIs. - Threaded runtime workspace roots through core session/thread snapshots, turn overrides, app-server request handling, and command execution permission resolution. - Kept session permission state symbolic so later runtime root updates and cwd-only implicit-root retargeting rebind `:workspace_roots` correctly. - Updated the embedded clients just enough to send and restore the new thread state. - Refreshed the generated schema/TypeScript artifacts and the app-server README to match the new contract. ## Verification Targeted coverage for this layer lives in: - `codex-rs/app-server-protocol/src/protocol/v2/tests.rs` - `codex-rs/app-server/tests/suite/v2/thread_start.rs` - `codex-rs/app-server/tests/suite/v2/thread_resume.rs` - `codex-rs/app-server/tests/suite/v2/turn_start.rs` - `codex-rs/core/src/session/tests.rs` The key regression checks exercise that: - `runtimeWorkspaceRoots` resolve against the effective cwd on thread start. - Profile-declared workspace roots are excluded from the runtime workspace roots returned by app-server. - A turn-level runtime workspace-root update persists onto the thread and is returned by `thread/resume`. - A named permission profile selected on one turn remains symbolic so a later runtime-root-only turn update changes the actual sandbox writes. - A cwd-only turn update retargets the implicit runtime cwd root while preserving additional runtime roots. - The protocol fixtures and generated client artifacts stay in sync with the string-based permission selection contract. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22611). * #22612 * __->__ #22611	2026-05-14 23:00:05 -07:00
guinness-oai	4f2918dd7f	[codex] Add opaque desktop config namespace (#22584 ) ## Summary - reserve an explicit opaque `desktop` namespace in `ConfigToml` - expose `desktop` directly in the app-server v2 `config/read` response - keep `config/value/write` and `config/batchWrite` as the only mutation seam for paths like `desktop.someKey` - regenerate the config/app-server schema outputs and document the new contract ## Why The desktop settings work wants one durable, user-editable home for app-owned preferences in `~/.codex/config.toml`, without forcing Rust to model every individual desktop setting key. This PR is only the enabling Rust/app-server layer. It gives the Electron app a first-class config namespace it can read and write through the existing config APIs, while leaving the actual desktop migration to the app PR. ## Behavior and design notes - Opaque but explicit: `desktop` is first-class at the typed config root, while its children remain app-owned and open-ended. - Strict validation still works: arbitrary nested `desktop.` keys are accepted instead of being rejected as unknown config. - Existing config APIs stay the seam:* `config/read` returns the bag, and dotted writes such as `desktop.someKey` continue to flow through `config/value/write` / `config/batchWrite` rather than a bespoke RPC. - No new consumer behavior: Core/TUI do not start depending on desktop preferences. This only preserves and exposes the namespace for callers that intentionally use it. - Same persistence machinery: hand-edited `config.toml` keeps using the existing TOML edit/write path; this PR does not introduce a second serializer or side channel. - TOML-friendly values: the namespace is intended for ordinary JSON-shaped setting values that map cleanly into TOML: strings, numbers, booleans, arrays, and nested object/table values. This PR does not add special handling for TOML-only edge cases such as datetimes. ## Layering semantics Reads keep using the ordinary effective config pipeline, so `desktop` participates in the same layered `config/read` behavior as the rest of `ConfigToml`. Writes still target user config through the existing config service. ## Why this is the shape The alternative would be teaching Rust about each desktop setting as it is added. That would make ordinary app preferences into a cross-repo change, which is exactly the coupling we want to avoid. This keeps the contract small: 1. Rust owns one opaque `desktop` namespace in `config.toml`. 2. The desktop app owns the schema and meaning of individual keys inside it. 3. The existing config APIs remain the transport and mutation surface. That is the piece the desktop settings PR needs in order to move forward cleanly. ## Verification - `cargo test -p codex-config strict_config_accepts_opaque_desktop_keys` - `cargo test -p codex-core desktop_toml_round_trips_opaque_nested_values` - `cargo test -p codex-core config_schema_matches_fixture` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server --test all desktop_settings`	2026-05-15 02:34:21 +00:00

1 2 3 4 5 ...

3336 Commits