codex

mirror of https://github.com/openai/codex.git synced 2026-04-24 14:45:27 +00:00

Author	SHA1	Message	Date
sayan-oai	c10f95ddac	Update models.json and related fixtures (#19323 ) Supersedes #18735. The scheduled rust-release-prepare workflow force-pushed `bot/update-models-json` back to the generated models.json-only diff, which dropped the test and snapshot updates needed for CI. This PR keeps the latest generated `models.json` from #18735 and adds the corresponding fixture updates: - preserve model availability NUX in the app-server model cache fixture - update core/TUI expectations for the new `gpt-5.4` `xhigh` default reasoning - refresh affected TUI chatwidget snapshots for the `gpt-5.5` default/model copy changes Validation run locally while preparing the fix: - `just fmt` - `cargo test -p codex-app-server model_list` - `cargo test -p codex-core includes_no_effort_in_request` - `cargo test -p codex-core includes_default_reasoning_effort_in_request_when_defined_by_model_info` - `cargo test -p codex-tui --lib chatwidget::tests` - `cargo insta pending-snapshots` --------- Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>	2026-04-24 11:14:13 +02:00
sayan-oai	e083b6c757	chore: apply truncation policy to unified_exec (#19247 ) we were not respecting turn's `truncation_policy` to clamp output tokens for `unified_exec` and `write_stdin`. this meant truncation was only being applied by `ContextManager` before the output was stored in-memory (so it _was_ being truncated from model-visible context), but the full output was persisted to rollout on disk. now we respect that `truncation_policy` and `ContextManager`-level truncation remains a backup. ### Tests added tests, tested locally.	2026-04-24 00:17:39 -07:00
Michael Bolin	4816b89204	permissions: make profiles represent enforcement (#19231 ) ## Why `PermissionProfile` is becoming the canonical permissions abstraction, but the old shape only carried optional filesystem and network fields. It could describe allowed access, but not who is responsible for enforcing it. That made `DangerFullAccess` and `ExternalSandbox` lossy when profiles were exported, cached, or round-tripped through app-server APIs. The important model change is that active permissions are now a disjoint union over the enforcement mode. Conceptually: ```rust pub enum PermissionProfile { Managed { file_system: FileSystemSandboxPolicy, network: NetworkSandboxPolicy, }, Disabled, External { network: NetworkSandboxPolicy, }, } ``` This distinction matters because `Disabled` means Codex should apply no outer sandbox at all, while `External` means filesystem isolation is owned by an outside caller. Those are not equivalent to a broad managed sandbox. For example, macOS cannot nest Seatbelt inside Seatbelt, so an inner sandbox may require the outer Codex layer to use no sandbox rather than a permissive one. ## How Existing Modeling Maps Legacy `SandboxPolicy` remains a boundary projection, but it now maps into the higher-fidelity profile model: - `ReadOnly` and `WorkspaceWrite` map to `PermissionProfile::Managed` with restricted filesystem entries plus the corresponding network policy. - `DangerFullAccess` maps to `PermissionProfile::Disabled`, preserving the “no outer sandbox” intent instead of treating it as a lax managed sandbox. - `ExternalSandbox { network_access }` maps to `PermissionProfile::External { network }`, preserving external filesystem enforcement while still carrying the active network policy. - Split runtime policies that legacy `SandboxPolicy` cannot faithfully express, such as managed unrestricted filesystem plus restricted network, stay `Managed` instead of being collapsed into `ExternalSandbox`. - Per-command/session/turn grants remain partial overlays via `AdditionalPermissionProfile`; full `PermissionProfile` is reserved for complete active runtime permissions. ## What Changed - Change active `PermissionProfile` into a tagged union: `managed`, `disabled`, and `external`. - Keep partial permission grants separate with `AdditionalPermissionProfile` for command/session/turn overlays. - Represent managed filesystem permissions as either `restricted` entries or `unrestricted`; `glob_scan_max_depth` is non-zero when present. - Preserve old rollout compatibility by accepting the pre-tagged `{ network, file_system }` profile shape during deserialization. - Preserve fidelity for important edge cases: `DangerFullAccess` round-trips as `disabled`, `ExternalSandbox` round-trips as `external`, and managed unrestricted filesystem + restricted network stays managed instead of being mistaken for external enforcement. - Preserve configured deny-read entries and bounded glob scan depth when full profiles are projected back into runtime policies, including unrestricted replacements that now become `:root = write` plus deny entries. - Regenerate the experimental app-server v2 JSON/TypeScript schema and update the `command/exec` README example for the tagged `permissionProfile` shape. ## Compatibility Legacy `SandboxPolicy` remains available at config/API boundaries as the compatibility projection. Existing rollout lines with the old `PermissionProfile` shape continue to load. The app-server `permissionProfile` field is experimental, so its v2 wire shape is intentionally updated to match the higher-fidelity model. ## Verification - `just write-app-server-schema` - `cargo check --tests` - `cargo test -p codex-protocol permission_profile` - `cargo test -p codex-protocol preserving_deny_entries_keeps_unrestricted_policy_enforceable` - `cargo test -p codex-app-server-protocol permission_profile_file_system_permissions` - `cargo test -p codex-app-server-protocol serialize_client_response` - `cargo test -p codex-core session_configured_reports_permission_profile_for_external_sandbox` - `just fix` - `just fix -p codex-protocol` - `just fix -p codex-app-server-protocol` - `just fix -p codex-core` - `just fix -p codex-app-server`	2026-04-23 23:02:18 -07:00
Celia Chen	e8d8080818	feat: let model providers own model discovery (#18950 ) ## Why `codex-models-manager` had grown to own provider-specific concerns: constructing OpenAI-compatible `/models` requests, resolving provider auth, emitting request telemetry, and deciding how provider catalogs should be sourced. That made the manager harder to reuse for providers whose model catalog is not fetched from the OpenAI `/models` endpoint, such as Amazon Bedrock. This change moves provider-specific model discovery behind provider-owned implementations, so the models manager can focus on refresh policy, cache behavior, picker ordering, and model metadata merging. ## What Changed - Introduced a `ModelsManager` trait with separate `OpenAiModelsManager` and `StaticModelsManager` implementations. - Added `ModelsEndpointClient` so OpenAI-compatible HTTP fetching lives outside `codex-models-manager`. - Moved `/models` request construction, provider auth resolution, timeout handling, and request telemetry into `codex-model-provider` via `OpenAiModelsEndpoint`. - Added provider-owned `models_manager(...)` construction so configured OpenAI-compatible providers use `OpenAiModelsManager`, while static/catalog-backed providers can return `StaticModelsManager`. - Added an Amazon Bedrock static model catalog for the GPT OSS Bedrock model IDs. - Updated core/session/thread manager code and tests to depend on `Arc<dyn ModelsManager>`. - Moved offline model test helpers into `codex_models_manager::test_support`. ## Metadata References The Bedrock catalog metadata is based on the official Amazon Bedrock OpenAI model documentation: - [Amazon Bedrock OpenAI models](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-openai.html) lists the Bedrock model IDs, text input/output modalities, and `128,000` token context window for `gpt-oss-20b` and `gpt-oss-120b`. - [Amazon Bedrock `gpt-oss-120b` model card](https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-openai-gpt-oss-120b.html) lists the `bedrock-runtime` model ID `openai.gpt-oss-120b-1:0`, the `bedrock-mantle` model ID `openai.gpt-oss-120b`, text-only modalities, and `128K` context window. - [OpenAI `gpt-oss-120b` model docs](https://developers.openai.com/api/docs/models/gpt-oss-120b) document configurable reasoning effort with `low`, `medium`, and `high`, plus text input/output modality. The display names, default reasoning effort, and priority ordering are Codex-local catalog choices. ## Test Plan - Manually verified app-server model listing with an AWS profile: ```shell CODEX_HOME="$(mktemp -d)" cargo run -p codex-app-server-test-client -- \ --codex-bin ./target/debug/codex \ -c 'model_provider="amazon-bedrock"' \ -c 'model_providers.amazon-bedrock.aws.profile="codex-bedrock"' \ -c 'model_providers.amazon-bedrock.aws.region="us-west-2"' \ model-list ``` The response returned the Bedrock catalog with `openai.gpt-oss-120b-1:0` as the default model and `openai.gpt-oss-20b-1:0` as the second listed model, both text-only and supporting low/medium/high reasoning effort.	2026-04-24 04:28:25 +00:00
Michael Bolin	040976b218	tests: isolate approval fixtures from host rules (#18288 ) ## Why Several approval-focused tests were unintentionally sensitive to host-level rule files. On machines with broader allowed command prefixes, commonly allowed commands such as `/bin/date` could bypass the approval path these tests were meant to exercise, making the fixtures depend on the developer or CI host configuration. ## What changed - Pins the approval matrix fixture to the explicit user reviewer so it does not inherit a host reviewer. - Changes OTel approval fixtures to request `/usr/bin/touch codex-otel-approval-test`, avoiding a command that may be pre-approved by local rules. - Clears the config layer stack for the permissions-message assertion that needs to compare only the permissions text under test. ## Verification - `env -u CODEX_SANDBOX_NETWORK_DISABLED cargo test -p codex-core --test all approval_matrix_covers_all_modes -- --nocapture` - `env -u CODEX_SANDBOX_NETWORK_DISABLED cargo test -p codex-core --test all permissions_messages -- --nocapture`	2026-04-23 14:12:09 -07:00
Eric Traut	a50cb205b7	Stabilize plugin MCP tools test (#19191 ) ## Summary The plugin MCP tool-listing test could hide MCP startup failures by polling `ListMcpTools` until its own 30s deadline. If the plugin MCP server startup had already failed or timed out, the session-owned MCP manager would keep returning an empty tool list, so CI only reported `discovered tools: []` instead of the startup state that mattered. This makes the test synchronize on `McpStartupComplete` for the sample plugin MCP server before asserting listed tools, and gives the Bazel-launched test server a larger startup window. ## Notes Confidence is about 80%. The source path strongly supports the RCA: a failed MCP startup is represented as an empty tool list through `ListMcpTools`, so the old polling contract could not distinguish "not ready yet" from "startup already failed." I could not retrieve the CI execution-log artifact to confirm the exact hidden startup error, but the observed Ubuntu Bazel failure matches this path: repeated `ListMcpTools` responses with no tools until the test-local timeout fired. I think this is the right solution because it keeps plugin behavior unchanged and fixes only the test contract. Future startup failures should now report the `McpStartupComplete` failure/cancellation instead of timing out on an empty tool snapshot. This test was introduced in https://github.com/openai/codex/pull/12864.	2026-04-23 14:08:40 -07:00
Michael Bolin	f90cc0ee64	tui: carry permission profiles on user turns (#18285 ) ## Why Per-turn permission overrides should use the same canonical profile abstraction as session configuration. That lets TUI submissions preserve exact configured permissions without round-tripping through legacy sandbox fields. ## What changed This adds `permission_profile` to user-turn operations, threads it through TUI/app-server submission paths, fills the new field in existing test fixtures, and adds coverage that composer submission includes the configured profile. ## Verification - `cargo test -p codex-tui permissions -- --nocapture` - `cargo test -p codex-core --test all permissions_messages -- --nocapture` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18285). * #18288 * #18287 * #18286 * __->__ #18285	2026-04-23 11:54:17 -07:00
jif-oai	a2f868c9d6	feat: drop spawned-agent context instructions (#19127 ) ## Why MultiAgentV2 children should not receive an extra model-visible developer fragment just because they were spawned. The parent/configured developer instructions should carry through normally, but the dedicated `<spawned_agent_context>` block is no longer desired. ## What changed - Removed the `SpawnAgentInstructions` context fragment and its `<spawned_agent_context>` wrapper. - Stopped appending spawned-agent instructions in `codex-rs/core/src/tools/handlers/multi_agents_v2/spawn.rs`. - Updated subagent notification coverage to assert inherited parent developer instructions without expecting the spawned-agent wrapper. ## Verification - `cargo test -p codex-core --test all spawned_multi_agent_v2_child_inherits_parent_developer_context -- --nocapture` - `cargo test -p codex-core --test all skills_toggle_skips_instructions_for_parent_and_spawned_child -- --nocapture` - `cargo test -p codex-core --test all subagent_notifications -- --nocapture`	2026-04-23 18:54:45 +02:00
Abhinav	305825abd9	Support MCP tools in hooks (#18385 ) ## Summary Lifecycle hooks currently treat `PreToolUse`, `PostToolUse`, and `PermissionRequest` as Bash-only flows - hook schema constrains `tool_name` to `Bash` - hook input assumes a command-shaped `tool_input` - core hook dispatch path passes only shell command strings That means hooks cannot target MCP tools even though MCP tool names are model-visible and stable This change generalizes those hook paths so they can match and receive payloads for MCP tools while preserving the existing Bash behavior. ## Reviewer Notes I think these are the key files - `codex-rs/core/src/tools/handlers/mcp.rs` - `codex-rs/core/src/mcp_tool_call.rs` Otherwise the changes across apply_patch, shell, and unified_exec are mainly to rewire everything to be `tool_input` based instead of just `command` so that it'll make sense for MCP tools. ## Changes - Allow `PreToolUse`, `PostToolUse`, and `PermissionRequest` hook inputs to carry arbitrary `tool_name` and `tool_input` values instead of hard-coding `Bash` and command-only payloads. - Add MCP hook payload support through `McpHandler`, using the model-visible tool name from `ToolInvocation` and the raw MCP arguments as `tool_input`. - Include MCP tool responses in `PostToolUse` by serializing `McpToolOutput` into the hook response payload. - Run `PermissionRequest` hooks for MCP approval requests after remembered approval checks and before falling back to user-facing MCP elicitation. - Preserve exact matching for literal hook matchers like `Bash` and `mcp__memory__create_entities`, while keeping regex matcher support for patterns like `mcp__memory__.` and `mcp__.__write.*`. --------- Co-authored-by: Andrei Eternal <eternal@openai.com> Co-authored-by: Codex <noreply@openai.com>	2026-04-23 07:33:57 +00:00
Eric Traut	bbff4ee61a	Add safety check notification and error handling (#19055 ) Adds a new app-server notification that fires when a user account has been flagged for potential safety reasons.	2026-04-22 22:24:12 -07:00
Andrei Eternal	2b2de3f38b	codex: support hooks in config.toml and requirements.toml (#18893 ) ## Summary Support the existing hooks schema in inline TOML so hooks can be configured from both `config.toml` and enterprise-managed `requirements.toml` without requiring a separate `hooks.json` payload. This gives enterprise admins a way to ship managed hook policy through the existing requirements channel while still leaving script delivery to MDM or other device-management tooling, and it keeps `hooks.json` working unchanged for existing users. This also lays the groundwork for follow-on managed filtering work such as #15937, while continuing to respect project trust gating from #14718. It does not implement `allow_managed_hooks_only` itself. NOTE: yes, it's a bit unfortunate that the toml isn't formatted as closely as normal to our default styling. This is because we're trying to stay compatible with the spec for plugins/hooks that we'll need to support & the main usecase here is embedding into requirements.toml ## What changed - moved the shared hook serde model out of `codex-rs/hooks` into `codex-rs/config` so the same schema can power `hooks.json`, inline `config.toml` hooks, and managed `requirements.toml` hooks - added `hooks` support to both `ConfigToml` and `ConfigRequirementsToml`, including requirements-side `managed_dir` / `windows_managed_dir` - treated requirements-managed hooks as one constrained value via `Constrained`, so managed hook policy is merged atomically and cannot drift across requirement sources - updated hook discovery to load requirements-managed hooks first, then per-layer `hooks.json`, then per-layer inline TOML hooks, with a warning when a single layer defines both representations - threaded managed hook metadata through discovered handlers and exposed requirements hooks in app-server responses, generated schemas, and `/debug-config` - added hook/config coverage in `codex-rs/config`, `codex-rs/hooks`, `codex-rs/core/src/config_loader/tests.rs`, and `codex-rs/core/tests/suite/hooks.rs` ## Testing - `cargo test -p codex-config` - `cargo test -p codex-hooks` - `cargo test -p codex-app-server config_api` ## Documentation Companion updates are needed in the developers website repo for: - the hooks guide - the config reference, sample, basic, and advanced pages - the enterprise managed configuration guide --------- Co-authored-by: Michael Bolin <mbolin@openai.com>	2026-04-22 21:20:09 -07:00
Dylan Hurd	5e71da1424	feat(request-permissions) approve with strict review (#19050 ) ## Summary Allow the user to approve a request_permissions_tool request with the condition that all commands in the rest of the turn are reviewed by guardian, regardless of sandbox status. ## Testing - [x] Added unit tests - [x] Ran locally	2026-04-23 01:56:32 +00:00
Matthew Zeng	8f0a92c1e5	Fix relative stdio MCP cwd fallback (#19031 )	2026-04-22 17:52:17 -07:00
Andrei Eternal	eed0e07825	hooks: emit Bash PostToolUse when exec_command completes via write_stdin (#18888 ) Fixes #16246. ## Why `exec_command` already emits `PreToolUse`, but long-running unified exec commands that finish on a later `write_stdin` poll could miss the matching `PostToolUse`. That left the Bash hook lifecycle inconsistent, broke expectations around `tool_use_id` and `tool_input.command`, and meant `PostToolUse` block/replacement feedback could fail to replace the final session output before it reached model context. This keeps the fix scoped to the `exec_command` / `write_stdin` lifecycle. Broader non-Bash hook expansion is still out of scope here and remains tracked separately in #16732. ## What changed - Compute and store `PostToolUsePayload` while handlers still have access to their concrete output type, and carry `tool_use_id` through that payload. - Preserve the original hook-facing `exec_command` string through unified exec state (`ExecCommandRequest`, `ProcessEntry`, `PreparedProcessHandles`, and `ExecCommandToolOutput`) via `hook_command`, and remove the now-unused `session_command` output metadata. - Emit exactly one Bash `PostToolUse` for long-running `exec_command` sessions when a later `write_stdin` poll observes final completion, using the original `exec_command` call id and hook-facing command. - Keep one-shot `exec_command` behavior aligned with the same payload construction, including interactive completions that return a final result directly. - Apply `PostToolUse` block/replacement feedback before the final `write_stdin` completion output is sent back to the model. - Keep `write_stdin` itself out of `PreToolUse` matching so it continues to act as transport/polling for the original Bash tool call. - Restore plain matcher behavior for tool-name matchers such as `Bash` and `Edit\|Write`, while still treating patterns with regex characters (for example `mcp__.*`) as regexes. - Add unit coverage for unified exec payload construction and parallel session separation, plus a core integration regression that verifies a blocked `PostToolUse` replaces the final `write_stdin` output in model context. ## Testing - `cargo test -p codex-hooks` - `cargo test -p codex-core post_tool_use_payload` - `cargo test -p codex-core post_tool_use_blocks_when_exec_session_completes_via_write_stdin`	2026-04-22 17:14:22 -07:00
Michael Bolin	6ca038bbd1	rollout: persist turn permission profiles (#18281 ) ## Why Resume and reconstruction need to preserve the permissions that were active for each user turn. If rollouts only keep legacy sandbox fields, replay cannot faithfully represent profile-shaped overrides introduced earlier in the stack. ## What changed This records `permission_profile` on user-turn rollout events, reconstructs it through history/state extraction, and updates rollout reconstruction and related fixtures to keep the field explicit. ## Verification - `cargo test -p codex-core --test all permissions_messages -- --nocapture` - `cargo test -p codex-core --test all request_permissions -- --nocapture` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18281). * #18288 * #18287 * #18286 * #18285 * #18284 * #18283 * #18282 * __->__ #18281	2026-04-22 17:00:29 -07:00
Michael Bolin	d3dd0d759b	exec-server: expose arg0 alias root to fs sandbox (#19016 ) ## Why The post-merge `rust-ci-full` run for #18999 still failed the Ubuntu remote `suite::remote_env` sandboxed filesystem tests. That run checked out merge commit `ddde50c611e4800cb805f243ed3c50bbafe7d011`, so the arg0 guard lifetime fix was present. The Docker-backed failure had two remaining pieces: - The sandboxed filesystem helper needs to execute Codex through the `codex-linux-sandbox` arg0 alias path. The helper sandbox was only granting read access to the real Codex executable parent, so the alias parent also has to be visible inside the helper sandbox. - The remote-env tests were building sandbox contexts with `FileSystemSandboxContext::new()`, which captures the local test runner cwd. In the Docker remote exec-server, that host checkout path does not exist, so spawning the filesystem helper failed with `No such file or directory` before the helper could process the request. ## What Changed - Track all helper runtime read roots instead of a single root. - Add both the real Codex executable parent and the `codex-linux-sandbox` alias parent to sandbox readable roots. - Avoid sending an unused local cwd in remote filesystem sandbox contexts when the permission profile has no cwd-dependent entries. - Build the Docker remote-env test sandbox contexts with a cwd path that exists inside the container. - Add unit coverage for the alias-parent root and remote sandbox cwd handling. ## Verification - `cargo test -p codex-exec-server` - `cargo test -p codex-core remote_test_env_sandboxed_read_allows_readable_root` - `just fix -p codex-exec-server` - `just fix -p codex-core`	2026-04-22 21:34:22 +00:00
Michael Bolin	18a26d7bbc	app-server: accept permission profile overrides (#18279 ) ## Why `PermissionProfile` is becoming the canonical permissions shape shared by core and app-server. After app-server responses expose the active profile, clients need to be able to send that same shape back when starting, resuming, forking, or overriding a turn instead of translating through the legacy `sandbox`/`sandboxPolicy` shorthands. This still needs to preserve the existing requirements/platform enforcement model. A profile-shaped request can be downgraded or rejected by constraints, but the server should keep the user's elevated-access intent for project trust decisions. Turn-level profile overrides also need to retain existing read protections, including deny-read entries and bounded glob-scan metadata, so a permission override cannot accidentally drop configured protections such as `*/.env = deny`. ## What changed - Adds optional `permissionProfile` request fields to `thread/start`, `thread/resume`, `thread/fork`, and `turn/start`. - Rejects ambiguous requests that specify both `permissionProfile` and the legacy `sandbox`/`sandboxPolicy` fields, including running-thread resume requests. - Converts profile-shaped overrides into core runtime filesystem/network permissions while continuing to derive the constrained legacy sandbox projection used by existing execution paths. - Preserves project-trust intent for profile overrides that are equivalent to workspace-write or full-access sandbox requests. - Preserves existing deny-read entries and `globScanMaxDepth` when applying turn-level `permissionProfile` overrides. - Updates app-server docs plus generated JSON/TypeScript schema fixtures and regression coverage. ## Verification - `cargo test -p codex-app-server-protocol schema_fixtures` - `cargo test -p codex-core session_configuration_apply_permission_profile_preserves_existing_deny_read_entries` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18279). * #18288 * #18287 * #18286 * #18285 * #18284 * #18283 * #18282 * #18281 * #18280 * __->__ #18279	2026-04-22 13:34:33 -07:00
cassirer-openai	f67383bcba	[rollout_trace] Record core session rollout traces (#18877 ) ## Summary Wires rollout trace recording into `codex-core` session and turn execution. This records the core model request/response, compaction, and session lifecycle boundaries needed for replay without yet tracing every nested runtime/tool boundary. ## Stack This is PR 2/5 in the rollout trace stack. - [#18876](https://github.com/openai/codex/pull/18876): Add rollout trace crate - [#18877](https://github.com/openai/codex/pull/18877): Record core session rollout traces - [#18878](https://github.com/openai/codex/pull/18878): Trace tool and code-mode boundaries - [#18879](https://github.com/openai/codex/pull/18879): Trace sessions and multi-agent edges - [#18880](https://github.com/openai/codex/pull/18880): Add debug trace reduction command ## Review Notes This layer is the first live integration point. The important review question is whether trace recording is isolated from normal session behavior: trace failures should not become user-visible execution failures, and recording should preserve the existing turn/session lifecycle semantics. The PR depends on the reducer/data model from the first stack entry and only introduces the core recorder surface that later PRs use for richer runtime and relationship events.	2026-04-22 17:00:48 +00:00
rhan-oai	213b17b7a3	[codex-analytics] guardian review TTFT plumbing and emission (#17696 ) ## Why Guardian analytics includes time-to-first-token, but the Guardian reviewer runs as a normal Codex session and `TurnCompleteEvent` did not expose TTFT. The timing needs to flow through the standard turn-completion protocol so Guardian review analytics can consume the same value as the rest of the session machinery. ## What changed Adds optional `time_to_first_token_ms` to `TurnCompleteEvent` and populates it from `TurnTiming`. The value is carried through app-server thread history, rollout reconstruction, TUI/app-server adapters, and Guardian review session handling. Guardian review analytics now captures TTFT from the reviewer turn-complete event when available. Existing tests and fixtures are updated to set the new optional field to `None` where TTFT is not relevant. ## Verification - `cargo clippy -p codex-tui --tests -- -D warnings` - `cargo clippy -p codex-core --lib --tests -- -D warnings` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/17696). * __->__ #17696 * #17695 * #17693 * #18278 * #18953	2026-04-22 01:52:48 -07:00
Michael Bolin	e18fe7a07f	test(core): move prompt debug coverage to integration suite (#18916 ) ## Why `build_prompt_input` now initializes `ExecServerRuntimePaths`, which requires a configured Codex executable path. The previous inline unit test in `core/src/prompt_debug.rs` built a bare `test_config()` and then failed before it could assert anything useful: ```text Codex executable path is not configured ``` This coverage is also integration-shaped: it drives the public `build_prompt_input` entry point through config, thread, and session setup rather than testing a small internal helper in isolation. Bazel CI did not catch this earlier because the affected test was behind the same wrapped Rust unit-test path fixed by #18913. Before that launcher/sharding fix, the outer `workspace_root_test` changed the working directory for Insta compatibility while the inner `rules_rust` sharding wrapper still expected its runfiles working directory. In practice, Bazel could report success without executing the Rust test cases in that shard. Once #18913 makes the wrapper run the Rust test binary directly and shard with libtest arguments, this stale unit test actually runs and exposes the missing `codex_self_exe` setup. ## What Changed - Moved `build_prompt_input_includes_context_and_user_message` out of `core/src/prompt_debug.rs`. - Added `core/tests/suite/prompt_debug_tests.rs` and registered it from `core/tests/suite/mod.rs`. - Builds the test config with `ConfigBuilder` and provides `codex_self_exe` using the current test executable, matching the runtime-path invariant required by prompt debug setup. - Preserves the existing assertions that the generated prompt input includes both the debug user message and project-specific user instructions. ## Verification - `cargo test -p codex-core --test all prompt_debug_tests::build_prompt_input_includes_context_and_user_message` - `bazel test //codex-rs/core:core-all-test --test_arg=prompt_debug_tests::build_prompt_input_includes_context_and_user_message --test_output=errors` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18916). * #18913 * __->__ #18916	2026-04-22 01:08:25 +00:00
Felipe Coury	09ebc34f17	fix(core): emit hooks for apply_patch edits (#18391 ) Fixes https://github.com/openai/codex/issues/16732. ## Why `apply_patch` is Codex's primary file edit path, but it was not emitting `PreToolUse` or `PostToolUse` hook events. That meant hook-based policy, auditing, and write coordination could observe shell commands while missing the actual file mutation performed by `apply_patch`. The issue also exposed that the hook runtime serialized command hook payloads with `tool_name: "Bash"` unconditionally. Even if `apply_patch` supplied hook payloads, hooks would either fail to match it directly or receive misleading stdin that identified the edit as a Bash tool call. ## What Changed - Added `PreToolUse` and `PostToolUse` payload support to `ApplyPatchHandler`. - Exposed the raw patch body as `tool_input.command` for both JSON/function and freeform `apply_patch` calls. - Taught tool hook payloads to carry a handler-supplied hook-facing `tool_name`. - Preserved existing shell compatibility by continuing to emit `Bash` for shell-like tools. - Serialized the selected hook `tool_name` into hook stdin instead of hardcoding `Bash`. - Relaxed the generated hook command input schema so `tool_name` can represent tools other than `Bash`. ## Verification Added focused handler coverage for: - JSON/function `apply_patch` calls producing a `PreToolUse` payload. - Freeform `apply_patch` calls producing a `PreToolUse` payload. - Successful `apply_patch` output producing a `PostToolUse` payload. - Shell and `exec_command` handlers continuing to expose `Bash`. Added end-to-end hook coverage for: - A `PreToolUse` hook matching `^apply_patch$` blocking the patch before the target file is created. - A `PostToolUse` hook matching `^apply_patch$` receiving the patch input and tool response, then adding context to the follow-up model request. - Non-participating tools such as the plan tool continuing not to emit `PreToolUse`/`PostToolUse` hook events. Also validated manually with a live `codex exec` smoke test using an isolated temp workspace and temp `CODEX_HOME`. The smoke test confirmed that a real `apply_patch` edit emits `PreToolUse`/`PostToolUse` with `tool_name: "apply_patch"`, a shell command still emits `tool_name: "Bash"`, and a denying `PreToolUse` hook prevents the blocked patch file from being created.	2026-04-21 22:00:40 -03:00
starr-openai	1d4cc494c9	Add turn-scoped environment selections (#18416 ) ## Summary - add experimental turn/start.environments params for per-turn environment id + cwd selections - pass selections through core protocol ops and resolve them with EnvironmentManager before TurnContext creation - treat omitted selections as default behavior, empty selections as no environment, and non-empty selections as first environment/cwd as the turn primary ## Testing - ran `just fmt` - ran `just write-app-server-schema` - not run: unit tests for this stacked PR --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-21 17:48:33 -07:00
starr-openai	ddbe2536be	Support multiple managed environments (#18401 ) ## Summary - refactor EnvironmentManager to own keyed environments with default/local lookup helpers - keep remote exec-server client creation lazy until exec/fs use - preserve disabled agent environment access separately from internal local environment access ## Validation - not run (per Codex worktree instruction to avoid tests/builds unless requested) --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-21 15:29:35 -07:00
efrazer-oai	be75785504	fix: fully revert agent identity runtime wiring (#18757 ) ## Summary This PR fully reverts the previously merged Agent Identity runtime integration from the old stack: https://github.com/openai/codex/pull/17387/changes It removes the Codex-side task lifecycle wiring, rollout/session persistence, feature flag plumbing, lazy `auth.json` mutation, background task auth paths, and request callsite changes introduced by that stack. This leaves the repo in a clean pre-AgentIdentity integration state so the follow-up PRs can reintroduce the pieces in smaller reviewable layers. ## Stack 1. This PR: full revert 2. https://github.com/openai/codex/pull/18871: move Agent Identity business logic into a crate 3. https://github.com/openai/codex/pull/18785: add explicit AgentIdentity auth mode and startup task allocation 4. https://github.com/openai/codex/pull/18811: migrate auth callsites through AuthProvider ## Testing Tests: targeted Rust checks, cargo-shear, Bazel lock check, and CI.	2026-04-21 14:30:55 -07:00
jif-oai	15b8cde2a4	chore: default multi-agent v2 fork to all (#18873 ) Default sub-agents v2 to `all` for the fork mode	2026-04-21 21:54:58 +01:00
Michael Bolin	f8562bd47b	sandboxing: intersect permission profiles semantically (#18275 ) ## Why Permission approval responses must not be able to grant more access than the tool requested. Moving this flow to `PermissionProfile` means the comparison must be profile-shaped instead of `SandboxPolicy`-shaped, and cwd-relative special paths such as `:cwd` and `:project_roots` must stay anchored to the turn that produced the request. ## What changed This implements semantic `PermissionProfile` intersection in `codex-sandboxing` for file-system and network permissions. The intersection accepts narrower path grants, rejects broader grants, preserves deny-read carve-outs and glob scan depth, and materializes cwd-dependent special-path grants to absolute paths before they can be recorded for reuse. The request-permissions response paths now use that intersection consistently. App-server captures the request turn cwd before waiting for the client response, includes that cwd in the v2 approval params, and core stores the requested profile plus cwd for direct TUI/client responses and Guardian decisions before recording turn- or session-scoped grants. The TUI app-server bridge now preserves the app-server request cwd when converting permission approval params into core events. ## Verification - `cargo test -p codex-sandboxing intersect_permission_profiles -- --nocapture` - `cargo test -p codex-app-server request_permissions_response -- --nocapture` - `cargo test -p codex-core request_permissions_response_materializes_session_cwd_grants_before_recording -- --nocapture` - `cargo check -p codex-tui --tests` - `cargo check --tests` - `cargo test -p codex-tui app_server_request_permissions_preserves_file_system_permissions`	2026-04-21 10:23:01 -07:00
pakrym-oai	2a226096f6	Split DeveloperInstructions into individual fragments. (#18813 ) Split DeveloperInstructions into individual fragments.	2026-04-21 10:22:36 -07:00
pash-openai	dc1a8f2190	[tool search] support namespaced deferred dynamic tools (#18413 ) Deferred dynamic tools need to round-trip a namespace so a tool returned by `tool_search` can be called through the same registry key that core uses for dispatch. This change adds namespace support for dynamic tool specs/calls, persists it through app-server thread state, and routes dynamic tool calls by full `ToolName` while still sending the app the leaf tool name. Deferred dynamic tools must provide a namespace; non-deferred dynamic tools may remain top-level. It also introduces `LoadableToolSpec` as the shared function-or-namespace Responses shape used by both `tool_search` output and dynamic tool registration, so dynamic tools use the same wrapping logic in both paths. Validation: - `cargo test -p codex-tools` - `cargo test -p codex-core tool_search` --------- Co-authored-by: Sayan Sisodiya <sayan@openai.com>	2026-04-21 14:13:08 +08:00
Celia Chen	cefcfe43b9	feat: add a built-in Amazon Bedrock model provider (#18744 ) ## Why Codex needs a first-class `amazon-bedrock` model provider so users can select Bedrock without copying a full provider definition into `config.toml`. The provider has Codex-owned defaults for the pieces that should stay consistent across users: the display `name`, Bedrock `base_url`, and `wire_api`. At the same time, users still need a way to choose the AWS credential profile used by their local environment. This change makes `amazon-bedrock` a partially modifiable built-in provider: code owns the provider identity and endpoint defaults, while user config can set `model_providers.amazon-bedrock.aws.profile`. For example: ```toml model_provider = "amazon-bedrock" [model_providers.amazon-bedrock.aws] profile = "codex-bedrock" ``` ## What Changed - Added `amazon-bedrock` to the built-in model provider map with: - `name = "Amazon Bedrock"` - `base_url = "https://bedrock-mantle.us-east-1.api.aws/v1"` - `wire_api = "responses"` - Added AWS provider auth config with a profile-only shape: `model_providers.<id>.aws.profile`. - Kept AWS auth config restricted to `amazon-bedrock`; custom providers that set `aws` are rejected. - Allowed `model_providers.amazon-bedrock` through reserved-provider validation so it can act as a partial override. - During config loading, only `aws.profile` is copied from the user-provided `amazon-bedrock` entry onto the built-in provider. Other Bedrock provider fields remain hard-coded by the built-in definition. - Updated the generated config schema for the new provider AWS profile config.	2026-04-21 00:54:05 +00:00
guinness-oai	ca3246f77a	[codex] Send realtime transcript deltas on handoff (#18761 ) ## Summary - Track how many realtime transcript entries have already been attached to a background-agent handoff. - Attach only entries added since the previous handoff as `<transcript_delta>` instead of resending the accumulated transcript snapshot. - Update the realtime integration test so the second delegation carries only the second transcript delta. ## Validation - `just fmt` - `cargo test -p codex-api` - `cargo test -p codex-core inbound_handoff_request_sends_transcript_delta_after_each_handoff` - `cargo build -p codex-cli -p codex-app-server` ## Manual testing Built local debug binaries at: - `codex-rs/target/debug/codex` - `codex-rs/target/debug/codex-app-server`	2026-04-20 16:46:15 -07:00
viyatb-oai	33fa952426	fix: fix stale proxy env restoration after shell snapshots (#17271 ) ## Summary This fixes a stale-environment path in shell snapshot restoration. A sandboxed command can source a shell snapshot that was captured while an older proxy process was running. If that proxy has died and come back on a different port, the snapshot can otherwise put old proxy values back into the command environment, which is how tools like `pip` end up talking to a dead proxy. The wrapper now captures the live process environment before sourcing the snapshot and then restores or clears every proxy env var from the proxy crate's canonical list. That makes proxy state after shell snapshot restoration match the current command environment, rather than whatever proxy values happened to be present in the snapshot. On macOS, the Codex-generated `GIT_SSH_COMMAND` is refreshed when the SOCKS listener changes, while custom SSH wrappers are still left alone. --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-20 16:39:17 -07:00
guinness-oai	1029742cf7	Add realtime silence tool (#18635 ) ## Summary Adds a second realtime v2 function tool, `remain_silent`, so the realtime model has an explicit non-speaking action when the collaboration mode or latest context says it should not answer aloud. This is stacked on #18597. ## Design - Advertise `remain_silent` alongside `background_agent` in realtime v2 conversational sessions. - Parse `remain_silent` function calls into a typed `RealtimeEvent::NoopRequested` event. - Have core answer that function call with an empty `function_call_output` and deliberately avoid `response.create`, so no follow-up realtime response is requested. - Keep the event hidden from app-server/TUI surfaces; it is operational plumbing, not user-visible conversation content.	2026-04-20 15:43:20 -07:00
Thibault Sottiaux	54bd07d28c	[codex] prefer inherited spawn agent model (#18701 ) This updates the spawn-agent tool contract so subagents are presented as inheriting the parent model by default. The visible model list is now framed as optional overrides, the model parameter tells callers to leave it unset and the delegation guidance no longer nudges models toward picking a smaller/mini override. Fixes reports that 5.4 would occasionally pick 5.2 or lower as sub-agents.	2026-04-20 22:34:08 +00:00
guinness-oai	126bd6e7a8	Update realtime handoff transcript handling (#18597 ) ## Summary This PR aims to improve integration between the realtime model and the codex agent by sharing more context with each other. In particular, we now share full realtime conversation transcript deltas in addition to the delegation message. realtime_conversation.rs now turns a handoff into: ``` <realtime_delegation> <input>...</input> <transcript_delta>...</transcript_delta> </realtime_delegation> ``` ## Implementation notes The transcript is accumulated in the realtime websocket layer as parsed realtime events arrive. When a background-agent handoff is requested, the current transcript snapshot is copied onto the handoff event and then serialized by `realtime_conversation.rs` into the hidden realtime delegation envelope that Codex receives as user-turn context. For Realtime V2, the session now explicitly enables input audio transcription, and the parser handles the relevant input/output transcript completion events so the snapshot includes both user speech and realtime model responses. The delegation `<input>` remains the actual handoff request, while `<transcript_delta>` carries the surrounding conversation history for context. Reviewers should note that the transcript payload is intended for Codex context sharing, not UI rendering. The realtime delegation envelope should stay hidden from the user-facing transcript surface, while still being included in the background-agent turn so Codex can answer with the same conversational context the realtime model had.	2026-04-20 14:04:09 -07:00
Ahmed Ibrahim	316cf0e90b	Update models.json (#18586 ) - Replace the active models-manager catalog with the deleted core catalog contents. - Replace stale hardcoded test model slugs with current bundled model slugs. - Keep this as a stacked change on top of the cleanup PR.	2026-04-20 10:27:01 -07:00
Michael Bolin	dcec516313	protocol: canonicalize file system permissions (#18274 ) ## Why `PermissionProfile` needs stable, canonical file-system semantics before it can become the primary runtime permissions abstraction. Without a canonical form, callers have to keep re-deriving legacy sandbox maps and profile comparisons remain lossy or order-dependent. ## What changed This adds canonicalization helpers for `FileSystemPermissions` and `PermissionProfile`, expands special paths into explicit sandbox entries, and updates permission request/conversion paths to consume those canonical entries. It also tightens the legacy bridge so root-wide write profiles with narrower carveouts are not silently projected as full-disk legacy access. ## Verification - `cargo test -p codex-protocol root_write_with_read_only_child_is_not_full_disk_write -- --nocapture` - `cargo test -p codex-sandboxing permission -- --nocapture` - `cargo test -p codex-tui permissions -- --nocapture`	2026-04-20 09:57:03 -07:00
jif-oai	ff6a5804d2	nit: telepathy to chronicle in tests (#18652 )	2026-04-20 11:51:55 +01:00
Dylan Hurd	49403e3676	chore(multiagent) skills instructions toggle (#18596 ) ## Summary Support toggling the skills message off. ## Test Plan - [x] Updated unit tests	2026-04-19 21:11:52 -07:00
Adrian	e5b52a3caa	Persist and prewarm agent tasks per thread (#17978 ) ## Summary - persist registered agent tasks in the session state update stream so the thread can reuse them - prewarm task registration once identity registration succeeds, while keeping startup failures best-effort - isolate the session-side task lifecycle into a dedicated module so AgentIdentityManager and RegisteredAgentTask do not leak across as many core layers ## Testing - cargo test -p codex-core startup_agent_task_prewarm - cargo test -p codex-core cached_agent_task_for_current_identity_clears_stale_task - cargo test -p codex-core record_initial_history_	2026-04-19 15:45:28 -07:00
Ahmed Ibrahim	996aa23e4c	[5/6] Wire executor-backed MCP stdio (#18212 ) ## Summary - Add the executor-backed RMCP stdio transport. - Wire MCP stdio placement through the executor environment config. - Cover local and executor-backed stdio paths with the existing MCP test helpers. ## Stack ```text o #18027 [6/6] Fail exec client operations after disconnect │ @ #18212 [5/6] Wire executor-backed MCP stdio │ o #18087 [4/6] Abstract MCP stdio server launching │ o #18020 [3/6] Add pushed exec process events │ o #18086 [2/6] Support piped stdin in exec process API │ o #18085 [1/6] Add MCP server environment config │ o main ``` --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-18 21:47:43 -07:00
pakrym-oai	53b1570367	Update image outputs to default to high detail (#18386 ) Do not assume the default `detail`.	2026-04-18 11:01:12 -07:00
jif-oai	e3c2acb9cd	Revert "[codex] drain mailbox only at request boundaries" (#18325 ) ## Summary - Reverts PR #17749 so queued inter-agent mail can again preempt after reasoning/commentary output item boundaries. - Applies the revert to the current `codex/turn.rs` module layout and restores the prior pending-input test expectations/snapshots. ## Testing - `just fmt` - `cargo test -p codex-core --test all pending_input` - `cargo test -p codex-core` failed in unrelated `tools::js_repl::tests::js_repl_imported_local_files_can_access_repl_globals`: dotslash download hit `mktemp: mkdtemp failed ... Operation not permitted` in the sandbox temp dir. Co-authored-by: Codex <noreply@openai.com>	2026-04-18 09:53:48 -07:00
Ahmed Ibrahim	5bb193aa88	Add max context window model metadata (#18382 ) Adds max_context_window to model metadata and routes core context-window reads through resolved model info. Config model_context_window overrides are clamped to max_context_window when present; without an override, the model context_window is used.	2026-04-17 21:48:14 -07:00
pakrym-oai	120bbf46c1	Update image resizing to fit 2048 square bounds (#18384 ) We don't have to downsize to 768 height.	2026-04-17 16:31:03 -07:00
richardopenai	139fa8b8f2	[codex] Propagate rate limit reached type (#18227 ) ## Summary First PR in the split from #17956. - adds the core/app-server `RateLimitReachedType` shape - maps backend `rate_limit_reached_type` into Codex rate-limit snapshots - carries the field through app-server notifications/responses and generated schemas - updates existing constructors/tests for the new optional field ## Validation - `cargo test -p codex-backend-client` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-app-server rate_limits` - `cargo test -p codex-tui workspace_` - `cargo test -p codex-tui status_` - `just fmt` - `just fix -p codex-backend-client` - `just fix -p codex-app-server-protocol` - `just fix -p codex-app-server` - `just fix -p codex-tui`	2026-04-17 13:37:25 -07:00
Michael Bolin	2c2ed51876	ci: make Windows Bazel clippy catch core test imports (#18350 ) ## Why Unused imports in `core/tests/suite/unified_exec.rs` in the Windows build were not caught by Bazel CI on https://github.com/openai/codex/pull/18096. I spot-checked https://github.com/openai/codex/actions/workflows/rust-ci-full.yml?query=branch%3Amain and noticed that builds were consistently red. This revealed that our Cargo builds _were_ properly catching these issues, identifying a Windows-specific coverage hole in the Bazel clippy job. The Windows Bazel clippy job uses `--skip_incompatible_explicit_targets` so it can lint a broad target set without failing immediately on targets that are genuinely incompatible with Windows. However, with the default Windows host platform, `rust_test` targets such as `//codex-rs/core:core-all-test` could be skipped before the clippy aspect reached their integration-test modules. As a result, the imports in `core/tests/suite/unified_exec.rs` were not being linted by the Windows Bazel clippy job at all. The clippy diagnostic that Windows Bazel should have surfaced was: ```text error: unused import: `codex_config::Constrained` --> core\tests\suite\unified_exec.rs:8:5 \| 8 \| use codex_config::Constrained; \| ^^^^^^^^^^^^^^^^^^^^^^^^^ \| = note: `-D unused-imports` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(unused_imports)]` error: unused import: `codex_protocol::permissions::FileSystemAccessMode` --> core\tests\suite\unified_exec.rs:11:5 \| 11 \| use codex_protocol::permissions::FileSystemAccessMode; \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ error: unused import: `codex_protocol::permissions::FileSystemPath` --> core\tests\suite\unified_exec.rs:12:5 \| 12 \| use codex_protocol::permissions::FileSystemPath; \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ error: unused import: `codex_protocol::permissions::FileSystemSandboxEntry` --> core\tests\suite\unified_exec.rs:13:5 \| 13 \| use codex_protocol::permissions::FileSystemSandboxEntry; \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ error: unused import: `codex_protocol::permissions::FileSystemSandboxPolicy` --> core\tests\suite\unified_exec.rs:14:5 \| 14 \| use codex_protocol::permissions::FileSystemSandboxPolicy; \| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``` ## What changed - Run the Windows Bazel clippy job with the MSVC host platform via `--windows-msvc-host-platform`, matching the Windows Bazel test job. This keeps `--skip_incompatible_explicit_targets` while ensuring Windows `rust_test` targets such as `//codex-rs/core:core-all-test` are still linted. - Remove the unused imports from `core/tests/suite/unified_exec.rs`. - Add `--print-failed-action-summary` to `.github/scripts/run-bazel-ci.sh` so Bazel action failures can be summarized after the build exits. ## Failure reporting Once the coverage issue was fixed, an intentionally reintroduced unused import made the Windows Bazel clippy job fail as expected. That exposed a separate usability problem: because the job keeps `--keep_going`, the top-level Bazel output could still end with: ```text ERROR: Build did NOT complete successfully FAILED: ``` without the underlying rustc/clippy diagnostic being visible in the obvious part of the GitHub Actions log. To keep `--keep_going` while making failures actionable, the wrapper now scans the captured Bazel console output for failed actions and prints the matching rustc/clippy diagnostic block. When a diagnostic block is found, it is emitted both as a GitHub `::error` annotation and as plain expanded log output, rather than being hidden in a collapsed group. ## Verification To validate the CI path, I intentionally introduced an unused import in `core/tests/suite/unified_exec.rs`. The Windows Bazel clippy job failed as expected, confirming that the integration-test module is now covered by Bazel clippy. The same failure also verified that the wrapper surfaces the matching clippy diagnostics directly in the Actions output.	2026-04-17 18:19:58 +00:00
sayan-oai	6991be7ead	enable tool search over dynamic tools (#18263 ) ## Summary - Normalize deferred MCP and dynamic tools into `ToolSearchEntry` values before constructing `ToolSearchHandler`. - Move the tool-search entry adapter out of `tools/handlers` and into `tools/tool_search_entry.rs` so the handlers directory stays focused on handlers. - Keep `ToolSearchHandler` operating over one generic entry list for BM25 search, namespace grouping, and per-bucket default limits. ## Why Follow-up cleanup for #17849. The dynamic tool-search support made the handler juggle source-specific MCP and dynamic tool lists, index arithmetic, output conversion, and namespace emission. This keeps source adaptation outside the handler so the search loop itself is smaller and source-agnostic. ## Validation - `just fmt` - `cargo test -p codex-core tools::handlers::tool_search::tests` - `git diff --check` - `cargo test -p codex-core` currently fails in unrelated `plugins::manager::tests::list_marketplaces_ignores_installed_roots_missing_from_config`; rerunning that single test fails the same way at `core/src/plugins/manager_tests.rs:1692`. --------- Co-authored-by: pash <pash@openai.com>	2026-04-18 02:07:59 +08:00
jif-oai	cfc23eee3d	feat: config aliases (#18140 ) Rename `no_memories_if_mcp_or_web_search` → `disable_on_external_context` with backward compatibility While doing so, we add a key alias system on our layer merging system. What we try to avoid is a case where a company managed config use an old name while the user has a new name in it's local config (which would make the deserialization fail)	2026-04-17 18:26:09 +01:00
Abhinav	8494e5bd7b	Add PermissionRequest hooks support (#17563 ) ## Why We need `PermissionRequest` hook support! Also addresses: - https://github.com/openai/codex/issues/16301 - run a script on Hook to do things like play a sound to draw attention but actually no-op so user can still approve - can omit the `decision` object from output or just have the script exit 0 and print nothing - https://github.com/openai/codex/issues/15311 - let the script approve/deny on its own - external UI what will run on Hook and relay decision back to codex ## Reviewer Note There's a lot of plumbing for the new hook, key files to review are: - New hook added in `codex-rs/hooks/src/events/permission_request.rs` - Wiring for network approvals `codex-rs/core/src/tools/network_approval.rs` - Wiring for tool orchestrator `codex-rs/core/src/tools/orchestrator.rs` - Wiring for execve `codex-rs/core/src/tools/runtimes/shell/unix_escalation.rs` ## What - Wires shell, unified exec, and network approval prompts into the `PermissionRequest` hook flow. - Lets hooks allow or deny approval prompts; quiet or invalid hooks fall back to the normal approval path. - Uses `tool_input.description` for user-facing context when it helps: - shell / `exec_command`: the request justification, when present - network approvals: `network-access <domain>` - Uses `tool_name: Bash` for shell, unified exec, and network approval permission-request hooks. - For network approvals, passes the originating command in `tool_input.command` when there is a single owning call; otherwise falls back to the synthetic `network-access ...` command. <details> <summary>Example `PermissionRequest` hook input for a shell approval</summary> ```json { "session_id": "<session-id>", "turn_id": "<turn-id>", "transcript_path": "/path/to/transcript.jsonl", "cwd": "/path/to/cwd", "hook_event_name": "PermissionRequest", "model": "gpt-5", "permission_mode": "default", "tool_name": "Bash", "tool_input": { "command": "rm -f /tmp/example" } } ``` </details> <details> <summary>Example `PermissionRequest` hook input for an escalated `exec_command` request</summary> ```json { "session_id": "<session-id>", "turn_id": "<turn-id>", "transcript_path": "/path/to/transcript.jsonl", "cwd": "/path/to/cwd", "hook_event_name": "PermissionRequest", "model": "gpt-5", "permission_mode": "default", "tool_name": "Bash", "tool_input": { "command": "cp /tmp/source.json /Users/alice/export/source.json", "description": "Need to copy a generated file outside the workspace" } } ``` </details> <details> <summary>Example `PermissionRequest` hook input for a network approval</summary> ```json { "session_id": "<session-id>", "turn_id": "<turn-id>", "transcript_path": "/path/to/transcript.jsonl", "cwd": "/path/to/cwd", "hook_event_name": "PermissionRequest", "model": "gpt-5", "permission_mode": "default", "tool_name": "Bash", "tool_input": { "command": "curl http://codex-network-test.invalid", "description": "network-access http://codex-network-test.invalid" } } ``` </details> ## Follow-ups - Implement the `PermissionRequest` semantics for `updatedInput`, `updatedPermissions`, `interrupt`, and suggestions / `permission_suggestions` - Add `PermissionRequest` support for the `request_permissions` tool path --------- Co-authored-by: Codex <noreply@openai.com>	2026-04-17 14:45:47 +00:00
sayan-oai	d0047de7cb	add token-based tool deferral behind feature flag (#18097 ) add new `tool_search_always_defer_mcp_tools` feature flag that always defers all mcp tools rather than deferring once > 100 deferrable tools. add new tests, also move `mcp_exposure` tests into dedicated file rather than polluting `codex_tests`.	2026-04-17 18:34:06 +08:00

1 2 3 4 5 ...

1087 Commits