codex

mirror of https://github.com/openai/codex.git synced 2026-05-21 19:45:26 +00:00

Author	SHA1	Message	Date
Michael Bolin	8a75001bfb	merge commit for archive created by Sapling	2026-05-12 18:28:43 -07:00
Michael Bolin	4d1ad0b7f9	config: add strict config parsing	2026-05-12 18:28:35 -07:00
Michael Bolin	2dcae67ab3	merge commit for archive created by Sapling	2026-05-12 18:06:50 -07:00
Michael Bolin	e5a8668c86	config: add strict config parsing	2026-05-12 18:06:41 -07:00
Michael Bolin	351f911c4c	Merge `68a66c514b` into sapling-pr-archive-bolinfest	2026-05-12 17:57:19 -07:00
Michael Bolin	68a66c514b	config: add strict config parsing	2026-05-12 17:57:13 -07:00
Michael Bolin	4f1c824d55	Merge `ad0b1199f0` into sapling-pr-archive-bolinfest	2026-05-12 17:56:53 -07:00
Michael Bolin	ad0b1199f0	config: add strict config parsing	2026-05-12 17:56:47 -07:00
Michael Bolin	35a82ef83e	merge commit for archive created by Sapling	2026-05-12 17:41:42 -07:00
Michael Bolin	5a0813d812	app-server: select permission profiles by id	2026-05-12 17:40:17 -07:00
Michael Bolin	8bafd4f6a5	permissions: move workspace roots onto thread state	2026-05-12 17:40:17 -07:00
Michael Bolin	273135dc31	Merge `dc0f61af1f` into sapling-pr-archive-bolinfest	2026-05-12 17:30:38 -07:00
Tom	c51c65ad09	Unify thread metadata updates above store (#22236 ) - make ThreadStore::update_thread_metadata accept a broad range of metadata patches - keep ThreadStore::append_items as raw canonical history append (no metadata side effects) - in the local store, write these metadata updates to a combination of sqlite and rollout jsonl files for backwards-compat. It special cases which fields need to go into jsonl vs sqlite vs whatever, confining the awkwardness to just this implementation - in remote stores we can simply persist the metadata directly to a database, no special casing required. - move the "implicit metadata updates triggered by appending rollout items" from the RolloutRecorder (which is local-threadstore-specific) to the LiveThread layer above the ThreadStore, inside of a private helper utility called ThreadMetadataSync. LiveThread calls ThreadStore append_items and update_metadata separately. - Add a generic update metadata method to ThreadManager that works on both live threads and "cold" threads - Call that ThreadManager method from app server code, so app server doesn't need to worry about whether the thread is live or not	2026-05-13 00:28:15 +00:00
Michael Bolin	251d79117e	Merge `8153d12590` into sapling-pr-archive-bolinfest	2026-05-12 17:24:10 -07:00
pakrym-oai	f11ad1eacb	[codex] Add search term coverage for tool_search (#22398 ) ## Why `tool_search` already had solid end-to-end coverage for discovery and follow-up execution, but it did not prove that distinct pieces of indexed search text actually work in integration. In particular, we were not exercising whether unique tool names, descriptions, namespaces, underscore-expanded dynamic names, and schema-property terms were sufficient to surface the expected deferred tools. This change adds focused integration coverage for those term sources so regressions in search text construction are caught by a real `TestCodex` flow instead of only by lower-level unit tests. ## What changed - added a small helper in `core/tests/suite/search_tool.rs` to assert that a `tool_search_output` contains an expected namespace child tool - added an MCP integration test that issues several `tool_search_call`s and verifies distinct query terms match the expected app tools: - exact tool name: `calendar_timezone_option_99` - tool description phrase: `uploaded document` - top-level schema property: `starts_at` - added a dynamic-tool integration test that verifies distinct query terms match the expected deferred dynamic tool: - exact name: `quasar_ping_beacon` - underscore-expanded name: `quasar ping beacon` - description phrase: `saffron metronome` - namespace: `orbit_ops` - schema property: `chrono_spec` ## Validation - `cargo test -p codex-core tool_search_matches_` ## Docs No documentation update needed.	2026-05-13 00:24:07 +00:00
Michael Bolin	8153d12590	config: add strict config parsing	2026-05-12 17:23:56 -07:00
Michael Bolin	9e7cdbd0d2	core: box multi-agent handler futures (#22266 ) ## Why This is the base PR in the split stack for the permissions migration. It isolates stack-safety work that had been mixed into the larger permissions PR, so reviewers can evaluate the async-future changes separately from the permissions model changes in #22267. The main risk this addresses is large or recursive multi-agent futures overflowing smaller runner stacks. A follow-up review also called out that `shutdown_live_agent` must remain quiescent: callers should not remove a live agent from tracking or release its spawn slot until the worker loop has actually terminated. ## What Changed - Boxes the large async futures in the multi-agent spawn, resume, and close tool handlers. - Boxes the `AgentControl` spawn and recursive close/shutdown paths that can otherwise build very deep futures. - Keeps `shutdown_live_agent` waiting for thread termination before removing/releasing the live agent, preserving the previous shutdown ordering while still boxing the recursive close path. ## Verification Strategy The focused local coverage was `cargo test -p codex-core multi_agents`, which exercises the multi-agent spawn/resume/close handlers, cascade close/resume behavior, and the shutdown path touched by this PR. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/22266). * #22330 * #22329 * #22328 * #22327 * __->__ #22266	2026-05-12 17:22:25 -07:00
Michael Bolin	dc0f61af1f	app-server: select permission profiles by id	2026-05-12 17:18:17 -07:00
Michael Bolin	f68960302a	permissions: move workspace roots onto thread state	2026-05-12 17:18:17 -07:00
Channing Conger	589b820d6e	code-mode: Add pending-aware code mode execution (#22280 ) Introduce execute_to_pending and wait_to_pending APIs that freeze pending-mode runtimes until an explicit resume, while preserving the existing continuously-running execute path. Add runtime and service coverage for pending, resume, completion, and freeze behavior.	2026-05-12 17:16:57 -07:00
pakrym-oai	0173f71143	Refactor namespaced tool spec registration (#22256 ) ## Summary This refactor makes tool handlers the owner of the specs they can publish, so registry construction can register handlers once and separately publish only the specs that should be model-visible. The main motivation is deferred tools: MCP and dynamic tools still need handlers registered up front, but deferred tools should be discoverable through `tool_search` rather than emitted in the initial tool spec list. ## What changed - `McpHandler` and `DynamicToolHandler` can return their own `ToolSpec`. - `build_tool_registry_builder` now collects handlers, registers them through the no-spec path, and publishes only non-deferred handler specs. - Deferred MCP and dynamic tool names are combined into one `all_deferred_tools` set that drives spec filtering, code-mode deferred-tool signaling, and `tool_search` registration. - `tool_search` registration now requires both deferred tools and `namespace_tools`. - Namespace specs are merged in `spec_plan`, preserving top-level spec order, sorting tools within each namespace, and backfilling empty namespace descriptions. - Hosted web search and image-generation specs are included in the collected spec vector before namespace merge/publication, and tool-name tests that should not care about hosted relative order now compare sets. ## Testing - `cargo test -p codex-core tools::spec::tests:: -- --nocapture` - `cargo test -p codex-core tools::spec_plan::tests:: -- --nocapture` - `cargo test -p codex-core tools::router::tests::specs_filter_deferred_dynamic_tools -- --nocapture` - `cargo test -p codex-core suite::prompt_caching::prompt_tools_are_consistent_across_requests -- --nocapture` - `just fmt` - `just fix -p codex-core` - `cargo test -p codex-core -- --skip tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed` passed the library suite after skipping the known stack-overflowing unit test. Full `cargo test -p codex-core` currently hits a stack overflow in `tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`; the same focused test reproduces on `origin/main`.	2026-05-12 17:09:14 -07:00
richardopenai	b6e718591b	[codex] Remove workspace owner usage nudge gate (#20509 ) ## Summary - make workspace owner nudge handling unconditional in the TUI now that it is fully rolled out - keep `workspace_owner_usage_nudge` as a removed no-op compatibility flag so old configs/app overrides remain accepted during rollout - remove flag-disabled test setup ## Companion PR - https://github.com/openai/openai/pull/876351 removes the Codex Apps Statsig rollout gate override after this change is available to the app/runtime path ## Validation - `just write-config-schema` - `just fmt` - `cargo test -p codex-features` - `cargo test -p codex-tui status_and_layout`	2026-05-12 17:07:33 -07:00
Michael Bolin	3d9af7a5b5	Merge `1f12d07fd1` into sapling-pr-archive-bolinfest	2026-05-12 16:51:01 -07:00
Michael Bolin	1b233d3714	core: box multi-agent handler futures	2026-05-12 16:50:48 -07:00
Michael Bolin	1f12d07fd1	docs: clarify permissions thread lifecycle API	2026-05-12 16:50:48 -07:00
Michael Bolin	9d5c8bfd88	app-server: test empty workspace roots roundtrip	2026-05-12 16:50:48 -07:00
Michael Bolin	9f75b476b1	app-server: test persisted active permission profile	2026-05-12 16:50:48 -07:00
Michael Bolin	41199009ea	permissions: move workspace roots onto thread state	2026-05-12 16:50:48 -07:00
Anton Panasenko	ac466c0dbd	feat(exec-server): use protobuf relay frames (#22343 ) ## Why Remote exec-server now needs one executor websocket to serve multiple harness JSON-RPC sessions. Rendezvous routes by `stream_id`, and the exec-server side needs to use the same stable relay frame contract instead of a hand-rolled JSON shape. The relay protocol also needs to make ownership boundaries clear: harness and executor endpoints own sequencing, acks, retries, duplicate suppression, segmentation, and reassembly; rendezvous only routes frames. ## What Changed - Add the checked-in `codex.exec_server.relay.v1.RelayMessageFrame` proto plus generated prost bindings for `codex-exec-server`. - Encode remote harness/executor relay traffic as binary protobuf websocket frames while keeping local websocket JSON-RPC unchanged. - Demux executor-side relay streams into independent `ConnectionProcessor` sessions keyed by `stream_id`. - Add a programmatic `RemoteExecutorConfig::with_bearer_token(...)` constructor for non-CLI callers and integration tests. - Add an integration test that starts the remote executor against a fake registry/rendezvous websocket and verifies two virtual streams share one executor websocket without cross-talk, including per-stream reset behavior. - Document the remote relay envelope, sequence ranges, `ack`/`ack_bits`, and endpoint responsibilities in `exec-server/README.md`. ## Verification - `cargo test -p codex-exec-server --test relay multiplexed_remote_executor_routes_independent_virtual_streams -- --exact` - `cargo test -p codex-exec-server --test relay` - `cargo test -p codex-exec-server` passed outside the sandbox. The sandboxed run hit macOS `sandbox-exec: sandbox_apply: Operation not permitted` in filesystem sandbox tests.	2026-05-12 16:50:45 -07:00
Felipe Coury	6dc3b3d7c8	test(tui): relax configured pet load timeout (#22392 ) ## Why Windows CI has been timing out in `configured_pet_load_is_deferred_until_after_construction` while waiting for the deferred configured-pet load event. The test still needs to prove construction returns before the pet image is available, but the background load slices the built-in pet spritesheet into frame cache files. That work can exceed the old 2 second deadline on slower or more contended CI machines. ## What Changed - Increased the test wait for `ConfiguredPetLoaded` from 2 seconds to 30 seconds. - Kept the post-construction assertion intact so the test still verifies that the pet is not loaded synchronously during `ChatWidget` construction. ## How to Test Targeted tests: - `cargo test -p codex-tui configured_pet_load_is_deferred_until_after_construction` - `just argument-comment-lint` Additional check: - `cargo test -p codex-tui` was run, but the broader crate suite did not complete successfully due to unrelated existing failures: - `status::tests::status_permissions_full_disk_managed_without_network_is_external_sandbox` - `status::tests::status_permissions_full_disk_managed_with_network_is_danger_full_access` - later abort in `tests::fork_last_filters_latest_session_by_cwd_unless_show_all` from stack overflow	2026-05-12 16:50:35 -07:00
pakrym-oai	960d42ddae	code-mode: carry nested tool kind through runtime (#22377 ) ## Why Code mode only used nested spec lookup at execution time to rediscover whether a nested tool should be invoked as a function tool or a freeform tool. That information is already present in the enabled tool metadata that code mode builds to expose `tools.*` and `ALL_TOOLS`, so re-looking it up from the router was redundant and kept execution coupled to a separate spec lookup path. ## What Changed - thread `CodeModeToolKind` through the code-mode runtime `ToolCall` event and `CodeModeNestedToolCall` - emit the nested tool kind directly from the V8 callback using the already-enabled tool metadata - build nested tool payloads from the propagated kind instead of calling `find_spec` - remove the now-unused `find_spec` plumbing from the router and parallel runtime helpers - add unit coverage for function vs freeform payload shaping and update affected router tests ## Testing - `cargo test -p codex-code-mode` - `cargo test -p codex-core code_mode::tests` - `cargo test -p codex-core extension_tool_bundles_are_model_visible_and_dispatchable` - `cargo test -p codex-core model_visible_specs_filter_deferred_dynamic_tools`	2026-05-12 23:34:37 +00:00
Dylan Hurd	8123bddb16	chore(config) include_collaboration_mode_instructions (#22383 ) ## Summary Adds include_collaboration_mode_instructions, which is a config equivalent to include_permissions_instructions for collaboration modes. Desired for situations where we want to disable this instruction from entering the context ## Testing - [x] Added unit test	2026-05-12 15:50:10 -07:00
Michael Bolin	c9a38fad25	merge commit for archive created by Sapling	2026-05-12 15:45:03 -07:00
pakrym-oai	862b2122ee	tools: remove is_mutating dispatch gating (#22382 ) ## Why Tool dispatch had two serialization mechanisms: - `supports_parallel_tool_calls` decides whether a tool participates in the shared parallel-execution lock. - `is_mutating` separately gated some calls inside dispatch. That second hook no longer carried its weight. The remaining parallel-support flag is already the per-tool concurrency policy, so keeping a second mutating gate made dispatch harder to follow and left behind extra session plumbing that only existed for that path. ## What changed - Removed `is_mutating` from tool handlers and deleted the `tool_call_gate` path that existed only to support it. - Simplified dispatch and routing to rely on the existing per-tool `supports_parallel_tool_calls` boolean. - Dropped the now-unused handler overrides and related session/test scaffolding. - Kept the router/parallel tests focused on the surviving per-tool behavior. - Removed the unused `codex-utils-readiness` dependency from `codex-core` as a follow-up fix for `cargo shear`. ## Testing - `cargo test -p codex-core parallel_support_does_not_match_namespaced_local_tool_names` - `cargo test -p codex-core mcp_parallel_support_uses_handler_data` - `cargo test -p codex-core tools_without_handlers_do_not_support_parallel`	2026-05-12 22:44:54 +00:00
Michael Bolin	01f7453617	docs: clarify permissions thread lifecycle API	2026-05-12 15:40:08 -07:00
Michael Bolin	c7e1e99166	app-server: test empty workspace roots roundtrip	2026-05-12 15:40:08 -07:00
Michael Bolin	ecf9d1b6ec	app-server: test persisted active permission profile	2026-05-12 15:40:08 -07:00
Michael Bolin	0e434f4d02	permissions: move workspace roots onto thread state	2026-05-12 15:40:08 -07:00
Michael Bolin	522e00e341	core: box multi-agent handler futures	2026-05-12 15:40:08 -07:00
Chris Bookholt	5e3ee5eddf	[codex] Tighten unified exec sandbox setup (#22207 ) ## Summary - tighten unified exec sandbox initialization - preserve the requested process workdir independently from sandbox setup - add regression coverage for the updated invariant ## Validation - Ran `/tmp/cargo-tools/bin/just fmt`. - Ran the targeted `codex-core` regression test successfully. - Ran `cargo test -p codex-core`; it did not complete cleanly because unrelated existing agent/config-loader tests failed and the run later aborted on a stack overflow in `tools::handlers::multi_agents::tests::tool_handlers_cascade_close_and_resume_and_keep_explicitly_closed_subtrees_closed`. Co-authored-by: Codex <noreply@openai.com>	2026-05-12 08:41:00 -07:00
Michael Bolin	18a35b5192	Merge `4eed799314` into sapling-pr-archive-bolinfest	2026-05-12 08:09:39 -07:00
Michael Bolin	4eed799314	docs: clarify permissions thread lifecycle API	2026-05-12 08:09:26 -07:00
Michael Bolin	0c3bde9fea	app-server: test empty workspace roots roundtrip	2026-05-12 08:09:26 -07:00
Michael Bolin	ba9d443843	app-server: test persisted active permission profile	2026-05-12 08:09:26 -07:00
Michael Bolin	d4060ad5a2	permissions: move workspace roots onto thread state	2026-05-12 08:09:26 -07:00
Michael Bolin	fcc338495a	core: box multi-agent handler futures	2026-05-12 08:09:26 -07:00
Michael Bolin	be0c37d42b	merge commit for archive created by Sapling	2026-05-12 07:59:24 -07:00
Michael Bolin	d86158a53b	docs: clarify permissions thread lifecycle API	2026-05-12 07:59:00 -07:00
Michael Bolin	3dfd98bddd	app-server: test empty workspace roots roundtrip	2026-05-12 07:59:00 -07:00
Michael Bolin	f4ffc1e89a	app-server: test persisted active permission profile	2026-05-12 07:59:00 -07:00

1 2 3 4 5 ...

15364 Commits