Commit Graph

999 Commits

Author SHA1 Message Date
Rasmus Rygaard
53d5972226 Reapply "Pass more params to compaction" (#14298) (#14521)
This reverts commit 8af97ce4b0.

Confirmed that this runs locally without the previous issues with tool
use
2026-03-12 23:27:21 +00:00
Anton Panasenko
651717323c feat(search_tool): gate search_tool on model supports_search_tool field (#14502) 2026-03-12 16:03:50 -07:00
pakrym-oai
a2546d5dff Expose code-mode tools through globals (#14517)
Summary
- make all code-mode tools accessible as globals so callers only need
`tools.<name>`
- rename text/image helpers and key globals (store, load, ALL_TOOLS,
etc.) to reflect the new shared namespace
- update the JS bridge, runners, descriptions, router, and tests to
follow the new API

Testing
- Not run (not requested)
2026-03-12 15:43:59 -07:00
Jack Mousseau
a314c7d3ae Decouple request permissions feature and tool (#14426) 2026-03-12 14:47:08 -07:00
pakrym-oai
04e14bdf23 Rename exec session IDs to cell IDs (#14510)
- Update the code-mode executor, wait handler, and protocol plumbing to
use cell IDs instead of session IDs for node communication
- Switch tool metadata, wait description, and suite tests to refer to
cell IDs so user-visible messages match the new terminology

**Testing**
- Not run (not requested)
2026-03-12 14:05:30 -07:00
pakrym-oai
dadffd27d4 Fix MCP tool calling (#14491)
Properly escape mcp tool names and make tools only available via
imports.
2026-03-12 13:38:52 -07:00
pakrym-oai
a5a4899d0c Skip nested tool call parallel test on Windows (#14505)
**Summary**
- disable the `code_mode_nested_tool_calls_can_run_in_parallel` test on
Windows where `exec_command` is unavailable

**Testing**
- Not run (not requested)
2026-03-12 13:32:11 -07:00
pakrym-oai
25e301ed98 Add parallel tool call test (#14494)
Summary
- pin tests to `test-gpt-5.1-codex` so code-mode suites exercise that
model explicitly
- add a regression test that ensures nested tool calls can execute in
parallel and assert on timing
- refresh `codex-rs/Cargo.lock` for the updated dependency tree (add
`codex-utils-pty`, drop `codex-otel`)

Testing
- Not run (not requested)
2026-03-12 12:10:14 -07:00
pakrym-oai
d1b03f0d7f Add default code-mode yield timeout (#14484)
Summary
- expose the default yield timeout through code mode runtime so the
handler, wait tool, and protocol share the same 10s value that matches
unified exec
- document the timeout change in the tool descriptions and propagate the
value all the way into the runner metadata
- adjust Cargo.lock to keep the dependency tree in sync with the added
code mode tool dependency

Testing
- Not run (not requested)
2026-03-12 12:06:23 -07:00
pakrym-oai
cfe3f6821a Cleanup code_mode tool descriptions (#14480)
Move to separate files and clarify a bit.
2026-03-12 11:13:35 -07:00
pakrym-oai
2f03b1a322 Dispatch tools when code mode is not awaited directly (#14437)
## Summary
- start a code mode worker once per turn and let it pump nested tool
calls through a dedicated queue
- simplify code mode request/response dispatch around request ids and
generic runner-unavailable errors
- clean up the code mode process API and runner protocol plumbing

## Testing
- not run yet
2026-03-12 09:00:20 -07:00
viyatb-oai
e99e8e4a6b fix: follow up on linux sandbox review nits (#14440)
## Summary
- address the follow-up review nits from #13996 in a separate PR
- make the approvals test command a raw string and keep the
managed-network path using env proxy routing
- inline `--apply-seccomp-then-exec` in the Linux sandbox inner command
builder
- remove the bubblewrap-specific sandbox metric tag path and drop the
`use_legacy_landlock` shim from `sandbox_tag`/`TurnMetadataState::new`
- restore the `Feature` import that `origin/main` currently still needs
in `connectors.rs`

## Testing
- `cargo test -p codex-linux-sandbox`
- focused `codex-core` tests were rerun/started, but the final
verification pass was interrupted when I pushed at request
2026-03-11 23:59:50 -07:00
viyatb-oai
04892b4ceb refactor: make bubblewrap the default Linux sandbox (#13996)
## Summary
- make bubblewrap the default Linux sandbox and keep
`use_legacy_landlock` as the only override
- remove `use_linux_sandbox_bwrap` from feature, config, schema, and
docs surfaces
- update Linux sandbox selection, CLI/config plumbing, and related
tests/docs to match the new default
- fold in the follow-up CI fixes for request-permissions responses and
Linux read-only sandbox error text
2026-03-11 23:31:18 -07:00
pakrym-oai
f6c6128fc7 Support waiting for code_mode sessions (#14295)
## Summary
- persist the code mode runner process in the session-scoped code mode
store
- switch the runner protocol from `init` to `start` with explicit
session ids
- handle runner-side session processing without the init waiter queue

## Validation
- just fmt
- cargo check -p codex-core
- node --check codex-rs/core/src/tools/code_mode_runner.cjs
2026-03-11 23:13:54 -07:00
Ahmed Ibrahim
367a8a2210 Clarify spawn agent authorization (#14432)
- Clarify that spawn_agent requires explicit user permission for
delegation or parallel agent work.
- Add a regression test covering the new description text.
2026-03-11 23:03:07 -07:00
Matthew Zeng
ba5b94287e [apps] Add tool_suggest tool. (#14287)
- [x] Add tool_suggest tool.
- [x] Move chatgpt/src/connectors.rs and core/src/connectors.rs into a
dedicated mod so that we have all the logic and global cache in one
place.
- [x] Update TUI app link view to support rendering the installation
view for mcp elicitation.

---------

Co-authored-by: Shaqayeq <shaqayeq@openai.com>
Co-authored-by: Eric Traut <etraut@openai.com>
Co-authored-by: pakrym-oai <pakrym@openai.com>
Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
Co-authored-by: guinness-oai <guinness@openai.com>
Co-authored-by: Eugene Brevdo <ebrevdo@users.noreply.github.com>
Co-authored-by: Charlie Guo <cguo@openai.com>
Co-authored-by: Fouad Matin <fouad@openai.com>
Co-authored-by: Fouad Matin <169186268+fouad-openai@users.noreply.github.com>
Co-authored-by: xl-openai <xl@openai.com>
Co-authored-by: alexsong-oai <alexsong@openai.com>
Co-authored-by: Owen Lin <owenlin0@gmail.com>
Co-authored-by: sdcoffey <stevendcoffey@gmail.com>
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Won Park <won@openai.com>
Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>
Co-authored-by: celia-oai <celia@openai.com>
Co-authored-by: gabec-openai <gabec@openai.com>
Co-authored-by: joeytrasatti-openai <joey.trasatti@openai.com>
Co-authored-by: Leo Shimonaka <leoshimo@openai.com>
Co-authored-by: Rasmus Rygaard <rasmus@openai.com>
Co-authored-by: maja-openai <163171781+maja-openai@users.noreply.github.com>
Co-authored-by: pash-openai <pash@openai.com>
Co-authored-by: Josh McKinney <joshka@openai.com>
2026-03-11 22:06:59 -07:00
Owen Lin
5bc82c5b93 feat(app-server): propagate traces across tasks and core ops (#14387)
## Summary

This PR keeps app-server RPC request trace context alive for the full
lifetime of the work that request kicks off (e.g. for `thread/start`,
this is `app-server rpc handler -> tokio background task -> core op
submissions`). Previously we lose trace lineage once the request handler
returns or hands work off to background tasks.

This approach is especially relevant for `thread/start` and other RPC
handlers that run in a non-blocking way. In the near future we'll most
likely want to make all app-server handlers run in a non-blocking way by
default, and only queue operations that must operate in order (e.g.
thread RPCs per thread?), so we want to make sure tracing in app-server
just generally works.

Depends on https://github.com/openai/codex/pull/14300

**Before**
<img width="155" height="207" alt="image"
src="https://github.com/user-attachments/assets/c9487459-36f1-436c-beb7-fafeb40737af"
/>


**After**
<img width="299" height="337" alt="image"
src="https://github.com/user-attachments/assets/727392b2-d072-4427-9dc4-0502d8652dea"
/>

## What changed

- Keep request-scoped trace context around until we send the final
response or error, or the connection closes.
- Thread that trace context through detached `thread/start` work so
background startup stays attached to the originating request.
- Pass request trace context through to downstream core operations,
including:
  - thread creation
  - resume/fork flows
  - turn submission
  - review
  - interrupt
  - realtime conversation operations
- Add tracing tests that verify:
  - remote W3C trace context is preserved for `thread/start`
  - remote W3C trace context is preserved for `turn/start`
  - downstream core spans stay under the originating request span
  - request-scoped tracing state is cleaned up correctly
- Clean up shutdown behavior so detached background tasks and spawned
threads are drained before process exit.
2026-03-11 20:18:31 -07:00
Anton Panasenko
77b0c75267 feat: search_tool migrate to bring you own tool of Responses API (#14274)
## Why

to support a new bring your own search tool in Responses
API(https://developers.openai.com/api/docs/guides/tools-tool-search#client-executed-tool-search)
we migrating our bm25 search tool to use official way to execute search
on client and communicate additional tools to the model.

## What
- replace the legacy `search_tool_bm25` flow with client-executed
`tool_search`
- add protocol, SSE, history, and normalization support for
`tool_search_call` and `tool_search_output`
- return namespaced Codex Apps search results and wire namespaced
follow-up tool calls back into MCP dispatch
2026-03-11 17:51:51 -07:00
Curtis 'Fjord' Hawthorne
8791f0ab9a Let models opt into original image detail (#14175)
## Summary

This PR narrows original image detail handling to a single opt-in
feature:

- `image_detail_original` lets the model request `detail: "original"` on
supported models
- Omitting `detail` preserves the default resized behavior

The model only sees `detail: "original"` guidance when the active model
supports it:

- JS REPL instructions include the guidance and examples only on
supported models
- `view_image` only exposes a `detail` parameter when the feature and
model can use it

The image detail API is intentionally narrow and consistent across both
paths:

- `view_image.detail` supports only `"original"`; otherwise omit the
field
- `codex.emitImage(..., detail)` supports only `"original"`; otherwise
omit the field
- Unsupported explicit values fail clearly at the API boundary instead
of being silently reinterpreted
- Unsupported explicit `detail: "original"` requests fall back to normal
behavior when the feature is disabled or the model does not support
original detail
2026-03-11 15:25:07 -07:00
Curtis 'Fjord' Hawthorne
5a89660ae4 Add js_repl cwd and homeDir helpers (#14385)
## Summary

This PR adds two read-only path helpers to `js_repl`:

- `codex.cwd`
- `codex.homeDir`

They are exposed alongside the existing `codex.tmpDir` helper so the
REPL can reference basic host path context without reopening direct
`process` access.

## Implementation

- expose `codex.cwd` and `codex.homeDir` from the js_repl kernel
- make `codex.homeDir` come from the kernel process environment
- pass session dependency env through js_repl kernel startup so
`codex.homeDir` matches the env a shell-launched process would see
- keep existing shell `HOME` population behavior unchanged
- update js_repl prompt/docs and add runtime/integration coverage for
the new helpers
2026-03-11 14:44:44 -07:00
Charley Cunningham
f5bb338fdb Defer initial context insertion until the first turn (#14313)
## Summary
- defer fresh-session `build_initial_context()` until the first real
turn instead of seeding model-visible context during startup
- rely on the existing `reference_context_item == None` turn-start path
to inject full initial context on that first real turn (and again after
baseline resets such as compaction)
- add a regression test for `InitialHistory::New` and update affected
deterministic tests / snapshots around developer-message layout,
collaboration instructions, personality updates, and compact request
shapes

## Notes
- this PR does not add any special empty-thread `/compact` behavior
- most of the snapshot churn is the direct result of moving the initial
model-visible context from startup to the first real turn, so first-turn
request layouts no longer contain a pre-user startup copy of permissions
/ environment / other developer-visible context
- remote manual `/compact` with no prior user still skips the remote
compact request; local first-turn `/compact` still issues a compact
request, but that request now reflects the lack of startup-seeded
context

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-11 12:33:10 -07:00
Ahmed Ibrahim
c32c445f1c Clarify locked role settings in spawn prompt (#14283)
- tell agents when a role pins model or reasoning effort so they know
those settings are not changeable
- add prompt-builder coverage for the locked-setting notes
2026-03-11 12:33:10 -07:00
Ahmed Ibrahim
8f8a0f55ce spawn prompt (#14362)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2026-03-11 12:33:10 -07:00
pakrym-oai
65b325159d Add ALL_TOOLS export to code mode (#14294)
So code mode can search for tools.
2026-03-11 12:33:10 -07:00
Rasmus Rygaard
7f22329389 Revert "Pass more params to compaction" (#14298) 2026-03-11 12:33:10 -07:00
Channing Conger
fd4a673525 Responses: set x-client-request-id as convesration_id when talking to responses (#14312)
Right now we're sending the header session_id to responses which is
ignored/dropped. This sets a useful x-client-request-id to the
conversation_id.
2026-03-11 12:33:10 -07:00
Ahmed Ibrahim
a4d884c767 Split spawn_csv from multi_agent (#14282)
- make `spawn_csv` a standalone feature for CSV agent jobs
- keep `spawn_csv -> multi_agent` one-way and preserve restricted
subagent disable paths
2026-03-11 12:33:09 -07:00
Ahmed Ibrahim
39c1bc1c68 Add realtime start instructions config override (#14270)
- add `realtime_start_instructions` config support
- thread it into realtime context updates, schema, docs, and tests
2026-03-11 12:33:09 -07:00
pakrym-oai
01792a4c61 Prefix code mode output with success or failure message and include error stack (#14272) 2026-03-11 12:33:09 -07:00
Ahmed Ibrahim
c8446d7cf3 Stabilize websocket response.failed error delivery (#14017)
## What changed
- Drop failed websocket connections immediately after a terminal stream
error instead of awaiting a graceful close handshake before forwarding
the error to the caller.
- Keep the success path and the closed-connection guard behavior
unchanged.

## Why this fixes the flake
- The failing integration test waits for the second websocket stream to
surface the model error before issuing a follow-up request.
- On slower runners, the old error path awaited
`ws_stream.close().await` before sending the error downstream. If that
close handshake stalled, the test kept waiting for an error that had
already happened server-side and nextest timed it out.
- Dropping the failed websocket immediately makes the terminal error
observable right away and marks the session closed so the next request
reconnects cleanly instead of depending on a best-effort close
handshake.

## Code or test?
- This is a production logic fix in `codex-api`. The existing websocket
integration test already exercises the regression path.
2026-03-11 12:33:09 -07:00
pakrym-oai
8a099b3dfb Rename code mode tool to exec (#14254)
Summary
- update the code-mode handler, runner, instructions, and error text to
refer to the `exec` tool name everywhere that used to say `code_mode`
- ensure generated documentation strings and tool specs describe `exec`
and rely on the shared `PUBLIC_TOOL_NAME`
- refresh the suite tests so they invoke `exec` instead of the old name

Testing
- Not run (not requested)
2026-03-11 12:33:09 -07:00
Celia Chen
c1a424691f chore: add a separate reject-policy flag for skill approvals (#14271)
## Summary
- add `skill_approval` to `RejectConfig` and the app-server v2
`AskForApproval::Reject` payload so skill-script prompts can be
configured independently from sandbox and rule-based prompts
- update Unix shell escalation to reject prompts based on the actual
decision source, keeping prefix rules tied to `rules`, unmatched command
fallbacks tied to `sandbox_approval`, and skill scripts tied to
`skill_approval`
- regenerate the affected protocol/config schemas and expand
unit/integration coverage for the new flag and skill approval behavior
2026-03-11 12:33:09 -07:00
pakrym-oai
83b22bb612 Add store/load support for code mode (#14259)
adds support for transferring state across code mode invocations.
2026-03-11 12:33:09 -07:00
Rasmus Rygaard
2621ba17e3 Pass more params to compaction (#14247)
Pass more params to /compact. This should give us parity with the
/responses endpoint to improve caching.

I'm torn about the MCP await. Blocking will give us parity but it seems
like we explicitly don't block on MCPs. Happy either way
2026-03-11 12:33:09 -07:00
pakrym-oai
07c22d20f6 Add code_mode output helpers for text and images (#14244)
Summary
- document how code-mode can import `output_text`/`output_image` and
ensure `add_content` stays compatible
- add a synthetic `@openai/code_mode` module that appends content items
and validates inputs
- cover the new behavior with integration tests for structured text and
image outputs

Testing
- Not run (not requested)
2026-03-11 12:33:08 -07:00
pakrym-oai
3d41ff0b77 Add model-controlled truncation for code mode results (#14258)
Summary
- document that `@openai/code_mode` exposes
`set_max_output_tokens_per_exec_call` and that `code_mode` truncates the
final Rust-side output when the budget is exceeded
- enforce the configured budget in the Rust tool runner, reusing
truncation helpers so text-only outputs follow the unified-exec wrapper
and mixed outputs still fit within the limit
- ensure the new behavior is covered by a code-mode integration test and
string spec update

Testing
- Not run (not requested)
2026-03-11 12:33:08 -07:00
pakrym-oai
ee8f84153e Add output schema to MCP tools and expose MCP tool results in code mode (#14236)
Summary
- drop `McpToolOutput` in favor of `CallToolResult`, moving its helpers
to keep MCP tooling focused on the final result shape
- wire the new schema definitions through code mode, context, handlers,
and spec modules so MCP tools serialize the exact output shape expected
by the model
- extend code mode tests to cover multiple MCP call scenarios and ensure
the serialized data matches the new schema
- refresh JS runner helpers and protocol models alongside the schema
changes

Testing
- Not run (not requested)
2026-03-11 12:33:08 -07:00
Won Park
722e8f08e1 unifying all image saves to /tmp to bug-proof (#14149)
image-gen feature will have the model saving to /tmp by default + at all
times
2026-03-11 12:33:08 -07:00
Ahmed Ibrahim
91ca20c7c3 Add spawn_agent model overrides (#14160)
- add `model` and `reasoning_effort` to the `spawn_agent` schema so the
values pass through
- validate requested models against `model.model` and only check that
the selected model supports the requested reasoning effort

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-11 12:33:08 -07:00
pakrym-oai
00ea8aa7ee Expose strongly-typed result for exec_command (#14183)
Summary
- document output types for the various tool handlers and registry so
the API exposes richer descriptions
- update unified execution helpers and client tests to align with the
new output metadata
- clean up unused helpers across tool dispatch paths

Testing
- Not run (not requested)
2026-03-11 12:33:07 -07:00
Eric Traut
7144f84c69 Fix release-mode integration test compiler failure (#13603)
Addresses #13586

This doesn't affect our CI scripts. It was user-reported.

Summary
- add `wiremock::ResponseTemplate` and `body_string_contains` imports
behind `#[cfg(not(debug_assertions))]` in
`codex-rs/core/tests/suite/view_image.rs` so release builds only pull
the helpers they actually use
2026-03-11 12:33:07 -07:00
Ahmed Ibrahim
aa6a57dfa2 Stabilize incomplete SSE retry test (#13879)
## What changed
- The retry test now uses the same streaming SSE test server used by
production-style tests instead of a wiremock sequence.
- The fixture is resolved via `find_resource!`, and the test asserts
that exactly two outbound requests were sent.

## Why this fixes the flake
- The old wiremock sequence approximated early-close behavior, but it
did not reproduce the same streaming semantics the real client sees.
- That meant the retry path depended on mock implementation details
instead of on the actual transport behavior we care about.
- Switching to the streaming SSE helper makes the test exercise the real
early-close/retry contract, and counting requests directly verifies that
we retried exactly once rather than merely hoping the sequence aligned.

## Scope
- Test-only change.
2026-03-09 22:34:44 -07:00
Ahmed Ibrahim
2e24be2134 Use realtime transcript for handoff context (#14132)
- collect input/output transcript deltas into active handoff transcript
state
- attach and clear that transcript on each handoff, and regenerate
schema/tests
2026-03-09 22:30:03 -07:00
Matthew Zeng
566e4cee4b [apps] Fix apps enablement condition. (#14011)
- [x] Fix apps enablement condition to check both the feature flag and
that the user is not an API key user.
2026-03-09 22:25:43 -07:00
xl-openai
0c33af7746 feat: support disabling bundled system skills (#13792)
Support disable bundled system skills with a config:

[skills.bundled]
enabled = false
2026-03-09 22:02:53 -07:00
pakrym-oai
710682598d Export tools module into code mode runner (#14167)
**Summary**
- allow `code_mode` to pass enabled tools metadata to the runner and
expose them via `tools.js`
- import tools inside JavaScript rather than relying only on globals or
proxies for nested tool calls
- update specs, docs, and tests to exercise the new bridge and explain
the tooling changes

**Testing**
- Not run (not requested)
2026-03-09 21:59:09 -07:00
pakrym-oai
d71e042694 Enforce single tool output type in codex handlers (#14157)
We'll need to associate output schema with each tool. Each tool can only
have on output type.
2026-03-09 21:49:44 -07:00
pakrym-oai
da616136cc Add code_mode experimental feature (#13418)
A much narrower and more isolated (no node features) version of js_repl
2026-03-09 20:56:27 -07:00
Dylan Hurd
6da84efed8 feat(approvals) RejectConfig for request_permissions (#14118)
## Summary
We need to support allowing request_permissions calls when using
`Reject` policy

<img width="1133" height="588" alt="Screenshot 2026-03-09 at 12 06
40 PM"
src="https://github.com/user-attachments/assets/a8df987f-c225-4866-b8ab-5590960daec5"
/>

Note that this is a backwards-incompatible change for Reject policy. I'm
not sure if we need to add a default based on our current use/setup

## Testing
- [x] Added tests
- [x] Tested locally
2026-03-09 18:16:54 -07:00
Dylan Hurd
c1defcc98c fix(core) RequestPermissions + ApplyPatch (#14055)
## Summary
The apply_patch tool should also respect AdditionalPermissions

## Testing
- [x] Added unit tests

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-09 16:11:19 -07:00