Commit Graph

2695 Commits

Author SHA1 Message Date
pakrym-oai
c0528b9bd9 Move code mode tool files under tools/code_mode and split functionality (#14476)
- **Summary**
- migrate the code mode handler, service, worker, process, runner, and
bridge assets into the `tools/code_mode` module tree
- split Execution, protocol, and handler logic into dedicated files and
relocate the tool definition into `code_mode/spec.rs`
- update core references and tests to stitch the new organization
together
- **Testing**
  - Not run (not requested)
2026-03-12 09:54:11 -07:00
viyatb-oai
a30b807efe fix(cli): support legacy use_linux_sandbox_bwrap flag (#14473)
## Summary
- restore `use_linux_sandbox_bwrap` as a removed feature key so older
`--enable` callers parse again
- keep it as a no-op by leaving runtime behavior unchanged
- add regression coverage for the legacy `--enable` path

## Testing
- Not run (updated and pushed quickly)
2026-03-12 16:33:58 +00:00
pakrym-oai
2f03b1a322 Dispatch tools when code mode is not awaited directly (#14437)
## Summary
- start a code mode worker once per turn and let it pump nested tool
calls through a dedicated queue
- simplify code mode request/response dispatch around request ids and
generic runner-unavailable errors
- clean up the code mode process API and runner protocol plumbing

## Testing
- not run yet
2026-03-12 09:00:20 -07:00
Michael Bolin
0c8a36676a fix: move inline codex-rs/core unit tests into sibling files (#14444)
## Why
PR #13783 moved the `codex.rs` unit tests into `codex_tests.rs`. This
applies the same extraction pattern across the rest of `codex-rs/core`
so the production modules stay focused on runtime code instead of large
inline test blocks.

Keeping the tests in sibling files also makes follow-up edits easier to
review because product changes no longer have to share a file with
hundreds or thousands of lines of test scaffolding.

## What changed
- replaced each inline `mod tests { ... }` in `codex-rs/core/src/**`
with a path-based module declaration
- moved each extracted unit test module into a sibling `*_tests.rs`
file, using `mod_tests.rs` for `mod.rs` modules
- preserved the existing `cfg(...)` guards and module-local structure so
the refactor remains structural rather than behavioral

## Testing
- `cargo test -p codex-core --lib` (`1653 passed; 0 failed; 5 ignored`)
- `just fix -p codex-core`
- `cargo fmt --check`
- `cargo shear`
2026-03-12 08:16:36 -07:00
Jack Mousseau
745ed4e5e0 Use granted permissions when invoking apply_patch (#14429) 2026-03-12 01:30:13 -07:00
Matthew Zeng
23e55d7668 [elicitation] User-friendly tool call messages. (#14403)
- [x] Add a curated set of tool call messages and human-readable tool
param names.
2026-03-12 00:35:21 -07:00
Jack Mousseau
19d0949aab Handle pre-approved permissions in zsh fork (#14431) 2026-03-12 00:27:11 -07:00
viyatb-oai
e99e8e4a6b fix: follow up on linux sandbox review nits (#14440)
## Summary
- address the follow-up review nits from #13996 in a separate PR
- make the approvals test command a raw string and keep the
managed-network path using env proxy routing
- inline `--apply-seccomp-then-exec` in the Linux sandbox inner command
builder
- remove the bubblewrap-specific sandbox metric tag path and drop the
`use_legacy_landlock` shim from `sandbox_tag`/`TurnMetadataState::new`
- restore the `Feature` import that `origin/main` currently still needs
in `connectors.rs`

## Testing
- `cargo test -p codex-linux-sandbox`
- focused `codex-core` tests were rerun/started, but the final
verification pass was interrupted when I pushed at request
2026-03-11 23:59:50 -07:00
viyatb-oai
04892b4ceb refactor: make bubblewrap the default Linux sandbox (#13996)
## Summary
- make bubblewrap the default Linux sandbox and keep
`use_legacy_landlock` as the only override
- remove `use_linux_sandbox_bwrap` from feature, config, schema, and
docs surfaces
- update Linux sandbox selection, CLI/config plumbing, and related
tests/docs to match the new default
- fold in the follow-up CI fixes for request-permissions responses and
Linux read-only sandbox error text
2026-03-11 23:31:18 -07:00
xl-openai
b5f927b973 feat: refactor on openai-curated plugins. (#14427)
- Curated repo sync now uses GitHub HTTP, not local git.
- Curated plugin cache/versioning now uses commit SHA instead of local.
- Startup sync now always repairs or refreshes curated plugin cache from
tmp (auto update to the lastest)
2026-03-11 23:18:58 -07:00
pakrym-oai
f6c6128fc7 Support waiting for code_mode sessions (#14295)
## Summary
- persist the code mode runner process in the session-scoped code mode
store
- switch the runner protocol from `init` to `start` with explicit
session ids
- handle runner-side session processing without the init waiter queue

## Validation
- just fmt
- cargo check -p codex-core
- node --check codex-rs/core/src/tools/code_mode_runner.cjs
2026-03-11 23:13:54 -07:00
Ahmed Ibrahim
367a8a2210 Clarify spawn agent authorization (#14432)
- Clarify that spawn_agent requires explicit user permission for
delegation or parallel agent work.
- Add a regression test covering the new description text.
2026-03-11 23:03:07 -07:00
Matthew Zeng
ba5b94287e [apps] Add tool_suggest tool. (#14287)
- [x] Add tool_suggest tool.
- [x] Move chatgpt/src/connectors.rs and core/src/connectors.rs into a
dedicated mod so that we have all the logic and global cache in one
place.
- [x] Update TUI app link view to support rendering the installation
view for mcp elicitation.

---------

Co-authored-by: Shaqayeq <shaqayeq@openai.com>
Co-authored-by: Eric Traut <etraut@openai.com>
Co-authored-by: pakrym-oai <pakrym@openai.com>
Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
Co-authored-by: guinness-oai <guinness@openai.com>
Co-authored-by: Eugene Brevdo <ebrevdo@users.noreply.github.com>
Co-authored-by: Charlie Guo <cguo@openai.com>
Co-authored-by: Fouad Matin <fouad@openai.com>
Co-authored-by: Fouad Matin <169186268+fouad-openai@users.noreply.github.com>
Co-authored-by: xl-openai <xl@openai.com>
Co-authored-by: alexsong-oai <alexsong@openai.com>
Co-authored-by: Owen Lin <owenlin0@gmail.com>
Co-authored-by: sdcoffey <stevendcoffey@gmail.com>
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Won Park <won@openai.com>
Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>
Co-authored-by: celia-oai <celia@openai.com>
Co-authored-by: gabec-openai <gabec@openai.com>
Co-authored-by: joeytrasatti-openai <joey.trasatti@openai.com>
Co-authored-by: Leo Shimonaka <leoshimo@openai.com>
Co-authored-by: Rasmus Rygaard <rasmus@openai.com>
Co-authored-by: maja-openai <163171781+maja-openai@users.noreply.github.com>
Co-authored-by: pash-openai <pash@openai.com>
Co-authored-by: Josh McKinney <joshka@openai.com>
2026-03-11 22:06:59 -07:00
sayan-oai
917c2df201 chore: use AVAILABLE and ON_INSTALL as default plugin install and auth policies (#14407)
make `AVAILABLE` the default plugin installPolicy when unset in
`marketplace.json`. similarly, make `ON_INSTALL` the default authPolicy.

this means, when unset, plugins are available to be installed (but not
auto-installed), and the contained connectors will be authed at
install-time.

updated tests.
2026-03-11 20:33:17 -07:00
Owen Lin
5bc82c5b93 feat(app-server): propagate traces across tasks and core ops (#14387)
## Summary

This PR keeps app-server RPC request trace context alive for the full
lifetime of the work that request kicks off (e.g. for `thread/start`,
this is `app-server rpc handler -> tokio background task -> core op
submissions`). Previously we lose trace lineage once the request handler
returns or hands work off to background tasks.

This approach is especially relevant for `thread/start` and other RPC
handlers that run in a non-blocking way. In the near future we'll most
likely want to make all app-server handlers run in a non-blocking way by
default, and only queue operations that must operate in order (e.g.
thread RPCs per thread?), so we want to make sure tracing in app-server
just generally works.

Depends on https://github.com/openai/codex/pull/14300

**Before**
<img width="155" height="207" alt="image"
src="https://github.com/user-attachments/assets/c9487459-36f1-436c-beb7-fafeb40737af"
/>


**After**
<img width="299" height="337" alt="image"
src="https://github.com/user-attachments/assets/727392b2-d072-4427-9dc4-0502d8652dea"
/>

## What changed

- Keep request-scoped trace context around until we send the final
response or error, or the connection closes.
- Thread that trace context through detached `thread/start` work so
background startup stays attached to the originating request.
- Pass request trace context through to downstream core operations,
including:
  - thread creation
  - resume/fork flows
  - turn submission
  - review
  - interrupt
  - realtime conversation operations
- Add tracing tests that verify:
  - remote W3C trace context is preserved for `thread/start`
  - remote W3C trace context is preserved for `turn/start`
  - downstream core spans stay under the originating request span
  - request-scoped tracing state is cleaned up correctly
- Clean up shutdown behavior so detached background tasks and spawned
threads are drained before process exit.
2026-03-11 20:18:31 -07:00
Ahmed Ibrahim
bf5e997b31 Include spawn agent model metadata in app-server items (#14410)
- add model and reasoning effort to app-server collab spawn items and
notifications
- regenerate app-server protocol schemas for the new fields

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-11 19:25:21 -07:00
viyatb-oai
c2d5458d67 fix: align core approvals with split sandbox policies (#14171)
## Stack

   fix: fail closed for unsupported split windows sandboxing #14172
   fix: preserve split filesystem semantics in linux sandbox #14173
-> fix: align core approvals with split sandbox policies #14171
   refactor: centralize filesystem permissions precedence #14174

## Why This PR Exists

This PR is intentionally narrower than the title may suggest.

Most of the original split-permissions migration already landed in the
earlier `#13434 -> #13453` stack. In particular:

- `#13439` already did the broad runtime plumbing for split filesystem
and network policies.
- `#13445` already moved `apply_patch` safety onto filesystem-policy
semantics.
- `#13448` already switched macOS Seatbelt generation to split policies.
- `#13449` and `#13453` already handled Linux helper and bubblewrap
enforcement.
- `#13440` already introduced the first protocol-side helpers for
deriving effective filesystem access.

The reason this PR still exists is that after the follow-on
`[permissions]` work and the new shared precedence helper in `#14174`, a
few core approval paths were still deciding behavior from the legacy
`SandboxPolicy` projection instead of the split filesystem policy that
actually carries the carveouts.

That means this PR is mostly a cleanup and alignment pass over the
remaining core consumers, not a fresh sandbox backend migration.

## What Is Actually New Here

- make unmatched-command fallback decisions consult
`FileSystemSandboxPolicy` instead of only legacy `DangerFullAccess` /
`ReadOnly` / `WorkspaceWrite` categories
- thread `file_system_sandbox_policy` into the shell, unified-exec, and
intercepted-exec approval paths so they all use the same split-policy
semantics
- keep `apply_patch` safety on the same effective-access rules as the
shared protocol helper, rather than letting it drift through
compatibility projections
- add loader-level regression coverage proving legacy `sandbox_mode`
config still builds split policies and round-trips back without semantic
drift

## What This PR Does Not Do

This PR does not introduce new platform backend enforcement on its own.

- Linux backend parity remains in `#14173`.
- Windows fail-closed handling remains in `#14172`.
- The shared precedence/model changes live in `#14174`.

## Files To Focus On

- `core/src/exec_policy.rs`: unmatched-command fallback and approval
rendering now read the split filesystem policy directly
- `core/src/tools/sandboxing.rs`: default exec-approval requirement keys
off `FileSystemSandboxPolicy.kind`
- `core/src/tools/handlers/shell.rs`: shell approval requests now carry
the split filesystem policy
- `core/src/unified_exec/process_manager.rs`: unified-exec approval
requests now carry the split filesystem policy
- `core/src/tools/runtimes/shell/unix_escalation.rs`: intercepted exec
fallback now uses the same split-policy approval semantics
- `core/src/safety.rs`: `apply_patch` safety keeps using effective
filesystem access rather than legacy sandbox categories
- `core/src/config/config_tests.rs`: new regression coverage for legacy
`sandbox_mode` no-drift behavior through the split-policy loader

## Notes

- `core/src/codex.rs` and `core/src/codex_tests.rs` are just small
fallout updates for `RequestPermissionsResponse.scope`; they are not the
point of the PR.
- If you reviewed the earlier `#13439` / `#13445` stack, the main review
question here is simply: “are there any remaining approval or
patch-safety paths that still reconstruct semantics from legacy
`SandboxPolicy` instead of consuming the split filesystem policy
directly?”

## Testing
- cargo test -p codex-core
legacy_sandbox_mode_config_builds_split_policies_without_drift
- cargo test -p codex-core request_permissions
- cargo test -p codex-core intercepted_exec_policy
- cargo test -p codex-core
restricted_sandbox_requires_exec_approval_on_request
- cargo test -p codex-core
unmatched_on_request_uses_split_filesystem_policy_for_escalation_prompts
- cargo test -p codex-core explicit_
- cargo clippy -p codex-core --tests -- -D warnings
2026-03-12 02:23:22 +00:00
viyatb-oai
f276325cdc refactor: centralize filesystem permissions precedence (#14174)
## Stack

   fix: fail closed for unsupported split windows sandboxing #14172
   fix: preserve split filesystem semantics in linux sandbox #14173
   fix: align core approvals with split sandbox policies #14171
-> refactor: centralize filesystem permissions precedence #14174

## Summary
- add a shared per-path split filesystem precedence helper in
`FileSystemSandboxPolicy`
- derive readable, writable, and unreadable roots from the same
most-specific resolution rules
- add regression coverage for nested `write` / `read` / `none` carveouts
and legacy bridge enforcement detection

## Testing
- cargo test -p codex-protocol
- cargo clippy -p codex-protocol --tests -- -D warnings
2026-03-12 01:35:44 +00:00
Anton Panasenko
77b0c75267 feat: search_tool migrate to bring you own tool of Responses API (#14274)
## Why

to support a new bring your own search tool in Responses
API(https://developers.openai.com/api/docs/guides/tools-tool-search#client-executed-tool-search)
we migrating our bm25 search tool to use official way to execute search
on client and communicate additional tools to the model.

## What
- replace the legacy `search_tool_bm25` flow with client-executed
`tool_search`
- add protocol, SSE, history, and normalization support for
`tool_search_call` and `tool_search_output`
- return namespaced Codex Apps search results and wire namespaced
follow-up tool calls back into MCP dispatch
2026-03-11 17:51:51 -07:00
Curtis 'Fjord' Hawthorne
8791f0ab9a Let models opt into original image detail (#14175)
## Summary

This PR narrows original image detail handling to a single opt-in
feature:

- `image_detail_original` lets the model request `detail: "original"` on
supported models
- Omitting `detail` preserves the default resized behavior

The model only sees `detail: "original"` guidance when the active model
supports it:

- JS REPL instructions include the guidance and examples only on
supported models
- `view_image` only exposes a `detail` parameter when the feature and
model can use it

The image detail API is intentionally narrow and consistent across both
paths:

- `view_image.detail` supports only `"original"`; otherwise omit the
field
- `codex.emitImage(..., detail)` supports only `"original"`; otherwise
omit the field
- Unsupported explicit values fail clearly at the API boundary instead
of being silently reinterpreted
- Unsupported explicit `detail: "original"` requests fall back to normal
behavior when the feature is disabled or the model does not support
original detail
2026-03-11 15:25:07 -07:00
Curtis 'Fjord' Hawthorne
5a89660ae4 Add js_repl cwd and homeDir helpers (#14385)
## Summary

This PR adds two read-only path helpers to `js_repl`:

- `codex.cwd`
- `codex.homeDir`

They are exposed alongside the existing `codex.tmpDir` helper so the
REPL can reference basic host path context without reopening direct
`process` access.

## Implementation

- expose `codex.cwd` and `codex.homeDir` from the js_repl kernel
- make `codex.homeDir` come from the kernel process environment
- pass session dependency env through js_repl kernel startup so
`codex.homeDir` matches the env a shell-launched process would see
- keep existing shell `HOME` population behavior unchanged
- update js_repl prompt/docs and add runtime/integration coverage for
the new helpers
2026-03-11 14:44:44 -07:00
Charley Cunningham
f5bb338fdb Defer initial context insertion until the first turn (#14313)
## Summary
- defer fresh-session `build_initial_context()` until the first real
turn instead of seeding model-visible context during startup
- rely on the existing `reference_context_item == None` turn-start path
to inject full initial context on that first real turn (and again after
baseline resets such as compaction)
- add a regression test for `InitialHistory::New` and update affected
deterministic tests / snapshots around developer-message layout,
collaboration instructions, personality updates, and compact request
shapes

## Notes
- this PR does not add any special empty-thread `/compact` behavior
- most of the snapshot churn is the direct result of moving the initial
model-visible context from startup to the first real turn, so first-turn
request layouts no longer contain a pre-user startup copy of permissions
/ environment / other developer-visible context
- remote manual `/compact` with no prior user still skips the remote
compact request; local first-turn `/compact` still issues a compact
request, but that request now reflects the lack of startup-seeded
context

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-11 12:33:10 -07:00
Ahmed Ibrahim
c32c445f1c Clarify locked role settings in spawn prompt (#14283)
- tell agents when a role pins model or reasoning effort so they know
those settings are not changeable
- add prompt-builder coverage for the locked-setting notes
2026-03-11 12:33:10 -07:00
viyatb-oai
52a3bde6cc feat(core): emit turn metric for network proxy state (#14250)
## Summary
- add a per-turn `codex.turn.network_proxy` metric constant
- emit the metric from turn completion using the live managed proxy
enabled state
- add focused tests for active and inactive tag emission
2026-03-11 12:33:10 -07:00
Ahmed Ibrahim
8f8a0f55ce spawn prompt (#14362)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2026-03-11 12:33:10 -07:00
pakrym-oai
65b325159d Add ALL_TOOLS export to code mode (#14294)
So code mode can search for tools.
2026-03-11 12:33:10 -07:00
sayan-oai
7b2cee53db chore: wire through plugin policies + category from marketplace.json (#14305)
wire plugin marketplace metadata through app-server endpoints:
- `plugin/list` has `installPolicy` and `authPolicy`
- `plugin/install` has plugin-level `authPolicy`

`plugin/install` also now enforces `NOT_AVAILABLE` `installPolicy` when
installing.


added tests.
2026-03-11 12:33:10 -07:00
Owen Lin
fa1242c83b fix(otel): make HTTP trace export survive app-server runtimes (#14300)
## Summary

This PR fixes OTLP HTTP trace export in runtimes where the previous
exporter setup was unreliable, especially around app-server usage. It
also removes the old `codex_otel::otel_provider` compatibility shim and
switches remaining call sites over to the crate-root
`codex_otel::OtelProvider` export.

## What changed

- Use a runtime-safe OTLP HTTP trace exporter path for Tokio runtimes.
- Add an async HTTP client path for trace export when we are already
inside a multi-thread Tokio runtime.
- Make provider shutdown flush traces before tearing down the tracer
provider.
- Add loopback coverage that verifies traces are actually sent to
`/v1/traces`:
  - outside Tokio
  - inside a multi-thread Tokio runtime
  - inside a current-thread Tokio runtime
- Remove the `codex_otel::otel_provider` shim and update remaining
imports.

## Why

I hit cases where spans were being created correctly but never made it
to the collector. The issue turned out to be in exporter/runtime
behavior rather than the span plumbing itself. This PR narrows that gap
and gives us regression coverage for the actual export path.
2026-03-11 12:33:10 -07:00
pakrym-oai
548583198a Allow bool web_search in ToolsToml (#14352)
Summary
- add a custom deserializer so `[tools].web_search` can be a bool
(treated as disabled) or a config object
- extend core and app-server tests to cover bool handling in TOML config

Testing
- Not run (not requested)
2026-03-11 12:33:10 -07:00
Rasmus Rygaard
7f22329389 Revert "Pass more params to compaction" (#14298) 2026-03-11 12:33:10 -07:00
Channing Conger
fd4a673525 Responses: set x-client-request-id as convesration_id when talking to responses (#14312)
Right now we're sending the header session_id to responses which is
ignored/dropped. This sets a useful x-client-request-id to the
conversation_id.
2026-03-11 12:33:10 -07:00
Fouad Matin
f385199cc0 fix(arc_monitor): api path (#14290)
This PR just fixes the API path for ARC monitor.
2026-03-11 12:33:10 -07:00
pakrym-oai
12ee9eb6e0 Add snippets annotated with types to tools when code mode enabled (#14284)
Main purpose is for code mode to understand the return type.
2026-03-11 12:33:09 -07:00
Ahmed Ibrahim
a4d884c767 Split spawn_csv from multi_agent (#14282)
- make `spawn_csv` a standalone feature for CSV agent jobs
- keep `spawn_csv -> multi_agent` one-way and preserve restricted
subagent disable paths
2026-03-11 12:33:09 -07:00
Ahmed Ibrahim
39c1bc1c68 Add realtime start instructions config override (#14270)
- add `realtime_start_instructions` config support
- thread it into realtime context updates, schema, docs, and tests
2026-03-11 12:33:09 -07:00
pakrym-oai
31bf1dbe63 Make unified exec session_id numeric (#14279)
It's a number on the write_stdin input, make it a number on the output
and also internally.
2026-03-11 12:33:09 -07:00
pakrym-oai
01792a4c61 Prefix code mode output with success or failure message and include error stack (#14272) 2026-03-11 12:33:09 -07:00
Ahmed Ibrahim
c8446d7cf3 Stabilize websocket response.failed error delivery (#14017)
## What changed
- Drop failed websocket connections immediately after a terminal stream
error instead of awaiting a graceful close handshake before forwarding
the error to the caller.
- Keep the success path and the closed-connection guard behavior
unchanged.

## Why this fixes the flake
- The failing integration test waits for the second websocket stream to
surface the model error before issuing a follow-up request.
- On slower runners, the old error path awaited
`ws_stream.close().await` before sending the error downstream. If that
close handshake stalled, the test kept waiting for an error that had
already happened server-side and nextest timed it out.
- Dropping the failed websocket immediately makes the terminal error
observable right away and marks the session closed so the next request
reconnects cleanly instead of depending on a best-effort close
handshake.

## Code or test?
- This is a production logic fix in `codex-api`. The existing websocket
integration test already exercises the regression path.
2026-03-11 12:33:09 -07:00
Ahmed Ibrahim
285b3a5143 Show spawned agent model and effort in TUI (#14273)
- include the requested sub-agent model and reasoning effort in the
spawn begin event\n- render that metadata next to the spawned agent name
and role in the TUI transcript

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-11 12:33:09 -07:00
pakrym-oai
8a099b3dfb Rename code mode tool to exec (#14254)
Summary
- update the code-mode handler, runner, instructions, and error text to
refer to the `exec` tool name everywhere that used to say `code_mode`
- ensure generated documentation strings and tool specs describe `exec`
and rely on the shared `PUBLIC_TOOL_NAME`
- refresh the suite tests so they invoke `exec` instead of the old name

Testing
- Not run (not requested)
2026-03-11 12:33:09 -07:00
maja-openai
e77b2fd925 prompt changes to guardian (#14263)
## Summary
  - update the guardian prompting
- clarify the guardian rejection message so an action may still proceed
if the user explicitly approves it after being informed of the risk

  ## Testing
  - cargo run on selected examples
2026-03-11 12:33:09 -07:00
Celia Chen
c1a424691f chore: add a separate reject-policy flag for skill approvals (#14271)
## Summary
- add `skill_approval` to `RejectConfig` and the app-server v2
`AskForApproval::Reject` payload so skill-script prompts can be
configured independently from sandbox and rule-based prompts
- update Unix shell escalation to reject prompts based on the actual
decision source, keeping prefix rules tied to `rules`, unmatched command
fallbacks tied to `sandbox_approval`, and skill scripts tied to
`skill_approval`
- regenerate the affected protocol/config schemas and expand
unit/integration coverage for the new flag and skill approval behavior
2026-03-11 12:33:09 -07:00
pakrym-oai
83b22bb612 Add store/load support for code mode (#14259)
adds support for transferring state across code mode invocations.
2026-03-11 12:33:09 -07:00
Rasmus Rygaard
2621ba17e3 Pass more params to compaction (#14247)
Pass more params to /compact. This should give us parity with the
/responses endpoint to improve caching.

I'm torn about the MCP await. Blocking will give us parity but it seems
like we explicitly don't block on MCPs. Happy either way
2026-03-11 12:33:09 -07:00
Leo Shimonaka
889b4796fc feat: Add additional macOS Sandbox Permissions for Launch Services, Contacts, Reminders (#14155)
Add additional macOS Sandbox Permissions levers for the following:

- Launch Services
- Contacts
- Reminders
2026-03-11 12:33:09 -07:00
pakrym-oai
07c22d20f6 Add code_mode output helpers for text and images (#14244)
Summary
- document how code-mode can import `output_text`/`output_image` and
ensure `add_content` stays compatible
- add a synthetic `@openai/code_mode` module that appends content items
and validates inputs
- cover the new behavior with integration tests for structured text and
image outputs

Testing
- Not run (not requested)
2026-03-11 12:33:08 -07:00
Ahmed Ibrahim
ce1d9abf11 Clarify close_agent tool description (#14269)
- clarify the `close_agent` tool description so it nudges models to
close agents they no longer need
- keep the change scoped to the tool spec text only

Co-authored-by: Codex <noreply@openai.com>
2026-03-11 12:33:08 -07:00
gabec-openai
a67660da2d Load agent metadata from role files (#14177) 2026-03-11 12:33:08 -07:00
pakrym-oai
3d41ff0b77 Add model-controlled truncation for code mode results (#14258)
Summary
- document that `@openai/code_mode` exposes
`set_max_output_tokens_per_exec_call` and that `code_mode` truncates the
final Rust-side output when the budget is exceeded
- enforce the configured budget in the Rust tool runner, reusing
truncation helpers so text-only outputs follow the unified-exec wrapper
and mixed outputs still fit within the limit
- ensure the new behavior is covered by a code-mode integration test and
string spec update

Testing
- Not run (not requested)
2026-03-11 12:33:08 -07:00
pakrym-oai
ee8f84153e Add output schema to MCP tools and expose MCP tool results in code mode (#14236)
Summary
- drop `McpToolOutput` in favor of `CallToolResult`, moving its helpers
to keep MCP tooling focused on the final result shape
- wire the new schema definitions through code mode, context, handlers,
and spec modules so MCP tools serialize the exact output shape expected
by the model
- extend code mode tests to cover multiple MCP call scenarios and ensure
the serialized data matches the new schema
- refresh JS runner helpers and protocol models alongside the schema
changes

Testing
- Not run (not requested)
2026-03-11 12:33:08 -07:00