codex

mirror of https://github.com/openai/codex.git synced 2026-05-18 18:22:39 +00:00

Author	SHA1	Message	Date
Peter Bakkum	51080fbc4e	Refine realtime footer and local context controls	2026-03-11 07:41:40 -07:00
Peter Bakkum	6512f0fe92	Improve realtime footer status and response recovery	2026-03-11 07:17:02 -07:00
Peter Bakkum	302f5e648b	Log realtime session setup and function calls	2026-03-11 06:47:10 -07:00
Peter Bakkum	f514f39994	Move realtime recording meter to footer	2026-03-11 06:46:46 -07:00
Peter Bakkum	7a5acf8433	Merge branch 'main' into dev/pbakkum/codex-realtimeapi-port	2026-03-11 04:56:07 -07:00
Channing Conger	2cfa106091	Responses: set x-client-request-id as convesration_id when talking to responses (#14312 ) Right now we're sending the header session_id to responses which is ignored/dropped. This sets a useful x-client-request-id to the conversation_id.	2026-03-10 23:46:05 -07:00
Peter Bakkum	6038a5e151	Fix typed realtime text follow-up flow	2026-03-10 22:37:15 -07:00
Peter Bakkum	f81d16c20e	Tighten realtime control-tool prompting	2026-03-10 22:06:22 -07:00
Peter Bakkum	29e856b588	Add realtime queue and runtime control tools	2026-03-10 21:59:00 -07:00
Fouad Matin	78280f872a	fix(arc_monitor): api path (#14290 ) This PR just fixes the API path for ARC monitor.	2026-03-11 02:50:38 +00:00
gabec-openai	052ec629b1	Add keyboard based fast switching between agents in TUI (#13923 )	2026-03-10 19:41:51 -07:00
pakrym-oai	816e447ead	Add snippets annotated with types to tools when code mode enabled (#14284 ) Main purpose is for code mode to understand the return type.	2026-03-10 19:20:15 -07:00
Ahmed Ibrahim	cc417c39a0	Split spawn_csv from multi_agent (#14282 ) - make `spawn_csv` a standalone feature for CSV agent jobs - keep `spawn_csv -> multi_agent` one-way and preserve restricted subagent disable paths	2026-03-11 01:42:50 +00:00
Ahmed Ibrahim	5b10b93ba2	Add realtime start instructions config override (#14270 ) - add `realtime_start_instructions` config support - thread it into realtime context updates, schema, docs, and tests	2026-03-10 18:42:05 -07:00
pakrym-oai	566897d427	Make unified exec session_id numeric (#14279 ) It's a number on the write_stdin input, make it a number on the output and also internally.	2026-03-10 18:38:39 -07:00
pakrym-oai	24b8d443b8	Prefix code mode output with success or failure message and include error stack (#14272 )	2026-03-10 18:33:52 -07:00
Peter Bakkum	103d384781	Update realtime prompt to clarify queuing behavior and add under-development features warning	2026-03-10 18:10:03 -07:00
pash-openai	cec211cabc	render local file links from target paths (#13857 ) Co-authored-by: Josh McKinney <joshka@openai.com>	2026-03-10 18:00:48 -07:00
Ahmed Ibrahim	3f7cb03043	Stabilize websocket response.failed error delivery (#14017 ) ## What changed - Drop failed websocket connections immediately after a terminal stream error instead of awaiting a graceful close handshake before forwarding the error to the caller. - Keep the success path and the closed-connection guard behavior unchanged. ## Why this fixes the flake - The failing integration test waits for the second websocket stream to surface the model error before issuing a follow-up request. - On slower runners, the old error path awaited `ws_stream.close().await` before sending the error downstream. If that close handshake stalled, the test kept waiting for an error that had already happened server-side and nextest timed it out. - Dropping the failed websocket immediately makes the terminal error observable right away and marks the session closed so the next request reconnects cleanly instead of depending on a best-effort close handshake. ## Code or test? - This is a production logic fix in `codex-api`. The existing websocket integration test already exercises the regression path.	2026-03-10 17:59:41 -07:00
Peter Bakkum	7cce23f6a6	Add realtime voice controls and interruption handling	2026-03-10 17:49:04 -07:00
Ahmed Ibrahim	567ad7fafd	Show spawned agent model and effort in TUI (#14273 ) - include the requested sub-agent model and reasoning effort in the spawn begin event\n- render that metadata next to the spawned agent name and role in the TUI transcript --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-11 00:46:25 +00:00
pakrym-oai	37f51382fd	Rename code mode tool to exec (#14254 ) Summary - update the code-mode handler, runner, instructions, and error text to refer to the `exec` tool name everywhere that used to say `code_mode` - ensure generated documentation strings and tool specs describe `exec` and rely on the shared `PUBLIC_TOOL_NAME` - refresh the suite tests so they invoke `exec` instead of the old name Testing - Not run (not requested)	2026-03-11 00:30:16 +00:00
maja-openai	16daab66d9	prompt changes to guardian (#14263 ) ## Summary - update the guardian prompting - clarify the guardian rejection message so an action may still proceed if the user explicitly approves it after being informed of the risk ## Testing - cargo run on selected examples	2026-03-10 17:05:43 -07:00
Ahmed Ibrahim	f6e966e64a	Stabilize pipe process stdin round-trip test (#14013 ) ## What changed - keep the explicit stdin-close behavior after writing so the child still receives EOF deterministically - on Windows, stop using `python -c` for the round-trip assertion and instead run a native `cmd.exe` pipeline that reads one line from stdin with `set /p` and echoes it back - send ` ` on Windows so the stdin payload matches the platform-native line ending the shell reader expects ## Why this fixes flakiness The failing branch-local flake was not in `spawn_pipe_process` itself. The child exited cleanly, but the Windows ARM runner sometimes produced an empty stdout string when the test used Python as the stdin consumer. That makes the test sensitive to Python startup and stdin-close timing rather than the pipe primitive we actually want to validate. Switching the Windows path to a native `cmd.exe` reader keeps the assertion focused on our pipe behavior: bytes written to stdin should come back on stdout before EOF closes the process. The explicit ` ` write removes line-ending ambiguity on Windows. ## Scope - test-only - no production logic change	2026-03-10 17:00:49 -07:00
Celia Chen	295b56bece	chore: add a separate reject-policy flag for skill approvals (#14271 ) ## Summary - add `skill_approval` to `RejectConfig` and the app-server v2 `AskForApproval::Reject` payload so skill-script prompts can be configured independently from sandbox and rule-based prompts - update Unix shell escalation to reject prompts based on the actual decision source, keeping prefix rules tied to `rules`, unmatched command fallbacks tied to `sandbox_approval`, and skill scripts tied to `skill_approval` - regenerate the affected protocol/config schemas and expand unit/integration coverage for the new flag and skill approval behavior	2026-03-10 23:58:23 +00:00
pakrym-oai	18199d4e0e	Add store/load support for code mode (#14259 ) adds support for transferring state across code mode invocations.	2026-03-10 16:53:53 -07:00
Rasmus Rygaard	f8ef154a6b	Pass more params to compaction (#14247 ) Pass more params to /compact. This should give us parity with the /responses endpoint to improve caching. I'm torn about the MCP await. Blocking will give us parity but it seems like we explicitly don't block on MCPs. Happy either way	2026-03-10 16:39:57 -07:00
Leo Shimonaka	de2a73cd91	feat: Add additional macOS Sandbox Permissions for Launch Services, Contacts, Reminders (#14155 ) Add additional macOS Sandbox Permissions levers for the following: - Launch Services - Contacts - Reminders	2026-03-10 23:34:47 +00:00
joeytrasatti-openai	e4bc352782	Add ephemeral flag support to thread fork (#14248 ) ### Summary This PR adds first-class ephemeral support to thread/fork, bringing it in line with thread/start. The goal is to support one-off completions on full forked threads without persisting them as normal user-visible threads. ### Testing	2026-03-10 16:34:27 -07:00
pakrym-oai	8b33485302	Add code_mode output helpers for text and images (#14244 ) Summary - document how code-mode can import `output_text`/`output_image` and ensure `add_content` stays compatible - add a synthetic `@openai/code_mode` module that appends content items and validates inputs - cover the new behavior with integration tests for structured text and image outputs Testing - Not run (not requested)	2026-03-10 16:25:27 -07:00
Ahmed Ibrahim	bf936fa0c1	Clarify close_agent tool description (#14269 ) - clarify the `close_agent` tool description so it nudges models to close agents they no longer need - keep the change scoped to the tool spec text only Co-authored-by: Codex <noreply@openai.com>	2026-03-10 16:25:08 -07:00
Ahmed Ibrahim	44bfd2f12e	Increase sdk workflow timeout to 15 minutes (#14252 ) - raise the sdk workflow job timeout from 10 to 15 minutes to reduce false cancellations near the current limit --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-10 16:24:55 -07:00
gabec-openai	b73228722a	Load agent metadata from role files (#14177 )	2026-03-10 16:21:48 -07:00
pakrym-oai	e791559029	Add model-controlled truncation for code mode results (#14258 ) Summary - document that `@openai/code_mode` exposes `set_max_output_tokens_per_exec_call` and that `code_mode` truncates the final Rust-side output when the budget is exceeded - enforce the configured budget in the Rust tool runner, reusing truncation helpers so text-only outputs follow the unified-exec wrapper and mixed outputs still fit within the limit - ensure the new behavior is covered by a code-mode integration test and string spec update Testing - Not run (not requested)	2026-03-10 15:57:14 -07:00
pakrym-oai	c7e28cffab	Add output schema to MCP tools and expose MCP tool results in code mode (#14236 ) Summary - drop `McpToolOutput` in favor of `CallToolResult`, moving its helpers to keep MCP tooling focused on the final result shape - wire the new schema definitions through code mode, context, handlers, and spec modules so MCP tools serialize the exact output shape expected by the model - extend code mode tests to cover multiple MCP call scenarios and ensure the serialized data matches the new schema - refresh JS runner helpers and protocol models alongside the schema changes Testing - Not run (not requested)	2026-03-10 15:25:19 -07:00
Dylan Hurd	15163050dc	app-server: propagate nested experimental gating for AskForApproval::Reject (#14191 ) ## Summary This change makes `AskForApproval::Reject` gate correctly anywhere it appears inside otherwise-stable app-server protocol types. Previously, experimental gating for `approval_policy: Reject` was handled with request-specific logic in `ClientRequest` detection. That covered a few request params types, but it did not generalize to other nested uses such as `ProfileV2`, `Config`, `ConfigReadResponse`, or `ConfigRequirements`. This PR replaces that ad hoc handling with a generic nested experimental propagation mechanism. ## Testing seeing this when run app-server-test-client without experimental api enabled: ``` initialize response: InitializeResponse { user_agent: "codex-toy-app-server/0.0.0 (Mac OS 26.3.1; arm64) vscode/2.4.36 (codex-toy-app-server; 0.0.0)" } > { > "id": "50244f6a-270a-425d-ace0-e9e98205bde7", > "method": "thread/start", > "params": { > "approvalPolicy": { > "reject": { > "mcp_elicitations": false, > "request_permissions": true, > "rules": false, > "sandbox_approval": true > } > }, > "baseInstructions": null, > "config": null, > "cwd": null, > "developerInstructions": null, > "dynamicTools": null, > "ephemeral": null, > "experimentalRawEvents": false, > "mockExperimentalField": null, > "model": null, > "modelProvider": null, > "persistExtendedHistory": false, > "personality": null, > "sandbox": null, > "serviceName": null > } > } < { < "error": { < "code": -32600, < "message": "askForApproval.reject requires experimentalApi capability" < }, < "id": "50244f6a-270a-425d-ace0-e9e98205bde7" < } [verified] thread/start rejected approvalPolicy=Reject without experimentalApi ``` --------- Co-authored-by: celia-oai <celia@openai.com>	2026-03-10 22:21:52 +00:00
Won Park	28934762d0	unifying all image saves to /tmp to bug-proof (#14149 ) image-gen feature will have the model saving to /tmp by default + at all times	2026-03-10 15:13:12 -07:00
Peter Bakkum	ab5451c4ae	Port realtime websocket handoff flow	2026-03-10 14:17:16 -07:00
Ahmed Ibrahim	2895d3571b	Add spawn_agent model overrides (#14160 ) - add `model` and `reasoning_effort` to the `spawn_agent` schema so the values pass through - validate requested models against `model.model` and only check that the selected model supports the requested reasoning effort --------- Co-authored-by: Codex <noreply@openai.com>	2026-03-10 14:04:04 -07:00
Owen Lin	9b3332e62f	fix(python-sdk): stop checking in codex binaries; stage pinned runtim… (#14232 ) …e package ## Summary This changes the Python SDK packaging model so we no longer commit `codex` binaries into `sdk/python`. Instead, published SDK builds now depend on a separate `codex-cli-bin` runtime package that carries the platform-specific `codex` binary. The SDK and runtime can be staged together with an exact version pin, so the published Python SDK still resolves to a Codex version we know is compatible. The SDK now resolves `codex` in this order: - `AppServerConfig.codex_bin` if explicitly set - installed `codex-cli-bin` runtime package There is no `PATH` fallback anymore. Published installs either use the pinned runtime or fail loudly, and local development uses an explicit `codex_bin` override when working from the repo. ## What changed - removed checked-in binaries from `sdk/python/src/codex_app_server/bin` - changed `AppServerClient` to resolve `codex` from: - explicit `AppServerConfig.codex_bin` - installed `codex-cli-bin` - kept `AppServerConfig.codex_bin` override support for local/dev use - added a new `sdk/python-runtime` package template for the pinned runtime - updated `scripts/update_sdk_artifacts.py` to stage releasable SDK/runtime packages instead of downloading binaries into the repo - made `codex-cli-bin` build as a platform-specific wheel - made `codex-cli-bin` wheel-only by rejecting `sdist` builds - updated docs/tests to match the new packaging flow and explicit local-dev contract ## Why Checking in six platform binaries made the repo much heavier and tied normal source changes to release artifacts. This keeps the compatibility guarantees we want, but moves them into packaging: - the published SDK can depend on an exact `codex-cli-bin==...` - the runtime package carries the platform-specific binary - users still get a pinned runtime - the repo no longer needs to store those binaries It also makes the runtime contract stricter and more predictable: - published installs never silently fall back to an arbitrary `codex` on `PATH` - local development remains supported through explicit `codex_bin` - `codex-cli-bin` is distributed as platform wheels only, which avoids unsafe source-distribution installs for a package that embeds a prebuilt binary ## Validation - ran targeted Python SDK tests: - `python3 -m pytest sdk/python/tests/test_artifact_workflow_and_binaries.py sdk/python/tests/test_client_rpc_methods.py sdk/python/tests/test_contract_generation.py` - exercised the staging flow with a local dummy binary to verify SDK/runtime staging end to end - verified the staged runtime package builds a platform-specific wheel (`Root-Is-Purelib: false`) rather than a universal `py3-none-any` wheel - added test coverage for the explicit-only runtime resolution model - added test coverage that `codex-cli-bin` rejects `sdist` builds --------- Co-authored-by: sdcoffey <stevendcoffey@gmail.com>	2026-03-10 13:53:08 -07:00
alexsong-oai	22d0aea5ba	Add granular metrics for cloud requirements load (#14108 )	2026-03-10 13:44:26 -07:00
xl-openai	2544bd02a2	feat: Allow sync with remote plugin status. (#14176 ) Add forceRemoteSync to plugin/list. When it is set to True, we will sync the local plugin status with the remote one (backend-api/plugins/list).	2026-03-10 13:32:59 -07:00
Matthew Zeng	bda9e55c7e	add(core): arc_monitor (#13936 ) ## Summary - add ARC monitor support for MCP tool calls by serializing MCP approval requests into the ARC action shape and sending the relevant conversation/policy context to the `/api/codex/safety/arc` endpoint - route ARC outcomes back into MCP approval flow so `ask-user` falls back to a user prompt and `steer-model` blocks the tool call, with guardian/ARC tests covering the new request shape - update the TUI approval copy from “Approve Once” to “Allow” / “Allow for this session” and refresh the related snapshots --------- Co-authored-by: Fouad Matin <fouad@openai.com> Co-authored-by: Fouad Matin <169186268+fouad-openai@users.noreply.github.com>	2026-03-10 13:16:47 -07:00
Charlie Guo	01e2c3b8d9	Add OpenAI Docs skill (#13596 ) ## Summary - add the OpenAI Docs skill under codex-rs/skills/src/assets/samples/openai-docs - include the skill metadata, assets, and GPT-5.4 upgrade reference files - exclude the test harness and test fixtures ## Testing - not run (skill-only asset copy)	2026-03-10 12:37:23 -07:00
Eugene Brevdo	027afb8858	[skill-creator] Add forward-testing instructions (#13600 ) This updates the `skill-creator` sample skill to explicitly cover forward-testing as part of the skill authoring workflow. The guidance now treats subagent-based validation as a first-class step for complex or fragile skills, with an emphasis on preserving evaluation integrity and avoiding leaked context. The sample initialization script is also updated so newly created skills point authors toward forward-testing after validation. Together, these changes make the sample more opinionated about how skills should be iterated on once the initial implementation is complete. - Add new guidance to `SKILL.md` on protecting validation integrity, when to use subagents for forward-testing, and how to structure realistic test prompts without leaking expected answers. - Expand the skill creation workflow so iteration explicitly includes forward-testing for complex skills, including approval guidance for expensive or risky validation runs.	2026-03-10 12:08:48 -07:00
guinness-oai	b33edebd6a	Mark incomplete resumed turns interrupted when idle (#14125 ) Fixes a Codex app bug where quitting the app mid-run could leave the reopened thread stuck in progress and non-interactable. On cold thread resume, app-server could return an idle thread with a replayed turn still marked in progress. This marks incomplete replayed turns as interrupted unless the thread is actually active.	2026-03-10 10:57:03 -07:00
pakrym-oai	46e6661d4e	Reuse McpToolOutput in McpHandler (#14229 ) We already have a type to represent the MCP tool output, reuse it instead of the custom McpHandlerOutput	2026-03-10 10:41:41 -07:00
Ahmed Ibrahim	77a02909a8	Stabilize split PTY output on Windows (#14003 ) ## Summary - run the split stdout/stderr PTY test through the normal shell helper on every platform - use a Windows-native command string instead of depending on Python to emit split streams - assert CRLF line endings on Windows explicitly ## Why this fixes the flake The earlier PTY split-output test used a Python one-liner on Windows while the rest of the file exercised shell-command behavior. That made the test depend on runner-local Python availability and masked the real Windows shell output shape. Using a native cmd-compatible command and asserting the actual CRLF output makes the split stdout/stderr coverage deterministic on Windows runners.	2026-03-10 10:25:29 -07:00
pakrym-oai	e52afd28b0	Expose strongly-typed result for exec_command (#14183 ) Summary - document output types for the various tool handlers and registry so the API exposes richer descriptions - update unified execution helpers and client tests to align with the new output metadata - clean up unused helpers across tool dispatch paths Testing - Not run (not requested)	2026-03-10 09:54:34 -07:00
Eric Traut	e4edafe1a8	Log ChatGPT user ID for feedback tags (#13901 ) There are some bug investigations that currently require us to ask users for their user ID even though they've already uploaded logs and session details via `/feedback`. This frustrates users and increases the time for diagnosis. This PR includes the ChatGPT user ID in the metadata uploaded for `/feedback` (both the TUI and app-server).	2026-03-10 09:57:41 -06:00

1 2 3 4 5 ...

4453 Commits