codex

mirror of https://github.com/openai/codex.git synced 2026-04-30 17:36:40 +00:00

Author	SHA1	Message	Date
Michael Bolin	0c8a36676a	fix: move inline codex-rs/core unit tests into sibling files (#14444 ) ## Why PR #13783 moved the `codex.rs` unit tests into `codex_tests.rs`. This applies the same extraction pattern across the rest of `codex-rs/core` so the production modules stay focused on runtime code instead of large inline test blocks. Keeping the tests in sibling files also makes follow-up edits easier to review because product changes no longer have to share a file with hundreds or thousands of lines of test scaffolding. ## What changed - replaced each inline `mod tests { ... }` in `codex-rs/core/src/*` with a path-based module declaration - moved each extracted unit test module into a sibling `_tests.rs` file, using `mod_tests.rs` for `mod.rs` modules - preserved the existing `cfg(...)` guards and module-local structure so the refactor remains structural rather than behavioral ## Testing - `cargo test -p codex-core --lib` (`1653 passed; 0 failed; 5 ignored`) - `just fix -p codex-core` - `cargo fmt --check` - `cargo shear`	2026-03-12 08:16:36 -07:00
Anton Panasenko	77b0c75267	feat: search_tool migrate to bring you own tool of Responses API (#14274 ) ## Why to support a new bring your own search tool in Responses API(https://developers.openai.com/api/docs/guides/tools-tool-search#client-executed-tool-search) we migrating our bm25 search tool to use official way to execute search on client and communicate additional tools to the model. ## What - replace the legacy `search_tool_bm25` flow with client-executed `tool_search` - add protocol, SSE, history, and normalization support for `tool_search_call` and `tool_search_output` - return namespaced Codex Apps search results and wire namespaced follow-up tool calls back into MCP dispatch	2026-03-11 17:51:51 -07:00
pakrym-oai	00ea8aa7ee	Expose strongly-typed result for exec_command (#14183 ) Summary - document output types for the various tool handlers and registry so the API exposes richer descriptions - update unified execution helpers and client tests to align with the new output metadata - clean up unused helpers across tool dispatch paths Testing - Not run (not requested)	2026-03-11 12:33:07 -07:00
Owen Lin	d309c102ef	fix(core): use dedicated types for responsesapi web search tool config (#14136 ) This changes the web_search tool spec in codex-core to use dedicated Responses-API payload structs instead of shared config types and custom serializers. Previously, `ToolSpec::WebSearch` stored `WebSearchFilters` and `WebSearchUserLocation` directly and relied on hand-written serializers to shape the outgoing JSON. This worked, but it mixed config/schema types with the OpenAI Responses payload contract and created an easy place for drift if those shared types changed later. ### Why This keeps the boundary clearer: - app-server/config/schema types stay focused on config - Responses tool payload types stay focused on the OpenAI wire format It also makes the serialization behavior obvious from the structs themselves, instead of hiding it in custom serializer functions.	2026-03-09 14:58:33 -07:00
Rohan Mehta	61098c7f51	Allow full web search tool config (#13675 ) Previously, we could only configure whether web search was on/off. This PR enables sending along a web search config, which includes all the stuff responsesapi supports: filters, location, etc.	2026-03-07 00:50:50 +00:00
Won Park	ee1a20258a	Enabling CWD Saving for Image-Gen (#13607 ) Codex now saves the generated image on to your current working directory.	2026-03-06 00:47:21 -08:00
sayan-oai	03d55f0e6f	chore: add web_search_tool_type for image support (#13538 ) add `web_search_tool_type` on model_info that can be populated from backend. will be used to filter which models can use `web_search` with images and which cant. added small unit test.	2026-03-05 07:02:27 +00:00
Won Park	fa2306b303	image-gen-core (#13290 ) Core tool-calling for image-gen, handles requesting and receiving logic for images using response API	2026-03-03 23:11:28 -08:00
Val Kharitonov	4f6c4bb143	support 'flex' tier in app-server in addition to 'fast' (#13391 )	2026-03-03 22:46:05 -08:00
pash-openai	2f5b01abd6	add fast mode toggle (#13212 ) - add a local Fast mode setting in codex-core (similar to how model id is currently stored on disk locally) - send `service_tier=priority` on requests when Fast is enabled - add `/fast` in the TUI and persist it locally - feature flag	2026-03-02 20:29:33 -08:00
Curtis 'Fjord' Hawthorne	7e980d7db6	Support multimodal custom tool outputs (#12948 ) ## Summary This changes `custom_tool_call_output` to use the same output payload shape as `function_call_output`, so freeform tools can return either plain text or structured content items. The main goal is to let `js_repl` return image content from nested `view_image` calls in its own `custom_tool_call_output`, instead of relying on a separate injected message. ## What changed - Changed `custom_tool_call_output.output` from `string` to `FunctionCallOutputPayload` - Updated freeform tool plumbing to preserve structured output bodies - Updated `js_repl` to aggregate nested tool content items and attach them to the outer `js_repl` result - Removed the old `js_repl` special case that injected `view_image` results as a separate pending user image message - Updated normalization/history/truncation paths to handle multimodal `custom_tool_call_output` - Regenerated app-server protocol schema artifacts ## Behavior Direct `view_image` calls still return a `function_call_output` with image content. When `view_image` is called inside `js_repl`, the outer `js_repl` `custom_tool_call_output` now carries: - an `input_text` item if the JS produced text output - one or more `input_image` items from nested tool results So the nested image result now stays inside the `js_repl` tool output instead of being injected as a separate message. ## Compatibility This is intended to be backward-compatible for resumed conversations. Older histories that stored `custom_tool_call_output.output` as a plain string still deserialize correctly, and older histories that used the previous injected-image-message flow also continue to resume. Added regression coverage for resuming a pre-change rollout containing: - string-valued `custom_tool_call_output` - legacy injected image message history #### [git stack](https://github.com/magus/git-stack-cli) - 👉 `1` https://github.com/openai/codex/pull/12948	2026-02-26 18:17:46 -08:00
pakrym-oai	3322b99900	Remove ApiPrompt (#11265 ) Keep things simple and build a full Responses API request request right in the model client	2026-02-10 16:12:31 +00:00
Owen Lin	5ea107a088	feat(app-server, core): allow text + image content items for dynamic tool outputs (#10567 ) Took over the work that @aaronl-openai started here: https://github.com/openai/codex/pull/10397 Now that app-server clients are able to set up custom tools (called `dynamic_tools` in app-server), we should expose a way for clients to pass in not just text, but also image outputs. This is something the Responses API already supports for function call outputs, where you can pass in either a string or an array of content outputs (text, image, file): https://platform.openai.com/docs/api-reference/responses/create#responses_create-input-input_item_list-item-function_tool_call_output-output-array-input_image So let's just plumb it through in Codex (with the caveat that we only support text and image for now). This is implemented end-to-end across app-server v2 protocol types and core tool handling. ## Breaking API change NOTE: This introduces a breaking change with dynamic tools, but I think it's ok since this concept was only recently introduced (https://github.com/openai/codex/pull/9539) and it's better to get the API contract correct. I don't think there are any real consumers of this yet (not even the Codex App). Old shape: `{ "output": "dynamic-ok", "success": true }` New shape: ``` { "contentItems": [ { "type": "inputText", "text": "dynamic-ok" }, { "type": "inputImage", "imageUrl": "data:image/png;base64,AAA" } ] "success": true } ```	2026-02-04 16:12:47 -08:00
Dylan Hurd	714151eb4e	feat(personality) introduce model_personality config (#9459 ) ## Summary Introduces the concept of a config model_personality. I would consider this an MVP for testing out the feature. There are a number of follow-ups to this PR: - More sophisticated templating with validation - In-product experience to manage this ## Testing - [x] Testing locally	2026-01-20 11:06:14 -08:00
Dylan Hurd	675f165c56	fix(core) Preserve base_instructions in SessionMeta (#9427 ) ## Summary This PR consolidates base_instructions onto SessionMeta / SessionConfiguration, so we ensure `base_instructions` is set once per session and should be (mostly) immutable, unless: - overridden by config on resume / fork - sub-agent tasks, like review or collab In a future PR, we should convert all references to `base_instructions` to consistently used the typed struct, so it's less likely that we put other strings there. See #9423. However, this PR is already quite complex, so I'm deferring that to a follow-up. ## Testing - [x] Added a resume test to assert that instructions are preserved. In particular, `resume_switches_models_preserves_base_instructions` fails against main. Existing test coverage thats assert base instructions are preserved across multiple requests in a session: - Manual compact keeps baseline instructions: core/tests/suite/compact.rs:199 - Auto-compact keeps baseline instructions: core/tests/suite/compact.rs:1142 - Prompt caching reuses the same instructions across two requests: core/tests/suite/prompt_caching.rs:150 and core/tests/suite/prompt_caching.rs:157 - Prompt caching with explicit expected string across two requests: core/tests/suite/prompt_caching.rs:213 and core/tests/suite/prompt_caching.rs:222 - Resume with model switch keeps original instructions: core/tests/suite/resume.rs:136 - Compact/resume/fork uses request 0 instructions for later expected payloads: core/tests/suite/compact_resume_fork.rs:215	2026-01-19 21:59:36 -08:00
Ahmed Ibrahim	9179c9deac	Merge Modelfamily into modelinfo (#8763 ) - Merge ModelFamily into ModelInfo - Remove logic for adding instructions to apply patch - Add compaction limit and visible context window to `ModelInfo`	2026-01-07 10:35:09 -08:00
sayan-oai	54ded1a3c0	add web_search_cached flag (#8795 ) Add `web_search_cached` feature to config. Enables `web_search` tool with access only to cached/indexed results (see [docs](https://platform.openai.com/docs/guides/tools-web-search#live-internet-access)). This takes precedence over the existing `web_search_request`, which continues to enable `web_search` over live results as it did before. `web_search_cached` is disabled for review mode, as `web_search_request` is.	2026-01-06 14:53:59 -08:00
Ahmed Ibrahim	f0dc6fd3c7	Rename OpenAI models to models manager (#8346 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.	2025-12-19 16:20:05 -08:00
Ahmed Ibrahim	b9fb3b81e5	Chore: limit find family visability (#7891 ) a little bit more code quality of life	2025-12-11 13:30:56 -08:00
Ahmed Ibrahim	cee37a32b2	Migrate model family to models manager (#7565 ) This PR moves `ModelsFamily` to `openai_models`. It also propagates `ModelsManager` to session services and use it to drive model family. We also make `derive_default_model_family` private because it's a step towards what we want: one place that gives model configuration. This is a second step at having one source of truth for models information and config: `ModelsManager`. Next steps would be to remove `ModelsFamily` from config. That's massive because it's being used in 41 occasions mostly pre launching `codex`. Also, we need to make `find_family_for_model` private. It's also big because it's being used in 21 occasions ~ all tests.	2025-12-03 18:49:47 -08:00
jif-oai	4502b1b263	chore: proper client extraction (#6996 )	2025-11-25 18:06:12 +00:00
jif-oai	bce030ddb5	Revert "fix: read `max_output_tokens` param from config" (#7088 ) Reverts openai/codex#4139	2025-11-21 11:40:02 +01:00
Yorling	c9e149fd5c	fix: read `max_output_tokens` param from config (#4139 ) Request param `max_output_tokens` is documented in `https://github.com/openai/codex/blob/main/docs/config.md`, but nowhere uses the item in config, this commit read it from config for GPT responses API. see https://github.com/openai/codex/issues/4138 for issue report. Signed-off-by: Yorling <shallowcloud@yeah.net>	2025-11-20 22:46:34 -08:00
Ahmed Ibrahim	d5dfba2509	feat: arcticfox in the wild (#6906 ) <img width="485" height="600" alt="image" src="https://github.com/user-attachments/assets/4341740d-dd58-4a3e-b69a-33a3be0606c5" /> --------- Co-authored-by: jif-oai <jif@openai.com>	2025-11-19 16:31:06 +00:00
Ahmed Ibrahim	efebc62fb7	Move shell to use `truncate_text` (#6842 ) Move shell to use the configurable `truncate_text` --------- Co-authored-by: pakrym-oai <pakrym@openai.com>	2025-11-19 01:56:08 -08:00
pakrym-oai	ee0484a98c	shell_command returns freeform output (#6860 ) Instead of returning structured out and then re-formatting it into freeform, return the freeform output from shell_command tool. Keep `shell` as the default tool for GPT-5.	2025-11-18 23:38:43 -08:00
Ahmed Ibrahim	793063070b	fix: typos in model picker (#6859 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.	2025-11-19 06:29:02 +00:00
Ahmed Ibrahim	ddcc60a085	Update defaults to gpt-5.1 (#6652 ) ## Summary - update documentation, example configs, and automation defaults to reference gpt-5.1 / gpt-5.1-codex - bump the CLI and core configuration defaults, model presets, and error messaging to the new models while keeping the model-family/tool coverage for legacy slugs - refresh tests, fixtures, and TUI snapshots so they expect the upgraded defaults ## Testing - `cargo test -p codex-core config::tests::test_precedence_fixture_with_gpt5_profile` ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_6916c5b3c2b08321ace04ee38604fc6b)	2025-11-17 17:40:11 -08:00
Dylan Hurd	497fb4a19c	fix(core) serialize shell_command (#6744 ) ## Summary Ensures we're serializing calls to `shell_command` ## Testing - [x] Added unit test	2025-11-16 23:16:51 -08:00
Celia Chen	b8ec97c0ef	[App-server] add new v2 events:`item/reasoning/delta`, `item/agentMessage/delta` & `item/reasoning/summaryPartAdded` (#6559 ) core event to app server event mapping: 1. `codex/event/reasoning_content_delta` -> `item/reasoning/summaryTextDelta`. 2. `codex/event/reasoning_raw_content_delta` -> `item/reasoning/textDelta` 3. `codex/event/agent_message_content_delta` → `item/agentMessage/delta`. 4. `codex/event/agent_reasoning_section_break` -> `item/reasoning/summaryPartAdded`. Also added a change in core to pass down content index, summary index and item id from events. Tested with the `git checkout owen/app_server_test_client && cargo run -p codex-app-server-test-client -- send-message-v2 "hello"` and verified that new events are emitted correctly.	2025-11-14 00:25:01 +00:00
pakrym-oai	34621166d5	Default to explicit medium reasoning for 5.1 (#6593 )	2025-11-13 07:58:42 +00:00
pakrym-oai	ec69a4a810	Add gpt-5.1 model definitions (#6551 )	2025-11-12 12:44:36 -08:00
pakrym-oai	3429e82e45	Add item streaming events (#5546 ) Adds AgentMessageContentDelta, ReasoningContentDelta, ReasoningRawContentDelta item streaming events while maintaining compatibility for old events. --------- Co-authored-by: Owen Lin <owen@openai.com>	2025-10-29 22:33:57 +00:00
Ahmed Ibrahim	13e1d0362d	Delegate review to codex instance (#5572 ) In this PR, I am exploring migrating task kind to an invocation of Codex. The main reason would be getting rid off multiple `ConversationHistory` state and streamlining our context/history management. This approach depends on opening a channel between the sub-codex and codex. This channel is responsible for forwarding `interactive` (`approvals`) and `non-interactive` events. The `task` is responsible for handling those events. This opens the door for implementing `codex as a tool`, replacing `compact` and `review`, and potentially subagents. One consideration is this code is very similar to `app-server` specially in the approval part. If in the future we wanted an interactive `sub-codex` we should consider using `codex-mcp`	2025-10-29 21:04:25 +00:00
jif-oai	5e4f3bbb0b	chore: rework tools execution workflow (#5278 ) Re-work the tool execution flow. Read `orchestrator.rs` to understand the structure	2025-10-20 20:57:37 +01:00
pakrym-oai	a90a58f7a1	Trim double Total output lines (#4787 )	2025-10-05 16:41:55 -07:00
jif-oai	dc3c6bf62a	feat: parallel tool calls (#4663 ) Add parallel tool calls. This is configurable at model level and tool level	2025-10-05 16:10:49 +00:00
Dylan	4764fc1ee7	feat: Freeform apply_patch with simple shell output (#4718 ) ## Summary This PR is an alternative approach to #4711, but instead of changing our storage, parses out shell calls in the client and reserializes them on the fly before we send them out as part of the request. What this changes: 1. Adds additional serialization logic when the ApplyPatchToolType::Freeform is in use. 2. Adds a --custom-apply-patch flag to enable this setting on a session-by-session basis. This change is delicate, but is not meant to be permanent. It is meant to be the first step in a migration: 1. (This PR) Add in-flight serialization with config 2. Update model_family default 3. Update serialization logic to store turn outputs in a structured format, with logic to serialize based on model_family setting. 4. Remove this rewrite in-flight logic. ## Test Plan - [x] Additional unit tests added - [x] Integration tests added - [x] Tested locally	2025-10-04 19:16:36 -07:00
jif-oai	33d3ecbccc	chore: refactor tool handling (#4510 ) # Tool System Refactor - Centralizes tool definitions and execution in `core/src/tools/`: specs (`spec.rs`), handlers (`handlers/`), router (`router.rs`), registry/dispatch (`registry.rs`), and shared context (`context.rs`). One registry now builds the model-visible tool list and binds handlers. - Router converts model responses to tool calls; Registry dispatches with consistent telemetry via `codex-rs/otel` and unified error handling. Function, Local Shell, MCP, and experimental `unified_exec` all flow through this path; legacy shell aliases still work. - Rationale: reduce per‑tool boilerplate, keep spec/handler in sync, and make adding tools predictable and testable. Example: `read_file` - Spec: `core/src/tools/spec.rs` (see `create_read_file_tool`, registered by `build_specs`). - Handler: `core/src/tools/handlers/read_file.rs` (absolute `file_path`, 1‑indexed `offset`, `limit`, `L#: ` prefixes, safe truncation). - E2E test: `core/tests/suite/read_file.rs` validates the tool returns the requested lines. ## Next steps: - Decompose `handle_container_exec_with_params` - Add parallel tool calls	2025-10-03 13:21:06 +01:00
Ahmed Ibrahim	8227a5ba1b	Send limits when getting rate limited (#4102 ) Users need visibility on rate limits when they are rate limited.	2025-09-23 22:56:34 +00:00
pakrym-oai	fdb8dadcae	Add exec output-schema parameter (#4079 ) Adds structured output to `exec` via the `--structured-output` parameter.	2025-09-23 13:59:16 -07:00
Ahmed Ibrahim	04504d8218	Forward Rate limits to the UI (#3965 ) We currently get information about rate limits in the response headers. We want to forward them to the clients to have better transparency. UI/UX plans have been discussed and this information is needed.	2025-09-20 21:26:16 -07:00
jif-oai	992b531180	fix: some nit Rust reference issues (#3849 ) Fix some small references issue. No behavioural change. Just making the code cleaner	2025-09-18 18:18:06 +01:00
Michael Bolin	f037b2fd56	chore: rename (#3648 )	2025-09-15 08:17:13 -07:00
Dylan	b6673838e8	fix: model family and apply_patch consistency (#3603 ) ## Summary Resolves a merge conflict between #3597 and #3560, and adds tests to double check our apply_patch configuration. ## Testing - [x] Added unit tests --------- Co-authored-by: dedrisian-oai <dedrisian@openai.com>	2025-09-14 18:20:37 -07:00
pakrym-oai	916fdc2a37	Add per-model-family prompts (#3597 ) Allows more flexibility in defining prompts.	2025-09-14 22:45:15 +00:00
pakrym-oai	414b8be8b6	Always request encrypted cot (#3539 ) Otherwise future requests will fail with 500	2025-09-12 23:51:30 +00:00
dedrisian-oai	90a0fd342f	Review Mode (Core) (#3401 ) ## 📝 Review Mode -- Core This PR introduces the Core implementation for Review mode: - New op `Op::Review { prompt: String }:` spawns a child review task with isolated context, a review‑specific system prompt, and a `Config.review_model`. - `EnteredReviewMode`: emitted when the child review session starts. Every event from this point onwards reflects the review session. - `ExitedReviewMode(Option<ReviewOutputEvent>)`: emitted when the review finishes or is interrupted, with optional structured findings: ```json { "findings": [ { "title": "<≤ 80 chars, imperative>", "body": "<valid Markdown explaining why this is a problem; cite files/lines/functions>", "confidence_score": <float 0.0-1.0>, "priority": <int 0-3>, "code_location": { "absolute_file_path": "<file path>", "line_range": {"start": <int>, "end": <int>} } } ], "overall_correctness": "patch is correct" \| "patch is incorrect", "overall_explanation": "<1-3 sentence explanation justifying the overall_correctness verdict>", "overall_confidence_score": <float 0.0-1.0> } ``` ## Questions ### Why separate out its own message history? We want the review thread to match the training of our review models as much as possible -- that means using a custom prompt, removing user instructions, and starting a clean chat history. We also want to make sure the review thread doesn't leak into the parent thread. ### Why do this as a mode, vs. sub-agents? 1. We want review to be a synchronous task, so it's fine for now to do a bespoke implementation. 2. We're still unclear about the final structure for sub-agents. We'd prefer to land this quickly and then refactor into sub-agents without rushing that implementation.	2025-09-12 23:25:10 +00:00
jif-oai	c6fd056aa6	feat: reasoning effort as optional (#3527 ) Allow the reasoning effort to be optional	2025-09-12 12:06:33 -07:00
pakrym-oai	5775174ec2	Never store requests (#3212 ) When item ids are sent to Responses API it will load them from the database ignoring the provided values. This adds extra latency. Not having the mode to store requests also allows us to simplify the code. ## Breaking change The `disable_response_storage` configuration option is removed.	2025-09-05 10:41:47 -07:00

1 2

85 Commits