codex

mirror of https://github.com/openai/codex.git synced 2026-06-01 19:02:59 +00:00

Author	SHA1	Message	Date
jif-oai	623707ab58	feat: add wait tool implementation for collab (#9088 ) Add implementation for the `wait` tool. For this we consider all status different from `PendingInit` and `Running` as terminal. The `wait` tool call will return either after a given timeout or when the tool reaches a non-terminal status. A few points to note: * The usage of a channel is preferred to prevent some races (just looping on `get_status()` could "miss" a terminal status) * The order of operations is very important, we need to first subscribe and then check the last known status to prevent race conditions * If the channel gets dropped, we return an error on purpose	2026-01-12 12:16:24 +00:00
jif-oai	86f81ca010	feat: testing harness for collab 1 (#8983 )	2026-01-12 11:17:05 +00:00
charley-oai	6709ad8975	Label attached images so agent can understand in-message labels (#8950 ) Agent wouldn't "see" attached images and would instead try to use the view_file tool: <img width="1516" height="504" alt="image" src="https://github.com/user-attachments/assets/68a705bb-f962-4fc1-9087-e932a6859b12" /> In this PR, we wrap image content items in XML tags with the name of each image (now just a numbered name like `[Image #1]`), so that the model can understand inline image references (based on name). We also put the image content items above the user message which the model seems to prefer (maybe it's more used to definitions being before references). We also tweak the view_file tool description which seemed to help a bit Results on a simple eval set of images: Before <img width="980" height="310" alt="image" src="https://github.com/user-attachments/assets/ba838651-2565-4684-a12e-81a36641bf86" /> After <img width="918" height="322" alt="image" src="https://github.com/user-attachments/assets/10a81951-7ee6-415e-a27e-e7a3fd0aee6f" /> ```json [ { "id": "single_describe", "prompt": "Describe the attached image in one sentence.", "images": ["image_a.png"] }, { "id": "single_color", "prompt": "What is the dominant color in the image? Answer with a single color word.", "images": ["image_b.png"] }, { "id": "orientation_check", "prompt": "Is the image portrait or landscape? Answer in one sentence.", "images": ["image_c.png"] }, { "id": "detail_request", "prompt": "Look closely at the image and call out any small details you notice.", "images": ["image_d.png"] }, { "id": "two_images_compare", "prompt": "I attached two images. Are they the same or different? Briefly explain.", "images": ["image_a.png", "image_b.png"] }, { "id": "two_images_captions", "prompt": "Provide a short caption for each image (Image 1, Image 2).", "images": ["image_c.png", "image_d.png"] }, { "id": "multi_image_rank", "prompt": "Rank the attached images from most colorful to least colorful.", "images": ["image_a.png", "image_b.png", "image_c.png"] }, { "id": "multi_image_choice", "prompt": "Which image looks more vibrant? Answer with 'Image 1' or 'Image 2'.", "images": ["image_b.png", "image_d.png"] } ] ```	2026-01-09 21:33:45 -08:00
jif-oai	e2e3f4490e	chore: add approval metric (#8970 )	2026-01-09 13:10:31 +00:00
jif-oai	e9c548c65e	chore: non mutable btree when building specs (#8969 )	2026-01-09 12:21:55 +00:00
jif-oai	568b938c80	feat: first pass on clb tool (#8930 )	2026-01-09 11:54:05 +00:00
Thibault Sottiaux	51dd5af807	fix: treat null MCP resource args as empty (#8917 ) Handle null tool arguments in the MCP resource handler so optional resource tools accept null without failing, preserving normal JSON parsing for non-null payloads and improving robustness when models emit null; this avoids spurious argument parse errors for list/read MCP resource calls.	2026-01-08 17:47:02 -08:00
jif-oai	c9c6560685	nit: parse_arguments (#8927 )	2026-01-08 19:49:17 +00:00
jif-oai	da667b1f56	chore: drop useless interaction_input (#8907 )	2026-01-08 15:01:07 +00:00
Thibault Sottiaux	267c05fb30	fix: stabilize list_dir pagination order (#8826 ) Sort list_dir entries before applying offset/limit so pagination matches the displayed order, update pagination/truncation expectations, and add coverage for sorted pagination. This ensures stable, predictable directory pages when list_dir is enabled.	2026-01-08 03:51:47 -08:00
jif-oai	634650dd25	feat: metrics capabilities (#8318 ) Add metrics capabilities to Codex. The `README.md` is up to date. This will not be merged with the metrics before this PR of course: https://github.com/openai/codex/pull/8350	2026-01-08 11:47:36 +00:00
Owen Lin	66450f0445	fix: implement 'Allow this session' for apply_patch approvals (#8451 ) Summary This PR makes “ApprovalDecision::AcceptForSession / don’t ask again this session” actually work for `apply_patch` approvals by caching approvals based on absolute file paths in codex-core, properly wiring it through app-server v2, and exposing the choice in both TUI and TUI2. - This brings `apply_patch` calls to be at feature-parity with general shell commands, which also have a "Yes, and don't ask again" option. - This also fixes VSCE's "Allow this session" button to actually work. While we're at it, also split the app-server v2 protocol's `ApprovalDecision` enum so execpolicy amendments are only available for command execution approvals. Key changes - Core: per-session patch approval allowlist keyed by absolute file paths - Handles multi-file patches and renames/moves by recording both source and destination paths for `Update { move_path: Some(...) }`. - Extend the `Approvable` trait and `ApplyPatchRuntime` to work with multiple keys, because an `apply_patch` tool call can modify multiple files. For a request to be auto-approved, we will need to check that all file paths have been approved previously. - App-server v2: honor AcceptForSession for file changes - File-change approval responses now map AcceptForSession to ReviewDecision::ApprovedForSession (no longer downgraded to plain Approved). - Replace `ApprovalDecision` with two enums: `CommandExecutionApprovalDecision` and `FileChangeApprovalDecision` - TUI / TUI2: expose “don’t ask again for these files this session” - Patch approval overlays now include a third option (“Yes, and don’t ask again for these files this session (s)”). - Snapshot updates for the approval modal. Tests added/updated - Core: - Integration test that proves ApprovedForSession on a patch skips the next patch prompt for the same file - App-server: - v2 integration test verifying FileChangeApprovalDecision::AcceptForSession works properly User-visible behavior - When the user approves a patch “for session”, future patches touching only those previously approved file(s) will no longer prompt gain during that session (both via app-server v2 and TUI/TUI2). Manual testing Tested both TUI and TUI2 - see screenshots below. TUI: <img width="1082" height="355" alt="image" src="https://github.com/user-attachments/assets/adcf45ad-d428-498d-92fc-1a0a420878d9" /> TUI2: <img width="1089" height="438" alt="image" src="https://github.com/user-attachments/assets/dd768b1a-2f5f-4bd6-98fd-e52c1d3abd9e" />	2026-01-07 20:11:12 +00:00
jif-oai	1253d19641	chore: drop useless feature flags (#8850 )	2026-01-07 19:54:32 +00:00
Ahmed Ibrahim	9179c9deac	Merge Modelfamily into modelinfo (#8763 ) - Merge ModelFamily into ModelInfo - Remove logic for adding instructions to apply patch - Add compaction limit and visible context window to `ModelInfo`	2026-01-07 10:35:09 -08:00
jif-oai	116059c3a0	chore: unify conversation with thread name (#8830 ) Done and verified by Codex + refactor feature of RustRover	2026-01-07 17:04:53 +00:00
jif-oai	4cef89a122	chore: rename unified exec sessions (#8822 ) Renaming done by Codex	2026-01-07 16:12:47 +00:00
sayan-oai	54ded1a3c0	add web_search_cached flag (#8795 ) Add `web_search_cached` feature to config. Enables `web_search` tool with access only to cached/indexed results (see [docs](https://platform.openai.com/docs/guides/tools-web-search#live-internet-access)). This takes precedence over the existing `web_search_request`, which continues to enable `web_search` over live results as it did before. `web_search_cached` is disabled for review mode, as `web_search_request` is.	2026-01-06 14:53:59 -08:00
jif-oai	8858012fd1	chore: emit unified exec begin only when PTY exist (#8780 )	2026-01-06 13:12:54 +00:00
pakrym-oai	96fdbdd434	Add ExecPolicyManager (#8349 ) Move exec policy management into services to keep turn context immutable.	2025-12-22 09:59:32 -08:00
Dylan Hurd	33e1d0844a	feat(windows) start powershell in utf-8 mode (#7902 ) ## Summary Adds a FeatureFlag to enforce UTF8 encoding in powershell, particularly Windows Powershell v5. This should help address issues like #7290. Notably, this PR does not include the ability to parse `apply_patch` invocations within UTF8 shell commands (calls to the freeform tool should not be impacted). I am leaving this out of scope for now. We should address before this feature becomes Stable, but those cases are not the default behavior at this time so we're okay for experimentation phase. We should continue cleaning up the `apply_patch::invocation` logic and then can handle it more cleanly. ## Testing - [x] Adds additional testing	2025-12-22 09:36:44 -08:00
Ahmed Ibrahim	f0dc6fd3c7	Rename OpenAI models to models manager (#8346 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.	2025-12-19 16:20:05 -08:00
Anton Panasenko	3429de21b3	feat: introduce ExternalSandbox policy (#8290 ) ## Description Introduced `ExternalSandbox` policy to cover use case when sandbox defined by outside environment, effectively it translates to `SandboxMode#DangerFullAccess` for file system (since sandbox configured on container level) and configurable `network_access` (either Restricted or Enabled by outside environment). as example you can configure `ExternalSandbox` policy as part of `sendUserTurn` v1 app_server API: ``` { "conversationId": <id>, "cwd": <cwd>, "approvalPolicy": "never", "sandboxPolicy": { "type": ""external-sandbox", "network_access": "enabled"/"restricted" }, "model": <model>, "effort": <effort>, .... } ```	2025-12-18 17:02:03 -08:00
Michael Bolin	3d4ced3ff5	chore: migrate from Config::load_from_base_config_with_overrides to ConfigBuilder (#8276 ) https://github.com/openai/codex/pull/8235 introduced `ConfigBuilder` and this PR updates all call non-test call sites to use it instead of `Config::load_from_base_config_with_overrides()`. This is important because `load_from_base_config_with_overrides()` uses an empty `ConfigRequirements`, which is a reasonable default for testing so the tests are not influenced by the settings on the host. This method is now guarded by `#[cfg(test)]` so it cannot be used by business logic. Because `ConfigBuilder::build()` is `async`, many of the test methods had to be migrated to be `async`, as well. On the bright side, this made it possible to eliminate a bunch of `block_on_future()` stuff.	2025-12-18 16:12:52 -08:00
jif-oai	45c164a982	nit: doc (#8186 )	2025-12-17 15:29:29 +00:00
jif-oai	2e7e4f6ea6	nit: drop dead branch with `unified_exec` tool (#8182 )	2025-12-17 13:55:13 +00:00
jif-oai	813bdb9010	feat: fallback unified_exec to shell_command (#8075 )	2025-12-17 10:29:45 +00:00
jif-oai	d7482510b1	nit: trace span for regular task (#8053 ) Logs are too spammy --------- Co-authored-by: Anton Panasenko <apanasenko@openai.com>	2025-12-16 16:53:15 +00:00
Dylan Hurd	b9d1a087ee	chore(shell_command) fix freeform timeout output (#7791 ) ## Summary Adding an additional integration test for timeout_ms ## Testing - [x] these are tests	2025-12-15 19:26:39 -08:00
Ahmed Ibrahim	d802b18716	fix parallel tool calls (#7956 )	2025-12-16 01:28:27 +00:00
Anton Panasenko	ad7b9d63c3	[codex] add otel tracing (#7844 )	2025-12-12 17:07:17 -08:00
Michael Bolin	9009490357	fix: use PowerShell to parse PowerShell (#7607 ) Previous to this PR, we used a hand-rolled PowerShell parser in `windows_safe_commands.rs` to take a `&str` of PowerShell script see if it is equivalent to a list of `execvp(3)` invocations, and if so, we then test each using `is_safe_powershell_command()` to determine if the overall command is safe: `6e6338aa87/codex-rs/core/src/command_safety/windows_safe_commands.rs (L89-L98)` Unfortunately, our PowerShell parser did not recognize `@(...)` as a special construct, so it was treated as an ordinary token. This meant that the following would erroneously be considered "safe:" ```powershell ls @(calc.exe) ``` The fix introduced in this PR is to do something comparable what we do for Bash/Zsh, which is to use a "proper" parser to derive the list of `execvp(3)` calls. For Bash/Zsh, we rely on https://crates.io/crates/tree-sitter-bash, but there does not appear to be a crate of comparable quality for parsing PowerShell statically (https://github.com/airbus-cert/tree-sitter-powershell/ is the best thing I found). Instead, in this PR, we use a PowerShell script to parse the input PowerShell program to produce the AST.	2025-12-12 13:06:49 -08:00
Ahmed Ibrahim	b7fa7ca8e9	Update Model Info (#7853 )	2025-12-11 14:06:07 -08:00
Ahmed Ibrahim	b9fb3b81e5	Chore: limit find family visability (#7891 ) a little bit more code quality of life	2025-12-11 13:30:56 -08:00
jif-oai	29381ba5c2	feat: add shell snapshot for shell command (#7786 )	2025-12-11 13:46:43 +00:00
Eric Traut	c4af707e09	Removed experimental "command risk assessment" feature (#7799 ) This experimental feature received lukewarm reception during internal testing. Removing from the code base.	2025-12-10 09:48:11 -08:00
zhao-oai	e0fb3ca1db	refactoring with_escalated_permissions to use SandboxPermissions instead (#7750 ) helpful in the future if we want more granularity for requesting escalated permissions: e.g when running in readonly sandbox, model can request to escalate to a sandbox that allows writes	2025-12-10 17:18:48 +00:00
jif-oai	0ad54982ae	chore: rework unified exec events (#7775 )	2025-12-10 10:30:38 +00:00
jif-oai	7836aeddae	feat: shell snapshotting (#7641 )	2025-12-09 18:36:58 +00:00
pakrym-oai	ac5fa6baf8	Do not emit start/end events for write stdin (#7561 )	2025-12-08 15:23:02 -08:00
jif-oai	da983c1761	feat: add is-mutating detection for shell command handler (#7729 )	2025-12-08 18:42:09 +00:00
zhao-oai	c2bdee0946	proposing execpolicy amendment when prompting due to sandbox denial (#7653 ) Currently, we only show the “don’t ask again for commands that start with…” option when a command is immediately flagged as needing approval. However, there is another case where we ask for approval: When a command is initially auto-approved to run within sandbox, but it fails to run inside sandbox, we would like to attempt to retry running outside of sandbox. This will require a prompt to the user. This PR addresses this latter case	2025-12-08 17:55:20 +00:00
Eric Traut	acb8ed493f	Fixed regression for chat endpoint; missing tools name caused litellm proxy to crash (#7724 ) This PR addresses https://github.com/openai/codex/issues/7051	2025-12-08 00:49:51 -08:00
Dylan Hurd	a8cbbdbc6e	feat(core) Add login to shell_command tool (#6846 ) ## Summary Adds the `login` parameter to the `shell_command` tool - optional, defaults to true. ## Testing - [x] Tested locally	2025-12-05 11:03:25 -08:00
Ahmed Ibrahim	7b359c9c8e	Call models endpoint in models manager (#7616 ) - Introduce `with_remote_overrides` and update `refresh_available_models` - Put `auth_manager` instead of `auth_mode` on `models_manager` - Remove `ShellType` and `ReasoningLevel` to use already existing structs	2025-12-04 18:28:03 -08:00
Ahmed Ibrahim	6e6338aa87	Inline response recording and remove process_items indirection (#7310 ) - Inline response recording during streaming: `run_turn` now records items as they arrive instead of building a `ProcessedResponseItem` list and post‑processing via `process_items`. - Simplify turn handling: `handle_output_item_done` returns the follow‑up signal + optional tool future; `needs_follow_up` is set only there, and in‑flight tool futures are drained once at the end (errors logged, no extra state writes). - Flattened stream loop: removed `process_items` indirection and the extra output queue - - Tests: relaxed `tool_parallelism::tool_results_grouped` to allow any completion order while still requiring matching call/output IDs.	2025-12-04 12:17:54 -08:00
jif-oai	36edb412b1	fix: release session ID when not used (#7592 )	2025-12-04 17:42:16 +00:00
zhao-oai	3d35cb4619	Refactor execpolicy fallback evaluation (#7544 ) ## Refactor of the `execpolicy` crate To illustrate why we need this refactor, consider an agent attempting to run `apple \| rm -rf ./`. Suppose `apple` is allowed by `execpolicy`. Before this PR, `execpolicy` would consider `apple` and `pear` and only render one rule match: `Allow`. We would skip any heuristics checks on `rm -rf ./` and immediately approve `apple \| rm -rf ./` to run. To fix this, we now thread a `fallback` evaluation function into `execpolicy` that runs when no `execpolicy` rules match a given command. In our example, we would run `fallback` on `rm -rf ./` and prevent `apple \| rm -rf ./` from being run without approval.	2025-12-03 23:39:48 -08:00
zhao-oai	e925a380dc	whitelist command prefix integration in core and tui (#7033 ) this PR enables TUI to approve commands and add their prefixes to an allowlist: <img width="708" height="605" alt="Screenshot 2025-11-21 at 4 18 07 PM" src="https://github.com/user-attachments/assets/56a19893-4553-4770-a881-becf79eeda32" /> note: we only show the option to whitelist the command when 1) command is not multi-part (e.g `git add -A && git commit -m 'hello world'`) 2) command is not already matched by an existing rule	2025-12-03 23:17:02 -08:00
Ahmed Ibrahim	cee37a32b2	Migrate model family to models manager (#7565 ) This PR moves `ModelsFamily` to `openai_models`. It also propagates `ModelsManager` to session services and use it to drive model family. We also make `derive_default_model_family` private because it's a step towards what we want: one place that gives model configuration. This is a second step at having one source of truth for models information and config: `ModelsManager`. Next steps would be to remove `ModelsFamily` from config. That's massive because it's being used in 41 occasions mostly pre launching `codex`. Also, we need to make `find_family_for_model` private. It's also big because it's being used in 21 occasions ~ all tests.	2025-12-03 18:49:47 -08:00
jif-oai	42ae738f67	feat: model warning in case of apply patch (#7494 )	2025-12-03 09:07:31 +00:00

... 12 13 14 15 16

776 Commits