codex

mirror of https://github.com/openai/codex.git synced 2026-04-30 09:26:44 +00:00

Author	SHA1	Message	Date
Ahmed Ibrahim	b519267d05	Account for encrypted reasoning for auto compaction (#7113 ) - The total token used returned from the api doesn't account for the reasoning items before the assistant message - Account for those for auto compaction - Add the encrypted reasoning effort in the common tests utils - Add a test to make sure it works as expected	2025-11-22 03:06:45 +00:00
Michael Bolin	67975ed33a	refactor: inline sandbox type lookup in process_exec_tool_call (#7122 ) `process_exec_tool_call()` was taking `SandboxType` as a param, but in practice, the only place it was constructed was in `codex_message_processor.rs` where it was derived from the other `sandbox_policy` param, so this PR inlines the logic that decides the `SandboxType` into `process_exec_tool_call()`. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/7122). * #7112 * __->__ #7122	2025-11-21 22:53:05 +00:00
pakrym-oai	e52cc38dfd	Use use_model (#7121 )	2025-11-21 22:10:52 +00:00
Ahmed Ibrahim	d5f661c91d	enable unified exec for experiments (#7118 )	2025-11-21 13:10:01 -08:00
jif-oai	bce030ddb5	Revert "fix: read `max_output_tokens` param from config" (#7088 ) Reverts openai/codex#4139	2025-11-21 11:40:02 +01:00
Yorling	c9e149fd5c	fix: read `max_output_tokens` param from config (#4139 ) Request param `max_output_tokens` is documented in `https://github.com/openai/codex/blob/main/docs/config.md`, but nowhere uses the item in config, this commit read it from config for GPT responses API. see https://github.com/openai/codex/issues/4138 for issue report. Signed-off-by: Yorling <shallowcloud@yeah.net>	2025-11-20 22:46:34 -08:00
Eric Traut	bacdc004be	Fixed two tests that can fail in some environments that have global git rewrite rules (#7068 ) This fixes https://github.com/openai/codex/issues/7044	2025-11-20 22:45:40 -08:00
pakrym-oai	ab5972d447	Support all types of search actions (#7061 ) Fixes the ``` { "error": { "message": "Invalid value: 'other'. Supported values are: 'search', 'open_page', and 'find_in_page'.", "type": "invalid_request_error", "param": "input[150].action.type", "code": "invalid_value" } ``` error. The actual-actual fix here is supporting absent `query` parameter.	2025-11-20 20:45:28 -08:00
pakrym-oai	767b66f407	Migrate coverage to shell_command (#7042 )	2025-11-21 03:44:00 +00:00
Ahmed Ibrahim	1388e99674	fix flaky `tool_call_output_exceeds_limit_truncated_chars_limit` (#7043 ) I am suspecting this is flaky because of the wall time can become 0, 0.1, or 1.	2025-11-20 16:36:29 -08:00
Michael Bolin	f56d1dc8fc	feat: update process_exec_tool_call() to take a cancellation token (#6972 ) This updates `ExecParams` so that instead of taking `timeout_ms: Option<u64>`, it now takes a more general cancellation mechanism, `ExecExpiration`, which is an enum that includes a `Cancellation(tokio_util::sync::CancellationToken)` variant. If the cancellation token is fired, then `process_exec_tool_call()` returns in the same way as if a timeout was exceeded. This is necessary so that in #6973, we can manage the timeout logic external to the `process_exec_tool_call()` because we want to "suspend" the timeout when an elicitation from a human user is pending. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/6972). * #7005 * #6973 * __->__ #6972	2025-11-20 16:29:57 -08:00
Ahmed Ibrahim	9be310041b	migrate `collect_tool_identifiers_for_model` to `test_codex` (#7041 ) Maybe it solved flakiness	2025-11-20 16:02:50 -08:00
Xiao-Yong Jin	0fbcdd77c8	core: make shell behavior portable on FreeBSD (#7039 ) - Use /bin/sh instead of /bin/bash on FreeBSD/OpenBSD in the process group timeout test to avoid command-not-found failures. - Accept /usr/local/bin/bash as a valid SHELL path to match common FreeBSD installations. - Switch the shell serialization duration test to /bin/sh for improved portability across Unix platforms. With this change, `cargo test -p codex-core --lib` runs and passes on FreeBSD.	2025-11-20 16:01:35 -08:00
Ahmed Ibrahim	54ee302a06	Attempt to fix `unified_exec_formats_large_output_summary` flakiness (#7029 ) second attempt to fix this test after https://github.com/openai/codex/pull/6884. I think this flakiness is happening because yield_time is too small for a 10,000 step loop in python.	2025-11-20 14:38:04 -08:00
pakrym-oai	856f97f449	Delete shell_command feature (#7024 )	2025-11-20 14:14:56 -08:00
pakrym-oai	52d0ec4cd8	Delete tiktoken-rs (#7018 )	2025-11-20 11:15:04 -08:00
LIHUA	397279d46e	Fix: Improve text encoding for shell output in VSCode preview (#6178 ) (#6182 ) ## 🐛 Problem Users running commands with non-ASCII characters (like Russian text "пример") in Windows/WSL environments experience garbled text in VSCode's shell preview window, with Unicode replacement characters (�) appearing instead of the actual text. Issue: https://github.com/openai/codex/issues/6178 ## 🔧 Root Cause The issue was in `StreamOutput<Vec<u8>>::from_utf8_lossy()` method in `codex-rs/core/src/exec.rs`, which used `String::from_utf8_lossy()` to convert shell output bytes to strings. This function immediately replaces any invalid UTF-8 byte sequences with replacement characters, without attempting to decode using other common encodings. In Windows/WSL environments, shell output often uses encodings like: - Windows-1252 (common Windows encoding) - Latin-1/ISO-8859-1 (extended ASCII) ## 🛠️ Solution Replaced the simple `String::from_utf8_lossy()` call with intelligent encoding detection via a new `bytes_to_string_smart()` function that tries multiple encoding strategies: 1. UTF-8 (fast path for valid UTF-8) 2. Windows-1252 (handles Windows-specific characters in 0x80-0x9F range) 3. Latin-1 (fallback for extended ASCII) 4. Lossy UTF-8 (final fallback, same as before) ## 📁 Changes ### New Files - `codex-rs/core/src/text_encoding.rs` - Smart encoding detection module - `codex-rs/core/tests/suite/text_encoding_fix.rs` - Integration tests ### Modified Files - `codex-rs/core/src/lib.rs` - Added text_encoding module - `codex-rs/core/src/exec.rs` - Updated StreamOutput::from_utf8_lossy() - `codex-rs/core/tests/suite/mod.rs` - Registered new test module ## ✅ Testing - 5 unit tests covering UTF-8, Windows-1252, Latin-1, and fallback scenarios - 2 integration tests simulating the exact Issue #6178 scenario - Demonstrates improvement over the previous `String::from_utf8_lossy()` approach All tests pass: ```bash cargo test -p codex-core text_encoding cargo test -p codex-core test_shell_output_encoding_issue_6178 ``` ## 🎯 Impact - ✅ Eliminates garbled text in VSCode shell preview for non-ASCII content - ✅ Supports Windows/WSL environments with proper encoding detection - ✅ Zero performance impact for UTF-8 text (fast path) - ✅ Backward compatible - UTF-8 content works exactly as before - ✅ Handles edge cases with robust fallback mechanism ## 🧪 Test Scenarios The fix has been tested with: - Russian text ("пример") - Windows-1252 quotation marks (""test") - Latin-1 accented characters ("café") - Mixed encoding content - Invalid byte sequences (graceful fallback) ## 📋 Checklist - [X] Addresses the reported issue - [X] Includes comprehensive tests - [X] Maintains backward compatibility - [X] Follows project coding conventions - [X] No breaking changes --------- Co-authored-by: Josh McKinney <joshka@openai.com>	2025-11-20 11:04:11 -08:00
pakrym-oai	30ca89424c	Always fallback to real shell (#6953 ) Either cmd.exe or `/bin/sh`.	2025-11-20 10:58:46 -08:00
hanson-openai	b5dd189067	Allow unified_exec to early exit (if the process terminates before yield_time_ms) (#6867 ) Thread through an `exit_notify` tokio `Notify` through to the `UnifiedExecSession` so that we can return early if the command terminates before `yield_time_ms`. As Codex review correctly pointed out below 🙌 we also need a `exit_signaled` flag so that commands which finish before we start waiting can also exit early. Since the default `yield_time_ms` is now 10s, this means that we don't have to wait 10s for trivial commands like ls, sed, etc (which are the majority of agent commands 😅) --------- Co-authored-by: jif-oai <jif@openai.com>	2025-11-20 13:34:41 +01:00
zhao-oai	65c13f1ae7	execpolicy2 core integration (#6641 ) This PR threads execpolicy2 into codex-core. activated via feature flag: exec_policy (on by default) reads and parses all .codexpolicy files in `codex_home/codex` refactored tool runtime API to integrate execpolicy logic --------- Co-authored-by: Michael Bolin <mbolin@openai.com>	2025-11-19 16:50:43 -08:00
zhao-oai	72af589398	storing credits (#6858 ) Expand the rate-limit cache/TUI: store credit snapshots alongside primary and secondary windows, render “Credits” when the backend reports they exist (unlimited vs rounded integer balances)	2025-11-19 10:49:35 -08:00
Ahmed Ibrahim	d62cab9a06	fix: don't truncate at new lines (#6907 )	2025-11-19 17:05:48 +00:00
Ahmed Ibrahim	d5dfba2509	feat: arcticfox in the wild (#6906 ) <img width="485" height="600" alt="image" src="https://github.com/user-attachments/assets/4341740d-dd58-4a3e-b69a-33a3be0606c5" /> --------- Co-authored-by: jif-oai <jif@openai.com>	2025-11-19 16:31:06 +00:00
Dylan Hurd	15b5eb30ed	fix(core) Support changing /approvals before conversation (#6836 ) ## Summary Setting `/approvals` before the start of a conversation was not updating the environment_context for a conversation. Not sure exactly when this problem was introduced, but this should reduce model confusion dramatically. ## Testing - [x] Added unit test to reproduce bug, confirmed fix with update - [x] Tested locally	2025-11-19 11:32:48 +00:00
Ahmed Ibrahim	efebc62fb7	Move shell to use `truncate_text` (#6842 ) Move shell to use the configurable `truncate_text` --------- Co-authored-by: pakrym-oai <pakrym@openai.com>	2025-11-19 01:56:08 -08:00
pakrym-oai	75f38f16dd	Run remote auto compaction (#6879 )	2025-11-19 00:43:58 -08:00
Ahmed Ibrahim	0440a3f105	flaky-unified_exec_formats_large_output_summary (#6884 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.	2025-11-19 00:00:37 -08:00
pakrym-oai	ee0484a98c	shell_command returns freeform output (#6860 ) Instead of returning structured out and then re-formatting it into freeform, return the freeform output from shell_command tool. Keep `shell` as the default tool for GPT-5.	2025-11-18 23:38:43 -08:00
Ahmed Ibrahim	793063070b	fix: typos in model picker (#6859 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.	2025-11-19 06:29:02 +00:00
jif-oai	c56d0c159b	fix: local compaction (#6844 )	2025-11-18 22:18:10 +00:00
jif-oai	8ddae8cde3	feat: review in app server (#6613 )	2025-11-18 21:58:54 +00:00
Dylan Hurd	29ca89c414	chore(config) enable shell_command (#6843 ) ## Summary Enables shell_command as default for `gpt-5` and `codex-` models. ## Testing - [x] Updated unit tests	2025-11-18 12:46:02 -08:00
Ahmed Ibrahim	3de8790714	Add the utility to truncate by tokens (#6746 ) - This PR is to make it on path for truncating by tokens. This path will be initially used by unified exec and context manager (responsible for MCP calls mainly). - We are exposing new config `calls_output_max_tokens` - Use `tokens` as the main budget unit but truncate based on the model family by Introducing `TruncationPolicy`. - Introduce `truncate_text` as a router for truncation based on the mode. In next PRs: - remove truncate_with_line_bytes_budget - Add the ability to the model to override the token budget.	2025-11-18 11:36:23 -08:00
jif-oai	838531d3e4	feat: remote compaction (#6795 ) Co-authored-by: pakrym-oai <pakrym@openai.com>	2025-11-18 16:51:16 +00:00
Dylan Hurd	28ebe1c97a	fix(windows) shell_command on windows, minor parsing (#6811 ) ## Summary Enables shell_command for windows users, and starts adding some basic command parsing here, to at least remove powershell prefixes. We'll follow this up with command parsing but I wanted to land this change separately with some basic UX. NOTE: This implementation parses bash and powershell on both platforms. In theory this is possible, since you can use git bash on windows or powershell on linux. In practice, this may not be worth the complexity of supporting, so I don't feel strongly about the current approach vs. platform-specific branching. ## Testing - [x] Added a bunch of tests - [x] Ran on both windows and os x	2025-11-17 22:23:53 -08:00
Dylan Hurd	2b7378ac77	chore(core) Add shell_serialization coverage (#6810 ) ## Summary Similar to #6545, this PR updates the shell_serialization test suite to cover the various `shell` tool invocations we have. Note that this does not cover unified_exec, which has its own suite of tests. This should provide some test coverage for when we eventually consolidate serialization logic. ## Testing - [x] These are tests	2025-11-17 19:10:56 -08:00
Ahmed Ibrahim	ddcc60a085	Update defaults to gpt-5.1 (#6652 ) ## Summary - update documentation, example configs, and automation defaults to reference gpt-5.1 / gpt-5.1-codex - bump the CLI and core configuration defaults, model presets, and error messaging to the new models while keeping the model-family/tool coverage for legacy slugs - refresh tests, fixtures, and TUI snapshots so they expect the upgraded defaults ## Testing - `cargo test -p codex-core config::tests::test_precedence_fixture_with_gpt5_profile` ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_6916c5b3c2b08321ace04ee38604fc6b)	2025-11-17 17:40:11 -08:00
Dylan Hurd	497fb4a19c	fix(core) serialize shell_command (#6744 ) ## Summary Ensures we're serializing calls to `shell_command` ## Testing - [x] Added unit test	2025-11-16 23:16:51 -08:00
Ahmed Ibrahim	0b28e72b66	Improve compact (#6692 ) This PR does the following: - Add compact prefix to the summary - Change the compaction prompt - Allow multiple compaction for long running tasks - Filter out summary messages on the following compaction Considerations: - Filtering out the summary message isn't the most clean - Theoretically, we can end up in infinite compaction loop if the user messages > compaction limit . However, that's not possible in today's code because we have hard cap on user messages. - We need to address having multiple user messages because it confuses the model. Testing: - Making sure that after compact we always end up with one user message (task) and one summary, even on multiple compaction.	2025-11-15 07:17:51 +00:00
pakrym-oai	018a2d2e50	Ignore unified_exec_respects_workdir_override (#6693 )	2025-11-14 15:00:31 -08:00
pakrym-oai	cfcc87a953	Order outputs before inputs (#6691 ) For better caching performance all output items should be rendered in the order they were produced before all new input items (for example, all function_call before all function_call_output).	2025-11-14 14:54:11 -08:00
jif-oai	63c8c01f40	feat: better UI for unified_exec (#6515 ) <img width="376" height="132" alt="Screenshot 2025-11-12 at 17 36 22" src="https://github.com/user-attachments/assets/ce693f0d-5ca0-462e-b170-c20811dcc8d5" />	2025-11-14 16:31:12 +01:00
pakrym-oai	6c384eb9c6	tests: replace mount_sse_once_match with mount_sse_once for SSE mocking (#6640 )	2025-11-13 18:04:05 -08:00
Ahmed Ibrahim	2a6e9b20df	Promote shared helpers for suite tests (#6460 ) ## Summary - add `TestCodex::submit_turn_with_policies` and extend the response helpers with reusable tool-call utilities - update the grep_files, read_file, list_dir, shell_serialization, and tools suites to rely on the shared helpers instead of local copies - make the list_dir helper return `anyhow::Result` so clippy no longer warns about `expect` ## Testing - `just fix -p codex-core` - `cargo test -p codex-core --test all suite::grep_files::grep_files_tool_collects_matches` - `cargo test -p codex-core suite::grep_files::grep_files_tool_collects_matches -- --ignored` (filter requests ignored tests so nothing runs, but the build stays clean) ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_69112d53abac83219813cab4d7cb6446)	2025-11-13 17:12:10 -08:00
Ahmed Ibrahim	9890ceb939	Avoid double truncation (#6631 ) 1. Avoid double truncation by giving 10% above the tool default constant 2. Add tests that fails when const = 1	2025-11-13 16:59:31 -08:00
pakrym-oai	7b027e7536	Revert "Revert "Overhaul shell detection and centralize command generation for unified exec"" (#6607 ) Reverts openai/codex#6606	2025-11-13 16:45:17 -08:00
Dylan Hurd	2c1b693da4	chore(core) Consolidate apply_patch tests (#6545 ) ## Summary Consolidates our apply_patch tests into one suite, and ensures each test case tests the various ways the harness supports apply_patch: 1. Freeform custom tool call 2. JSON function tool 3. Simple shell call 4. Heredoc shell call There are a few test cases that are specific to a particular variant, I've left those alone. ## Testing - [x] This adds a significant number of tests	2025-11-13 15:52:39 -08:00
pakrym-oai	041d6ad902	Migrate prompt caching tests to test_codex (#6605 ) To hopefully fix the flakiness	2025-11-13 09:19:38 -08:00
pakrym-oai	e6995174c1	Revert "Overhaul shell detection and centralize command generation for unified exec" (#6606 ) Reverts openai/codex#6577	2025-11-13 08:43:00 -08:00
pakrym-oai	d28e912214	Overhaul shell detection and centralize command generation for unified exec (#6577 ) This fixes command display for unified exec. All `cd`s and `ls`es are now parsed. <img width="452" height="237" alt="image" src="https://github.com/user-attachments/assets/ce92d81f-f74c-485a-9b34-1eaa29290ec6" /> Deletes a ton of tests that were doing nothing from shell.rs. --------- Co-authored-by: Pavel Krymets <pavel@krymets.com>	2025-11-13 08:28:09 -08:00

... 12 13 14 15 16 ...

885 Commits