codex

mirror of https://github.com/openai/codex.git synced 2026-04-28 00:25:56 +00:00

Author	SHA1	Message	Date
Ahmed Ibrahim	146d54cede	Add collaboration_mode override to turns (#9408 )	2026-01-16 21:51:25 -08:00
Ahmed Ibrahim	ebdd8795e9	Turn-state sticky routing per turn (#9332 ) - capture the header from SSE/WS handshakes, store it per ModelClientSession using `Oncelock`, echo it on turn-scoped requests, and add SSE+WS integration tests for within-turn persistence + cross-turn reset. - keep `x-codex-turn-state` sticky within a user turn to maintain routing continuity for retries/tool follow-ups.	2026-01-16 09:30:11 -08:00
charley-oai	4a9c2bcc5a	Add text element metadata to types (#9235 ) Initial type tweaking PR to make the diff of https://github.com/openai/codex/pull/9116 smaller This should not change any behavior, just adds some fields to types	2026-01-14 16:41:50 -08:00
pakrym-oai	9f8d3c14ce	Fix flakiness in WebSocket tests (#9169 ) The connection was being added to the list after the WebSocket response was sent. So the test can sometimes race and observe connections before the list was updated. After this change, connection and request is added to the list before the response is sent.	2026-01-13 15:09:59 -08:00
pakrym-oai	2d56519ecd	Support response.done and add integration tests (#9129 ) The agent loop using a persistent incremental web socket connection.	2026-01-13 16:12:30 +00:00
Ahmed Ibrahim	cbca43d57a	Send message by default mid turn. queue messages by tab (#9077 ) https://github.com/user-attachments/assets/03838730-4ddc-44df-a2c7-cb8ecda78660	2026-01-12 23:06:35 -08:00
pakrym-oai	490c1c1fdd	Add model client sessions (#9102 ) Maintain a long-running session.	2026-01-13 01:15:56 +00:00
zbarsky-openai	2a06d64bc9	feat: add support for building with Bazel (#8875 ) This PR configures Codex CLI so it can be built with [Bazel](https://bazel.build) in addition to Cargo. The `.bazelrc` includes configuration so that remote builds can be done using [BuildBuddy](https://www.buildbuddy.io). If you are familiar with Bazel, things should work as you expect, e.g., run `bazel test //... --keep-going` to run all the tests in the repo, but we have also added some new aliases in the `justfile` for convenience: - `just bazel-test` to run tests locally - `just bazel-remote-test` to run tests remotely (currently, the remote build is for x86_64 Linux regardless of your host platform). Note we are currently seeing the following test failures in the remote build, so we still need to figure out what is happening here: ``` failures: suite::compact::manual_compact_twice_preserves_latest_user_messages suite::compact_resume_fork::compact_resume_after_second_compaction_preserves_history suite::compact_resume_fork::compact_resume_and_fork_preserve_model_history_view ``` - `just build-for-release` to build release binaries for all platforms/architectures remotely To setup remote execution: - [Create a buildbuddy account](https://app.buildbuddy.io/) (OpenAI employees should also request org access at https://openai.buildbuddy.io/join/ with their `@openai.com` email address.) - [Copy your API key](https://app.buildbuddy.io/docs/setup/) to `~/.bazelrc` (add the line `build --remote_header=x-buildbuddy-api-key=YOUR_KEY`) - Use `--config=remote` in your `bazel` invocations (or add `common --config=remote` to your `~/.bazelrc`, or use the `just` commands) ## CI In terms of CI, this PR introduces `.github/workflows/bazel.yml`, which uses Bazel to run the tests _locally_ on Mac and Linux GitHub runners (we are working on supporting Windows, but that is not ready yet). Note that the failures we are seeing in `just bazel-remote-test` do not occur on these GitHub CI jobs, so everything in `.github/workflows/bazel.yml` is green right now. The `bazel.yml` uses extra config in `.github/workflows/ci.bazelrc` so that macOS CI jobs build _remotely_ on Linux hosts (using the `docker://docker.io/mbolin491/codex-bazel` Docker image declared in the root `BUILD.bazel`) using cross-compilation to build the macOS artifacts. Then these artifacts are downloaded locally to GitHub's macOS runner so the tests can be executed natively. This is the relevant config that enables this: ``` common:macos --config=remote common:macos --strategy=remote common:macos --strategy=TestRunner=darwin-sandbox,local ``` Because of the remote caching benefits we get from BuildBuddy, these new CI jobs can be extremely fast! For example, consider these two jobs that ran all the tests on Linux x86_64: - Bazel 1m37s https://github.com/openai/codex/actions/runs/20861063212/job/59940545209?pr=8875 - Cargo 9m20s https://github.com/openai/codex/actions/runs/20861063192/job/59940559592?pr=8875 For now, we will continue to run both the Bazel and Cargo jobs for PRs, but once we add support for Windows and running Clippy, we should be able to cutover to using Bazel exclusively for PRs, which should still speed things up considerably. We will probably continue to run the Cargo jobs post-merge for commits that land on `main` as a sanity check. Release builds will also continue to be done by Cargo for now. Earlier attempt at this PR: https://github.com/openai/codex/pull/8832 Earlier attempt to add support for Buck2, now abandoned: https://github.com/openai/codex/pull/8504 --------- Co-authored-by: David Zbarsky <dzbarsky@gmail.com> Co-authored-by: Michael Bolin <mbolin@openai.com>	2026-01-09 11:09:43 -08:00
jif-oai	1aed01e99f	renaming: task to turn (#8963 )	2026-01-09 17:31:17 +00:00
Ahmed Ibrahim	81caee3400	Add 5s timeout to models list call + integration test (#8942 ) - Enforce a 5s timeout around the remote models refresh to avoid hanging /models calls.	2026-01-08 18:06:10 -08:00
Ahmed Ibrahim	0d3e673019	remove `get_responses_requests` and `get_responses_request_bodies` to use in-place matcher (#8858 )	2026-01-08 13:57:48 -08:00
Michael Bolin	1e29774fce	fix: leverage codex_utils_cargo_bin() in codex-rs/core/tests/suite (#8887 ) This eliminates our dependency on the `escargot` crate and better prepares us for Bazel builds: https://github.com/openai/codex/pull/8875.	2026-01-08 14:56:16 +00:00
Michael Bolin	7520d8ba58	fix: leverage find_resource! macro in load_sse_fixture_with_id (#8888 ) This helps prepare us for Bazel builds: https://github.com/openai/codex/pull/8875.	2026-01-08 09:34:05 -05:00
jif-oai	116059c3a0	chore: unify conversation with thread name (#8830 ) Done and verified by Codex + refactor feature of RustRover	2026-01-07 17:04:53 +00:00
jif-oai	1dd1355df3	feat: agent controller (#8783 ) Added an agent control plane that lets sessions spawn or message other conversations via `AgentControl`. `AgentBus` (core/src/agent/bus.rs) keeps track of the last known status of a conversation. ConversationManager now holds shared state behind an Arc so AgentControl keeps only a weak back-reference, the goal is just to avoid explicit cycle reference. Follow-ups: * Build a small tool in the TUI to be able to see every agent and send manual message to each of them * Handle approval requests in this TUI * Add tools to spawn/communicate between agents (see related design) * Define agent types	2026-01-06 19:08:02 +00:00
Ahmed Ibrahim	66b7c673e9	Refresh on models etag mismatch (#8491 ) - Send models etag - Refresh models on 412 - This wires `ModelsManager` to `ModelFamily` so we don't mutate it mid-turn	2026-01-01 11:41:16 -08:00
Michael Bolin	e61bae12e3	feat: introduce codex-utils-cargo-bin as an alternative to assert_cmd::Command (#8496 ) This PR introduces a `codex-utils-cargo-bin` utility crate that wraps/replaces our use of `assert_cmd::Command` and `escargot::CargoBuild`. As you can infer from the introduction of `buck_project_root()` in this PR, I am attempting to make it possible to build Codex under [Buck2](https://buck2.build) as well as `cargo`. With Buck2, I hope to achieve faster incremental local builds (largely due to Buck2's [dice](https://buck2.build/docs/insights_and_knowledge/modern_dice/) build strategy, as well as benefits from its local build daemon) as well as faster CI builds if we invest in remote execution and caching. See https://buck2.build/docs/getting_started/what_is_buck2/#why-use-buck2-key-advantages for more details about the performance advantages of Buck2. Buck2 enforces stronger requirements in terms of build and test isolation. It discourages assumptions about absolute paths (which is key to enabling remote execution). Because the `CARGO_BIN_EXE_` environment variables that Cargo provides are absolute paths (which `assert_cmd::Command` reads), this is a problem for Buck2, which is why we need this `codex-utils-cargo-bin` utility. My WIP-Buck2 setup sets the `CARGO_BIN_EXE_` environment variables passed to a `rust_test()` build rule as relative paths. `codex-utils-cargo-bin` will resolve these values to absolute paths, when necessary. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/8496). * #8498 * __->__ #8496	2025-12-23 19:29:32 -08:00
Michael Bolin	3d4ced3ff5	chore: migrate from Config::load_from_base_config_with_overrides to ConfigBuilder (#8276 ) https://github.com/openai/codex/pull/8235 introduced `ConfigBuilder` and this PR updates all call non-test call sites to use it instead of `Config::load_from_base_config_with_overrides()`. This is important because `load_from_base_config_with_overrides()` uses an empty `ConfigRequirements`, which is a reasonable default for testing so the tests are not influenced by the settings on the host. This method is now guarded by `#[cfg(test)]` so it cannot be used by business logic. Because `ConfigBuilder::build()` is `async`, many of the test methods had to be migrated to be `async`, as well. On the bright side, this made it possible to eliminate a bunch of `block_on_future()` stuff.	2025-12-18 16:12:52 -08:00
jif-oai	ae57e18947	feat: close unified_exec at end of turn (#8052 )	2025-12-16 12:16:43 +00:00
Ahmed Ibrahim	d802b18716	fix parallel tool calls (#7956 )	2025-12-16 01:28:27 +00:00
xl-openai	5d77d4db6b	Reimplement skills loading using SkillsManager + skills/list op. (#7914 ) refactor the way we load and manage skills: 1. Move skill discovery/caching into SkillsManager and reuse it across sessions. 2. Add the skills/list API (Op::ListSkills/SkillsListResponse) to fetch skills for one or more cwds. Also update app-server for VSCE/App; 3. Trigger skills/list during session startup so UIs preload skills and handle errors immediately.	2025-12-14 09:58:17 -08:00
Michael Bolin	b1905d3754	fix: added test helpers for platform-specific paths (#7954 ) This addresses post-merge feedback from https://github.com/openai/codex/pull/7856.	2025-12-13 00:14:12 +00:00
jif-oai	29381ba5c2	feat: add shell snapshot for shell command (#7786 )	2025-12-11 13:46:43 +00:00
xl-openai	b36ecb6c32	Inject SKILL.md when it's explicitly mentioned. (#7763 ) 1. Skills load once in core at session start; the cached outcome is reused across core and surfaced to TUI via SessionConfigured. 2. TUI detects explicit skill selections, and core injects the matching SKILL.md content into the turn when a selected skill is present.	2025-12-10 13:59:17 -08:00
Ahmed Ibrahim	cb9a189857	make `model` optional in config (#7769 ) - Make Config.model optional and centralize default-selection logic in ModelsManager, including a default_model helper (with codex-auto-balanced when available) so sessions now carry an explicit chosen model separate from the base config. - Resolve `model` once in `core` and `tui` from config. Then store the state of it on other structs. - Move refreshing models to be before resolving the default model	2025-12-10 11:19:00 -08:00
Ahmed Ibrahim	222a491570	load models from disk and set a ttl and etag (#7722 ) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.	2025-12-08 13:43:04 -08:00
Pavel Krymets	f48d88067e	Fix unified_exec on windows (#7620 ) Fix unified_exec on windows Requires removal of PSUEDOCONSOLE_INHERIT_CURSOR flag so child processed don't attempt to wait for cursor position response (and timeout). https://github.com/wezterm/wezterm/compare/main...pakrym:wezterm:PSUEDOCONSOLE_INHERIT_CURSOR?expand=1 --------- Co-authored-by: pakrym-oai <pakrym@openai.com>	2025-12-05 20:09:43 +00:00
Dylan Hurd	a8cbbdbc6e	feat(core) Add login to shell_command tool (#6846 ) ## Summary Adds the `login` parameter to the `shell_command` tool - optional, defaults to true. ## Testing - [x] Tested locally	2025-12-05 11:03:25 -08:00
Ahmed Ibrahim	d08efb1743	Wire `with_remote_overrides` to construct model families (#7621 ) - This PR wires `with_remote_overrides` and make the `construct_model_families` an async function - Moves getting model family a level above to keep the function `sync` - Updates the tests to local, offline, and `sync` helper for model families	2025-12-05 10:40:15 -08:00
Ahmed Ibrahim	903b7774bc	Add models endpoint (#7603 ) - Use the codex-api crate to introduce models endpoint. - Add `models` to codex core tests helpers - Add `ModelsInfo` for the endpoint return type	2025-12-04 12:57:54 -08:00
Ahmed Ibrahim	9b2055586d	remove `model_family` from `config (#7571 ) - Remove `model_family` from `config` - Make sure to still override config elements related to `model_family` like supporting reasoning	2025-12-04 11:57:58 -08:00
Ahmed Ibrahim	cee37a32b2	Migrate model family to models manager (#7565 ) This PR moves `ModelsFamily` to `openai_models`. It also propagates `ModelsManager` to session services and use it to drive model family. We also make `derive_default_model_family` private because it's a step towards what we want: one place that gives model configuration. This is a second step at having one source of truth for models information and config: `ModelsManager`. Next steps would be to remove `ModelsFamily` from config. That's massive because it's being used in 41 occasions mostly pre launching `codex`. Also, we need to make `find_family_for_model` private. It's also big because it's being used in 21 occasions ~ all tests.	2025-12-03 18:49:47 -08:00
jif-oai	51307eaf07	feat: retroactive image placeholder to prevent poisoning (#6774 ) If an image can't be read by the API, it will poison the entire history, preventing any new turn on the conversation. This detect such cases and replace the image by a placeholder	2025-12-03 11:35:56 +00:00
Dylan Hurd	5b25915d7e	fix(apply_patch) tests for shell_command (#7307 ) ## Summary Adds test coverage for invocations of apply_patch via shell_command with heredoc, to validate behavior. ## Testing - [x] These are tests	2025-12-01 15:09:22 -08:00
Josh McKinney	ec49b56874	chore: add cargo-deny configuration (#7119 ) - add GitHub workflow running cargo-deny on push/PR - document cargo-deny allowlist with workspace-dep notes and advisory ignores - align workspace crates to inherit version/edition/license for consistent checks	2025-11-24 12:22:18 -08:00
Ahmed Ibrahim	b519267d05	Account for encrypted reasoning for auto compaction (#7113 ) - The total token used returned from the api doesn't account for the reasoning items before the assistant message - Account for those for auto compaction - Add the encrypted reasoning effort in the common tests utils - Add a test to make sure it works as expected	2025-11-22 03:06:45 +00:00
pakrym-oai	767b66f407	Migrate coverage to shell_command (#7042 )	2025-11-21 03:44:00 +00:00
pakrym-oai	75f38f16dd	Run remote auto compaction (#6879 )	2025-11-19 00:43:58 -08:00
jif-oai	838531d3e4	feat: remote compaction (#6795 ) Co-authored-by: pakrym-oai <pakrym@openai.com>	2025-11-18 16:51:16 +00:00
Dylan Hurd	2b7378ac77	chore(core) Add shell_serialization coverage (#6810 ) ## Summary Similar to #6545, this PR updates the shell_serialization test suite to cover the various `shell` tool invocations we have. Note that this does not cover unified_exec, which has its own suite of tests. This should provide some test coverage for when we eventually consolidate serialization logic. ## Testing - [x] These are tests	2025-11-17 19:10:56 -08:00
pakrym-oai	6c384eb9c6	tests: replace mount_sse_once_match with mount_sse_once for SSE mocking (#6640 )	2025-11-13 18:04:05 -08:00
Ahmed Ibrahim	2a6e9b20df	Promote shared helpers for suite tests (#6460 ) ## Summary - add `TestCodex::submit_turn_with_policies` and extend the response helpers with reusable tool-call utilities - update the grep_files, read_file, list_dir, shell_serialization, and tools suites to rely on the shared helpers instead of local copies - make the list_dir helper return `anyhow::Result` so clippy no longer warns about `expect` ## Testing - `just fix -p codex-core` - `cargo test -p codex-core --test all suite::grep_files::grep_files_tool_collects_matches` - `cargo test -p codex-core suite::grep_files::grep_files_tool_collects_matches -- --ignored` (filter requests ignored tests so nothing runs, but the build stays clean) ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_69112d53abac83219813cab4d7cb6446)	2025-11-13 17:12:10 -08:00
Celia Chen	b8ec97c0ef	[App-server] add new v2 events:`item/reasoning/delta`, `item/agentMessage/delta` & `item/reasoning/summaryPartAdded` (#6559 ) core event to app server event mapping: 1. `codex/event/reasoning_content_delta` -> `item/reasoning/summaryTextDelta`. 2. `codex/event/reasoning_raw_content_delta` -> `item/reasoning/textDelta` 3. `codex/event/agent_message_content_delta` → `item/agentMessage/delta`. 4. `codex/event/agent_reasoning_section_break` -> `item/reasoning/summaryPartAdded`. Also added a change in core to pass down content index, summary index and item id from events. Tested with the `git checkout owen/app_server_test_client && cargo run -p codex-app-server-test-client -- send-message-v2 "hello"` and verified that new events are emitted correctly.	2025-11-14 00:25:01 +00:00
Dylan Hurd	2c1b693da4	chore(core) Consolidate apply_patch tests (#6545 ) ## Summary Consolidates our apply_patch tests into one suite, and ensures each test case tests the various ways the harness supports apply_patch: 1. Freeform custom tool call 2. JSON function tool 3. Simple shell call 4. Heredoc shell call There are a few test cases that are specific to a particular variant, I've left those alone. ## Testing - [x] This adds a significant number of tests	2025-11-13 15:52:39 -08:00
pakrym-oai	041d6ad902	Migrate prompt caching tests to test_codex (#6605 ) To hopefully fix the flakiness	2025-11-13 09:19:38 -08:00
pakrym-oai	f97874093e	Set verbosity to low for 5.1 (#6568 ) And improve test coverage	2025-11-13 01:40:52 +00:00
pakrym-oai	7d9ad3effd	Fix otel tests (#6541 ) Mount responses only once, remove unneeded retries and add a final assistant messages to complete the turn.	2025-11-12 16:35:34 +00:00
zhao-oai	980886498c	Add user command event types (#6246 ) adding new user command event, logic in TUI to render user command events	2025-11-10 19:18:45 +00:00
Celia Chen	d3187dbc17	[App-server] v2 for account/updated and account/logout (#6175 ) V2 for `account/updated` and `account/logout` for app server. correspond to old `authStatusChange` and `LogoutChatGpt` respectively. Followup PRs will make other v2 endpoints call `account/updated` instead of `authStatusChange` too.	2025-11-03 22:01:33 -08:00
jif-oai	0508823075	test: undo (#6034 )	2025-10-31 14:46:24 +00:00

1 2

85 Commits