codex

mirror of https://github.com/openai/codex.git synced 2026-04-26 15:45:02 +00:00

Author	SHA1	Message	Date
Michael Bolin	476c1a7160	Remove `test-support` feature from `codex-core` and replace it with explicit test toggles (#11405 ) ## Why `codex-core` was being built in multiple feature-resolved permutations because test-only behavior was modeled as crate features. For a large crate, those permutations increase compile cost and reduce cache reuse. ## Net Change - Removed the `test-support` crate feature and related feature wiring so `codex-core` no longer needs separate feature shapes for test consumers. - Standardized cross-crate test-only access behind `codex_core::test_support`. - External test code now imports helpers from `codex_core::test_support`. - Underlying implementation hooks are kept internal (`pub(crate)`) instead of broadly public. ## Outcome - Fewer `codex-core` build permutations. - Better incremental cache reuse across test targets. - No intended production behavior change.	2026-02-10 22:44:02 -08:00
pakrym-oai	fbb3a30953	Remove WebSocket wire format (#10179 ) I'd like WireApi to go away (when chat is removed) and WebSockets is still responses API just over a different transport.	2026-01-29 13:50:53 -08:00
pakrym-oai	998e88b12a	Use test_codex more (#9961 ) Reduces boilderplate.	2026-01-26 18:52:10 -08:00
Dylan Hurd	8b3521ee77	feat(core) update Personality on turn (#9644 ) ## Summary Support updating Personality mid-Thread via UserTurn/OverwriteTurn. This is explicitly unused by the clients so far, to simplify PRs - app-server and tui implementations will be follow-ups. ## Testing - [x] added integration tests	2026-01-22 12:04:23 -08:00
Ahmed Ibrahim	146d54cede	Add collaboration_mode override to turns (#9408 )	2026-01-16 21:51:25 -08:00
charley-oai	4a9c2bcc5a	Add text element metadata to types (#9235 ) Initial type tweaking PR to make the diff of https://github.com/openai/codex/pull/9116 smaller This should not change any behavior, just adds some fields to types	2026-01-14 16:41:50 -08:00
pakrym-oai	2d56519ecd	Support response.done and add integration tests (#9129 ) The agent loop using a persistent incremental web socket connection.	2026-01-13 16:12:30 +00:00
jif-oai	1aed01e99f	renaming: task to turn (#8963 )	2026-01-09 17:31:17 +00:00
Ahmed Ibrahim	0d3e673019	remove `get_responses_requests` and `get_responses_request_bodies` to use in-place matcher (#8858 )	2026-01-08 13:57:48 -08:00
jif-oai	116059c3a0	chore: unify conversation with thread name (#8830 ) Done and verified by Codex + refactor feature of RustRover	2026-01-07 17:04:53 +00:00
jif-oai	1dd1355df3	feat: agent controller (#8783 ) Added an agent control plane that lets sessions spawn or message other conversations via `AgentControl`. `AgentBus` (core/src/agent/bus.rs) keeps track of the last known status of a conversation. ConversationManager now holds shared state behind an Arc so AgentControl keeps only a weak back-reference, the goal is just to avoid explicit cycle reference. Follow-ups: * Build a small tool in the TUI to be able to see every agent and send manual message to each of them * Handle approval requests in this TUI * Add tools to spawn/communicate between agents (see related design) * Define agent types	2026-01-06 19:08:02 +00:00
Michael Bolin	e61bae12e3	feat: introduce codex-utils-cargo-bin as an alternative to assert_cmd::Command (#8496 ) This PR introduces a `codex-utils-cargo-bin` utility crate that wraps/replaces our use of `assert_cmd::Command` and `escargot::CargoBuild`. As you can infer from the introduction of `buck_project_root()` in this PR, I am attempting to make it possible to build Codex under [Buck2](https://buck2.build) as well as `cargo`. With Buck2, I hope to achieve faster incremental local builds (largely due to Buck2's [dice](https://buck2.build/docs/insights_and_knowledge/modern_dice/) build strategy, as well as benefits from its local build daemon) as well as faster CI builds if we invest in remote execution and caching. See https://buck2.build/docs/getting_started/what_is_buck2/#why-use-buck2-key-advantages for more details about the performance advantages of Buck2. Buck2 enforces stronger requirements in terms of build and test isolation. It discourages assumptions about absolute paths (which is key to enabling remote execution). Because the `CARGO_BIN_EXE_` environment variables that Cargo provides are absolute paths (which `assert_cmd::Command` reads), this is a problem for Buck2, which is why we need this `codex-utils-cargo-bin` utility. My WIP-Buck2 setup sets the `CARGO_BIN_EXE_` environment variables passed to a `rust_test()` build rule as relative paths. `codex-utils-cargo-bin` will resolve these values to absolute paths, when necessary. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/8496). * #8498 * __->__ #8496	2025-12-23 19:29:32 -08:00
Michael Bolin	3d4ced3ff5	chore: migrate from Config::load_from_base_config_with_overrides to ConfigBuilder (#8276 ) https://github.com/openai/codex/pull/8235 introduced `ConfigBuilder` and this PR updates all call non-test call sites to use it instead of `Config::load_from_base_config_with_overrides()`. This is important because `load_from_base_config_with_overrides()` uses an empty `ConfigRequirements`, which is a reasonable default for testing so the tests are not influenced by the settings on the host. This method is now guarded by `#[cfg(test)]` so it cannot be used by business logic. Because `ConfigBuilder::build()` is `async`, many of the test methods had to be migrated to be `async`, as well. On the bright side, this made it possible to eliminate a bunch of `block_on_future()` stuff.	2025-12-18 16:12:52 -08:00
Ahmed Ibrahim	d802b18716	fix parallel tool calls (#7956 )	2025-12-16 01:28:27 +00:00
xl-openai	5d77d4db6b	Reimplement skills loading using SkillsManager + skills/list op. (#7914 ) refactor the way we load and manage skills: 1. Move skill discovery/caching into SkillsManager and reuse it across sessions. 2. Add the skills/list API (Op::ListSkills/SkillsListResponse) to fetch skills for one or more cwds. Also update app-server for VSCE/App; 3. Trigger skills/list during session startup so UIs preload skills and handle errors immediately.	2025-12-14 09:58:17 -08:00
xl-openai	b36ecb6c32	Inject SKILL.md when it's explicitly mentioned. (#7763 ) 1. Skills load once in core at session start; the cached outcome is reused across core and surfaced to TUI via SessionConfigured. 2. TUI detects explicit skill selections, and core injects the matching SKILL.md content into the turn when a selected skill is present.	2025-12-10 13:59:17 -08:00
Ahmed Ibrahim	cb9a189857	make `model` optional in config (#7769 ) - Make Config.model optional and centralize default-selection logic in ModelsManager, including a default_model helper (with codex-auto-balanced when available) so sessions now carry an explicit chosen model separate from the base config. - Resolve `model` once in `core` and `tui` from config. Then store the state of it on other structs. - Move refreshing models to be before resolving the default model	2025-12-10 11:19:00 -08:00
Ahmed Ibrahim	9b2055586d	remove `model_family` from `config (#7571 ) - Remove `model_family` from `config` - Make sure to still override config elements related to `model_family` like supporting reasoning	2025-12-04 11:57:58 -08:00
Ahmed Ibrahim	cee37a32b2	Migrate model family to models manager (#7565 ) This PR moves `ModelsFamily` to `openai_models`. It also propagates `ModelsManager` to session services and use it to drive model family. We also make `derive_default_model_family` private because it's a step towards what we want: one place that gives model configuration. This is a second step at having one source of truth for models information and config: `ModelsManager`. Next steps would be to remove `ModelsFamily` from config. That's massive because it's being used in 41 occasions mostly pre launching `codex`. Also, we need to make `find_family_for_model` private. It's also big because it's being used in 21 occasions ~ all tests.	2025-12-03 18:49:47 -08:00
Dylan Hurd	5b25915d7e	fix(apply_patch) tests for shell_command (#7307 ) ## Summary Adds test coverage for invocations of apply_patch via shell_command with heredoc, to validate behavior. ## Testing - [x] These are tests	2025-12-01 15:09:22 -08:00
jif-oai	838531d3e4	feat: remote compaction (#6795 ) Co-authored-by: pakrym-oai <pakrym@openai.com>	2025-11-18 16:51:16 +00:00
Dylan Hurd	2b7378ac77	chore(core) Add shell_serialization coverage (#6810 ) ## Summary Similar to #6545, this PR updates the shell_serialization test suite to cover the various `shell` tool invocations we have. Note that this does not cover unified_exec, which has its own suite of tests. This should provide some test coverage for when we eventually consolidate serialization logic. ## Testing - [x] These are tests	2025-11-17 19:10:56 -08:00
Ahmed Ibrahim	2a6e9b20df	Promote shared helpers for suite tests (#6460 ) ## Summary - add `TestCodex::submit_turn_with_policies` and extend the response helpers with reusable tool-call utilities - update the grep_files, read_file, list_dir, shell_serialization, and tools suites to rely on the shared helpers instead of local copies - make the list_dir helper return `anyhow::Result` so clippy no longer warns about `expect` ## Testing - `just fix -p codex-core` - `cargo test -p codex-core --test all suite::grep_files::grep_files_tool_collects_matches` - `cargo test -p codex-core suite::grep_files::grep_files_tool_collects_matches -- --ignored` (filter requests ignored tests so nothing runs, but the build stays clean) ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_69112d53abac83219813cab4d7cb6446)	2025-11-13 17:12:10 -08:00
Dylan Hurd	2c1b693da4	chore(core) Consolidate apply_patch tests (#6545 ) ## Summary Consolidates our apply_patch tests into one suite, and ensures each test case tests the various ways the harness supports apply_patch: 1. Freeform custom tool call 2. JSON function tool 3. Simple shell call 4. Heredoc shell call There are a few test cases that are specific to a particular variant, I've left those alone. ## Testing - [x] This adds a significant number of tests	2025-11-13 15:52:39 -08:00
pakrym-oai	041d6ad902	Migrate prompt caching tests to test_codex (#6605 ) To hopefully fix the flakiness	2025-11-13 09:19:38 -08:00
pakrym-oai	f97874093e	Set verbosity to low for 5.1 (#6568 ) And improve test coverage	2025-11-13 01:40:52 +00:00
Dylan Hurd	4a55646a02	chore: testing on freeform apply_patch (#5952 ) ## Summary Duplicates the tests in `apply_patch_cli.rs`, but tests the freeform apply_patch tool as opposed to the function call path. The good news is that all the tests pass with zero logical tests, with the exception of the heredoc, which doesn't really make sense in the freeform tool context anyway. @jif-oai since you wrote the original tests in #5557, I'd love your opinion on the right way to DRY these test cases between the two. Happy to set up a more sophisticated harness, but didn't want to go down the rabbit hole until we agreed on the right pattern ## Testing - [x] These are tests	2025-10-30 10:40:48 -07:00
jif-oai	060637b4d4	feat: deprecation warning (#5825 ) <img width="955" height="311" alt="Screenshot 2025-10-28 at 14 26 25" src="https://github.com/user-attachments/assets/99729b3d-3bc9-4503-aab3-8dc919220ab4" />	2025-10-29 12:29:28 +00:00
jif-oai	6745b12427	chore: testing on apply_path (#5557 )	2025-10-23 17:00:48 +01:00
pakrym-oai	5cd8803998	Add a baseline test for resume initial messages (#5466 )	2025-10-21 11:45:01 -07:00
Dylan	0a0a10d8b3	fix: apply_patch shell_serialization tests (#4786 ) ## Summary Adds additional shell_serialization tests specifically for apply_patch and other cases. ## Test Plan - [x] These are all tests	2025-10-14 13:00:49 -07:00
Gabriel Peal	1d17ca1fa3	[MCP] Add support for MCP Oauth credentials (#4517 ) This PR adds oauth login support to streamable http servers when `experimental_use_rmcp_client` is enabled. This PR is large but represents the minimal amount of work required for this to work. To keep this PR smaller, login can only be done with `codex mcp login` and `codex mcp logout` but it doesn't appear in `/mcp` or `codex mcp list` yet. Fingers crossed that this is the last large MCP PR and that subsequent PRs can be smaller. Under the hood, credentials are stored using platform credential managers using the [keyring crate](https://crates.io/crates/keyring). When the keyring isn't available, it falls back to storing credentials in `CODEX_HOME/.credentials.json` which is consistent with how other coding agents handle authentication. I tested this on macOS, Windows, WSL (ubuntu), and Linux. I wasn't able to test the dbus store on linux but did verify that the fallback works. One quirk is that if you have credentials, during development, every build will have its own ad-hoc binary so the keyring won't recognize the reader as being the same as the write so it may ask for the user's password. I may add an override to disable this or allow users/enterprises to opt-out of the keyring storage if it causes issues. <img width="5064" height="686" alt="CleanShot 2025-09-30 at 19 31 40" src="https://github.com/user-attachments/assets/9573f9b4-07f1-4160-83b8-2920db287e2d" /> <img width="745" height="486" alt="image" src="https://github.com/user-attachments/assets/9562649b-ea5f-4f22-ace2-d0cb438b143e" />	2025-10-03 13:43:12 -04:00
pakrym-oai	5c7d9e27b1	Add notifier tests (#4064 ) Proposal: 1. Use anyhow for tests and avoid unwrap 2. Extract a helper for starting a test instance of codex	2025-09-23 14:25:46 +00:00

33 Commits