Commit Graph

21 Commits

Author SHA1 Message Date
Dylan Hurd
f3bbcc987d test(core): stabilize ARM bazel remote-model and parallelism tests (#11330)
## Summary
- keep wiremock MockServer handles alive through async assertions in
remote model suite tests
- assert /models request count in remote_models_hide_picker_only_models
- use a slightly higher parallel timing threshold on aarch64 while
keeping existing x86 threshold

## Validation
- just fmt
- targeted tests:
- cargo test -p codex-core --test all
suite::remote_models::remote_models_merge_replaces_overlapping_model --
--exact
- cargo test -p codex-core --test all
suite::remote_models::remote_models_hide_picker_only_models -- --exact
- cargo test -p codex-core --test all
suite::tool_parallelism::shell_tools_run_in_parallel -- --exact
- soak loop: 40 iterations of all three targeted tests

## Notes
- cargo test -p codex-core has one unrelated local-env failure in
shell_snapshot::tests::try_new_creates_and_deletes_snapshot_file from
exported certificate env content in this workspace.
- local bazel test //codex-rs/core:core-all-test failed to build due
missing rust-objcopy in this host toolchain.
2026-02-10 10:57:50 -08:00
Fouad Matin
693bac1851 fix(protocol): approval policy never prompt (#11288)
This removes overly directed language about how the model should behave
when it's in `approval_policy=never` mode.

---------

Co-authored-by: Dylan Hurd <dylan.hurd@openai.com>
2026-02-10 09:27:46 -08:00
gt-oai
54b401aa5f Deflake mixed parallel tools timing test (#11193)
```
        FAIL [   1.903s] (1926/3311) codex-core::all suite::tool_parallelism::mixed_parallel_tools_run_in_parallel
  stdout ───

    running 1 test
    test suite::tool_parallelism::mixed_parallel_tools_run_in_parallel ... FAILED

    failures:

    failures:
        suite::tool_parallelism::mixed_parallel_tools_run_in_parallel

    test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 684 filtered out; finished in 1.86s
    
  stderr ───

    thread 'suite::tool_parallelism::mixed_parallel_tools_run_in_parallel' (205083) panicked at core/tests/suite/tool_parallelism.rs:74:5:
    expected parallel execution to finish quickly, got 1.406255993s
    stack backtrace:
       0: __rustc::rust_begin_unwind
                 at /rustc/254b59607d4417e9dffbc307138ae5c86280fe4c/library/std/src/panicking.rs:689:5
       1: core::panicking::panic_fmt
                 at /rustc/254b59607d4417e9dffbc307138ae5c86280fe4c/library/core/src/panicking.rs:80:14
       2: all::suite::tool_parallelism::assert_parallel_duration
                 at ./tests/suite/tool_parallelism.rs:74:5
       3: all::suite::tool_parallelism::mixed_parallel_tools_run_in_parallel::{{closure}}
                 at ./tests/suite/tool_parallelism.rs:206:5
       4: <core::pin::Pin<P> as core::future::future::Future>::poll
                 at /home/runner/.rustup/toolchains/1.93.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/future.rs:133:9
       5: tokio::runtime::park::CachedParkThread::block_on::{{closure}}
                 at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/park.rs:284:71
       6: tokio::task::coop::with_budget
                 at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/task/coop/mod.rs:167:5
       7: tokio::task::coop::budget
                 at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/task/coop/mod.rs:133:5
       8: tokio::runtime::park::CachedParkThread::block_on
                 at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/park.rs:284:31
       9: tokio::runtime::context::blocking::BlockingRegionGuard::block_on
                 at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/context/blocking.rs:66:14
      10: tokio::runtime::scheduler::multi_thread::MultiThread::block_on::{{closure}}
                 at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/scheduler/multi_thread/mod.rs:89:22
      11: tokio::runtime::context::runtime::enter_runtime
                 at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/context/runtime.rs:65:16
      12: tokio::runtime::scheduler::multi_thread::MultiThread::block_on
                 at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/scheduler/multi_thread/mod.rs:88:9
      13: tokio::runtime::runtime::Runtime::block_on_inner
                 at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/runtime.rs:370:50
      14: tokio::runtime::runtime::Runtime::block_on
                 at /home/runner/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.49.0/src/runtime/runtime.rs:342:18
      15: all::suite::tool_parallelism::mixed_parallel_tools_run_in_parallel
                 at ./tests/suite/tool_parallelism.rs:208:7
      16: all::suite::tool_parallelism::mixed_parallel_tools_run_in_parallel::{{closure}}
                 at ./tests/suite/tool_parallelism.rs:178:52
      17: core::ops::function::FnOnce::call_once
                 at /home/runner/.rustup/toolchains/1.93.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:250:5
      18: core::ops::function::FnOnce::call_once
                 at /rustc/254b59607d4417e9dffbc307138ae5c86280fe4c/library/core/src/ops/function.rs:250:5
    note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
```
2026-02-09 15:16:54 +00:00
jif-oai
71e63f8d10 fix: flaky test (#10644) 2026-02-04 17:59:22 +00:00
jif-oai
33dc93e4d2 Enable parallel shell tools (#10505)
Summary
- mark the shell-related tools as supporting parallel tool calls so
exec_command, shell_command, etc. can run concurrently
- update expectations in tool parallelism tests to reflect the new
parallel behavior
- drop the unused serial duration helper from the suite

Testing
- Not run (not requested)
2026-02-03 18:05:02 +00:00
Dylan Hurd
8b3521ee77 feat(core) update Personality on turn (#9644)
## Summary
Support updating Personality mid-Thread via UserTurn/OverwriteTurn. This
is explicitly unused by the clients so far, to simplify PRs - app-server
and tui implementations will be follow-ups.

## Testing
- [x] added integration tests
2026-01-22 12:04:23 -08:00
charley-oai
41e38856f6 Reduce burst testing flake (#9549)
## Summary

- make paste-burst tests deterministic by injecting explicit timestamps
instead of relying on wall clock timing
- add time-aware helpers for input/submission paths so tests can drive
the burst heuristic precisely
- update burst-related tests to flush using computed timeouts while
preserving behavior assertions
- increase timeout slack in
shell_tools_start_before_response_completed_when_stream_delayed to
reduce flakiness
2026-01-21 16:42:31 -08:00
Ahmed Ibrahim
146d54cede Add collaboration_mode override to turns (#9408) 2026-01-16 21:51:25 -08:00
charley-oai
4a9c2bcc5a Add text element metadata to types (#9235)
Initial type tweaking PR to make the diff of
https://github.com/openai/codex/pull/9116 smaller

This should not change any behavior, just adds some fields to types
2026-01-14 16:41:50 -08:00
jif-oai
1aed01e99f renaming: task to turn (#8963) 2026-01-09 17:31:17 +00:00
Thibault Sottiaux
230a045ac9 chore: stabilize core tool parallelism test (#8805)
Set login=false for the shell tool in the timing-based parallelism test
so it does not depend on slow user login shells, making the test
deterministic without user-facing changes. This prevents occasional
flakes when running locally.
2026-01-07 09:26:47 +00:00
jif-oai
2bf57674d6 fix: flaky test 6 (#8175) 2025-12-17 11:59:13 +00:00
Ahmed Ibrahim
d802b18716 fix parallel tool calls (#7956) 2025-12-16 01:28:27 +00:00
pakrym-oai
e52cc38dfd Use use_model (#7121) 2025-11-21 22:10:52 +00:00
pakrym-oai
767b66f407 Migrate coverage to shell_command (#7042) 2025-11-21 03:44:00 +00:00
Ahmed Ibrahim
ddcc60a085 Update defaults to gpt-5.1 (#6652)
## Summary
- update documentation, example configs, and automation defaults to
reference gpt-5.1 / gpt-5.1-codex
- bump the CLI and core configuration defaults, model presets, and error
messaging to the new models while keeping the model-family/tool coverage
for legacy slugs
- refresh tests, fixtures, and TUI snapshots so they expect the upgraded
defaults

## Testing
- `cargo test -p codex-core
config::tests::test_precedence_fixture_with_gpt5_profile`


------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_6916c5b3c2b08321ace04ee38604fc6b)
2025-11-17 17:40:11 -08:00
pakrym-oai
cfcc87a953 Order outputs before inputs (#6691)
For better caching performance all output items should be rendered in
the order they were produced before all new input items (for example,
all function_call before all function_call_output).
2025-11-14 14:54:11 -08:00
pakrym-oai
9c903c4716 Add ItemStarted/ItemCompleted events for UserInputItem (#5306)
Adds a new ItemStarted event and delivers UserMessage as the first item
type (more to come).


Renames `InputItem` to `UserInput` considering we're using the `Item`
suffix for actual items.
2025-10-20 13:34:44 -07:00
jif-oai
f52320be86 feat: grep_files as a tool (#4820)
Add `grep_files` to be able to perform more action in parallel
2025-10-08 11:02:50 +01:00
jif-oai
338c2c873c bug: fix flaky test (#4878)
Fix flaky test by warming up the tools
2025-10-07 19:32:49 +01:00
jif-oai
dc3c6bf62a feat: parallel tool calls (#4663)
Add parallel tool calls. This is configurable at model level and tool
level
2025-10-05 16:10:49 +00:00