Commit Graph

1614 Commits

Author SHA1 Message Date
jif-oai
847a6092e6 fix: reduce usage of open_if_present (#11344) 2026-02-10 19:25:07 +00:00
pakrym-oai
0639c33892 Compare full request for websockets incrementality (#11343)
Tools can dynamically change mid-turn now. We need to be more thorough
about reusing incremental connections.
2026-02-10 19:14:36 +00:00
Michael Bolin
548afa5749 core: remove stale apply_patch SandboxPolicy TODO in seatbelt (#11345)
The `TODO` in `core/src/seatbelt.rs` claimed that `apply_patch` still needed to honor `SandboxPolicy`. That was true when the comment was added, but it is no longer true.

Analysis:
- The TODO was introduced in #1762, when seatbelt code was split out of `exec.rs`.
- `apply_patch` sandboxing was later implemented in #1705.
- Today, `apply_patch` calls are routed through the tool orchestrator and delegated to `ApplyPatchRuntime`, which executes via `execute_env()` using the active sandbox attempt policy.
- On macOS, the sandbox transform path for that execution still builds seatbelt args with `create_seatbelt_command_args(command, policy, sandbox_policy_cwd)`, so the same `SandboxPolicy` gates `apply_patch` writes and network behavior.

Because this behavior is already enforced, the TODO is stale and removing it avoids implying missing sandbox coverage where none exists.

No functional behavior change; comment-only cleanup.
2026-02-10 19:10:02 +00:00
guinness-oai
099ed802b2 Treat first rollout session_meta as canonical thread identity (#11241)
During thread/fork, the new rollout includes the fork’s own session_meta
plus copied history that can contain older session_meta entries from the
source thread. thread/list was overwriting metadata on later
session_meta lines, so a fork could be reported with the source thread’s
thread_id. This fix only uses the first session_meta, so the fork keeps
its own ID.
2026-02-10 10:32:11 -08:00
Matthew Zeng
48e415bdef [apps] Improve app installation flow. (#11249)
- [x] Add buttons to start the installation flow and verify installation
completes.
- [x] Hard refresh apps list when the /apps view opens.
2026-02-10 17:59:43 +00:00
Shijie Rao
c4b771a16f Fix: update parallel tool call exec approval to approve on request id (#11162)
### Summary

In parallel tool call, exec command approvals were not approved at
request level but at a turn level. i.e. when a single request is
approved, the system currently treats all requests in turn as approved.

### Before

https://github.com/user-attachments/assets/d50ed129-b3d2-4b2f-97fa-8601eb11f6a8

### After

https://github.com/user-attachments/assets/36528a43-a4aa-4775-9e12-f13287ef19fc
2026-02-10 09:38:00 -08:00
pakrym-oai
3322b99900 Remove ApiPrompt (#11265)
Keep things simple and build a full Responses API request request right
in the model client
2026-02-10 16:12:31 +00:00
jif-oai
e57892b211 feat: phase 2 consolidation (#11306)
Consolidation phase of memories

Cleaning and better handling of concurrency
2026-02-10 14:31:16 +00:00
jif-oai
d735df1f50 Extract hooks into dedicated crate (#11311)
Summary
- move `core/src/hooks` implementation into a new `codex-hooks` crate
with its own manifest
- update `codex-rs` workspace and `codex-core` crate to depend on the
extracted `hooks` crate and wire up the shared APIs
- ensure references, modules, and lockfile reflect the new crate layout

Testing
- Not run (not requested)
2026-02-10 13:42:17 +00:00
jif-oai
1d5eba0090 feat: align memory phase 1 and make it stronger (#11300)
## Align with the new phase-1 design

Basically we know run phase 1 in parallel by considering:
* Max 64 rollouts
* Max 1 month old
* Consider the most recent first

This PR also adds stronger parallelization capabilities by detecting
stale jobs, retry policies, ownership of computation to prevent double
computations etc etc
2026-02-10 13:42:09 +00:00
jif-oai
223fadc760 Fix spawn_agent input type (#11304) 2026-02-10 12:16:39 +00:00
jif-oai
87ccc5bbae feat: add connector capabilities to sub-agents (#11191) 2026-02-10 11:53:01 +00:00
jif-oai
6049ff02a0 memories: add extraction and prompt module foundation (#11200)
## Summary
- add the new `core/src/memories` module (phase-one parsing, rollout
filtering, storage, selection, prompts)
- add Askama-backed memory templates for stage-one input/system and
consolidation prompts
- add module tests for parsing, filtering, path bucketing, and summary
maintenance

## Testing
- just fmt
- cargo test -p codex-core --lib memories::
2026-02-10 10:10:24 +00:00
Michael Bolin
44ebf4588f feat: retain NetworkProxy, when appropriate (#11207)
As of this PR, `SessionServices` retains a
`Option<StartedNetworkProxy>`, if appropriate.

Now the `network` field on `Config` is `Option<NetworkProxySpec>`
instead of `Option<NetworkProxy>`.

Over in `Session::new()`, we invoke `NetworkProxySpec::start_proxy()` to
create the `StartedNetworkProxy`, which is a new struct that retains the
`NetworkProxy` as well as the `NetworkProxyHandle`. (Note that `Drop` is
implemented for `NetworkProxyHandle` to ensure the proxies are shutdown
when it is dropped.)

The `NetworkProxy` from the `StartedNetworkProxy` is threaded through to
the appropriate places.


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/11207).
* #11285
* __->__ #11207
2026-02-10 02:09:23 -08:00
alexsong-oai
9fded117ac feat: support configurable metric_exporter (#10940) 2026-02-10 08:14:28 +00:00
viyatb-oai
3391e5ea86 feat(sandbox): enforce proxy-aware network routing in sandbox (#11113)
## Summary
- expand proxy env injection to cover common tool env vars
(`HTTP_PROXY`/`HTTPS_PROXY`/`ALL_PROXY`/`NO_PROXY` families +
tool-specific variants)
- harden macOS Seatbelt network policy generation to route through
inferred loopback proxy endpoints and fail closed when proxy env is
malformed
- thread proxy-aware Linux sandbox flags and add minimal bwrap netns
isolation hook for restricted non-proxy runs
- add/refresh tests for proxy env wiring, Seatbelt policy generation,
and Linux sandbox argument wiring
2026-02-10 07:44:21 +00:00
alexsong-oai
91704c5672 feat: add SkillPolicy to skill metadata and support allow_implicit_invocation (#11244)
Tested by setting the policy in agents/openai.yaml to true, false, and
leaving it unset (default).
```
policy:
  allow_implicit_invocation: false
```
<img width="847" height="289" alt="Screenshot 2026-02-09 at 3 42 41 PM"
src="https://github.com/user-attachments/assets/d3476264-3355-47cf-894a-4ffba53e3481"
/>
2026-02-09 23:13:27 -08:00
Matthew Zeng
005e040f97 [apps] Add thread_id param to optionally load thread config for apps feature check. (#11279)
- [x] Add thread_id param to optionally load thread config for apps
feature check
2026-02-09 23:10:26 -08:00
Eric Traut
bb974c78de Disable dynamic model refresh for custom model providers (#11239)
The dynamic model refresh feature (`https://api.openai.com/v1/models`
endpoint) is currently gated on a runtime check for an auth method other
than API Key. It should be gated on a check specifically for ChatGPT
Auth because some custom model providers (e.g. for local models) use no
auth mechanism. A call to `self.auth_manager.auth_mode()` will return
`None` in this case.

Addresses #11213
2026-02-09 21:36:09 -08:00
Owen Lin
53741013ab fix(app-server): for external auth, replace id_token with chatgpt_acc… (#11240)
…ount_id and chatgpt_plan_type

### Summary
Following up on external auth mode which was introduced here:
https://github.com/openai/codex/pull/10012

Turns out some clients have a differently shaped ID token and don't have
a chosen workspace (aka chatgpt_account_id) encoded in their ID token.
So, let's replace `id_token` param with `chatgpt_account_id` and
`chatgpt_plan_type` (optional) when initializing the external ChatGPT
auth mode (`account/login/start` with `chatgptAuthTokens`).

The client was able to test end-to-end with a Codex build from this
branch and verified it worked!
2026-02-09 20:48:58 -08:00
Michael Bolin
862ab63071 chore: change ConfigState so it no longer depends on a single config.toml file for reloading (#11262)
If anything, it should depend on `ConfigLayerStack`.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/11262).
* #11207
* __->__ #11262
2026-02-09 19:26:39 -08:00
Ahmed Ibrahim
d1df3bd63b Revert "Revert "Update models.json"" (#11256)
Reverts openai/codex#11255
2026-02-09 19:22:41 -08:00
Ahmed Ibrahim
a1abd53b6a Remove offline fallback for models (#11238)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2026-02-09 16:58:54 -08:00
Dylan Hurd
d65f09b913 fix(feature) UnderDevelopment feature must be off (#11242)
## Summary
1. Bump RemoteModels to Stable
2. Assert that all UnderDevelopment features are off by default

## Testing
- [x] Added unit test
2026-02-09 15:14:15 -08:00
Ahmed Ibrahim
481145e959 Use longest remote model prefix matching (#11228)
Match model metadata by longest matching remote slug prefix before local
fallback.

- Update `get_model_info` to prefer the most specific remote slug prefix
for the requested model.
- Add an integration test to assert `gpt-5.3-codex-test` resolves to
`gpt-5.3-codex` over `gpt-5.3`.
2026-02-09 15:05:56 -08:00
Matthew Zeng
d90df4761b [apps] Add gated instructions for Apps. (#10924)
- [x] Add gated instructions for Apps.
2026-02-09 14:48:09 -08:00
jif-oai
ffd4bd345c feat: tie shell snapshot to cwd (#11231)
Fix for this: https://github.com/openai/codex/issues/11223

Basically we tie the shell snapshot to a `cwd` to handle `cwd`-based env
setups
2026-02-09 22:14:39 +00:00
jif-oai
c2ca51273f feat: use a notify instead of grace to close ue process (#11219) 2026-02-09 22:14:33 +00:00
xl-openai
cca13fb03a skill-creator: Remove invalid reference. (#10960)
Remove references to two files that do not exist.
2026-02-09 13:37:27 -08:00
xl-openai
a33ee46e3b feat: extend skills/list to support additional roots. (#10835)
Add an optional perCwdExtraUserRoots
2026-02-09 13:30:38 -08:00
jif-oai
74ecd6e3b2 state: add memory consolidation lock primitives (#11199)
## Summary
- add a migration for memory_consolidation_locks
- add acquire/release lock primitives to codex-state runtime
- add core/state_db wrappers and cwd normalization for memory queries
and lock keys

## Testing
- cargo test -p codex-state memory_consolidation_lock_
- cargo test -p codex-core --lib state_db::
2026-02-09 21:04:20 +00:00
Anton Panasenko
becc3a0424 feat: search_tool (#10657)
**Why We Did This**
- The goal is to reduce MCP tool context pollution by not exposing the
full MCP tool list up front
- It forces an explicit discovery step (`search_tool_bm25`) so the model
narrows tool scope before making MCP calls, which helps relevance and
lowers prompt/tool clutter.

**What It Changed**
- Added a new experimental feature flag `search_tool` in
`core/src/features.rs:90` and `core/src/features.rs:430`.
- Added config/schema support for that flag in
`core/config.schema.json:214` and `core/config.schema.json:1235`.
- Added BM25 dependency (`bm25`) in `Cargo.toml:129` and
`core/Cargo.toml:23`.
- Added new tool handler `search_tool_bm25` in
`core/src/tools/handlers/search_tool_bm25.rs:18`.
- Registered the handler and tool spec in
`core/src/tools/handlers/mod.rs:11` and `core/src/tools/spec.rs:780` and
`core/src/tools/spec.rs:1344`.
- Extended `ToolsConfig` to carry `search_tool` enablement in
`core/src/tools/spec.rs:32` and `core/src/tools/spec.rs:56`.
- Injected dedicated developer instructions for tool-discovery workflow
in `core/src/codex.rs:483` and `core/src/codex.rs:1976`, using
`core/templates/search_tool/developer_instructions.md:1`.
- Added session state to store one-shot selected MCP tools in
`core/src/state/session.rs:27` and `core/src/state/session.rs:131`.
- Added filtering so when feature is enabled, only selected MCP tools
are exposed on the next request (then consumed) in
`core/src/codex.rs:3800` and `core/src/codex.rs:3843`.
- Added E2E suite coverage for
enablement/instructions/hide-until-search/one-turn-selection in
`core/tests/suite/search_tool.rs:72`,
`core/tests/suite/search_tool.rs:109`,
`core/tests/suite/search_tool.rs:147`, and
`core/tests/suite/search_tool.rs:218`.
- Refactored test helper utilities to support config-driven tool
collection in `core/tests/suite/tools.rs:281`.

**Net Behavioral Effect**
- With `search_tool` **off**: existing MCP behavior (tools exposed
normally).
- With `search_tool` **on**: MCP tools start hidden, model must call
`search_tool_bm25`, and only returned `selected_tools` are available for
the next model call.
2026-02-09 12:53:50 -08:00
Charley Cunningham
9450cd9ce5 core: add focused diagnostics for remote compaction overflows (#11133)
## Summary
- add targeted remote-compaction failure diagnostics in compact_remote
logging
- log the specific values needed to explain overflow timing:
  - last_api_response_total_tokens
  - estimated_tokens_of_items_added_since_last_successful_api_response
  - estimated_bytes_of_items_added_since_last_successful_api_response
  - failing_compaction_request_body_bytes
- simplify breakdown naming and remove
last_api_response_total_bytes_estimate (it was an approximation and not
useful for debugging)

## Why
When compaction fails with context_length_exceeded, we need concrete,
low-ambiguity numbers that map directly to:
1) what the API most recently reported, and
2) what local history added since then.

This keeps the failure logs actionable without adding broad, noisy
metrics.

## Testing
- just fmt
- cargo test -p codex-core
2026-02-09 12:42:20 -08:00
jif-oai
c2bfd1e473 Revert "chore: enable sub agents" (#11230)
Reverts openai/codex#11173
2026-02-09 20:22:38 +00:00
pakrym-oai
ccd17374cb Move warmup to the task level (#11216)
Instead of storing a special connection on the client level make the
regular task responsible for establishing a normal client session and
open a connection on it.

Then when the turn is started we pass in a pre-established session.
2026-02-09 10:57:52 -08:00
Eric Traut
9346d321d2 Fixed bug in file watcher that results in spurious skills update events and large log files (#11217)
On some platforms, the "notify" file watcher library emits events for
file opens and reads, not just file modifications or deletes. The
previous implementation didn't take this into account.

Furthermore, the `tracing.info!` call that I previously added was
emitting a lot of logs. I had assumed incorrectly that `info` level
logging was disabled by default, but it's apparently enabled for this
crate. This is resulting in large logs (hundreds of MB) for some users.
2026-02-09 10:33:57 -08:00
jif-oai
cfce286459 tools: remove get_memory tool and tests (#11198)
Drop this memory tool as the design changed
2026-02-09 17:47:36 +00:00
Charley Cunningham
0883e5d3e5 core: account for all post-response items in auto-compact token checks (#11132)
## Summary
- change compaction pre-check accounting to include all items added
after the last model-generated item, not only trailing codex-generated
outputs
- use that boundary consistently in get_total_token_usage() and
get_total_token_usage_breakdown()
- update history tests to cover user/tool-output items after the last
model item

## Why
last_token_usage.total_tokens is API-reported for the last successful
model response. After that point, local history may gain additional
items (user messages, injected context, tool outputs). Compaction
triggering must account for all of those items to avoid late compaction
attempts that can overflow context.

## Testing
- just fmt
- cargo test -p codex-core
2026-02-09 08:34:38 -08:00
gt-oai
9fe925b15a Load requirements on windows (#10770)
We support requirements on Unix, loading from
`/etc/codex/requirements.toml`. On MacOS, we also support MDM.

Now, on Windows, we'll load requirements from
`%ProgramData%\OpenAI\Codex\requirements.toml`
2026-02-09 16:05:38 +00:00
jif-oai
284c03ceab chore: enable sub agents (#11173) 2026-02-09 11:25:37 +00:00
jif-oai
753821c90f chore: enable shell snapshot (#11172) 2026-02-09 11:23:59 +00:00
jif-oai
6cf61725d0 feat: do not close unified exec processes across turns (#10799)
With this PR we do not close the unified exec processes (i.e. background
terminals) at the end of a turn unless:
* The user interrupt the turn
* The user decide to clean the processes through `app-server` or
`/clean`

I made sure that `codex exec` correctly kill all the processes
2026-02-09 10:27:46 +00:00
Michael Bolin
383b45279e feat: include NetworkConfig through ExecParams (#11105)
This PR adds the following field to `Config`:

```rust
pub network: Option<NetworkProxy>,
```

Though for the moment, it will always be initialized as `None` (this
will be addressed in a subsequent PR).

This PR does the work to thread `network` through to `execute_exec_env()`, `process_exec_tool_call()`, and `UnifiedExecRuntime.run()` to ensure it is available whenever we span a process.
2026-02-09 03:32:17 +00:00
Michael Bolin
ff74aaae21 chore: reverse the codex-network-proxy -> codex-core dependency (#11121) 2026-02-08 17:03:24 -08:00
Matthew Zeng
45b7763c3f [apps] Improve app loading. (#10994)
There are two concepts of apps that we load in the harness:

- Directory apps, which is all the apps that the user can install.
- Accessible apps, which is what the user actually installed and can be
$ inserted and be used by the model. These are extracted from the tools
that are loaded through the gateway MCP.

Previously we wait for both sets of apps before returning the full apps
list. Which causes many issues because accessible apps won't be
available to the UI or the model if directory apps aren't loaded or
failed to load.

In this PR we are separating them so that accessible apps can be loaded
separately and are instantly available to be shown in the UI and to be
provided in model context. We also added an app-server event so that
clients can subscribe to also get accessible apps without being blocked
on the full app list.

- [x] Separate accessible apps and directory apps loading.
- [x] `app/list` request will also emit `app/list/updated` notifications
that app-server clients can subscribe. Which allows clients to get
accessible apps list to render in the $ menu without being blocked by
directory apps.
- [x] Cache both accessible and directory apps with 1 hour TTL to avoid
reloading them when creating new threads.
- [x] TUI improvements to redraw $ menu and /apps menu when app list is
updated.
2026-02-08 15:24:56 -08:00
Michael Bolin
181b721ba5 feat: include [experimental_network] in <environment_context> (#11044)
If `NetworkConstraints` is set, then include the relevant settings on `<environment_context>`. Example:

```xml
<environment_context>
  <cwd>/repo</cwd>
  <shell>bash</shell>
  <network enabled="true">
    <allowed>api.example.com</allowed>
    <allowed>*.openai.com</allowed>
    <denied>blocked.example.com</denied>
  </network>
</environment_context>
```
2026-02-08 15:16:50 -08:00
Matthew Zeng
9f1009540b Upgrade rmcp to 0.14 (#10718)
- [x] Upgrade rmcp to 0.14
2026-02-08 15:07:53 -08:00
Tom
409ec76fbc Gate view_image tool by model input_modalities (#11051)
- Plumb input modalities from model catalog through the openai model
protocol. Default to text and image.
- Conditionally add the view_image tool only if input modalities support
image.
2026-02-08 10:45:26 -08:00
Eric Traut
b3de6c7f2b Defer persistence of rollout file (#11028)
- Defer rollout persistence for fresh threads (`InitialHistory::New`):
keep rollout events in memory and only materialize rollout file + state
DB row on first `EventMsg::UserMessage`.
- Keep precomputed rollout path available before materialization.
- Change `thread/start` to build thread response from live config
snapshot and optional precomputed path.
- Improve pre-materialization behavior in app-server/TUI: clearer
invalid-request errors for file-backed ops and a friendlier `/fork` “not
ready yet” UX.
- Update tests to match deferred semantics across
start/read/archive/unarchive/fork/resume/review flows.
- Improved resilience of user_shell test, which should be unrelated to
this change but must be affected by timing changes

For Reviewers:
* The primary change is in recorder.rs
* Most of the other changes were to fix up broken assumptions in
existing tests

Testing:
* Manually tested CLI
* Exercised app server paths by manually running IDE Extension with
rebuilt CLI binary
* Only user-visible change is that `/fork` in TUI generates visible
error if used prior to first turn
2026-02-07 23:05:03 -08:00
pakrym-oai
6d08298f4e Fallback to HTTP on UPGRADE_REQUIRED (#10824)
Allow the server to trigger a connection downgrade in case the protocol
changes in incompatible ways.
2026-02-08 05:06:33 +00:00