Commit Graph

69 Commits

Author SHA1 Message Date
github-actions[bot]
3626399811 Update models.json (#11274)
Automated update of models.json.

---------

Co-authored-by: aibrahim-oai <219906144+aibrahim-oai@users.noreply.github.com>
Co-authored-by: Ahmed Ibrahim <aibrahim@openai.com>
Co-authored-by: Sayan Sisodiya <sayan@openai.com>
2026-02-10 14:28:18 -08:00
Anton Panasenko
becc3a0424 feat: search_tool (#10657)
**Why We Did This**
- The goal is to reduce MCP tool context pollution by not exposing the
full MCP tool list up front
- It forces an explicit discovery step (`search_tool_bm25`) so the model
narrows tool scope before making MCP calls, which helps relevance and
lowers prompt/tool clutter.

**What It Changed**
- Added a new experimental feature flag `search_tool` in
`core/src/features.rs:90` and `core/src/features.rs:430`.
- Added config/schema support for that flag in
`core/config.schema.json:214` and `core/config.schema.json:1235`.
- Added BM25 dependency (`bm25`) in `Cargo.toml:129` and
`core/Cargo.toml:23`.
- Added new tool handler `search_tool_bm25` in
`core/src/tools/handlers/search_tool_bm25.rs:18`.
- Registered the handler and tool spec in
`core/src/tools/handlers/mod.rs:11` and `core/src/tools/spec.rs:780` and
`core/src/tools/spec.rs:1344`.
- Extended `ToolsConfig` to carry `search_tool` enablement in
`core/src/tools/spec.rs:32` and `core/src/tools/spec.rs:56`.
- Injected dedicated developer instructions for tool-discovery workflow
in `core/src/codex.rs:483` and `core/src/codex.rs:1976`, using
`core/templates/search_tool/developer_instructions.md:1`.
- Added session state to store one-shot selected MCP tools in
`core/src/state/session.rs:27` and `core/src/state/session.rs:131`.
- Added filtering so when feature is enabled, only selected MCP tools
are exposed on the next request (then consumed) in
`core/src/codex.rs:3800` and `core/src/codex.rs:3843`.
- Added E2E suite coverage for
enablement/instructions/hide-until-search/one-turn-selection in
`core/tests/suite/search_tool.rs:72`,
`core/tests/suite/search_tool.rs:109`,
`core/tests/suite/search_tool.rs:147`, and
`core/tests/suite/search_tool.rs:218`.
- Refactored test helper utilities to support config-driven tool
collection in `core/tests/suite/tools.rs:281`.

**Net Behavioral Effect**
- With `search_tool` **off**: existing MCP behavior (tools exposed
normally).
- With `search_tool` **on**: MCP tools start hidden, model must call
`search_tool_bm25`, and only returned `selected_tools` are available for
the next model call.
2026-02-09 12:53:50 -08:00
jif-oai
cfce286459 tools: remove get_memory tool and tests (#11198)
Drop this memory tool as the design changed
2026-02-09 17:47:36 +00:00
jif-oai
41f3b1ba0b feat: add memory tool (#10637)
Add a tool for memory to retrieve a full memory based on the memory ID
2026-02-05 16:16:31 +00:00
Dylan Hurd
e482978261 fix(core) switching model appends model instructions (#10651)
## Summary
When switching models, we should append the instructions of the new
model to the conversation as a developer message.

## Test
- [x] Adds a unit test
2026-02-05 05:50:38 +00:00
Eric Traut
7bcc552325 Added support for live updates to skills (#10478)
Add a centralized FileWatcher in codex-core (using notify) that watches
skill roots from the config layer stack (recursive)

Send `SkillsChanged` events when relevant file system changes are
detected

On `SkillsChanged`:
* Invalidate the skills cache immediately in ThreadManager
* Emit EventMsg::SkillsUpdateAvailable to active sessions
~~* Broadcast a new app-server notification:
SkillsListUpdatedNotification~~

This change does not inject new items into the event stream. That means
the agent will not know about new skills, so it won't be able to
implicitly invoke new skills. It also won't know about changes to
existing skills, so if it has already read the contents of a modified
skill, it will not honor the new behavior.

This change also does not detect modifications to AGENTS.md.

I plan to address these limitations in a follow-on PR modeled after
#9985. Injection of new skills and AGENTS was deemed to risky, hence the
need to split the feature into two stages. The changes in this PR were
designed to easily accommodate the second stage once we have some other
foundational changes in place.

Testing: In addition to automated tests, I did manual testing to confirm
that newly-created skills, deleted skills, and renamed skills are
reflected in the TUI skill picker menu. Also confirmed that
modifications to behaviors for explicitly-invoked skills are honored.

---------

Co-authored-by: Xin Lin <xl@openai.com>
2026-02-04 15:25:03 -08:00
viyatb-oai
08926a3fb7 chore(arg0): advisory-lock janitor for codex tmp paths (#10039)
## Description

### What changed
- Switch the arg0 helper root from `~/.codex/tmp/path` to
`~/.codex/tmp/path2`
- Add `Arg0PathEntryGuard` to keep both the `TempDir` and an exclusive
`.lock` file alive for the process lifetime
- Add a startup janitor that scans `path2` and deletes only directories
whose lock can be acquired

### Tests
- `cargo clippy -p codex-arg0`
- `cargo clippy -p codex-core`
- `cargo test -p codex-arg0`
- `cargo test -p codex-core`
2026-02-03 21:38:31 +00:00
Dylan Hurd
0f9858394b feat(core,tui,app-server) personality migration (#10307)
## Summary
Keep existing users on Pragmatic, to preserve behavior while new users
default to Friendly

## Testing
- [x] Tested locally
- [x] add integration tests
2026-01-31 17:25:14 -07:00
sayan-oai
31d1e49340 fix: dont auto-enable web_search for azure (#10266)
seeing issues with azure after default-enabling web search: #10071,
#10257.

need to work with azure to fix api-side, for now turning off
default-enable of web_search for azure.

diff is big because i moved logic to reuse
2026-01-30 22:52:37 +00:00
jif-oai
f8056e62d4 nit: actually run tests (#10217) 2026-01-30 10:02:46 +01:00
pakrym-oai
3b1cddf001 Fall back to http when websockets fail (#10139)
I expect not all proxies work with websockets, fall back to http if
websockets fail.
2026-01-29 10:36:21 -08:00
jif-oai
3878c3dc7c feat: sqlite 1 (#10004)
Add a `.sqlite` database to be used to store rollout metatdata (and
later logs)
This PR is phase 1:
* Add the database and the required infrastructure
* Add a backfill of the database
* Persist the newly created rollout both in files and in the DB
* When we need to get metadata or a rollout, consider the `JSONL` as the
source of truth but compare the results with the DB and show any errors
2026-01-28 15:29:14 +01:00
Ahmed Ibrahim
c900de271a Warn users on enabling underdevelopment features (#9954)
<img width="938" height="73" alt="image"
src="https://github.com/user-attachments/assets/a2d5ac46-92c5-4828-b35e-0965c30cdf36"
/>
2026-01-27 01:58:05 +00:00
Dylan Hurd
96a72828be feat(core) ModelInfo.model_instructions_template (#9597)
## Summary
#9555 is the start of a rename, so I'm starting to standardize here.
Sets up `model_instructions` templating with a strongly-typed object for
injecting a personality block into the model instructions.

## Testing
- [x] Added tests
- [x] Ran locally
2026-01-21 18:11:18 -08:00
Shijie Rao
57ec3a8277 Feat: request user input tool (#9472)
### Summary
* Add `requestUserInput` tool that the model can use for gather
feedback/asking question mid turn.


### Tool input schema
```
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "requestUserInput input",
  "type": "object",
  "additionalProperties": false,
  "required": ["questions"],
  "properties": {
    "questions": {
      "type": "array",
      "description": "Questions to show the user (1-3). Prefer 1 unless multiple independent decisions block progress.",
      "minItems": 1,
      "maxItems": 3,
      "items": {
        "type": "object",
        "additionalProperties": false,
        "required": ["id", "header", "question"],
        "properties": {
          "id": {
            "type": "string",
            "description": "Stable identifier for mapping answers (snake_case)."
          },
          "header": {
            "type": "string",
            "description": "Short header label shown in the UI (12 or fewer chars)."
          },
          "question": {
            "type": "string",
            "description": "Single-sentence prompt shown to the user."
          },
          "options": {
            "type": "array",
            "description": "Optional 2-3 mutually exclusive choices. Put the recommended option first and suffix its label with \"(Recommended)\". Only include \"Other\" option if we want to include a free form option. If the question is free form in nature, do not include any option.",
            "minItems": 2,
            "maxItems": 3,
            "items": {
              "type": "object",
              "additionalProperties": false,
              "required": ["value", "label", "description"],
              "properties": {
                "value": {
                  "type": "string",
                  "description": "Machine-readable value (snake_case)."
                },
                "label": {
                  "type": "string",
                  "description": "User-facing label (1-5 words)."
                },
                "description": {
                  "type": "string",
                  "description": "One short sentence explaining impact/tradeoff if selected."
                }
              }
            }
          }
        }
      }
    }
  }
}
```

### Tool output schema
```
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "requestUserInput output",
  "type": "object",
  "additionalProperties": false,
  "required": ["answers"],
  "properties": {
    "answers": {
      "type": "object",
      "description": "Map of question id to user answer.",
      "additionalProperties": {
        "type": "object",
        "additionalProperties": false,
        "required": ["selected"],
        "properties": {
          "selected": {
            "type": "array",
            "items": { "type": "string" }
          },
          "other": {
            "type": ["string", "null"]
          }
        }
      }
    }
  }
}
```
2026-01-19 10:17:30 -08:00
Ahmed Ibrahim
1478a88eb0 Add collaboration developer instructions (#9424)
- Add additional instructions when they are available
- Make sure to update them on change either UserInput or UserTurn
2026-01-18 01:31:14 +00:00
Ahmed Ibrahim
4d787a2cc2 Renew cache ttl on etag match (#9174)
so we don't do unnecessary fetches
2026-01-14 01:21:41 +00:00
pakrym-oai
2d56519ecd Support response.done and add integration tests (#9129)
The agent loop using a persistent incremental web socket connection.
2026-01-13 16:12:30 +00:00
Ahmed Ibrahim
cbca43d57a Send message by default mid turn. queue messages by tab (#9077)
https://github.com/user-attachments/assets/03838730-4ddc-44df-a2c7-cb8ecda78660
2026-01-12 23:06:35 -08:00
pakrym-oai
490c1c1fdd Add model client sessions (#9102)
Maintain a long-running session.
2026-01-13 01:15:56 +00:00
Ahmed Ibrahim
87f7226cca Assemble sandbox/approval/network prompts dynamically (#8961)
- Add a single builder for developer permissions messaging that accepts
SandboxPolicy and approval policy. This builder now drives the developer
“permissions” message that’s injected at session start and any time
sandbox/approval settings change.
- Trim EnvironmentContext to only include cwd, writable roots, and
shell; removed sandbox/approval/network duplication and adjusted XML
serialization and tests accordingly.

Follow-up: adding a config value to replace the developer permissions
message for custom sandboxes.
2026-01-12 23:12:59 +00:00
charley-oai
d7cdcfc302 Add some tests for image attachments (#9080)
Some extra tests for https://github.com/openai/codex/pull/8950
2026-01-12 13:41:50 -08:00
pakrym-oai
acfd94f625 Add hierarchical agent prompt (#8996) 2026-01-09 13:47:37 -08:00
Channing Conger
21c6d40a44 Add feature for optional request compression (#8767)
Adds a new feature
`enable_request_compression` that will compress using zstd requests to
the codex-backend. Currently only enabled for codex-backend so only enabled for openai providers when using chatgpt::auth even when the feature is enabled

Added a new info log line too for evaluating the compression ratio and
overhead off compressing before requesting. You can enable with
`RUST_LOG=$RUST_LOG,codex_client::transport=info`

```
2026-01-06T00:09:48.272113Z  INFO codex_client::transport: Compressed request body with zstd pre_compression_bytes=28914 post_compression_bytes=11485 compression_duration_ms=0
```
2026-01-07 13:21:40 -08:00
Ahmed Ibrahim
187924d761 Override truncation policy at model info level (#8856)
We used to override truncation policy by comparing model info vs config
value in context manager. A better way to do it is to construct model
info using the config value
2026-01-07 13:06:20 -08:00
jif-oai
116059c3a0 chore: unify conversation with thread name (#8830)
Done and verified by Codex + refactor feature of RustRover
2026-01-07 17:04:53 +00:00
sayan-oai
54ded1a3c0 add web_search_cached flag (#8795)
Add `web_search_cached` feature to config. Enables `web_search` tool
with access only to cached/indexed results (see
[docs](https://platform.openai.com/docs/guides/tools-web-search#live-internet-access)).

This takes precedence over the existing `web_search_request`, which
continues to enable `web_search` over live results as it did before.

`web_search_cached` is disabled for review mode, as `web_search_request`
is.
2026-01-06 14:53:59 -08:00
Ahmed Ibrahim
66b7c673e9 Refresh on models etag mismatch (#8491)
- Send models etag
- Refresh models on 412
- This wires `ModelsManager` to `ModelFamily` so we don't mutate it
mid-turn
2026-01-01 11:41:16 -08:00
Michael Bolin
0a7021de72 fix: enable resume_warning that was missing from mod.rs (#8333)
This test was introduced in https://github.com/openai/codex/pull/6507,
but was not included in `mod.rs`. It does not appear that it was getting
compiled?
2025-12-19 19:21:47 +00:00
xl-openai
b36ecb6c32 Inject SKILL.md when it's explicitly mentioned. (#7763)
1. Skills load once in core at session start; the cached outcome is
reused across core and surfaced to TUI via SessionConfigured.
2. TUI detects explicit skill selections, and core injects the matching
SKILL.md content into the turn when a selected skill is present.
2025-12-10 13:59:17 -08:00
jif-oai
7836aeddae feat: shell snapshotting (#7641) 2025-12-09 18:36:58 +00:00
Ahmed Ibrahim
53a486f7ea Add remote models feature flag (#7648)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2025-12-07 09:47:48 -08:00
Dylan Hurd
a8cbbdbc6e feat(core) Add login to shell_command tool (#6846)
## Summary
Adds the `login` parameter to the `shell_command` tool - optional,
defaults to true.

## Testing
- [x] Tested locally
2025-12-05 11:03:25 -08:00
Dylan Hurd
37c36024c7 chore(core): test apply_patch_cli on Windows (#7554)
## Summary
These tests pass on windows, let's enable them.

## Testing
- [x] These are more tests
2025-12-04 10:39:45 -08:00
Ahmed Ibrahim
71504325d3 Migrate model preset (#7542)
- Introduce `openai_models` in `/core`
- Move `PRESETS` under it
- Move `ModelPreset`, `ModelUpgrade`, `ReasoningEffortPreset`,
`ReasoningEffortPreset`, and `ReasoningEffortPreset` to `protocol`
- Introduce `Op::ListModels` and `EventMsg::AvailableModels`

Next steps:
- migrate `app-server` and `tui` to use the introduced Operation
2025-12-03 20:30:43 +00:00
LIHUA
397279d46e Fix: Improve text encoding for shell output in VSCode preview (#6178) (#6182)
## 🐛 Problem

Users running commands with non-ASCII characters (like Russian text
"пример") in Windows/WSL environments experience garbled text in
VSCode's shell preview window, with Unicode replacement characters (�)
appearing instead of the actual text.

**Issue**: https://github.com/openai/codex/issues/6178

## 🔧 Root Cause

The issue was in `StreamOutput<Vec<u8>>::from_utf8_lossy()` method in
`codex-rs/core/src/exec.rs`, which used `String::from_utf8_lossy()` to
convert shell output bytes to strings. This function immediately
replaces any invalid UTF-8 byte sequences with replacement characters,
without attempting to decode using other common encodings.

In Windows/WSL environments, shell output often uses encodings like:

- Windows-1252 (common Windows encoding)
- Latin-1/ISO-8859-1 (extended ASCII)

## 🛠️ Solution

Replaced the simple `String::from_utf8_lossy()` call with intelligent
encoding detection via a new `bytes_to_string_smart()` function that
tries multiple encoding strategies:

1. **UTF-8** (fast path for valid UTF-8)
2. **Windows-1252** (handles Windows-specific characters in 0x80-0x9F
range)
3. **Latin-1** (fallback for extended ASCII)
4. **Lossy UTF-8** (final fallback, same as before)

## 📁 Changes

### New Files

- `codex-rs/core/src/text_encoding.rs` - Smart encoding detection module
- `codex-rs/core/tests/suite/text_encoding_fix.rs` - Integration tests

### Modified Files

- `codex-rs/core/src/lib.rs` - Added text_encoding module
- `codex-rs/core/src/exec.rs` - Updated StreamOutput::from_utf8_lossy()
- `codex-rs/core/tests/suite/mod.rs` - Registered new test module

##  Testing

- **5 unit tests** covering UTF-8, Windows-1252, Latin-1, and fallback
scenarios
- **2 integration tests** simulating the exact Issue #6178 scenario
- **Demonstrates improvement** over the previous
`String::from_utf8_lossy()` approach

All tests pass:

```bash
cargo test -p codex-core text_encoding
cargo test -p codex-core test_shell_output_encoding_issue_6178
```

## 🎯 Impact

-  **Eliminates garbled text** in VSCode shell preview for non-ASCII
content
-  **Supports Windows/WSL environments** with proper encoding detection
-  **Zero performance impact** for UTF-8 text (fast path)
-  **Backward compatible** - UTF-8 content works exactly as before
-  **Handles edge cases** with robust fallback mechanism

## 🧪 Test Scenarios

The fix has been tested with:

- Russian text ("пример")
- Windows-1252 quotation marks (""test")
- Latin-1 accented characters ("café")
- Mixed encoding content
- Invalid byte sequences (graceful fallback)

## 📋 Checklist

- [X] Addresses the reported issue
- [X] Includes comprehensive tests
- [X] Maintains backward compatibility
- [X] Follows project coding conventions
- [X] No breaking changes

---------

Co-authored-by: Josh McKinney <joshka@openai.com>
2025-11-20 11:04:11 -08:00
zhao-oai
65c13f1ae7 execpolicy2 core integration (#6641)
This PR threads execpolicy2 into codex-core.

activated via feature flag: exec_policy (on by default)

reads and parses all .codexpolicy files in `codex_home/codex`

refactored tool runtime API to integrate execpolicy logic

---------

Co-authored-by: Michael Bolin <mbolin@openai.com>
2025-11-19 16:50:43 -08:00
jif-oai
838531d3e4 feat: remote compaction (#6795)
Co-authored-by: pakrym-oai <pakrym@openai.com>
2025-11-18 16:51:16 +00:00
Dylan Hurd
2c1b693da4 chore(core) Consolidate apply_patch tests (#6545)
## Summary
Consolidates our apply_patch tests into one suite, and ensures each test
case tests the various ways the harness supports apply_patch:
1. Freeform custom tool call
2. JSON function tool
3. Simple shell call
4. Heredoc shell call

There are a few test cases that are specific to a particular variant,
I've left those alone.

## Testing
- [x] This adds a significant number of tests
2025-11-13 15:52:39 -08:00
pakrym-oai
6c36318bd8 Use codex-linux-sandbox in unified exec (#6480)
Unified exec isn't working on Linux because we don't provide the correct
arg0.

The library we use for pty management doesn't allow setting arg0
separately from executable. Use the same aliasing strategy we use for
`apply_patch` for `codex-linux-sandbox`.

Use `#[ctor]` hack to dispatch codex-linux-sandbox calls.


Addresses https://github.com/openai/codex/issues/6450
2025-11-10 17:17:09 -08:00
Eric Traut
0c647bc566 Don't retry "insufficient_quota" errors (#6340)
This PR makes an "insufficient quota" error fatal so we don't attempt to
retry it multiple times in the agent loop.

We have multiple bug reports from users about intermittent retry
behaviors, and this could explain some of them. With this change, we'll
eliminate the retries and surface a clear error message.

The PR is a nearly identical copy of [this
PR](https://github.com/openai/codex/pull/4837) contributed by
@abimaelmartell. The original PR has gone stale. Rather than wait for
the contributor to resolve merge conflicts, I wanted to get this change
in.
2025-11-06 15:12:01 -08:00
Eric Traut
c4ebe4b078 Improved token refresh handling to address "Re-connecting" behavior (#6231)
Currently, when the access token expires, we attempt to use the refresh
token to acquire a new access token. This works most of the time.
However, there are situations where the refresh token is expired,
exhausted (already used to perform a refresh), or revoked. In those
cases, the current logic treats the error as transient and attempts to
retry it repeatedly.

This PR changes the token refresh logic to differentiate between
permanent and transient errors. It also changes callers to treat the
permanent errors as fatal rather than retrying them. And it provides
better error messages to users so they understand how to address the
problem. These error messages should also help us further understand why
we're seeing examples of refresh token exhaustion.

Here is the error message in the CLI. The same text appears within the
extension.

<img width="863" height="38" alt="image"
src="https://github.com/user-attachments/assets/7ffc0d08-ebf0-4900-b9a9-265064202f4f"
/>

I also correct the spelling of "Re-connecting", which shouldn't have a
hyphen in it.

Testing: I manually tested these code paths by adding temporary code to
programmatically cause my refresh token to be exhausted (by calling the
token refresh endpoint in a tight loop more than 50 times). I then
simulated an access token expiration, which caused the token refresh
logic to be invoked. I confirmed that the updated logic properly handled
the error condition.

Note: We earlier discussed the idea of forcefully logging out the user
at the point where token refresh failed. I made several attempts to do
this, and all of them resulted in a bad UX. It's important to surface
this error to users in a way that explains the problem and tells them
that they need to log in again. We also previously discussed deleting
the auth.json file when this condition is detected. That also creates
problems because it effectively changes the auth status from logged in
to logged out, and this causes odd failures and inconsistent UX. I think
it's therefore better not to delete auth.json in this case. If the user
closes the CLI or VSCE and starts it again, we properly detect that the
access token is expired and the refresh token is "dead", and we force
the user to go through the login flow at that time.

This should address aspects of #6191, #5679, and #5505
2025-11-05 10:51:57 -08:00
jif-oai
0508823075 test: undo (#6034) 2025-10-31 14:46:24 +00:00
Dylan Hurd
4a55646a02 chore: testing on freeform apply_patch (#5952)
## Summary
Duplicates the tests in `apply_patch_cli.rs`, but tests the freeform
apply_patch tool as opposed to the function call path. The good news is
that all the tests pass with zero logical tests, with the exception of
the heredoc, which doesn't really make sense in the freeform tool
context anyway.

@jif-oai since you wrote the original tests in #5557, I'd love your
opinion on the right way to DRY these test cases between the two. Happy
to set up a more sophisticated harness, but didn't want to go down the
rabbit hole until we agreed on the right pattern

## Testing
- [x] These are tests
2025-10-30 10:40:48 -07:00
Ahmed Ibrahim
13e1d0362d Delegate review to codex instance (#5572)
In this PR, I am exploring migrating task kind to an invocation of
Codex. The main reason would be getting rid off multiple
`ConversationHistory` state and streamlining our context/history
management.

This approach depends on opening a channel between the sub-codex and
codex. This channel is responsible for forwarding `interactive`
(`approvals`) and `non-interactive` events. The `task` is responsible
for handling those events.

This opens the door for implementing `codex as a tool`, replacing
`compact` and `review`, and potentially subagents.

One consideration is this code is very similar to `app-server` specially
in the approval part. If in the future we wanted an interactive
`sub-codex` we should consider using `codex-mcp`
2025-10-29 21:04:25 +00:00
jif-oai
060637b4d4 feat: deprecation warning (#5825)
<img width="955" height="311" alt="Screenshot 2025-10-28 at 14 26 25"
src="https://github.com/user-attachments/assets/99729b3d-3bc9-4503-aab3-8dc919220ab4"
/>
2025-10-29 12:29:28 +00:00
Abhishek Bhardwaj
89591e4246 feature: Add "!cmd" user shell execution (#2471)
feature: Add "!cmd" user shell execution

This change lets users run local shell commands directly from the TUI by
prefixing their input with ! (e.g. !ls). Output is truncated to keep the
exec cell usable, and Ctrl-C cleanly
  interrupts long-running commands (e.g. !sleep 10000).

**Summary of changes**

- Route Op::RunUserShellCommand through a dedicated UserShellCommandTask
(core/src/tasks/user_shell.rs), keeping the task logic out of codex.rs.
- Reuse the existing tool router: the task constructs a ToolCall for the
local_shell tool and relies on ShellHandler, so no manual MCP tool
lookup is required.
- Emit exec lifecycle events (ExecCommandBegin/ExecCommandEnd) so the
TUI can show command metadata, live output, and exit status.

**End-to-end flow**

  **TUI handling**

1. ChatWidget::submit_user_message (TUI) intercepts messages starting
with !.
2. Non-empty commands dispatch Op::RunUserShellCommand { command };
empty commands surface a help hint.
3. No UserInput items are created, so nothing is enqueued for the model.

  **Core submission loop**
4. The submission loop routes the op to handlers::run_user_shell_command
(core/src/codex.rs).
5. A fresh TurnContext is created and Session::spawn_user_shell_command
enqueues UserShellCommandTask.

  **Task execution**
6. UserShellCommandTask::run emits TaskStartedEvent, formats the
command, and prepares a ToolCall targeting local_shell.
  7. ToolCallRuntime::handle_tool_call dispatches to ShellHandler.

  **Shell tool runtime**
8. ShellHandler::run_exec_like launches the process via the unified exec
runtime, honoring sandbox and shell policies, and emits
ExecCommandBegin/End.
9. Stdout/stderr are captured for the UI, but the task does not turn the
resulting ToolOutput into a model response.

  **Completion**
10. After ExecCommandEnd, the task finishes without an assistant
message; the session marks it complete and the exec cell displays the
final output.

  **Conversation context**

- The command and its output never enter the conversation history or the
model prompt; the flow is local-only.
  - Only exec/task events are emitted for UI rendering.

**Demo video**


https://github.com/user-attachments/assets/fcd114b0-4304-4448-a367-a04c43e0b996
2025-10-29 00:31:20 -07:00
Ahmed Ibrahim
7226365397 Centralize truncation in conversation history (#5652)
move the truncation logic to conversation history to use on any tool
output. This will help us in avoiding edge cases while truncating the
tool calls and mcp calls.
2025-10-27 14:05:35 -07:00
jif-oai
6745b12427 chore: testing on apply_path (#5557) 2025-10-23 17:00:48 +01:00
pakrym-oai
5cd8803998 Add a baseline test for resume initial messages (#5466) 2025-10-21 11:45:01 -07:00