Commit Graph

346 Commits

Author SHA1 Message Date
xl-openai
42fa4c237f Support SKILL.toml file. (#9125)
We’re introducing a new SKILL.toml to hold skill metadata so Codex can
deliver a richer Skills experience.

Initial focus is the interface block:
```
[interface]
display_name = "Optional user-facing name"
short_description = "Optional user-facing description"
icon_small = "./assets/small-400px.png"
icon_large = "./assets/large-logo.svg"
brand_color = "#3B82F6"
default_prompt = "Optional surrounding prompt to use the skill with"
```

All fields are exposed via the app server API.
display_name and short_description are consumed by the TUI.
2026-01-15 11:20:04 -08:00
jif-oai
da44569fef nit: clean unified exec background processes (#9304)
To fix the occurences where the End event is received after the listener
stopped listenning
2026-01-15 18:34:33 +00:00
jif-oai
3fc487e0e0 feat: basic tui for event emission (#9209) 2026-01-15 15:53:02 +00:00
charley-oai
4a9c2bcc5a Add text element metadata to types (#9235)
Initial type tweaking PR to make the diff of
https://github.com/openai/codex/pull/9116 smaller

This should not change any behavior, just adds some fields to types
2026-01-14 16:41:50 -08:00
Josh McKinney
27da8a68d3 fix(tui): disable double-press quit shortcut (#9220)
Disables the default Ctrl+C/Ctrl+D double-press quit UX (keeps the code
path behind a const) while we rethink the quit/interrupt flow.

Tests:
- just fmt
- cargo clippy --fix --all-features --tests --allow-dirty --allow-no-vcs
-p codex-tui
- cargo test -p codex-tui --lib
2026-01-14 20:28:18 +00:00
Ahmed Ibrahim
8e937fbba9 Get model on session configured (#9191)
- Don't try to precompute model unless you know it from `config`
- Block `/model` on session configured
- Queue messages until session configured
- show "loading" in status until session configured
2026-01-14 10:20:41 -08:00
jif-oai
6a939ed7a4 feat: emit events around collab tools (#9095)
Emit the following events around the collab tools. On the `app-server`
this will be under `item/started` and `item/completed`
```
#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabAgentSpawnBeginEvent {
    /// Identifier for the collab tool call.
    pub call_id: String,
    /// Thread ID of the sender.
    pub sender_thread_id: ThreadId,
    /// Initial prompt sent to the agent. Can be empty to prevent CoT leaking at the
    /// beginning.
    pub prompt: String,
}

#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabAgentSpawnEndEvent {
    /// Identifier for the collab tool call.
    pub call_id: String,
    /// Thread ID of the sender.
    pub sender_thread_id: ThreadId,
    /// Thread ID of the newly spawned agent, if it was created.
    pub new_thread_id: Option<ThreadId>,
    /// Initial prompt sent to the agent. Can be empty to prevent CoT leaking at the
    /// beginning.
    pub prompt: String,
    /// Last known status of the new agent reported to the sender agent.
    pub status: AgentStatus,
}

#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabAgentInteractionBeginEvent {
    /// Identifier for the collab tool call.
    pub call_id: String,
    /// Thread ID of the sender.
    pub sender_thread_id: ThreadId,
    /// Thread ID of the receiver.
    pub receiver_thread_id: ThreadId,
    /// Prompt sent from the sender to the receiver. Can be empty to prevent CoT
    /// leaking at the beginning.
    pub prompt: String,
}

#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabAgentInteractionEndEvent {
    /// Identifier for the collab tool call.
    pub call_id: String,
    /// Thread ID of the sender.
    pub sender_thread_id: ThreadId,
    /// Thread ID of the receiver.
    pub receiver_thread_id: ThreadId,
    /// Prompt sent from the sender to the receiver. Can be empty to prevent CoT
    /// leaking at the beginning.
    pub prompt: String,
    /// Last known status of the receiver agent reported to the sender agent.
    pub status: AgentStatus,
}

#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabWaitingBeginEvent {
    /// Thread ID of the sender.
    pub sender_thread_id: ThreadId,
    /// Thread ID of the receiver.
    pub receiver_thread_id: ThreadId,
    /// ID of the waiting call.
    pub call_id: String,
}

#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabWaitingEndEvent {
    /// Thread ID of the sender.
    pub sender_thread_id: ThreadId,
    /// Thread ID of the receiver.
    pub receiver_thread_id: ThreadId,
    /// ID of the waiting call.
    pub call_id: String,
    /// Last known status of the receiver agent reported to the sender agent.
    pub status: AgentStatus,
}

#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabCloseBeginEvent {
    /// Identifier for the collab tool call.
    pub call_id: String,
    /// Thread ID of the sender.
    pub sender_thread_id: ThreadId,
    /// Thread ID of the receiver.
    pub receiver_thread_id: ThreadId,
}

#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, JsonSchema, TS)]
pub struct CollabCloseEndEvent {
    /// Identifier for the collab tool call.
    pub call_id: String,
    /// Thread ID of the sender.
    pub sender_thread_id: ThreadId,
    /// Thread ID of the receiver.
    pub receiver_thread_id: ThreadId,
    /// Last known status of the receiver agent reported to the sender agent before
    /// the close.
    pub status: AgentStatus,
}
```
2026-01-14 17:55:57 +00:00
Josh McKinney
4283a7432b tui: double-press Ctrl+C/Ctrl+D to quit (#8936)
## Problem

Codex’s TUI quit behavior has historically been easy to trigger
accidentally and hard to reason
about.

- `Ctrl+C`/`Ctrl+D` could terminate the UI immediately, which is a
common key to press while trying
  to dismiss a modal, cancel a command, or recover from a stuck state.
- “Quit” and “shutdown” were not consistently separated, so some exit
paths could bypass the
  shutdown/cleanup work that should run before the process terminates.

This PR makes quitting both safer (harder to do by accident) and more
uniform across quit
gestures, while keeping the shutdown-first semantics explicit.

## Mental model

After this change, the system treats quitting as a UI request that is
coordinated by the app
layer.

- The UI requests exit via `AppEvent::Exit(ExitMode)`.
- `ExitMode::ShutdownFirst` is the normal user path: the app triggers
`Op::Shutdown`, continues
rendering while shutdown runs, and only ends the UI loop once shutdown
has completed.
- `ExitMode::Immediate` exists as an escape hatch (and as the
post-shutdown “now actually exit”
signal); it bypasses cleanup and should not be the default for
user-triggered quits.

User-facing quit gestures are intentionally “two-step” for safety:

- `Ctrl+C` and `Ctrl+D` no longer exit immediately.
- The first press arms a 1-second window and shows a footer hint (“ctrl
+ <key> again to quit”).
- Pressing the same key again within the window requests a
shutdown-first quit; otherwise the
  hint expires and the next press starts a fresh window.

Key routing remains modal-first:

- A modal/popup gets first chance to consume `Ctrl+C`.
- If a modal handles `Ctrl+C`, any armed quit shortcut is cleared so
dismissing a modal cannot
  prime a subsequent `Ctrl+C` to quit.
- `Ctrl+D` only participates in quitting when the composer is empty and
no modal/popup is active.

The design doc `docs/exit-confirmation-prompt-design.md` captures the
intended routing and the
invariants the UI should maintain.

## Non-goals

- This does not attempt to redesign modal UX or make modals uniformly
dismissible via `Ctrl+C`.
It only ensures modals get priority and that quit arming does not leak
across modal handling.
- This does not introduce a persistent confirmation prompt/menu for
quitting; the goal is to keep
  the exit gesture lightweight and consistent.
- This does not change the semantics of core shutdown itself; it changes
how the UI requests and
  sequences it.

## Tradeoffs

- Quitting via `Ctrl+C`/`Ctrl+D` now requires a deliberate second
keypress, which adds friction for
  users who relied on the old “instant quit” behavior.
- The UI now maintains a small time-bounded state machine for the armed
shortcut, which increases
  complexity and introduces timing-dependent behavior.

This design was chosen over alternatives (a modal confirmation prompt or
a long-lived “are you
sure” state) because it provides an explicit safety barrier while
keeping the flow fast and
keyboard-native.

## Architecture

- `ChatWidget` owns the quit-shortcut state machine and decides when a
quit gesture is allowed
  (idle vs cancellable work, composer state, etc.).
- `BottomPane` owns rendering and local input routing for modals/popups.
It is responsible for
consuming cancellation keys when a view is active and for
showing/expiring the footer hint.
- `App` owns shutdown sequencing: translating
`AppEvent::Exit(ShutdownFirst)` into `Op::Shutdown`
  and only terminating the UI loop when exit is safe.

This keeps “what should happen” decisions (quit vs interrupt vs ignore)
in the chat/widget layer,
while keeping “how it looks and which view gets the key” in the
bottom-pane layer.

## Observability

You can tell this is working by running the TUIs and exercising the quit
gestures:

- While idle: pressing `Ctrl+C` (or `Ctrl+D` with an empty composer and
no modal) shows a footer
hint for ~1 second; pressing again within that window exits via
shutdown-first.
- While streaming/tools/review are active: `Ctrl+C` interrupts work
rather than quitting.
- With a modal/popup open: `Ctrl+C` dismisses/handles the modal (if it
chooses to) and does not
arm a quit shortcut; a subsequent quick `Ctrl+C` should not quit unless
the user re-arms it.

Failure modes are visible as:

- Quits that happen immediately (no hint window) from `Ctrl+C`/`Ctrl+D`.
- Quits that occur while a modal is open and consuming `Ctrl+C`.
- UI termination before shutdown completes (cleanup skipped).

## Tests

- Updated/added unit and snapshot coverage in `codex-tui` and
`codex-tui2` to validate:
  - The quit hint appears and expires on the expected key.
- Double-press within the window triggers a shutdown-first quit request.
- Modal-first routing prevents quit bypass and clears any armed shortcut
when a modal consumes
    `Ctrl+C`.

These tests focus on the UI-level invariants and rendered output; they
do not attempt to validate
real terminal key-repeat timing or end-to-end process shutdown behavior.

---
Screenshot:
<img width="912" height="740" alt="Screenshot 2026-01-13 at 1 05 28 PM"
src="https://github.com/user-attachments/assets/18f3d22e-2557-47f2-a369-ae7a9531f29f"
/>
2026-01-14 17:42:52 +00:00
Ahmed Ibrahim
bdae0035ec Render exec output deltas inline (#9194) 2026-01-14 08:11:12 -08:00
jif-oai
7532f34699 fix: drop double waiting header in TUI (#9145) 2026-01-14 09:52:34 +00:00
Marius Wichtner
3c711f3d16 Fix spinner/Esc interrupt when MCP startup completes mid-turn (#8661)
## **Problem**

Codex’s TUI uses a single “task running” indicator (spinner + Esc interrupt hint)
to communicate “the UI is busy”. In practice, “busy” can mean two different
things: an agent turn is running, or MCP servers are still starting up. Without a
clear contract, those lifecycles can interfere: startup completion can clear the
spinner while a turn is still in progress, or the UI can appear idle while MCP is
still booting. This is user-visible confusion during the most important moments
(startup and the first turn), so it was worth making the contract explicit and
guarding it.

## **Mental model**

`ChatWidget` is the UI-side adapter for the `codex_core::protocol` event stream.
It receives `EventMsg` events and updates two major UI surfaces: the transcript
(history/streaming cells) and the bottom pane (composer + status indicator).

The key concept after this change is that the bottom pane’s “task running”
indicator is treated as **derived UI-busy state**, not “agent is running”. It is
considered active while either:
- an agent turn is in progress (`TurnStarted` → completion/abort), or
- MCP startup is in progress (`McpStartupUpdate` → `McpStartupComplete`).

Those lifecycles are tracked independently, and the bottom-pane indicator is
defined as their union.

## **Non-goals**

- This does not introduce separate UI indicators for “turn busy” vs “MCP busy”.
- This does not change MCP startup behavior, ordering guarantees, or core
  protocol semantics.
- This does not rework unrelated status/header rendering or transcript layout.

## **Tradeoffs**

- The “one flag represents multiple lifecycles” approach remains lossy: it
  preserves correct “busy vs idle” semantics but cannot express *which* kind of
  busy is happening without further UI changes.
- The design keeps complexity low by keeping a single derived boolean, rather
  than adding a more expressive bottom-pane state machine. That’s chosen because
  it matches existing UX and minimizes churn while fixing the confusion.

## **Architecture**

- `codex-core` owns the actual lifecycles and emits `codex_core::protocol`
  events.
- `ChatWidget` owns the UI interpretation of those lifecycles. It is responsible
  for keeping the bottom pane’s derived “busy” state consistent with the event
  stream, and for updating the status header when MCP progress updates arrive.
- The bottom pane remains a dumb renderer of the single “task running” flag; it
  does not learn about MCP or agent turns directly.

## **Observability**

- When working: the spinner/Esc hint stays visible during MCP startup and does
  not disappear mid-turn when `McpStartupComplete` arrives; startup status
  headers can update without clearing “busy” for an active turn.
- When broken: you’ll see the spinner/hint flicker off while output is still
  streaming, or the UI appears idle while MCP startup status is still changing.

## **Tests**

- Adds/strengthens a regression test that asserts MCP startup completion does
  not clear the “task running” indicator for an active turn (in both `tui` and
  `tui2` variants).
- These tests prove the **contract** (“busy is the union of turn + startup”) at
  the UI boundary; they do not attempt to validate MCP startup ordering,
  real-world startup timing, or backend integration behavior.

Fixes #7017

Signed-off-by: 2mawi2 <2mawi2@users.noreply.github.com>
Co-authored-by: 2mawi2 <2mawi2@users.noreply.github.com>
Co-authored-by: Josh McKinney <joshka@openai.com>
2026-01-13 13:56:09 -08:00
Matthew Zeng
e25d2ab3bf Fresh tooltips (#9130)
Fresh tooltips
2026-01-13 13:06:35 -08:00
charley-oai
57ba758df5 Fix queued messages during /review (#9122)
Sending a message during /review interrupts the review, whereas during
normal operation, sending a message while the agent is running will
queue the message. This is unexpected behavior, and since /review
usually takes a while, it takes away a potentially useful operation.

Summary
- Treat review mode as an active task for message queuing so inputs
don’t inject into the running review turn.
- Prevents user submissions from rendering immediately in the transcript
while the review continues streaming.
- Keeps review UX consistent with normal “task running” behavior and
avoids accidental interrupt/replacement.

Notes
- This change only affects UI queuing logic; core review flow and task
lifecycle remain unchanged.
2026-01-13 11:23:22 -08:00
Ahmed Ibrahim
17ab5f6a52 Show tab queue hint in footer (#9138)
- show the Tab queue hint in the footer when a task is running with
Steer enabled
- drop the history queue hint and add footer snapshots
2026-01-13 00:32:53 -08:00
Ahmed Ibrahim
18b737910c Handle image paste from empty paste events (#9049)
Handle image paste on empty paste events.

- Intent: make image paste work in terminals that emit empty paste
events.
- Approach: route paste events through an image-aware handler and read
the clipboard when text is empty.
- That's best effort to detect it. Some terminals don't send the empty
signal.
2026-01-12 23:39:59 -08:00
Ahmed Ibrahim
cbca43d57a Send message by default mid turn. queue messages by tab (#9077)
https://github.com/user-attachments/assets/03838730-4ddc-44df-a2c7-cb8ecda78660
2026-01-12 23:06:35 -08:00
Chriss4123
12779c7c07 fix(tui): show in-flight coalesced tool calls in transcript overlay (#8246)
### Problem
Ctrl+T transcript overlay can omit in-flight coalesced tool calls because it
renders only committed transcript cells while the main viewport can render the
current in-flight ChatWidget.active_cell immediately.

### Mental model
The UI has both committed transcript cells (finalized HistoryCell entries) and
an in-flight active cell that can mutate in place while streaming, often
representing a coalesced exec/tool group. The transcript overlay renders
committed cells plus a render-only live tail derived from the current active
cell. The live tail is cached and only recomputed when its cache key changes,
which is derived from terminal width (wrapping), active-cell revision
(in-place mutations), stream continuation (spacing), and animation tick
(time-based visuals).

### Non-goals
This does not change coalescing rules, flush boundaries, or when active cells
become committed. It does not change tool-call semantics or transcript
persistence; it is a rendering-only improvement for the overlay.

### Tradeoffs
This adds cache invalidation complexity: correctness depends on bumping an
active-cell revision (and/or providing an animation tick) when the active cell
mutates in place. The mechanism is implemented in both codex-tui and codex-tui2,
which keeps behavior consistent but risks drift if future changes are not
applied in lockstep.

### Architecture
App special-cases transcript overlay draws to sync a live tail from ChatWidget
into TranscriptOverlay. TranscriptOverlay remains the owner of committed
transcript cells; the live tail is an optional appended renderable.
HistoryCell::transcript_animation_tick() allows time-dependent transcript output
(spinner/shimmer) to invalidate the cached tail without requiring data mutation.

### Observability
Manual verification is to open Ctrl+T while an exploring/coalesced active cell
is still in-flight and confirm the overlay includes the same in-flight tool-call
group the main viewport shows. The overlay is kept in sync by App passing an
active-cell key and transcript lines into TranscriptOverlay::sync_live_tail; the
key must change when the active cell mutates or animates.

### Tests
Snapshot tests validate that the transcript overlay renders a live tail appended
after committed cells and that identical keys short-circuit recomputation. Unit
tests validate that active-cell revision bumps occur on specific in-place
mutations (e.g. unified exec wait cell command display becoming known late) so
cached tails are invalidated.

## Documentation patches (module, type, function)

### Module-level docs (invariants + mechanisms)
- codex-rs/tui/src/app_backtrack.rs:1
- codex-rs/tui/src/chatwidget.rs:1
- codex-rs/tui/src/pager_overlay.rs:1
- codex-rs/tui/src/history_cell.rs:1
- codex-rs/tui2/src/app_backtrack.rs:1
- codex-rs/tui2/src/chatwidget.rs:1
- codex-rs/tui2/src/pager_overlay.rs:1
- codex-rs/tui2/src/history_cell.rs:1

### Type-level docs (cache key + invariants)
- codex-rs/tui/src/chatwidget.rs (ChatWidget.active_cell_revision, ActiveCellTranscriptKey)
- codex-rs/tui/src/pager_overlay.rs (TranscriptOverlay live tail storage model)
- codex-rs/tui/src/history_cell.rs (HistoryCell::transcript_animation_tick, UnifiedExecWaitCell::update_command_display)
- Mirrored in codex-rs/tui2/src/chatwidget.rs, codex-rs/tui2/src/pager_overlay.rs, codex-rs/tui2/src/history_cell.rs

### Function-level docs (why/when/guarantees/pitfalls)
- codex-rs/tui/src/app_backtrack.rs (overlay_forward_event)
- codex-rs/tui/src/chatwidget.rs (active_cell_transcript_key, active_cell_transcript_lines)
- codex-rs/tui/src/pager_overlay.rs (sync_live_tail, take_live_tail_renderable)
- codex-rs/tui/src/history_cell.rs (transcript_animation_tick, UnifiedExecWaitCell::update_command_display)
- Mirrored in codex-rs/tui2 equivalents where present

### Validation performed
- cd codex-rs && just fmt
- cd codex-rs && cargo test -p codex-tui
- cd codex-rs && cargo test -p codex-tui2

## Design inconsistencies / risks

- Cache invalidation is a distributed responsibility: any future in-place active
  cell transcript mutation that forgets to bump active_cell_revision (or expose
  an animation tick) can leave the transcript overlay live tail out of sync with
  the main viewport.
- TranscriptOverlay tail handling assumes a structural invariant that the live
  tail, when present, is exactly one trailing renderable after the committed cell
  renderables; if renderable construction changes in a way that violates that
  assumption, tail insertion/removal logic becomes incorrect.
- codex-tui and codex-tui2 duplicate the live-tail mechanism; the documentation
  is aligned, but the implementation can still drift unless changes continue to
  be applied in lockstep.
2026-01-13 03:06:11 +00:00
Anton Panasenko
4223948cf5 feat: wire fork to codex cli (#8994)
## Summary
- add `codex fork` subcommand and `/fork` slash command mirroring resume
- extend session picker to support fork/resume actions with dynamic
labels in tui/tui2
- wire fork selection flow through tui bootstraps and add fork-related
tests
2026-01-12 10:09:11 -08:00
charley-oai
6709ad8975 Label attached images so agent can understand in-message labels (#8950)
Agent wouldn't "see" attached images and would instead try to use the
view_file tool:
<img width="1516" height="504" alt="image"
src="https://github.com/user-attachments/assets/68a705bb-f962-4fc1-9087-e932a6859b12"
/>

In this PR, we wrap image content items in XML tags with the name of
each image (now just a numbered name like `[Image #1]`), so that the
model can understand inline image references (based on name). We also
put the image content items above the user message which the model seems
to prefer (maybe it's more used to definitions being before references).

We also tweak the view_file tool description which seemed to help a bit

Results on a simple eval set of images:

Before
<img width="980" height="310" alt="image"
src="https://github.com/user-attachments/assets/ba838651-2565-4684-a12e-81a36641bf86"
/>

After
<img width="918" height="322" alt="image"
src="https://github.com/user-attachments/assets/10a81951-7ee6-415e-a27e-e7a3fd0aee6f"
/>

```json
[
  {
    "id": "single_describe",
    "prompt": "Describe the attached image in one sentence.",
    "images": ["image_a.png"]
  },
  {
    "id": "single_color",
    "prompt": "What is the dominant color in the image? Answer with a single color word.",
    "images": ["image_b.png"]
  },
  {
    "id": "orientation_check",
    "prompt": "Is the image portrait or landscape? Answer in one sentence.",
    "images": ["image_c.png"]
  },
  {
    "id": "detail_request",
    "prompt": "Look closely at the image and call out any small details you notice.",
    "images": ["image_d.png"]
  },
  {
    "id": "two_images_compare",
    "prompt": "I attached two images. Are they the same or different? Briefly explain.",
    "images": ["image_a.png", "image_b.png"]
  },
  {
    "id": "two_images_captions",
    "prompt": "Provide a short caption for each image (Image 1, Image 2).",
    "images": ["image_c.png", "image_d.png"]
  },
  {
    "id": "multi_image_rank",
    "prompt": "Rank the attached images from most colorful to least colorful.",
    "images": ["image_a.png", "image_b.png", "image_c.png"]
  },
  {
    "id": "multi_image_choice",
    "prompt": "Which image looks more vibrant? Answer with 'Image 1' or 'Image 2'.",
    "images": ["image_b.png", "image_d.png"]
  }
]
```
2026-01-09 21:33:45 -08:00
jif-oai
1aed01e99f renaming: task to turn (#8963) 2026-01-09 17:31:17 +00:00
gt-oai
5b5a5b92b5 Add config to disable /feedback (#8909)
Some enterprises do not want their users to be able to `/feedback`.

<img width="395" height="325" alt="image"
src="https://github.com/user-attachments/assets/2dae9c0b-20c3-4a15-bcd3-0187857ebbd8"
/>

Adds to `config.toml`:

```toml
[feedback]
enabled = false
```

I've deliberately decided to:
1. leave other references to `/feedback` (e.g. in the interrupt message,
tips of the day) unchanged. I think we should continue to promote the
feature even if it is not usable currently.
2. leave the `/feedback` menu item selectable and display an error
saying it's disabled, rather than remove the menu item (which I believe
would raise more questions).

but happy to discuss these.

This will be followed by a change to requirements.toml that admins can
use to force the value of feedback.enabled.
2026-01-09 16:33:48 +00:00
iceweasel-oai
6372ba9d5f Elevated sandbox NUX (#8789)
Elevated Sandbox NUX:

* prompt for elevated sandbox setup when agent mode is selected (via
/approvals or at startup)
* prompt for degraded sandbox if elevated setup is declined or fails
* introduce /elevate-sandbox command to upgrade from degraded
experience.
2026-01-08 16:23:06 -08:00
pakrym-oai
634764ece9 Immutable CodexAuth (#8857)
Historically we started with a CodexAuth that knew how to refresh it's
own tokens and then added AuthManager that did a different kind of
refresh (re-reading from disk).

I don't think it makes sense for both `CodexAuth` and `AuthManager` to
be mutable and contain behaviors.

Move all refresh logic into `AuthManager` and keep `CodexAuth` as a data
object.
2026-01-08 11:43:56 -08:00
gt-oai
f07b8aa591 Warn in /model if BASE_URL set (#8847)
<img width="763" height="349" alt="Screenshot 2026-01-07 at 18 37 59"
src="https://github.com/user-attachments/assets/569d01cb-ea91-4113-889b-ba74df24adaf"
/>

It may not make sense to use the `/model` menu with a custom
OPENAI_BASE_URL. But some model proxies may support it, so we shouldn't
disable it completely. A warning is a reasonable compromise.
2026-01-07 21:24:18 +00:00
Ahmed Ibrahim
c31960b13a remove unnecessary todos (#8842)
> // todo(aibrahim): why are we passing model here while it can change?

we update it on each turn with `.with_model`

> //TODO(aibrahim): run CI in release mode.

although it's good to have, release builds take double the time tests
take.

> // todo(aibrahim): make this async function

we figured out another way of doing this sync
2026-01-07 10:43:10 -08:00
jif-oai
116059c3a0 chore: unify conversation with thread name (#8830)
Done and verified by Codex + refactor feature of RustRover
2026-01-07 17:04:53 +00:00
jif-oai
4cef89a122 chore: rename unified exec sessions (#8822)
Renaming done by Codex
2026-01-07 16:12:47 +00:00
Thibault Sottiaux
124a09e577 fix: handle /review arguments in TUI (#8823)
Handle /review <instructions> in the TUI and TUI2 by routing it as a
custom review command instead of plain text, wiring command dispatch and
adding composer coverage so typing /review text starts a review directly
rather than posting a message. User impact: /review with arguments now
kicks off the review flow, previously it would just forward as a plain
command and not actually start a review.
2026-01-07 13:14:55 +00:00
charley-oai
3389465c8d Enable model upgrade popup even when selected model is no longer in picker (#8802)
With `config.toml`:
```
model = "gpt-5.1-codex"
```
(where `gpt-5.1-codex` has `show_in_picker: false` in
[`model_presets.rs`](https://github.com/openai/codex/blob/main/codex-rs/core/src/models_manager/model_presets.rs);
this happens if the user hasn't used codex in a while so they didn't see
the popup before their model was changed to `show_in_picker: false`)

The upgrade picker used to not show (because `gpt-5.1-codex` was
filtered out of the model list in code). Now, the filtering is done
downstream in tui and app-server, so the model upgrade popup shows:

<img width="1503" height="227" alt="Screenshot 2026-01-06 at 5 04 37 PM"
src="https://github.com/user-attachments/assets/26144cc2-0b3f-4674-ac17-e476781ec548"
/>
2026-01-06 19:32:27 -08:00
Owen Lin
8b7ec31ba7 feat(app-server): thread/rollback API (#8454)
Add `thread/rollback` to app-server to support IDEs undo-ing the last N
turns of a thread.

For context, an IDE partner will be supporting an "undo" capability
where the IDE (the app-server client) will be responsible for reverting
the local changes made during the last turn. To support this well, we
also need a way to drop the last turn (or more generally, the last N
turns) from the agent's context. This is what `thread/rollback` does.

**Core idea**: A Thread rollback is represented as a persisted event
message (EventMsg::ThreadRollback) in the rollout JSONL file, not by
rewriting history. On resume, both the model's context (core replay) and
the UI turn list (app-server v2's thread history builder) apply these
markers so the pruned history is consistent across live conversations
and `thread/resume`.

Implementation notes:
- Rollback only affects agent context and appends to the rollout file;
clients are responsible for reverting files on disk.
- If a thread rollback is currently in progress, subsequent
`thread/rollback` calls are rejected.
- Because we use `CodexConversation::submit` and codex core tracks
active turns, returning an error on concurrent rollbacks is communicated
via an `EventMsg::Error` with a new variant
`CodexErrorInfo::ThreadRollbackFailed`. app-server watches for that and
sends the BAD_REQUEST RPC response.

Tests cover thread rollbacks in both core and app-server, including when
`num_turns` > existing turns (which clears all turns).

**Note**: this explicitly does **not** behave like `/undo` which we just
removed from the CLI, which does the opposite of what `thread/rollback`
does. `/undo` reverts local changes via ghost commits/snapshots and does
not modify the agent's context / conversation history.
2026-01-06 21:23:48 +00:00
jif-oai
740bf0e755 chore: clear background terminals on interrupt (#8786) 2026-01-06 19:01:07 +00:00
Anton Panasenko
807f8a43c2 feat: expose outputSchema to user_turn/turn_start app_server API (#8377)
What changed
- Added `outputSchema` support to the app-server APIs, mirroring `codex
exec --output-schema` behavior.
- V1 `sendUserTurn` now accepts `outputSchema` and constrains the final
assistant message for that turn.
- V2 `turn/start` now accepts `outputSchema` and constrains the final
assistant message for that turn (explicitly per-turn only).

Core behavior
- `Op::UserTurn` already supported `final_output_json_schema`; now V1
`sendUserTurn` forwards `outputSchema` into that field.
- `Op::UserInput` now carries `final_output_json_schema` for per-turn
settings updates; core maps it into
`SessionSettingsUpdate.final_output_json_schema` so it applies to the
created turn context.
- V2 `turn/start` does NOT persist the schema via `OverrideTurnContext`
(it’s applied only for the current turn). Other overrides
(cwd/model/etc) keep their existing persistent behavior.

API / docs
- `codex-rs/app-server-protocol/src/protocol/v1.rs`: add `output_schema:
Option<serde_json::Value>` to `SendUserTurnParams` (serialized as
`outputSchema`).
- `codex-rs/app-server-protocol/src/protocol/v2.rs`: add `output_schema:
Option<JsonValue>` to `TurnStartParams` (serialized as `outputSchema`).
- `codex-rs/app-server/README.md`: document `outputSchema` for
`turn/start` and clarify it applies only to the current turn.
- `codex-rs/docs/codex_mcp_interface.md`: document `outputSchema` for v1
`sendUserTurn` and v2 `turn/start`.

Tests added/updated
- New app-server integration tests asserting `outputSchema` is forwarded
into outbound `/responses` requests as `text.format`:
  - `codex-rs/app-server/tests/suite/output_schema.rs`
  - `codex-rs/app-server/tests/suite/v2/output_schema.rs`
- Added per-turn semantics tests (schema does not leak to the next
turn):
  - `send_user_turn_output_schema_is_per_turn_v1`
  - `turn_start_output_schema_is_per_turn_v2`
- Added protocol wire-compat tests for the merged op:
  - serialize omits `final_output_json_schema` when `None`
  - deserialize works when field is missing
  - serialize includes `final_output_json_schema` when `Some(schema)`

Call site updates (high level)
- Updated all `Op::UserInput { .. }` constructions to include
`final_output_json_schema`:
  - `codex-rs/app-server/src/codex_message_processor.rs`
  - `codex-rs/core/src/codex_delegate.rs`
  - `codex-rs/mcp-server/src/codex_tool_runner.rs`
  - `codex-rs/tui/src/chatwidget.rs`
  - `codex-rs/tui2/src/chatwidget.rs`
  - plus impacted core tests.

Validation
- `just fmt`
- `cargo test -p codex-core`
- `cargo test -p codex-app-server`
- `cargo test -p codex-mcp-server`
- `cargo test -p codex-tui`
- `cargo test -p codex-tui2`
- `cargo test -p codex-protocol`
- `cargo clippy --all-features --tests --profile dev --fix -- -D
warnings`
2026-01-05 10:27:00 -08:00
Ahmed Ibrahim
2de731490e Remove model family from tui (#8488)
- Remove model family from tui
2026-01-02 11:30:04 -08:00
sayan-oai
bf732600ea [chore] add additional_details to StreamErrorEvent + wire through (#8307)
### What

Builds on #8293.

Add `additional_details`, which contains the upstream error message, to
relevant structures used to pass along retryable `StreamError`s.

Uses the new TUI status indicator's `details` field (shows under the
status header) to display the `additional_details` error to the user on
retryable `Reconnecting...` errors. This adds clarity for users for
retryable errors.

Will make corresponding change to VSCode extension to show
`additional_details` as expandable from the `Reconnecting...` cell.

Examples:
<img width="1012" height="326" alt="image"
src="https://github.com/user-attachments/assets/f35e7e6a-8f5e-4a2f-a764-358101776996"
/>

<img width="1526" height="358" alt="image"
src="https://github.com/user-attachments/assets/0029cbc0-f062-4233-8650-cc216c7808f0"
/>
2025-12-24 10:07:38 -08:00
Ahmed Ibrahim
40de81e7af Remove reasoning format (#8484)
This isn't very useful parameter. 

logic:
```
if model puts `**` in their reasoning, trim it and visualize the header.
if couldn't trim: don't render
if model doesn't support: don't render
```

We can simplify to:
```
if could trim, visualize header.
if not, don't render
```
2025-12-23 16:01:46 -08:00
sayan-oai
53eb2e9f27 [tui] add optional details to TUI status header (#8293)
### What

Add optional `details` field to TUI's status indicator header. `details`
is shown under the header with text wrapping and a max height of 3
lines.

Duplicated changes to `tui2`.

### Why

Groundwork for displaying error details under `Reconnecting...` for
clarity with retryable errors.

Basic examples
<img width="1012" height="326" alt="image"
src="https://github.com/user-attachments/assets/dd751ceb-b179-4fb2-8fd1-e4784d6366fb"
/>

<img width="1526" height="358" alt="image"
src="https://github.com/user-attachments/assets/bbe466fc-faff-4a78-af7f-3073ccdd8e34"
/>

Truncation example
<img width="936" height="189" alt="image"
src="https://github.com/user-attachments/assets/f3f1b5dd-9050-438b-bb07-bd833c03e889"
/>

### Tests
Tested locally, added tests for truncation.
2025-12-23 12:40:40 -08:00
sayan-oai
4673090f73 feat: open prompt in configured external editor (#7606)
Add `ctrl+g` shortcut to enable opening current prompt in configured
editor (`$VISUAL` or `$EDITOR`).


- Prompt is updated with editor's content upon editor close.
- Paste placeholders are automatically expanded when opening the
external editor, and are not "recompressed" on close
- They could be preserved in the editor, but it would be hard to prevent
the user from modifying the placeholder text directly, which would drop
the mapping to the `pending_paste` value
- Image placeholders stay as-is
- `ctrl+g` explanation added to shortcuts menu, snapshot tests updated



https://github.com/user-attachments/assets/4ee05c81-fa49-4e99-8b07-fc9eef0bbfce
2025-12-22 15:12:23 -08:00
jif-oai
7a8407bbb6 chore: un-ship undo (#8424) 2025-12-22 09:53:03 +01:00
Ahmed Ibrahim
f0dc6fd3c7 Rename OpenAI models to models manager (#8346)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2025-12-19 16:20:05 -08:00
Michael Bolin
dc61fc5f50 feat: support allowed_sandbox_modes in requirements.toml (#8298)
This adds support for `allowed_sandbox_modes` in `requirements.toml` and
provides legacy support for constraining sandbox modes in
`managed_config.toml`. This is converted to `Constrained<SandboxPolicy>`
in `ConfigRequirements` and applied to `Config` such that constraints
are enforced throughout the harness.

Note that, because `managed_config.toml` is deprecated, we do not add
support for the new `external-sandbox` variant recently introduced in
https://github.com/openai/codex/pull/8290. As noted, that variant is not
supported in `config.toml` today, but can be configured programmatically
via app server.
2025-12-19 21:09:20 +00:00
jif-oai
6c76d17713 feat: collapse "waiting" of unified_exec (#8257)
Screenshots here but check the snapshot files to see it better
<img width="712" height="408" alt="Screenshot 2025-12-18 at 11 58 02"
src="https://github.com/user-attachments/assets/84a2c410-0767-4870-84d1-ae1c0d4c445e"
/>
<img width="523" height="352" alt="Screenshot 2025-12-18 at 11 17 41"
src="https://github.com/user-attachments/assets/d029c7ea-0feb-4493-9dca-af43a0c70c52"
/>
2025-12-18 17:03:43 -08:00
xl-openai
358a5baba0 Support skills shortDescription. (#8278)
Allow SKILL.md to specify a more human-readable short description as
skill metadata.
2025-12-18 23:13:18 +00:00
jif-oai
4fb0b547d6 feat: add /ps (#8279)
See snapshots for view of edge cases
This is still named `UnifiedExecSessions` for consistency across the
code but should be renamed to `BackgroundTerminals` in a follow-up

Example:
<img width="945" height="687" alt="Screenshot 2025-12-18 at 20 12 53"
src="https://github.com/user-attachments/assets/92f39ff2-243c-4006-b402-e3fa9e93c952"
/>
2025-12-18 21:09:06 +00:00
Jeremy Rose
be274cbe62 tui: improve rendering of search cell (#8273)
before:

<img width="795" height="150" alt="Screenshot 2025-12-18 at 10 48 01 AM"
src="https://github.com/user-attachments/assets/6f4d8856-b4c2-4e2a-b60a-b86f82b956a0"
/>

after:

<img width="795" height="150" alt="Screenshot 2025-12-18 at 10 48 39 AM"
src="https://github.com/user-attachments/assets/dd0d167a-5d09-4bb7-9d36-95a2eb1aaa83"
/>
2025-12-18 11:05:26 -08:00
jif-oai
aea47b6553 feat: add name to beta features (#8266)
Add a name to Beta features

<img width="906" height="153" alt="Screenshot 2025-12-18 at 16 42 49"
src="https://github.com/user-attachments/assets/d56f3519-0613-4d9a-ad4d-38b1a7eb125a"
/>
2025-12-18 16:59:46 +00:00
Ahmed Ibrahim
774bd9e432 feat: model picker (#8209)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2025-12-17 16:12:35 -08:00
jif-oai
f74e0cda92 feat: unified exec footer (#8117)
# With `unified_exec`
Known tools are correctly casted
<img width="1150" height="312" alt="Screenshot 2025-12-16 at 19 27 28"
src="https://github.com/user-attachments/assets/24150ee5-e88d-461b-a459-483c24784196"
/>
If a session exit the turn, we render it with the "Ran ..."
<img width="1168" height="355" alt="Screenshot 2025-12-16 at 19 27 58"
src="https://github.com/user-attachments/assets/3f00b60c-2d57-4f9d-a201-9cc8388957cb"
/>
If a session does not exit during the turn, it is closed at the end of
the turn but this is not rendered
<img width="642" height="342" alt="Screenshot 2025-12-16 at 19 34 37"
src="https://github.com/user-attachments/assets/c2bd9283-7017-4915-ba73-c52199b0b28e"
/>

# Without `unified_exec`
No changes
<img width="740" height="603" alt="Screenshot 2025-12-16 at 19 31 21"
src="https://github.com/user-attachments/assets/ca5d90fe-a9b2-42ba-bcd7-3e98c4ed22e8"
/>
2025-12-17 17:12:04 +00:00
jif-oai
ac6ba286aa feat: experimental menu (#8071)
This will automatically render any `Stage::Beta` features.

The change only gets applied to the *next session*. This started as a
bug but actually this is a good thing to prevent out of distribution
push

<img width="986" height="288" alt="Screenshot 2025-12-15 at 15 38 35"
src="https://github.com/user-attachments/assets/78b7a71d-0e43-4828-a118-91c5237909c7"
/>


<img width="509" height="109" alt="Screenshot 2025-12-15 at 17 35 44"
src="https://github.com/user-attachments/assets/6933de52-9b66-4abf-b58b-a5f26d5747e2"
/>
2025-12-17 17:08:03 +00:00
gt-oai
9352c6b235 feat: Constrain values for approval_policy (#7778)
Constrain `approval_policy` through new `admin_policy` config.

This PR will:
1. Add a `admin_policy` section to config, with a single field (for now)
`allowed_approval_policies`. This list constrains the set of
user-settable `approval_policy`s.
2. Introduce a new `Constrained<T>` type, which combines a current value
and a validator function. The validator function ensures disallowed
values are not set.
3. Change the type of `approval_policy` on `Config` and
`SessionConfiguration` from `AskForApproval` to
`Constrained<AskForApproval>`. The validator function is set by the
values passed into `allowed_approval_policies`.
4. `GenericDisplayRow`: add a `disabled_reason: Option<String>`. When
set, it disables selection of the value and indicates as such in the
menu. This also makes it unselectable with arrow keys or numbers. This
is used in the `/approvals` menu.

Follow ups are:
1. Do the same thing to `sandbox_policy`.
2. Propagate the allowed set of values through app-server for the
extension (though already this should prevent app-server from setting
this values, it's just that we want to disable UI elements that are
unsettable).

Happy to split this PR up if you prefer, into the logical numbered areas
above. Especially if there are parts we want to gavel on separately
(e.g. admin_policy).

Disabled full access:
<img width="1680" height="380" alt="image"
src="https://github.com/user-attachments/assets/1fb61c8c-1fcb-4dc4-8355-2293edb52ba0"
/>

Disabled `--yolo` on startup:
<img width="749" height="76" alt="image"
src="https://github.com/user-attachments/assets/0a1211a0-6eb1-40d6-a1d7-439c41e94ddb"
/>

CODEX-4087
2025-12-17 16:19:27 +00:00
xl-openai
4897efcced Add public skills + improve repo skill discovery and error UX (#8098)
1. Adds SkillScope::Public end-to-end (core + protocol) and loads skills
from the public cache directory
2. Improves repo skill discovery by searching upward for the nearest
.codex/skills within a git repo
3. Deduplicates skills by name with deterministic ordering to avoid
duplicates across sources
4. Fixes garbled “Skill errors” overlay rendering by preventing pending
history lines from being injected during the modal
5. Updates the project docs “Skills” intro wording to avoid hardcoded
paths
2025-12-17 01:35:49 -08:00