Commit Graph

44 Commits

Author SHA1 Message Date
pakrym-oai
2004173cd7 Move message history out of core (#21278)
## Why

Message history was implemented inside `codex-core` and surfaced through
core protocol ops and `SessionConfiguredEvent` fields even though the
current consumer is TUI-local prompt recall. That made core own UI
history persistence and exposed `history_log_id` / `history_entry_count`
through surfaces that app-server and other clients do not need.

This change moves message history persistence out of core and keeps the
recall plumbing local to the TUI.

## What changed

- Added a new `codex-message-history` crate for appending, looking up,
trimming, and reading metadata from `history.jsonl`.
- Removed core protocol history ops/events: `AddToHistory`,
`GetHistoryEntryRequest`, and `GetHistoryEntryResponse`.
- Removed `history_log_id` and `history_entry_count` from
`SessionConfiguredEvent` and updated exec/MCP/test fixtures accordingly.
- Updated the TUI to dispatch local app events for message-history
append/lookup and keep its persistent-history metadata in TUI session
state.

## Validation

- `cargo test -p codex-message-history -p codex-protocol`
- `cargo test -p codex-exec event_processor_with_json_output`
- `cargo test -p codex-mcp-server outgoing_message`
- `cargo test -p codex-tui`
- `just fix -p codex-message-history -p codex-protocol -p codex-core -p
codex-tui -p codex-exec -p codex-mcp-server`
2026-05-06 08:35:42 -07:00
jif-oai
a98623511b feat: add session_id (#20437)
## Summary

Related to
https://openai.slack.com/archives/C095U48JNL9/p1777537279707449
TLDR:
We update the meaning of session ids and thread ids:
* thread_id stays as now
* session_id become a shared id between every thread under a /root
thread (i.e. every sub-agent share the same session id)

This PR introduces an explicit `SessionId` and threads it through the
protocol/client boundary so `session_id` and `thread_id` can diverge
when they need to, while preserving compatibility for older serialized
`session_configured` events.

---------

Co-authored-by: Codex <noreply@openai.com>
2026-05-06 10:48:37 +02:00
rhan-oai
b3d4f1a9f0 [codex-analytics] rework thread_source for thread analytics (#20949)
## Summary
- make `thread_source` an explicit optional thread-level field on
`thread/start`, `thread/fork`, and returned thread payloads
- persist `thread_source` in rollout/session metadata so resumed live
threads retain the original value
- replace the old best-effort `session_source` -> `thread_source`
mapping with an explicit caller-supplied analytics classification

## Why
Before this change, analytics `thread_source` was populated by a
best-effort mapping from `session_source`. `session_source` describes
the runtime/client surface, not the actual thread-level origin, so that
projection was not accurate enough to distinguish cases such as `user`,
`subagent`, `memory_consolidation`, and future thread origins reliably.

Making `thread_source` explicit keeps one thread-level analytics field
while letting callers provide the real classification directly instead
of recovering it indirectly from `session_source`.

## Impact
For new analytics events, `thread_source` now reflects the explicit
thread-level classification supplied by the caller rather than an
inferred value derived from `session_source`. Existing protocol fields
remain optional; callers that omit `threadSource` now produce `null`
instead of a best-effort inferred value.

## Validation
- `just write-app-server-schema`
- `cargo test -p codex-analytics -p codex-core -p
codex-app-server-protocol --no-run`
- `cargo test -p codex-app-server-protocol
generated_ts_optional_nullable_fields_only_in_params`
- `cargo test -p codex-analytics
thread_initialized_event_serializes_expected_shape`
- `cargo test -p codex-core
resume_stopped_thread_from_rollout_preserves_thread_source`
2026-05-06 02:12:31 +00:00
Michael Bolin
ac4332c05b permissions: expose active profile metadata (#20095) 2026-04-29 20:54:59 -07:00
Michael Bolin
bf38def44e permissions: make SessionConfigured profile-only (#19774)
## Why

`SessionConfiguredEvent` is the internal event that tells clients what
permissions are active for a session. Emitting both `sandbox_policy` and
`permission_profile` leaves two possible authorities and forces every
consumer to decide which one to honor. At this point in the migration,
the profile is expressive enough to represent managed, disabled, and
external sandbox enforcement, so the internal event can be profile-only.

The wire compatibility concern is older serialized events or rollout
data that only contain `sandbox_policy`; those still need to
deserialize.

## What Changed

- Removes `sandbox_policy` from `SessionConfiguredEvent` and makes
`permission_profile` required.
- Adds custom deserialization so old payloads with only `sandbox_policy`
are upgraded to a cwd-anchored `PermissionProfile`.
- Updates core event emission and TUI session handling to sync
permissions from the profile directly.
- Updates app-server response construction to derive the legacy
`sandbox` response field from the active thread snapshot instead of from
`SessionConfiguredEvent`.
- Updates yolo-mode display logic to treat both
`PermissionProfile::Disabled` and managed unrestricted filesystem plus
enabled network as full-access, while still preserving the distinction
between no sandbox and external sandboxing.

## Verification

- `cargo test -p codex-protocol session_configured_event --lib`
- `cargo test -p codex-protocol serialize_event --lib`
- `cargo test -p codex-exec session_configured --lib`
- `cargo test -p codex-app-server
thread_response_permission_profile_preserves_enforcement --lib`
- `cargo test -p codex-core
session_configured_reports_permission_profile_for_external_sandbox
--lib`
- `cargo test -p codex-tui session_configured --lib`
- `cargo test -p codex-tui
yolo_mode_includes_managed_full_access_profiles --lib`


































---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/19774).
* #19900
* #19899
* #19776
* #19775
* __->__ #19774
2026-04-27 22:06:47 -07:00
Michael Bolin
082fc4f632 protocol: report session permission profiles (#18282)
## Why

Clients that observe `SessionConfigured` need the same canonical
permission view that app-server thread responses provide. Reporting the
profile in protocol events lets clients keep their local state
synchronized without reinterpreting legacy sandbox fields.

## What changed

This adds `permission_profile` to `SessionConfigured` and propagates it
through core, exec JSON output, MCP server messages, and TUI
history/widget handling.

## Verification

- `cargo test -p codex-tui permissions -- --nocapture`
- `cargo test -p codex-core --test all permissions_messages --
--nocapture`















































---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18282).
* #18288
* #18287
* #18286
* #18285
* #18284
* #18283
* __->__ #18282
2026-04-22 21:29:32 -07:00
pakrym-oai
dd1321d11b Spread AbsolutePathBuf (#17792)
Mechanical change to promote absolute paths through code.
2026-04-14 14:26:10 -07:00
Michael Bolin
61dfe0b86c chore: clean up argument-comment lint and roll out all-target CI on macOS (#16054)
## Why

`argument-comment-lint` was green in CI even though the repo still had
many uncommented literal arguments. The main gap was target coverage:
the repo wrapper did not force Cargo to inspect test-only call sites, so
examples like the `latest_session_lookup_params(true, ...)` tests in
`codex-rs/tui_app_server/src/lib.rs` never entered the blocking CI path.

This change cleans up the existing backlog, makes the default repo lint
path cover all Cargo targets, and starts rolling that stricter CI
enforcement out on the platform where it is currently validated.

## What changed

- mechanically fixed existing `argument-comment-lint` violations across
the `codex-rs` workspace, including tests, examples, and benches
- updated `tools/argument-comment-lint/run-prebuilt-linter.sh` and
`tools/argument-comment-lint/run.sh` so non-`--fix` runs default to
`--all-targets` unless the caller explicitly narrows the target set
- fixed both wrappers so forwarded cargo arguments after `--` are
preserved with a single separator
- documented the new default behavior in
`tools/argument-comment-lint/README.md`
- updated `rust-ci` so the macOS lint lane keeps the plain wrapper
invocation and therefore enforces `--all-targets`, while Linux and
Windows temporarily pass `-- --lib --bins`

That temporary CI split keeps the stricter all-targets check where it is
already cleaned up, while leaving room to finish the remaining Linux-
and Windows-specific target-gated cleanup before enabling
`--all-targets` on those runners. The Linux and Windows failures on the
intermediate revision were caused by the wrapper forwarding bug, not by
additional lint findings in those lanes.

## Validation

- `bash -n tools/argument-comment-lint/run.sh`
- `bash -n tools/argument-comment-lint/run-prebuilt-linter.sh`
- shell-level wrapper forwarding check for `-- --lib --bins`
- shell-level wrapper forwarding check for `-- --tests`
- `just argument-comment-lint`
- `cargo test` in `tools/argument-comment-lint`
- `cargo test -p codex-terminal-detection`

## Follow-up

- Clean up remaining Linux-only target-gated callsites, then switch the
Linux lint lane back to the plain wrapper invocation.
- Clean up remaining Windows-only target-gated callsites, then switch
the Windows lint lane back to the plain wrapper invocation.
2026-03-27 19:00:44 -07:00
Charley Cunningham
bc24017d64 Add Smart Approvals guardian review across core, app-server, and TUI (#13860)
## Summary
- add `approvals_reviewer = "user" | "guardian_subagent"` as the runtime
control for who reviews approval requests
- route Smart Approvals guardian review through core for command
execution, file changes, managed-network approvals, MCP approvals, and
delegated/subagent approval flows
- expose guardian review in app-server with temporary unstable
`item/autoApprovalReview/{started,completed}` notifications carrying
`targetItemId`, `review`, and `action`
- update the TUI so Smart Approvals can be enabled from `/experimental`,
aligned with the matching `/approvals` mode, and surfaced clearly while
reviews are pending or resolved

## Runtime model
This PR does not introduce a new `approval_policy`.

Instead:
- `approval_policy` still controls when approval is needed
- `approvals_reviewer` controls who reviewable approval requests are
routed to:
  - `user`
  - `guardian_subagent`

`guardian_subagent` is a carefully prompted reviewer subagent that
gathers relevant context and applies a risk-based decision framework
before approving or denying the request.

The `smart_approvals` feature flag is a rollout/UI gate. Core runtime
behavior keys off `approvals_reviewer`.

When Smart Approvals is enabled from the TUI, it also switches the
current `/approvals` settings to the matching Smart Approvals mode so
users immediately see guardian review in the active thread:
- `approval_policy = on-request`
- `approvals_reviewer = guardian_subagent`
- `sandbox_mode = workspace-write`

Users can still change `/approvals` afterward.

Config-load behavior stays intentionally narrow:
- plain `smart_approvals = true` in `config.toml` remains just the
rollout/UI gate and does not auto-set `approvals_reviewer`
- the deprecated `guardian_approval = true` alias migration does
backfill `approvals_reviewer = "guardian_subagent"` in the same scope
when that reviewer is not already configured there, so old configs
preserve their original guardian-enabled behavior

ARC remains a separate safety check. For MCP tool approvals, ARC
escalations now flow into the configured reviewer instead of always
bypassing guardian and forcing manual review.

## Config stability
The runtime reviewer override is stable, but the config-backed
app-server protocol shape is still settling.

- `thread/start`, `thread/resume`, and `turn/start` keep stable
`approvalsReviewer` overrides
- the config-backed `approvals_reviewer` exposure returned via
`config/read` (including profile-level config) is now marked
`[UNSTABLE]` / experimental in the app-server protocol until we are more
confident in that config surface

## App-server surface
This PR intentionally keeps the guardian app-server shape narrow and
temporary.

It adds generic unstable lifecycle notifications:
- `item/autoApprovalReview/started`
- `item/autoApprovalReview/completed`

with payloads of the form:
- `{ threadId, turnId, targetItemId, review, action? }`

`review` is currently:
- `{ status, riskScore?, riskLevel?, rationale? }`
- where `status` is one of `inProgress`, `approved`, `denied`, or
`aborted`

`action` carries the guardian action summary payload from core when
available. This lets clients render temporary standalone pending-review
UI, including parallel reviews, even when the underlying tool item has
not been emitted yet.

These notifications are explicitly documented as `[UNSTABLE]` and
expected to change soon.

This PR does **not** persist guardian review state onto `thread/read`
tool items. The intended follow-up is to attach guardian review state to
the reviewed tool item lifecycle instead, which would improve
consistency with manual approvals and allow thread history / reconnect
flows to replay guardian review state directly.

## TUI behavior
- `/experimental` exposes the rollout gate as `Smart Approvals`
- enabling it in the TUI enables the feature and switches the current
session to the matching Smart Approvals `/approvals` mode
- disabling it in the TUI clears the persisted `approvals_reviewer`
override when appropriate and returns the session to default manual
review when the effective reviewer changes
- `/approvals` still exposes the reviewer choice directly
- the TUI renders:
- pending guardian review state in the live status footer, including
parallel review aggregation
  - resolved approval/denial state in history

## Scope notes
This PR includes the supporting core/runtime work needed to make Smart
Approvals usable end-to-end:
- shell / unified-exec / apply_patch / managed-network / MCP guardian
review
- delegated/subagent approval routing into guardian review
- guardian review risk metadata and action summaries for app-server/TUI
- config/profile/TUI handling for `smart_approvals`, `guardian_approval`
alias migration, and `approvals_reviewer`
- a small internal cleanup of delegated approval forwarding to dedupe
fallback paths and simplify guardian-vs-parent approval waiting (no
intended behavior change)

Out of scope for this PR:
- redesigning the existing manual approval protocol shapes
- persisting guardian review state onto app-server `ThreadItem`s
- delegated MCP elicitation auto-review (the current delegated MCP
guardian shim only covers the legacy `RequestUserInput` path)

---------

Co-authored-by: Codex <noreply@openai.com>
2026-03-13 15:27:00 -07:00
pash-openai
07e532dcb9 app-server service tier plumbing (plus some cleanup) (#13334)
followup to https://github.com/openai/codex/pull/13212 to expose fast
tier controls to app server
(majority of this PR is generated schema jsons - actual code is +69 /
-35 and +24 tests )

- add service tier fields to the app-server protocol surfaces used by
thread lifecycle, turn start, config, and session configured events
- thread service tier through the app-server message processor and core
thread config snapshots
- allow runtime config overrides to carry service tier for app-server
callers

cleanup:
- Removing useless "legacy" code supporting "standard" - we moved to
None | "fast", so "standard" is not needed.
2026-03-03 02:35:09 -08:00
Michael Bolin
1af2a37ada chore: remove codex-core public protocol/shell re-exports (#12432)
## Why

`codex-rs/core/src/lib.rs` re-exported a broad set of types and modules
from `codex-protocol` and `codex-shell-command`. That made it easy for
workspace crates to import those APIs through `codex-core`, which in
turn hides dependency edges and makes it harder to reduce compile-time
coupling over time.

This change removes those public re-exports so call sites must import
from the source crates directly. Even when a crate still depends on
`codex-core` today, this makes dependency boundaries explicit and
unblocks future work to drop `codex-core` dependencies where possible.

## What Changed

- Removed public re-exports from `codex-rs/core/src/lib.rs` for:
- `codex_protocol::protocol` and related protocol/model types (including
`InitialHistory`)
  - `codex_protocol::config_types` (`protocol_config_types`)
- `codex_shell_command::{bash, is_dangerous_command, is_safe_command,
parse_command, powershell}`
- Migrated workspace Rust call sites to import directly from:
  - `codex_protocol::protocol`
  - `codex_protocol::config_types`
  - `codex_protocol::models`
  - `codex_shell_command`
- Added explicit `Cargo.toml` dependencies (`codex-protocol` /
`codex-shell-command`) in crates that now import those crates directly.
- Kept `codex-core` internal modules compiling by using `pub(crate)`
aliases in `core/src/lib.rs` (internal-only, not part of the public
API).
- Updated the two utility crates that can already drop a `codex-core`
dependency edge entirely:
  - `codex-utils-approval-presets`
  - `codex-utils-cli`

## Verification

- `cargo test -p codex-utils-approval-presets`
- `cargo test -p codex-utils-cli`
- `cargo check --workspace --all-targets`
- `just clippy`
2026-02-20 23:45:35 -08:00
Michael Bolin
abbd74e2be feat: make sandbox read access configurable with ReadOnlyAccess (#11387)
`SandboxPolicy::ReadOnly` previously implied broad read access and could
not express a narrower read surface.
This change introduces an explicit read-access model so we can support
user-configurable read restrictions in follow-up work, while preserving
current behavior today.

It also ensures unsupported backends fail closed for restricted-read
policies instead of silently granting broader access than intended.

## What

- Added `ReadOnlyAccess` in protocol with:
  - `Restricted { include_platform_defaults, readable_roots }`
  - `FullAccess`
- Updated `SandboxPolicy` to carry read-access configuration:
  - `ReadOnly { access: ReadOnlyAccess }`
  - `WorkspaceWrite { ..., read_only_access: ReadOnlyAccess }`
- Preserved existing behavior by defaulting current construction paths
to `ReadOnlyAccess::FullAccess`.
- Threaded the new fields through sandbox policy consumers and call
sites across `core`, `tui`, `linux-sandbox`, `windows-sandbox`, and
related tests.
- Updated Seatbelt policy generation to honor restricted read roots by
emitting scoped read rules when full read access is not granted.
- Added fail-closed behavior on Linux and Windows backends when
restricted read access is requested but not yet implemented there
(`UnsupportedOperation`).
- Regenerated app-server protocol schema and TypeScript artifacts,
including `ReadOnlyAccess`.

## Compatibility / rollout

- Runtime behavior remains unchanged by default (`FullAccess`).
- API/schema changes are in place so future config wiring can enable
restricted read access without another policy-shape migration.
2026-02-11 18:31:14 -08:00
Michael Bolin
44ebf4588f feat: retain NetworkProxy, when appropriate (#11207)
As of this PR, `SessionServices` retains a
`Option<StartedNetworkProxy>`, if appropriate.

Now the `network` field on `Config` is `Option<NetworkProxySpec>`
instead of `Option<NetworkProxy>`.

Over in `Session::new()`, we invoke `NetworkProxySpec::start_proxy()` to
create the `StartedNetworkProxy`, which is a new struct that retains the
`NetworkProxy` as well as the `NetworkProxyHandle`. (Note that `Drop` is
implemented for `NetworkProxyHandle` to ensure the proxies are shutdown
when it is dropped.)

The `NetworkProxy` from the `StartedNetworkProxy` is threaded through to
the appropriate places.


---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/11207).
* #11285
* __->__ #11207
2026-02-10 02:09:23 -08:00
Michael Bolin
66447d5d2c feat: replace custom mcp-types crate with equivalents from rmcp (#10349)
We started working with MCP in Codex before
https://crates.io/crates/rmcp was mature, so we had our own crate for
MCP types that was generated from the MCP schema:


8b95d3e082/codex-rs/mcp-types/README.md

Now that `rmcp` is more mature, it makes more sense to use their MCP
types in Rust, as they handle details (like the `_meta` field) that our
custom version ignored. Though one advantage that our custom types had
is that our generated types implemented `JsonSchema` and `ts_rs::TS`,
whereas the types in `rmcp` do not. As such, part of the work of this PR
is leveraging the adapters between `rmcp` types and the serializable
types that are API for us (app server and MCP) introduced in #10356.

Note this PR results in a number of changes to
`codex-rs/app-server-protocol/schema`, which merit special attention
during review. We must ensure that these changes are still
backwards-compatible, which is possible because we have:

```diff
- export type CallToolResult = { content: Array<ContentBlock>, isError?: boolean, structuredContent?: JsonValue, };
+ export type CallToolResult = { content: Array<JsonValue>, structuredContent?: JsonValue, isError?: boolean, _meta?: JsonValue, };
```

so `ContentBlock` has been replaced with the more general `JsonValue`.
Note that `ContentBlock` was defined as:

```typescript
export type ContentBlock = TextContent | ImageContent | AudioContent | ResourceLink | EmbeddedResource;
```

so the deletion of those individual variants should not be a cause of
great concern.

Similarly, we have the following change in
`codex-rs/app-server-protocol/schema/typescript/Tool.ts`:

```
- export type Tool = { annotations?: ToolAnnotations, description?: string, inputSchema: ToolInputSchema, name: string, outputSchema?: ToolOutputSchema, title?: string, };
+ export type Tool = { name: string, title?: string, description?: string, inputSchema: JsonValue, outputSchema?: JsonValue, annotations?: JsonValue, icons?: Array<JsonValue>, _meta?: JsonValue, };
```

so:

- `annotations?: ToolAnnotations` ➡️ `JsonValue`
- `inputSchema: ToolInputSchema` ➡️ `JsonValue`
- `outputSchema?: ToolOutputSchema` ➡️ `JsonValue`

and two new fields: `icons?: Array<JsonValue>, _meta?: JsonValue`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10349).
* #10357
* __->__ #10349
* #10356
2026-02-02 17:41:55 -08:00
pap-openai
1ef5455eb6 Conversation naming (#8991)
Session renaming:
- `/rename my_session`
- `/rename` without arg and passing an argument in `customViewPrompt`
- AppExitInfo shows resume hint using the session name if set instead of
uuid, defaults to uuid if not set
- Names are stored in `CODEX_HOME/sessions.jsonl`

Session resuming:
- codex resume <name> lookup for `CODEX_HOME/sessions.jsonl` first entry
matching the name and resumes the session

---------

Co-authored-by: jif-oai <jif@openai.com>
2026-01-30 10:40:09 +00:00
jif-oai
83775f4df1 feat: ephemeral threads (#9765)
Add ephemeral threads capabilities. Only exposed through the
`app-server` v2

The idea is to disable the rollout recorder for those threads.
2026-01-24 14:57:40 +00:00
Anton Panasenko
c26fe64539 feat: show forked from session id in /status (#9330)
Summary:
- Add forked_from to SessionMeta/SessionConfiguredEvent and persist it
for forked sessions.
- Surface forked_from in /status for tui + tui2 and add snapshots.
2026-01-16 13:41:46 -08:00
Michael Bolin
0c09dc3c03 feat: add threadId to MCP server messages (#9192)
This favors `threadId` instead of `conversationId` so we use the same
terms as https://developers.openai.com/codex/sdk/.

To test the local build:

```
cd codex-rs
cargo build --bin codex
npx -y @modelcontextprotocol/inspector ./target/debug/codex mcp-server
```

I sent:

```json
{
  "method": "tools/call",
  "params": {
    "name": "codex",
    "arguments": {
      "prompt": "favorite ls option?"
    },
    "_meta": {
      "progressToken": 0
    }
  }
}
```

and got:

```json
{
  "content": [
    {
      "type": "text",
      "text": "`ls -lah` (or `ls -alh`) — long listing, includes dotfiles, human-readable sizes."
    }
  ],
  "structuredContent": {
    "threadId": "019bbb20-bff6-7130-83aa-bf45ab33250e"
  }
}
```

and successfully used the `threadId` in the follow-up with the
`codex-reply` tool call:

```json
{
  "method": "tools/call",
  "params": {
    "name": "codex-reply",
    "arguments": {
      "prompt": "what is the long versoin",
      "threadId": "019bbb20-bff6-7130-83aa-bf45ab33250e"
    },
    "_meta": {
      "progressToken": 1
    }
  }
}
```

whose response also has the `threadId`:

```json
{
  "content": [
    {
      "type": "text",
      "text": "Long listing is `ls -l` (adds permissions, owner/group, size, timestamp)."
    }
  ],
  "structuredContent": {
    "threadId": "019bbb20-bff6-7130-83aa-bf45ab33250e"
  }
}
```

Fixes https://github.com/openai/codex/issues/3712.
2026-01-13 22:14:41 -08:00
jif-oai
116059c3a0 chore: unify conversation with thread name (#8830)
Done and verified by Codex + refactor feature of RustRover
2026-01-07 17:04:53 +00:00
xl-openai
5d77d4db6b Reimplement skills loading using SkillsManager + skills/list op. (#7914)
refactor the way we load and manage skills:
1. Move skill discovery/caching into SkillsManager and reuse it across
sessions.
2. Add the skills/list API (Op::ListSkills/SkillsListResponse) to fetch
skills for one or more cwds. Also update app-server for VSCE/App;
3. Trigger skills/list during session startup so UIs preload skills and
handle errors immediately.
2025-12-14 09:58:17 -08:00
xl-openai
b36ecb6c32 Inject SKILL.md when it's explicitly mentioned. (#7763)
1. Skills load once in core at session start; the cached outcome is
reused across core and surfaced to TUI via SessionConfigured.
2. TUI detects explicit skill selections, and core injects the matching
SKILL.md content into the turn when a selected skill is present.
2025-12-10 13:59:17 -08:00
Ahmed Ibrahim
71504325d3 Migrate model preset (#7542)
- Introduce `openai_models` in `/core`
- Move `PRESETS` under it
- Move `ModelPreset`, `ModelUpgrade`, `ReasoningEffortPreset`,
`ReasoningEffortPreset`, and `ReasoningEffortPreset` to `protocol`
- Introduce `Op::ListModels` and `EventMsg::AvailableModels`

Next steps:
- migrate `app-server` and `tui` to use the introduced Operation
2025-12-03 20:30:43 +00:00
Michael Bolin
a75321a64c fix: add more fields to ThreadStartResponse and ThreadResumeResponse (#6847)
This adds the following fields to `ThreadStartResponse` and
`ThreadResumeResponse`:

```rust
    pub model: String,
    pub model_provider: String,
    pub cwd: PathBuf,
    pub approval_policy: AskForApproval,
    pub sandbox: SandboxPolicy,
    pub reasoning_effort: Option<ReasoningEffort>,
```

This is important because these fields are optional in
`ThreadStartParams` and `ThreadResumeParams`, so the caller needs to be
able to determine what values were ultimately used to start/resume the
conversation. (Though note that any of these could be changed later
between turns in the conversation.)

Though to get this information reliably, it must be read from the
internal `SessionConfiguredEvent` that is created in response to the
start of a conversation. Because `SessionConfiguredEvent` (as defined in
`codex-rs/protocol/src/protocol.rs`) did not have all of these fields, a
number of them had to be added as part of this PR.

Because `SessionConfiguredEvent` is referenced in many tests, test
instances of `SessionConfiguredEvent` had to be updated, as well, which
is why this PR touches so many files.
2025-11-18 21:18:43 -08:00
Michael Bolin
5881c0d6d4 fix: remove mcp-types from app server protocol (#4537)
We continue the separation between `codex app-server` and `codex
mcp-server`.

In particular, we introduce a new crate, `codex-app-server-protocol`,
and migrate `codex-rs/protocol/src/mcp_protocol.rs` into it, renaming it
`codex-rs/app-server-protocol/src/protocol.rs`.

Because `ConversationId` was defined in `mcp_protocol.rs`, we move it
into its own file, `codex-rs/protocol/src/conversation_id.rs`, and
because it is referenced in a ton of places, we have to touch a lot of
files as part of this PR.

We also decide to get away from proper JSON-RPC 2.0 semantics, so we
also introduce `codex-rs/app-server-protocol/src/jsonrpc_lite.rs`, which
is basically the same `JSONRPCMessage` type defined in `mcp-types`
except with all of the `"jsonrpc": "2.0"` removed.

Getting rid of `"jsonrpc": "2.0"` makes our serialization logic
considerably simpler, as we can lean heavier on serde to serialize
directly into the wire format that we use now.
2025-10-01 02:16:26 +00:00
Michael Bolin
d9dbf48828 fix: separate codex mcp into codex mcp-server and codex app-server (#4471)
This is a very large PR with some non-backwards-compatible changes.

Historically, `codex mcp` (or `codex mcp serve`) started a JSON-RPC-ish
server that had two overlapping responsibilities:

- Running an MCP server, providing some basic tool calls.
- Running the app server used to power experiences such as the VS Code
extension.

This PR aims to separate these into distinct concepts:

- `codex mcp-server` for the MCP server
- `codex app-server` for the "application server"

Note `codex mcp` still exists because it already has its own subcommands
for MCP management (`list`, `add`, etc.)

The MCP logic continues to live in `codex-rs/mcp-server` whereas the
refactored app server logic is in the new `codex-rs/app-server` folder.
Note that most of the existing integration tests in
`codex-rs/mcp-server/tests/suite` were actually for the app server, so
all the tests have been moved with the exception of
`codex-rs/mcp-server/tests/suite/mod.rs`.

Because this is already a large diff, I tried not to change more than I
had to, so `codex-rs/app-server/tests/common/mcp_process.rs` still uses
the name `McpProcess` for now, but I will do some mechanical renamings
to things like `AppServer` in subsequent PRs.

While `mcp-server` and `app-server` share some overlapping functionality
(like reading streams of JSONL and dispatching based on message types)
and some differences (completely different message types), I ended up
doing a bit of copypasta between the two crates, as both have somewhat
similar `message_processor.rs` and `outgoing_message.rs` files for now,
though I expect them to diverge more in the near future.

One material change is that of the initialize handshake for `codex
app-server`, as we no longer use the MCP types for that handshake.
Instead, we update `codex-rs/protocol/src/mcp_protocol.rs` to add an
`Initialize` variant to `ClientRequest`, which takes the `ClientInfo`
object we need to update the `USER_AGENT_SUFFIX` in
`codex-rs/app-server/src/message_processor.rs`.

One other material change is in
`codex-rs/app-server/src/codex_message_processor.rs` where I eliminated
a use of the `send_event_as_notification()` method I am generally trying
to deprecate (because it blindly maps an `EventMsg` into a
`JSONNotification`) in favor of `send_server_notification()`, which
takes a `ServerNotification`, as that is intended to be a custom enum of
all notification types supported by the app server. So to make this
update, I had to introduce a new variant of `ServerNotification`,
`SessionConfigured`, which is a non-backwards compatible change with the
old `codex mcp`, and clients will have to be updated after the next
release that contains this PR. Note that
`codex-rs/app-server/tests/suite/list_resume.rs` also had to be update
to reflect this change.

I introduced `codex-rs/utils/json-to-toml/src/lib.rs` as a small utility
crate to avoid some of the copying between `mcp-server` and
`app-server`.
2025-09-30 07:06:18 +00:00
pakrym-oai
0f9a796617 Use anyhow::Result in tests for error propagation (#4105) 2025-09-23 13:31:36 -07:00
jif-oai
c6fd056aa6 feat: reasoning effort as optional (#3527)
Allow the reasoning effort to be optional
2025-09-12 12:06:33 -07:00
Michael Bolin
9bbeb75361 feat: include reasoning_effort in NewConversationResponse (#3506)
`ClientRequest::NewConversation` picks up the reasoning level from the user's defaults in `config.toml`, so it should be reported in `NewConversationResponse`.
2025-09-11 21:04:40 -07:00
Michael Bolin
bec51f6c05 chore: enable clippy::redundant_clone (#3489)
Created this PR by:

- adding `redundant_clone` to `[workspace.lints.clippy]` in
`cargo-rs/Cargol.toml`
- running `cargo clippy --tests --fix`
- running `just fmt`

Though I had to clean up one instance of the following that resulted:

```rust
let codex = codex;
```
2025-09-11 11:59:37 -07:00
Michael Bolin
2a76a08a9e fix: include rollout_path in NewConversationResponse (#3352)
Adding the `rollout_path` to the `NewConversationResponse` makes it so a
client can perform subsequent operations on a `(ConversationId,
PathBuf)` pair. #3353 will introduce support for `ArchiveConversation`.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/3352).
* #3353
* __->__ #3352
2025-09-09 00:11:48 -07:00
Gabriel Peal
c8fab51372 Use ConversationId instead of raw Uuids (#3282)
We're trying to migrate from `session_id: Uuid` to `conversation_id:
ConversationId`. Not only does this give us more type safety but it
unifies our terminology across Codex and with the implementation of
session resuming, a conversation (which can span multiple sessions) is
more appropriate.

I started this impl on https://github.com/openai/codex/pull/3219 as part
of getting resume working in the extension but it's big enough that it
should be broken out.
2025-09-07 23:22:25 -04:00
Michael Bolin
3f40fbc0a8 chore: improve serialization of ServerNotification (#3193)
This PR introduces introduces a new
`OutgoingMessage::AppServerNotification` variant that is designed to
wrap a `ServerNotification`, which makes the serialization more
straightforward compared to
`OutgoingMessage::Notification(OutgoingNotification)`. We still use the
latter for serializing an `Event` as a `JSONRPCMessage::Notification`,
but I will try to get away from that in the near future.

With this change, now the generated TypeScript type for
`ServerNotification` is:

```typescript
export type ServerNotification =
  | { "method": "authStatusChange", "params": AuthStatusChangeNotification }
  | { "method": "loginChatGptComplete", "params": LoginChatGptCompleteNotification };
```

whereas before it was:

```typescript
export type ServerNotification =
  | { type: "auth_status_change"; data: AuthStatusChangeNotification }
  | { type: "login_chat_gpt_complete"; data: LoginChatGptCompleteNotification };
```

Once the `Event`s are migrated to the `ServerNotification` enum in Rust,
it should be considerably easier to work with notifications on the
TypeScript side, as it will be possible to `switch (message.method)` and
check for exhaustiveness.

Though we will probably need to introduce:

```typescript
export type ServerMessage = ServerRequest | ServerNotification;
```

and then we still need to group all of the `ServerResponse` types
together, as well.
2025-09-04 17:49:50 -07:00
Ahmed Ibrahim
f2036572b6 Replay EventMsgs from Response Items when resuming a session with history. (#3123)
### Overview

This PR introduces the following changes:
	1.	Adds a unified mechanism to convert ResponseItem into EventMsg.
2. Ensures that when a session is initialized with initial history, a
vector of EventMsg is sent along with the session configuration. This
allows clients to re-render the UI accordingly.
	3. 	Added integration testing

### Caveats

This implementation does not send every EventMsg that was previously
dispatched to clients. The excluded events fall into two categories:
	•	“Arguably” rolled-out events
Examples include tool calls and apply-patch calls. While these events
are conceptually rolled out, we currently only roll out ResponseItems.
These events are already being handled elsewhere and transformed into
EventMsg before being sent.
	•	Non-rolled-out events
Certain events such as TurnDiff, Error, and TokenCount are not rolled
out at all.

### Future Directions

At present, resuming a session involves maintaining two states:
	•	UI State
Clients can replay most of the important UI from the provided EventMsg
history.
	•	Model State
The model receives the complete session history to reconstruct its
internal state.

This design provides a solid foundation. If, in the future, more precise
UI reconstruction is needed, we have two potential paths:
1. Introduce a third data structure that allows us to derive both
ResponseItems and EventMsgs.
2. Clearly divide responsibilities: the core system ensures the
integrity of the model state, while clients are responsible for
reconstructing the UI.
2025-09-04 04:47:00 +00:00
Michael Bolin
970e466ab3 fix: switch to unbounded channel (#2874)
#2747 encouraged me to audit our codebase for similar issues, as now I
am particularly suspicious that our flaky tests are due to a racy
deadlock.

I asked Codex to audit our code, and one of its suggestions was this:

> **High-Risk Patterns**
>
> All `send_*` methods await on a bounded
`mpsc::Sender<OutgoingMessage>`. If the writer blocks, the channel fills
and the processor task blocks on send, stops draining incoming requests,
and stdin reader eventually blocks on its send. This creates a
backpressure deadlock cycle across the three tasks.
>
> **Recommendations**
> * Server outgoing path: break the backpressure cycle
> * Option A (minimal risk): Change `OutgoingMessageSender` to use an
unbounded channel to decouple producer from stdout. Add rate logging so
floods are visible.
> * Option B (bounded + drop policy): Change `send_*` to try_send and
drop messages (or coalesce) when the queue is full, logging a warning.
This prevents processor stalls at the cost of losing messages under
extreme backpressure.
> * Option C (two-stage buffer): Keep bounded channel, but have a
dedicated “egress” task that drains an unbounded internal queue, writing
to stdout with retries and a shutdown timeout. This centralizes
backpressure policy.

So this PR is Option A.

Indeed, we previously used a bounded channel with a capacity of `128`,
but as we discovered recently with #2776, there are certainly cases
where we can get flooded with events.

That said, `test_shell_command_approval_triggers_elicitation` just
failed one one build when I put up this PR, so clearly we are not out of
the woods yet...

**Update:** I think I found the true source of the deadlock! See
https://github.com/openai/codex/pull/2876
2025-08-28 22:20:10 -07:00
Michael Bolin
74d2741729 chore: require uninlined_format_args from clippy (#2845)
- added `uninlined_format_args` to `[workspace.lints.clippy]` in the
`Cargo.toml` for the workspace
- ran `cargo clippy --tests --fix`
- ran `just fmt`
2025-08-28 11:25:23 -07:00
Eric Traut
dacff9675a Added new auth-related methods and events to mcp server (#2496)
This PR adds the following:
* A getAuthStatus method on the mcp server. This returns the auth method
currently in use (chatgpt or apikey) or none if the user is not
authenticated. It also returns the "preferred auth method" which
reflects the `preferred_auth_method` value in the config.
* A logout method on the mcp server. If called, it logs out the user and
deletes the `auth.json` file — the same behavior in the cli's `/logout`
command.
* An `authStatusChange` event notification that is sent when the auth
status changes due to successful login or logout operations.
* Logic to pass command-line config overrides to the mcp server at
startup time. This allows use cases like `codex mcp -c
preferred_auth_method=apikey`.
2025-08-20 20:36:34 -07:00
Michael Bolin
a269754668 remove mcp-server/src/mcp_protocol.rs and the code that depends on it (#2360) 2025-08-18 00:29:18 -07:00
Parker Thompson
a075424437 Added allow-expect-in-tests / allow-unwrap-in-tests (#2328)
This PR:
* Added the clippy.toml to configure allowable expect / unwrap usage in
tests
* Removed as many expect/allow lines as possible from tests
* moved a bunch of allows to expects where possible

Note: in integration tests, non `#[test]` helper functions are not
covered by this so we had to leave a few lingering `expect(expect_used`
checks around
2025-08-14 17:59:01 -07:00
Michael Bolin
a62510e0ae fix: verify notifications are sent with the conversationId set (#2278)
This updates `CodexMessageProcessor` so that each notification it sends
for a `EventMsg` from a `CodexConversation` such that:

- The `params` always has an appropriate `conversationId` field.
- The `method` is now includes the name of the `EventMsg` type rather
than using `codex/event` as the `method` type for all notifications. (We
currently prefix the method name with `codex/event/`, but I think that
should go away once we formalize the notification schema in
`wire_format.rs`.)

As part of this, we update `test_codex_jsonrpc_conversation_flow()` to
verify that the `task_finished` notification has made it through the
system instead of sleeping for 5s and "hoping" the server finished
processing the task. Note we have seen some flakiness in some of our
other, similar integration tests, and I expect adding a similar check
would help in those cases, as well.
2025-08-13 17:54:12 -07:00
Michael Bolin
37fc4185ef fix: update OutgoingMessageSender::send_response() to take Serialize (#2263)
This makes `send_response()` easier to work with.
2025-08-13 14:29:13 -07:00
Dylan
094d7af8c3 [mcp-server] Populate notifications._meta with requestId (#1704)
## Summary
Per the [latest MCP
spec](https://modelcontextprotocol.io/specification/2025-06-18/basic#meta),
the `_meta` field is reserved for metadata. In the [Typescript
Schema](0695a497eb/schema/2025-06-18/schema.ts (L37-L40)),
`progressToken` is defined as a value to be attached to subsequent
notifications for that request.

The
[CallToolRequestParams](0695a497eb/schema/2025-06-18/schema.ts (L806-L817))
extends this definition but overwrites the params field. This ambiguity
makes our generated type definitions tricky, so I'm going to skip
`progressToken` field for now and just send back the `requestId`
instead.
 
In a future PR, we can clarify, update our `generate_mcp_types.py`
script, and update our progressToken logic accordingly.

## Testing
- [x] Added unit tests
- [x] Manually tested with mcp client
2025-07-28 13:32:09 -07:00
aibrahim-oai
5a0079fea2 Changing method in MCP notifications (#1684)
- Changing the codex/event type
2025-07-26 10:35:49 -07:00
Michael Bolin
018003e52f feat: leverage elicitations in the MCP server (#1623)
This updates the MCP server so that if it receives an
`ExecApprovalRequest` from the `Codex` session, it in turn sends an [MCP
elicitation](https://modelcontextprotocol.io/specification/draft/client/elicitation)
to the client to ask for the approval decision. Upon getting a response,
it forwards the client's decision via `Op::ExecApproval`.

Admittedly, we should be doing the same thing for
`ApplyPatchApprovalRequest`, but this is our first time experimenting
with elicitations, so I'm inclined to defer wiring that code path up
until we feel good about how this one works.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1623).
* __->__ #1623
* #1622
* #1621
* #1620
2025-07-19 01:32:03 -04:00
Michael Bolin
11fd3123be chore: introduce OutgoingMessageSender (#1622)
Previous to this change, `MessageProcessor` had a
`tokio::sync::mpsc::Sender<JSONRPCMessage>` as an abstraction for server
code to send a message down to the MCP client. Because `Sender` is cheap
to `clone()`, it was straightforward to make it available to tasks
scheduled with `tokio::task::spawn()`.

This worked well when we were only sending notifications or responses
back down to the client, but we want to add support for sending
elicitations in #1623, which means that we need to be able to send
_requests_ to the client, and now we need a bit of centralization to
ensure all request ids are unique.

To that end, this PR introduces `OutgoingMessageSender`, which houses
the existing `Sender<OutgoingMessage>` as well as an `AtomicI64` to mint
out new, unique request ids. It has methods like `send_request()` and
`send_response()` so that callers do not have to deal with
`JSONRPCMessage` directly, as having to set the `jsonrpc` for each
message was a bit tedious (this cleans up `codex_tool_runner.rs` quite a
bit).

We do not have `OutgoingMessageSender` implement `Clone` because it is
important that the `AtomicI64` is shared across all users of
`OutgoingMessageSender`. As such, `Arc<OutgoingMessageSender>` must be
used instead, as it is frequently shared with new tokio tasks.

As part of this change, we update `message_processor.rs` to embrace
`await`, though we must be careful that no individual handler blocks the
main loop and prevents other messages from being handled.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/1622).
* #1623
* __->__ #1622
* #1621
* #1620
2025-07-19 00:30:56 -04:00