Files
codex/codex-rs/MODELS_STARTUP_PLAN.md
Ahmed Ibrahim d0516eac5d fix
2025-12-22 18:06:47 -08:00

5.4 KiB
Raw Blame History

Models Startup + TUI Gating Plan

Goal: let the TUI render immediately and accept input, while ensuring we do not lie about the model actually used. Model selection (/model) and turn submission are gated until the session is configured. If /models refresh fails, core keeps using the fallback models seeded at startup (from models.json) via the existing RwLock.

This plan is intended to be executed top-to-bottom. Each step is checked off when completed.


Step 1 - Core: Bound /models refresh + prevent duplicate refreshes

  • Add a 5s timeout to the remote /models request in core/src/models_manager/manager.rs::refresh_available_models.
    • On timeout/error, do not overwrite remote_models (fallback seeded at startup remains available automatically).
    • Continue to log failures for visibility.
  • Add a refresh_lock: tokio::sync::Mutex<()> field to ModelsManager.
    • Initialize it in ModelsManager::new and ModelsManager::with_provider.
    • Acquire it around the refresh logic so concurrent calls do not issue duplicate /models requests.

Acceptance:

  • refresh_available_models returns promptly (<= ~5s) even when /models hangs.
  • Concurrent list_models / get_model callers do not duplicate the refresh.

Step 2 - Protocol: SessionConfigured includes the selected model family metadata

  • Change SessionConfiguredEvent to carry model_family metadata (not a plain model string).
    • Implemented as codex_protocol::openai_models::ModelFamily (a type alias of ModelInfo, matching the /models payload shape).
    • Consumers derive the model slug from event.model_family.slug.

Acceptance:

  • UIs can render the selected model name (and gate features) without making additional /models calls after startup completes.
    • No placeholder model family exists or is required.

Step 3 - TUI: Boot without blocking on models

Files:

  • tui/src/app.rs

  • tui2/src/app.rs

  • Remove startup awaits that currently block the first render:

    • Do not call ModelsManager::get_model(...).await during startup.
    • Do not call ModelsManager::list_models(...).await during startup.
    • Do not run the existing model migration prompt at startup.
  • Construct ChatWidget immediately with model_family: None (no placeholder).

Acceptance:

  • The TUI event loop begins immediately (frame scheduled before any /models IO).

Step 4 - TUI: Truthful readiness gating (Loading/Ready only)

Design:

  • Ready is defined as "we have received EventMsg::SessionConfigured", which includes the model actually used.
  • While Loading, allow typing, but queue submissions and disable /model.

Files:

  • tui/src/chatwidget.rs

  • tui2/src/chatwidget.rs

  • Keep any model-family-dependent behavior gated until SessionConfigured.

    • No placeholder: ChatWidget stores model_family: Option<ModelFamily>.
    • Once SessionConfigured arrives, hydrate model_family from event.model_family.
  • Gate turn submission:

    • While not configured, pressing Enter enqueues into the existing queued_user_messages queue and updates the queued display.
    • Prevent maybe_send_next_queued_input from sending anything until the session is configured.
    • After session configured:
      • If there is no CLI-provided initial_user_message, start draining the queued inputs by submitting exactly one (the existing turn-completion loop will send the rest sequentially).
      • If there is an initial_user_message, do not start draining immediately (avoid sending queued inputs before the initial message begins a turn).
  • Gate /model:

    • While not configured, show an info message explaining /model is disabled until startup completes.
  • Gate other model-family-dependent flows (e.g. popups/status helpers) so they early-return instead of assuming a model family exists.

Acceptance:

  • Users can type immediately.
  • Users can press Enter multiple times during startup; submits are queued and later executed in order.
  • /model is unavailable until the actual model is known (SessionConfigured).

Step 5 - Migration: Schedule for next run (like update)

Goal: never interrupt the user during startup. Migration UX becomes a "pending notice" rendered on the next run.

  • After SessionConfigured, compute whether a migration notice should be scheduled using the current ModelsManager model list (from the RwLock).
  • Persist a "pending migration notice" to a separate file under codex_home (similar to version.json) so we dont overcrowd config.toml.
  • On next run, display the notice as a history cell (non-modal), then clear the pending notice (and record it as seen so it won't reappear).

Acceptance:

  • No migration prompt blocks startup.
  • If migration is relevant, the user sees it next run similarly to update notices.

Step 6 - Formatting, lint, and tests

  • Run just fmt (required after Rust changes).
  • Ask before running just fix -p codex-core / just fix -p codex-tui / just fix -p codex-tui2.
  • Run targeted tests:
    • cargo test -p codex-core
    • cargo test -p codex-tui
    • cargo test -p codex-tui2 (if changed)
  • Ask before running cargo test --all-features (since core changed).

Notes:

  • cargo test -p codex-core must be run outside the sandbox in this environment (wiremock binds an OS port; some seatbelt tests invoke sandbox-exec).