fix(tui): reduce startup and new-session latency (#17039)

## TL;DR

- Fetches account/rateLimits/read asynchronously so the TUI can continue
starting without waiting for the rate-limit response.
- Fixes the /status card so it no longer leaves a stale “refreshing
cached limits...” notice in terminal history.

## Problem

The TUI bootstrap path fetched account rate limits synchronously
(`account/rateLimits/read`) before the event loop started for
ChatGPT/OpenAI-authenticated startups. This added ~670 ms of blocking
latency in the measured hot-start case, even though rate-limit data is
not needed to render the initial UI or accept user input. The delay was
especially noticeable on hot starts where every other RPC
(`account/read`, `model/list`, `thread/start`) completed in under 70 ms
total.

Moving that fetch to the background also exposed a `/status` UI bug: the
status card is flattened into terminal scrollback when it is inserted. A
transient "refreshing limits in background..." line could not be cleared
later, because the async completion updated the retained `HistoryCell`,
not the already-written terminal history.

## Mental model

Before this change, `AppServerSession::bootstrap()` performed three
sequential RPCs: `account/read` → `model/list` →
`account/rateLimits/read`. The result of the third call was baked into
`AppServerBootstrap` and applied to the chat widget before the event
loop began.

After this change, `bootstrap()` only performs two RPCs (`account/read`
+ `model/list`), and rate-limit fetching is kicked off as an async
background task immediately after the first frame is scheduled. A new
enum, `RateLimitRefreshOrigin`, tags each fetch so the event handler
knows whether the result came from the startup prefetch or from a
user-initiated `/status` command; they have different completion
side-effects.

The `get_login_status()` helper (used outside the main app flow) was
also decoupled: it previously called the full `bootstrap()` just to
check auth mode, wasting model-list and rate-limit work. It now calls
the narrower `read_account()` directly.

For `/status`, this PR keeps the background refresh request but stops
printing transient refresh notices into status history when cached
limits are already available. If a refresh updates the cache, the next
`/status` command will render the new values.

## Non-goals

- This change does not alter the rate-limit data itself.
- This change does not introduce caching, retries, or staleness
management for rate limits.
- This change does not affect the `model/list` or `thread/start` RPCs;
they remain on the critical startup path.

## Tradeoffs

- **Stale-on-first-render**: The status bar will briefly show no
rate-limit info until the background fetch completes; observed
background fetches landed roughly in the 400-900 ms range after the UI
appeared. This is acceptable because the user cannot meaningfully act on
rate-limit data in the first fraction of a second.
- **Error silence on startup prefetch**: If the startup prefetch fails,
the error is logged but the UI is not notified (unlike `/status` refresh
failures, which go through the status-command completion path). This
avoids surfacing transient network errors as a startup blocker.
- **Static `/status` history**: `/status` output is terminal history,
not a live widget. The card now avoids progress-style language that
would appear stuck in scrollback; users can run `/status` again to see
newly cached values.
- **`account_auth_mode` field removed from `AppServerBootstrap`**: The
only consumer was `get_login_status()`, which no longer goes through
`bootstrap()`. The field was dead weight.

## Architecture

### New types

- `RateLimitRefreshOrigin` (in `app_event.rs`): A `Copy` enum
distinguishing `StartupPrefetch` from `StatusCommand { request_id }`.
Carried through `RefreshRateLimits` and `RateLimitsLoaded` events so the
handler applies the right completion behavior.

### Modified types

- `AppServerBootstrap`: Lost `account_auth_mode` and
`rate_limit_snapshots`; gained `requires_openai_auth: bool` (passed
through from the account response so the caller can decide whether to
fire the prefetch).

### Control flow

1. `bootstrap()` returns with `requires_openai_auth` and
`has_chatgpt_account`.
2. After scheduling the first frame, `App::run_inner` fires
`refresh_rate_limits(StartupPrefetch)` if both flags are true.
3. When `RateLimitsLoaded { StartupPrefetch, Ok(..) }` arrives,
snapshots are applied and a frame is scheduled to repaint the status
bar.
4. When `RateLimitsLoaded { StartupPrefetch, Err(..) }` arrives, the
error is logged and no UI update occurs.
5. `/status`-initiated refreshes continue to use `StatusCommand {
request_id }` and call `finish_status_rate_limit_refresh` on completion
(success or failure).
6. `/status` history cells with cached rate-limit rows no longer render
an additional "refreshing limits" notice; the async refresh updates the
cache for future status output.

### Extracted method

- `AppServerSession::read_account()`: Factored out of `bootstrap()` so
that `get_login_status()` can call it independently without triggering
model-list or rate-limit work.

## Observability

- The existing `tracing::warn!` for rate-limit fetch failures is
preserved for the startup path.
- No new metrics or spans are introduced. The startup-time improvement
is observable via the existing `ready` timestamp in TUI startup logs.

## Tests

- Existing tests in `status_command_tests.rs` are updated to match on
`RateLimitRefreshOrigin::StatusCommand { request_id }` instead of a bare
`request_id`.
- Focused `/status` tests now assert that status history avoids
transient refresh text, continues to request an async refresh, and uses
refreshed cached limits in future status output.
- No new tests are added for the startup prefetch path because it is a
fire-and-forget spawn with no observable side-effect other than the
widget state update, which is already covered by the
snapshot-application tests.

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Felipe Coury
2026-04-07 22:16:09 -03:00
committed by GitHub
parent 80ebc80be5
commit 359e17a852
12 changed files with 235 additions and 133 deletions

View File

@@ -15,49 +15,50 @@ async fn status_command_renders_immediately_and_refreshes_rate_limits_for_chatgp
other => panic!("expected status output before refresh request, got {other:?}"),
};
assert!(
rendered.contains("refreshing limits"),
"expected /status to explain the background refresh, got: {rendered}"
!rendered.contains("refreshing limits"),
"expected /status to avoid transient refresh text in terminal history, got: {rendered}"
);
let request_id = match rx.try_recv() {
Ok(AppEvent::RefreshRateLimits { request_id }) => request_id,
Ok(AppEvent::RefreshRateLimits {
origin: RateLimitRefreshOrigin::StatusCommand { request_id },
}) => request_id,
other => panic!("expected rate-limit refresh request, got {other:?}"),
};
pretty_assertions::assert_eq!(request_id, 0);
}
#[tokio::test]
async fn status_command_updates_rendered_cell_after_rate_limit_refresh() {
async fn status_command_refresh_updates_cached_limits_for_future_status_outputs() {
let (mut chat, mut rx, _op_rx) = make_chatwidget_manual(/*model_override*/ None).await;
set_chatgpt_auth(&mut chat);
chat.dispatch_command(SlashCommand::Status);
let cell = match rx.try_recv() {
Ok(AppEvent::InsertHistoryCell(cell)) => cell,
match rx.try_recv() {
Ok(AppEvent::InsertHistoryCell(_)) => {}
other => panic!("expected status output before refresh request, got {other:?}"),
};
}
let first_request_id = match rx.try_recv() {
Ok(AppEvent::RefreshRateLimits { request_id }) => request_id,
Ok(AppEvent::RefreshRateLimits {
origin: RateLimitRefreshOrigin::StatusCommand { request_id },
}) => request_id,
other => panic!("expected rate-limit refresh request, got {other:?}"),
};
let initial = lines_to_single_string(&cell.display_lines(/*width*/ 80));
assert!(
initial.contains("refreshing limits"),
"expected initial /status output to show refresh notice, got: {initial}"
);
chat.on_rate_limit_snapshot(Some(snapshot(/*percent*/ 92.0)));
chat.finish_status_rate_limit_refresh(first_request_id);
drain_insert_history(&mut rx);
let updated = lines_to_single_string(&cell.display_lines(/*width*/ 80));
assert_ne!(
initial, updated,
"expected refreshed /status output to change"
);
chat.dispatch_command(SlashCommand::Status);
let refreshed = match rx.try_recv() {
Ok(AppEvent::InsertHistoryCell(cell)) => {
lines_to_single_string(&cell.display_lines(/*width*/ 80))
}
other => panic!("expected refreshed status output, got {other:?}"),
};
assert!(
!updated.contains("refreshing limits"),
"expected refresh notice to clear after background update, got: {updated}"
refreshed.contains("8% left"),
"expected a future /status output to use refreshed cached limits, got: {refreshed}"
);
}
@@ -81,46 +82,41 @@ async fn status_command_overlapping_refreshes_update_matching_cells_only() {
set_chatgpt_auth(&mut chat);
chat.dispatch_command(SlashCommand::Status);
let first_cell = match rx.try_recv() {
Ok(AppEvent::InsertHistoryCell(cell)) => cell,
match rx.try_recv() {
Ok(AppEvent::InsertHistoryCell(_)) => {}
other => panic!("expected first status output, got {other:?}"),
};
}
let first_request_id = match rx.try_recv() {
Ok(AppEvent::RefreshRateLimits { request_id }) => request_id,
Ok(AppEvent::RefreshRateLimits {
origin: RateLimitRefreshOrigin::StatusCommand { request_id },
}) => request_id,
other => panic!("expected first refresh request, got {other:?}"),
};
chat.dispatch_command(SlashCommand::Status);
let second_cell = match rx.try_recv() {
Ok(AppEvent::InsertHistoryCell(cell)) => cell,
let second_rendered = match rx.try_recv() {
Ok(AppEvent::InsertHistoryCell(cell)) => {
lines_to_single_string(&cell.display_lines(/*width*/ 80))
}
other => panic!("expected second status output, got {other:?}"),
};
let second_request_id = match rx.try_recv() {
Ok(AppEvent::RefreshRateLimits { request_id }) => request_id,
Ok(AppEvent::RefreshRateLimits {
origin: RateLimitRefreshOrigin::StatusCommand { request_id },
}) => request_id,
other => panic!("expected second refresh request, got {other:?}"),
};
assert_ne!(first_request_id, second_request_id);
assert!(
!second_rendered.contains("refreshing limits"),
"expected /status to avoid transient refresh text in terminal history, got: {second_rendered}"
);
chat.finish_status_rate_limit_refresh(first_request_id);
let first_after_failure = lines_to_single_string(&first_cell.display_lines(/*width*/ 80));
let second_still_refreshing = lines_to_single_string(&second_cell.display_lines(/*width*/ 80));
assert!(
!first_after_failure.contains("refreshing limits"),
"expected first status cell to stop refreshing after its request completed, got: {first_after_failure}"
);
assert!(
second_still_refreshing.contains("refreshing limits"),
"expected later status cell to keep refreshing until its own request completes, got: {second_still_refreshing}"
);
pretty_assertions::assert_eq!(chat.refreshing_status_outputs.len(), 1);
chat.on_rate_limit_snapshot(Some(snapshot(/*percent*/ 92.0)));
chat.finish_status_rate_limit_refresh(second_request_id);
let second_after_success = lines_to_single_string(&second_cell.display_lines(/*width*/ 80));
assert!(
!second_after_success.contains("refreshing limits"),
"expected second status cell to refresh once its own request completed, got: {second_after_success}"
);
assert!(chat.refreshing_status_outputs.is_empty());
}