mirror of
https://github.com/openai/codex.git
synced 2026-05-03 02:46:39 +00:00
feat(tui): syntax highlighting via syntect with theme picker (#11447)
## Summary Adds syntax highlighting to the TUI for fenced code blocks in markdown responses and file diffs, plus a `/theme` command with live preview and persistent theme selection. Uses syntect (~250 grammars, 32 bundled themes, ~1 MB binary cost) — the same engine behind `bat`, `delta`, and `xi-editor`. Includes guardrails for large inputs, graceful fallback to plain text, and SSH-aware clipboard integration for the `/copy` command. <img width="1554" height="1014" alt="image" src="https://github.com/user-attachments/assets/38737a79-8717-4715-b857-94cf1ba59b85" /> <img width="2354" height="1374" alt="image" src="https://github.com/user-attachments/assets/25d30a00-c487-4af8-9cb6-63b0695a4be7" /> ## Problem Code blocks in the TUI (markdown responses and file diffs) render without syntax highlighting, making it hard to scan code at a glance. Users also have no way to pick a color theme that matches their terminal aesthetic. ## Mental model The highlighting system has three layers: 1. **Syntax engine** (`render::highlight`) -- a thin wrapper around syntect + two-face. It owns a process-global `SyntaxSet` (~250 grammars) and a `RwLock<Theme>` that can be swapped at runtime. All public entry points accept `(code, lang)` and return ratatui `Span`/`Line` vectors or `None` when the language is unrecognized or the input exceeds safety guardrails. 2. **Rendering consumers** -- `markdown_render` feeds fenced code blocks through the engine; `diff_render` highlights Add/Delete content as a whole file and Update hunks per-hunk (preserving parser state across hunk lines). Both callers fall back to plain unstyled text when the engine returns `None`. 3. **Theme lifecycle** -- at startup the config's `tui.theme` is resolved to a syntect `Theme` via `set_theme_override`. At runtime the `/theme` picker calls `set_syntax_theme` to swap themes live; on cancel it restores the snapshot taken at open. On confirm it persists `[tui] theme = "..."` to config.toml. ## Non-goals - Inline diff highlighting (word-level change detection within a line). - Semantic / LSP-backed highlighting. - Theme authoring tooling; users supply standard `.tmTheme` files. ## Tradeoffs | Decision | Upside | Downside | | ------------------------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------- | | syntect over tree-sitter / arborium | ~1 MB binary increase for ~250 grammars + 32 themes; battle-tested crate powering widely-used tools (`bat`, `delta`, `xi-editor`). tree-sitter would add ~12 MB for 20-30 languages or ~35 MB for full coverage. | Regex-based; less structurally accurate than tree-sitter for some languages (e.g. language injections like JS-in-HTML). | | Global `RwLock<Theme>` | Enables live `/theme` preview without threading Theme through every call site | Lock contention risk (mitigated: reads vastly outnumber writes, single UI thread) | | Skip background / italic / underline from themes | Terminal BG preserved, avoids ugly rendering on some themes | Themes that rely on these properties lose fidelity | | Guardrails: 512 KB / 10k lines | Prevents pathological stalls on huge diffs or pastes | Very large files render without color | ## Architecture ``` config.toml ─[tui.theme]─> set_theme_override() ─> THEME (RwLock) │ ┌───────────────────────────────────────────┘ │ markdown_render ─── highlight_code_to_lines(code, lang) ─> Vec<Line> diff_render ─── highlight_code_to_styled_spans(code, lang) ─> Option<Vec<Vec<Span>>> │ │ (None ⇒ plain text fallback) │ /theme picker ─── set_syntax_theme(theme) // live preview swap ─── current_syntax_theme() // snapshot for cancel ─── resolve_theme_by_name(name) // lookup by kebab-case ``` Key files: - `tui/src/render/highlight.rs` -- engine, theme management, guardrails - `tui/src/diff_render.rs` -- syntax-aware diff line wrapping - `tui/src/theme_picker.rs` -- `/theme` command builder - `tui/src/bottom_pane/list_selection_view.rs` -- side content panel, callbacks - `core/src/config/types.rs` -- `Tui::theme` field - `core/src/config/edit.rs` -- `syntax_theme_edit()` helper ## Observability - `tracing::warn` when a configured theme name cannot be resolved. - `Config::startup_warnings` surfaces the same message as a TUI banner. - `tracing::error` when persisting theme selection fails. ## Tests - Unit tests in `highlight.rs`: language coverage, fallback behavior, CRLF stripping, style conversion, guardrail enforcement, theme name mapping exhaustiveness. - Unit tests in `diff_render.rs`: snapshot gallery at multiple terminal sizes (80x24, 94x35, 120x40), syntax-highlighted wrapping, large-diff guardrail, rename-to-different-extension highlighting, parser state preservation across hunk lines. - Unit tests in `theme_picker.rs`: preview rendering (wide + narrow), dim overlay on deletions, subtitle truncation, cancel-restore, fallback for unavailable configured theme. - Unit tests in `list_selection_view.rs`: side layout geometry, stacked fallback, buffer clearing, cancel/selection-changed callbacks. - Integration test in `lib.rs`: theme warning uses the final (post-resume) config. ## Cargo Deny: Unmaintained Dependency Exceptions This PR adds two `cargo deny` advisory exceptions for transitive dependencies pulled in by `syntect v5.3.0`: | Advisory | Crate | Status | |----------|-------|--------| | RUSTSEC-2024-0320 | `yaml-rust` | Unmaintained (maintainer unreachable) | | RUSTSEC-2025-0141 | `bincode` | Unmaintained (development ceased; v1.3.3 considered complete) | **Why this is safe in our usage:** - Neither advisory describes a known security vulnerability. Both are "unmaintained" notices only. - `bincode` is used by syntect to deserialize pre-compiled syntax sets. Again, these are **static vendored artifacts** baked into the binary at build time. No user-supplied bincode data is ever deserialized. - Attack surface is zero for both crates; exploitation would require a supply-chain compromise of our own build artifacts. - These exceptions can be removed when syntect migrates to `yaml-rust2` and drops `bincode`, or when alternative crates are available upstream.
This commit is contained in:
@@ -652,10 +652,83 @@ fn link() {
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn code_block_unhighlighted() {
|
||||
fn code_block_known_lang_has_syntax_colors() {
|
||||
let text = render_markdown_text("```rust\nfn main() {}\n```\n");
|
||||
let expected = Text::from_iter([Line::from_iter(["", "fn main() {}"])]);
|
||||
assert_eq!(text, expected);
|
||||
let content: Vec<String> = text
|
||||
.lines
|
||||
.iter()
|
||||
.map(|l| {
|
||||
l.spans
|
||||
.iter()
|
||||
.map(|s| s.content.clone())
|
||||
.collect::<String>()
|
||||
})
|
||||
.collect();
|
||||
// Content should be preserved; ignore trailing empty line from highlighting.
|
||||
let content: Vec<&str> = content
|
||||
.iter()
|
||||
.map(std::string::String::as_str)
|
||||
.filter(|s| !s.is_empty())
|
||||
.collect();
|
||||
assert_eq!(content, vec!["fn main() {}"]);
|
||||
|
||||
// At least one span should have non-default style (syntax highlighting).
|
||||
let has_colored_span = text
|
||||
.lines
|
||||
.iter()
|
||||
.flat_map(|l| l.spans.iter())
|
||||
.any(|sp| sp.style.fg.is_some());
|
||||
assert!(has_colored_span, "expected syntax-highlighted spans with color");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn code_block_unknown_lang_plain() {
|
||||
let text = render_markdown_text("```xyzlang\nhello world\n```\n");
|
||||
let content: Vec<String> = text
|
||||
.lines
|
||||
.iter()
|
||||
.map(|l| {
|
||||
l.spans
|
||||
.iter()
|
||||
.map(|s| s.content.clone())
|
||||
.collect::<String>()
|
||||
})
|
||||
.collect();
|
||||
let content: Vec<&str> = content
|
||||
.iter()
|
||||
.map(std::string::String::as_str)
|
||||
.filter(|s| !s.is_empty())
|
||||
.collect();
|
||||
assert_eq!(content, vec!["hello world"]);
|
||||
|
||||
// No syntax coloring for unknown language — all spans have default style.
|
||||
let has_colored_span = text
|
||||
.lines
|
||||
.iter()
|
||||
.flat_map(|l| l.spans.iter())
|
||||
.any(|sp| sp.style.fg.is_some());
|
||||
assert!(!has_colored_span, "expected no syntax coloring for unknown lang");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn code_block_no_lang_plain() {
|
||||
let text = render_markdown_text("```\nno lang specified\n```\n");
|
||||
let content: Vec<String> = text
|
||||
.lines
|
||||
.iter()
|
||||
.map(|l| {
|
||||
l.spans
|
||||
.iter()
|
||||
.map(|s| s.content.clone())
|
||||
.collect::<String>()
|
||||
})
|
||||
.collect();
|
||||
let content: Vec<&str> = content
|
||||
.iter()
|
||||
.map(std::string::String::as_str)
|
||||
.filter(|s| !s.is_empty())
|
||||
.collect();
|
||||
assert_eq!(content, vec!["no lang specified"]);
|
||||
}
|
||||
|
||||
#[test]
|
||||
@@ -721,16 +794,25 @@ Here is a code block that shows another fenced block:
|
||||
.collect::<String>()
|
||||
})
|
||||
.collect();
|
||||
// Filter empty trailing lines for stability; the code block may or may
|
||||
// not emit a trailing blank depending on the highlighting path.
|
||||
let trimmed: Vec<&str> = {
|
||||
let mut v: Vec<&str> = lines.iter().map(std::string::String::as_str).collect();
|
||||
while v.last() == Some(&"") {
|
||||
v.pop();
|
||||
}
|
||||
v
|
||||
};
|
||||
assert_eq!(
|
||||
lines,
|
||||
trimmed,
|
||||
vec![
|
||||
"Here is a code block that shows another fenced block:".to_string(),
|
||||
String::new(),
|
||||
"```md".to_string(),
|
||||
"# Inside fence".to_string(),
|
||||
"- bullet".to_string(),
|
||||
"- `inline code`".to_string(),
|
||||
"```".to_string(),
|
||||
"Here is a code block that shows another fenced block:",
|
||||
"",
|
||||
"```md",
|
||||
"# Inside fence",
|
||||
"- bullet",
|
||||
"- `inline code`",
|
||||
"```",
|
||||
]
|
||||
);
|
||||
}
|
||||
@@ -993,3 +1075,36 @@ fn nested_item_continuation_paragraph_is_indented() {
|
||||
]);
|
||||
assert_eq!(text, expected);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn code_block_preserves_trailing_blank_lines() {
|
||||
// A fenced code block with an intentional trailing blank line must keep it.
|
||||
let md = "```rust\nfn main() {}\n\n```\n";
|
||||
let text = render_markdown_text(md);
|
||||
let content: Vec<String> = text
|
||||
.lines
|
||||
.iter()
|
||||
.map(|l| {
|
||||
l.spans
|
||||
.iter()
|
||||
.map(|s| s.content.clone())
|
||||
.collect::<String>()
|
||||
})
|
||||
.collect();
|
||||
// Should have: "fn main() {}" then "" (the blank line).
|
||||
// Filter only to content lines (skip leading/trailing empty from rendering).
|
||||
assert!(
|
||||
content.iter().any(|c| c == "fn main() {}"),
|
||||
"expected code line, got {content:?}"
|
||||
);
|
||||
// The trailing blank line inside the fence should be preserved.
|
||||
let code_start = content.iter().position(|c| c == "fn main() {}").unwrap();
|
||||
assert!(
|
||||
content.len() > code_start + 1,
|
||||
"expected a line after 'fn main() {{}}' but content ends: {content:?}"
|
||||
);
|
||||
assert_eq!(
|
||||
content[code_start + 1], "",
|
||||
"trailing blank line inside code fence was lost: {content:?}"
|
||||
);
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user