FIX: WSL Paste image does not work (#6793)

## Related issues:  
- https://github.com/openai/codex/issues/3939  
- https://github.com/openai/codex/issues/2292  
- https://github.com/openai/codex/issues/7528 (After correction
https://github.com/openai/codex/pull/3990)

**Area:** `codex-cli` (image handling / clipboard & file uploads)  
**Platforms affected:** WSL (Ubuntu on Windows 10/11). No behavior
change on native Linux/macOS/Windows.

## Summary

This PR fixes image pasting and file uploads when running `codex-cli`
inside WSL. Previously, image operations failed silently or with
permission errors because paths weren't properly mapped between Windows
and WSL filesystems.

## Visual Result

<img width="1118" height="798" alt="image"
src="https://github.com/user-attachments/assets/14e10bc4-6b71-4d1f-b2a6-52c0a67dd069"
/>

## Last Rust-Cli

<img width="1175" height="859" alt="image"
src="https://github.com/user-attachments/assets/7ef41e29-9118-42c9-903c-7116d21e1751"
/>

## Root cause

The CLI assumed native Linux/Windows environments and didn't handle the
WSL↔Windows boundary:

- Used Linux paths for files that lived on the Windows host
- Missing path normalization between Windows (`C:\...`) and WSL
(`/mnt/c/...`)
- Clipboard access failed under WSL

### Why `Ctrl+V` doesn't work in WSL terminals

Most WSL terminal emulators (Windows Terminal, ConEmu, etc.) intercept
`Ctrl+V` at the terminal level to paste text from the Windows clipboard.
This keypress never reaches the CLI application itself, so our clipboard
image handler never gets triggered. Users need `Ctrl+Alt+V`.

## Changes

### WSL detection & path mapping

- Detects WSL by checking `/proc/sys/kernel/osrelease` and the
`WSL_INTEROP` env var
- Maps Windows drive paths to WSL mount paths (`C:\...` → `/mnt/c/...`)

### Clipboard fallback for WSL

- When clipboard access fails under WSL, falls back to PowerShell to
extract images from the Windows clipboard
- Saves to a temp file and maps the path back to WSL

### UI improvements

- Shows `Ctrl+Alt+V` hint on WSL (many terminals intercept plain
`Ctrl+V`)
- Better error messages for unreadable images

## Performance

- Negligible overhead. The fallback adds a single FS copy to a temp file
only when needed.
- Direct streaming remains the default.

## Files changed

- `protocol/src/lib.rs` – Added platform detection module  
- `protocol/src/models.rs` – Added WSL path mapping for local images  
- `protocol/src/platform.rs` – New module with WSL detection utilities  
- `tui/src/bottom_pane/chat_composer.rs` – Added base64 data URL support
and WSL path mapping
- `tui/src/bottom_pane/footer.rs` – WSL-aware keyboard shortcuts  
- `tui/src/clipboard_paste.rs` – PowerShell clipboard fallback

## How to reproduce the original bug (pre-fix)

1. Run `codex-cli` inside WSL2 on Windows.  
2. Paste an image from the Windows clipboard or drag an image from
`C:\...` into the terminal.
3. Observe that the image is not attached (silent failure) or an error
is logged; no artifact reaches the tool.

## How to verify the fix

1. Build this branch and run `codex-cli` inside WSL2.  
2. Paste from clipboard and drag from both Windows and WSL paths.  
3. Confirm that the image appears in the tool and the CLI shows a single
concise info line (no warning unless fallback was used).

I’m happy to adjust paths, naming, or split helpers into a separate
module if you prefer.

## How to try this branch

If you want to try this before it’s merged, you can use my Git branch:

Repository: https://github.com/Waxime64/codex.git  
Branch: `wsl-image-2`

1. Start WSL on your Windows machine.
2. Clone the repository and switch to the branch:
   ```bash
   git clone https://github.com/Waxime64/codex.git
   cd codex
   git checkout wsl-image-2
   # then go into the Rust workspace root, e.g.:
   cd codex-rs
3. Build the TUI binary:
  cargo build -p codex-tui --bin codex-tui --release
4. Install the binary:
   sudo install -m 0755 target/release/codex-tui /usr/local/bin/codex
5. From the project directory where you want to use Codex, start it
with:
   cd /path/to/your/project
   /usr/local/bin/codex

On WSL, use CTRL+ALT+V to paste an image from the Windows clipboard into
the chat.
This commit is contained in:
Maxime Savard
2025-12-04 13:50:20 -05:00
committed by GitHub
parent 37c36024c7
commit ce0b38c056
4 changed files with 162 additions and 61 deletions

View File

@@ -257,6 +257,8 @@ impl ChatComposer {
return false;
};
// normalize_pasted_path already handles Windows → WSL path conversion,
// so we can directly try to read the image dimensions.
match image::image_dimensions(&path_buf) {
Ok((w, h)) => {
tracing::info!("OK: {pasted}");

View File

@@ -1,3 +1,5 @@
#[cfg(target_os = "linux")]
use crate::clipboard_paste::is_probably_wsl;
use crate::key_hint;
use crate::key_hint::KeyBinding;
use crate::render::line_utils::prefix_lines;
@@ -94,10 +96,19 @@ fn footer_lines(props: FooterProps) -> Vec<Line<'static>> {
]);
vec![line]
}
FooterMode::ShortcutOverlay => shortcut_overlay_lines(ShortcutsState {
use_shift_enter_hint: props.use_shift_enter_hint,
esc_backtrack_hint: props.esc_backtrack_hint,
}),
FooterMode::ShortcutOverlay => {
#[cfg(target_os = "linux")]
let is_wsl = is_probably_wsl();
#[cfg(not(target_os = "linux"))]
let is_wsl = false;
let state = ShortcutsState {
use_shift_enter_hint: props.use_shift_enter_hint,
esc_backtrack_hint: props.esc_backtrack_hint,
is_wsl,
};
shortcut_overlay_lines(state)
}
FooterMode::EscHint => vec![esc_hint_line(props.esc_backtrack_hint)],
FooterMode::ContextOnly => vec![context_window_line(
props.context_window_percent,
@@ -115,6 +126,7 @@ struct CtrlCReminderState {
struct ShortcutsState {
use_shift_enter_hint: bool,
esc_backtrack_hint: bool,
is_wsl: bool,
}
fn ctrl_c_reminder_line(state: CtrlCReminderState) -> Line<'static> {
@@ -271,6 +283,7 @@ enum DisplayCondition {
Always,
WhenShiftEnterHint,
WhenNotShiftEnterHint,
WhenUnderWSL,
}
impl DisplayCondition {
@@ -279,6 +292,7 @@ impl DisplayCondition {
DisplayCondition::Always => true,
DisplayCondition::WhenShiftEnterHint => state.use_shift_enter_hint,
DisplayCondition::WhenNotShiftEnterHint => !state.use_shift_enter_hint,
DisplayCondition::WhenUnderWSL => state.is_wsl,
}
}
}
@@ -352,10 +366,18 @@ const SHORTCUTS: &[ShortcutDescriptor] = &[
},
ShortcutDescriptor {
id: ShortcutId::PasteImage,
bindings: &[ShortcutBinding {
key: key_hint::ctrl(KeyCode::Char('v')),
condition: DisplayCondition::Always,
}],
// Show Ctrl+Alt+V when running under WSL (terminals often intercept plain
// Ctrl+V); otherwise fall back to Ctrl+V.
bindings: &[
ShortcutBinding {
key: key_hint::ctrl_alt(KeyCode::Char('v')),
condition: DisplayCondition::WhenUnderWSL,
},
ShortcutBinding {
key: key_hint::ctrl(KeyCode::Char('v')),
condition: DisplayCondition::Always,
},
],
prefix: "",
label: " to paste images",
},

View File

@@ -2,7 +2,7 @@ use std::path::Path;
use std::path::PathBuf;
use tempfile::Builder;
#[derive(Debug)]
#[derive(Debug, Clone)]
pub enum PasteImageError {
ClipboardUnavailable(String),
NoImage(String),
@@ -119,19 +119,113 @@ pub fn paste_image_as_png() -> Result<(Vec<u8>, PastedImageInfo), PasteImageErro
/// Convenience: write to a temp file and return its path + info.
#[cfg(not(target_os = "android"))]
pub fn paste_image_to_temp_png() -> Result<(PathBuf, PastedImageInfo), PasteImageError> {
let (png, info) = paste_image_as_png()?;
// Create a unique temporary file with a .png suffix to avoid collisions.
let tmp = Builder::new()
.prefix("codex-clipboard-")
.suffix(".png")
.tempfile()
.map_err(|e| PasteImageError::IoError(e.to_string()))?;
std::fs::write(tmp.path(), &png).map_err(|e| PasteImageError::IoError(e.to_string()))?;
// Persist the file (so it remains after the handle is dropped) and return its PathBuf.
let (_file, path) = tmp
.keep()
.map_err(|e| PasteImageError::IoError(e.error.to_string()))?;
Ok((path, info))
// First attempt: read image from system clipboard via arboard (native paths or image data).
match paste_image_as_png() {
Ok((png, info)) => {
// Create a unique temporary file with a .png suffix to avoid collisions.
let tmp = Builder::new()
.prefix("codex-clipboard-")
.suffix(".png")
.tempfile()
.map_err(|e| PasteImageError::IoError(e.to_string()))?;
std::fs::write(tmp.path(), &png)
.map_err(|e| PasteImageError::IoError(e.to_string()))?;
// Persist the file (so it remains after the handle is dropped) and return its PathBuf.
let (_file, path) = tmp
.keep()
.map_err(|e| PasteImageError::IoError(e.error.to_string()))?;
Ok((path, info))
}
Err(e) => {
#[cfg(target_os = "linux")]
{
try_wsl_clipboard_fallback(&e).or(Err(e))
}
#[cfg(not(target_os = "linux"))]
{
Err(e)
}
}
}
}
/// Attempt WSL fallback for clipboard image paste.
///
/// If clipboard is unavailable (common under WSL because arboard cannot access
/// the Windows clipboard), attempt a WSL fallback that calls PowerShell on the
/// Windows side to write the clipboard image to a temporary file, then return
/// the corresponding WSL path.
#[cfg(target_os = "linux")]
fn try_wsl_clipboard_fallback(
error: &PasteImageError,
) -> Result<(PathBuf, PastedImageInfo), PasteImageError> {
use PasteImageError::ClipboardUnavailable;
use PasteImageError::NoImage;
if !is_probably_wsl() || !matches!(error, ClipboardUnavailable(_) | NoImage(_)) {
return Err(error.clone());
}
tracing::debug!("attempting Windows PowerShell clipboard fallback");
let Some(win_path) = try_dump_windows_clipboard_image() else {
return Err(error.clone());
};
tracing::debug!("powershell produced path: {}", win_path);
let Some(mapped_path) = convert_windows_path_to_wsl(&win_path) else {
return Err(error.clone());
};
let Ok((w, h)) = image::image_dimensions(&mapped_path) else {
return Err(error.clone());
};
// Return the mapped path directly without copying.
// The file will be read and base64-encoded during serialization.
Ok((
mapped_path,
PastedImageInfo {
width: w,
height: h,
encoded_format: EncodedImageFormat::Png,
},
))
}
/// Try to call a Windows PowerShell command (several common names) to save the
/// clipboard image to a temporary PNG and return the Windows path to that file.
/// Returns None if no command succeeded or no image was present.
#[cfg(target_os = "linux")]
fn try_dump_windows_clipboard_image() -> Option<String> {
// Powershell script: save image from clipboard to a temp png and print the path.
// Force UTF-8 output to avoid encoding issues between powershell.exe (UTF-16LE default)
// and pwsh (UTF-8 default).
let script = r#"[Console]::OutputEncoding = [System.Text.Encoding]::UTF8; $img = Get-Clipboard -Format Image; if ($img -ne $null) { $p=[System.IO.Path]::GetTempFileName(); $p = [System.IO.Path]::ChangeExtension($p,'png'); $img.Save($p,[System.Drawing.Imaging.ImageFormat]::Png); Write-Output $p } else { exit 1 }"#;
for cmd in ["powershell.exe", "pwsh", "powershell"] {
match std::process::Command::new(cmd)
.args(["-NoProfile", "-Command", script])
.output()
{
// Executing PowerShell command
Ok(output) => {
if output.status.success() {
// Decode as UTF-8 (forced by the script above).
let win_path = String::from_utf8_lossy(&output.stdout).trim().to_string();
if !win_path.is_empty() {
tracing::debug!("{} saved clipboard image to {}", cmd, win_path);
return Some(win_path);
}
} else {
tracing::debug!("{} returned non-zero status", cmd);
}
}
Err(err) => {
tracing::debug!("{} not executable: {}", cmd, err);
}
}
}
None
}
#[cfg(target_os = "android")]
@@ -202,10 +296,19 @@ pub fn normalize_pasted_path(pasted: &str) -> Option<PathBuf> {
}
#[cfg(target_os = "linux")]
fn is_probably_wsl() -> bool {
std::env::var_os("WSL_DISTRO_NAME").is_some()
|| std::env::var_os("WSL_INTEROP").is_some()
|| std::env::var_os("WSLENV").is_some()
pub(crate) fn is_probably_wsl() -> bool {
// Primary: Check /proc/version for "microsoft" or "WSL" (most reliable for standard WSL).
if let Ok(version) = std::fs::read_to_string("/proc/version") {
let version_lower = version.to_lowercase();
if version_lower.contains("microsoft") || version_lower.contains("wsl") {
return true;
}
}
// Fallback: Check WSL environment variables. This handles edge cases like
// custom Linux kernels installed in WSL where /proc/version may not contain
// "microsoft" or "WSL".
std::env::var_os("WSL_DISTRO_NAME").is_some() || std::env::var_os("WSL_INTEROP").is_some()
}
#[cfg(target_os = "linux")]
@@ -253,40 +356,6 @@ pub fn pasted_image_format(path: &Path) -> EncodedImageFormat {
#[cfg(test)]
mod pasted_paths_tests {
use super::*;
#[cfg(target_os = "linux")]
use std::ffi::OsString;
#[cfg(target_os = "linux")]
struct EnvVarGuard {
key: &'static str,
original: Option<OsString>,
}
#[cfg(target_os = "linux")]
impl EnvVarGuard {
fn set(key: &'static str, value: &str) -> Self {
let original = std::env::var_os(key);
unsafe {
std::env::set_var(key, value);
}
Self { key, original }
}
}
#[cfg(target_os = "linux")]
impl Drop for EnvVarGuard {
fn drop(&mut self) {
if let Some(original) = &self.original {
unsafe {
std::env::set_var(self.key, original);
}
} else {
unsafe {
std::env::remove_var(self.key);
}
}
}
}
#[cfg(not(windows))]
#[test]
@@ -420,7 +489,11 @@ mod pasted_paths_tests {
#[cfg(target_os = "linux")]
#[test]
fn normalize_windows_path_in_wsl() {
let _guard = EnvVarGuard::set("WSL_DISTRO_NAME", "Ubuntu-24.04");
// This test only runs on actual WSL systems
if !is_probably_wsl() {
// Skip test if not on WSL
return;
}
let input = r"C:\\Users\\Alice\\Pictures\\example image.png";
let result = normalize_pasted_path(input).expect("should convert windows path on wsl");
assert_eq!(

View File

@@ -49,6 +49,10 @@ pub(crate) const fn ctrl(key: KeyCode) -> KeyBinding {
KeyBinding::new(key, KeyModifiers::CONTROL)
}
pub(crate) const fn ctrl_alt(key: KeyCode) -> KeyBinding {
KeyBinding::new(key, KeyModifiers::CONTROL.union(KeyModifiers::ALT))
}
fn modifiers_to_string(modifiers: KeyModifiers) -> String {
let mut result = String::new();
if modifiers.contains(KeyModifiers::CONTROL) {