Files
codex/codex-rs/core/src/session
Eric Traut 9dda71dbae Warn on invalid UTF-8 in AGENTS.md files (#23232)
Fixes #23223.

## Why

Malformed AGENTS instructions should not fail silently. The reported
issue had invalid UTF-8 in a global `AGENTS.md`; before this change,
Codex treated that decode failure like a missing file, so the personal
instructions disappeared without a user-visible explanation and the
rollout had no `# AGENTS.md instructions` block.

Project-level AGENTS files already used lossy decoding, so their
instructions still appeared, but invalid bytes were replaced without
telling the user. Global and project AGENTS files should behave
consistently: keep usable instruction text when possible, and surface a
diagnostic when bytes had to be replaced.

## What changed

Global `AGENTS.override.md` and `AGENTS.md` loading now reads bytes and
decodes with replacement characters on invalid UTF-8, matching
project-level AGENTS behavior. Both global and project AGENTS loading
now emit a startup warning when invalid UTF-8 is found, and both keep
the instruction text with invalid byte sequences replaced.

Missing files, non-file candidates, empty files, and the existing
`AGENTS.override.md` before `AGENTS.md` precedence keep their current
behavior.

## How users see it

The warnings flow through the existing startup warning surface.
App-server clients receive config-time startup warnings as
`configWarning` notifications during initialization, and thread startup
emits startup warnings as thread-scoped `warning` notifications.

Global AGENTS invalid UTF-8 warnings can appear on both surfaces.
Project-level AGENTS invalid UTF-8 warnings are discovered while
building thread instructions, so they appear as thread-scoped `warning`
notifications. Clients that render warning notifications in the
conversation surface show the message as a visible diagnostic instead of
silently hiding or altering instructions.
2026-05-19 21:56:46 -07:00
..