Commit Graph

17 Commits

Author SHA1 Message Date
Shijie Rao
d544adf71a Feat: plan mode prompt update (#9495)
### Summary
* Added instruction on using `request_user_input`
* Added the output to be json with `plan` key and the actual plan as the
value.
* Remove `PLAN.md` write because that gets into sandbox issue. We can
add it back later.
2026-01-19 12:17:29 -08:00
jif-oai
3c28c85063 prompt 3 (#9479) 2026-01-19 11:56:38 +00:00
Ahmed Ibrahim
f72f87fbee Add collaboration modes test prompts (#9443)
# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.
2026-01-18 11:39:08 -08:00
jif-oai
7905e99d03 prompt collab (#9367) 2026-01-16 15:12:41 +01:00
jif-oai
393a5a0311 chore: better orchestrator prompt (#9301) 2026-01-15 18:11:43 +00:00
jif-oai
05b960671d feat: add agent roles to collab tools (#9275)
Add `agent_type` parameter to the collab tool `spawn_agent` that
contains a preset to apply on the config when spawning this agent
2026-01-15 13:33:52 +00:00
jif-oai
3d322fa9d8 feat: add collab prompt (#9208)
Adding a prompt for collab tools. This is only for internal use and the
prompt won't be gated for now as it is not stable yet.

The goal of this PR is to provide the tool required to iterate on the
prompt
2026-01-14 09:58:15 -08:00
Eric Traut
c4af707e09 Removed experimental "command risk assessment" feature (#7799)
This experimental feature received lukewarm reception during internal
testing. Removing from the code base.
2025-12-10 09:48:11 -08:00
jif-oai
6382dc2338 chore: enable parallel tc (#7589) 2025-12-09 17:00:56 +00:00
jif-oai
4985a7a444 fix: parallel tool call instruction injection (#6893) 2025-11-19 11:01:57 +00:00
jif-oai
f5d9939cda feat: enable parallel tool calls (#6796) 2025-11-18 17:10:14 +00:00
Ahmed Ibrahim
0b28e72b66 Improve compact (#6692)
This PR does the following:
- Add compact prefix to the summary
- Change the compaction prompt
- Allow multiple compaction for long running tasks
- Filter out summary messages on the following compaction

Considerations:
- Filtering out the summary message isn't the most clean
- Theoretically, we can end up in infinite compaction loop if the user
messages > compaction limit . However, that's not possible in today's
code because we have hard cap on user messages.
- We need to address having multiple user messages because it confuses
the model.

Testing:
- Making sure that after compact we always end up with one user message
(task) and one summary, even on multiple compaction.
2025-11-15 07:17:51 +00:00
Eric Traut
d5853d9c47 Changes to sandbox command assessment feature based on initial experiment feedback (#6091)
* Removed sandbox risk categories; feedback indicates that these are not
that useful and "less is more"
* Tweaked the assessment prompt to generate terser answers
* Fixed bug in orchestrator that prevents this feature from being
exposed in the extension
2025-11-01 14:52:23 -07:00
jif-oai
611e00c862 feat: compactor 2 (#6027)
Co-authored-by: pakrym-oai <pakrym@openai.com>
2025-10-31 14:27:08 -07:00
Ahmed Ibrahim
13e1d0362d Delegate review to codex instance (#5572)
In this PR, I am exploring migrating task kind to an invocation of
Codex. The main reason would be getting rid off multiple
`ConversationHistory` state and streamlining our context/history
management.

This approach depends on opening a channel between the sub-codex and
codex. This channel is responsible for forwarding `interactive`
(`approvals`) and `non-interactive` events. The `task` is responsible
for handling those events.

This opens the door for implementing `codex as a tool`, replacing
`compact` and `review`, and potentially subagents.

One consideration is this code is very similar to `app-server` specially
in the approval part. If in the future we wanted an interactive
`sub-codex` we should consider using `codex-mcp`
2025-10-29 21:04:25 +00:00
Eric Traut
f8af4f5c8d Added model summary and risk assessment for commands that violate sandbox policy (#5536)
This PR adds support for a model-based summary and risk assessment for
commands that violate the sandbox policy and require user approval. This
aids the user in evaluating whether the command should be approved.

The feature works by taking a failed command and passing it back to the
model and asking it to summarize the command, give it a risk level (low,
medium, high) and a risk category (e.g. "data deletion" or "data
exfiltration"). It uses a new conversation thread so the context in the
existing thread doesn't influence the answer. If the call to the model
fails or takes longer than 5 seconds, it falls back to the current
behavior.

For now, this is an experimental feature and is gated by a config key
`experimental_sandbox_command_assessment`.

Here is a screen shot of the approval prompt showing the risk assessment
and summary.

<img width="723" height="282" alt="image"
src="https://github.com/user-attachments/assets/4597dd7c-d5a0-4e9f-9d13-414bd082fd6b"
/>
2025-10-24 15:23:44 -07:00
jif-oai
ea225df22e feat: context compaction (#3446)
## Compact feature:
1. Stops the model when the context window become too large
2. Add a user turn, asking for the model to summarize
3. Build a bridge that contains all the previous user message + the
summary. Rendered from a template
4. Start sampling again from a clean conversation with only that bridge
2025-09-12 13:07:10 -07:00