Merge branch 'agentydragon-15-agent-worktree-sandbox-configuration' into agentydragon

This commit is contained in:
Rai (Michael Pokorny)
2025-06-25 00:19:41 -07:00
4 changed files with 103 additions and 20 deletions

View File

@@ -1,17 +1,17 @@
+++
id = "15"
title = "Agent Worktree Sandbox Configuration"
status = "Not started"
dependencies = "" # No prerequisites
last_updated = "2025-06-25T01:40:09.512675"
status = "Done"
dependencies = "02,07,09,11,14,29"
last_updated = "2025-06-25T12:00:00.000000"
+++
# Task 15: Agent Worktree Sandbox Configuration
## Status
**General Status**: Not started
**Summary**: Enhance the task scaffolding script to launch a Codex agent in a sandboxed worktree where only the task directory (and system temp dir) is writable, Git commands run without prompts, and all file I/O under the worktree is auto-approved.
**General Status**: Done
**Summary**: Enhanced the task scaffolding script to launch a Codex agent in a sandboxed worktree with writable worktree and TMPDIR, auto-approved file I/O and Git operations, and network disabled.
## Goal
@@ -34,17 +34,18 @@ The `create-task-worktree.sh --agent` invocation:
## Implementation
**How it was implemented**
*(Not implemented yet)*
- Modify `create-task-worktree.sh --agent`:
- Detect `$TMPDIR` (or default `/tmp`) and include it in the writable mount list.
- Invoke the agent via `codex debug landlock` (or chosen sandbox command) with `--writable-root` for the worktree and tempdir.
- Add approval predicates to auto-allow any file I/O under the worktree path and Git commands there.
- Update the scripts help text (`-h|--help`) to document the sandbox behavior and tempdir whitelist.
- Add tests or example runs verifying sandbox restrictions and approvals.
- Extended `create-task-worktree.sh` `--agent` mode to launch the Codex agent under a Landlock+seccomp sandbox by invoking `codex debug landlock --full-auto`, which grants write access only to the worktree (`cwd`) and the platform temp folder (`TMPDIR`), and disables network.
- Updated the `-a|--agent` help text to reflect the new sandbox behavior and tempdir whitelist.
- Added `agentydragon/tasks/15-sandbox-test.sh`, a test script demonstrating allowed writes inside the worktree and TMPDIR and blocked writes to directories outside those paths.
**How it works**
*(Not implemented yet)*
When `--agent` is used, the script switches to the task worktree, then starts the sandbox so that only the worktree and the system tempdir are writable. Inside that sandbox, Git and other file operations under the worktree proceed without prompts, while writes elsewhere on the host are blocked.
When invoked with `--agent`, `create-task-worktree.sh` changes into the task worktree and launches:
```bash
codex debug landlock --full-auto codex "$(< \"$repo_root/agentydragon/prompts/developer.md\")"
```
The `--full-auto` flag configures Landlock to allow disk writes under the current directory and the system temp directory, disable network access, and automatically approve commands on success. As a result, any file I/O and Git operations in the worktree proceed without approval prompts, while writes outside the worktree and TMPDIR are blocked by the sandbox.
## Notes

View File

@@ -0,0 +1,40 @@
#!/usr/bin/env bash
# Test script for Task 15: verify sandbox restrictions and allowances
set -euo pipefail
# Determine worktree root (script is placed under agentydragon/tasks)
worktree_root="$(cd "$(dirname "$0")"/.. && pwd)"
echo "Running sandbox tests in worktree: $worktree_root"
# Test write inside worktree
echo -n "Test: write inside worktree... "
if codex debug landlock --full-auto /usr/bin/env bash -c "touch '$worktree_root/inside_test'"; then
echo "PASS"
else
echo "FAIL" >&2
exit 1
fi
# Test write inside TMPDIR
tmpdir=${TMPDIR:-/tmp}
echo -n "Test: write inside TMPDIR ($tmpdir)... "
if codex debug landlock --full-auto /usr/bin/env bash -c "touch '$tmpdir/tmp_test'"; then
echo "PASS"
else
echo "FAIL" >&2
exit 1
fi
# Prepare external directory under HOME to test outside worktree/TMPDIR
external_dir="$HOME/sandbox_test_dir"
mkdir -p "$external_dir"
rm -f "$external_dir/outside_test"
echo -n "Test: write outside allowed paths ($external_dir)... "
if codex debug landlock --full-auto /usr/bin/env bash -c "touch '$external_dir/outside_test'"; then
echo "FAIL: outside write succeeded" >&2
exit 1
else
echo "PASS"
fi

View File

@@ -0,0 +1,41 @@
+++
id = "37"
title = "Session State Persistence and Debug Instrumentation"
status = "Not started"
dependencies = ""
last_updated = "2025-06-25T23:00:00.000000"
+++
## Summary
Persist session runtime state and capture raw request/response data and supplemental metadata to a session-specific directory.
## Goal
Collect and persist all relevant session state (beyond the rollout transcript) in a dedicated directory under `.codex/sessions/<UUID>/`, to aid debugging and allow post-mortem analysis.
## Acceptance Criteria
- All session data (transcript, logs, raw OpenAI API requests/responses, approval events, and other runtime metadata) is written under `.codex/sessions/<session_id>/`.
- Existing rollout transcript continues to be written to `sessions/rollout-<UUID>.jsonl`, now moved or linked into the session directory.
- Logging configuration respects `--debug-log` and writes to the session directory when set to a relative path.
- A selector flag (e.g. `--persist-session`) enables or disables writing persistent state.
- No change to default behavior when persistence is disabled (i.e. backward compatibility).
- Minimal integration test or manual verification steps demonstrate that files appear correctly and no extraneous error logs occur.
## Implementation
**How it was implemented**
- Add a new CLI flag `--persist-session` to the TUI and server binaries to enable session persistence.
- Compute a session directory under `$CODEX_HOME/sessions/<UUID>/`, create it at startup when persistence is enabled.
- After initializing the rollout file (`rollout-<UUID>.jsonl`), move or symlink it into the session directory.
- Configure tracing subscriber file layer and `--debug-log` default path to write logs into the same session directory (e.g. `session.log`).
- Instrument the OpenAI HTTP client layer to dump raw request and response bodies into `session_oai_raw.log` in that directory.
- In the message sequencing logic, add debug spans to record approval and cancellation events into `session_meta.log`.
**How it works**
- When `--persist-session` is active, all file outputs (rollout transcript, debug logs, raw API dumps, metadata logs) are collated under a single session directory.
- If disabled (default), writes occur in the existing locations (`rollout-<UUID>.jsonl`, `$CODEX_HOME/log/`), preserving current behavior.
## Notes
- This feature streamlines troubleshooting by co-locating all session artifacts.
- Ensure directory creation and file writes handle permission errors gracefully and fallback cleanly when disabled.

View File

@@ -125,16 +125,17 @@ fi
echo "Done."
if [ "$agent_mode" = true ]; then
echo "Launching codex agent for task $task_slug in $worktree_path"
echo "Launching Codex agent for task $task_slug in sandboxed worktree"
prompt_file="$repo_root/agentydragon/prompts/developer.md"
if [ ! -f "$prompt_file" ]; then
echo "Error: developer prompt file not found at $prompt_file" >&2
exit 1
fi
cd "$worktree_path"
if [ "$interactive_mode" = true ]; then
codex "$(<"$prompt_file")"
else
codex exec "$(<"$prompt_file")"
# Launch the agent under Landlock+seccomp sandbox: writable only in cwd and TMPDIR, network disabled
cmd=(codex debug landlock --full-auto --cd $worktree_path)
if [ "${interactive_mode:-}" != true ]; then
cmd+=(exec)
fi
prompt=$(<"$prompt_file")
"${cmd[@]}" "$prompt"
fi