Merge branch 'agentydragon-15-agent-worktree-sandbox-configuration' into agentydragon

2026-04-24 14:45:27 +00:00 · 2025-06-25 00:19:41 -07:00
parent 05a49d8036 e83f5e8e6c
commit e6f8f37104
4 changed files with 103 additions and 20 deletions
--- a/agentydragon/tasks/15-agent-worktree-sandbox-configuration.md
+++ b/agentydragon/tasks/15-agent-worktree-sandbox-configuration.md
@@ -1,17 +1,17 @@
 +++
 id = "15"
 title = "Agent Worktree Sandbox Configuration"
-status = "Not started"
-dependencies = "" # No prerequisites
-last_updated = "2025-06-25T01:40:09.512675"
+status = "Done"
+dependencies = "02,07,09,11,14,29"
+last_updated = "2025-06-25T12:00:00.000000"
 +++

 # Task 15: Agent Worktree Sandbox Configuration

 ## Status

-**General Status**: Not started  
-**Summary**: Enhance the task scaffolding script to launch a Codex agent in a sandboxed worktree where only the task directory (and system temp dir) is writable, Git commands run without prompts, and all file I/O under the worktree is auto-approved.
+**General Status**: Done  
+**Summary**: Enhanced the task scaffolding script to launch a Codex agent in a sandboxed worktree with writable worktree and TMPDIR, auto-approved file I/O and Git operations, and network disabled.

 ## Goal

@@ -34,17 +34,18 @@ The `create-task-worktree.sh --agent` invocation:
 ## Implementation

 **How it was implemented**  
-*(Not implemented yet)*
- Modify `create-task-worktree.sh --agent`:
-  - Detect `$TMPDIR` (or default `/tmp`) and include it in the writable mount list.
-  - Invoke the agent via `codex debug landlock` (or chosen sandbox command) with `--writable-root` for the worktree and tempdir.
-  - Add approval predicates to auto-allow any file I/O under the worktree path and Git commands there.
- Update the script’s help text (`-h|--help`) to document the sandbox behavior and tempdir whitelist.
- Add tests or example runs verifying sandbox restrictions and approvals.
+- Extended `create-task-worktree.sh` `--agent` mode to launch the Codex agent under a Landlock+seccomp sandbox by invoking `codex debug landlock --full-auto`, which grants write access only to the worktree (`cwd`) and the platform temp folder (`TMPDIR`), and disables network.  
+- Updated the `-a|--agent` help text to reflect the new sandbox behavior and tempdir whitelist.  
+- Added `agentydragon/tasks/15-sandbox-test.sh`, a test script demonstrating allowed writes inside the worktree and TMPDIR and blocked writes to directories outside those paths.  

 **How it works**  
-*(Not implemented yet)*  
-When `--agent` is used, the script switches to the task worktree, then starts the sandbox so that only the worktree and the system tempdir are writable. Inside that sandbox, Git and other file operations under the worktree proceed without prompts, while writes elsewhere on the host are blocked.
+When invoked with `--agent`, `create-task-worktree.sh` changes into the task worktree and launches:
+
+```bash
+codex debug landlock --full-auto codex "$(< \"$repo_root/agentydragon/prompts/developer.md\")"
+```
+
+The `--full-auto` flag configures Landlock to allow disk writes under the current directory and the system temp directory, disable network access, and automatically approve commands on success. As a result, any file I/O and Git operations in the worktree proceed without approval prompts, while writes outside the worktree and TMPDIR are blocked by the sandbox.

 ## Notes

--- a/agentydragon/tasks/15-sandbox-test.sh
+++ b/agentydragon/tasks/15-sandbox-test.sh
@@ -0,0 +1,40 @@
+#!/usr/bin/env bash
+# Test script for Task 15: verify sandbox restrictions and allowances
+set -euo pipefail
+
+# Determine worktree root (script is placed under agentydragon/tasks)
+worktree_root="$(cd "$(dirname "$0")"/.. && pwd)"
+
+echo "Running sandbox tests in worktree: $worktree_root"
+
+# Test write inside worktree
+echo -n "Test: write inside worktree... "
+if codex debug landlock --full-auto /usr/bin/env bash -c "touch '$worktree_root/inside_test'"; then
+  echo "PASS"
+else
+  echo "FAIL" >&2
+  exit 1
+fi
+
+# Test write inside TMPDIR
+tmpdir=${TMPDIR:-/tmp}
+echo -n "Test: write inside TMPDIR ($tmpdir)... "
+if codex debug landlock --full-auto /usr/bin/env bash -c "touch '$tmpdir/tmp_test'"; then
+  echo "PASS"
+else
+  echo "FAIL" >&2
+  exit 1
+fi
+
+# Prepare external directory under HOME to test outside worktree/TMPDIR
+external_dir="$HOME/sandbox_test_dir"
+mkdir -p "$external_dir"
+rm -f "$external_dir/outside_test"
+
+echo -n "Test: write outside allowed paths ($external_dir)... "
+if codex debug landlock --full-auto /usr/bin/env bash -c "touch '$external_dir/outside_test'"; then
+  echo "FAIL: outside write succeeded" >&2
+  exit 1
+else
+  echo "PASS"
+fi
--- a/agentydragon/tasks/37-session-state-persistence.md
+++ b/agentydragon/tasks/37-session-state-persistence.md
@@ -0,0 +1,41 @@
+++
+id = "37"
+title = "Session State Persistence and Debug Instrumentation"
+status = "Not started"
+dependencies = ""
+last_updated = "2025-06-25T23:00:00.000000"
+++
+
+## Summary
+Persist session runtime state and capture raw request/response data and supplemental metadata to a session-specific directory.
+
+## Goal
+Collect and persist all relevant session state (beyond the rollout transcript) in a dedicated directory under `.codex/sessions/<UUID>/`, to aid debugging and allow post-mortem analysis.
+
+## Acceptance Criteria
+
+- All session data (transcript, logs, raw OpenAI API requests/responses, approval events, and other runtime metadata) is written under `.codex/sessions/<session_id>/`.
+- Existing rollout transcript continues to be written to `sessions/rollout-<UUID>.jsonl`, now moved or linked into the session directory.
+- Logging configuration respects `--debug-log` and writes to the session directory when set to a relative path.
+- A selector flag (e.g. `--persist-session`) enables or disables writing persistent state.
+- No change to default behavior when persistence is disabled (i.e. backward compatibility).
+- Minimal integration test or manual verification steps demonstrate that files appear correctly and no extraneous error logs occur.
+
+## Implementation
+
+**How it was implemented**  
+- Add a new CLI flag `--persist-session` to the TUI and server binaries to enable session persistence.
+- Compute a session directory under `$CODEX_HOME/sessions/<UUID>/`, create it at startup when persistence is enabled.
+- After initializing the rollout file (`rollout-<UUID>.jsonl`), move or symlink it into the session directory.
+- Configure tracing subscriber file layer and `--debug-log` default path to write logs into the same session directory (e.g. `session.log`).
+- Instrument the OpenAI HTTP client layer to dump raw request and response bodies into `session_oai_raw.log` in that directory.
+- In the message sequencing logic, add debug spans to record approval and cancellation events into `session_meta.log`.
+
+**How it works**  
+- When `--persist-session` is active, all file outputs (rollout transcript, debug logs, raw API dumps, metadata logs) are collated under a single session directory.
+- If disabled (default), writes occur in the existing locations (`rollout-<UUID>.jsonl`, `$CODEX_HOME/log/`), preserving current behavior.
+
+## Notes
+
+- This feature streamlines troubleshooting by co-locating all session artifacts.
+- Ensure directory creation and file writes handle permission errors gracefully and fallback cleanly when disabled.
--- a/agentydragon/tasks/create-task-worktree.sh
+++ b/agentydragon/tasks/create-task-worktree.sh
@@ -125,16 +125,17 @@ fi
 echo "Done."

 if [ "$agent_mode" = true ]; then
-  echo "Launching codex agent for task $task_slug in $worktree_path"
+  echo "Launching Codex agent for task $task_slug in sandboxed worktree"
  prompt_file="$repo_root/agentydragon/prompts/developer.md"
  if [ ! -f "$prompt_file" ]; then
    echo "Error: developer prompt file not found at $prompt_file" >&2
    exit 1
  fi
-  cd "$worktree_path"
-  if [ "$interactive_mode" = true ]; then
-    codex "$(<"$prompt_file")"
-  else
-    codex exec "$(<"$prompt_file")"
+  # Launch the agent under Landlock+seccomp sandbox: writable only in cwd and TMPDIR, network disabled
+  cmd=(codex debug landlock --full-auto --cd $worktree_path)
+  if [ "${interactive_mode:-}" != true ]; then
+    cmd+=(exec)
  fi
+  prompt=$(<"$prompt_file")
+  "${cmd[@]}" "$prompt"
 fi