Connor Holly

A structured approach to designing AI agent loops that run iteratively toward a goal, with explicit completion conditions, failure ceilings, checkpoint recovery, and escalation rules — built through a deliberate interview protocol rather than ad-hoc prompting.

The Pattern

Before writing any loop logic, run a structured interview:

What does "done" look like? Define a completion promise — a specific, unambiguous signal that the work is finished. This becomes a literal marker the agent outputs (e.g., <promise>DONE</promise>). No fuzzy "looks good enough."
What are the failure modes? Enumerate what can go wrong. Infinite retries, scope creep, silent errors, resource exhaustion. Each failure mode gets a mitigation.
How do you verify each iteration? Pick from five verification strategies:
- Code execution: Run tests, build commands, linters — binary pass/fail
- Browser testing: Automated assertions against rendered UI
- Data cross-reference: Compare output against a known-good dataset
- State check: Verify file system, database, or API state matches expectations
- Live execution: Actually run the artifact in a real environment
What's the iteration ceiling? Always set a max iteration count. 10 for quick tasks, 50 for complex ones, never unlimited. When the ceiling hits, the loop stops and reports what it accomplished.
When should it escalate? Define thresholds for asking for help instead of retrying. Three consecutive failures on the same step. Verification passing but output looking wrong. Resource usage exceeding a budget.

Key Decisions

Checkpoint files over in-memory state. Write a JSON state file after each iteration with the current milestone, attempt count, last error, and intermediate outputs. If the process crashes, the next run reads the checkpoint and resumes — no lost work.

Completion promise over heuristic detection. Don't try to infer whether the agent is "done" from output analysis. Make the agent explicitly declare completion with a specific string. This is unambiguous and trivially parseable.

Interview before implementation. The interview protocol forces you to think about failure modes and verification before writing the loop. Without it, you end up with loops that work on the happy path and break silently on everything else.

Escalation over infinite retry. Retrying the same failing step with the same approach wastes compute and time. After N failures, simplify the step, try an alternative approach, or surface the problem for human decision. Three retries is usually the right threshold.

When to Use It

Autonomous code generation, automated refactoring, data migration, content creation pipelines — any iterative process where an AI agent works toward a goal over multiple passes. The pattern adds ~15 minutes of upfront design time and saves hours of debugging runaway loops, silent failures, and lost progress from crashes.