Connor Holly

← AI Skills

Codex Tech Lead

Agent Orchestration

codexdelegationagents

A pattern for orchestrating multiple AI agents as a technical lead — assigning named roles with explicit boundaries, enforcing evidence-based work, and using delivery gates between phases to maintain quality at scale.

The Pattern

One lead agent (the tech lead) decomposes work and assigns it to specialist agents. Each specialist has a named role, scoped responsibilities, and a self-contained prompt.

Tech Lead
  |
  +-- Researcher    (read-only: search, analyze, report findings)
  +-- Implementer   (write: create/modify code based on spec)
  +-- Reviewer      (read-only: verify implementation, run tests)

Topology tiers:

  • Tiny (1 agent): Do it yourself. No orchestration overhead.
  • Small (2-3): Research + implement, or implement + review. Light coordination.
  • Medium (4-6): Full pipeline with research, implementation, review, and documentation.
  • Large (7+): Rarely justified. Coordination overhead usually exceeds the parallelism gains. Cap at 3-4 parallel agents for most work.

Core mechanics:

Self-contained prompts. Each agent's prompt includes all necessary context: file paths, schemas, constraints, expected output format. Agents share zero implicit context with each other or with the tech lead. If the agent needs to know a schema, paste the schema in the prompt. Never assume shared state.

Evidence discipline. Agents must cite specific files, line numbers, error messages, or API responses to support their claims. "The auth module handles this" is rejected. "Line 47 of auth/middleware.ts checks the JWT expiration" is accepted. This eliminates hallucinated reasoning.

Delivery gates. Work moves between phases only after explicit verification. Implementation doesn't start until the research report is reviewed. Code doesn't go to review until tests pass locally. Each gate has a binary pass/fail condition — no "looks good enough."

Placeholder pattern. When two agents work on dependent tasks in parallel, Agent A writes a placeholder (e.g., /*PLACEHOLDER*/null) where it needs Agent B's output. The tech lead replaces the placeholder with real data once Agent B finishes. This enables parallel work on sequential dependencies.

Key Decisions

Named roles over generic workers. "Researcher" with explicit read-only constraints produces better output than "Agent 2, do some research." Role semantics create behavioral boundaries — researchers don't try to implement, implementers don't second-guess the research.

3-4 agents max for parallel work. Beyond this threshold, the tech lead spends more time coordinating, resolving conflicts, and merging outputs than the agents save through parallelism. Sequential phases with fewer agents are usually faster end-to-end.

Prompt completeness over brevity. A 500-token prompt that includes all context produces better results than a 50-token prompt that assumes the agent will "figure it out." The cost of redundant context in prompts is negligible compared to the cost of an agent going off-track because it was missing information.

When to Use It

Multi-file code changes, feature implementations that span research and coding, codebase migrations, or any task where different phases require different capabilities (read-only analysis vs. code modification vs. testing). The overhead of role assignment and delivery gates pays off when the total work exceeds what a single agent can reliably do in one pass — typically anything touching more than 3-4 files or requiring more than 30 minutes of focused work.