Library
AI Skills
Reusable patterns and workflows I give my AI agents
What These Are
AI agents start every session with zero memory. They don't remember what worked last time, what patterns your codebase follows, or what mistakes to avoid. Skills solve this. Each skill is a document that teaches an agent how to do a specific type of work: the architecture, the gotchas, the verification steps, the decisions that matter.
Think of them as procedural knowledge that persists across sessions. When I tell an agent to "run the overnight PR factory," it doesn't improvise. It loads the skill, follows the pattern, and produces consistent results. The skill encodes everything I've learned about that workflow so the agent doesn't have to rediscover it every time.
How I Use Them
I have 30+ agents running at any given time. Some run on cron schedules, monitoring production health, syncing data, checking for anomalies. Others are reactive, triggered by events like Sentry alerts or customer support escalations. But most of my day-to-day is managing agents directly in the terminal. Spinning them up, giving them context, reviewing their output.
Skills are keyword-triggered. When I mention "evaluate" or "pairwise comparison," the evaluation skill loads automatically. When I say "menu bar app," the SwiftUI skill loads. The agent gets exactly the context it needs without me having to explain the same architecture for the twentieth time.
Every skill started as a one-off solution to a real problem. It got used 3-5 times, proved its value, then got formalized. If I find myself explaining the same thing twice to an agent, it becomes a skill.
Why This Matters
The bottleneck with AI agents is context, not intelligence. At least for most things. The real shift is going from doing the work yourself to understanding how to scaffold and harness agents to do it. It's less about writing code and more about knowing how to give agents the right context, the right constraints, and the right verification steps. Skills are how I do that systematically instead of ad hoc.
These are primarily built for terminal-based agents. The patterns are architectural though, not tool-specific. How to structure multi-agent orchestration, how to design evaluations that actually measure what matters, how to gather context cheaply before spending on expensive model calls. They work regardless of which model or framework you're using.
Most Used
Planner-Worker-Judge Loop
Autonomous multi-step execution with milestone-based planning, worker agents, and judge-gated quality control. No human in the loop.
agentsorchestrationautonomyGemini Swarm
3-stage parallel research pipeline using cheap models at scale. Context gathering, discovery, then deep analysis, at a fraction of the cost.
geminiresearchcost-optimizationOvernight PR Factory
Headless overnight batch that pulls tickets from a queue and turns them into draft PRs. tmux, git worktrees, completion promises, and optional judge gates.
codexautomationPRsCross-Tool Orchestration
Use one AI tool as the orchestrator and another as the engineering team. Plan, spawn parallel workers, wait, judge, integrate.
orchestrationcodexclaudeRobust Eval Design
Domain-agnostic evaluation framework: pairwise comparison, blinding, ground truth tracing, ablation testing, and SUT abstraction.
evalsqualitytestingContext Evolution
Framework for writing persistent documentation that survives across AI sessions. Anti-patterns, decision rationale, and session-end protocols.
memorycontextdocumentationFlash Scouts
Fast codebase context gathering using cheap models before expensive ones. The scout finds the files; the expert reads them.
contextcodebasecost-optimization
All Skills
Agent Orchestration
Planner-Worker-Judge Loop
Autonomous multi-step execution with milestone-based planning, worker agents, and judge-gated quality control. No human in the loop.
agentsorchestrationautonomyMulti-Agent Orchestration with Convex
Scalable 3-tier agent topology (controller/orchestrator/worker) with real-time coordination via Convex as the message backbone.
agentsconvexreal-timeCodex Tech Lead
Delegate coding work to parallel Codex agents with explicit role semantics, evidence discipline, and delivery gates.
codexdelegationagentsAutonomous Loop Design
Design verification-first iterative loops with 5 verification strategies: code execution, browser testing, data cross-reference, state checks, and live execution.
loopsverificationautonomyGemini Swarm
3-stage parallel research pipeline using cheap models at scale. Context gathering, discovery, then deep analysis, at a fraction of the cost.
geminiresearchcost-optimizationOvernight PR Factory
Headless overnight batch that pulls tickets from a queue and turns them into draft PRs. tmux, git worktrees, completion promises, and optional judge gates.
codexautomationPRsCross-Tool Orchestration
Use one AI tool as the orchestrator and another as the engineering team. Plan, spawn parallel workers, wait, judge, integrate.
orchestrationcodexclaudeProduction Release Guardrails
Staged promotion with health checks between each service, Sentry regression detection, scheduled follow-ups, and automated rollback via revert PRs.
deploymentsafetyrollbackParallel Ticket Triage
Spawn parallel planning agents to scope tickets into implementation plans with confidence ratings. Planning only, no implementation.
planninglinearagentsCloud Agent Deployment
Build production agents with the Claude Agent SDK. E2B sandboxing, local-to-cloud promotion workflow, and tool translation patterns.
deploymentSDKagents
Evaluation & Quality
Robust Eval Design
Domain-agnostic evaluation framework: pairwise comparison, blinding, ground truth tracing, ablation testing, and SUT abstraction.
evalsqualitytestingMini Evals
Lightweight evaluation harness for verifying skill triggering, system health, and regression detection across AI workflows.
evalstestingverificationContext Gap Audit
Audit data freshness, completeness, and blind spots before making decisions. Red flags, remediation actions, and integration protocol.
data-qualityauditanalyticsError Monitoring Automation
Automated Sentry error triage. List issues, generate AI-fixable prompts, create tickets with full stacktrace context, and prioritize by impact.
sentryerrorsmonitoringAnalytics API Patterns
PostHog API patterns for insights, funnels, HogQL queries, survey responses, and dashboard management. InsightVizNode wrapper and common pitfalls.
posthoganalyticsAPI
Context & Memory
Context Evolution
Framework for writing persistent documentation that survives across AI sessions. Anti-patterns, decision rationale, and session-end protocols.
memorycontextdocumentationFlash Scouts
Fast codebase context gathering using cheap models before expensive ones. The scout finds the files; the expert reads them.
contextcodebasecost-optimizationSession Insights
Analyze AI coding session logs for usage patterns, tool frequency, and workflow bottlenecks. Turn transcripts into actionable data.
analyticssessionspatternsSession Shutdown Protocol
End-of-session ritual for capturing learnings, updating memory, and creating continuity for the next session.
memoryworkflowcontinuity
Developer Tools
SwiftUI Menu Bar Apps
Build native macOS menu bar apps with SwiftUI MenuBarExtra. Build, bundle, sign, install, and event-driven monitoring patterns.
swiftmacosnativetmux Multi-Agent Sessions
Manage parallel agent sessions with tmux. Spawn, monitor, and coordinate multiple AI processes from a single terminal.
tmuxterminalagentsUI Debug Screenshots
Playwright-driven screenshot loops for responsive UI debugging. Capture every breakpoint, compare states, catch visual regressions.
playwrightdebuggingvisualASCII Diagrams
Unicode box-drawing character reference and diagram templates for terminal-native documentation and architecture diagrams.
diagramsunicodeterminalQuick Share
Share local files instantly via Cloudflare Quick Tunnel. One command to get a public URL for any file on your machine.
sharingcloudflaretunnels
Content & Docs
Interview-Driven PRDs
Create product requirement documents through structured interviews. Zero technical content — pure problem, solution, and acceptance criteria.
productPRDinterviewsDocument Converter
Markdown to branded HTML and PDF conversion pipeline. Consistent styling across outputs with template support.
markdownPDFconversionDoc-Code Sync
Autonomous drift detection between documentation and code. Finds stale docs, outdated references, and missing coverage.
documentationsyncautomationProgrammatic Video with Remotion
Create videos programmatically with React and Remotion. Data-driven animations, batch rendering, and template composition.
videoreactremotionAI Image Generation Patterns
Prompt engineering patterns for Gemini image generation. Style control, consistency techniques, and batch workflows.
imagesgeminipromptsVisual Research Pipeline
Web research, Chrome DevTools screenshots, and automated ticket creation. Competitive analysis and CRO audits with annotated visual evidence.
researchscreenshotsCRO
Productivity
Meeting Prep Briefs
Generate concise meeting preparation briefs from calendar data. Context, talking points, and decision items — automated.
meetingscalendarautomationNotion Operations
Notion CRUD operations via CLI scripts. Create, read, update pages and databases without touching the browser.
notionCLIautomationExperiment Operations
Experiment card discipline for product validation. Hypothesis, metric, threshold, and kill criteria — before writing any code.
experimentsproductvalidationAgent Tools Design
Decision framework for choosing between MCP servers, CLI scripts, and direct API calls. When to build what.
toolsarchitecturedecisionsConvex Database Bootstrap
Zero-to-running Convex database setup. Schema design, function patterns, and real-time subscription architecture.
convexdatabasesetup