9 Prompts Aren't a Team. Here's What Made Them One.

Building an AI team: cartoon robots lined up on an assembly line with a larger robot supervising

Last week, I wrote about building a virtual AI team instead of deploying OpenClaw. That post described the end state: 34 specialized agents organized into 11 groups, managed by a lightweight orchestrator, all running locally through Claude Code.

What I didn't talk about was how I got there. It didn't start with 34 agents. It started with 9 prompts and a web development project.

The Starting Point: A Collection of Prompts

I had a set of basic prompts that walked through the full lifecycle of building a website, from competitive research and brand identity through to SEO implementation and launch readiness. Each prompt was designed to be used manually and sequentially. I'd run the first one, copy its output, paste it into the next one, and so on.

It worked, but it was tedious. Every handoff was manual. Context got lost between steps. And if a client changed their mind about the brand direction at step 7, I was back to step 2 with a lot of painful copy-pasting ahead of me.

The prompts that I had covered 9 distinct roles:

Competitive Intelligence Analyst - Market research, competitor teardown, gap analysis
Brand Identity System Designer - Colors, typography, UI direction
Website Strategy Consultant - Site architecture, messaging, conversion strategy
Full-Stack PHP Web Developer - Actually builds the site
SEO Strategist - Keyword research, schema recommendations, content planning
SEO Implementation Specialist - Applies the strategy as meta tags, JSON-LD, heading hierarchy
Website Imagery Strategist - Visual content planning, AI image prompts
Visual Enhancement Specialist - Animations, micro-interactions, performance polish
QA Lead & Launch Readiness Specialist - Pre-launch validation across every dimension

The individual prompts were great. But together, they were just a checklist. They didn't work together as a team.

The Question That Changed Everything

I was looking at these 9 prompts, and asked myself a simple question: what if they didn't need me to move context between them?

What if there was a 10th agent, acting as a project manager, whose only job was to understand the project, figure out which specialists to invoke, pass the right context between them, and pause when I needed to weigh in?

That question led to the architecture that eventually became my entire virtual team.

Designing the Web Dev PM

The first thing I realized was that collapsing all 9 prompts into one mega-prompt was absolutely a bad move. When you front-load too many objectives, the model loses focus. The quality of each individual step degrades.

Instead, I kept the 9 specialists as independent prompts — each one improved and structured with clear inputs, outputs, and quality standards — and built a Project Manager agent that orchestrated them.

The Project Manager's job:

Intake. Assess the project, ask sharp questions, produce a Project Brief.
Routing. Determine which specialists to invoke based on project type — new build, redesign, or targeted improvement.
Context bridging. Translate one specialist's output into the next specialist's input, preserving specific values (hex codes, keyword lists, sitemap structures) while summarizing everything else.
Checkpoints. Pause at natural breakpoints so I can review, approve, or redirect.
Change propagation. When I change my mind about the brand at step 7, the PM knows that means re-running the builder and visual polish — but not the market research.

The Execution Model

Here's where it gets interesting. Each specialist runs as a subagent via Claude Code's Task tool. That means each one gets its own context window, completely independent from the PM and from every other specialist.

All communication happens through files:

The PM writes an input file for each specialist (populated with the right variables and context from prior steps)
The specialist reads its own prompt file for operating instructions, reads the input file for project data, does its work, and writes its output to a known location
The PM reads the output, verifies it, updates a running Project Context Document, and moves on

The file-based approach gives you something most agent architectures don't: inspectable, persistent state.

When something goes wrong at step 8, I can look at the exact files that step 4 produced. No guessing about what the model "saw."

And if I pause a project for a week, the files are still there. The PM reads the project context and picks up where we left off.

The Dependency Graph

Not all 9 specialists need to run sequentially. Some have no dependencies on each other and can run in parallel:

P1 (Market Research)───────┐
                           ├──▶ P3 (Architecture) ──┐
P2 (Brand Identity)────────┤                         ├──▶ P4 (Build) ──┐
                           │                         │                  │
                           └──▶ P7 (Imagery) ────────┘                  ├──▶ P8 (Polish) ──┐
                                                                        │                   │
P5 (SEO Strategy) ─────────────────────────▶ P6 (SEO Implement) ────────┘                   │
                                                                                             │
                                                                                    P9 (Launch QA) ◀────┘

Market research, brand identity, and SEO strategy all launch simultaneously. Architecture and imagery run in parallel once their dependencies complete. SEO implementation and visual polish run in parallel after the build. Launch QA runs last.

A full new-build pipeline has 4 phases with a checkpoint after each one. The PM won't proceed past a checkpoint without my explicit approval.

What the PM Does at a Checkpoint

This is the part I got wrong on my first attempt. My initial version just paused and said, "How does this look?" That's too open-ended.

The revised checkpoint protocol presents:

A summary of what was just completed and which specialists ran
The 3-5 most important decisions that were made
Specific items that need my judgment
A recommendation on whether we're ready to proceed
What happens next if I approve

When I respond, the PM understands three modes: approve, approve with changes, or reject. If I approve with changes, it figures out which specialists are affected and only re-runs those. It doesn't start the whole pipeline over.

From 9 Agents to 34

Once the web dev team was working, I saw the larger pattern more clearly. The architecture wasn't specific to web development. It was a general-purpose orchestration pattern:

A lightweight dispatcher that maintains a roster of specialists, analyzes incoming tasks, assembles context, and spawns agents as independent processes — with all communication happening through file handoffs.

I started adding specialists. A UX researcher. A contract analyst. A pricing strategist. A devil's advocate whose only job is to stress-test ideas. Each one followed the same template: a self-contained prompt file with defined inputs, outputs, and quality standards.

The orchestrator evolved from a web-dev-specific PM into a general dispatcher that could route any task to the right specialist (or combination of specialists). The pipeline pattern - one agent's output becomes the next agent's input, all through files - scaled cleanly from 9 agents to 34.

That's the system I described in my previous post. But it didn't start as a grand architecture. It started with a practical problem: I had 9 web development prompts and I was tired of copying output between them.

What I'd Do Differently

If I were starting over, I'd change two things:

Start with the interface contract, not the prompts. I improved all 9 specialist prompts first, then figured out how to connect them. In hindsight, I should have defined the input/output contract between specialists first. What does the brand identity specialist produce? What does the builder expect to receive? Nailing that interface early would have saved revision cycles later.

Build the PM prompt with the execution model in mind from day one. My first PM prompt was platform-agnostic. It described what to do, but not how to do it. When I updated it for Claude Code's Task tool and file-based I/O, the prompt got significantly better because the execution model was concrete, not abstract.

Wrapping Up

I think my approach is straightforward to replicate:

Start with your workflow. What are the distinct steps? Write each one as an independent prompt with clear inputs and outputs.
Improve each prompt individually. Give each specialist a role, structured output format, self-verification checklist, and edge case handling.
Map the dependencies. Which steps need outputs from other steps? Which can run in parallel?
Build the orchestrator. A PM prompt that knows the roster, understands dependencies, handles intake, manages context bridging, and pauses at checkpoints.
Run it in Claude Code. The PM launches specialists via the Task tool. Each specialist reads its instructions from a file, reads its inputs from a file, and writes its output to a file. The PM verifies and bridges.

The key insight isn't the web development domain. It's the pattern. Any multi-step workflow where you're currently copying outputs between prompts can be turned into an orchestrated team. The PM doesn't need to be smart about your domain. It needs to be smart about managing the handoffs.

My web dev team is one of my most-used pipelines. A full Web site build runs all 9 specialists across 4 phases, produces a working site with SEO, visual polish, and a launch checklist. I only have to make decisions at 4 checkpoints instead of managing 9 manual handoffs.

But the real value was what it taught me about building the rest of my virtual team. Every new specialist that I add follows that same pattern. Every orchestration challenge has the same solution: clear interfaces, file-based handoffs, and a Project Manager that knows when to push forward and when to pause.

This article was originally published on LinkedIn.