Yesterday, I asked my Senior UX Design Researcher to assess the new version of the NetSuite SuiteQL Query Tool that I'm building. The researcher found usability issues I'd missed, organized findings by severity, and produced a structured report.
I then gave the report to my Enterprise Software Developer, and requested that it come up with a plan to address the UI/UX issues that were found.
Here's the thing: That researcher doesn't exist, and neither does the developer. They're both AI agents that I built with Claude Code in about twenty minutes.
In this article, I discuss how I've built my ever-expanding virtual team, and why I prefer this approach over OpenClaw (the open-source AI agent framework that's taken the developer world by storm).
The Dream and the Disaster
OpenClaw's appeal is wild. It's a 24/7 AI assistant that runs on your hardware, connects to messaging platforms, executes terminal commands, manages files, browses the web, and orchestrates workflows while you sleep. It crossed 180,000 GitHub stars and drew two million visitors in a single week. People use it to fight insurance claims, build websites from their phones, and monitor production systems autonomously.
But the architecture that makes it powerful also makes it dangerous.
Cisco has published a report calling OpenClaw a "security nightmare." SecurityScorecard identified tens of thousands of exposed instances leaking API keys. Bitdefender documented nearly 900 malicious plugins flooding OpenClaw's skill marketplace.
Security researcher Simon Willison identified what he calls the "lethal trifecta" - the combination of private data access, untrusted content exposure, and external communication capabilities. OpenClaw has all three, running as a single long-lived process with broad system permissions.
The consequences showed up fast. Cisco tested a malicious OpenClaw skill called "What Would Elon Do?" and found it silently exfiltrated data through curl commands and used prompt injection to bypass safety guidelines - all without the user's knowledge. A one-click remote code execution vulnerability was patched after researchers demonstrated that visiting a malicious website could give an attacker full control over a victim's instance. Bitdefender found automated scripts uploading new malicious skills to the ClawHub marketplace every few minutes. An independent analysis of the ecosystem found more than a quarter of available packages contained vulnerabilities.
OpenClaw's own documentation acknowledges: "There is no 'perfectly secure' setup."
I wanted OpenClaw's capability. I didn't want the risk. So I've been building something different.
My Virtual Team
Instead of one monolithic agent with access to everything, I've built a team of specialists. 34 of them (so far), organized into 11 groups. Each is a Claude Code agent with a defined persona, expertise, and scoped access.
On the development side, an Enterprise Software Developer handles my development work and uses my preferred stack (NetSuite, SuiteScript, PHP, nginx). A special PHP Workflow Architect builds background automation scripts. A Senior UX Design Researcher runs usability testing, heuristic evaluation, and contextual inquiry. A Web Designer owns front-end work and design systems.
Beyond code, a Competitive Intel Analyst does deep competitor research - positioning, strategy, exploitable gaps. A Contract Analyst extracts risks and flags deviations from market-standard terms. A Pricing Strategist works through revenue optimization. An Executive Assistant handles communications, meeting prep, and prospect research. A Content Strategist builds editorial systems. A Business Technical Editor prioritizes argument structure before polish.
Then there's the Devil's Advocate. Its only job is to stress-test my ideas, plans, and assumptions by finding weaknesses and blind spots. It's the team member nobody enjoys consulting and everybody needs.
Each agent exists as a definition file in my "virtual team" directory. When I need one, I invoke it through Claude Code, hand it the relevant context, and let it work. When it's done, it's done. No daemon running, no port open, no gateway exposed to the internet.
Why This Is Inherently Safer
The security advantages of my virtual team strategy aren't incidental. Instead, they're structural.
OpenClaw:
- Always-on daemon with persistent network exposure.
- Single process with broad system permissions.
- Processes untrusted inputs from messaging platforms automatically.
- Public plugin marketplace with widespread vulnerabilities.
- Authentication tokens in URLs, exposed Control UIs.
Virtual Team:
- On-demand processes that exist only when a task is being worked on.
- Each agent is scoped to specific files and capabilities.
- A human reviews and approves every action.
- Agent definitions are written by me, and stored locally.
- There's no network exposure. No gateway. No open ports.
The most important difference is the absence of the "lethal trifecta." My agents don't have simultaneous access to private data, untrusted external content, or outbound communication channels. The UX Researcher reads the product I point it at and writes a report. It doesn't also monitor my email, parse untrusted web pages, and send messages on my behalf.
That compartmentalization isn't something I had to engineer. It results naturally from treating each agent as an independent, short-lived process with a specific job.
Building the Agents
Creating a specialist agent isn't complicated, but it benefits from deliberate thought.
Before I write an agent prompt, I research the role itself. What qualities does an elite professional in this field actually have? What methodologies do they use? What does their judgment look like?
For the UX Researcher, I started by mapping the skills a senior practitioner would bring - formal usability testing, guerrilla testing, contextual inquiry, ethnographic methods, persona development grounded in data rather than assumptions. I captured the disposition too: deep empathy, comfort with ambiguity, the ability to translate research findings into language that stakeholders act on. All of that went into the agent's system prompt.
For the Elite Developer, I defined expertise in my specific stack and conventions - how I structure projects, which patterns I follow, the style guidelines I care about. The result is an agent that produces code consistent with my existing codebase, not generic boilerplate.
This research phase is important. The difference between a useful agent and a generic chatbot with a title is the depth and specificity of the instructions it operates from.
Where It Gets Interesting: The Orchestrator
Running individual agents is already useful. What I'm building now takes it further: a lightweight orchestrator whose job is to be the dispatcher.
The orchestrator doesn't do work itself. It understands my team's "roster" - who each specialist is, what they're good at, what files and tools they need, and what they should not have access to. When I describe a task, the orchestrator determines which specialist handles it, assembles the context, and spawns the right agent.
For multi-step work, it also manages the pipeline. Say I need a usability assessment of a tool and then want the issues fixed. The orchestrator recognizes that's a two-phase job: UX Researcher first, then Developer. It proposes the plan, I approve, and it executes - spawning each agent as an independent process.
The orchestrator maintains the roster of specialists with defined capabilities and access boundaries. When a request comes in, it determines the right specialist (or sequence of specialists), assembles context, confirms the plan with me, and spawns each agent as a separate Claude Code process. It doesn't need broad permissions itself - just the ability to run claude and manage a temp directory.
The Pipeline Pattern
The key architectural insight is that agents don't need to share memory or context to collaborate. They just need to read and write files.
When the orchestrator spawns the UX Researcher, it runs as a completely independent Claude process with its own context window. The researcher reads the source files I've pointed it at, does its analysis, and writes findings to a file. When the Developer agent is spawned next, it reads that file as input. The orchestrator manages the sequence and the file paths.
Here's the pattern:
You (request)
→ Orchestrator (plans, spawns, monitors)
→ Agent A writes output → /tmp/job/step-1.md
→ Agent B reads step-1.md → writes /tmp/job/step-2.md
→ Agent C reads step-2.md → writes /tmp/job/final.md
← Orchestrator (collects artifacts, presents results)
Each agent is stateless and isolated. The only coupling between them is that file contract - what they read and what they produce. No shared memory, no context bleed, no risk of one agent's instructions contaminating another.
This also solves the context window problem. If the orchestrator tried to hold everything in its own conversation - receiving the UX report, summarizing it, passing that summary along - it would burn its context on content it doesn't need. With file-based handoff, each specialist gets the full, uncompressed output from the previous step, and the orchestrator's context stays lean.
For independent tasks, the orchestrator can spawn multiple agents in parallel - researching three competitors simultaneously, for example - and wait for all of them to complete before moving to the next phase.
The Security Properties I Get for Free
This architecture gives me isolation without any special sandboxing infrastructure. Each spawned Claude process inherits only the file access and permissions the orchestrator grants through its command arguments. Agents can't see each other's context. They can't modify each other's outputs unless explicitly pointed at the same files.
The orchestrator itself is deliberately unprivileged. It can read agent definitions and spawn subagents, but it doesn't need file system access, shell access, or API credentials for the actual work. Those permissions exist only in the specialist agents, only for the duration of their task.
There's a natural checkpoint at the planning stage. The orchestrator confirms its approach with me before executing. That's the human-in-the-loop moment, but lighter weight than manually assembling context and choosing agents myself.
Contrast this with OpenClaw, where a single long-running process with full system access handles everything, and a poisoned skill can reach into any part of that shared context.
What I'm Giving Up
Of course, my approach does have some tradeoffs. My agents don't monitor my inbox overnight. They don't respond to Slack messages while I'm asleep. They can't proactively notice that my calendar has a conflict and fix it. The "set it and forget it" appeal of OpenClaw - the agent that fights your insurance company while you're at dinner - isn't something my approach delivers.
I'm still in the loop - and that's by design. I decide when to wake up a team member, what to hand them, and when to pass work from one specialist to the next. The orchestrator reduces that overhead, but doesn't eliminate it. I'm still the executive making decisions, not a bystander watching an autonomous system run.
For me, that's an acceptable trade. I get leverage - significant leverage - while retaining control. For someone running a team of 50 who wants shared agents operating autonomously across the organization, this isn't the right model. But for a solopreneur or small consultancy that wants AI to multiply their output without multiplying their risk? Different calculation entirely.
The Honest Comparison
OpenClaw users are getting things done. The people building it are solving real problems, and the enthusiasm is well-earned. I'm not writing this to dismiss what they've achieved.
But OpenClaw users are also the ones showing up in Shodan scans with leaked API keys. They're the ones whose agents silently exfiltrated data through a skill called "What Would Elon Do?" They're the ones China's Ministry of Industry and Information Technology is issuing alerts about.
The features that make an agent useful - broad access, autonomous action, persistent memory, external integrations - are exactly the features that make it dangerous.
My approach trades convenience for security. It trades always-on autonomy for on-demand leverage. And it treats each step toward greater autonomy as a deliberate security decision rather than a convenience feature.
The question isn't whether autonomous AI agents are the future. They are.
The question is whether you need to accept a "security nightmare" to get there today.
I don't think you do.
This article was originally published on LinkedIn.