AI Agent Survival Guide, Part 4: Your Agent Army Awaits

Written by Andrew Filev | Feb 12, 2026 12:13:24 PM

This is Part 4 of a four-part series.

Part 1: The Repo You Didn't Scan

Part 2: Chasing the Nine-Tailed Fox

Part 3: That MCP Server You Just Installed

In Parts 1–3, we built a security foundation: automated scanning for repos, MCP servers, and skills before they touch your machine. This post is about what comes next: putting all your AI agents to work together.

The multi-subscription reality

Here's something that's true for most serious developers in 2026 but nobody talks about: you're probably paying for two or three AI subscriptions, eg Anthropic for Claude Code, OpenAI for Codex. Each one has strengths the others don't — different models, different tool ecosystems, different reasoning patterns.

And yet most people use one CLI at a time.

The others sit idle — subscriptions you're paying for, agents you're not using. That's not an expense problem. It's an orchestration problem.

What becomes possible

The most powerful pattern isn't having all your agents do the same thing. It's having them do different things and combine the results — independent perspectives that you synthesize into something better than any single model produces alone.

You already have the subscriptions. You already have the CLIs. What's missing are the patterns to make them productive together. Let's start with a concrete example.

The multi-agent spec: a real example

Here's a pattern I use regularly. Before writing any code, I have two or three models independently spec the same feature. The instruction is wrapped in a skill that directs the master agent to launch independent sub-agents via shell scripts:

# Both agents run in parallel — independent, no shared context
~/skills/spec/scripts/claude_spec.sh &
~/skills/spec/scripts/codex_spec.sh &
wait

Each agent gets the same prompt from the master agent. They write to separate output files. Then the master agent reads both and applies a merge pattern:

Cross-validate: If both agents made the same design decision independently, it's high confidence. They converged without coordination — that's signal.
Synthesize: For overlapping areas, combine the strongest elements. One spec might have sharper API design, the other more thorough error handling.
Blind spot coverage: Surface unique insights (only one agent caught them) with source attribution.

What's interesting is how consistently different models emphasize different things. Claude tends to do a good job of understanding the "gotchas" in the requirements, and GPT is often very thorough on the system design side. These aren't hard rules — AI is probabilistic — and here we're using that non-determinism in our favor instead of it being a hurdle.

The result is a spec that synthesizes the "collective intelligence" of the agents — genuinely different reasoning patterns, not the same model asked to review its own work — preventing some problems before any code is written. When two models independently converge on the same architectural decision, you know it's solid. When one surfaces a concern the other missed entirely, you've caught something that a single-model workflow never would.

Here are some things I personally experimented with that didn't work so well (YMMV):

Don't run them sequentially, feeding one's output to the other. The second agent just agrees with the first — you lose the independence that makes the pattern valuable.
Don't average their scores or ratings. Their scales aren't calibrated the same way. A "high risk" from Claude and a "high risk" from GPT don't mean the same thing.
Don't try to make them cross-chat. You get watered-down "committee" specs where every opinionated decision gets hedged into mush. As the models develop, I feel this might resurface from anti-pattern into best practice, hence the YMMV on top — this is a very dynamic area. I like to say that "vibe coding isn't a religion, it's a process of rapid experimentation".

The principle underneath all of this: independence is the asset. The moment you let the agents see each other's work before they've finished their own, you've collapsed two perspectives into one. This pattern works beyond specs — I use it for content brainstorming, architecture decisions, anything where independent perspectives beat groupthink.

Cross-model code review

The independence principle gets even more valuable after implementation. Once one agent writes the code, hand it to a different model for review.

Think about what happens when you ask a model to review code it just wrote. It has a sunk cost in every decision — it chose that data structure, that API shape, that error handling approach. When it "reviews" its own work, it's rationalizing, not reviewing. It will find surface issues (a missing null check, a typo) but miss the structural decisions because those are its decisions.

A different model has no such bias. It reads the code cold. It might question why you're using a map where a set would suffice, or why the retry logic doesn't have exponential backoff, or why a function is doing three things instead of one. These are exactly the observations that the authoring model would rationalize away — and exactly the ones that matter most.

This applies to security review too. Have one model write the code, another run the security scan from Part 1, and a third review both the code and the scan results. Three perspectives, zero shared blind spots.

Setting up: making this practical

All of this requires your CLIs to share the same tools — the same MCP servers, the same skills, the same credentials. The good news: the setup is simpler than you'd expect.

The updated install-mcp and install-skill skills handle this. They build on the security foundation from Parts 1–3 — the same gate that scans every MCP server and skill before install now also configures all your CLIs in one step. Scan once, install everywhere.

For MCP servers, the key idea is the credential wrapper: secrets live in exactly one file, and every CLI points to a wrapper script that sources them at runtime.

~/.config/github-mcp/
├── credentials.env   (chmod 600 — secrets live here, nowhere else)
└── run-mcp.sh        (chmod 700 — sources credentials, launches server)

One set of credentials, one file with the right permissions. The install-mcp skill handles the per-CLI config formats automatically — Claude Code, Codex, Gemini CLI, Zencoder CLI. Rotate a key? Update one file. Revoke access? Delete one file.

For skills, it's even simpler: clone the skill once, symlink to every CLI's skill directory. The install-skill script does this in one command and supports install, uninstall, and listing across all CLIs. Update a skill? git pull in the source directory. Every CLI sees the change instantly.

(Full setup details and per-CLI config examples are in the skills repo.)

Zenflow: making orchestration practical

Running the multi-agent pattern manually — launching scripts, managing output files, synthesizing results — is doable for a single task. But doing it consistently, across every feature, with the same quality? That's where manual breaks down.

Zenflow lets you define custom workflows that coordinate multiple agents. A workflow might look like:

Spec step — two or three models independently spec the feature
Merge step — one model synthesizes the best spec from all of them
Implement step — agent writes the code against the merged spec
Verify step — run tests, static analysis
Review step — a different model reviews the implementation

Each step runs in its own context, enabling agents to perform at their full potential. You define the workflow once and run it every time you need it. And as you go deeper into the AI orchestration rabbit hole, I'm sure you'll build your own war chest of custom workflows that fit your job.

[Disclaimer: I'm the founder of Zencoder, parent company of Zenflow. We have about 50 engineers in the company, and as we transitioned to AI-First Engineering last year, we needed tools to orchestrate all the AI workflows we were running across our SDLC, which led to Zenflow. The basic version is free and works with your existing AI subscriptions.]

Vibe responsibly

I started this series with excitement about open-source repos, but at the same time, a very cautious posture toward the quality and security of the code there, and wanted to give back to the community the tools to enjoy the crowdsourced awesomeness while staying responsible. It's only fair that we close the arc of this series by talking about how your vibe-coded contributions to the world don't end up on top of security bulletins (never say never, though).

If we look at the workflow above, the verify step asks to integrate a security review, similar to the one we worked through in Part 1.

The reason security practices fail isn't that people don't care. It's friction. If it's manual, it gets skipped. If it's embedded in the workflow, it happens every time. The same agent that wrote the code runs the review — you just told it to, once, in the workflow definition.

The full arc

Part 1: scan before you clone. Part 2: understand the recursive trap — your scanner reads untrusted content too. Part 3: secure the supply chain for MCP servers and skills. This post: turn those defenses into a force multiplier by orchestrating your agents together, and build the security review into your workflow so it happens without thinking.

The AI Agent Survival Guide isn't about being paranoid. It's about building the habits and automation that let you vibe code confidently — fast drafts backed by real review, multiple perspectives instead of one, security that runs because the workflow says so, not because you remembered.

View full post