This is Part 1 of a four-part series.
Part 2: Chasing the Nine-Tailed Fox
Part 3: That MCP Server You Just Installed
Part 4: Your Agent Army Awaits
This year we are going to see a steady beat of two progressing themes:
A week doesn't pass without a new vibe-coded repo that people plug into their AI agents that have network and bash access. For some of you, this is brand new territory, so I'll try to explain as we go. Some of you may think you're already familiar with most of these, but because it's a new muscle, repetition will still be helpful. For example, last week someone dropped a link in my feed for a cool open-source pentesting bot. My immediate reactions were:
Those thoughts—and a string of very real breaches in January—are why I wrote this series.
"Normal" open-source libraries are mostly passive. You import them, they run inside your app, and the blast radius is constrained by your runtime and your code paths.
Agentic tooling flips that:
None of this is moral failure. It's physics.
Most "getting started" guides optimize for one thing: time-to-wow. Install, paste a token, run with broad permissions. That's fine for a demo. It's not fine as a default.
Late January, security researcher Jamieson O'Reilly demonstrated three attack vectors against OpenClaw (100,000+ GitHub stars). Nearly a thousand instances were found with open gateway ports and no authentication—giving attackers access to every integrated system: Signal messages, credentials, conversation histories.
Zenity Labs then showed the deeper problem: a prompt injection hidden in a document the agent processes during routine work is enough to create a persistent backdoor and escalate to full system compromise. No software vulnerability required—every step abuses the agent's intended capabilities.
On the supply chain side, 1Password's security team discovered that hundreds of OpenClaw skills distributed macOS malware through a coordinated ClickFix-style campaign. A top-downloaded "Twitter" skill directed users to install a fake prerequisite—the install instructions decoded obfuscated payloads, removed macOS Gatekeeper protections, and dropped an infostealer targeting browser sessions, saved credentials, developer tokens, and SSH keys. The skill's SKILL.md file was the attack vector: not code, but instructions that looked like setup steps.
Early February, Wiz security researchers discovered that Moltbook—a popular social network for AI agents—used a Supabase publishable key in its client-side JavaScript (normal for Supabase) but never configured Row Level Security. Without RLS, that publishable key gave unauthenticated, full read/write access to the entire database. 1.5 million API tokens. 35,000 email addresses. 4,060 private conversations between agents, some containing plaintext third-party API credentials. Complete account takeover for any user—not because the key was exposed, but because nothing behind it enforced access control.
And AI pentesting tools like Shannon can now find and exploit these vulnerabilities automatically, at $50 per scan, outperforming human pentesters. Any security weakness in the repos you depend on will be found faster than ever.
The implication: checking your open-source dependencies is no longer optional. It's urgent.
There are many steps we could take to harden our setups, but the more complex they are, the less likely we are to follow them, and security practices only work if they are followed. Could we automate some of it to avoid relying on memory or discipline? Let's see if we can use AI agents to help us secure our AI agents.
Here's the idea: teach your AI coding agent to gate every git clone of external code with a fast security scan, automatically.
You're already using an AI agent to write code. That same agent can search the web for known vulnerabilities, shallow-clone a repo, scan the code for red flags, and give you a summary table in 30-60 seconds. You just need to tell it to do this by default, which is where "AI skills" come in handy (a new way to teach your old AI dogs new tricks, supported by Claude Code, Zen CLI, Codex, and others).
I've published this as an open-source skill you can install in any AI coding CLI (Claude Code, Zen CLI, Codex, and others):
| Skill | What it gates | Link |
|---|---|---|
| git | Every git clone of a public repo |
git/SKILL.md |
It delegates to a shared oss-security-check engine that contains the scanning methodology. Once installed, it triggers automatically—the agent does it for you, every time.
(In Part 3, we publish companion skills for MCP server installs and skill imports—same engine, additional checks specific to each attack surface. In Part 4: Your Agent Army Awaits, we put multiple AI agents to work together — multi-agent specs, cross-model code review, and building security into your workflow.)
Edit the git skill's markdown file in a way that repos in the <your-org> GitHub organization and repos hosted on git.yourcompany.com skip the security gate. All other public repos should be scanned before cloning.
One prompt, and the skill is tailored to your org's trust boundary.
The assessment checks two layers:
Layer 1: Reputation and history (web search). This is the highest-signal check and the one most people skip.
owner/repo vulnerability OR CVE OR malware", checks GitHub Security Advisories, and scans issues for security-related reports.A repo with known incidents should be flagged before you even look at the code.
Layer 2: Code scan. The agent shallow-clones (--depth 1) to a temp directory and checks:
npm audit / pip audit results, curl | bash install patterns, unpinned dependencies.Both layers feed into a single summary:
## OSS Security Check: org/repo
| Category | Finding | Risk |
|------------------------|----------------------|------|
| Known issues (web) | No CVEs, author est. | ✅ |
| Repo age & activity | Created 2 weeks ago | ⚠️ |
| Outbound network | Posts to analytics | ⚠️ |
| Secret handling | Reads .env, no logs | ✅ |
| Action surfaces | Shell exec, gated | ✅ |
| Dependencies | 3 high-severity CVEs | ❌ |
| Prompt injection | Automated scan only | ⚠️ |
Recommendation: Medium risk — clone to sandbox, use throwaway credentials
Then you decide. 30-60 seconds of your agent's time, evidence instead of vibes.
This reminds me of the early days of the internet's widespread adoption (though things are much faster now), where people blessedly created their "123password"s and left admin consoles open on static IP addresses. Unfortunately, the world wasn't kind to them - with "worms", botnets, malware, and other unpleasantries. We were all forced to invest more effort and cognitive cycles into cyber.
If you want to avoid the 2026 version of "123password," I hope this series will give you some basic knowledge and more importantly, some basic automatic tooling to ease that cognitive load.
This post covered the practical defense: automate the scan, make it a team default. But the scanner itself reads untrusted code—which raises an uncomfortable question.
In Part 2, we audit a 6K-star AI pentesting tool and discover what bypassPermissions means when an agent reads untrusted content. In Part 3, we look at the MCP server and AI skill supply chain—where the "install" command is the attack vector. And in Part 4: Your Agent Army Awaits, we turn these security skills into multi-CLI productivity tools, orchestrate multiple agents together, and build security review into your vibe coding workflow so it happens before you push — not after something breaks.
Continue to Part 2 — Chasing the Nine-Tailed Fox