Nine Seconds: "Excessive Agency" Stopped Being Just a Talking Point
A Claude-powered Cursor agent deleted a startup's production database and every backup in nine seconds. The industry's response shipped within days, and the architectural pivot from guardrails to action checkpointing is now the new compliance baseline. Plus five Spring AI CVEs introducing a vector-store injection class, the Pentagon dissolving Anthropic's exclusive Mythos moat, and the Vatican beating NIST to a formal AI ethics framework.
Safe AI AcademyMay 3, 202613 min read13 views
When I read OWASP's agentic AI threat list last year and saw "Excessive Agency" sitting there as ASI08, I assumed it was going to be the kind of risk we would talk about for two more years before any of us could point to a clean public example.
That timeline is now over. It took nine seconds.
A startup called PocketOS lost its production database and every single backup in nine seconds because a Claude-powered Cursor agent autonomously decided to "fix" a staging credential mismatch by issuing a single curl call against a Railway volume delete endpoint. The agent's own post-incident statement, captured in The Register's writeup, reads like a corporate apology written by the perpetrator: "I violated every principle I was given." Tom's Hardware confirmed the agent ignored the Cursor system prompt and the explicit project rules forbidding destructive operations. Zenity's incident analysis frames it as the highest-profile real-world demonstration of OWASP's Excessive Agency (ASI08) and Tool Misuse failure modes in production, and IT Security Guru's followup walks through the lessons in unsparing detail.
This is the article I have been mentally drafting since OWASP first published the agentic threat taxonomy. Let me walk you through what shifted in the seven days between that incident and now, because the response from the industry was both faster and structurally more interesting than I expected.
The Vendor That Caused the Incident Just Sold You the Tool to Prevent the Next One
Stay Updated
Get notified when we publish new articles and course announcements.
Cursor, the IDE whose agent did the deleting, shipped Security Review GA on Teams and Enterprise on May 1. The product is exactly what it sounds like, an always-on Security Reviewer that drops vulnerability comments on every pull request, plus a scheduled Vulnerability Scanner that pushes findings to Slack. Phemex News' coverage confirms the GA timing and the per-PR deployment model.
I have to admit the optics here are hard to read with a straight face. The same vendor whose agent went rogue against a customer's database five days earlier is now the vendor selling you the AI security review for your repo. To be fair to Cursor, this is not actually contradictory, it is just timing. Pull request scanning is a known, well-understood control surface. The PocketOS issue was not a missing scanner, it was a missing checkpoint at the moment an agent invoked an irreversible action.
That second problem is the one IBM tackled the same day. IBM's "Bob", a new agentic coding tool, ships with multi-model routing and, more importantly, mandatory human-in-the-loop checkpoints at irreversible-action boundaries. This is the first major vendor I have seen ship checkpointed action gating as a default rather than as an opt-in policy. The way I see it, this is the actual architectural lesson of PocketOS, and it deserves to be the new baseline expectation for any agentic IDE shipping into enterprise.
I covered the "guardrails alone are not sufficient" argument in last week's piece, where NIST's research supervisor said it on the record. What is new this week is that the industry response is not "build better guardrails." It is "stop letting the agent be the last decision-maker for any action that cannot be undone." Those are very different control designs, and if you are building AI agent control libraries right now, you should be writing controls that mandate the second one. A guardrail asks the agent to behave; a checkpoint physically prevents the agent from completing the action without a separate authority.
A New Vulnerability Class Just Showed Up in the Java Stack
While everyone was watching the agent drama, the Spring ecosystem quietly did something I think will define the next year of compliance work. Spring AI 1.0.6, 1.1.5, and 2.0.0-M5 dropped on April 27 with five simultaneous CVEs across the AI integration surface. HeroDevs' roundup breaks them down. The headline issue is CVE-2026-40967, CVSS 8.6, an injection flaw in the FilterExpressionConverter used by vector store implementations including PgVector. Insufficient escaping of keys and values means user-supplied input flowing into a filterExpression can alter the underlying query. Tenable's research advisory is the cleanest read on the actual mechanics. Sister CVEs hit CosmosDB SQL injection and a cross-tenant memory exfiltration path.
I want to be clear about why this matters more than another routine framework patch. We have spent fifteen years writing OWASP-driven SQL injection controls because somebody figured out in 2008 that string-concatenated queries against relational databases were going to be a generational problem. The vector store filter expression is the same shape of problem, except now it is a query language nobody has standardized for, against retrieval indexes nobody has built injection test suites for, in frameworks where the integrations are six months old. If you are running RAG systems, your compliance program needs a vector-store injection control today, the same way you needed a parameterized-query control fifteen years ago. The way I see it, this is going to be its own line item in every AI security framework by the end of Q3.
The Pentagon Just Erased an Exclusive Moat in a Single Decision
If you read last week's piece, you remember the bit about Anthropic's restricted Mythos model being a tightly gated capability with nation-state-tier access controls. That posture lasted approximately seven days.
That last detail is the one that should land for compliance practitioners. Government use of restricted AI capabilities is now happening across multiple agencies with overlapping but inconsistent risk designations. The civilian side could not get a copy. NSA is already running it. The Pentagon flagged the vendor as a supply chain risk while letting NSA use the vendor's tool. If you build an internal AI risk taxonomy, you can no longer use "approved by US government" as a clean evidence node. The US government is approving and disapproving the same model in adjacent buildings.
Anthropic, for what it is worth, also formalized the commercial side this week. Claude Security launched in public beta, an Opus 4.7-powered code vulnerability scanner with integrations across CrowdStrike, Microsoft, Palo Alto, SentinelOne, TrendAI, and Wiz. So the same week the gated-capability narrative collapsed, Anthropic opened a commercial product offering meaningful chunks of the same capability to enterprise buyers through the existing security tooling stack. Read the chess position. The exclusivity moat was being defended publicly while the company simultaneously expanded the commercial perimeter. I do not blame them, I would do the same. But it does mean the "Mythos is the only place that capability lives" argument from a month ago is dated.
Agent Governance Just Became a Product Category, and the First Bug Is Already in the Wild
Microsoft made Agent 365 generally available on May 1 at $15 per user per month. The April 30 pre-GA suite included Agent 365 Runtime Protection (preview), AI CSPM in Defender, Agent Actions Audit in GitHub Advanced Security, and AI Data Security in Purview, per the rolled-up coverage. This is the first time I have seen a major vendor price agent governance as a discrete per-seat SKU rather than as a bundled feature inside a larger productivity license.
That being said, the same architecture got its first public stumble last week. The Hacker News reported on the Entra ID Agent ID Administrator privileged role, which Silverfort's detailed writeup confirms allowed takeover of arbitrary service principals, not just agent identities, via owner-assignment plus credential-add. 99% of analyzed tenants had at least one privileged service principal, which means the practical blast radius was the entire identity tier. Microsoft cloud-patched on April 9 and the public disclosure landed April 28. CSO Online has the additional context.
That is a very specific story I want compliance practitioners to absorb. The new "agent role" in your cloud directory was not actually agent-scoped. The role description said agent, the underlying scope said anything. If you have onboarded any agent identity governance product in the last quarter, your control library needs to validate the role scope at the IAM-layer level, not at the documentation layer.
OpenAI shipped its own response on April 30 with hardware-backed passkeys via Yubico, and TAC members must enable by June 1. Hardware identity for the human accounts driving the agents. That is the right direction, and I expect to see it become the bottom-of-the-stack control across every major lab within ninety days.
The Substrate Underneath Is Not Clean Either
Before we get to the survey numbers, I want to flag the supply chain layer underneath all of this, because it kept being noisy in ways that compound everything else. PyTorch Lightning 2.6.2 and 2.6.3 were compromised on PyPI with a credential-stealing payload that auto-executes from a hidden _runtime directory; Socket flagged it eighteen minutes after publication and tied it to the Mini Shai-Hulud campaign. CVE-2026-7593 added a remotely exploitable command injection to the Sunwood-ai-labs MCP server, the second MCP-server shell-injection RCE class CVE in two weeks. The Lovable AI app builder ran a forty-eight-day BOLA exposure of source code, database credentials, and AI chat history with HackerOne reports ignored the entire time. And Mandiant's VP of Consulting warned at Google Cloud Next, in attributed quotes, that the AI rush is reviving classical security failures rather than driving novel defenses. Mandiant's red teams achieved AI-amplified breach paths via social-engineered initial access plus AI-driven exfiltration, and SC Media's coverage reinforces the point.
So the AI agents are not running on a clean substrate. They are running on top of a package ecosystem under active worm campaigns, an MCP server population that keeps shipping shell-injection CVEs, AI-built apps with first-year auth flaws, and identity infrastructure that is itself the lateral movement vector. If you are designing controls for agentic AI without controls for what sits underneath it, you are building a roof on top of nothing.
The Numbers Have Caught Up With the Stories
Three new data points this week, taken together, kill any remaining argument that AI agent risk is hypothetical.
42% reported a confirmed or suspected AI-related incident
50% globally experienced AI incidents despite having controls in place
52% had no confidence their controls would even detect a compromised AI assistant
87% had deployed AI assistants beyond pilot
76% were piloting or rolling out autonomous agents
So agentic AI is in production at three-quarters of large organizations, and a majority of those organizations admit they could not detect a compromise. The full report has the methodology.
82% reported unknown AI agents in their environment
65% had experienced AI-agent-related incidents in the past year
Only 21% had a formal decommissioning process.
Infosecurity Magazine has the rollup. Read that decommissioning number again. Four out of five organizations have no documented way to retire an AI agent. We are about to spend the next two years discovering legacy agents the way we used to discover unmanaged service accounts, and there will be a lot more of them than there are unmanaged service accounts.
KELA's State of Cybercrime 2026 report gave the threat side of the same coin. 80 to 90 percent of malicious operations now require minimal human involvement. KELA introduced "vibe hacking" as a category, the practice of framing malicious actions as legitimate tasks to bypass agent safeguards. Unite.AI's coverage has the methodology. The mirror image of "vibe coding" producing a deleted production database is "vibe hacking" producing autonomous credential abuse, and we are already running both in parallel.
And Yes, the Vatican Got There First
I will close on something I cannot believe I am writing. The Vatican published a formal AI ethics framework this week banning manipulative AI that exploits cognitive biases and prohibiting clergy from using AI-generated sermons. It is the first religious-institution AI authenticity standard, and the timing is striking. NIST has not finished its updated AI RMF agent profile. The EU AI Act's agent provisions are still in implementation flux. ISO 42001 does not yet have a specific agent extension. And the Vatican shipped a written, named, distributable framework explicitly calling out deepfakes and authenticity in the same week that an AI agent deleted a startup.
At the end of the day, this week stress-tested every assumption the AI compliance world has been operating under. Agents can take irreversible actions on your behalf in nine seconds. Vector stores have their own injection class now. The government's gated-capability story collapsed. Agent governance is a per-seat product with a public bug already on its scoreboard. And eighty-two percent of you have AI agents in production you cannot identify.
Word of advise. If you're building control libraries, start with two new must-have controls: irreversible-action checkpointing for any agent in production, and a vector-store injection test in your RAG security pipeline. Everything else can iterate. Those two are the floor.