Modern Cyber with Jeremy Snyder - Episode
108

This Week in AI Security - 21st May 2026

In this episode for May 21, 2026, Jeremy looks at the rapidly compressing timeline of AI-driven exploits. From the first live confirmation of an AI-assisted 2FA zero-day to Microsoft's multi-agent "debate" system outperforming top frontier models, defenders are watching the offensive clock shrink in real time.

This Week in AI Security - 21st May 2026

Podcast Transcript

All right. Welcome back to another episode of This Week in AI security, coming to you for the week of the twenty first of May twenty twenty six. We have a ton to get through this week across a couple of different categories. So we're going to go through some of these stories pretty quick. And without further ado, let us dive in. So I want to open on the theme of offensive capabilities and how AI is being used out in the wild.

So first is some information. A report from the Google Threat Intelligence group GTAG about the first confirmed AI generated zero day exploit in the wild, which I think is a little bit of a funny way of of thinking about it, but it is the first confirmed 2FA bypass used in exploitation. Basically, the GTAG caught cybercrime actors using an LLM to discover and weaponize a 2FA bypass in a popular open source admin tool. It's a high level semantic logic bug with some hard, hard coded trust assumptions. These are exactly the kinds of problems that LLMs reportedly are very, very good at. And we're going to return to why I said reportedly here in just a minute here.

But it's the kind of thing that traditional fuzzers and a lot of human examiners miss, because it might require deeper, you know, kind of in-depth inspection or chaining multiple conditions together. I think one of the interesting things from the team over at Watchtower, who also looked at this was discovery. Weaponization and exploitation are faster. This is not heading towards compressed timelines. They're actually watching them compress. And so this is, you know, kind of proof of this stuff being in the wild. Moving on.

Next story along this same theme. This is from the Palo Alto Unit 42. Indirect prompt injection starting to be observed, quote unquote, at scale. Now what at scale means is always kind of subject to interpretation. But we have covered on the show previously where any number of web pages are having embedded text that is very specifically targeted at LLMs. So sets of instructions that LLM web crawlers might be picking up that can lead to this kind of indirect prompt injection that we're talking about here.

Um, and one of the interesting things is that they're seeing not just single instructions, but more kind of agent oriented stuff. They have analyzed twenty two distinct payload techniques, identified AI based ad review evasion, SEO manipulation, promoting phishing sites, data destruction, unauthorized transactions, system prompt leakage. So some interesting stuff again, being observed in the wild. More from an offensive capabilities perspective.

Now transitioning from the offensive side to more of the hey, let's identify vulnerabilities and improve our own security. An interesting report out of Microsoft about something called the multimodal agent system. Um, called, I think labeled as M-Dash inside Microsoft. It uses one hundred plus specialized AI agents across multiple models that scan code. And then and this is the key debate each other's findings. And this is, I think, one of the most interesting aspects of it. And of course, you know, many organizations wouldn't have the resources to be able to do, you know, one hundred plus specialized agents engaging in debate as opposed to, hey, I can only have the token budget for one that can go and scan and maybe even just scan once and it identifies issues that need to get fixed, etc..

But back to their, uh, result, which is that it scored eighty eight point four five percent on the UC Berkeley Cyber Gym benchmark, which is actually higher than Anthropic Claude Mythos' score on that same benchmark. Sixteen new vulnerabilities in windows networking and authentication stack, including a couple of critical RCEs in the TCP IP stack, HTTP.sys, and a few different other things. Uh, six user mode, um, kernel level issues patched in the May Patch Tuesday.

This debate thing is really interesting, and it kind of goes to something that we have discussed on the program before. You know, there's often this question about who has the upper hand. Is it attackers? Is it defenders? Is it whoever uses AI best? Is it whoever uses AI first and very often? You know, one of the consistent results is that the combination of AI plus a human outperforms either one on their own. And this is some interesting proof that says like, hey, actually multi based systems, multi-agent systems might be a really valid contender in the, you know, who produces the out the best quality output. Actually combine that with the human. And, you know, maybe that is the direction that a lot of this stuff will be heading. So really interesting report as always, link in the show notes. Moving on to our next one.

So OpenAI kind of, uh, you know, not to, let's say copycat the Mythos stuff, but they've got their own program called Daybreak. It is a program that is out there with a three tier model with GPT-5.5, GPT-5.5 Trusted Access for Cyber (TAC), and GPT-5.5 Cyber, which is a more permissive model that allows for red teaming Penetrating testing codec security as an agentic harness for automated vulnerability discovery and patch validation already been integrated by a number of kind of larger key partners in the cybersecurity ecosystem Akamai, Cisco, Cloudflare, et cetera, et cetera. So, you know, really, again, kind of a response from them, some interesting stuff. So looks like more and more of the frontier model developers are realizing that cyber capabilities are one of the most in demand areas and probably one of the most immediately commercial commercial opportunities as well. Moving on.

And, you know, I mentioned at the beginning, quote unquote, reportedly around the capabilities of these models, there was a little bit of a pushback on the cURL codebase. And, you know, cURL for those who don't know, it's one of these underlying system commands inside Linux that allows the Linux operating system to pull down web pages. It is actually a mechanism used by a number of web browsers in the Linux world, and I think maybe even outside it in things like macOS and BSD based distributions that use that under the hood. And it's basically the way to do hypertext transfer, right?

So it leverages the HTTP, Hypertext Transfer Protocol to suck down a web page, and then your browser renders it locally. Great. Well, uh, Claude Mythos was, you know, is one of the most talked about capabilities in the cybersecurity world. And researchers took it and pointed it at the cURL codebase. And after review of a number of different bugs, uh, Daniel Stenberg and his security team found only one real vulnerability that was not a false positive or already documented. And so one net new, real vulnerability. And it was kind of categorized as low severity. And so Stenberg is calling it, quote, the greatest marketing stunt ever.

But I think it's, you know, something to think about. This is a very heavily fuzzed and audited codebase. Um, and so, you know, you contrast that with a lot of organizations where they don't have that level of fuzzing and auditing. And, you know, you might find a lot more novel stuff. I just thought it was an interesting kind of pushback, offering a different perspective on, you know, how good these models really might be or not be for cyber capabilities, as the case may be. All right, moving on to our next story.

And this is a TanStack supply chain attack hitting OpenAI as well as a number of other vendors. So the Team PCP group compromised the TanStack npm packages via a CI pipeline token theft. So not from phishing, but from a leaked token or stolen token. Eighty four malicious versions published in a six minute window. Looking at the commit history. Uh, two OpenAI employee devices were hit. Credential material was exfiltrated from internal repos. OpenAI rotated all their code signing certificates for all macOS apps, so that's things like ChatGPT desktop, the Codex app, the Codex CLI, and Atlas.

And it is the second cert rotation in two months for them. The first was a North Korean compromise of the Axios library. We talked about that back in March. Um and you know, the recommendation is obviously like update your, uh, update your operating system based tools that might be interacting with ChatGPT. Uh, so it's just, again, you know, kind of a confirmation that not only are the models and the model providers, but all the kind of tooling around these things is part of the infrastructure that is a very high value, uh, very prominent attack surface right now. We've got a number of stories on this theme as we kind of move on through today's episode. Next story.

PraisonAI vulnerability gets scanned within four hours of disclosure. So this is an AI agent framework called PraisonAI. It shipped with a Flask library that has a legacy configuration where authentication defaults to false. So meaning that this system is literally out there with no authentication if you know it as it was packaged at that time. Um, and the interesting thing here is that, you know, from the time it went live and some of the codebase was, was published, you know, this was scanned in less than four hours. So from the time it went online, within three hours, forty four minutes of the advisory publication of this CVE, this thing went was scanned so unauthenticated callers could invoke any configured AI workflow via this interface, turn utilization quotas, et cetera, et cetera. Uh, credit to the researcher Shmulik Cohen for discovery and disclosure on this one. So again, the infrastructure, the tooling around these systems is very often a high value attack surface and a high value attack surface that has a lot of insecure defaults. So think about that. Moving on to the next one.

This is around Google Cloud keys. We talked before about how Google kind of silently changed scopes on a lot of keys that were being used for things like embedded maps on websites and things like that. So attackers have been scraping keys from both GitHub repos, uh, things like Nano Banana, Veo 3, um, you know, image generation, model generation, file generation things. And the keys belong to different organizations who are waking up to bills in the ten to fifty K range. Um, Google automatically expands your spending limits, which kind of makes the damage worse for these organizations.

Google cites this as an industry wide problem and blames users for leaking keys, which I have to admit, you know, there is some level of blame to be had, but I will again go back to kind of that change to model scopes as being one of the key challenges around that, because for a very, very long time, the guidance was these keys are not high risk keys, right? So these are the kinds of credentials that you don't really need to worry about these kind of, you know, denial of wallet financial attacks are something to really watch out for. And I always do advise organizations who are looking at, hey, how do I monitor things like attacks and whatnot. Pay attention to your spend as one of the metrics that you're tracking as you look at that. You know, it could be the case that AI adoption is really going up, or it could be the case that some of these credentials or tokens have been compromised. All right. Moving on.

I want to give a special shout out to a personal friend and an organization that I know well, which is Fog Security and Jason Kao over there for discovering an authorization bypass in Amazon Quick, which leads to unauthorized AI chat agent usage. And what this is, is that it's a critical missing server side authorization check in the configuration for Amazon Quick. This dovetails into so many things that we've talked about on the show. It's always authentication, authorization, or input validation and sanitization. Those three things are always weaknesses in connections that really lead to problems. And in this case, it was the interface for Amazon Quick, by the way, formerly QuickSight now just called Amazon Quick.

Uh, administrators could visually disable AI chat agents in the user interface, but the backdoor completely ignored these restrictions, allowing unauthenticated requests to interact with the models directly. This is both an IAM blind spot, as well as, again, something that could have kind of wallet impact. This functionality is not covered by a lot of traditional AWS IAM policies or service control policies, so it's one of these services that kind of falls outside of the standard security controls. Again, the infrastructure around enablement is a valuable attack surface and something to watch out for. All right, moving on to our next story.

Which is a little bit of a philosophical one. And this is a research paper from a gentleman named Diego Carpintero, who is heading up something called the AI Engineering Summit and had a talk around a zero trust. Zero trust architecture towards securing LLMs. And it's an academically oriented post, even though it is promoting an open source package around something called ModernBERT, which is an encoder architecture that is designed to work as something like an LLM firewall. The gist of it is the following thing. And we've talked on the show before about how LLMs don't understand your semantic intention. You know, they operate on a token processing and a token prediction model, right?

And so to that end, there is no way for an LLM to know that you are or are not trying to jailbreak or bypass, let's say, ethical guardrails or security controls that it might have. So to this end, what are the interesting proposals here is to go through this kind of encoding mechanism. And it has a few advantages also on things like performance and token utilization, and use this as a way to abstract away and do a semantic analysis before requests go to LLMs. It's a long read. It took me a while to get through, and I must admit I might not have understood all the nuance here, so I'll just link this from the show notes and we'll move on to our last story for the day.

Which is a macro or kind of a meta theme around this. We talked last year in our kind of state of AI security report about how vulnerability and vulnerability exploitation has finally now become the number one attack surface, surpassing things like phishing and credential theft. And that is confirmed in this year's Verizon DBIR, which is the Data Breach and Incident Response report or maybe incident report for twenty twenty six. This is the first year in the DBIR's nineteen years of publication that vulnerability exploitation has surpassed this.

And the interesting thing to me is that it surpassed it not by like one or two percentage points, but by eighteen. So credential theft is now at thirteen percent out of twenty twenty confirmed breaches, whereas vulnerability exploitation is now at thirty one percent of those breaches. We've talked about the zero day clock. We've talked about many things. Bear this in mind. I'll leave you with that. Talk to you next week. Bye bye.

Protect your AI Innovation

See how FireTail can help you to discover AI & shadow AI use, analyze what data is being sent out and check for data leaks & compliance. Request a demo today.