Modern Cyber with Jeremy Snyder - Episode
115

This Week in AI Security - 18th June 2026

In this episode, Jeremy explores the fallout of the first US government-mandated global model kill switch, an unprecedented action taken against Anthropic's new Fable model.

This Week in AI Security - 18th June 2026

Podcast Transcript

All right. Welcome back to another episode of This Week in AI security, coming to you from Finland for the week of the eighteenth of June twenty twenty six. And while a lot of northern Europe may be ready to take the summer off, the AI industry is not. We've got a ton to jump into this week, so let's get started. I'm just going to go quickly through a couple of things around topics that we have talked about in any number of times on the show, and that is supply chain infrastructure around AI, etc. with a couple of stories to get started with.

First is one hundred and forty four packages within the master namespace compromised by a single hijacked contributor account. Some malicious code was hidden in a third party library called Key payday. Pushed through an automated publishing campaign only lasted eighty eight minutes, but it did get downloaded and consumed by some people who are users of those packages really highlights the speed and scale of supply chain attacks right now, really also highlights the need to not necessarily just be grabbing the latest version of any package that you use in your own infrastructure or your own stack right now. So do bear that in mind as you think about building out your own agents and applications, etc..

Moving on to the next one, we've got a vulnerability, a critical severity one in the bronze tech AI infrastructure platform, specifically something called the PraisonAI orchestration platform. The key issues here are some authentication evasion, some broken access controls, and automated agent manipulation. This has already been detected in the wild and has been used in exploits in the wild. Again, look at some of the architecture that you're using. Look at some of the components that you're building in. Maybe gravitate towards more established packages. While I know that is very difficult when you're trying to build on some of the latest technology and really be at the cutting edge as we all are right now in AI, but do bear that in mind. Do think a little bit about the packages that you're consuming, etc..

We've got another one on the JetBrains side. This is the JetBrains marketplace at least fifteen malicious plug ins. These are plug ins that do everything from coding assistant to other pieces of functionality. They silently exfiltrated AI provider credentials. And again, we've talked about how all AI consumption happens over interfaces. So interfaces become not only the attack surface of choice, but they are also the number one credential when it comes to things like connection keys. That is how attackers are going to use your AWS or similar environment to, let's say, manipulate models or to run up a big bill by using your keys to run models that they're going to consume. The malicious plugins. And the JetBrains case had been downloaded about twenty five thousand times before being flagged. There's additional risk of some Chrome extensions that are also around this ecosystem.

So moving on, we've got a really interesting, uh, unauthenticated remote code execution vulnerability in the Hugging Face Transformers Environment. So we have been talking about using things like third party tools and packages for a long time as an industry, as a way to get ahead. Why would you reinvent the wheel? Why would you go build something that already exists? And, you know, we've talked about a couple of the stories already in today's episode, but we're going to come into this one here as well. So this is a CVE with the ID of twenty twenty six four three seven two. And it proves that a single polluted line in an innocuous looking config file can potentially give an attacker full RCE on your inference servers. Even if you explicitly told the library not to run remote code.

So these AI frameworks are being built for speed and developer convenience, not enterprise security. We've talked about that trade off and the tension between speed and security. Right now it is an internal field that or it appears to be an internal field with a t t n implementation internal. But that could silently short circuit a security control that really highlights how AI libraries handle external data. And if you think about the blast radius of this, you've got, you know, potentially hundreds, thousands, maybe even millions of workflows, automated CI, CD pipelines, etc., that are pulling down models or communicating with models constantly. And if attacker poisons a community based model that you might be using, they aren't just getting data, they're getting access to your environment and potentially your credentials. If they implant a malicious flag inside some of that, that then exfiltrating data from your environment. It's one of the things that really speaks to a need to have constant inventory so that when a compromised environment like this or a compromised package or a compromised model is identified, you know, immediately what the impact to your organization might be around that.

All right. Moving on. We've got a really interesting story from the folks over at toxic around something that they're calling agent jacking, and they are referring to it as basically this kind of. Way to manipulate automated systems into executing malicious code. So let's break it down real quick. The attack specifically Weaponizes Sentry, which is a very, you know, popular application monitoring tool. It is something that you implement inside your code and it helps you and your software development team understand how your users are using your application. Also helps you identify errors inside the environment. So when a user, let's say, tries to navigate from page one to page two and encounters an error, something in the Sentry monitoring might help you identify what the error is. What is it? A parameter that didn't get passed correctly from page one to page two, etc. and very often Sentry connection keys are intentionally public, and they're embedded in front end JavaScript so that they can be out there collecting all of that user activity.

They, the attackers can pick up these keys. They can use those keys to inject malicious markdown instructions into your error logs. So what happens normally is that as a user is using your application, they hit an error. The Sentry credential is used to then send a request using that key to your Sentry back end saying oh, the user hit an error going page one to page two. Missing parameter, whatever the case may be. Well, knowing that key, it's not hard to figure out how to send your own Sentry formatted messages in a way that the Sentry system will then kind of interpret and potentially act upon. And of course, one of the most tempting and high value use cases is not to have a human process all of the century error messages, but have an AI agent process those Sentry error messages. You know, a developer might say, hey, find and fix my Sentry errors and then use that to update their code base with bug fixes, etc.. And the thing is, this will bypass WAFs. This will bypass EDR tools because this is a normal well formatted Sentry error log, etc.. So it's a really interesting kind of and I like the word agent jacking, but a hijacking of a very legitimate agent for a purpose that is very, very common and high value for an organization and an area where, again, many organizations are going to be look to deploy agents to improve code quality, improve efficiency, improve developer productivity, what have you.

All right, moving on to our next story. This is a one click save in a Microsoft platform that specifically combines an AI specific parameter to prompt injection via copilot search parameters with an HTML rich condition using Bing's image search endpoint. Now I know that's a lot, and it's a very complex chain. Very big kudos to the team over at Varonis who identified this. They communicated it to Microsoft. Microsoft identified, agreed, validated, tracked it as a high severity CVE. It is already patched on the Microsoft platform, so you don't need to worry about that aspect of it. But it does highlight how, you know, there are all these kind of edge cases around particular conditions that may not have been tested in your own QA process, because you can't anticipate every single string that a user might send to an agent. Right? This is one of the things about these agents is they are free form, human looking texts, natural language. And so it's very, very difficult to identify what's going to happen with something like a parameter to prompt injection, etc..

All right. Moving on. We've got a couple of stories around something called the Shai-Hulud Miasma. And of course, Shai-Hulud. You've probably heard that name a worm that was or has been out now for well over a year, I want to say, but it's now making its way into more AI use cases. Got a four point six megabyte JavaScript payload that is being dubbed Shai-Hulud Miasma that has been discovered. It's been validated by several organizations, research teams, etc. it exploits static code analysis on linked source code to find and hijacked AI tools, specifically targeting the exposure of credentials. One of the other interesting things around this is very similar to that last story about Sentry. Uh, we've seen now instances of attackers realizing that one of the ways to avoid detections of malware planted via some of these supply chains is to include instructions about topics that are off limits, such as, let's say, bomb making, nuclear weapons, bioweapons, etc. and that'll prevent the LLM from continuing to analyze the threat because they'll stop analyzing some of the log files and messages that might be coming out of these worms. And there are additional now kind of knock on effects where, um. These malware and worms can get into packages that are including in software development, in projects and so on. And they can do things like leverage an SKS logic around things like predictable naming conventions for, for storage buckets and so on.

So we've got a CVE identified for something here from the unit forty two team over at Palo Alto. They're calling it pickle in the middle exploit. This is a kind of a mini Shai-Hulud thing wrapped into a Python set of packages that are used in an SDK for building on the Vertex AI platform. And so effectively, the attacker identifies that you may be using this. They'll know that your SDK is going to probably predict a bucket name of a very specific format. And so the attacker goes and creates a bucket with that format and maybe leaves it open to you. And so what ends up happening is that you, as a user of one of these compromised SDK environments, you say, hey, I need to store some images or something like that. The SDK checks to see, does this bucket exist? It already does, because it's been created by that attacker who leaves it open for write permissions. And your data can be exfiltrated because you're pushing data out to a bucket that you assume exists in your infrastructure, but might exist in somebody else's infrastructure. It's also sometimes called the bucket squatting vulnerability around that. So think about this again. Again, I would probably point to a mitigation strategy of not necessarily grabbing the latest, but maybe pegging to a known good version.

All right. Moving on. We've talked about kind of the Claude Mythos effect and the Vulnpocalypse, fears of an upcoming, you know, huge surge in number of CVEs reported that is confirmed to be true. We're tracking towards sixty six zero zero zero CVEs. To put that in context, that is, well higher than any year prior to now. It is also directing the or kind of, let's say, urging the center for Cybersecurity and Infrastructure Security Agency to release a new directive BOD two thousand six hundred and four, completely throwing out patching timelines and moving towards a super fast three day deadline for the most dangerous total control software flaws. And if you've paid attention to CISA guidelines over the past, you will know that it's really, really rare to see anything as rapid as thirty days historically. You know, that's for the very, very most critical CVEs in the past. So moving towards a three day deadline is really, um, it's really kind of a major, you know, order of magnitude increase on that side.

All right. Uh, something else from the folks over at toxic. This is something that they're calling the lethal trifecta. When AI agent simultaneously has three dangerous capabilities access to sensitive private data, exposure to untrusted external content like emails or web pages, and the ability to execute outbound actions or communications. When these three things are all coincide. Attackers can weaponize indirect prompt injection. So that will be things like hiding malicious text instructions inside business data like web pages, documentation pages, etc. we've talked about this a couple of times. This is just kind of a nice, consolidated, concise way of framing the three things together.

All right, moving on to our last story for this week. And you know, I like to close on a little bit of a think piece. So let's talk about what's happened with Anthropic over the last couple of weeks. So on the ninth of June, they released their next generation ultra capable tier. This is the Claude Fable Family launch for a general commercial availability with some safety guardrails built in. And the Mythos five family, which had been highly restricted, and safety classifiers under something called Project Glasswing. Glasswing rather had been launched to an approved group of cyber defenders due to its privileges, and Anthropic faced some blowback for enforcing a thirty day data retention policy on Fable users, which was intended to monitor for jailbreaks but did raise some privacy concerns. There was a, uh, a vulnerability flagged by some researchers at Amazon who found a narrow jailbreak vulnerability. In Fable five, the flaw was flagged directly to senior U.S. national security officials, and that led on June twelfth, only five days later to the U.S. Department of Commerce, bypassing administrative delays in issuing a sweeping immediate export control directive, which meant that no foreign national. So only U.S. citizens could access this. And that led to a global deactivation of the model. So a really, really fast time frame here that really sends a lot of shocks around any number of issues, whether that is access to models, model cyber capabilities, sovereignty, which is a big, um, a big topic right now in the European Union, but exporting kind of, or sorry, uh, executing this kind of kill switch is really the first type of action that we've seen at this level. So something to keep an eye on. I'm sure this story is not done.

All right. A lot of stories, a lot we got through this week. I'm sure that was a lot to digest. We'll talk to you next week. Thanks so much.

Protect your AI Innovation

See how FireTail can help you to discover AI & shadow AI use, analyze what data is being sent out and check for data leaks & compliance. Request a demo today.