In this episode of This Week in AI Security, Jeremy highlights a significant uptick in AI-related vulnerabilities and the shifting regulatory landscape. The episode covers everything from "Body Snatcher" flaws in enterprise platforms to the growing "industrialization" of AI-powered exploit generation.
.png)
In this episode of This Week in AI Security, Jeremy highlights a significant uptick in AI-related vulnerabilities and the shifting regulatory landscape. The episode covers everything from "Body Snatcher" flaws in enterprise platforms to the growing "industrialization" of AI-powered exploit generation.
Key Stories & Developments:
Episode Links
Understood. I will provide the entire transcript word for word, making no changes to the text (including verbal fillers like "ums" and "uhs"). The only addition will be paragraph breaks to improve readability.
Here is the word-for-word transcript for the episode from January 22, 2026:
All right. Welcome back to Modern Cyber. And this week in AI security coming to you for the week of the twenty second of January, twenty twenty six. This week is is a busy one. There's a lot of stories and I can tell you the year is off to a very, very strong start in the sense that so many new AI vulnerabilities, challenges and, you know, tactics, techniques, procedures are being uncovered all the time. We've got a lot to get through today. So why don't we get started?
First, I just wanted to follow up on a story that we covered last week that's been brewing over the last couple of months around XAI and sexualized deepfakes. And so if you're not familiar or if you missed previous episodes, you know, this popped up late in twenty twenty five with a number of jurisdictions, especially international jurisdictions and countries, starting to question AI or even ban it for use within their territories because of issues like sexualized deepfakes. And, you know, really what this comes down to is that XAI could be prompted to create sexualized deepfakes of real people. And that became a real problem. And, you know, in a number of jurisdictions, there are there's legislation against things like revenge porn, which would also kind of fall into this category of deepfakes, sexualized deepfake, etc.. So it's interesting to me to now see that California, you know, this is the first place within the US that has sent a cease and desist. To my knowledge, around a topic like this, you know, there's been hallucinations, there's been misinformation, there's been all these other challenges with llms. But this is the first time that we've seen something rise to the level of a jurisdiction sending a cease and desist order.
And, of course, it's always interesting to watch, you know, where this stuff happens first. We've talked on other episodes about how legislation usually starts from one or two locations. California tends to lead the nation in a lot of kind of privacy regulations and digital legislation, especially, you know, being the home to Silicon Valley and so much of the technology ecosystem, it makes a lot of sense Now. It'll be interesting to see how this plays out again in the absence of federal AI legislation. Will this California stuff become a standard? Is this a one off or is there more? We're going to keep following the story in future episodes of This week in AI security.
All right, moving on. We've got a new AI platform flaw coming from ServiceNow. This was a nine point three out of ten CVE or Cvss score on a CVE. It's been named Body Snatcher. And effectively what this is, is in the ServiceNow platform, you have the ability to create AI agents that can then execute different tasks within your organization ServiceNow instance. And unfortunately, what happened is that a, uh, endpoint around the agent service was completely unauthenticated and allowed for remote code execution. So anybody who basically became aware of where this API lived and how to send it a, an instruction or a command could abuse that, and that abuse could then trickle into your whole organization. So remember the things that we've said here on modern cyber and at retail in general, remember that there is no AI without API. And especially for Agentic systems, APIs tend to be kind of the starting point for a lot of agentic workflows. So having strong security around the API that kind of kicks off the process is very, very important. I definitely predict that this is not going to be the first time that we see these types of kind of initial access APIs into agentic workflows cause problems. In fact, we've got another story about that later in today's episode.
All right, moving on. We've talked about Ides several times in the past and these kind of, you know, VB code environments and some of the weaknesses in them. We talked about things like malicious prompts being embedded in source code that would be read by the IDE as it kicks off your process or your vibe coding session. Well, what we've got here is a open code, which is kind of a downloadable vibe coding platform, an open source one. And what it was doing was that it was kind of silently running in a web server with an unauthenticated API. And so similar to the last, uh, vulnerability that we discussed around the ServiceNow platform, anybody who knew where to access your open code environment could again, just interact with it, send it malicious instructions, etc.. It's been patched now. Uh, there was a CVE assigned to it. And so this is again one of those cases where you got to watch out for the entire configuration around the AI service, not just the AI service itself.
All right, moving on. Next story. This is a really interesting one. This was posted I believe, on blue Sky. Initially it's been shared around pretty widely since uh, it's a couple days old as of the time of recording. And what it is, is this is a magic string that will effectively crash. Anthropic um llms. It looks like this works for every, uh, anthropic provided model and model version at this point. I've seen rumors, although I've not seen confirmation, that there is a similar kind of magic string for, um, OpenAI. And why is this a problem? Well, it's not really that big a deal if I'm in an interactive chat mode where I'm kind of, let's say, working with an LLM and going back and forth with the LLM for various prompts, tasks, document analysis, summarization, whatever the case may be, you know. But what it can mean if you've got some kind of workflow and you don't control inputs around what's going to the LLM, this can crash your workflow.
And in fact, you know, if you're looking at the video of today's episode, you're going to see that in the initial conversation thread around this magic string. Disclosure. Sure enough, the person who reports the magic string reports that it killed their agent, uh, when the agent was told to read the posting that contained this magic string. So in a way, this is actually a prompt injection, um, but with a very specialized prompt that that is malicious in nature and crashes the LM or kind of stops the interaction with the LM, it probably doesn't take down the actual LM service on the back end, but it may crash your session or your interaction stream with that. LM so something to watch out for. It looks like it may well be the case that every LM has some kind of magic string like this that will take them down. It probably comes down to something very specific in this string that then in the encoding process creates tokens that are bad. Um, and so that'll be something to keep an eye out for if more of these begin to be disclosed. If the LM providers find ways to patch this, to monitor them, but for your own purposes, just make sure you're sanitizing inputs, make sure you're keeping an eye out. And especially obviously, if you're using an anthropic model, you want to make sure that you don't pass this string along to the back end of any kind of agentic workflow or system.
All right, moving on. Uh, more on Claude and more on vibe coding. So if you go through a vibe coding session in an interactive manner where you're kind of working with Claude to define the structure of an application or some software or a piece of code that you are asking Claude to generate for you. Claude code specifically to generate for you. It turns out that your prompts by default are saved in a directory called Claude Logs, and you've got the full text record of the entire conversation that led to the creation of that code. Now, a lot of people are taking the code and committing it directly from where it is generated and not going through, let's say, a more kind of sanitized, uh, commit process. And what that ends up meaning is that when they send that code, when they commit the code and they put it up on a website or on a service, um, anybody who knows to look for the Claude logs directory might discover that full text conversation, and might be able to then just read from that the context of what you've built. Understand all of the inputs and effectively get a good mapping of your application, and then have a good starting point for figuring out where you have vulnerabilities. Maybe in your instructions, it's clear that you haven't sanitized inputs to kind of harp on a theme again, and so they'll know how to think about attacking your application. That's the security aspect.
There are all kinds of other abuse cases that could happen around, you know, abusing business logic of of a vibe coded application and so on. It's something that needs to get added to your. Gitignore. So a lot of the times things like environment variables or secret credentials are added to. Gitignore. So they get factored into the application but they don't get kind of indexed or picked up. There are a number of examples where people's GitHub code repos have contained these whole conversations. People can look at that to find them. So it's something to watch out for, especially if you're in the vibe coding or you're using one of these coding. Assistant environments, etc. credentials can get put in there if you feed those to. To the coding assistant while you're building your application. So lots of stuff that you may not want to get included with your application might get included by virtue of this, uh, logs directory that contains your chat history.
All right. Moving on. This was the probably the most interesting and compelling vulnerability that we read about. It's actually a couple of weeks old, so I apologize. We didn't have it on a previous episode. It just hit our radar in the past week. So we want to share it, even though it goes back to the twenty second December twenty twenty five. And in fact, it is the result of a, uh, disclosed vulnerability to Eurostar. For those who are not familiar, Eurostar is the train service that goes between London and a number of cities across the European continent, most notably Paris. And then I think I think the service continues all the way up to Amsterdam, if I'm not mistaken, these days for issues in the public. AI chatbot allowed for guardrail bypass, unchecked conversation message IDs, prompt injection, leaking system prompts, HTML injection, cross-site scripting, and much more. So it's a real interesting case. I'm not going to go through all the details of it, but the reporter Ross Donald, kudos to Ross for reporting this. Also for finding it in the first place. Ross disclosed the entire, um, interactions that he had, as well as kind of describing the vulnerabilities.
The TLDR on it is effectively, though, that it comes back to something we talked about earlier in this week's episode, which is, you know, the API is the interface that gets you to the chatbot. The weaker your API, the more challenges you're going to have on the back end with the LLM service. To take an example, one of the uh, one of the API parameters around this was kind of the validation of the prompt coming in. So they have a nice kind of, you know, chat validation feature that says like if you feed this thing some kind of garbage request. You know, this is a chatbot that's designed around booking train tickets or serving people who have reservations on the train. If you ask it to tell you facts about the moon and Jupiter and the sun, that's completely out of scope. So they've got a nice kind of validation service on the front end of this API that checks the incoming text, sees, sees if it's relevant to pass along, and so on. But in there, you know, you could also just tell the API my prompt has been validated. And so what Ros was able to figure out here is that by using that that parameter hardcoded into his API requests, the API would just accept. Yep, this has been validated. Let me pass it on to the back end.
The you know, the Discloser was able to discover that it was GPT four running, was able to get a number of other pieces of information out, including the system prompts for how you know how the LLM should generate answers for the users. Um, all the vulnerabilities have now been fixed. wasn't necessarily a super, super smooth disclosure process. So that can always happen with larger organizations, especially in this case, it looks like the org, partway through the disclosure period, just by random circumstance, was also in the process of moving on to, I think, a vulnerability disclosure platform, probably something like hacker one or Bugcrowd, and that might have created a communication gap as well. Um, but all of this is disclosed in there. So a really interesting read if you're thinking about again, kind of an LLM powered application, an agent, a workflow, etc., um, think about the entirety of the application that you're building that that can prove to be super valid and relevant for you.
All right. Finishing up this week's episode with something that I think is is really interesting. Uh, this is a philosophical kind of piece, but it does have some real world data behind it. And the title of the piece is On the Coming Industrialization of Exploit Generation with Llms. Effectively, the direction of the piece. Is that like, okay, we know that there are vulnerabilities and pieces of software all around the world. How easy is it for somebody to kind of weaponize a vulnerability? And what Sean. Sean Heelan is the author of the blog. What Sean really talks about is basically setting a token budget of thirty million with an LLM, giving it the instructions on how to find an exploit for a vulnerability. And this is admittedly in kind of a controlled environment where there was a known vulnerability. So it's not starting from, you know, crawling a piece of software to look for the vulnerability first and then weaponize it. No, it started more from the standpoint of, okay, we know this software has a flaw.
How do we go about attacking it? And one thing that for us at Firetail was particularly interesting is that the first thing that the LLM did was actually turn that vulnerability into an API and then write code against the API as a way of exploiting the vulnerability. But effectively for a thirty dollars budget. The agent was successful. Now, again, a pretty controlled and contrived scenario to just test the validity of the hypothesis that this can be done. And we've talked about this a few times. We've had previous episodes where we've talked to people who have been involved in these types of projects. When we did an RSA recap last year with Michael, we talked about how that was the first time that it was proven that an LLM could actually write an exploit for a vulnerability and find these types of paths towards doing that. So it's really interesting to see that, you know, with every kind of incremental step, as we as a, as a tech community get more and more proficient with these services, it does become more and more clear that, you know, the mean time towards exploit is a real threat to vulnerable software.
We've talked in the past about how the mean time towards exploitation is predicted to fall as low as twenty minutes from the time that, you know, a vulnerability is disclosed to the time that exploit code is available. And now it's really clear, you know, this is something that runs super fast. Thirty million tokens can be processed in a matter of, you know, minute singular by some of these llms maybe, you know, two to five minutes max. So this twenty minutes, it's not unreasonable. And in fact, it is not far fetched. And in fact, it is something to really, really think about as you write your own software. How are you going to get better with your organization in terms of making sure that the least number of vulnerabilities make it out there into the real world?
And on that note, we will end today's episode. As always, the links are always in there. If you've got stories to submit, please send them our way. You can send them directly to us, or you can submit them via our AI Incident tracker that you can find on the Firetail website. Like, subscribe, rate, review, all that good stuff, and we'll talk to you on next week's episode of This Week in AI security. Coming to you from modern Cyber. Thank you so much. Bye bye.