Modern Cyber with Jeremy Snyder - Episode
92

This Week in AI Security - 26th February 2026

In this episode of This Week in AI Security for February 26, 2026, Jeremy covers another packed week featuring AI privacy boundary failures, agent-driven outages, AI-accelerated cybercrime, Android malware innovation, platform responsibility debates, and the continued risks of vibe-coded applications.

This Week in AI Security - 26th February 2026

Podcast Transcript

All right, welcome back to another episode of This Week in AI Security from the team that brings you the Modern Cyber podcast. I am Jeremy. We are speaking for the week of the 26th of February 2026. We have a lot to get through today. Some really interesting developments over the last week. So without any further ado, let's dive in.

We're going to kick things off with a Microsoft Copilot issue that surfaced over the past seven days. And that is a Copilot bug that summarizes confidential emails. For those who don't know, Copilot is kind of, as its name implies, a copilot system that lives alongside you in the Microsoft ecosystem, most specifically in places like your Outlook inbox, your Outlook calendar, your Microsoft Word documents, etc. And these can be very helpful tools and capabilities that allow you to do things like summarize quickly across a long Word document that maybe one of your coworkers or a customer sent to you or something like that. Or they can be things that allow you to create content. They can be things that allow you to look at your day ahead and summarize the busyness that you have and the meetings that you have, and maybe help you prep for them.

So a lot of great capabilities within the Copilot suite. But along with that, there are risks. And so the risk in this case was an error in the Copilot chat work tab that allows a user to dive into sent items and draft folders and find confidential emails. And confidential emails are meant to be out of scope for the Microsoft 365 Copilot, but in fact, they turned up in that. So if you think about, hey, let me summarize emails that have maybe been sent or maybe been received, and you might get confidential emails that show up in there. And because this is a workspace or kind of an organization-wide tool, you have the capability at times to also peek into other users' mailboxes if you grant them permission.

So this kind of fundamentally breaks a trust boundary. If an executive drafts a highly sensitive strategy email, marks it as confidential, Copilot would still summarize that for somebody else who might have the ability to query that executive's inbox environment or something like that. The bug has been given a CVE number by Microsoft. It was live from late January until a fix in early February. And that is kind of a little bit of a reputational dent in the privacy settings of these tools as they roll out across different environments.

All right, let's move on to our next story, which is a little bit of a fragility of the cloud kind of story. So over at AWS, they've had a number of outages in the last several months. Those have been widely reported. We're not going to touch on them too much this time. But there was an outage in December involving tools around Amazon Bedrock. And these were not just standard kind of outages that nine times out of ten trace back to something like DNS or something like a buffer overflow in a very, very large, complex operating environment. But in this case, it was something very specifically down to the complexity of these AI workloads.

So for enterprises that have gone all in on Bedrock, they might have encountered this. And what we found in this situation is that effectively, the Cairo tool took a kind of nuke solution for deleting and recreating an environment. An agentic coding assistant didn't just make a typo. It decided that the most efficient way to resolve a minor bug was to delete and then recreate the entire operating environment. And as an agent who had the capabilities, this thing operated at agent speed as opposed to at human speed. So it turned a small fix into eventually a 13-hour outage.

There's a lot of debate around user error versus AI error. And from the Amazon perspective, I think in their statements they've categorized this as a user error because the human engineer gave the AI system broader permissions than expected. But the counterargument is that it often doesn't really matter because an outage is an outage, whether it was caused by an AI agent or a human who gave overscope permissions to something. Overscope permissions is a super, super common error in most enterprise environments. So this is something that we should be expecting.

And so we as people who are enabling these systems need to actually have a little bit more foresight and planning around them. And three, the kind of human-in-the-loop peer review argument always comes up around these. And last but not least is, these are things that are happening at a pace that we're not prepared to respond to.

So when we think about normal incident response plans that we've had in the past where we might react to an outage as it's occurring, that will typically be based on, oh, well, how long does it take for DNS records to propagate and outages to happen? Or how long does it take for a buffer overflow error to start affecting multiple storage environments, or whatever the case may be? But with an AI agent that can operate at the speed of that, this is an entire environment nuked in a matter of seconds or minutes, not hours or days that an incident response plan might be designed for.

So we've got kind of a mismatch in capabilities and response capabilities here. So a lot to get into. As always, the stories are linked from the show notes.

All right, moving on to our next story. Still involving Amazon, but not involving anything Amazon internal. So this is AI-powered assembly line for cybercrime. The talking points here are really around the following couple of things. So a Russian-speaking hacker started breaching FortiGate firewalls across 55 countries in just five weeks over the course of January and February.

The attacker was not particularly sophisticated or skilled. Normally, a campaign like this would involve a lot more kind of capability from the attacker perspective, but you get this kind of AI force multiplier. So what the attacker did was basically leveraged a cloud and DeepSeek models on top of Amazon Bedrock, gave it an attack plan, some scripts, some command generation to help pivot from initial access into digging deeper inside environments once that has been breached.

Amazon did find the hacker's AWS accounts and environments and did look at some of the things that were going on. And again, this is a case of this is not complex, but it is very, very fast. No zero-days, no fancy bugs, no novel exploits crafted here, rather just kind of AI-powered speed and force multiplier capabilities on this attack.

We have seen this any number of times play out. And in fact, this has been extended a little bit to a malicious MCP server or putting an MCP server in line across these FortiGate attacks. And this is again a single hacker created a pipeline called Arxan. And the MCP kind of acts as the brains on this attack. So it's another attack targeting the same thing. So not the same one as we just reported on, but in the previous attack that we just covered, the hacker asked the AI for advice and autonomy and gave it instructions. In this case, though, this is providing orchestration. So this is a little bit more sophisticated in terms of understanding what should happen.

And this really goes to something that we've talked about on the show before, which is one of the impacts on cyber defenders of AI systems is the speed at which everything moves. So when we think about statistics that have been true in the cybersecurity industry for a long time, like MTTR, mean time to remediation of known vulnerabilities being in the kind of 186-month range, and that's been true for about 20 years, that has to change. There fundamentally has to be a shift in the market in terms of the ability to respond to vulnerabilities much, much more quickly.

All right, moving on to our next story. This is PromptSpy. It is the first known Android malware that uses JNI at runtime. So let me explain here. PromptSpy actually has connections to Google Gemini in real time to solve this problem, this classic problem of command and control servers.

A lot of the challenge in the past for attackers has been that a command and control server is only as smart as what you can predict will be inside the customer's environment. So you might give a series of if-then statements for, oh, if I find that the victim environment is running Microsoft Active Directory on a Windows 2000 server, I know the following exploits exist. Please go execute them.

But what is happening is that especially in the Android ecosystem where you have such a divergence of devices there, there are literally thousands of flavors of Android-powered devices, that becomes really, really tough to navigate from an attacker perspective. And so what the attackers realized here is, well, why don't we just hook this up to Google Gemini and we'll allow Gemini to solve problems for us? Things like navigating user menus, things like navigating through the operating system of the mobile device and looking for things like screen capture, unlock patterns, intercepting PINs, etc.

And rather than giving very deterministic instructions, you give the LLM kind of the tasks and the goals that you want it to set out using the access that it has through this malware that the user has downloaded and installed on their phone. So a really interesting development here. We're going to have to see if this also bears out in things like endpoint devices, Windows, Mac, etc., but something definitely to keep an eye on over the next several months.

All right, moving on. ChatGPT. Now unfortunately, there was a shooting event in Canada recently. The event is known as Tumbler Ridge. And in the aftermath of that event, the authorities have looked into the history of the attacker, a physical attacker in this case, to look for signs of what were they searching for, what were they looking at? And they found a lot of ChatGPT usage and certainly a lot of back and forth about maybe some of the mental health crisis that the shooter was experiencing and some of the dark spaces that they might have been going to in their mind.

And internal records that have come out through this investigation show that OpenAI was aware that there were some potentially problematic searches and chats going on and really debated this. And so now these LLM and AI provider companies that are moving at such a breakneck speed are having to pause and ask themselves, at what point do we have to start cooperating with law enforcement and authorities?

And this is something that the likes of Google and Microsoft and big email providers and big technology and search providers, for instance, have had to wrestle with for a long time. But they've got decades of kind of corporate understanding and internal process and legal review and all of the things that might be required to do that level of cooperation with law enforcement.

For a lot of these companies, bear in mind that these are companies that are often less than five years old and are moving at such a breakneck speed. As we've talked about on the show many, many times, security and safety are often an afterthought. And now we're getting into the realm where human lives are actually called into question in some of these incidents. It will be really interesting to figure out whether some level of, let's say, safety and responsibility starts being a practice at more of these firms.

We've seen that both grow and then die at organizations like Meta, where there were community safety boards that were built up and torn down for any number of reasons, again involving real-world violence and so on. It will be very interesting to watch how this develops at some of these companies that are again just moving so, so fast.

All right, moving on to our next story, a very quick one. Don't let your LLM build your password is kind of the long and short version of it. If you're looking to generate passwords, password managers all have this capability built in. I myself am also a fan of a website called Strong Password Generator. I think it's dot com, but if you just Google strong password generator, you'll land on the site. You select a few options, length of password, how complex you want it, boom, and it generates something that is very, very randomized and hard to guess.

But what turns out to be the case with LLMs is remember, LLMs are predictive machines. They look for patterns and regurgitate patterns that they've seen before. So what that means is that these passwords have what's called 80 percent less entropy. And entropy is kind of the randomness in them. So they are less random than passwords generated by things like strong password generators or password managers.

All right, moving on. Two more stories to get through. A great thread over on Reddit about a suite of tools under the RR header. I think Hunter, Sonar, etc., a number of tools that kind of all end in the same thing, Sonar, Radar, Prowler, etc.

And what it is is that it's come out now that these are 100 percent vibe-coded applications that were built very, very quickly. We've covered the risks around vibe-coded applications and things like hard-coded credentials, API keys being very accessible, API keys being included in the front-end JavaScript components of a lot of these applications.

And so in this case, the master key was actually in API settings general, where it was very easy for any attacker to find. Unauthenticated 2FA enrollment was possible. There is a set of clear bypass method on API slash setup slash clear requires no authentication. So a lot of these kind of common problems around things like unauthenticated endpoints, path traversal, lack of controls or validation or sanitization of inputs. It's just the latest in this line. So nothing too crazy here, but definitely a story worth mentioning.

Moving on to our last piece, which is kind of our philosophical piece for this week. I'm not going to spend too much time on this because we've gone a little long on today's episode already. But this has been making waves in the cybersecurity world where Anthropic, who I've credited on the show many times for kind of leading the way on trying to be very security-forward in a lot of what they do, have released kind of a set of tools for going from passive to active, going from identifying source code problems to actually fixing those problems.

And that's everything or potentially everything from analyzing poor coding practices, again things like unauthenticated API endpoints, hard-coded credentials, any of these practices that you'll find from the source code perspective. There's been a lot of talk in the cybersecurity world that this kind of signals the end for a lot of companies for whom this is what they do. But I'm not so sure. I think there are areas where there's still some hesitancy around unleashing AI agents, and potentially rightfully so, as we've discussed many times. But there's a lot of potential here.

So I'm not going to pass any strong opinions or judgment on this. This is something just to watch out to see what's happening down the road.

All right, we're going to leave it here for today. Longer episode than normal. I apologize for that, but so much to get through. We'll talk to you next week. Until then, rate, review, like, subscribe, all that good stuff. Bye-bye.

Protect your AI Innovation

See how FireTail can help you to discover AI & shadow AI use, analyze what data is being sent out and check for data leaks & compliance. Request a demo today.