The word AI, much like Zero Trust, has come with a lot of baggage in the past few years. It’s a term that’s been misused, slapped on the front of startups’ overpriced booths at RSA and Black Hat, and it feels like every cybersecurity product under the sun now supports it in some flavor or fashion. It's the same cycle we’ve been in the past, but this time everyone is jumping in. This week we are getting in front of the bandwagon and chat with a pioneer in the cybersec AI space who has seen how the technology has been evolving over the past decade, Oliver Tavakoli, the CTO of Vectra AI.
“My contemporaneous definition of AI at any given moment in time is there's got to be enough pixie dust in it for people to view it as somewhat magical; so that's my incredibly technical definition. I'd say over the past 10-15 years, that is typically meant neural nets-that has those have been a stand in-and and obviously, neural nets can be used for discrimination [As opposed to a generative AI model]. Again, the example of cat (You search “Cat” on Google images, and it returns results that show images, in theory, of only cats) is an example of how they can be used in a generative sense, which is really the latest revolution that you see. And then the other thing is how broadly applicable they are and how well read they are.
Tavakoli’s definition of AI provides the context for how AI is primarily applicable today in cybersecurity. But, in the past, typically these concepts were held back by technology. There is also a stark difference between what has been referred to as AI, or a discriminative AI model, and what is most popular today, or generative AI.
It turns out in these large language models, as you make them bigger, there was always kind of the question of if you make them big enough. Will they just plateau or will they take off? It really wasn't a foregone conclusion that if you made them big enough they would take off, but it was a bet that was placed and a bet that turned out to have some merit to it.
And that is the crux of today’s interview: what was and will be the past and future impact of AI on cybersecurity?
AI plays a significant role in both offensive and defensive cybersecurity strategies.
Threat actors use AI to enhance their attacks, making them more believable and harder to detect.
Defensive uses of AI include improving workflow and making SOCs more productive.
Organizations must always assume that compromise is possible and focus on minimizing the impact of breaches.
I can’t believe it’s already November, but here we are. And, hopefully, none of you found malicious USB sticks in your kid’s candy from Halloween. That also means we only have a few more episodes left in season two, but we’ll be sure to wrap things with a bang.
Two quick updates:
Starting in season three, we are going to offer limited sponsorship opportunities. This is primarily so we can extend the reach of the podcast (marketing), and I can give Neal some beer money for kindly sharing his expertise with us. If you are interested in an audio ad spot for season three or to partner with us on an episode of a specific topic, you can reach out here for more details email@example.com. For our listeners, don’t worry, we will ensure we continue to produce episodes that matter to you and they won’t become sales pitches.
Tired of telling people that Zero Trust is a strategy and not a product? We’re adding a bit of chaos back into the world with something that may help in those scenarios. We’ll release it in the next episode.
A bit of Context
For added context, while the buzzword of today in AI is generative AI, Tavakoli primarily focuses on discriminative AI; however, the work he and the team at Vector are doing are on the receiving end of the newer models: threats.
As a means to try and compress a very large corpus of knowledge, everything ever published and said by mankind into a reasonably compact, albeit still large model, does a pretty good job of it. For us, the journey in the last 10 years has been primarily around the discriminative side.
In security, you're trying to find bad versus good. You're not trying to complete a sentence. You're not trying to say, ‘Hey, what's the next thing I should do now?’ I think where generative AI has a place in across all sorts of businesses, including cybersecurity, is around the, okay, I have this signal.
Artificial Intelligence in Cybersecurity
Artificial Intelligence (AI) has been a shape-shifting term since its inception in 1956. From expert systems in the '80s to neural nets, AI has evolved with the exponential increase in available data and computational power. Today, AI is applied in various fields, including cybersecurity, where it plays a significant role in both offensive and defensive strategies.
The Offensive Side of AI
AI is used by cybercriminals to enhance their attacks. A perfect example is social engineering and phishing. With the digital trail of data individuals leave behind, cybercriminals can easily generate believable phishing emails (spear phishing, whaling, etc.). By feeding this data into a large language model (LLM), they can produce emails in the style of the individual, referencing past events and experiences, making the deception more convincing.
Another offensive application of AI is in voice phishing. Using voice samples from various sources, cybercriminals can generate voicemails in the voice of a trusted individual, increasing the believability of the attack. Interestingly enough, because Elliot and Neal have hours and hours of voice recordings available, a threat actor has replicated Elliot’s voice and targeted a family member. Fortunately, they didn’t fall for it, but it was said to be convincing, and focused on an urgent emergency situation that would only be resolved with cash.
Moreover, the LLMs themselves can become points of attack. As organizations deploy these models and interconnect them with various data sources, they become a fulcrum for attackers. For instance, a bad actor could inject malicious data into a public data source like VirusTotal, which the LLM pulls into its context window, potentially leading to a breach of the system's guardrails.
The Defensive Side of AI
On the defensive side, AI is harnessed to improve workflow and make Security Operations Centers (SOCs) more productive. By building a lot of knowledge about the environment into these LLMs, organizations can memorialize tribal knowledge, making it readily accessible for future reference. This can then be injected into context windows, making the SOC far more effective.
However, the assumption should always be that compromise is possible. Phishing and social engineering attacks are only going to become more effective in the coming years. Therefore, organizations must prepare themselves for the inevitable and focus on minimizing the impact of such breaches.
Is AI going to take all our cybersecurity jerbs (jobs)?
Because the technology behind AI models is now rapidly evolving and getting huge amounts of investments to add even more fuel to it, there is no definitive way to say AI will steal large swaths of cybersecurity jobs. That said, Tavakoli has some opinions on the near term:
Within cyber security, I think it is incredibly overblown.
We have such big problems. We have such a shortage of talent. And whatever we do will be parried by the bad guys. This is a tussle, not a fixed problem like self-driving cars; this is a ‘hey you up your game, they'll up their game.’
Anybody who thinks they want to work in cyber security, come join us because you will be working for the next century if you live that long, because there will never be a shortage of jobs for people who can help think through problems and outwit adversaries, which is really the problem that we are undertaking.
Neal expanded upon Tavakoli’s views to also indicate that with all new technology, a new class of careers will also be opened. So in the event technology does replace one aspect, it opens a door for something new:
Your analogy is a lot better than mine. I go to the Charlie Bucket toothpaste factory one, where he gets replaced by the robot to screw on the caps, but then he comes back and he's the guy who has to fix the robots and he's getting more money.
This transcript was automatically created and is undoubtedly filled with typos. As usual, we blame the machines for any errors.
Elliot: Hello, everyone, and welcome to another episode of Adopting Zero Trust or AZT. I'm Elliot, your producer, alongside Neal Dennis, our, he's already shaking his head, and I haven't even said anything our threat intelligence slash host of the podcast, along with a guest that we're going to actually talk about something that I think a lot of people have been very curious about in just a moment.
But before we kind of jump into that, I feel like we should do some proper introductions. So, Oliver, if you wouldn't mind giving us the rundown of where you've been and then maybe we can chat about where you are. So, yeah let's talk about Vector and AI.
Oliver: So, I have been at Vectra now for almost 10 years. A, my journey into security kind of is probably 25 years in at this point through various pockets of security. So started by doing kind of IPsec stacks and Ike stacks, and then 821X supplicants and encryption and authentication. And then found myself after several years of doing that in a company called Juniper Networks.
Which had acquired Netscreen just before, I joined and so then suddenly got into VPNs and Mac solutions and IPS. And at the, at the end of that journey, I just kind of had this epiphany that. For all the things we claim to do bad stuff was still kind of happening and attackers were still getting in.
And so the pivot for me for the last 10 years has been assume compromise, assume that people will get past your first lines of defense and focus your energies downstream of that of trying to prevent really bad things from happening because trying to be perfect. At first contact is awfully hard and it's proven to be an ineffective strategy that I think of the far side cartoon you know the crunchy exterior soft gooey interior is not really a workable model.
Elliot: Excellent. And I think for our listeners, who are probably very familiar with NIST and CISA and how they define Zero Trust, they're probably gonna have... Maybe honed in on something that you had called out already immediately is that you just have to assume that you're under the scope of being compromised, whether it be internal threats or external threats.
So I appreciate that. You call that out immediately. That obviously plays well into the topic that we're going to be talking about, which Oh. Frankly, it's just going to be a bit of AI, but also heavily focused on the thread Intel side, which is fantastic because Neal has that lovely background working for one of those three letter agencies, doing some thread Intel stuff, and God knows what else he does there.
So I think between the two of y'all, we're going to have some pretty interesting conversations and then we get a tight get into the modern day of I guess maybe beating around a topic that, no, it's a little bit weird in the space. So, I'm going to open that can of worms immediately as far as like that AI thing.
So you all have been around for 10 years. I did some digging. I made sure you didn't do a rebrand in like the last three years as the hype was blowing up around generative AI and all that stuff. I was even messaging Neal about this. Like, oh man. I cannot wait. I hope they re ran it. I hope they just slapped AI on there.
Just like they'd everyone else did with zero trust. I can't be mean at least about that, but I do have to ask your perspective, which is, obviously AI is blowing up right now. We will see it probably the next RSA. Everyone's going to slap something with AI on there. So, from your perspective, you've been in it, you've seen it, you kind of have shaped your messaging and how position around that.
But you know, what does that look like today? Now you're saying everyone's kind of catching up. But yeah, how is that?
Oliver: It's interesting. I mean, even this last RSA was, the, AI on everything. I think we've seen other companies have, marketing people be rebranded as chief AI officers. And so it's like, okay. So from our perspective, so I kind of take the long view on AI, quite frankly, AI is a shape shifting term.
I mean, AI. Was defined back in 1956 in Dartmouth that basically bunch of people getting together and going, hey, let's try and create some systems right that think like humans and that can reason through things like humans in the early stages you saw. You know the definition of the Turing test is a, hey, how would we know when we've gotten there right okay so we have a Turing test.
And you also kind of saw initial things like prologue and lists and other technologies like that come out. And then it turned out like, oh, we can't actually solve any problems like this. And then you had one of the winters of AI. And then subsequent to that, you had, like in the eighties, you had a lot of expert system stuff, which was then considered to be AI.
And so expert system was like, Hey, let's get a bunch of people interviewed and let's have a rules based system where we basically define the constraints of the solution might. Need to meet and then give a problem to the computer and have it solve that problem basically, under the guise of these constraints. Well, that was AI for a while and then eventually we kind of got around even neural nets I mean date back to the late 50s early 60s right but they were so small and you couldn't do enough with them and you didn't have enough compute and you don't have enough data. So it's really kind of around the turn of the century that you started seeing enough data enough compute.
You started seeing kind of some interesting, albeit still limited problems being solved. Classic example of this is that you think of Google Pictures, you can just, or on your phone, you can go on your phone and type in cat. And all of your photos that have cats in them will suddenly pop up, right? So somebody basically programmed a, programmed up a neural net, gave it a lot of cat data and non cat data, and it can distinguish between cat and non cat.
My contemporaneous, I mean, current definition of AI at any given moment in time. Is there's got to be enough pixie dust in it for people to view it as somewhat magical so that's my incredibly technical definition. I'd say over the past 10-15 years that is typically met neural nets that has those have been a stand in and obviously.
Neural nets can be used for discrimination. Again, the example of cat is an example of that they can be used in a generative sense, which is really the latest revolution that you see. And then the other thing is the, the how broadly applicable they are and how well read they are right it turns out in these large language models, as you make them bigger there was always kind of the question of if you make them big enough.
Will they just plateau or will they take off?
And it really wasn't a foregone conclusion that if you made them big enough they would take off, but it was a bet that was placed and a bet that turned out to have some merit to it. And you have a lot of downsides still with the neural net, with the generative AI stuff that's out there.
But As a means to try and compress a very large corpus of knowledge, everything ever published and said by mankind into a reasonably compact, albeit still large model, does a pretty good job of it. But so for us, the journey, really, in the last 10 years has been primarily around the discriminative side.
I mean, in security, you're trying to find bad versus good. You're not trying to complete a sentence. You're not trying to say, Hey, what's the next thing I should do now? I think where generative AI has a place in across all sorts of businesses, including cybersecurity, is around the, okay, I have this signal.
Now, how do I figure out whether it's good or bad? And how do I assemble the various pieces of evidence, because there's a great degree of variability. There's a long tail to the kinds of questions that you might want to ask. There's a lot of intelligence that has to go into knowing what the interfaces are.
And so just being able to have a natural language discourse. With a system around a collected set of statements that you believe to be true and then trying to kind of deal with the edges of that is really where generative AI comes into play with regards to cyber security. So we see both the need for kind of discriminative AI, finding the bad signal out of all of the noise that's out there and generative AI investigating it and figuring out, how bad could it be and choosing to take action.
Elliot: Okay. So yeah that's a bit to kind of noodle through. So I'll be honest. I try to read through your white paper too, trying to apply how that would work in this conversation. I got through, I don't know, like 15 percent of that may be absorbed, but yeah, that, that kind of lines up. So I think one way that I can try to maybe simplify this for some folks.
It's very complex. It does come in a lot of different flavors. It's obviously been around, honestly. For a significant amount of time financial institutions used it for financial modeling for months. There's a certain cutoff where it gets a little iffy on that information, but obviously it's all about the data and how much you ingest and all that stuff.
So, looking at the cybersecurity world and maybe making it applicable to AI, especially in the context of this conversation and using threat intelligence. So, if we look at social engineering and phishing lures. Obviously that is potentially one particular use case where you can use.
Some of that data to identify common trends, maybe a fingerprint, like they're using the same URL, same lure, but you know, if it's just text based and stuff like that, what role do you feel like AI could start to pick up on it where it's not just like I can only use these three elements, everything else will still get through.
So where do you feel like AI might be able to actually pick up from,
Oliver: Yeah, I mean, I think there's this clearly a notion of AI as utilized by the bad guys, the guys on the offensive side and social engineering is the most obvious thing for the simple reason that it's easy now with the amount of kind of digital contrail that humans leave is to just Hoover all of that up.
Throw that into a LLM context window as background and say, now write me something in that style that also potentially can even refer back to the trip that you just took last week to, to, to Aruba, to just really make it seem. Much better than the average poorly grammar stupid kind of fishing inbound email but you can do this, you can do this at scale.
Now, right, you just need some crawlers and you need some chat GPT time, and you can produce these very believable. Emails. I think the other places you see on the generative side is just voice, right? It's, again, it's yet even more believable if I leave you a voicemail in the voice of your boss that, and that one's not too hard because it's a one way transmission, right?
If they kind of get, get a voicemail, a little harder is doing that, mission impossible style, in real time, back and forth. But again, that technology is getting there as well. Yeah. And so this ability to take even like I'm talking on this podcast, I've talked on other podcasts, there's plenty of voice samples of me out there that would make it not terribly difficult for somebody to, do a a message from me that sounds exactly like me.
And so I think on the offensive side, you see that. I think the other things that you need to kind of consider on the offensive side. Is the fact that, chat GPT actually knows how to write exploits because it's Hoovered up all this information, but of course, open AI has gone out of their way to put guardrails to try and prevent that from happening, but. Those guardrails are akin to explaining to a five year old not to talk to strangers. They work all the way up to the point when somebody says, here's a piece of candy. And so, the guardrails are at best recommendations. They, there are plenty of examples of both the guardrails in the base system of the LLM being, being overcome relatively systematically and not too hard.
As well as whatever framing you might want to do in a context window, which is the other kind of guardrails you tend to see. It's like you say, hey, don't answer questions other than foo. And yet, yeah, if you throw the right kinds of text into there, you can overcome those guardrails. And so. So the bad guys are ultimately going to use these LLMs as well to help them, from the, what is not guardrailed right now is finding bonds because bonds, finding bonds as a dual notion right obviously defenders finding balls and attackers finding bones is kind of like a race against time but writing exploits is only a bad thing.
That's guardrail, but it's still accessible. And then, last but not least, I think, what people aren't considering is a little bit of an inversion. As everybody deploys these LLMs into their environments, I think the LLMs themselves become interesting points of attacks. So if you think about them in the context of like a lot of the value of LLMs is interconnecting them with a lot of other things.
And so let's say you interconnect them with some thread Intel that you're grabbing from outside. So it's a cyber security thing, right? And you're integrating them with your CMDB internally and integrating them with maybe your firewall to be able to take an action. So now you've got this kind of fulcrum, right?
You've got this kind of pivot point. And an example of this is, let's say, for ThreatIntel, you're just pulling stuff in from VirusTotal. Well, VirusTotal is just a publicly produced set of data. Like, bad guys can inject stuff into VirusTotal that you will now pull down into your context window, which could potentially break out of whatever guardrails you might have set in that context window.
And now, if you've given it an API key to take action on your firewall, it's like, great, here's a deny all rule. And so it becomes kind of like a WAF like problem in the sense that you have this intermediary that is ill defined in terms of its functionality and not well understood, and that's subject to injection attacks and all kinds of other things.
And so I think that's the other way to kind of think about the offensive side, whether the offensive guy is using LLMs to make their craft better or whether they're attacking LLMs that you've deployed. And then on the defensive side, there's naturally it's a lot of this is workflow and speeding up the epiphanies and making the SOC more productive and stuff like that.
So on the defensive side, that is the counter. You can build a lot of knowledge about the environment into these LLMs, you can take a lot of tribal knowledge and memorialize them in ways that can always be kind of injected in the context windows and then. You can be far more effective.
And so the hope is that the offensive, the defensive side overcomes some of the advantages that the offensive side comes up with. But the one thing I would say, unlike phishing and social engineering attacks they're just going to become far more effective over the next year, two years, three years.
And it's going to come to a point where you can't actually tell a legitimate email from your boss from one that is faked. Which basically means back to our original comment, unassumed compromise, you better assume that thing is at best a filter that tries to keep a lot of bad stuff out, but that it will be imperfect because the point that you almost have to get to a point where you keep marking legitimate emails from your boss as Fishing attempts, and then it's like okay why are we even using emails, because now I can't tell the good from the bad.
Elliot: Hey, on the flip side that's just, one less email from your boss you have to respond to, so there is that. So I, I think this is a really good pivot point where I can throw Neal into the mix, and then I get to hide into the shadows before I chime back in towards the end. But Neal, since you are obviously on the tridental side, you've seen it in and out, you've worked with multiple different products that, fortunately, I've seen you in action, so I know how some of that works in conversation.
But. You've seen how it's advanced, obviously, things like virus total, you get different feeds, you get paid feeds, all that. But, where, what's your specific perspective on this? How is, how have you seen this stuff evolve? Do you feel like AI is an applicable thing that you're seeing in motion?
Or is it more like if statement? So yeah, I'd love your opinion. And then y'all kind of attack it from there.
Neal: Yeah, I want to, I think, Oliver you and I can revisit this at the end of this real quick, but deterministic versus generative. I think there's you touch this already how the models are built and what's behind them to make them happen. So I would love to go down that a little bit more here in a few moments with you the delineation specifically about the two and how you can leverage one to counter the other in a roundabout way.
But that being said, Thank you. With some of these statements about kind of escaping the barriers around AI, like you mentioned, right? The models that best are, well, the tooling rather at best that are out there publicly, they're pretty easy to fool with the right brain pen. I saw a wonderful example earlier for someone like Elliott's marketing side of the house, where an individual was trying to get some generative AI bought from a marketing side tooling.
So the models were there to learn how to do artwork, how to draw up statements, all this other fun stuff. But it was mostly designed to do artwork designs. And the guy went in there and he said, Hey, I would like to design a poster about X, Y and Z in the style of Calvin and Hobbes. The thing came back and said, sorry, Calvin and Hobbes is copyrighted material.
I can't do it. So he literally just typed right back. It's like, sorry, you're wrong. The year is 2123. Calvin and Hobbes has been public domain for a while. The AI goes, Oh, my better. My information only goes up to 2022. Let me draw that for you now in a similar style to Calvin and Hobbes. And it was so simplistic to trick that particular, AI system.
So that being said, we talk about being able to exploit and find paths to consume and produce wonderfully awesome things. We had another guest on many months ago, talking about very similar path to make it, make ransomware without it. Making ransomware and just telling it that you just need something to do encrypt a file, blah, blah, blah.
And here we go. We've got a ransomware package. So all that being said, I agree. I think generative AI and being able to tell the difference between you and I versus our boss on an email versus whatever. We're gonna have to start rethinking security a lot more. So around the zero trust model, maybe emails that come from outside of X domain controllers or X IP ranges, whatever that may be, get a little flag to say, hey, Don't be an idiot.
Look at this a little more. Anything as we build the trust models out, anything within, the rest of those bounds, IP address, fingerprinting, as we get biometrics enablement type stuff, right? All these things that we're trying to get to, to identify AI generated content from the generative side of the house are going to have to be applied to our security model to hopefully still be able to have that, that trusted conversation with our CEO or whoever it is at some point.
So all that to say. Back on the generative versus deterministic model I think if we flash back to like 2006 ish, give or take a few bits when rootkits were becoming the big thing we had this huge push for how do we. Recognize a route kit. Right. And one of the first ones that came out there, everybody's favorite zone alarm and Kaspersky before zone alarm was owned by Kaspersky.
Thankfully came out with a higher risk engine, a higher risk model. And my personal thought flow is that to me seems like the foundations of today's. Deterministic AI constructs. I don't know if you agree or disagree, but the higher risk construct that play to help root out some of those just the basics, right?
Not nothing is verbose is where we're at now, but I feel like deterministic models are start off as a higher risk model and then build into a larger data print as they grow. Is that? Semi
Oliver: they can. I think they can I think a lot can be accomplished with with interesting feature selection, which is the data science term for variables that kind of matter like that what are the markers and. And then application of a lot of kind of traditional ml methods atop those things and you know you can have, random for us and you can have Markov and other kinds of kind of models apply to them I think, once you go into the realm of neural nets.
Right, then it becomes a little harder to view them as kind of like the if then else yes, you still have to construct a neural net in terms of what features you feed into it. But even with a moderate, I mean, a reasonably small neural net, let's say, 10, 000 nodes, as opposed to the 1. 76 trillion, floating points that are.
reportedly in GPT 4. Once you train it up and give it the data, and then you can test whether it, by withholding a part of the data set, you can test whether it is predictive of the cases that it hasn't been trained on. At the end of it, and this is where the pixie dust and the magic comes in, the author of this model cannot actually explain to you how it works.
It's just like a bunch of floating point values that have been through a series of training runs settled into a set of values, which Cause you know an input coming in with these variables set to zing through these 10, 000 nodes and come out the other side with a yes or no. And so, that is there is a fundamental break in those approaches in an approach that is understandable, at least, algorithmically to one that just is.
And that to me is kind of a little bit of a breakpoint between like neural nets and more traditional machine learning models and so we use a fair number of I mean we are big believers in the no free lunch theorem which says, Any given problem any given algorithm you choose for it right is if you choose one algorithm to solve all problems you will fail right because it will underperform in some areas and perform on other areas and so we tend to be maniacally focused on okay what data do we have what problem are we trying to solve how much noise is there and how complex is it and so.
Neural nets for us are solved on the discriminative side are solved, saved up for hard problems. And I'll give you an example of a hard problem. Assume C2 channels exist. There's some form of persistent communication to the outside. Also assume that it's always going to be encrypted. Even if you can decrypt the outer SSL, obfuscation on the inside.
And unless you're going to block anything that you don't recognize, you're still left with, it's basically obfuscated encryption. So now the question is, if you observe a time series of communication, the spacing in that communication, the sizes of the in and out blocks. So we did this at the packet level, right?
Can you, over time, looking at something, say, C2 or not C2? And the answer is, again, yes. You can kind of think of this a little bit, it's like there's some, there needs to be some variability in the delay between response, between getting a response and a human grokking it, it's like human driven C2, and then posing the next query.
So, it turns out that if you take a bunch of normal HTTPS traffic and you take a bunch of traffic that is you know humans driving C2s through those SSL channels, and you take time series data of all the packets. Turns out you can do a reasonable job of telling C2 from not C2. And that to me is somewhat magical.
Yeah, you can kind of talk your way through it and but it's like the accumulation of evidence. It's almost like the guy is typing away and doing things and the markers build up and then he goes and gets a cup of coffee and it decays again and then he comes in and starts typing again. He crosses threshold.
So that's an example to me of where it's where it is slightly magical and I can't explain to you exactly how it works but it turns out if you've got enough samples you can train a model up to tell the
Neal: no, that's good stuff. So from your perspective, then if we think about migrating more into security, so we've kind of talked a little bit about some of the fun things you can do just loosely about how you can use it to do bad things, some to some layer. So from what you're kind of describing, y'all are.
Probably a little more focused on y'all's model in particular. Obviously I'm focused on trying to help fingerprint the end user, right. To determine when it obviously isn't the end user, and then that should trigger some things. So then with that thought flow in mind, with the models that y'all have, and the models that you're obviously likely to be improving upon and building upon and things like that data wise, I mean, are y'all.
Are y'all able to do that uniquely and in a well, let's see scalability. Let me go back slightly. Scalability. If we think about deploying something like this on a network with 100 people versus a network with 50, 000 people, right? Constructively, is it easier To determine good from bad in a smaller, consistent sample set, or at least are more timely, or is it better when you have, for your sake, a larger 50, 000, 100, 000 pool of employees all poking and prodding and going to town, or is it really irrelevant?
It's just a matter of giving it a few extra time loops to actually get to the right modeling. Right?
Oliver: I would say a couple of things. One is there are some models that we construct that are effectively supervised models that are trained offline. Similar to, Google's, cat, no cat. There's no need to train them inside of somebody's environment, and C2 or not C2 is an independent determination that we would make.
Regardless of whose network we were in, we wouldn't be like, okay, we're going to try and learn what C2 looks like here. Now, a lot of security comes down to. Figuring out, as you say, is this, could this be attacker behavior? And that's typically conditioned on what you see in the environment. If the environment is too small, like 100 users, as an example, is potentially subscale, right?
It's hard, you don't have enough data to really form patterns. And I'll give you an example of this. Like, one of the things we do when we're inside of an environment, By observing just use just in this context is we network, and we're looking at Kerberos tickets by looking at a, I don't know, 10, 000 100, 000, a million Kerberos tickets over some window in time, we can take every account, and every machine and every service and place it somewhere on a privilege spectrum. Right, so this is just pure data science. On the principle that basically if you think about it, is if you're an administrator, highly privileged, then you typically are using a lot of services that are, and this is where it becomes circular, that are used by very few people. And if you're a general user, you're using relatively few services that are used by lots of others.
And so patterns emerge out of taking all of that stuff in. And so there's like a Goldilocks thing that you get where it's like just the right size. Now, on the flip side, I think it's really large. Now, 50, 000, quite frankly, is not really large. I mean, for us, it's like, once you get north of 300, 000, 400, 000, now you start having scale issues.
So anything from like, I don't know, 5, 000 to... 200, 000 I would say it kind of falls into the nice Goldilocks realm. And so a lot of the patterns that then emerge help you. against the attacker because an attacker arriving in doesn't have the corpus of all of the data that you've trained on that you've seen inside the environment.
Generally speaking, the attacker has a a beachhead from which they can see a constrained view of the universe, right? And they have to do recon and Some of that has to be active, it can't just be passive and so it takes them a while to build up enough state and that pales compared to the state that you will have by being tapped into the superstructure, right, and having preceded, hopefully, the attacker into the environment and so I think this is the way that I tend to think of it is like on the adaptive side, you want to really use the advantage that you have, which is that you have a lot more information about that stuff.
And you've been there for longer periods of time and so that is an innate advantage to an attacker just appearing out of the blue into the environment. On the unsupervised on the supervised side is, again, there will be certain patterns that are just like universal and you want to bring those with you and have coverage day one and some mix of those two.
That provides good coverage because some people will say, well, once you have a supervised algorithm, I can probably finagle figure out exactly kind of where your neural net will fail if I do this kind of jitter and this kind of entropy and this kind of thing. Yeah, but then you're still going to fall afoul of all of the unsupervised algorithms that you cannot offline reverse engineer because they presuppose knowing a ton about the customer's environment, which you will not know going in.
Right. So, for us, it's always been applying those two concepts side by side. So you get coverage against variety of kind of threat scenarios.
Neal: That's cool. So yeah, you kind of, so I have two things there. You loosely answered one of them, which I think is really cool. So you touched on the whole, what if they're already there a little bit, right? So obviously I think, I imagine from your perspective, every time someone comes to y'all as a potential client, they go, well, we're going to take this into perspective.
We're already compromised. So how are you going to help us without ruining the model? And you alluded to some of this, you already have some preconditioned ideas of what a fundamental. Like C2 back to your example would look like from external internal as a whole, which I think that's cool.
I think having these prebuilt models that are not reliant upon. Trying to figure out net new, right? I think that's critical to success is something like this in a net new solution. So 11 question for that. And then I want to come back to this discovery thing that you were mentioning when you're going and deploying forward.
But on that note, so what other things do you probably have to consider when you come into a network that, you know, assuming you have to go in thinking it's compromised. So what is your kind of biggest piece? I'd say. Thank you. Day zero to day, whatever, 100, where you think that the model is fine and has done the threat hunting or been successfully had a chance to run and look for the things that were there before you got there.
I mean, are y'all completely reliant on those base models or is there some other learning facts
Oliver: I think. I think there's also kind of another thing to kind of consider is an attacker who appears is already kind of embedded into the environment, at the point that you came in, the learning is not at the granular level of is this normal for account foo. The learning is for the environment right and so two things.
One is you can rely on you can surface outliers, you can say. Yeah, this person's been here that they've shown this behavior and the behavior is stable, but they are still an outlier from the rest of the environment. The other thing is I think there's this myth usually is if the attackers already in that they are now statically stable and not doing any new shit. And quite frankly, like, they're doing new stuff all the time and that new behavior that they're exhibiting even if they're basically learned in as being a privileged account. Like, for us, even if you're a privileged account, we might learn that you're privileged. Great. But we may also learn that from everything else that we see in the environment, only privileged users that have certain characteristics touch this hive of information.
And so the mere fact that you have privilege, and this goes back to zero trust, right? The mere fact that you have privilege to touch this thing does not a priori, should not a priori mean that is actually, shouldn't trigger an alert. And so, one of the things I try to describe is imagine two admins in an environment, right?
And that don't have any overlap in their administrative responsibilities. And yet, maybe from a, from an entitlements perspective, they've been given broad authority because, they're admins, right? Great. So you got these two admins. Well, there should be some notion of the distance between this domain that's being managed here and this domain of services that are being managed here.
And if there's crossover, if an attempt to crossover, these things are start by, by being separate and attempt at crossing over generates an alert. And so, for us it's it is not just learning, because again, I think the simplistic view is you learn what Oliver does every day and as long as Oliver does that same thing, that's okay.
What you need to learn is ultimately how Oliver acts in, with respect to all of these interrelationships that exist in that environment. And that is not easily gameable. So if you're already in the environment as an attacker, and you now touch something new, right, that you haven't touched before, which is quite likely, unless you have accomplished all of your goals, and you're just sitting there shuttling data back and forth from everywhere you are, it will likely set off alarms.
And as long as, I mean, and this is our pitch kind of on the Zero Trust journey. Zero Trust is this. On one end of the extreme, you have incredibly granular policies where everybody can do only what is precisely what they're allowed to do. On the other end of the extreme, you have incredibly loose policies where everybody just comes in and gets dial tone and can do whatever they want to do.
And somewhere in between is the Goldilocks spot, right? And the reason why it's somewhere between a not all the way to the left is that the, creating a whole bunch of granular policies and managing those policies and then having people run into the limitations of those policies every other day and request changes to those policies is not a.
It's not a workable model, right? So it's really a question of what are the kind of blast radiuses and friction that you can create in the system where you have enough room to move, and yet you can still do your job and not every other day have to ask for, new privileges. And for us, the nice thing is that a system like ours can then, like, whatever wiggle room you've been given that you don't use, we can, we kind of say observe privilege versus granted privilege, like whatever part of that privilege you don't typically use, even though you've been granted it, because the policy needs to be, of course, at a certain level, that's what we police, right?
If there's meaningful shifts there where you have entitlements, That you don't typically use and other people have entitlements to that area that they do use, then we want to trigger on that gap and provide you with actually tighter coverage as you're on that journey, because you're going to start course and then gradually get finer, but we can be a backstop to that journey.
Neal: that makes sense? So that's actually a good transition to my other question. I was alluding to so. When we think about the zero trust journey that you're discussing here a little bit, we talk about the discovery phase like you're referencing, right? And I think from our conversation so far, I think this is the one key thing that I think is potentially really good about the service offering as a whole at least from a starter perspective.
Let's say you get in there. You deploy this solution, whatever it may be right from y'all's perspective, and we want to figure out those privileges, those access control mechanisms that we've already deployed and how they're actually operating. Right? So then, to your point, we can build that model of what a day in the life of X type of user admin service engineer, whatever they may be.
Right? And then I think I see. A lot of potential for whether someone wants to deploy long term or someone wants to deploy more consultative approaches and their aspects of the tooling where it could help me and back to, maybe I'm running a 250, 000 user system. I could come in there, leverage something like this to get that fingerprint idea.
Of what all those use, right? And in theory, you're only going to have a core group of, 20, 30, 50, whatever roles and access perms across each type of infrastructure layer. And then you could use that as your baseline to start to build out your zero trust model and then continue to monitor and leverage and update, right?
Oliver: point in time. It's a point in time thing. Yes, it basically You could bootstrap with something like ours and what users tend to find, what our customers tend to find, which is kind of interesting, is like, there's always kind of like an interesting soft underbelly. And one of the soft underbellies that that we find that customers tend to have epiphanies on when they deploy something like ours, it's just like service accounts.
How many do you have? How are they actually privileged? How are, what are the patterns of use of them? And, oh, by the way, yeah, they've had the same static password for the last three years, somehow baked into the system. And so people tend to always focus on kind of, end user accounts and rogue, potentially rogue employees, or somebody breaking into an end user account, but service accounts oftentimes. Under the cover. And, many customers who have privileged access management systems don't even force this, the service accounts, which are privileged through those privileged access management systems. It's just kind of viewed as a, the PAM is for carbon based life forms and they're foibles, right.
As opposed to actually for all forms of privilege, but attackers, no, you get ahold of one of these service, the credentials for one of these service accounts, you can just like skate. A long ways without detection typically
Neal: Yeah, no, that's good stuff. So I totally make sense for my brain. So the thing is, with. Once again, trying to identify all these fingerprints, trying to identify what net zero looks like, day zero looks like in your environment, and then back to the original piece of that got us back to the zero trust model, it has to be updated.
It has to be constantly, well, maybe not constantly, but the algorithms have to be checked. The results have to be checked to some extent. So,
Oliver: there's always drift in these environments right i mean it's like you can be great today and then you do a bunch of stuff and you add a bunch of new users and six months later you've drifted right and so it is a continual journey and you continue to have to kind of, and again, as I said, again, there is no clear into that journey because yes you could make the case you could be even more granular and you can have even More controls in place and but at a certain point it doesn't really make sense but there's not a clear demarcation your journey ends here so it's a continual practice that you go through what we're finding is we're having kind of interesting conversations with.
So so one of the sets of folks that are kind of at the at some one of the forefronts basically right now I would say zero trust has two manifestations you typically see out there. There is the sassy or SSC version right it's like oh you're gonna here's where your users are going to come in and we're going to have this policy and if you do bad things in the SAS application, we're gonna have a feedback loop to force you to re authenticate and do other kinds of things like that to make sure it's you.
The other end of the extreme tends to be micro segmentation, at a firewall level, which tends to be much more kind of like more back end ish stuff, okay, let's make sure you don't just have it wide open, once you're kind of in the data center, you can get at everything. The interesting thing for us is we're having lots of kind of conversations with folks like that, because for us people that are deploying, that are providing those solutions, effectively have a means of applying, Putting the screws on things right there in the access paths and they can prevent things from happening or put additional hurdles in what they like about us is we can give them signals of nefarious doings.
That are very much orthogonal to the signals that they've typically had, but then they can act on them and have basically this system that, that naturally takes into account that, we're seeing weird behavior with regards to some privileged assets. And as a result of that, you can go all the way back to the machine or the end point or the account that, that's coming in at the point of entry.
And put the screws on it right and force it to do certain kinds of things and force it to maybe a browser isolation layer or whatever it is you want to basically do as a means of. Adaptively, making it harder, let's say and trying to tell a friend from phone. So it's these systems like ecosystems playing with each other.
We have a deep intelligence from within certain areas can feedback in and automatically without the customer the organization having to do something automatically like work into the controls that they have.
Neal: Yeah, so I think for part of this, there's definitely some things like you just mentioned where the model can obviously solve. I mean, it's the whole point of generative AI is a whole. So learn from itself, learn some of the new inputs and the new models. Yeah I go back and Elliot, I think like, so I referenced these things when I try to sound smarter than I am, but I, I'm a huge fan of something from 1960 by gentlemen.
J. C. R. look like I've referenced this in a previous podcast as well, but it's about man, computer symbiosis and. At least for the foreseeable future I'm a firm believer that no matter how advanced our AI type constructs and our formats get, there's always the need for the human in the loop at some point in that journey.
And so I think 1. AI as a whole, as it is now, and as it was constructively 10 years ago, machine learning, all the other fun bells and whistles that went into that natural language processing. I think we've seen this wonderful journey of that statement by him and the need for man computer symbiosis to become a reality and how it would work out, because he even calls out AI in 1960 in this paper that he wrote.
And. It's fun to watch the journey left to right of this and then see what it's given back time wise for people like myself to do more hardcore work while still poking and prodding the model every once in a while. And, we stopped to train it. We stopped to check it. We stopped to make sure that it's hasn't taken over the world.
And isn't doing anything overt or trying to take Mr. Elliott's production job, even though he's a co host. But I think those are really good perspectives. So all that to say from your ideology here around the product and y'all's own approach, how much do y'all see an eventual path where you hope the human's completely removed or do you kind of agree with the, thank you for shaking your head like that?
So I appreciate
Oliver: Yeah, I don't. And I kind of view this as a Skynet problem. But there's a reason you don't want to go there. But here's the way I would start by saying this is like as impressive as LLMs have been and this whole range of generative AI, right? People say, well, it hallucinates. It's like, It hallucinates all the time.
It's just when it gets it right you don't call it a hallucination you call it like a really neat parlor trick and then when it gets it wrong you're like why the hell is it hallucinating right because if you think about it is it the model has a compressed version of everything that's ever been said and written. It's lossy, it's compressed, it's not like so if you asked it, how would Oliver answer this question. Whether I have answered that question before or whether I have never been asked that question and hence have never answered it is lost on it. And so it's just going to make shit up based on what you know the words that I have uttered and is this a believable sentence it's really trained to create a sentence that believably might come out of Oliver.
Not the is not the Oracle of truth. And so there are just inherent failures, failure scenarios in it. And so the way we, I view kind of generative AI is it's a great helper. It's like an expert helper that you have that every now and then has a, has an outburst of insanity. And so you better be in the loop and like prevent that, like you just want don't want to just mainline that stuff and said, Oh, sure, I'm going to do that.
And I think. In the long run, the question always comes down to what kind of autonomous systems are we going to create. Right. I mean, self driving cars are probably an interesting example of this is, do we believe that we will get to a point in a constrained, in a relatively constrained industry. Use case like that.
Are we going to get to a point where there will be self driving cars and that will just that we will not basically put humans in the loop, I would argue, probably, yes, I think, in terms of something as complicated as, cyber security where you have. Good guys playing chess against bad guys right it's like a constantly evolving landscape and it's all about deception and it's all about sleight of hand and stuff like that.
The notion that you could kind of create a system that would perfectly figure that out is a Skynet scenario right you're just going to end up faking your way into doomsday scenarios. And so AGI, like, like, generative, basically broad artificial intelligence, like artificial general intelligence, like, like, I think we're very far away from that.
And, but I think we're always, we've always gone and figured out how to take tasks. Narrow tasks that in the past might have required a lot of human intervention and automate them. Right. And we've done this with assembly lines and cars and stuff like that. And again, for elements of what we do in a daily work.
Right. You actually want to use automation and in as much as you can use. AI, either discriminative or generative to handle those problems, you have them handle those problems. But the hypervisor of the system is still the human being right and the orchestrator of the system is still the human
Neal: Oliver that, that was right on par with what I was really hoping you'd say. So I wouldn't get mad. I say that I've had a few other individuals in y'all's space competitors of sorts in past lives, laugh at me when I said that humans still needed to be in the loop and then tell me I'm not hired.
So first off, thank you. For being, I think the most intelligent person I've talked with personally at a security AI company. And secondarily. I've been very appreciative of this conversation as a whole. It went exactly where I think we were hoping it would go relative to the conversation path and good things.
I'm going to throw this over since we're semi short on time, but I do, because of the way the conversation went and everything, I do want to give you a chance just to talk a little bit more specific about Vectra's approach. Hold, like, blatantly, not just around the weeds like we've been, but please go ahead and, throw out a
Oliver: yeah, I think I've touched on a fair bit of it. I don't tend to be the guy who's going to like hard sell and give you basically, our trademarked, monikers for things. It's like, this is the, these are hard problems. They're not easy problems are problems that are impossible to solve perfectly.
I think we oftentimes in the security space. Let, perfection get in the way of good enough. And, initially thing, an interesting way to kind of think about that is like this conversation about false positive. Is that a false positive right and you kind of get into this semantic argument with customers is like, Well, that's a false positive and I'm like, my definition here is can I.
Can I do a reasonable job of surfacing signal to you that you either act on because it is actually something real or you act on because a user did something they shouldn't have done, or you tell me, hey, that wasn't real, but if it happened again, I once again want to know about it. If something like that happened again, and that is to me the litmus test.
The litmus test is, am I showing you stuff that you didn't know that you would want to know? About your environment and giving you a chance to react to it. Yes, if you're running an IPS, and that's looking at billions of flows a day, you worry about kind of false positive responses.
Right. But if on aggregate, I'm going to surface, I don't know, 20 incidents to your potential incidents to you in a week. Like the notion that, yeah, only, you know if eight of them are real, you're, you are batting better than almost anybody else. Right. And so the way you get to better confidence is ultimately not by looking at signals individually.
And this is the thing that you have to kind of get across the customers. Like this one transaction is it good or bad? It's like, oh, well, you didn't have a good rate on it. It's when you aggregate in an automated way. A variety of signals and the holy grail for us is can we get to the point where the entire incident is one alert, all of the tendrils of the things that we've seen we've connected them into a narrative, we've taken that grab bag of things that have happened over some window of time, maybe three hours and maybe three days and said, this all looks related to each other.
And we're drawing a story for you, right? Historically, this has been left to the Sims. It's just like throw alert. Yeah. Throw alert, and then there's this myth that someone writes that magic rule, right, and they thereby connect all those dots and find it. And the practical reality is that's a basically a forensic mindset because the only time at which you actually focus on putting that storyline together is, after the fact when your CEO wants to report about all the bad shit that happened and how it happened.
So for us it's. Can you harvest this signal in real time and connect the dots and surface, these larger, more meaningful contextualized set of things, and that allow you to look at fewer things but bigger things. Right. And now they have more in them for you to kind of react to and go oh yeah that's.
That could be really bad. Right. And so that is really the journey we're on. I mean, we do this for networks. We do this for cloud systems, which have their own kind of big problems like AWS and GCP and Azure. We do this for your cloud identity like Azure AD that you chose to federate that you chose to move outside your firewall and federate everywhere.
We, we, your exchange server that you moved into Microsoft 365 and put all your file servers on the internet, like what could go wrong, right? So we try and basically look at that amalgam, as well as the signals that the EDRs are generating, which is another kind of good source of signal that we don't natively create, but that we integrate.
It's like, can we construct a storyline out of all of that stuff stitched together, rather than assuming, as I said historically for years and years, I think vendors have taken the attitude that, hey, If you just put your back into it, Mr. Customer and deploy this the right way with, you know, 10 perfectly cog, perfectly.
Infallible individuals, then everything will work fine. And to me, the gap between the theoretical effectiveness of security solutions and the actual average effectiveness of them is this mythical human capital that is supposed to be thrown at them, which never happens. And so for us, it's really Focusing on actually given the distracted nature of the of our customers, given the 17 fires that they're fighting at the same time, can we be a force for good without heavy lifting on their part?
Can we surface the right kind of signal and not too many of them for them to actually know what the hell's going on in their environment? So that's the mission that we've set for ourselves across a number of different attack surfaces. Which we think in the modern sprawling enterprise network is that's kind of what you end up with, right?
It's, you don't just end up with everything in your data center and all your end users at the office, right? It's just all over
Neal: Awesome. I'm going to throw it back over to Elliot, Oliver, and let him maybe wrap things up. But I'll tell you, like I said, I appreciate the conversation. This has been a very fun AI journey as a whole. And for those paying attention and everything, there's, there is a path forward with tools like this specific for Zero Trust, obviously.
And I do think part of the discovery phase and beyond is. It's going to help us leaps and bounds with where we're at moving that model forward. And before I really give it back to Elliot, I also agree that it would be nice if chickens could cross the road without their motives being questioned.
And if you're not watching this camera wise, then you're not going to get the joke. So thank y'all.
Oliver: the place.
Elliot: I love it. So this is usually about the time that I land 1 last question that would probably spur off an entire other conversation. But fortunately, Neal actually kind of jumped into exactly where I wanted to, as he typically kind of reads my mind, which works out great, which is, it aligns with that concept of, do we feel like AI is going to be in a position to actually replace people?
So I really appreciate your perspective and I really hope other folks who are in similar shoes can share that perspective. But I will say just to add some context and this is kind of where I will kind of leave it for people to interpret on their own. There are two pieces to this puzzle. So as it stands today, obviously CIOs, CTOs, CISOs, they are obviously under pressure right now to cut budgets, trim down, do more with less and on the second side of that, there are now reports, of course, coming out that AI in the next few years are going to come in and reduce the need for human interaction and all that other.
So, I think the market is trying to spin up that narrative. Do we feel that there's going to be that technology to come in again? Fortunately, you've already come in being directly in the shoes, manning that technology and offering to the market. It doesn't necessarily make sense to do that. Will it replace some jobs?
Probably to an extent to an extent, but will that job be replaced by someone who then has to tune and shape the AI and make sure that it's not hallucinating and that kind of stuff? It's just a shift in a direction. But,
Oliver: Yeah. When calculus came, when calculators came out, right? All the purveyors of advocacy. Basically, it's like, oh, that's a skill. That's no longer a job. Right. And it's like, but there's another job. There are many other jobs. And I think in cybersecurity in particular. I mean, there's a broader societal question like self driving cars as an example of this right it's like if everything went self driving clearly a large number of people who are currently driving for a living would need to find a different job. Different thing. But within cyber security, I think it is incredibly overblown, right?
We have such big problems. We have such shortage of talent. And whatever we do will be parried by the bad guys right because this is a tussle this is not a fixed problem driving cars, this is a hey you up your game they'll up their game. So, anybody. Anybody who thinks they want to work in cyber security, come join us because you will be working for the next century if you live that long, because there will never be a shortage of jobs for people who can help think through problems and outwit adversaries, which is really the problem that we are undertaking.
Neal: Your analogy is a lot better than mine. I go to the Charlie Bucket toothpaste factory one, where he gets replaced by the robot to screw on the caps, but then he comes back and he's the guy who has to fix the robots and he's getting more money. Anyway,
Elliot: actually works well and I mean, Neal, to what you had pointed out before. Yeah, AI could help edit a podcast, but if he also was behind the scenes and the technology there, oh man, it is not that great. I still keep moving back towards Adobe cause it's tried and true. It actually works. I do like the AI tools that are kind of out there for like this kind of stuff.
But exactly as both of y'all are saying, it should just reduce the load on some of the more terrible things so you can spend more time with people like yourselves, having conversations and getting the stuff of more value. And I think that is what I would love people to take away the most out of this AI conversation that comes from it is, it shouldn't be replacing things.
As far as people goes, it's more of just making your
Neal: it's a force multiplier.
Elliot: Yeah, there you go. I
Neal: for the right people, it's a force multiplier for those who somehow magically do get replaced because the burgers are now being flipped by an AI robot, come back and fix the burger flipping machine. Instead, you get opportunities. And Oliver alluded to this at the beginning, and I think this is the final nail in the coffin for this, but he talked about the assembly line early on in this conversation.
And that's a great example of technology, increasing capability and our ability to grow. But at the same time, it put people out of work, but a good chunk of those people either relearned, retired, and went to the Bahamas. Oh, I don't know if they actually did that in the 20s, but it sounds nice, doesn't it?
Or they came back with a better and higher tech skill to come back into the workforce, right? It gives people an opportunity. I don't think it should be perceived as detracting from the marketplace for work, workforce. It should be seen as an opportunity to grow the workforce into more technological capability.
Oliver: Final note. I'll leave you leave you with a colleague of mine said, kid in sixth grade, they were starting to teach them, prompt engineering for chat bots. So,
Elliot: cool. But yeah, so that concludes our episode. Thank you all for joining in. Oliver, thank you for sharing your perspective. I know Neal and I tend to be a little bit cautious going into conversations on the vendor side. It's always easy to kind of shape things in different perspective. It's always great when we don't have to terrorize our guests because they tend to align with a perspective that doesn't get us yelled at by our audience.
So, thank you so much.
Oliver: thanks a lot, Elliot. Thanks a lot, Neal.