The Glaring Problems of SIEMs and Current UEBA Solutions

Apr 4

We as cybersecurity professionals depend on our tools to perform our jobs. Whether the tools have to do with endpoint protection, firewall protection, vulnerability scanning, network access control, or some other kind of cybersecurity tool, we utilize them in order to protect our business environments. Because these tools are made by different companies, serve different purposes, and all produce their own kinds of data, we attempt to unite them by centralizing all these data sources into SIEMs in order to see correlations between these disparate data sets.

There’s a huge problem with this, however: there’s simply too much data. With all the competing priorities that cybersecurity professionals have on their to-do lists, we only have so much time in any given day to utilize what our SIEM is collecting. Most professional cybersecurity organizations preach being proactive in security, but rarely do cybersecurity teams have the time, money, or manpower in order to do this. What makes this worse is that most SIEMs hold large troves of useful data, but the data tends to be used only when an incident response activity is in progress. For the rest of the time, the data just sits in the SIEM with nothing making good use of it.

Now let’s talk about the elephant in the room: user entity behavior analytics (UEBA) solutions. Granted, current UEBA solutions can use these large data sets in your SIEM (or other data stores) in order to establish a “normal” for users and systems within an organization by utilizing artificial intelligence. From here, the UEBA solutions determine threats from the anomalies, and the threats are your responsibility to validate. Unfortunately, in their current state, these UEBA solutions provide varying levels of value to cybersecurity professionals. And interestingly enough, I believe the problem lies within the fact that UEBA solutions utilize the vast amounts of data in your environment when it comes to learning the “normal” for your users and systems.

UEBA solutions incorporate unsupervised learning algorithms in order to learn about your users and systems. In laymen’s terms, unsupervised learning is a type of machine learning that learns from data without human supervision. In essence, you give data to an AI model for it to learn properties about the data. This kind of learning is useful for determining anomalies, as the AI knows what’s normal for the data and can tell you when new data points are anomalous. The biggest upside of current UEBA solutions is that you can train the model with as much data as you want, so naturally, you’re going to want to feed it all of that delicious data you’ve been collecting in your environment. The biggest downside is that current UEBA solutions that attempt to learn everything about your environment tend to produce less useful results as the vast amounts of data on which they’re trained grows larger. Because business environments are diverse with users and entities performing all kinds of different tasks, the AI has to learn what’s normal for these users. What makes this challenging is that what’s normal for one user or system may not be normal for another user or system. Therefore, the AI model has to learn a little bit of everything, which doesn’t bode well for determining a normal for users when technically everything can be normal in some form. As a result, you may get figures like the following in the main dashboard of the UEBA solution:

1,000,000 anomalies

500,000 threats

For my cybersecurity professionals who work in larger environments that utilize some form of a UEBA (or machine learning) solution, I’m certain you’ve seen numbers like these. I’m also certain that you have not triaged all of these.

I realize that my argument can be considered broad and that UEBA solutions differ from each other under the hood, such as the use of multiple models to better represent aspects of users and systems, the use of unsupervised deep learning algorithms, the ways that UEBA solutions let users tune their implementations to a certain extent in order to lower the number of anomalies and threats, etc. However, even if we assume a perfect UEBA solution, you still have the following issues:

An entirely new interface to learn on top of all the other dashboards that cybersecurity professionals already have to monitor
The fact that by design, you can’t tell your UEBA specifically what to learn, especially when people or systems change
The knowledge of system administration that is required in order to even implement a UEBA solution in the first place
The constant upkeep that UEBA solutions require, which means that a subject matter expert will have to be on standby for it
Not knowing in detail why something is anomalous or how anomalous it is

In a time where the use of AI is on the rise and cybersecurity teams are expected to do more with less, there has to be a nice middle ground where cybersecurity teams can better utilize the vast amounts of data in their SIEM while using AI to gain better context of the data.

Contact QFunction today to see how we implement a customized AI solution directly into your SIEM!

ueba solutionsthreat hunting solutionssiem servicesartificial intelligence

Ryan Smith

The Glaring Problems of SIEMs and Current UEBA Solutions

Understanding Autoencoders and GANs for Anomaly Detection

Predicting Malicious Network Connections Using Splunk and AI