How to Enhance Your SIEM with AI

May 7

The Evolution of SIEM Systems

Security Incident and Event Managers, or SIEMs for short, are critical to any cybersecurity organization today. With the amount of data that is generated from workstations, servers, cybersecurity tools, and other sources, it’s imperative that all of the different parts of your environment need to log to a centralized location. Otherwise, you end up with a bunch of different tools and systems that all create data in their own formats and within their own siloes, making it very difficult to establish correlations between your data. This will prove to be problematic because if your organization ever has a security breach, the incident response process for that breach will require you go into all of the systems involved with the incident and manually extract the needed data to determine the root cause of the breach and how to best remediate it. Having all the necessary data ready in your SIEM makes this process that much easier. SIEMs also allow you to see the security posture of your organization, including high level looks at trends in your security data such as currently open investigations, the number of vulnerabilities discovered in your assets, the number of security-related alerts generated over a given timeframe, and more. For team leads, CISOs, and other C suite executives, knowing these numbers is critical in understanding how well your organization is handling cybersecurity and what needs to be addressed or prioritized for your team.

Innovation in SIEMs has mainly occurred through the various means of detecting malicious activity as well as the visibility they allow into your organization. Cybersecurity threats have become quite complex, and creating alerts for every single threat out there is simply infeasible. Luckily, SIEMs have stepped up to the challenge by providing out-of-the-box alerts that can immediately be enabled, assuming you’re collecting the correct data for them. These alerts can be as simple as detecting lateral movement between Windows systems, or as complex as detecting cryptocurrency miners that are actively working and hogging resources in your systems. If the built-in alerts aren’t enough, SIEMs make it simple enough to create alerts for whatever is important for your cybersecurity team or organization. In terms of visibility, SIEMs allow you to ingest all kinds of data, regardless of its format. On top of this, most SIEMs provide their own apps and integrations that directly supplement the data you’re collecting, such as a dedicated app for firewall data where all the data is neatly presented with matching visualizations. SIEMs also provide the ability to create your own apps that allow you to present your security data however you see fit. In short, the only things that limit the innovation of SIEMs are imagination and technical ability.

While they are very effective, SIEMs are not perfect. The utility that cybersecurity teams get out of SIEMs is directly correlated with the amount of data you feed them. Simply put, the more data you provide SIEMs, the more visibility you gain with them and the more you can do with them. However, ingesting this data comes at a price, either from a money perspective or a time perspective. Depending on your SIEM, you can be charged for it based on the amount of data you keep in it, the amount of searches you perform on your data, or some other measure. The costs can quickly add up if you’re in a large organization. From a time perspective, it will most likely require a dedicated person or team to maintain SIEM operations (e.g. making sure that any security data feeds to the SIEM are functional, addressing any alerts produced by the SIEM, performing upgrades on the SIEM, etc). You also have to invest in learning the search language that your SIEM provides, which can become complex when trying to perform more sophisticated queries that interact with the collected data. Even assuming that your SIEM is perfect and your costs are manageable, you still have the issue of collecting all of this security data that primarily remains dormant until there’s an active need for it such as an incident response activity. Fortunately, with the rise of artificial intelligence, we are quickly approaching the next frontier of the SIEM that has the potential to address all of these issues and more: the AI SIEM.

What is an AI SIEM?

As you can probably already deduce, an AI SIEM is any SIEM that is enhanced by artificial intelligence in some form. It’s no secret that AI is making its waves through all industries, and cybersecurity is no exception. From a practical perspective, an AI SIEM would be able to handle the following use cases that would provide immediate benefit to security teams:

Proactive threat hunting and anomaly analysis for your collected security data sources
The detection of security threats and breaches not yet known by threat intelligence
Anomaly detection for SIEM data sources to determine when there are data issues or outages
The translation of a human question to a search query understandable by the SIEM

The first use case of proactive threat hunting and anomaly analysis will be the first benefit that an AI SIEM will achieve, which naturally leads into the detection of security threats not known by threat intelligence. The same anomaly detection technology can be used to detect anomalies in SIEM data ingestion when data sources are no longer sending as expected. The last point of human question translation to SIEM search queries is an area that’s actively being worked and innovated, and will most likely hit most mainstream SIEMs within a couple of years, possibly sooner. We will focus on the first two benefits, as they are possible with the technology that we have today.

Why Integrate AI with Your SIEM?

While SIEM vendors are most likely in progress of innovating their SIEMs to utilize artificial intelligence, it’s better to take advantage of the AI capabilities that we have today and implement it yourself. The incentives are the following:

You gain control over how AI learns your data
You’re not limited to vendor provided solutions
You gain a better understanding of AI and how it can work in your SIEM
You avoid vendor lock-in, just in case you decide to switch SIEMs in the future

The biggest incentive to implement an AI SIEM yourself is the control you gain over your data. You can train the AI on whatever data you desire in your SIEM in exactly the manner you want to train it, without interference from outside sources. For example, most organizations have sensitive users or systems that could be considered the “crown jewels”. These are the users or systems that, if compromised, can wreak havoc within your organization. This can be a sensitive database, a set of highly privileged users, or a combination of these. For some older or larger companies, the crown jewels may not be the most state of the art systems, but rather the end of life workhorse systems that run mission critical software that hasn’t been ported to modern operating systems. Regardless of the situation, most if not all companies have these sensitive users or systems that deserve higher levels of protection than other systems in the environment. With AI, you will be able to establish a “normal” for these sensitive users and systems based off their logs held within your SIEM, and you’ll be able to alert of when they start performing activity that deviates from their established normal. This provides much higher level fidelity monitoring, which will allow you to detect anomalies associated with your crown jewels that can be possibly be associated with threats and create alerts within your SIEM based off the anomalies. And because you control exactly what AI algorithms are used to learn your data, you can focus on whatever parts of the data are most important for your sensitive users or systems. This type of capability goes beyond any capability of any SIEM at the moment, and can implementing something like this can put your AI powered SIEM head and shoulders above what normal SIEMs can provide.

The second incentive is that you’re not locked in to vendor provided solutions. SIEM providers do allow to do some sort of AI integration with their SIEMs, but a lot of them force you to conform with their provided algorithms and methodologies. While convenient, the provided AI algorithms may not meet your use cases. Any SIEM system should allow for you to customize it to fit your use cases, and that includes how you choose to implement your AI based SIEM.

The first two incentives naturally lead into the third incentive of gaining a better understanding of AI and how you can integrate your SIEM with AI. For better or worse, AI isn’t going anywhere anytime soon, and getting a better understanding of how it works will most likely be a requirement in the cybersecurity realm in the future.

Finally, the last incentive protects you from any changes that you may need in your cybersecurity processes in the future. Vendor lock-in is a serious issue for most cybersecurity departments, and being able to implement AI in your SIEM platform regardless of the vendor will help your team’s future flexibility.

AI SIEM Implementation Strategies

Some SIEM systems natively provide data science and deep learning capabilities. For example, Splunk provides the DSDL app which integrates deep learning libraries like scikit-learn, Tensorflow, or PyTorch. When possible, it’s best to utilize native applications as they work directly within the SIEM platform. For SIEMs that don’t provide this capability, you would follow this high level blueprint:

Create your model based off data collected from your SIEM
Save the model either on a dedicated server in your environment or within your cloud environment
Create a program or script that periodically collects new data from your SIEM and feeds it to the model for it to perform inference
Take the results of the model’s inference and ingest them back into your SIEM

These steps are very high level and the implementation of them are completely environment dependent. All cloud environments (Azure, AWS, GCP) now provide the ability to host trained models for a price. If you decide against cloud hosting your model, you even have the option of keeping your model offline; the only consideration would be to make sure that the script that retrieves new data from your SIEM can reach your trained model to perform inference. As you can see, you have a lot of flexibility in implementing AI in your SIEM. In the future, SIEM vendors will most likely provide the means to directly host models within the SIEM itself, similar to Splunk.

How to Start Enhancing Your SIEM with AI

AI is understandably an intimidating new frontier that cybersecurity is reaching. While it is new, there hasn’t been a better time to start experimenting with implementing an AI powered SIEM for your cybersecurity use cases. If you would like to start enhancing your SIEM with AI, contact QFunction to see how we can help you explore AI in your cybersecurity practice. If you’re interested in getting consultation for your SIEM, check out our QFunction SIEM Services. And if you’re interested in an example of how AI can be used directly within your SIEM, check out this blog post on how the Splunk DSDL app was used to for determining malicious behavior on domain controllers!

Ryan Smith