Detection as Code: A Maturity Framework

Detection as Code (DaC) is transforming how we handle threat detection by merging software engineering practices with security operations. Breaking down DaC into functional areas can enhance your organization's detection capabilities and maturity.

July 25, 2024

Daniel Wyleczuk-Stern

11-Minute Read

Detection as Code: A Maturity Framework

Introduction

In the ever-evolving landscape of cybersecurity, the methodologies and tools we rely on are continually advancing. One of the latest trends is Detection as Code (DaC), which promises to revolutionize threat detection by integrating software engineering practices into security operations. Having worked in an organization that practices DaC for several years, I’ve come to realize that it’s not simply a matter of whether or not to adopt DaC.

If we want to adopt DaC, where do we start? And how do we know if we’re doing a good job at it? How do I convince leadership that it’s an area we should invest in? At Snowflake, we’ve been executing on aspects of DaC long before I arrived. However, we quickly learned that writing our detections in an IDE wasn’t sufficient to achieve the operational results we wanted. Like most principles in cybersecurity, DaC can be broken down into multiple functional areas, each with its own levels of maturity. I’ll share my insights into these functional areas, the various maturity levels within them, and how to improve your competency in each area.

Understanding Detection as Code

Detection as Code (DaC) leverages automation and version control to manage, deploy, and maintain detection rules, policies, and configurations. By treating detection rules as code, DaC enables rigorous testing, continuous integration/continuous deployment (CI/CD), and monitoring practices to enhance the reliability and efficiency of threat detection. A key aspect of DaC isn’t just writing detections in YAML or Python, but also applying the same development and deployment rigor as software engineers.

Functional Areas and Maturity Levels

DaC can be divided into four functional areas: programming language, testing, integration & automation, and monitoring. Each area has four maturity levels: Ad hoc, structured, automated, and optimized. Let’s explore each area in detail.

Programming Language

The choice of language for defining detections significantly impacts capabilities. Languages vary in flexibility and tools for translating ideas into code. Many teams start with markup languages, such as YAML. YAML has almost no barrier to entry and is extremely human readable.

YAML Rule

Image Source: https://github.com/SigmaHQ/sigma/blob/master/rules/cloud/aws/cloudtrail/aws_cloudtrail_disable_logging.yml

The Sigma Project is a well-known DaC project focused on YAML. However, expressing complex logical thoughts with YAML can be challenging.

A step up from YAML are query languages such as SQL or SPL. Query languages are specifically designed for analyzing large amounts of systems from databases or other data stores (such as one where you may keep your logs). The SIEM I’m most familiar with that is purely based on using SQL is SnowAlert (a now mostly deprecated project).

General-purpose languages like Python offer the most flexibility in translating detection logic to code. Many modern detection engineering teams are moving towards Python or Go for interfacing with their data due to their flexible structures and rich libraries.

An optimized team uses the best tool for the job, seamlessly integrating markup, query, and functional languages.

Maturity Levels

Ad hoc: Markup language
Structured: Query language
Automated: General-purpose language
Optimized: Any combination of the above/best tool for the job

Testing

Testing your detections is a critical and sadly often neglected area. I’ve broken down testing into five different areas (unit testing, linting, performance testing, integration testing, and end-to-end), which can seem even more daunting. You may say to yourself “that’s such a huge task, why even bother?” It’s important to realize that significant improvements can be realized with just a modicum of testing.

Unit Testing

Unit testing is focused on ensuring your detection logic is correct that works by passing input to your detection logic and ensuring the output matches as expected.

Unit Testing 1

Unit Testing 2

In the above example based on Panther, we pass an example input into our rule along with an expected output.

Linting

Linting is basic static analysis for code quality. Linting can speed up code reviews and ensure highly standardized code even with multiple engineers contributing.

Linting 1

Linting 2

Images source: https://plainenglish.io/blog/how-to-lint-properly-your-sql-code-5ae21e6f9f5c

Performance Testing

Performance testing will run your queries or detections and ensure they run in a timely and cost efficient way.

Integration Testing

Integration testing ensures that disparate systems which interface together properly interface as expected. This is most common when you have other parts of your detection platform such as SOAR.

End-to-End Testing

End-to-end testing occurs when you full test your system from data generation all the way to case management and potentially closure.

End-to-End

Image Source: https://github.com/WithSecureLabs/leonidas

Testing Priority and Difficulty

Not all testing methodologies are created equal. Some tests are hard to get setup or maybe aren’t a super high priority. For your organization, you should decide on the areas you’re seeing the most operational failures and work to setup testing for those areas. Note that the below difficulty may vary depending on your tech stack and organizational structure.

Test Type	Priority	Difficulty
Unit Testing	High	Low
Linting	Low	Low
Performance	Medium	Medium
Integration	Medium	Medium
End-To-End	Low	High

Maturity Levels

Ad hoc: No or minimal testing
Structured: Some testing in all areas or strong testing in some areas but missing in others
Automated: Some testing in all areas with strong capabilities in a few categories. A sandbox/dev environment exists and is used
Optimized: Appropriate testing for all components with code coverage measurements. Tests required for new detections, with appropriate mechanisms in place (ie. robust criteria and automated expiration) for break-glass scenarios

Automation and Integration

Developing appropriate automation and integrations is critical to improving analyst response time and decrease error likelihood. Anytime your SOC analyst has to context switch, go into another system for information, or look something up, it increases time to respond and adds unnecessary friction. I firmly believe that as detection engineers, our job is to give as much information to the SOC so they can do what they do best: analysis and response. Integrations empower your analyst and therefore should be a top priority for any robust detection team.

Suppression

One of the most common detections is “impossible travel”. When an analyst gets this alert, the first step in their playbook is usually to reach out to the user to confirm that it was in fact them logging into a system. We can use automation to suppress this alert, though.

Was This You?

Image Source: https://www.tines.com/blog/chatbots-for-security-and-it-teams-part-3-creating-a-slack-chatbot/

If the user clicks “I recognize this activity” then the alert is closed without further (or any) action needed from the analyst.

Enrichment

Sometimes the data you need doesn’t exist in your logs because maybe it’s too expensive or too infeasible to ingest for whatever reason. In this case, you’ll want integrations that allow you to query real-time data in order to enrich an alert. Looking at the example where one can escalate privileges through EC2 user data, we can see that we can detect this activity through watching CloudTrail for the “ModifyInstanceAttribute” event. However, the actual user data is removed because its very common for secrets to be in this data.

ModifyInstanceAttribute

So an analyst’s first step would be to run a command to pull this data.

Manual Command

However, ran than rely on an analyst to have all the necessary permissions and access, the better solution is to fold these capabilities into your detection platform and automatically run this command when the alert fires and appropriately enrich your alert. Thus, you have a more secure and faster system that allows your responders to do their analysis in a timely and efficient way.

Generation

This may be the most common integration that detection engineers will support in their environment. Let’s say you use a Cloud Detection and Response tool. While a basic implementation may involve directly sending alerts to your case management system, a mature DaC team will ingest those alerts into their SIEM just like any other data. From there, alerts can be normalized, suppressed, and enriched as appropriate.

Maturity Levels

Ad hoc: No or minimal integrations
Structured: Some integrations setup, but setting up new integrations is high effort. Missing tooling and standards
Automated: Process and tooling exists for developing new integrations, can automation suppression or enrichment for key detections
Optimized: Integrations for suppression and enrichment. Can easily incorporate 3rd party alerts. Strong standards and tooling for deployment of new integrations

Monitoring

As our systems grow in complexity, it becomes increasingly important to monitor their inputs, activity, and outputs. Identifying and resolving failures or errors in your system as quickly as possible is the goal. I generally break down monitoring into 3 parts: inputs, executions, and outputs. Are you getting good data? Are you monitoring the performance or run time failures in your system? Are you generating useful alerts?

Inputs

Data quality can be (and has been many times from other writers) its own blog post. For the purpose of detection engineering, I focus on five areas of data quality:

Format or ensuring your data is well structured and parsable (please stop using syslog)
Volume or monitoring for spikes or drops in data volume
Completeness or ensuring you’re collecting data from all systems or accounts
Timing or ensuring that data is reaching your detection engine soon after the event occurs
Data contracts or your ability to work with system owners to ensure they improve their system in a timely manner if any area of data quality drops

Execution

Monitoring the runtime of your system can further be broken down into two main categories.

Performance/Cost

Are you watching for spikes in cost or drops in performance? Do you find out about an issue when you get the bill or have you setup monitoring to rapidly alert to cost spikes?

Failure Handling

This has large overlap with business continuity and disaster recovery and will make your compliance team happy. When a rule fails, does your system identify this and re-run it automatically? Do you get alerted when it repeatedly happens so you can take action? Or do things silently fail? If you wrote all these great detections but something is silently failing, then you’re not executing an effective detection engineering system.

Outputs

This is the area of monitoring that most security operations are familiar with and start off with: labeling your alerts as true or false positive and working to improve the false positives. As you mature, you can grow in this area with improved labeling or proactively suppress alerts that start spiking and overwhelming your triage team. As an example, at Snowflake, we’ll quarantine a particular detector and create a ticket for our detection engineers if it has too many false positive closure statuses within a particular time window.

Maturity Levels

Ad hoc: Find out about missing data during forensics, get told about detection issues from the SOC
Structured: Collecting output metrics, basic dashboards and visualizations
Automated: Notified about failures, established SLAs and SLOs
Optimized: Automated suppressions, data/performance contracts, automatic recovery capabilities

Maturity Matrix

	Language	Testing	Integration	Monitoring
Ad Hoc	Markup language	No or minimal testing	No or minimal integrations	Find out about missing data during forensics, get told about detection issues from the SOC
Structured	Query language	Some testing in all areas or strong testing in some areas but missing in others	Some integrations setup, but setting up new integrations is high effort. Missing tooling and standards	Collecting output metrics, basic dashboards and visualizations
Automated	General-purpose language	Some testing in all areas with strong capabilities in a few categories. A sandbox/dev environment exists and is used	Process and tooling exists for developing new integrations, can automation suppression or enrichment for key detections	Notified about failures, established SLAs and SLOs
Optimized	Any combination of the above/best tool for the job	Appropriate testing for all components with code coverage measurements. Tests required for new detections, with appropriate mechanisms in place (ie. robust criteria and automated expiration) for break-glass scenarios	Integrations for suppression and enrichment. Can easily incorporate 3rd party alerts. Strong standards and tooling for deployment of new integrations	Automated suppressions, data/performance contracts, automatic recovery capabilities

Setting Goals

You may read through this blog post and wonder how you’ll ever reach optimized in all categories. It’s my belief that this shouldn’t be the goal for all teams. Security is ultimately a function of risk management, and you shouldn’t overinvest for how much risk you mitigate. Similar, with adopting devops principles your detections, you may end up over-engineering when resources could have been better invested in other areas. As an example, if you’re a smaller shop where your only detection engineer also runs vulnerability management and incident response, then shooting for optimized across the board may not be realistic. In that case, it may be more important that you have strong integrations and automation to scale your engineer. Meanwhile, a shop that has a ton of legacy components may want to invest in monitoring to ensure all areas are working as expected. Reason through your particular situation and where your pain points to decide where on the matrix you want to end up before setting unreachable goals.

Conclusion

Detection as Code (DaC) represents a significant advancement in cybersecurity, bringing software engineering principles into threat detection processes. By breaking DaC down into functional areas—programming language, testing, integration & automation, and monitoring, we can assess our current maturity and pinpoint areas for improvement.

Each functional area offers a pathway from basic, ad hoc practices to optimized, comprehensive systems that enhance efficiency, reliability, and response times. Implementing DaC requires a commitment to continuous improvement, leveraging the best tools and practices available.

As cybersecurity threats continue to evolve, adopting DaC can provide a robust framework for staying ahead. By treating detection rules as code, integrating rigorous testing, and ensuring seamless automation and monitoring, we can significantly enhance our security posture.

Whether your team is just starting out with DaC or looking to optimize an existing system, focusing on these key areas will help you build a more resilient and responsive security operation. The journey to maturity in DaC is incremental, but the benefits—reduced response times, fewer errors, and more effective threat detection—are well worth the investment.

(Note, this is a blog post version of a talk I gave a while ago with some additional updates. You can find the original slides here.)

Detection as Code: A Maturity Framework

Introduction

Understanding Detection as Code

Functional Areas and Maturity Levels

Programming Language

Maturity Levels

Testing

Unit Testing

Linting

Performance Testing

Integration Testing

End-to-End Testing

Testing Priority and Difficulty

Maturity Levels

Automation and Integration

Suppression

Enrichment

Generation

Maturity Levels

Monitoring

Inputs

Execution

Performance/Cost

Failure Handling

Outputs

Maturity Levels

Maturity Matrix

Setting Goals

Conclusion

Say Something

Comments