IOC hunting at scale
As the holiday season approaches and our schedules hopefully begin to open up, many of us find ourselves with a bit more time on our hands. This time could be perfectly spent delving into some hunting activities. And if you’re into hunting threats and sifting through vast amounts of data, the KQL External Data operator might be the holiday gift for you!
This powerful capability enables you to seamlessly incorporate external data into your KQL queries, such as GitHub IOC lists or MISP Feeds. This data can be dynamically loaded in your KQL query to hunt for matches across all your devices.
In this blog, we share ready-to-use hunting queries for:
Theory
Before diving into the fun part of the blog, the hunting queries, I first want to address some theoretical background information. The concept is based on the externaldata operator in KQL. The operator is designed to facilitate the querying of data sTored in external sTorage systems, such as Azure Blob STorage or GitHub. The data is read from external sTorage and returned to a table. The most common data formats supported are CSV, JSON, MultiJSON and TXT. For the full list have a look at the Microsoft documentation:
The externaldata operator leverages dynamic external input, the advantage is that there is no need for a long array which you need to maintain. Every time the query is run, the latest status of the external content is queried. This helps for example with indicaTors of compromise, as you only want to query for fresh IOCs to prevent false positives.
For public resources, it is enough to have the link to the file. For example, the malware MD5 hashes of Abuce.ch are queried and loaded into KQL.
let MalwareSampleMD5 = externaldata(MD5: string)[@"https://bazaar.abuse.ch/export/txt/md5/recent"] with (format="txt", ignoreFirstRecord=True);
MalwareSampleMD5
The above example works well for public information, but what if you want to enrich your queries based on internal information? The solution is to use a sTorage blob and connect with a SAS token to your sTorage blob. Such a request would be similar to the one below.
externaldata(Timestamp:datetime, ProductId:string, ProductDescription:string)
[
h@"https://mycompanysTorage.blob.core.windows.net/archivedproducts/2019/01/01/part-00000-7e967c99-cf2b-4dbb-8c53-ce388389470d.csv.gz?...SAS...",
h@"https://mycompanysTorage.blob.core.windows.net/archivedproducts/2019/01/02/part-00000-ba356fa4-f85f-430a-8b5a-afd64f128ca4.csv.gz?...SAS...",
h@"https://mycompanysTorage.blob.core.windows.net/archivedproducts/2019/01/03/part-00000-acb644dc-2fc6-467c-ab80-d1590b23fc31.csv.gz?...SAS..."
]
with(format="csv")
| summarize count() by ProductId
The externaldata operator can be used as a variable and inline. To make it even simpler you can query multiple external sources at once, which I do not recommend for performance reasons.
This blog only discusses a few example queries, there are many more available to try:
- Threat Hunting - Rules that start with TI Feed use the externaldata operator.
- MISP Feeds - KQL-MISP implementation queries.
- Threat Intelligence feeds - Based on the list of feeds you can create your own queries to hunt for suspicious activities.
Suspicious NamedPipes
A named pipe is a named, one-way or duplex pipe for communication between the pipe server and one or more pipe clients, this allows it to be an easy form of communication between related and unrelated processes. A bunch of offensive tools leverage this functionality too, and some of them use standardized pipenames for their applications. These standardized pipenames can be used to detect the execution of malicious software in your environment.
Some examples of offensive tools that leverage standardized pipe names are:
- Cobaltstrike
- Havoc C2
- PSEXEC
The query below gets the Suspicious Named Pipe List from the GitHub of mthcht, this list contains over 300 named pipes to moniTor. The query below ingests the list dynamically and searches in the DeviceEvents table for matches. In the case below multiple PsExec Named Pipes have been detected.
Query:
let NamePipes = externaldata(pipe_name: string, metadata_description: string, metadata_tool:string, metadata_category: string, metadata_link: string, metadata_priority:string, metadata_fp_risk: string, metadata_severity: string, metadata_tool_type: string, metadata_usage: string, metadata_comment: string, metadata_reference: string)[@"https://raw.githubusercontent.com/mthcht/awesome-lists/refs/heads/main/Lists/suspicious_named_pipe_list.csv"] with (format="csv", ignoreFirstRecord=True);
let StandardizedPipes = NamePipes
| project pipe_name = replace_string(tolower(pipe_name), "*", "");
DeviceEvents
| where Timestamp > ago(24h)
| where ActionType == "NamedPipeEvent"
| where split(tolower(AdditionalFields.PipeName), "\\")[-1] has_any(StandardizedPipes)
| extend PipeName = AdditionalFields.PipeName, PipeNameChild = split(tolower(AdditionalFields.PipeName), "\\")[-1]
| project-reorder Timestamp, PipeName, DeviceName, AccountName
Tor
The second example is based on connections made to Tor nodes. While Tor has legitimate uses for protecting personal privacy and circumventing censorship, it is often unwanted that connections are being made to Tor nodes. Detecting connections to Tor nodes can be done using the dynamic IP list of Tor nodes provided by dan.me.uk, this will allow you to query the most recent nodes each time the query is executed.
let TorNodes = externaldata(IP:string )[@"https://www.dan.me.uk/Torlist/?full"] with (format="txt", ignoreFirstRecord=False);
let IPs = TorNodes
| distinct IP;
DeviceNetworkEvents
| where ActionType == "ConnectionSuccess"
| where RemoteIP in (IPs)
| project-reorder TimeGenerated, DeviceName, RemoteIP, InitiatingProcessAccountName, InitiatingProcessCommandLine
Performance The queries provided in this blog can be resource-intensive as the externaldata operator needs to parse the externaldata for each query you run. Especially for Defender XDR customers, it is important to be aware of the CPU quota every tenant has. It is recommended to use time-based filters in your query to only query the last 24 hours or the last 7 days for example. To limit the performance impact summarizing the data before the external enrichment could also yield positive performance results.
CISA Know Exploitable Vulnerabilities
The CISA Known Exploited Vulnerabilities Catalog (CISA KEV) helps organizations prioritize vulnerabilities. The CISA KEV catalog can be combined with the vulnerability information in the DeviceTvmSoftwareVulnerabilities table to enrich and prioritize vulnerabilities. The query below creates and columnchart with the total number of vulnerable devices for each active CISA KEV vulnerability found in your environment. It is highly recommended to patch the vulnerabilities found in the list, as they are already actively exploited by adversaries. The column chart can also be easily shared with management to give priority to the top 10 most active vulnerabilities in your environment.
let KnowExploitesVulnsCISA = externaldata(cveID: string, vendorProject: string, product: string, vulnerabilityName: string, dateAdded: datetime, shortDescription: string, requiredAction: string, dueDate: datetime,
notes: string)[@"https://www.cisa.gov/sites/default/files/csv/known_exploited_vulnerabilities.csv"] with (format="csv", ignoreFirstRecord=True);
DeviceTvmSoftwareVulnerabilities
| join kind=inner KnowExploitesVulnsCISA on $left.CveId == $right.cveID
| summarize TotalDevices = dcount(DeviceId) by CveId
| sort by TotalDevices
| render columnchart with(title="Active CVEIds CISA KEV")
What is the CISA Known Exploited Vulnerabilities Catalog? “The Known Exploited Vulnerabilities Catalog is developed for the benefit of the cybersecurity community and network defenders—and to help every organization better manage vulnerabilities and keep pace with threat activity—CISA maintains the authoritative source of vulnerabilities that have been exploited in the wild: the Known Exploited Vulnerability (KEV) catalog. CISA strongly recommends all organizations review and moniTor the KEV catalog and prioritize remediation of the listed vulnerabilities to reduce the likelihood of compromise by known threat acTors” - Source: Cyber Security and Infrastructure Security Agency
For more CISA KEV and vulnerability queries see GitHub.
MISP
MISP is a well-known Open Source Threat Intelligence Platform that can be combined with KQL. With the externaldata operator there is no need for additional connecTors or to deploy infrastructure to hunt for IOC matches across your data. I have written a partial KQL-MISP implementation. The implementation is based on the externaldata() operator, which collects the data externally and ingests that into a KQL query for detection rules, threat hunting, enrichment based on incident entities or in incident response scenarios were you quickly want to know if known bad IPs are found in a tenant.
Putting all the MISP queries in this blog would not be suitable, but there is plenty to hunt for listed on GitHub:
The image below shows a subset of the current supported MISP connecTors.
For the full implementation status and documentation see GitHub.
Questions? Feel free to reach out to me on any of my socials.