YARA rules are powerful pattern-matching tools for identifying, classifying, and detecting malicious activity. Malware analysts, security researchers, and incident responders use them to defend against malware and hunt for bad guys. They are also one of the fundamental pieces of tactical intelligence you will share with operational teams as a cyber threat intelligence analyst.
YARA is a key concept for any cyber security professional to learn, and this guide will teach you everything you need to know. You will discover what YARA rules are and how to use them, explore how to create your own YARA rules with useful tips, and unlock the best practices for using YARA rules in the real world.
Let’s dive in and begin elevating our cyber security skills!
What Are YARA Rules?
YARA is a powerful pattern-matching tool for searching within data, such as memory dumps, packet captures, or binary files, for patterns. It is used in cyber security primarily to detect malware and hunt for indicators on endpoints. This makes it useful across various fields.
You can think of YARA as grep or regular expressions on steroids with its advanced pattern-matching features, extension modules, and wide adoption in the malware analysis and incident response community.
As a threat intelligence analyst, you must become familiar with understanding, handling, and creating your own YARA rules. They are essential to your tactical intelligence toolkit and help operational teams proactively hunt for malicious activity.
Let’s explore what a YARA rule actually looks like so you can begin using them.
The team behind VirusTotal maintains YARA to help malware researchers identify and classify malware samples. The tool, which has the tagline “The pattern-matching Swiss knife for malware researchers (and everyone else),” has become a staple in the cyber industry. It is widely used by researchers, analysts, and security tooling.
Anatomy of a YARA Rule
The most basic YARA rule will consist of three main sections.
- Meta Section: This contains metadata about the rule, such as its name, a brief description of what it detects, the author, and other references.
- Strings Section: This section defines the binary strings or patterns the rule should match against. They can be plaintext strings (enclosed in double quotes), hexadecimal byte sequences (enclosed in curly brackets), or regular expressions (enclosed with forward slashes).
- Condition Section: Where the logic of the rule is stored. This section defines the criteria required to trigger the rule using Boolean logic and the data described in the Strings Section.
YARA rules are typically broken into individual files ending in .yar
, each targeting a specific malware variant, TTP, or threat actor. However, you can combine multiple YARA rules into a single file or import other rules into a file.
YARA rules can also include comments. Use //
for a single comment or /* <comment> */
for a multi-line comment.
Once you have created a YARA rule or set of rules, you can use the YARA command line tool or another security tool that uses the YARA engine to test a target, such as a file, folder, or process, against your rules.
On the Windows command line, execute yara.exe [options] <rule-file(s)> <target>
to scan a file with the YARA executable.
Here, the sample.exe
and sample2.exe
files are scanned using the rule.yar
YARA rule. You can see sample.exe
matches against a rule in this file named MaliciousFile
, whereas the sample2.exe
file does not (no output).
You can download the YARA command line tool from VirusTotal’s GitHub page. Other popular alternative applications that allow you to run YARA rules, along with other common detection rule formats like IOC lists and Sigma rules, include:
- Thor and Thor Lite by Nextron Systems.
- Invoke-Yara to run YARA through a PowerShell script.
- The popular performant endpoint visibility platform osquery.
- Yara Scanner.
In addition to applications, there are add-ons that allow you to use YARA with popular tools (e.g., Burp Suite, IDA Pro, Binary Ninja, Cutter, Firefox) and wrappers/bindings for popular programming frameworks (e.g., C#, go, OCaml, Java, Rust, etc.). This compatibility allows you to integrate YARA into your own custom tools.
YARA was originally written in Python, so don’t worry. If you are a Python fan, there are plenty of Python modules to create, parse, and run YARA rules.
Hex Special Values
One of YARA’s key features is its ability to match various types of strings against files. The most popular of these is hexadecimal byte sequences because attackers find them difficult to hide or obfuscate in binary files. Because of this, YARA includes special hex values to make pattern matching more versatile.
These include:
- Wildcards that let you use the question mark (
?
) to match any single byte in a hexadecimal string. For example,01 ? 03
would match any sequence that starts with01
, followed by any byte, and then03
. - Ranges representing consecutive bytes in a hexadecimal string using the hyphen character (
-
). For example,01-[02-04]
would match any sequence that starts with01
, followed by any byte between02
and04
. - Negation using the exclamation mark (
!
) to exclude patterns. - Alternatives to provide different options for a given fragment of a hex string. For example,
( 62 B4 | 56 )
will match62 B4
or56
. This is similar to the OR Boolean operator.
You can even chain these special values together to create complex patterns.
rule AlternativesExample2
{
strings:
$hex_string = { F4 23 ( 62 B4 | 56 | 45 ?? 67 ) 45 }
condition:
$hex_string
}
This is just a taste of YARA’s string-matching capabilities. For a complete reference guide on text and regular expression capabilities, see the documentation on writing rules.
YARA Modules
YARA’s most powerful feature is its module extensions, which provide additional functionality beyond the core features of the YARA engine. These modules allow you to integrate tools, services, and data sources with YARA to extend its capabilities.
Popular modules include:
- PE: Match against the fields in a Portable Executable (PE) header. These are Windows executable files.
- ELF: Match against the fields in an Executable and Linkable Format (ELF) header. These are Linux executable files.
- Magic: Match against the “magic bytes” that distinguish a file’s type.
- Hash: Calculate and match against a file’s MD5, SHA1, or SHA256 hash.
- Math: Calculate certain values from portions of the file and match against them, such as the file size, entropy, and common mathematical functions like mean.
- Dotnet: Match against attributes and features of the .NET file format.
- Time: Allows you to use temporal conditions in your rules, such as comparing to the current time.
- Console: A module for logging information to standard output (stdout) during condition execution.
- LNK: Match against attributes and features of the LNK file format.
For instance, to search for files with the filename explorer.exe
that are of uncommon size, you can use the following YARA rule:
rule Suspicious_Size_explorer_exe {
meta:
description = "Detects uncommon file size of explorer.exe"
author = "Florian Roth"
score = 60
date = "2015-12-21"
condition:
uint16(0) == 0x5a4d
and filename == "explorer.exe"
and ( filesize < 1000KB or filesize > 3000KB )
}
This rule uses the unit16(0)
function to search for the magic bytes 0x5a4d
(the MZ header present in all PE files) at offset 0
to match all PE files. It then uses the filename
module to match explorer.exe
and a filesize
range to find files less than 1000KB
or greater than 3000KB
(uncommon for the genuine explorer.exe process). The score
field helps an analyst triage the detection.
You can use community modules or write your own to express more complex conditions. Modules are written in C and built into YARA at compile time. Creating your own YARA modules is outside the scope of this article. For a complete guide, read the documentation for writing your own modules.
Now that you have a basic understanding of YARA rules and how to create them let’s explore some tips to help you get started.
Tips for Writing Effective YARA Rules
YARA rules can range from simple ones that match a text string to extremely complex ones that use multiple modules and conditions to avoid triggering false positives. This can make mastering YARA rules difficult.
At times, YARA can seem like a never-ending battle to create a detection rule that accounts for all potential false positives but is not too specific to exclude malware variations. This can quickly add complexity to your rule, impact the performance, and make it hard to maintain.
Here are some useful tips for writing effective YARA rules to help you get started and avoid common pitfalls.
Read the Documentation
The documentation is the first step in creating YARA rules! Here, you will find everything you need to start, including instructions on writing rules and references to using specific tool features.
Start Simple
Begin by defining the purpose and objective of your rule and create simple detection logic that meets these requirements. Don’t try to go all in from the start by importing several modules or matching all edge cases. Start simple and build up your rule over time to reduce false positives by only including the code you need. The simpler the rule, the faster it will run.
Use Descriptive Names
Use a name that clearly describes the purpose or functionality of the YARA rule. Using generic names or numbers is useless if you intend to share it with the wider community. It also makes your rules harder to manage as your ruleset grows.
Document Rules With Metadata
Include metadata! Your rule should always include documentation that describes its purpose, the type of threat it targets, and any relevant references or sources it uses (you cannot include all this in a name). This data will help other analysts (and future you) understand and use your rules effectively.
Test Thoroughly
YARA rules will generate false positives (trigger on legitimate binaries) or false negatives (not trigger when you want them to). The only way to find these is to rigorously test your rule against diverse samples, including malicious and benign files. This will allow you to evaluate its effectiveness and performance and make necessary changes to improve accuracy.
Collaborate With Others
YARA has been widely adopted in cyber by a thriving community of enthusiasts. Use this to your advantage. Collaborate with other cyber security professionals, exchange rules and insights, and contribute to community-driven threat intelligence efforts. This will help you create better rules and strengthen the overall cyber landscape.
Leverage Contextual Information
Contextual information, such as file metadata and attributes, can be incredibly helpful in enhancing the accuracy of your YARA rules. If you know a piece of malware is always a certain size or always includes a certain PE header attribute, you can instantly filter out anything that doesn’t match these requirements. This will reduce the range of false positives and let you narrow in on what you want to search for (specific strings or mutexes).
Tune How Specific Your Rules Are
Not all YARA rules are designed to fulfill the same purpose. Some are used to hunt potentially malicious behavior and generate more false positives to spread a wider net. Others are focused on detecting threats and will only trigger if a specific malicious thing is found. Ensure you define your rule’s purpose. It will affect how specific you make it.
If you are still struggling to get started or want to know how your rules compare to others, check out these free learning resources:
- The YARA toolkit: An excellent collection of YARA tools by Thomas Roocia. It includes a YARA editor, generator, scanner, code generation tools, and database of YARA rules you can search through.
- Yara-Rules GitHub repository: A great collection of YARA rules that you can use to hunt for malware or as inspiration to create your own rules.
- Awesome YARA: A curated list of awesome YARA rules, tools, and resources to accelerate your learning journey.
Real-World Use of YARA Rules
At this point, you may think, “YARA rules are great. Why not use them for everything?” Unfortunately, like everything in security, they have moments of brilliance and real-world limitations, so let’s explore when it is right to use them.
When to Use YARA Rules
YARA is a versatile technology that cyber security analysts can use in various scenarios. These include:
- Malware Analysis: Write rules that identify, classify, and detect malware using known malware signatures or patterns within files or data.
- Threat Hunting: Create rules that allow you to proactively search for IOCs or suspicious patterns across systems or networks.
- Incident Response: Use YARA rules to help you identify and contain compromised systems during a security incident. You can even incorporate them into your incident response playbooks.
- Threat Intelligence: Share YARA rules with the cyber security community to help others defend against emerging cyber threats and foster collaboration.
- Detection Engineering: Use YARA to create custom detections tailored to your specific environment based on your threat model or threat profile.
This versatility allows any organization to strengthen its security posture by leveraging YARA rules effectively and offers the following key benefits:
- Customizability: You can tailor YARA rules to fit your use case, detect specific malware samples or malicious behavior, and tune them to filter our false positives relevant to your environment. This flexibility makes them excellent for building custom detections or threat hunts.
- Granularity: YARA lets you match patterns, characteristics, or indicators within files or data at a fine-grain level. This is perfect for creating rules that match specific indicators of compromise (IOCs), such as malware signatures and behavioral patterns.
- Compatibility: YARA is cross-platform compatible with tools that let you run rules on various operating systems and architectures (e.g., Windows, Linux, macOS). This allows you to use YARA rules across diverse environments and enables rule sharing.
- Community: The YARA project is backed by a large and active community of researchers, software developers, and cyber security professionals who regularly contribute new tools and share rules. This community support is the driving factor behind the security industry’s widespread adoption of YARA.
- Proactive defense: The sharing of threat intelligence and YARA rules allows cyber security professionals to proactively defend their organizations from threats. It empowers analysts to build detection rules and generate threat hunts that mitigate and identify threats before they escalate.
When NOT to Use YARA Rules
Despite their many benefits, YARA rules are not perfect for every situation. Their limitations mainly revolve around performance issues (running a YARA scan is resource-intensive) and skill issues (learning how to use YARA effectively takes time and dedication).
Here are some common limitations you will face when trying to use YARA rules:
- Vendor and tool support: Although many open-source tools and projects support YARA rules, commercial vendor support is low. Due to the performance overhead of running YARA rules, many SIEM or EDR providers do not allow users to create and run their own custom rules. However, technological advancements are slowly changing this.
- Complexity: Writing effective YARA rules requires malware analysis skills and a good understanding of the YARA syntax, regular expressions, and binary file formats. These concepts often require a steep learning curve and can be challenging for beginners to grasp.
- Performance impact: Scanning files or processes with YARA rules can be an intensive process that requires significant computational resources. This can impact the system’s responsiveness and negatively affect the end-user experience, particularly if you use a large ruleset. This impact on business operations may be deemed unacceptable.
- Rule maintenance: Building your own set of custom detections requires keeping up with emerging threats, new malware families, and changes in attack techniques. This requires regular maintenance and reviews of your YARA rules, which can become time-consuming and resource-intensive.
- Scalability: Applying YARA rules to large datasets or high-volume traffic streams can impose performance overhead on systems and networks. This makes scaling YARA rules across an enterprise environment difficult.
These limitations can make automating and scaling YARA across your enterprise challenging. You must carefully consider the implications of using YARA before incorporating it into your real-world workflows. However, if you are smart about where and when you use YARA rules, the benefits will outweigh the limitations for most organizations.
Know What You Are Scanning For
Another important factor to consider when using YARA rules is the type of malware that runs against them. These days, most malware is packed (encrypted or obfuscated to make it difficult to detect) and then unpacked at runtime. YARA struggles to detect packed malware.
The real value of YARA comes when you write rules that detect things in unpacked malware. This could mean executing YARA rules against a running process’s memory or performing dynamic malware analysis.
Before you scan all your systems using YARA rules, think about what your YARA rules are actually searching for.
Conclusion
Using and understanding YARA rules is a fundamental skill for anyone in a technical cyber security role. Their powerful pattern-matching capabilities allow you to identify, classify, and detect malicious activity in files, data, and running processes. This makes them a key piece of intelligence for threat hunting, incident response, and malware analysis.
This guide has covered everything you need to start using YARA rules and crafting your own custom rules. You saw what a YARA rule looks like, how to scan files using them, and various tips on writing effective rules. Now, it’s time to apply this knowledge.
Use the advice on using YARA in the real world and begin using this awesome technology!
Frequently Asked Questions
What Does YARA Stand For?
YARA does not officially stand for anything. The tool’s creator simply used it as its name. However, people in the cyber security community often joke that it stands for “Yet Another Recursive Acronym” to poke fun at the technology industry’s overuse of meaningless acronyms for tools and projects.
What is the Purpose of YARA Rules?
YARA is a powerful pattern-matching tool for identifying and classifying malware. In cyber security, it is used alongside YARA rules to search for patterns across various data formats (e.g., memory dumps, packet captures, or binary files) and detect malicious activity on endpoint devices. Researchers, analysts, and incident responders use the tool to share, analyze, and find indicators (IOCs).
What Is the Difference Between Sigma Rules and YARA Rules?
YARA and Sigma are two technologies that cyber security professionals use to detect malicious activity. YARA is used to identify specific patterns or characteristics associated with malicious files. Sigma is designed to detect security incidents in log data generated by security appliances, operating systems, and applications.
Both play a role in detecting malicious activity. You should use YARA to detect and hunt for malware at a low level in binary executables, files, and process memory. Meanwhile, you should use Sigma to find suspicious or malicious activity in your log data using an Endpoint Detection and Response (EDR) or Security Information and Event Management (SIEM) tool.
Does CrowdStrike Use YARA Rules?
Yes. CrowdStrike is a popular Endpoint Detection and Response (EDR) tool widely used in enterprise environments. It uses YARA, alongside other technologies, to detect malware present on endpoint devices. However, the tool uses its own proprietary YARA rules that are not publicly available and does not allow users to upload their own detection rules.
CrowdStrike has a feature called MalQuery, a malware search engine that allows you to search for malware using YARA rules. Security researchers and analysts use it to search and attribute samples during an investigation to aid in threat hunting, prevention, and YARA rule creation.
Who Invented YARA Rules?
Software developer and malware researcher Victor Manuel Alvarez created YARA. He publicly released it in 2007 as an open-source tool for identifying and classifying malware. Since then, it has been picked up and maintained by the team behind VirusTotal and has seen mass adoption by the cyber security industry for threat detection, analysis, and classification.