Operationalizing TLSH Fuzzy Hashing

If you work in cybersecurity or tech, you’re likely familiar with hashing. A cryptographic hash function generates a fixed-size hash value from any given input data. This is a one-way process, making it computationally infeasible to reverse-engineer the original data from the hash value. Hashing is excellent for identifying specific files or binaries, such as a particular malware sample. However, its effectiveness in detection diminishes quickly because even a single change in the input (like altering one character) will result in a completely different hash value. Changing a binary’s hash is incredibly easy; malware authors might inject random words, or a binary might be compiled with specific information for a targeted organization, resulting in a different hash for each impacted organization.

This limitation in traditional hashing methods has led to the development of fuzzy hashing algorithms, which offer a more flexible approach to identifying similar but not identical files. However, many security analysts are still unaware of fuzzy hashing and how it works. Even those who are aware often struggle to use them effectively for detection or hunting in their environments. This is primarily because logs typically don’t include fuzzy hash values, and even when SIEM/XDR tools provide TLSH values for binaries, they rarely offer mechanisms for organizations to incorporate their own TLSH hashes and acceptable distance metrics. Consequently, TLSH hashing is often used reactively after a detection, with analysts providing values to search for other samples in tools like Stairwell or VirusTotal for additional infrastructure and Indicators of Compromise (IoCs). This is disappointing because the potential of algorithms like TLSH for detection and threat hunting is immense. This blog post will explore ways to operationalize TLSH in your environment for proactive detection.

While this blog post will focus on TLSH, it’s important to be aware of other common fuzzy hashing algorithms such as SSDEEP. These algorithms are specific implementations of broader concepts like Locality-sensitive hashing and Context-Triggered Piecewise Hashing.

TLSH - Trend Micro Locality-Sensitive Hash

TLSH is a custom implementation of a Locality-sensitive Hashing algorithm developed by the smart folks over at Trend Micro. If you want to learn how it works under the hood, I recommend reading some of the official material which can be found here.

Fuzzy Hashing/TLSH in English

Imagine you have a bunch of different puzzles, each with a unique picture. Traditional hashing is like taking a photo of each puzzle once it’s completed. If you change even a single piece, the photo changes completely, making it hard to tell if the new photo is similar to the original.

TLSH, or Trend Micro Locality-Sensitive Hashing, works differently. Instead of taking a photo of the entire puzzle, it looks at various parts of the puzzle and gives each part a score based on its characteristics. It then combines these scores into a summary. If you change a few pieces of the puzzle, the summary changes only a little, allowing TLSH to recognize that the two puzzles are still similar.

This way, even if a file or a piece of data is slightly altered, TLSH can still tell that it is similar to the original. It’s like having a way to see the overall pattern of a puzzle rather than focusing on individual pieces, making it easier to spot similarities and differences.

TLSH Detection and False Positive Rate

The score that TLSH hashing comparisons generate measures the distance between two files. A higher score indicates less similarity, while a lower score indicates more similarity. This scoring and distance measurement is why TLSH is not commonly exposed in modern security tools outside of threat intelligence platforms like Stairwell or VirusTotal that allow you to search through files. Unlike simply checking if a SHA256 hash matches a list of known malware hashes or if a domain matches a list of known malicious domains, TLSH requires evaluating the degree of similarity, making its integration into existing systems more complex.

Additionally, different Security Operations Centers (SOCs) have varying levels of tolerance for false positive alerts. Fortunately, Trend Micro conducted extensive research on the detection and false positive rates for various TLSH comparison scores, as well as how another Locality-Sensitive Hashing Algorithm known as Nilsimsa compares. The full results of this research can be found here.

TLSH Scoring Chart

As you can see, a score of less than 30 has a false positive rate of only 0.00181% and a detection rate of 32.2%, while a score of 100 maintains a reasonable false positive rate of 6.43% with an impressive 94.5% detection rate.

This means that if you’re able to maintain a high-quality list of TLSH hashes, you can achieve a low false positive rate in your SOC while having a 94.5% chance of matching a binary when it is observed even if the hash is different.

Operationalizing TLSH Hashing

Don’t assume your EDR tooling, for example, doesn’t use TLSH matching for detection behind the scenes; it’s certainly possible even if they don’t provide an option for you to add your own IoCs. Trend Micro, the developers of TLSH, is a perfect example of this usage. It is likely that other vendors are using it as well. That said, there can be significant value in managing your own TLSH hash dataset. For instance, maintaining a list of files that bypass the delivery phase of the Kill Chain within your organization or that evaded your EDR can help detect and prevent similar threats in the future. Lets talk about some of the ways we might utilize it in the wild.

Identification of Other Recent Infrastructure

Using tools like Stairwell, abuse.ch, VirusTotal, or other similar services, you can use the TLSH hash to find other recently submitted samples that might be similar but use different infrastructure, such as other domains, IP addresses, or follow-on malware. This can be incredibly useful when you detect malware in your environment, as it allows you to identify related infrastructure that you might not have originally detected.

Tracking Malware Campaigns

You can group together malware samples with low distance scores to identify and track different variants of the same malware family, such as Emotet’s various epochs. Additionally, by monitoring for larger jumps in the distance scores (that are still within your false positive threshold), you may potentially identify significant changes in techniques or payloads. This may indicate a new malware campaign, providing opportunities for further detection and analysis.

Detection and Threat Hunting

If you’re fortunate enough that some of your tools output the TLSH hash of a binary executing in your environment but don’t provide a way to supply your own TLSH values, you can use tools such as Jupyter Notebooks to hit your tooling API and perform the comparison yourself. Decide on an acceptable false positive threshold based on the chart above, and you might be able to detect directly with a certain score and use a higher score for threat hunting.

Here’s a basic Python implementation for calculating the TLSH score distance:

First, install the py-tlsh package:

pip install py-tlsh

Then:

import tlsh

def measure_tlsh(user_supplied_tlsh, max_distance, tlsh_to_compare):

    score = tlsh.diff(user_supplied_tlsh, tlsh_to_compare)
    if score < max_distance:
        return("match")
    else:
        return("no match")

Retroactive Hunting

As you identify new TLSH hashes for the various malware families and tools you may track, doing historical look ups and comparisons can be a greatway to find a tool that may have been missed in the past.

Closing

As you can see, learning how to operationalize TLSH hashing can have a powerful impact from a detection perspective, a hunting perspective, a grouping and tracking perspective, and during incident response for identifying new infrastructure. Not only that, but it raises the bar for adversaries targeting your organization by forcing them to change how their malware works or to use other methods such as adding binary padding or UPX packing to evade detection, which themselves can become detectable toolmarks.