Classification
ReversingLabs uses a classification algorithm that places analyzed files into the following buckets:
- No threats found (unclassified)
- Goodware/known
- Suspicious
- Malicious
The classification of a sample is based on a comprehensive assessment of its assigned risk factor, threat level, and trust factor; however, it can be manually or automatically overridden when necessary.
Risk score​
A risk score is a value representing the trustworthiness or malicious severity of a sample. Risk score is expressed as a number from 0 to 10, with 0 indicating whitelisted samples from a reputable origin, and 10 indicating the most dangerous threats. At a glance:
Classification | Trust factor | Threat level | Risk score | Severity | Comment |
---|---|---|---|---|---|
0 (no threats found) | N/A | N/A | N/A | ⬜ N/A | No threats found. Please submit the sample to Spectra Intelligence for classification. |
0 | N/A | 0 | 🟩 Clean | File comes from a very trustworthy domain or has a very trustworthy certificate. Examples: HP, IBM, Microsoft, Oracle, Intel, Dell, Sony, Google... | |
1 | N/A | 1 | 🟩 Clean | File comes from a trustworthy domain or has a trustworthy certificate. Examples: php.net, mit.edu, postgresql.org, redhat.de, opera.com, nasa.gov... | |
2 | N/A | 2 | 🟩 Clean | File comes from a usually trusted domain. Examples: softpedia.com, sourceforge.net, cnet.com... | |
3 | N/A | 3 | 🟩 Likely clean | File comes from another known site. | |
4 | N/A | 4 | 🟩 Possibly clean | Some valid but not very trusted certificates. | |
1 (goodware/known) | 5 | N/A | 5 | 🟩 Low | Low trust source, no whitelisted certificates. |
N/A | 0 | 5 | 🟨 Low | More information about sample required for final classification. | |
N/A | 1 | 6 | 🟨 Low | More information about sample required for final classification. | |
N/A | 2 | 7 | 🟨 Low | More information about sample required for final classification. | |
N/A | 3 | 8 | 🟨 Low | More information about sample required for final classification. | |
N/A | 4 | 9 | 🟨 Low | More information about sample required for final classification. | |
2 (suspicious) | N/A | 5 | 10 | 🟨 Low | More information about sample required for final classification. |
N/A | 0 | N/A | 🟧 Low | Low trust source, no whitelisted certificates. | |
N/A | 1 | 6 | 🟧 Low | Adware, potentially unwanted apps, tools for masking malware (packers). | |
N/A | 2 | 7 | 🟥 Medium | Spyware. | |
N/A | 3 | 8 | 🟥 Medium | Tools used to introduce malware or to use infected machines for denial-of-service attacks. | |
N/A | 4 | 9 | 🟥 High | Malicious browser extensions, fake antivirus software, rootkits. | |
3 (malicious) | N/A | 5 | 10 | 🟥 High | Virus, worm, trojan, keylogger, infostealer. Most dangerous threats. |
Files with no threats found don't get assigned a risk score and are therefore unclassified.
Values from 0 to 5 are reserved for samples classified as goodware/known, and take into account the source and structural metadata of the file, among other things. Since goodware samples do not have threat names associated with them, they receive a description based on their risk score.
Risk scores from 6 to 10 are reserved for suspicious and malicious samples, and express their severity. They are calculated by a ReversingLabs proprietary algorithm, and based on many factors such as file origin, threat type, how frequently it occurs in the wild, YARA rules, and more. Lesser threats like adware get a risk score of 6, while ransomware and trojans always get a risk score of 10.
Malware type and risk score​
In cases where multiple threats are detected and there are no other factors (such as user overrides) involved, the final classification is always the one that presents the biggest threat. If they belong to the same risk score group, malware types are prioritized in this order:
Risk score | Malware types |
---|---|
10 | EXPLOIT > BACKDOOR > RANSOMWARE > INFOSTEALER > KEYLOGGER > WORM > VIRUS > CERTIFICATE > PHISHING > FORMAT > TROJAN |
9 | ROOTKIT > COINMINER > ROGUE > BROWSER |
8 | DOWNLOADER > DROPPER > DIALER > NETWORK |
7 | SPYWARE > HYPERLINK > SPAM > MALWARE |
6 | ADWARE > HACKTOOL > PUA > PACKED |
Threat level and trust factor​
The risk score table describes the relationship between the risk score, and the threat level and trust factor used by the File Reputation API.
The main difference is that the risk score maps all classifications onto one numerical scale (0-10), while the File Reputation API uses two different scales for different classifications.
Nomenclature​
The following classifications are equivalent:
File Reputation API | Spectra Analyze | Spectra Detect Worker |
---|---|---|
known | goodware | 1 (in the Worker report) |
In the Worker report, the risk score is called rca_factor
.
Deciding sample priority​
The risk score table highlights that the a sample's risk score and its classification don't have a perfect correlation. This means that a sample's risk score cannot be interpreted on its own, and that the primary criterion in deciding a sample's priority is its classification.
Samples classified as suspicious can be a result of heuristics, or a possible early detection. A suspicious file may be declared malicious or known at a later time if new information is received that changes its threat profile, or if the user manually modifies its status.
The system always considers a malicious sample with a risk score of 6 as a higher threat than a suspicious sample with a risk score of 10, meaning that samples classified as malicious always supersede suspicious samples, regardless of the calculated risk score.
The reason for this is certainty - a malicious sample is decidedly malicious, while suspicious samples need more data to confirm the detected threat. It is a constant effort by ReversingLabs to reduce the number of suspicious samples.
While a suspicious sample with a risk score of 10 does deserve user attention and shouldn't be ignored, a malicious sample with a risk score of 10 should be triaged as soon as possible.
Malware naming standard​
The ReversingLabs detection string consists of three main parts separated by dots. All parts of the string will always appear (all three parts are mandatory).
platform-subplatform.type.familyname
-
The first part of the string indicates the platform targeted by the malware.
This string is always one of the strings listed in the Platform string table. If the platform is Archive, Audio, ByteCode, Document, Image or Script, then it has a subplatform string. Platform and subplatform strings are divided by a hyphen (
-
). The lists of available strings for Archive, Audio, ByteCode, Document, Image and Script subplatforms can be found in their respective tables. -
The second part of the detection string describes the malware type. Strings that appear as malware type descriptions are listed in the Type string table.
-
The third and last part of the detection string represents the malware family name, i.e. the name given to a particular malware strain.
Names "Agent", "Gen", "Heur", and other similar short generic names are not allowed. Names can't be shorter than three characters, and can't contain only numbers. Special characters (apart from
-
) must be avoided as well. The-
character is only allowed in exploit (CVE/CAN) names (for example CVE-2012-0158).
Examples​
If a trojan is designed for the Windows 32-bit platform and has the family name "Adams", its detection string will look like this:
Win32.Trojan.Adams
If some backdoor malware is a PHP script with the family name "Jones", the detection string will look like this:
Script-PHP.Backdoor.Jones
Some potentially unwanted application designed for Android that has the family name "Smith" will have the following detection string:
Android.PUA.Smith
Some examples of detections with invalid family names are:
Win32.Dropper.Agent
ByteCode-MSIL.Keylogger.Heur
Script-JS.Hacktool.Gen
Android.Backdoor.12345
Document-PDF.Exploit.KO
Android.Spyware.1a
Android.Spyware.Not-a-CVE
Win32.Trojan.Blue_Banana
Win32.Ransomware.Hydra:Crypt
Win32.Ransomware.HDD#Cryptor