Skip to main content

Functionally similar files

The RHA (ReversingLabs Hashing Algorithm) identifies code similarity between unknown samples and previously seen malware samples. Files have the same RHA1 hash when they are functionally similar.

This API provides a list of SHA1 hashes of files that are functionally similar to the provided file (SHA1 hash) at the selected precision level (grouped by their RHA1 hash).

Precision level represents the degree to which a file is functionally similar to another file. The following precision levels are supported - 25% and 50% for PE and 25% for MachO and ELF executable files. A higher precision level will match fewer files, but the files will have more functional similarity.

The 'extended' option provides SHA1 hashes and related sample reputation information in the response, along with some other properties such as sample size, sample type and download availability.

By using this API, the user can easily get a list of all functionally similar files for the provided file (SHA1 hash) found in the Spectra Intelligence cloud, along with their reputation information.

The user can also filter out results by sample's classification. For example, only malicious files functionally similar to the provided one.

General Info about Requests/Responses

  • This query returns a list containing all the SHA1 hashes for the requested SHA1 and precision level (grouped by the RHA1 hash)
  • All requests support the format query field, which supports two options: xml and json
  • The default response format is XML

Group By RHA1 Single Query

This query returns a list containing all SHA1 hashes of functionally similar samples for the requested SHA1 sample hash and RHA1 precision level.

If the extended option is selected, each SHA1 hash in the list will be expanded with additional metadata:

  • reputation information - classification, threat level, trust factor, malware family name, threat name, malware type, targeted platform and subplatform;
  • SHA1, MD5, and SHA256 hashes;
  • sample size, sample type, download availability, first and last seen dates (UTC);

Request

GET /api/group_by_rha1/v1/query/{rha1_type}/{hash_value}[/{next_page_sha1}]?[format=xml|json]&[limit={1-1000}]&[extended=true|false]&[classification=KNOWN|MALICIOUS|SUSPICIOUS|UNKNOWN]

Path parameters:

  • rha1_type
    • A measure of RHA1 precision level. It represents the degree to which a file is functionally similar to another file. A higher precision level will match fewer files, but the files will have more functional similarity: pe01, elf01, machO01 - 25% precision level; pe02 - 50% precision level
    • Required
  • hash_value
    • Must be a valid SHA1 file hash
    • Required
  • next_page_sha1
    • An optional parameter used for pagination. It is the SHA1 hash of the first sample on the next page.
    • Optional

Query parameters:

  • format
    • Specifies the response format, with possible values being xml (default) and json
    • Optional
  • limit
    • The maximum number of sample SHA1 hashes to return. This value has to be an integer in the range from 1 and 1000 (1000 is the default value)
    • Optional
  • extended
    • An optional parameter. Supported values are true - extended, and false - non-extended data set (default)
    • Optional
  • classification
    • If this parameter is provided in the request, the query will return a filtered list of samples that match the requested classification. Supported values are: KNOWN, SUSPICIOUS, MALICIOUS, UNKNOWN
    • Optional

Response

  • The query will return a list of limit samples. If the limit parameter is not provided in the request, 1000 records will be returned by default.
  • Fields in the response depend on the selected data set. If the extended parameter is not set to true, only the list of sample SHA1 hashes will be returned.
  • If the requested hash doesn't exist in the database records, the server will respond with the status response code 404 and the message "Requested data was not found"
{
"rl": {
"group_by_rha1": {
"query_sha1": "string",
"rha1_type": "string",
"sha1_list": []
}
}
}
  • query_sha1
    • Requested SHA1
  • rha1_type
    • Requested RHA1 type
  • next_page_sha1
    • First SHA1 on the next page. This hash value can be used as the next_page parameter in the next request to retrieve the next page of results
  • sha1_list
    • A list of hashes of functionally similar files

rl.group_by_rha1.sha1_list[]

{
"sha1": "string",
"sha256": "string",
"md5": "string",
"classification": "string",
"sample_type": "string",
"sample_size": 0,
"sample_available": 0,
"trust_factor": 0,
"threat_level": 0,
"first_seen": 0,
"last_seen": 0
}
  • sha1
    • SHA1
  • sha256
    • SHA256
  • md5
    • MD5
  • first_seen (optional)
    • time when the sample was first seen in the ReversingLabs system (UTC)
  • last_seen (optional)
    • time when the sample was last seen in the ReversingLabs system (UTC)
  • sample_type
    • sample type
  • sample_size
    • sample size
  • sample_available
    • sample download availability status
  • classification
    • sample's classification
  • platform (optional)
    • platform targeted by the malware
  • subplatform (optional)
    • subplatform targeted by the malware
  • threat_name (optional)
    • threat name for malicious samples
  • malware_type (optional)
    • malware type for malicious samples
  • malware_family (optional)
    • malware family for malicious samples
  • threat_level (optional)
    • threat level of the sample
  • trust_factor (optional)
    • trust factor of the sample

Examples

Format query field

These examples request different response formats:

/api/group_by_rha1/v1/query/pe01/1b85cbfa30e181c505ba15211db33247c1f8a63f?format=json
/api/group_by_rha1/v1/query/pe01/1b85cbfa30e181c505ba15211db33247c1f8a63f?format=xml

Extended optional parameter used

These examples use different extended parameters:

/api/group_by_rha1/v1/query/pe01/1b85cbfa30e181c505ba15211db33247c1f8a63f?extended=true
/api/group_by_rha1/v1/query/pe01/1b85cbfa30e181c505ba15211db33247c1f8a63f?extended=false

Limit query field

These examples use different query response limits:

/api/group_by_rha1/v1/query/pe01/1b85cbfa30e181c505ba15211db33247c1f8a63f?limit=1
/api/group_by_rha1/v1/query/pe01/1b85cbfa30e181c505ba15211db33247c1f8a63f?limit=100

Different precision levels (RHA1 type)

These examples request different rha1_type precision levels:

/api/group_by_rha1/v1/query/pe01/1b85cbfa30e181c505ba15211db33247c1f8a63f
/api/group_by_rha1/v1/query/pe02/1b85cbfa30e181c505ba15211db33247c1f8a63f
/api/group_by_rha1/v1/query/elf01/8983043176164b960e10b34307f52e88db894b71
/api/group_by_rha1/v1/query/macho01/f1c2712a3881ca795b5eadf65077788198522362

next_page_sha1 query field

These examples use the next page parameter:

/api/group_by_rha1/v1/query/pe01/1b85cbfa30e181c505ba15211db33247c1f8a63f/cd3710af6638b99666a19b4f098b8788723397ab
/api/group_by_rha1/v1/query/pe02/1b85cbfa30e181c505ba15211db33247c1f8a63f/cd3710af6638b99666a19b4f098b8788723397ab