Skip to main content

YARA retro hunting

The ReversingLabs YARA Retro Hunting service enables users to run their own YARA rules and retroactively match them against files from the ReversingLabs sample set. The YARA Retro Hunting sample set is based on the last 90 days of stored samples, excluding samples larger than 200 MB and archives. Samples extracted from archives are not excluded.

Please note that this service cannot be used without access to the YARA Hunting service (TCA-0303). Users need to upload their YARA rules to the service using the API described in the YARA Hunting service (TCA-0303) documentation.

In this document, the term ruleset represents a collection of one or more YARA rules that are defined inside one YARA code text. One ruleset can contain up to 300 rules. Rules inside a single ruleset are allowed to be interdependent; i.e., one rule can reference another (see the Referencing other rules section in the official YARA documentation).

For users who need to create fine-grained, complex rulesets, YARA provides modules as a way of extending the default set of features. The ReversingLabs YARA Retro Hunting Service supports the following YARA modules: PE, ELF, Math, Hash, Time, and Dotnet.

The retro-admin API allows users to manage their own YARA retro hunts. A RESTful interface provides operations to enable, start, check status, and cancel a particular retro hunt. Retro hunts are referenced by a ruleset name.

The retro-matches API is used to retrieve retro hunt matches generated for a specific user.

This API is rate limited to 1 request per second.

This service has specific explanations for certain response status codes. See the individual endpoint sections for more information.

YARA Retro Administration API

Start Retro Hunt

This query starts the retro hunt for the specified ruleset.

POST /api/yara/admin/v1/ruleset/start-retro-hunt

Request Format

    Content-Type: application/json

{ "ruleset_name" : <ruleset_name> }
  • ruleset_name
    • String containing the name of a YARA ruleset previously uploaded by the user. The name should be between 3 and 48 characters long, and conform to the following regular expression: ^[a-z,A-Z,0-9,_-]*$

Response Format

    Content-Type: application/json

{ "ruleset_name" : <ruleset_name>,
"ruleset_sha1": <ruleset_sha1> }
  • ruleset_name - string specified by the user in the request
  • ruleset_sha1 - hex-encoded string representing the SHA1 digest of the ruleset text

Status Codes

CodeNameDescription
200OKThe ruleset was successfully queued for retro hunt.
400Bad RequestThe received request was malformed and cannot be processed.
403ForbiddenThe server understood the request, but is refusing to fulfill it or quota reached.
404Not FoundThe ruleset with ruleset_name does not exist for the authenticated user.
409ConflictRetro hunt for the ruleset with ruleset_name has already been started.

Check Retro Hunt Status

This query checks the retro hunt status for the specified ruleset.

GET /api/yara/admin/v1/ruleset/{ruleset_name}/status-retro-hunt

Request Format

  • ``ruleset_name`
    • String containing the name of a YARA ruleset previously uploaded by the user. The name should be between 3 and 48 characters long, and conform to the following regular expression: ^[a-z,A-Z,0-9,_-]*$

Response Format

    Content-Type: application/json

{ "ruleset_name" : <ruleset_name>,
"retro_status" : <retro_status>,
"reason" : <reason>,
"progress" : <progress>,
"start_time" : <start_time>,
"finish_time" : <finish_time>,
"estimated_finish_time" : <estimated_finish_time> }
  • ruleset_name
    • String specified by the user in the request
  • retro_status
    • Current ruleset retro hunt status. The value can be one of the following: PENDING, IN_VALIDATION, VALIDATION_FAILURE, VALIDATION_SUCCESS, RUNNING, FINISHED, CANCELLED
  • reason
    • Textual description of the reason for the current status (if applicable)
  • progress
    • Estimated retro hunt progress, expressed as a float number from interval [0,1] (0 meaning that the hunt has not yet started, 1 meaning that the hunt is finished)
  • start_time
    • Timestamp in the YYYY-MM-DDThh:mm:ss format indicating when the retro hunt was started. Returns null if the retro hunt has not been started yet
  • finish_time
    • Timestamp in the YYYY-MM-DDThh:mm:ss format indicating when the retro hunt was finished. Returns null if the retro hunt is still running
  • estimated_finish_time
    • Timestamp in the YYYY-MM-DDThh:mm:ss indicating when the retro hunt is estimated to finish. Returns null if it has already finished

Status Codes

CodeNameDescription
200OKThe ruleset retro hunt status was successfully retrieved.
400Bad RequestThe received request was malformed and cannot be processed.
404Not FoundThe ruleset with ruleset_name does not exist for the authenticated user.

Cancel Retro Hunt

This query cancels the retro hunt for the specified ruleset.

POST /api/yara/admin/v1/ruleset/cancel-retro-hunt

Request Format

    Content-Type: application/json

{ "ruleset_name" : <ruleset_name> }
  • ruleset_name
    • String containing the name of a YARA ruleset previously uploaded by the user. The name should be between 3 and 48 characters long, and conform to the following regular expression: ^[a-z,A-Z,0-9,_-]*$

Response Format

    Content-Type: application/json

{ "ruleset_name" : <ruleset_name>,
"ruleset_sha1" : <ruleset_sha1> }
  • ruleset_name - string specified by the user in the request
  • ruleset_sha1 - hex-encoded string representing the SHA1 digest of the ruleset text

Status Codes

CodeNameDescription
200OKRetro hunt for the specified ruleset was successfully cancelled.
400Bad RequestThe received request was malformed and cannot be processed.
403ForbiddenThe server understood the request, but is refusing to fulfill it or quota reached.
404Not FoundThe ruleset with ruleset_name does not exist for the authenticated user.

YARA Retro Matches Feed API

Fetch Retro matches

This query returns a recordset of YARA ruleset matches in the specified time range for a particular user.

GET /api/feed/yara/retro/v1/query/{time_format}/{time_value}[?format=xml|json] 

Request Format

  • time_format
    • Format in which the time value will be specified. Supported values are: timestamp - number of seconds since 1970-01-01 00:00:00; utc - UTC date in the YYYY-MM-DDThh:mm:ss format
  • time_value
    • The value defining the start of the requested time range
  • format
    • Optional parameter defining the format in which the resulting data will be returned. Supported values are: xml (default), json

The time range is defined as the time period from the time specified by the time_value parameter up to the time when the request is being made. The earliest supported time value is May 20 2016 00:00h UTC (timestamp 1463702400). The latest supported time value is 10 seconds before the current time. Specifying a time value outside this period will yield a "400 Bad Request" status code.

The feed will return at most 1000 records, starting from the earliest one. However, if a single second contains more than 1000 matches, all of them will be returned in a single query.

When a ruleset reaches 10 000 matches, it will be capped and will no longer store new matches. To continue collecting new matches, the ruleset has to be created again under a new name.

The response data provides the latest timestamp until which events are included. To fetch the next recordset, the timestamp from the response field last_timestamp should be increased by 1 and used in the next query as the time_value parameter.

The maximum time span for a single request is limited to 24 hours.

All time values are in UTC, independent of the input format.

Example Requests

Get all YARA matches from 2016-05-20 00:00:00 as timestamp:

/api/feed/yara/retro/v1/query/timestamp/1463702400

Get all the YARA matches from 2016-05-20 00:00:00 UTC as date string:

/api/feed/yara/retro/v1/query/utc/2016-05-20T00:00:00

Fetch all YARA matches from 2016-05-20 00:00:00 in the JSON format:

/api/feed/yara/retro/v1/query/timestamp/1463702400?format=json

Fetch all YARA matches from 2016-05-20 00:00:00 in the XML format:

/api/feed/yara/retro/v1/query/timestamp/1463702400?format=xml

Response Format

A query response will contain zero or more match entries that are found within the specified time period.

A single entry represents a match produced by a specific YARA ruleset against a particular sample file. The entry contains information about the sample file (SHA1, file size) as well as information on one or more rules from the ruleset that matched the sample file (rule name, tags, meta fields and their values).

Additional data propagated from YARA engine is a list of matching strings contained in matched_data objects. The matched_data object is a 3 dimensional vector with the form (int match_offset, base64_string string_identifier, base64_string matched_string).

The entry also contains information about sample download availability as a boolean value in the sample_available field.

If a sample was matched by several rulesets, each will produce its own entry.

JSON Response Format

{ "rl":
{ "feed":
{ "name": "YARA Match Continuous Feed",
"time_range": { "from": "YYYY-MM-DDTHH:MM:SS",
"to": "YYYY-MM-DDTHH:MM:SS" },
"last_timestamp": timestamp_value,
"entries": [
{ "timestamp": timestamp_value,
"sha1": sample_sha1_value,
"file_type": file_type_value,
"file_size": file_size_in_bytes,
"ruleset_sha1": ruleset_sha1_value,
"ruleset_name": ruleset_name,
"rule": [
{ "identifier": rule_name,
"meta": [[name_0, value_0], ..., [name_n, value_n]],
"tag": [tag_0, ..., tag_m]
},
...
{ "identifier": rule_name,
"meta": [[name_0, value_0], ..., [name_n, value_n]],
"tag": [tag_0, ..., tag_m]
}
],
"sample_available": boolean
},
...
]
}
}
}