YARA retro hunting
The ReversingLabs YARA Retro Hunting service enables users to run their own YARA rules and retroactively match them against files from the ReversingLabs sample set. The YARA Retro Hunting sample set is based on the last 90 days of stored samples, excluding samples larger than 200 MB and archives. Samples extracted from archives are not excluded.
Please note that this service cannot be used without access to the YARA Hunting service (TCA-0303). Users need to upload their YARA rules to the service using the API described in the YARA Hunting service (TCA-0303) documentation.
In this document, the term ruleset represents a collection of one or more YARA rules that are defined inside one YARA code text. One ruleset can contain up to 300 rules. Rules inside a single ruleset are allowed to be interdependent; i.e., one rule can reference another (see the Referencing other rules section in the official YARA documentation).
For users who need to create fine-grained, complex rulesets, YARA provides modules as a way of extending the default set of features. The ReversingLabs YARA Retro Hunting Service supports the following YARA modules: PE, ELF, Math, Hash, Time, and Dotnet.
The retro-admin
API allows users to manage their own YARA retro hunts. A RESTful interface provides operations to enable, start, check status, and cancel a particular retro hunt. Retro hunts are referenced by a ruleset name.
The retro-matches
API is used to retrieve retro hunt matches generated for a specific user.
This API is rate limited to 1 request per second.
This service has specific explanations for certain response status codes. See the individual endpoint sections for more information.
YARA Retro Administration API
Start Retro Hunt
This query starts the retro hunt for the specified ruleset.
POST /api/yara/admin/v1/ruleset/start-retro-hunt
Request Format
Content-Type: application/json
{ "ruleset_name" : <ruleset_name> }
ruleset_name
- String containing the name of a YARA ruleset previously uploaded by the user. The name should be between 3 and 48 characters long, and conform to the following regular expression: ^[a-z,A-Z,0-9,_-]*$
Response Format
Content-Type: application/json
{ "ruleset_name" : <ruleset_name>,
"ruleset_sha1": <ruleset_sha1> }
- ruleset_name - string specified by the user in the request
- ruleset_sha1 - hex-encoded string representing the SHA1 digest of the ruleset text
Status Codes
Code | Name | Description |
---|---|---|
200 | OK | The ruleset was successfully queued for retro hunt. |
400 | Bad Request | The received request was malformed and cannot be processed. |
403 | Forbidden | The server understood the request, but is refusing to fulfill it or quota reached. |
404 | Not Found | The ruleset with ruleset_name does not exist for the authenticated user. |
409 | Conflict | Retro hunt for the ruleset with ruleset_name has already been started. |
Check Retro Hunt Status
This query checks the retro hunt status for the specified ruleset.
GET /api/yara/admin/v1/ruleset/{ruleset_name}/status-retro-hunt
Request Format
- ``ruleset_name`
- String containing the name of a YARA ruleset previously uploaded by the user. The name should be between 3 and 48 characters long, and conform to the following regular expression: ^[a-z,A-Z,0-9,_-]*$
Response Format
Content-Type: application/json
{ "ruleset_name" : <ruleset_name>,
"retro_status" : <retro_status>,
"reason" : <reason>,
"progress" : <progress>,
"start_time" : <start_time>,
"finish_time" : <finish_time>,
"estimated_finish_time" : <estimated_finish_time> }
ruleset_name
- String specified by the user in the request
retro_status
- Current ruleset retro hunt status. The value can be one of the following: PENDING, IN_VALIDATION, VALIDATION_FAILURE, VALIDATION_SUCCESS, RUNNING, FINISHED, CANCELLED
reason
- Textual description of the reason for the current status (if applicable)
progress
- Estimated retro hunt progress, expressed as a float number from interval [0,1] (0 meaning that the hunt has not yet started, 1 meaning that the hunt is finished)
start_time
- Timestamp in the YYYY-MM-DDThh:mm:ss format indicating when the retro hunt was started. Returns null if the retro hunt has not been started yet
finish_time
- Timestamp in the YYYY-MM-DDThh:mm:ss format indicating when the retro hunt was finished. Returns null if the retro hunt is still running
estimated_finish_time
- Timestamp in the YYYY-MM-DDThh:mm:ss indicating when the retro hunt is estimated to finish. Returns null if it has already finished
Status Codes
Code | Name | Description |
---|---|---|
200 | OK | The ruleset retro hunt status was successfully retrieved. |
400 | Bad Request | The received request was malformed and cannot be processed. |
404 | Not Found | The ruleset with ruleset_name does not exist for the authenticated user. |
Cancel Retro Hunt
This query cancels the retro hunt for the specified ruleset.
POST /api/yara/admin/v1/ruleset/cancel-retro-hunt
Request Format
Content-Type: application/json
{ "ruleset_name" : <ruleset_name> }
ruleset_name
- String containing the name of a YARA ruleset previously uploaded by the user. The name should be between 3 and 48 characters long, and conform to the following regular expression: ^[a-z,A-Z,0-9,_-]*$
Response Format
Content-Type: application/json
{ "ruleset_name" : <ruleset_name>,
"ruleset_sha1" : <ruleset_sha1> }
- ruleset_name - string specified by the user in the request
- ruleset_sha1 - hex-encoded string representing the SHA1 digest of the ruleset text
Status Codes
Code | Name | Description |
---|---|---|
200 | OK | Retro hunt for the specified ruleset was successfully cancelled. |
400 | Bad Request | The received request was malformed and cannot be processed. |
403 | Forbidden | The server understood the request, but is refusing to fulfill it or quota reached. |
404 | Not Found | The ruleset with ruleset_name does not exist for the authenticated user. |
YARA Retro Matches Feed API
Fetch Retro matches
This query returns a recordset of YARA ruleset matches in the specified time range for a particular user.
GET /api/feed/yara/retro/v1/query/{time_format}/{time_value}[?format=xml|json]
Request Format
time_format
- Format in which the time value will be specified. Supported values are: timestamp - number of seconds since 1970-01-01 00:00:00; utc - UTC date in the YYYY-MM-DDThh:mm:ss format
time_value
- The value defining the start of the requested time range
format
- Optional parameter defining the format in which the resulting data will be returned. Supported values are: xml (default), json
The time range is defined as the time period from the time specified by the time_value parameter up to the time when the request is being made. The earliest supported time value is May 20 2016 00:00h UTC (timestamp 1463702400). The latest supported time value is 10 seconds before the current time. Specifying a time value outside this period will yield a "400 Bad Request" status code.
The feed will return at most 1000 records, starting from the earliest one. However, if a single second contains more than 1000 matches, all of them will be returned in a single query.
When a ruleset reaches 10 000 matches, it will be capped and will no longer store new matches. To continue collecting new matches, the ruleset has to be created again under a new name.
The response data provides the latest timestamp until which events are included. To fetch the next recordset, the timestamp from the response field last_timestamp should be increased by 1 and used in the next query as the time_value parameter.
The maximum time span for a single request is limited to 24 hours.
All time values are in UTC, independent of the input format.
Example Requests
Get all YARA matches from 2016-05-20 00:00:00 as timestamp:
/api/feed/yara/retro/v1/query/timestamp/1463702400
Get all the YARA matches from 2016-05-20 00:00:00 UTC as date string:
/api/feed/yara/retro/v1/query/utc/2016-05-20T00:00:00
Fetch all YARA matches from 2016-05-20 00:00:00 in the JSON format:
/api/feed/yara/retro/v1/query/timestamp/1463702400?format=json
Fetch all YARA matches from 2016-05-20 00:00:00 in the XML format:
/api/feed/yara/retro/v1/query/timestamp/1463702400?format=xml
Response Format
A query response will contain zero or more match entries that are found within the specified time period.
A single entry represents a match produced by a specific YARA ruleset against a particular sample file. The entry contains information about the sample file (SHA1, file size) as well as information on one or more rules from the ruleset that matched the sample file (rule name, tags, meta fields and their values).
Additional data propagated from YARA engine is a list of matching strings contained in matched_data objects. The matched_data object is a 3 dimensional vector with the form (int match_offset, base64_string string_identifier, base64_string matched_string)
.
The entry also contains information about sample download availability as a boolean value in the sample_available field.
If a sample was matched by several rulesets, each will produce its own entry.
JSON Response Format
{ "rl":
{ "feed":
{ "name": "YARA Match Continuous Feed",
"time_range": { "from": "YYYY-MM-DDTHH:MM:SS",
"to": "YYYY-MM-DDTHH:MM:SS" },
"last_timestamp": timestamp_value,
"entries": [
{ "timestamp": timestamp_value,
"sha1": sample_sha1_value,
"file_type": file_type_value,
"file_size": file_size_in_bytes,
"ruleset_sha1": ruleset_sha1_value,
"ruleset_name": ruleset_name,
"rule": [
{ "identifier": rule_name,
"meta": [[name_0, value_0], ..., [name_n, value_n]],
"tag": [tag_0, ..., tag_m]
},
...
{ "identifier": rule_name,
"meta": [[name_0, value_0], ..., [name_n, value_n]],
"tag": [tag_0, ..., tag_m]
}
],
"sample_available": boolean
},
...
]
}
}
}