Skip to main content

URL threat intelligence (TCA-0403)

This service returns threat intelligence data for the submitted URL. The report contains the ReversingLabs URL classification status, URL reputation from various reputation sources, metadata for performed URL analyses, and the maliciousness of files found on the submitted URL. The service also provides the option to get a list of these downloaded files.

URL classification information is URL-specific, and won't propagate to other resources hosted on the domain, or the domain itself.

URL classification is based on the proprietary ReversingLabs algorithm that takes into account the 3rd party URL reputation and the latest file reputation of all files downloaded across all performed URL analyses.

This means that a URL will be classified as malicious for as long as the sample that originally caused this classification is classified as malicious, regardless of whether it's present during a new scan.

To determine if the potentially malicious sample is still present on the URL, users have the option to get lists of downloaded files for all, the latest, or some specific analysis alongside information about the first/last download timestamp.

Users can send requests to three different endpoints:

URL report endpoint

This endpoint returns:

  • ReversingLabs URL classification
  • Third-party URL reputation and categorization
  • Metadata for the last analysis performed on the submitted URL
  • Counters of samples downloaded from the URL, mapped to their classification status (malicious, suspicious, known, unknown)
  • The most common threats downloaded from the submitted URL

If multiple analyses were performed on the submitted URL, the response will contain metadata for each performed analysis, and file counters will show the total number of samples downloaded across all analyses.

Downloaded files endpoint

  • Provides a list of hashes for files downloaded from the submitted URL: across all analyses, during the last analysis, or those downloaded during a specific analysis.
  • The results can be filtered to return samples with specific classifications. If requested, the endpoint can return extended metadata for each file.

Extended records contain:

  • Additional sample properties: SHA1 hash, MD5 hash, SHA256 hash, sample size, sample type, download availability of the sample, first and last seen dates, first and last download time.
  • Sample reputation information: classification, threat level, trust factor, malware family name, malware type, threat name, targeted platform and subplatform.

URL analysis notification feed

This feed serves a continuous list of previously submitted URLs which were analyzed to completion and their reports are ready. This feed can also be used as an additional source of interesting URLs.

This API is rate limited to 5 requests per second.

URL report endpoint

This service returns the report for the submitted URL. The report contains the ReversingLabs URL classification status, URL reputation from various reputation sources, metadata for performed URL analyses, statistics of files found on the submitted URL mapped to their classification, and an overview of the most common threats.

Request

POST /api/networking/url/v1/report/query/{format}

Path parameters:

  • format
    • Defines the POST body format. The following values are supported: xml and json
    • Required

Request body:

{
"rl": {
"query": {
"url": "string",
"response_format": "string"
}
}
}
  • url
    • The URL for which to retrieve the report. Provide the full URL of a website including the protocol (https://www.example.org). Only http and https protocols are supported. If the protocol is missing from the submitted URL, http will be automatically prepended to the URL. Note that URL normalization is performed during request submission, so duplicate and empty elements in the URL may be automatically removed or converted.
    • Required
  • response_format
    • Defines the response format. The following values are supported: xml (default) and json.
    • Optional

Response

If there were multiple analyses performed on the submitted URL, the counters will include files downloaded across all of them. The report will also list the most common threats downloaded from the URL.

Regardless if the submitted URL was sent to analysis or not, the URL classification status and third party URL reputation information will always be returned.

{
"rl": {
"sha1": "string",
"base64": "string",
"analysis": {},
"third_party_reputations": {},
"last_seen": "string",
"classification": "string",
"reason": "string",
"categories": [],
"requested_url": "string"
}
}
  • sha1
    • The SHA1 hash of the normalized submitted URL.
  • base64
    • The Base64 encoding of the normalized submitted URL.
  • analysis
    • Info on performed static analyses.
  • dynamic_analysis
    • Info on performed dynamic analyses.
  • third_party_reputations
    • Info from third parties on this URL.
  • last_seen
    • The last time when the requested URL received an indicator that updated its report. This can be the last time when we checked the URL reputation against third-party sources, the last time we obtained metadata for the requested URL from ReversingLabs static/dynamic file processing services (related files), or the last time the requested URL has been crawled or analyzed by the ReversingLabs Cloud Sandbox.
  • classification
    • URL classification based on the proprietary ReversingLabs algorithm that takes into account the 3rd party URL reputation and the latest file reputation of all files downloaded across all performed URL analyses. This means that a URL will be classified as malicious for as long as the sample that originally caused this classification is classified as malicious, regardless of whether the sample is present during a new scan.
  • categories
    • URL categorization according to the source, e.g. phishing. Not all sources provide categorization information.
  • requested_url
    • The submitted URL
  • reason
    • Reason why the URL was given a classification. This parameter is only shown if the classification is not unknown. Possible values are
      • whitelist, blacklist - The URL was found on a ReversingLabs curated whitelist/blacklist.
      • file_reputation - Classification based on the downloaded content.
      • sandbox - Classification based on dynamic analysis (ReversingLabs Cloud Sandbox).
      • third_party_reputation, domain_third_party_reputation - Classification based on third-party reputation sources.
      • user_override - Classification overridden by you, or a user belonging to your organization.
      • analyst_override - Classification overridden by a ReversingLabs analyst.
  • threat_level
    • Malware severity indicator expressed as an integer between 0 and 5. Values from 1 to 5 indicate threats from lowest to highest severity, 0 is reserved for known URIs. For unknown URIs, this value is omitted.
    • In real-world situations, threat level values are typically interpreted in the following way:
      • Threat Level 4, 5 - immediate response required (e.g., different types of Trojans, URI found on a blocklist or matches a known malware regex)
      • Threat Level 2, 3 - should be examined within 24 hours (e.g., first stage exploits, URLs with homoglyph variations)
      • Threat Level 1 - not urgent, but should be periodically reviewed (e.g. Adware / PUA, misleading subdomains).

rl.analysis

{
"first_analysis": "string",
"analysis_history": [],
"last_analysis": {},
"analysis_count": 0,
"statistics": {}
}
  • first_analysis
    • Time when the URL was analyzed for the first time (UTC)
  • analysis_history
  • last_analysis
    • Metadata about the last analysis.
  • analysis_count
    • The total number of analyses performed on the submitted URL
  • statistics
    • File counters mapped to classification. They include files downloaded across all performed analyses.

rl.analysis.analysis_history[] and rl.analysis.last_analysis

{
"analysis_id": "string",
"analysis_time": "string",
"http_response_code": 0,
"availability_status": "string",
"domain": "string",
"final_url": "string",
"serving_ip_address": "string",
}
  • final_url
    • The URL that was analyzed after redirecting from the submitted URL
  • availability_status
    • Indicates whether the analyzed URL was available or not (at the time of the analysis). Possible values are: online, offline
  • http_response_code
    • The HTTP status code returned by the analyzed site
  • analysis_time
    • Time when the report was generated
  • domain
    • The domain that hosts the URL
  • serving_ip_address
    • The resolved IP address of the analyzed URL
  • analysis_id
    • The unique identifier of the analysis
  • redirection_count
    • If the submitted URL redirects, this is the total count of redirections.
  • redirection_chain
    • The chain of redirections from the submitted URL to the final URL. The maximum number of listed redirections is 30.

rl.analysis.statistics

{
"unknown": 0,
"known": 0,
"suspicious": 0,
"malicious": 0,
"total": 0
}
  • total
    • The total number of files downloaded from the URL
  • known
    • The number of files classified as KNOWN
  • suspicious
    • The number of files classified as SUSPICIOUS
  • malicious
    • The number of files classified as MALICIOUS
  • unknown
    • The number of files without classification

rl.dynamic_analysis.analysis_history[] and rl.dynamic_analysis.last_analysis

{
"analysis_id": "string",
"analysis_time": "string",
"platform": "string",
"classification": "string",
"risk_score": 0,
"threat_type": [
"string"
],
"browser": "string",
"optional_parameters": "string"
}
  • analysis_id
    • The unique identifier of the analysis.
  • analysis_time
    • Time when the report was generated.
  • platform
    • States which platform was used to detonate the URL.
  • classification
    • Contains classification value that matches the requested URL sample. Supported values are: MALICIOUS, SUSPICIOUS, CLEAN, UNKNOWN.
  • risk_score
    • Value representing the trustworthiness or malicious severity of a sample. Risk score is expressed as a number from 0 to 10, with 0 indicating whitelisted samples from a reputable origin, and 10 indicating the most dangerous threats.
  • threat_type
    • Defines the type of a threat. This field is not shown if the classification is UNKNOWN.
  • browser
    • Browser-specific dynamic analysis detonation. The browser used for URL detonation is chrome.
  • optional_parameters
    • An object containing a list of optional parameters used during analysis. Possible values: internet_simulation, sample_name, geolocation, locale.
    • If internet_simulation is true, dynamic analysis was performed without connecting to the internet and a simulated network was used instead. If this value is false, the report is the same as if the parameter is omitted from the response. HTTPS traffic information is not monitored during analysis when internet_simulation is set to true.
    • sample_name is a custom file name and/or extension provided on submission. Custom extensions impact which application was used to open and run the file.
    • geolocation is a geographic location associated with the sample's network activity, reflecting the configured country from which the network traffic is egressed, set via VPN or similar routing methods. Supported geographic location values are: us (default), uk, in, br, de, jp, sg, it, es, fr, tor.
    • locale setting reflects the configured OS language, region, and keyboard layout to simulate a specific country or environment for anti-evasion or targeted analysis purposes. Supported locale values are: en-US (default), en-GB, pt-BR, de-DE, ja-JP, it-IT, es-ES, fr-FR.

rl.third_party_reputations.sources[]

{
"source": "string",
"update_time": "string",
"detection": "string"
}
  • source
    • Name of the third party source
  • detection
    • Detection for the submitted URL. The possible values are malicious/clean/undetected. If the URL is on a blacklist, it will be labeled as malicious. Whitelisted URLs will be considered clean, and if the source does not have any information about the URL, the value will be undetected.
  • categories
    • URL categorization according to the source, e.g. phishing. Not all sources provide categorization information.
  • update_time
    • Time when the information from the source was last updated
  • detect_time
    • Time when the URL was last detected or given a category by the source.

rl.third_party_reputations.statistics

{
"total": 0,
"malicious": 0,
"suspicious": 0,
"clean": 0,
"undetected": 0
}
  • total
    • The total number of consulted URL reputation sources
  • malicious
    • The number of sources that consider the URL malicious
  • suspicious
    • The number of sources that consider the URL suspicious
  • clean
    • The number of sources that consider the URL clean
  • undetected
    • The number of sources that do not have info about the URL

Response format (no report)

  • If the analysis report does not exist, the API will return a 200 response and the third_party_reputations section. The analysis section will be empty.
{
"rl": {
"requested_url": "string",
"classification": "string",
"analysis": {},
"third_party_reputations": {
"statistics": "string",
"sources": [
"string",
"string",
"string"
]
}
}
}

Examples - report retrieval

Retrieve a report

/api/networking/url/v1/report/query/json
{
"rl": {
"query": {
"url": "https://psychology-degree-programs-us.today/",
"response_format": "json"
}
}
}

Retrieve a report for a URL with third party detections only

Retrieving a URL report in JSON format, using a JSON POST body. The submitted URL was not analyzed (submitted to the TCA-0404 service), but third party detections exist.

/api/networking/url/v1/report/query/json
{
"rl": {
"query": {
"url": "http://icayus.com/wTpWgvg",
"response_format": "json"
}
}
}

URL downloaded files endpoint

Request

POST /api/networking/url/v1/downloaded_files/query/{format}
  • format
    • Defines the POST body format. The following values are supported: xml and json.
    • Required

Request body:

{
"rl": {
"query": {
"url": "string",
"analysis_id": "string",
"last_analysis": "string",
"response_format": "string",
"limit": "string",
"extended": "string",
"classification": "string"
}
}
}
  • url
    • The URL for which to retrieve a list of files. Provide the full URL of a website including the protocol (https://www.example.org). Only http and https protocols are supported. If the protocol is missing from the submitted URL, http will be automatically prepended to the URL. Note that URL normalization is performed during request submission, so duplicate and empty elements in the URL may be automatically removed or converted. This parameter returns a list of all files downloaded from a URL in all analyses. url cannot be used in combination with analysis_id.
    • Optional
  • analysis_id
    • A string provided by the Analyze URL API response, or the URL Analysis Notification Feed. If provided in the request, this parameter returns a list of files downloaded from a URL during the specific analysis matching the ID. analysis_id cannot be used in combination with url.
    • Optional
  • last_analysis
    • Boolean; if set to true, the listed files will be exclusively the ones that were downloaded during the most recent analysis. Default value: false. last_analysis cannot be used in combination with analysis_id.
    • Optional
  • response_format
    • Defines the response format. The following values are supported: xml (default) and json
    • Optional
  • limit
    • The number of files to return in the response. Default value: 1000
    • Optional
  • extended
    • Allows choosing between true - extended, and false - non-extended data set (default)
    • Optional
  • classification
    • If this parameter is provided in the request, the response will contain only samples that match the requested classification. Supported values are: KNOWN, SUSPICIOUS, MALICIOUS, UNKNOWN
    • Optional

Response

The response will contain metadata for files downloaded from the submitted URL. Empty fields are not included in the response.

{
"rl": {
"requested_url": "string",
"first_analysis": "string",
"last_analysis": "string",
"analysis_count": 0,
"total_files_count": 0,
"files": [],
"next_page": "string"
}
}
  • requested_url
    • The submitted URL
  • first_analysis
    • Time when the URL was analyzed for the first time (UTC)
  • last_analysis
    • Time when the the URL was last analyzed (UTC)
  • analysis_count
    • The total number of times the URL was analyzed
  • total_files_count
    • The total number of files downloaded from the submitted URL (all crawls)
  • files
    • A list of files and their metadata.
      • if URL submitted - all files
      • if analysis_id - only for that analysis (crawl)
      • if URL submitted and last_analysis=true - only samples downloaded with the last analysis (crawl)
  • next_page
    • This value can be used with the page parameter in the next request to retrieve the next page of records

rl.files[]

{
"first_download": "string",
"classification": "string",
"sample_type": "string",
"sha256": "string",
"sample_size": 0,
"sample_available": 0,
"sha1": "string",
"last_download": "string",
"first_seen": "string",
"threat_level": 0,
"trust_factor": 0,
"md5": "string",
"last_seen": "string"
}
  • first_download
    • Time when the file was first downloaded (UTC)
  • classification
    • File classification. Can be one of the following: KNOWN, MALICIOUS, SUSPICIOUS, UNKNOWN
  • sample_type
    • File type, as detected by Spectra Core
  • sha256
    • SHA256 of the file
  • sample_size
    • File size (in bytes)
  • sample_available
    • Indicates whether the sample is present in the ReversingLabs storage and available for download (true) or if it's not available (false).
  • sha1
    • The SHA1 hash of the file
  • last_download
    • Time when the file was last downloaded (UTC)
  • first_seen
    • Time when the sample was first seen in the ReversingLabs system (UTC)
  • md5
    • MD5 of the file
  • trust_factor
    • Trustworthiness indicator for goodware samples, expressed as an integer between 0 and 5, where 0 indicates the most trusted samples (highest confidence). Applies to known samples only
  • threat_name
    • Complete malware threat name. Conforms to the ReversingLabs Malware naming standard: platform-subplatform.type.familyname. Applies to malicious and suspicious samples only
  • threat_level
    • Malware severity indicator for suspicious and malicious samples, expressed as an integer between 0 and 5, where 5 indicates the most dangerous threats (highest severity). Applies to malicious and suspicious samples only
  • malware_type
    • The type part of the full threat name detected for the sample (for example, Trojan, Adware, Rootkit...). Conforms to the ReversingLabs Malware naming standard. Applies to malicious and suspicious samples only
  • malware_family
    • The familyname part of the full threat name detected for the sample (for example, Marsdaemon, Orcus, Androrat...).. Applies to malicious and suspicious samples only
  • platform
    • The platform targeted by the malware
  • subplatform
    • The subplatform targeted by the malware
  • last_seen
    • Time when the sample was last seen in the ReversingLabs system (UTC)

Examples - file metadata

Retrieve files downloaded from a URL with extended metadata

Get extended metadata about MALICIOUS files downloaded from http://185.242.104.78/wftp/ using a JSON request. Request the response in JSON, and limit it to 1 item.

Request:

/api/networking/url/v1/downloaded_files/query/json
{
"rl": {
"query": {
"url": "http://185.242.104.78/wftp/",
"limit": 1,
"extended": true,
"classification": "MALICIOUS",
"response_format": "json"
}
}
}

Response:

{
"rl": {
"requested_url": "http://185.242.104.78/wftp/",
"first_analysis": "2020-04-06T15:38:20",
"last_analysis": "2020-04-14T13:04:35",
"analysis_count": 6,
"total_files_count": 5,
"files": [
{
"sha1": "0d092b9cad4e5219313c589b9dd73b514af22bdf",
"first_download": "2020-03-30T11:19:54",
"last_download": "2020-03-31T11:22:32",
"classification": "MALICIOUS",
"md5": "80e101f716a4b97ea309e5f8d0ae654b",
"sha256": "a92e3c5ee2574135d81299050998115d847fa9289559adf1eee8e860baa7063e",
"sample_available": true,
"first_seen": "2020-03-25T01:07:44",
"last_seen": "2020-04-07T21:02:00",
"sample_type": "Text/VBS",
"sample_size": 19836,
"trust_factor": 5,
"threat_level": 5,
"threat_name": "Script-VBS.Trojan.Valyria",
"malware_family": "Valyria",
"malware_type": "Trojan",
"platform": "Script",
"subplatform": "VBS"
}
],
"next_page": "15856537526475f6b4c1d2867907b170ee65eae5ae82340546"
}
}

URL analysis notification endpoint

This service provides a continuous list of completed analyses up to 90 days into the past. The records enter the feed once the submitted URL is analyzed to completion and the report is ready.

Request

GET /api/networking/url/v1/notifications/query/latest[/page/{page}]?[format=xml|json]&[limit=1-100]
GET /api/networking/url/v1/notifications/query/from/{time_format}/{start_time}/[/page/{page}]?[format=xml|json]&[limit=1-100]

Path parameters:

  • time_format
    • Defines the time format. Possible values are: utc, timestamp
    • Required
  • start_time
    • Accepts values in the format set by time_format. If the format is set to utc, the value should be formatted as YYYY-MM-DDThh:mm:ss. If the format is set to timestamp, the value should be expressed as the number of seconds since 1970-01-01 00:00:00
    • Required
  • page
    • An optional pagination parameter used for retrieving the next page of the results. The pagination value for the next page is provided in the previous request response
    • Optional

Query parameters:

  • format
    • Defines the response format. Supported values are xml and json; the default is xml
    • Optional
  • limit
    • The maximum number of reports to return in the response. Accepted values are numbers between 1 and 1000; if the parameter is not provided in the request, it defaults to 1000
    • Optional

Response

The response will contain a list of completed URL analyses, starting from the requested time, or in the last 1 hour if the latest endpoint is used. The records will be sorted in ascending order by their stored time.

The endpoint will return a list of maximum limit records. If the limit value is not provided in the request, 1000 records will be returned by default.

{
"rl": {
"next_page": "string",
"urls": []
}
}
  • next_page
    • This value can be used with the page parameter in the next request to retrieve the next page of records
  • urls
    • List of URLs

rl.urls[]

{
"url": "string",
"availability_status": "string",
"final_url": "string",
"analysis_id": "string",
"analysis_time": "string"
}
  • url
    • The submitted URL
  • final_url
    • The URL that was analyzed after redirecting from the submitted URL
  • availability_status
    • Indicates whether the submitted URL was available or not (at the time of the analysis). Possible values: online, offline
  • analysis_time
    • Time when the report was generated
  • analysis_id
    • The unique identifier of the analysis

Examples - URL Analysis Notification feed

Retrieve records from a specific timestamp

Retrieving information starting from timestamp 1586869174, listing five feed records, in JSON format.

Request:

/api/networking/url/v1/notifications/query/from/timestamp/1586869174?format=json&limit=5

Response:

{
"rl": {
"urls": [
{
"url": "http://45.95.168.106/mips",
"final_url": "http://45.95.168.106/mips",
"availability_status": "offline",
"analysis_time": "2020-07-14T08:57:44",
"analysis_id": "15947161334231c0"
},
{
"url": "https://eicar.org/?page_id=3950",
"final_url": "https://eicar.org/?page_id=3950",
"availability_status": "online",
"analysis_time": "2020-07-14T09:31:49",
"analysis_id": "159471910992e314"
},
{
"url": "https://maximizemeasly.com/.well-known/pki-validation/BillingsCollections/odvr/",
"final_url": "https://maximizemeasly.com/.well-known/pki-validation/BillingsCollections/odvr/",
"availability_status": "offline",
"analysis_time": "2020-07-14T10:43:05",
"analysis_id": "1594722480864193"
},
{
"url": "http://45.95.168.106/mips",
"final_url": "http://45.95.168.106/mips",
"availability_status": "offline",
"analysis_time": "2020-07-14T13:00:29",
"analysis_id": "15947316295131c0"
},
{
"url": "https://maximizemeasly.com/.well-known/pki-validation/BillingsCollections/odvr/",
"final_url": "https://maximizemeasly.com/.well-known/pki-validation/BillingsCollections/odvr/",
"availability_status": "offline",
"analysis_time": "2020-07-14T22:47:24",
"analysis_id": "1594766844714193"
}
],
"next_page": "15874700660fc075d084e05839478b2f2e6056ec64b5f8ee48"
}
}

Retrieve a specific number of records

Retrieving five records in JSON format, starting from UTC time 2020.06.08. 00:00:00 and providing the next page parameter 15916074711674f496007368cf5a54b5492ab59bdeaeea71c6.

Request:

/api/networking/url/v1/notifications/query/from/utc/2020-06-08T00:00:00/page/15916074711674f496007368cf5a54b5492ab59bdeaeea71c6?format=json&limit=5

Response

{
"rl": {
"urls": [
{
"url": "http://45.95.168.106/mips",
"final_url": "http://45.95.168.106/mips",
"availability_status": "offline",
"analysis_time": "2020-07-14T13:00:29",
"analysis_id": "15947316295131c0"
},
{
"url": "http://107.174.206.110/bins",
"final_url": "http://107.174.206.110/bins",
"availability_status": "offline",
"analysis_time": "2020-07-14T14:03:39",
"analysis_id": "159473541917280d"
},
{
"url": "http://194.15.36.104/bins",
"final_url": "http://194.15.36.104/bins",
"availability_status": "offline",
"analysis_time": "2020-07-14T14:18:45",
"analysis_id": "159473632593312e"
},
{
"url": "https://maximizemeasly.com/.well-known/pki-validation/BillingsCollections/odvr/",
"final_url": "https://maximizemeasly.com/.well-known/pki-validation/BillingsCollections/odvr/",
"availability_status": "offline",
"analysis_time": "2020-07-14T14:46:54",
"analysis_id": "1594738014894193"
},
{
"url": "https://ubadrium.com/",
"final_url": "https://ubadrium.com/",
"availability_status": "online",
"analysis_time": "2020-07-14T16:05:04",
"analysis_id": "159474270457d32c"
}
],
"next_page": "15916142651a3d50c943dc141d2713cc149f19449039e29e3e"
}
}