# T1000 Documentation > T1000 threat intelligence appliance documentation. This file contains all documentation content in a single document following the llmstxt.org standard. ## Getting started with T1000 This guide walks you through authorizing the T1000 appliance, applying your license, creating an API user, and verifying the setup with a file reputation lookup. ## Prerequisites Before you begin: - T1000 appliance deployed and powered on (see [Deployment](./deployment.md)) - Network configured via the VM console (see [Management](./management.md)) - Access to a web browser on a machine that can reach the appliance IP - `curl` installed for API testing **Tip: Initial credentials** Your initial administrator username and password for the web management interface are provided by [ReversingLabs Support](mailto:support@reversinglabs.com). Contact support before proceeding if you do not have them. ## Step 1: Log in to the web management interface Open a browser and navigate to the appliance management interface on port 10000: ``` http://:10000 ``` Log in with the default credentials provided by [ReversingLabs Support](mailto:support@reversinglabs.com). On first login you are prompted to set a new password — do this before proceeding. **Note: The web management interface may be unresponsive for 10–60 minutes after each restart while the database completes internal verification. Wait for it to become available before continuing.** ## Step 2: Obtain your license (authorize the appliance) The T1000 appliance cannot retrieve database updates or respond to API requests until it is authorized. Authorization links the appliance to your ReversingLabs account. 1. In the web management interface, navigate to **RL Appliance > Authorization**. 2. Copy the values from the following fields: - **Appliance Type** - **Appliance ID** - **Appliance Key** - **Appliance Version** - **Appliance Username** - **Expiration Date** (shows "N/A" on an unlicensed appliance — include it anyway) 3. Send all of these values to [support@reversinglabs.com](mailto:support@reversinglabs.com) to request your authorization token. 4. When ReversingLabs Support responds with the token, paste it into the **Token** field on the Authorization page. 5. Select **Authorize**. 6. Restart the appliance after successful authorization (**RL Appliance > Dashboard > Reboot**). After restart, the appliance begins downloading the latest database updates from Spectra Intelligence. The Authorization page will show the license expiry date and the number of available updates. ## Step 3: Create an API user REST API access requires a dedicated user account. The default admin account cannot be used for API calls. 1. Navigate to **RL Appliance > User Management**. 2. Enter a username (alphanumeric only, must not be `admin`). 3. Select **Add User**. 4. Note the generated 8-character password shown in the **User info** section. Usernames always have the `u/` prefix — for example, if you enter `analyst`, the full username is `u/analyst`. Use this full prefixed form in all API requests. **Note: Passwords are auto-generated and cannot be manually set. Use the **Reset password** button to generate a new one if needed.** ## Step 4: Verify with an EICAR hash lookup Once the appliance has downloaded its initial database update, verify it is working by querying the EICAR test file hash — a well-known test sample that every threat intelligence database should classify as malicious. ```bash curl -u "u/:" \ "http:///api/databrowser/malware_presence/query/sha1/list?format=json" \ -H "Content-Type: application/json" \ -d '{"rl": {"query": {"hash_type": "sha1", "hashes": ["3395856ce81f2b7382dee72602f798b642f14d45"]}}}' ``` A successful response confirms the appliance is authorized, the database has loaded, and API access is working: ```json { "rl": { "malware_presence": { "entries": [ { "sha1": "3395856ce81f2b7382dee72602f798b642f14d45", "status": "MALICIOUS", "threat_level": 5, "classification": { "classification": "malware", "type": "Virus", "platform": "DOS", "family_name": "EICAR-Test-File" } } ] } } } ``` If the response returns `UNKNOWN` or an authentication error, see [Troubleshooting](#troubleshooting). ## Troubleshooting | Symptom | Likely cause | Action | |---|---|---| | Web interface unreachable on port 10000 | Database verification still in progress | Wait up to 60 minutes after restart | | Authorization page shows "N/A" for all fields | Appliance not yet networked | Configure network via VM console first | | EICAR returns `UNKNOWN` | Database update not yet complete | Wait for updates to finish; check update status on the Authorization page | | API returns `401` | Wrong username format or password | Ensure username includes the `u/` prefix | | API returns `429` | License expired | Re-authorize the appliance and contact [support@reversinglabs.com](mailto:support@reversinglabs.com) | ## Next steps - [Configuration](./configuration.md) — proxy settings, certificate management, REST API protocol - [Management](./management.md) — network settings, DNS, NTP, password reset via VM console - [File Threat Intelligence API](/SpectraIntelligence/API/FileThreatIntel/) — full API reference for hash lookups and file analysis reports --- ## T1000 # T1000 — Network Appliance for File Reputation ReversingLabs T1000 Appliance provides on-premises access to an up-to-date copy of ReversingLabs Spectra Intelligence, the industry's most comprehensive source for threat intelligence and data reputation files. With a local database, customers do not incur latency penalties and privacy risks associated with the Internet. The T1000 Appliance uses a NoSQL database optimized for data replication, and supports advanced searches across billions of file records in milliseconds. ## T1000 R1 and XG T1000 R1 and T1000 XG share the same core data model, but XG exposes additional sample metadata and analysis endpoints. While R1 supports only the TCA-0101 (Malware Presence) API, XG additionally provides TCA-0104 (File Analysis), TCA-0103 (Historic multi-AV scan records) and the XG-CFS forensic sampling service. **Info: The documentation for API endpoints titled TCA-XXXX is mirrored from ** [Spectra Intelligence](/SpectraIntelligence/API/FileThreatIntel). The functionality is equivalent in terms of requests: URL structure, request parameters, and so on. In terms of responses, certain information won't be present in T1000: - history - hashes that are not SHA256, SHA1 or MD5: - SHA384 - SHA512 - RIPEMD160 - scanner metadata: - version used for this scanning report - update timestamp Where such information is available with a direct call to Spectra Intelligence, T1000 will return `null`. When sending requests, use the username and the password created with the Appliance management interface. ### Response Status Codes | Code | Description | |----- | ------------------------------------------------------------------------------------------------------------------ | | 200 | The request has succeeded. | | 400 | The request could not be understood by the server due to malformed syntax. | | 401 | The request requires user authentication. | | 403 | The server understood the request, but is refusing to fulfill it. | | 404 | The server has not found anything matching the request URI. | | 429 | License has expired. | | 500 | The server encountered an unexpected condition which prevented it from fulfilling the request. | | 503 | The server is currently unable to handle the request due to a temporary overloading or maintenance of the server. | --- ## Deployment ## Appliance Deployment The T1000 virtual machine is provided as an OVA file or AMI file. When provided as an OVA file, it is stored on a USB HDD disk provided by ReversingLabs. A file containing the MD5 hash, e.g., `ReversingLabs-T1000-R1-YYYY-MM-DD.ova-md5` of the .OVA file is present in the root folder of the USB HDD. The method of deployment is standard for the underlying infrastructure. Follow one of these two resources for deployment: - [VMware vSphere](https://techdocs.broadcom.com/us/en/vmware-cis/vsphere/vsphere/7-0/vsphere-virtual-machine-administration-guide-7-0/deploying-ovf-templatesvm-admin/deploy-an-ovf-template-flex-and-h5vm-admin.html) - [AWS AMI](https://docs.aws.amazon.com/imagebuilder/latest/userguide/ib-tutorials.html) ## Minimum VM requirements for T1000 Virtual Appliance Shared requirements: - Virtual environment compatible with VMWare vSphere Hypervisor 5.5 (ESXi 5.5, Virtual machine hardware version 10) - 64 GB RAM - 16 CPU - All disks must be SSD Specific requirements: - T1000 XG: 18 TB free disk space (thin provisioning) with room to grow to 24 TB for the next 12 months (24 TB thick provisioning) - T1000 R1: 16 TB free disk space - T1000 AV: 20 TB free disk space --- ## Configuration The T1000 appliance has a built-in graphical user interface that provides configuration options for authorization, proxy settings, dashboard, upgrade, user management, exporting and importing accounts, replicator endpoint settings, certificate management, and other system tool modules. Customers can access this interface in a web browser. **Note: The Web Management Interface is not fully usable while the database is going** through internal verification. Depending on the hardware, this process can take between 10 and 60 minutes each time the appliance is restarted. ## Web management To access the management interface for the first time, open the following address in a web browser: ``` ://:10000 ``` ![](./_static/images/t1000-web-login.png) On the login page, use the provided username and password to log in. After the first login, it is necessary to provide a new password and log in with it. When first started, the T1000 appliance contains no authorization data. This makes the appliance unable to connect to Spectra Intelligence, and unable to retrieve the latest updates. Furthermore, the fields for total remaining cloud updates, total cloud updates, and update status are blank. After authorization, T1000 will download the latest database updates from Spectra Intelligence. By default, feed updates will be retrieved from `https://data.reversinglabs.com/replicator/v1` on port 443. To change the Replicator URL, see the section [T1000 Replicator Endpoint Module](#t1000-replicator-endpoint-module). When the appliance is authorized, the fields on the login page will contain information about the appliance status: - **Total remaining cloud updates** - the number of available updates not yet downloaded - **Total cloud updates** - total number of available updates for download - **Update status** - current status of downloaded updates To log out from the appliance, select **Logout** from the upper right menu that displays the username. ## Authorization The T1000 appliance needs to be authorized to be able to update the local database. Navigate to the main menu and select **RL Appliance > Authorization** to load the Authorization module. ![](./_static/images/t1000-authorization-module.png) If an email client is configured on the system, use the link on the right side under the **Authorize** button. If an email client is not configured, copy the content of the **Appliance Type**, **Appliance ID**, **Appliance key**, **Appliance version**, **Appliance username** and **Expiration date** fields and send it to support@reversinglabs.com. Note that in the unauthorized appliance state, the **Appliance Username** and **Expiration Date** fields will display as "N/A". ![](./_static/images/t1000-authorization-token.png) Copy the received token into the **Token** field and **Authorize** the appliance. After successful authorization, restart the VM. When the appliance is successfully authorized, create a user to access the REST API. ## Licensing The T1000 appliance uses licenses with expiration dates. In the initial phase, when the license is in an inactive state, certain modules (excluding the Authorization module), configurations and API access are restricted. In this state all displayed information will indicate "Your license is not activated." Upon successful authorization, all modules become accessible, and information regarding the license status and expiry date becomes visible within the Authorization module. ## Managing users This module manages users that are connecting to the REST API of the appliance. Navigate to the main menu and select **RL Appliance > User Management**. The username can contain only alphanumeric characters, and it cannot be `admin`. Usernames always have the `u/` prefix and must be used with that prefix. Usernames and passwords are case-sensitive. ![](./_static/images/t1000-user-info.png) When a user has been successfully added, the "User info" section will display the new username and its generated 8-character-long password. Users can be disabled, or deleted. Passwords cannot be manually changed, but a new password can be generated using the **Reset password** button. ### Importing and exporting accounts The exporting and importing accounts functionality is used to export or import an existing account database to or from another ReversingLabs T1000 appliance. Navigate to the main menu and select **RL Appliance > Export and import accounts** to access this functionality. **Note: When importing accounts with the same name (e.g., `u/test`), the existing** account(s) will be overwritten. ------------------------ The **RL Appliance** menu contains additional configuration options. - You can set a proxy (HTTP/HTTPS). - You can start/stop the appliance on the Dashboard page. - You can switch between HTTP and HTTPS for API access on the REST API Protocol toggling page. - Upgrading is done by uploading an upgrade file on the Upgrade page. The machine needs to be rebooted after upgrade. **Warning: Restart the appliance only if the upgrade was successful.** - The replicator endpoint page allows changing the URL for the replicator service. Custom settings here may break the appliance, so please contact support if you need a custom URL for the replicator service. **Changing this setting requires an appliance restart.** - The Certificate Management page allows replacing the SSL certificate and key for the web server. By default, the machine uses a self-signed certificate from ReversingLabs. ## Help The built-in ReversingLabs Appliance Help documentation contains basic information about the modules and the REST API provided by the T1000 appliance. To access this documentation, click the **help** link inside the search box. ## System configuration ### Network interfaces **Tools > Network Configuration > Network Interfaces** ![](./_static/images/t1000-network-interfaces.png) At the top, under **Interfaces Active Now** are the interfaces that are currently enabled and have an IP address assigned. All loopback, Ethernet and PPP interfaces will be shown, although not all will be editable. At the bottom, under **Interfaces Activated at Boot Time** are the interfaces that have been configured to be activated at boot. The two lists will not necessarily be the same, as some interface types (such as PPP) are not activated at boot time, and do not appear in the second list. #### How to change the IP address 1. If the interface appears under both **Interfaces Active Now** and **Interfaces Activated at Boot Time** (as most of the editable ones do), click its name in the lower list. This will open a dialog for editing its settings. 2. To assign a different address, enter it into the **IP Address** field. To enable dynamic IP address assignment by a DHCP server, select the **From DHCP** option. 3. If necessary, change the **Netmask** field. If the **Netmask** or the **IP address** fields are changed, the Broadcast address based on the new netmask and IP also needs to be set. 4. When editing an active interface, the **MTU** and **Hardware address** fields will be available. The **MTU** field should be edited only by experienced users because it can seriously impact network performance or completely cut the system off from the rest of the network. The hardware address should only be changed if the network card needs a different Ethernet address, which is rarely necessary. 5. If editing a boot-time interface, make sure the **Activate at boot?** option is set to **Yes** so that the interface is brought up when the system starts. If editing an active interface, make sure the **Status** option is set to **Up** so that it can be used immediately. 6. When done editing a boot-time interface, click the **Save & Apply** button to save changes for use at boot time, and to make them immediately active. When editing an active interface, just click `Save` to apply changes. #### How to configure routing **Tools > Network Configuration > Routing & Gateways** Any system attached to a large network needs to know the address of the default gateway. In some cases, the system itself may be a gateway as well - perhaps forwarding data between a local area network and a broadband connection. In this case, it must be configured to forward incoming packets that are destined for some other address. In some cases, the traffic destined for certain networks may have to be sent through another router instead of the default gateway. If more than one IP networks share the same LAN, the traffic for any of those networks must be sent using the correct interface. If either of these are the case on your network, static or local routes need to be configured so that the system knows where to send packets for certain destinations. To change the default gateway used by your system or enable packet forwarding, follow these steps. 1. Enter the IP address of the default gateway into the **Default router** field. 2. Enter the name of the network interface that must be used to reach the default router into the **Default route device** field. On some Linux distributions this field is optional, meaning that the system will set it automatically. On others, there is a **Gateway** field next to the **Default router** field. 3. To enable routing, set the **Act as router?** option to **Yes**. 4. On RedHat, Mandrake, MSC and Turbo Linux, static routes can be set up using the **Static routes** table. Each static route must be in a new row containing the following information: - In the **Interface** column, enter the interface that will be used to reach the router, such as eth0 - In the **Network** column, enter the address of the remote network, such as 192.168.5.0 - In the **Netmask** column, enter the network's netmask, such as 255.255.255.0 - In the **Gateway** column, enter the IP address of a router that knows how to forward data to the network, such as 192.168.4.1 5. On distributions mentioned in step 4, routing to additional IP networks can be set using the **Local routes** table. Each route needs to be in a new row containing the following information: - In the **Interface** column, enter the name of the interface that the LAN is connected to, such as eth1 - In the **Network** column, enter the address of the additional IP network, such as 192.168.3.0 6. Click the **Save** button when done modifying the settings. Any changes will not be activated immediately - instead, they will only take effect on the next boot. If the system's primary network connection is via PPP dialup, then the default gateway will be assigned automatically when connecting, and removed when disconnected. Therefore it is not necessary to set it up with this dialog. #### How to change the hostname or DNS settings **Tools > Network Configuration > DNS Client** Enter the new hostname (composed of letters, numbers, underscores and dots) into the **Hostname** field. Click the **Save** button to immediately apply the change. The browser will redirect to the main Network Configuration module page. If there’s a local DNS server running on the network, don't forget to update the entry for the reconfigured system there as well. To change the system's DNS settings, follow these steps: 1. Click the **DNS Client** icon on the main Network Configuration module page to open the configuration dialog. 2. Enter the addresses of up to three servers into the **DNS servers** field. If the first is not available, the system will try the second, or the third. Most networks will have at least a primary and secondary DNS server to increase reliability in case one fails. 3. The **Resolution order** field can be used to control where the system will look when resolving hostnames and IP addresses. Generally, the defaults are reasonable, with the `/etc/hosts` file listed first and DNS later. However, if NIS is used for hostname resolution, it must be selected somewhere in the order. 4. In the **Search domains** field, enter any domain names to automatically append to resolve hostnames. For example, if *foo.com* was on the list and the user ran the command `telnet server1`, then the IP address for *server1.foo.com* would be looked up. 5. When done modifying the settings, click the `Save` button. Any changes will take effect immediately in all programs running on the system. If the system's only network connection is via dial-up, the DNS servers may be assigned automatically by the ISP depending on the PPP configuration. ##### Editing host addresses On a small network with only a few systems, there is an option of not running a DNS server at all, but instead keeping the addresses of every system in the hosts file on each system. ![](./_static/images/t1000-address.png) To view the addresses on the current system, click the **Host Addresses** icon on the main Network Configuration module page. There will always be an entry for `localhost`, and probably one for the local system's hostname as well. If the system's IP address or hostname have been changed, the host addresses list will probably not reflect the change, which could cause problems. To change a host address, follow these steps: 1. Click a host IP address in the list, which opens the configuration dialog. 2. Enter the new address into the **IP Address** field. 3. Enter any hostnames into the **Hostnames** field. It is always a good idea to enter both the short and long forms of any hostname, such as `server1.foo.com` and `server1` so that both can be used. 4. Click the **Save** button. If there are no errors in the dialog, the browser will return to the list of hosts and addresses. Extra host addresses can be added by clicking the *Add a new host address* link above or below the list. There are no restrictions on the same hostname being associated with two different IP addresses, or the same IP address appearing twice in the list. ### Running processes This module can be used to view, kill, re-prioritize, and run processes on the system. When opened for the first time from the System category, the main page will display a tree of processes. ![](./_static/images/t1000-running-processes1.png) #### Starting a Process The module can also be used to run simple commands, either in the foreground so that their output is displayed, or in the background as daemons. This can be useful for running a command without having to login via telnet or SSH (or if a firewall is preventing a telnet or SSH login). The following steps describe the procedure for starting a process. 1. On the main page of the module, click the **Run** link next to the display mode options. This redirects to the dialog for starting a new process. 2. Enter the command into the **Command to run** field. 3. If the command is something that will take a long time to run, the **Run mode** option can be set to **Run in background** to automatically put the process in the background. To see the output from the command, leave the option set to **Wait until complete**. 4. Enter any input to be fed to the command into the **Input to command** field. 5. Click the **Run** button to execute the command. If the **Wait until complete** option was selected, any output from the command will be displayed. #### Viewing, stopping, or prioritizing a process To see the full details of any running process, click its **Process ID column** entry in any of the sections on the main page. This opens the process information page. ![](./_static/images/t1000-running-processes3.png) The process can be stopped with a `TERM` signal by clicking the **Terminate Process** button. Because this can be ignored by some commands, the **Kill Process** button can be used to send a `KILL` signal if the termination fails. Unless the process hangs inside a kernel system call, killing it is guaranteed to succeed. Other signals can be sent by selecting the type of signal next to the **Send Signal** button before clicking it. Some of the more useful signals include: - `HUP`: for many server processes, this signal will cause them to re-read their configuration files - `STOP`: suspends the process until a CONT signal is received - `CONT`: resumes a process that has been suspended by a STOP signal The information page can also be used to change the `nice` level of a running process, giving it a higher or lower priority. To change the priority of a process, select a new level from the **Nice level** list, and then click the **Change** button. Lower levels mean higher priorities, so a process with a nice level of 10 will get less CPU time than the one with level 5. ## Bootup and shutdown This module allows creating and editing scripts that run at bootup and shutdown time. The main page of the module displays a list of all available actions (whether or not they are started at boot), and a short description for each. ![](./_static/images/t1000-bootup-shutdown-module.png) To avoid the risk of losing data on the local hard drives, the system should always be rebooted or shut down with the appropriate commands, instead of turning off the power or pressing the reset button. If the system was improperly shut down, it will perform a lengthy file system check with `fsck` at next boot time if using a non-journaling filesystem. To reboot the system, follow these steps: 1. At the bottom of the Bootup and Shutdown module page, click the **Reboot System** button. This opens a new page prompting for reboot confirmation. 2. Click the **Reboot System** button on the confirmation page. The shutdown process starts immediately, and the current console session is automatically logged out. After all the shutdown scripts have been executed, the system will boot up again. The procedure for shutting down the system is nearly identical, and triggered using the **Shutdown System** button at the bottom of the page. --- ## Management The T1000 appliance must be configured on first boot. Network settings are configured via the VM console (reachable from vSphere Client or vSphere Web client) on TTY1. The VM console interface makes it possible to setup IP and DNS, run shutdown or restart procedures, and reset the web interface password. When connected to the VM console, a menu and basic information such as MAC, IP, and GATEWAY address are displayed on TTY1. The addresses visible in the screenshot below are only for illustrative purposes. Users need to use the addresses that are valid in their production network environments. First, choose how the appliance will obtain the IP - statically or dynamically (DHCP). Currently active setup is indicated with an \[active\] mark beside the menu item. ![](./_static/images/t1000-appliance-management-console.png) Use the numbers to select individual settings. For example, press 1 to choose interface, press 2 to configure a static IP address, and so on. Select values with `Enter`, confirm with `y`, or cancel with `n`. When specifying network settings, after changing them, wait for the *Network configuration has been saved* message, and then reboot the appliance. 1. **Interface selection**: The default interface is `eth0`, but this can be changed. 2. **Static IP settings**: - The networking information uses dotted decimal notation. - Specify each of the networking values (IP address, subnet mask, broadcast address, gateway IP address). 3. **Dynamic IP (DHCP) settings** 4. **DNS settings** 5. **NTP settings** 6. **Reset the password** - This resets the Web management interface password to its original default. 7. **Shutdown/restart** --- ## T1000 R1 --- ## File reputation (TCA-0101) <_ /> --- ## T1000 XG --- ## File reputation (TCA-0101)(XG) <_ /> --- ## Historic multi-AV scan records (TCA-0103) <_ /> --- ## File analysis (TCA-0104) <_ /> --- ## Cyber Forensic Service (XG CFS) ## Introduction XG CFS provides all available cyber forensic Spectra Intelligence XG metadata for the requested sample(s) on the T1000 XG appliance. The service supports single and bulk queries. ## XG CFS Single Query Returns information for a single hash. ### Request - Requests can be sent using the GET method or the POST method. - Both methods use Basic Authentication. - Both methods support specifying hashes in the URL. - POST allows specifying hashes in the request body as form data. ``` GET /api/xg/cfs/1/hashinfo/lookup.{format} ``` ``` POST /api/xg/cfs/1/hashinfo/lookup.{format} ``` Path parameters: - `format`: Specifies the response format. Supported values: `xml`, `json` Query parameters: - `md5` - `sha1` - `sha256` Examples: ``` GET /api/xg/cfs/1/hashinfo/lookup.json?sha1=550a0e228ff317c74d62b668d260eb0a60bfeb39 ``` ### Response ```json5 { "hashinfo": { "peheadermetadata": { "version": "", "description": "", "language": "", "companyname": "", "originalname": "", "codepage": "", "productname": "", "productversion": "", "fileversion": "" }, "fileinfo": { "firstseendateutc": "2024-12-09T16:49:11Z", "iscontainer": false, "crc32": "", "filesizebytes": 11401, "ssdeep": "", "md5": "9c966d6b81788f1a0026aef16355927a", "sha1": "00020c16c9a3f1ba16f0d339bdca3d64af4eb12a", "ispeformat": false, "firstseenname": "", "sha256": "fc2a50fff96621a8e72916d4feca6aaf773da8eb17422f1a9645348ee9f28c8a", "isexecutable": false }, "trust": 8, "threat": 0, "certificate": { "commonname": "", "certificateexinfo": { "validfromdateutc": "", "publisher": "", "issuerthumbprint": "", "name": "", "serialnumber": "", "validtodateutc": "", "thumbprint": "" } } }, "request": "/api/xg/cfs/1/hashinfo/lookup.json?sha1=00020c16c9a3f1ba16f0d339bdca3d64af4eb12a" } ``` The response contains one top-level `hashinfolookup` object for the requested hash. The `request` object contains the data submitted in the request. The response code 404 is returned with a message "Missing Hashinfo for sample" when a hash is not found in the database. `hashinfo.certificate` - `certificateexinfo`: Detailed view of certificate information for the requested hash. Contains metadata such as thumbprint of certificate issuer, certificate name, certificate publisher (organization name), certificate serial number, certificate thumbprint, date the certificate is valid from (in UTC), date the certificate is valid until (in UTC). - `commonname`: The certificate common name. --------------- `hashinfo.fileinfo` - `crc32`: CRC32 value of the requested hash. - `filesizebytes`: Sample size in bytes. - `iscontainer`: Indicates whether or not the requested sample is a container. - `isexecutable`: Indicates whether or not the requested sample is executable. - `ispeformat`: Indicates whether or not the file format of the requested sample is PE (Portable Executable). - `firstseendateutc`: First seen date of the requested sample (in UTC). - `firstseenname`: First seen name of the requested sample. - `md5`: MD5 value of the requested sample. - `sha1`: SHA1 value of the requested sample. - `sha256`: SHA256 value of the requested sample. - `ssdeep`: SSDEEP value of the requested sample (if available). ---------------- ` hashinfo.peheadermetadata` - `codepage`: Codepage metadata from the PE header of the requested sample. - `companyname`: Company name metadata from the PE header of the requested sample. - `description`: Description metadata from the PE header of the requested sample. - `fileversion`: File version metadata from the PE header of the requested sample. - `language`: Language metadata from the PE header of the requested sample. - `originalname`: Original name metadata in the PE header of the requested sample. - `productname`: Product name metadata in the PE header of the requested sample. - `productversion`: Product version metadata in the PE header of the requested sample. - `version`: Version metadata from the PE header of the requested sample. ## XG CFS Bulk Query This query retrieves nearly identical data as the single query does, but for multiple sample hashes within a single response. It is more network-efficient compared to multiple single queries. ### Request - Requests can be sent using the GET method or the POST method. - Both methods use Basic Authentication. - Both methods support specifying hashes in the URL. - POST allows specifying hashes in the request body as form data. ``` GET /api/xg/cfs/1/hashinfos/lookup.{format} ``` ``` POST /api/xg/cfs/1/hashinfos/lookup.{format} ``` Path parameters: - `format`: Specifies the response format. Supported values: `xml`, `json` Query parameters: - `md5` - `sha1` - `sha256` When requesting a list of hashes, they should be submitted as multiple `hash_type=hash_value` pairs. Hashes can be submitted as part of the request URL or in the POST body (form data). The hashes can be serialized in one or both of the following ways: 1. One argument, comma-separated list: A set of hashes may be a comma-delimited list in a single argument, like so: ``` md5=hash1,hash2,hash3 ``` 2. Multiple arguments of the same type: A set of hashes may each be a separate argument of the same type, like so: ``` md5=hash1&md5=hash2&md5=hash3 ``` #### Examples GET requests: ``` GET /api/xg/cfs/1/hashinfos/lookup.json?sha1=550a0e228ff317c74d62b668d260eb0a60bfeb39,a183f2a0906357488256945754592faa4bd4f7ba ``` ``` GET /api/xg/cfs/1/hashinfos/lookup.json?sha1=550a0e228ff317c74d62b668d260eb0a60bfeb39&sha1=a183f2a0906357488256945754592faa4bd4f7ba ``` POST request: ``` POST /api/xg/cfs/1/hashinfos/lookup.json ``` Form data: ``` sha1=550a0e228ff317c74d62b668d260eb0a60bfeb39&sha1=a183f2a0906357488256945754592faa4bd4f7ba ``` ### Response ```json5 { "totalcount": 1, "request": "/api/xg/cfs/1/hashinfos/lookup.json", "hashinfos": [ { "certificate": { "commonname": "", "certificateexinfo": { "validfromdateutc": "", "publisher": "", "issuerthumbprint": "", "name": "", "serialnumber": "", "validtodateutc": "", "thumbprint": "" } }, "requestsha1": "00020c16c9a3f1ba16f0d339bdca3d64af4eb12a", "peheadermetadata": { "version": "", "description": "", "language": "", "companyname": "", "originalname": "", "codepage": "", "productname": "", "productversion": "", "fileversion": "" }, "isfound": true, "threat": 0, "fileinfo": { "firstseendateutc": "2024-12-09T16:49:11Z", "iscontainer": false, "crc32": "", "filesizebytes": 11401, "ssdeep": "", "md5": "9c966d6b81788f1a0026aef16355927a", "sha1": "00020c16c9a3f1ba16f0d339bdca3d64af4eb12a", "ispeformat": false, "firstseenname": "", "sha256": "fc2a50fff96621a8e72916d4feca6aaf773da8eb17422f1a9645348ee9f28c8a", "isexecutable": false }, "trust": 8 } ] } ``` The response for the bulk request contains one `hashinfoslookup` object with one `hashinfo` object for each requested hash. The `hashinfo` object in the bulk query response contains identical fields from the `hashinfo` object in the single query response. Additionally, the following fields are returned only in the bulk query response: - `totalcount`: Indicates how many hashes were submitted in the request. - `isfound`: For each requested hash, indicates if the hash was found in the ReversingLabs database. - Depending on the requested hash type, the response includes one of: - `requestmd5` - `requestsha1` - `requestsha256` --- ## Analysis Timeout Issues File analysis timeouts can occur when processing complex or large files that require extensive analysis time. Understanding the causes and solutions helps ensure successful file processing. ## Common Causes Analysis timeouts typically happen due to: - **Large file sizes** - Files approaching or exceeding the size limits for your appliance tier - **Deep nesting** - Archives containing multiple layers of compressed files - **Extensive unpacking** - Files that trigger recursive decompression operations - **Complex file structures** - Files with intricate internal structures requiring detailed parsing - **Resource constraints** - Insufficient RAM or CPU allocation for the analysis workload ## Configuration Options ### Spectra Analyze The analysis timeout can be adjusted in the appliance configuration: 1. Navigate to **Administration > Configuration** 2. Locate the analysis timeout setting 3. Increase the timeout value based on your file processing requirements 4. Save the configuration changes ### File Inspection Engine Use the `--analysis-timeout` flag to control the per-file time limit: ```bash rl-scan --analysis-timeout 300 /path/to/file ``` The timeout value is specified in seconds. ## Troubleshooting Steps If analysis timeouts persist: 1. **Increase allocated resources** - Ensure the appliance or container has sufficient RAM (32 GB+ recommended) and CPU cores 2. **Check decompression ratio limits** - Verify that recursive unpacking isn't exceeding configured limits 3. **Review file characteristics** - Examine the file structure to identify potential issues 4. **Monitor system resources** - Check if the appliance is under heavy load from concurrent analyses 5. **Adjust timeout values** - Increase timeout settings for complex file processing workflows ## Related Topics - [Platform Requirements](/General/DeploymentAndIntegration/PlatformRequirements) - Hardware specifications for different appliance tiers - [How Spectra Core analysis works](/General/AnalysisAndClassification/SpectraCoreAnalysis) - Understanding the analysis process --- ## Antivirus Result Availability When a sample is uploaded or rescanned in Spectra Intelligence, it will usually get new antivirus results **within 30 minutes**. When a sample has new antivirus results, these will available in relevant APIs, for example [TCA-0104 File analysis](/SpectraIntelligence/API/FileThreatIntel/tca-0104/). --- ## Certificate Revocation ReversingLabs maintains a certificate revocation database that is updated with each [Spectra Core](/General/AnalysisAndClassification/SpectraCoreAnalysis) release. Because the database is offline, some recently revoked certificates may not appear as revoked until the next update. Certificate Authority (CA) revocation alone is not sufficient to classify a sample as malicious. Most CAs backdate revocations to the certificate's issuance date, regardless of when or whether the certificate was abused. When additional context is available, ReversingLabs adjusts the revocation date to reflect the most appropriate point in time. If a certificate is whitelisted, this correction is not applied. ## Searching for Revoked Certificates You can find samples signed with revoked certificates using **Advanced Search** with the `tag:cert-revoked` keyword. Advanced Search is available both through the [Spectra Analyze user interface](/SpectraAnalyze/search-page/) and as the [TCA-0320 Advanced Search](/SpectraIntelligence/API/MalwareHunting/tca-0320/) API. --- ## File Classification and Risk Scoring — ReversingLabs # Classification File classification assigns a risk score (0-10) and threat verdict (malicious, suspicious, goodware, or unknown) to every analyzed file using ReversingLabs Spectra Core. The classification algorithm combines YARA rules, machine learning, heuristics, certificate validation, and file similarity matching to determine security status. YARA rules take precedence as the most authoritative signal, followed by other detection methods that contribute to the final verdict. The classification of a sample is based on a comprehensive assessment of its assigned risk factor, threat level, and trust factor; however, it can be manually or automatically overridden when necessary. Based on this evaluation, files are placed into one of the following buckets: - No threats found (unclassified) - Goodware/known - Suspicious - Malicious The classification process weighs signals from all available sources to arrive at the most accurate verdict. Some signals are considered more authoritative than others and take priority. For example, [Spectra Core](/General/AnalysisAndClassification/SpectraCoreAnalysis) YARA rules always take precedence because they are written and curated by ReversingLabs analysts. These rules provide the highest degree of accuracy, as they target specific, named threats. This does not mean that other classification methods are less important. Similarity matching, heuristics, and machine learning still contribute valuable signals and may produce additional matches. In cases where multiple detections apply, YARA rules simply serve as the deciding factor for the final classification. ## Risk score A risk score is a value representing the trustworthiness or malicious severity of a sample. Risk score is expressed as a number from 0 to 10, with 0 indicating whitelisted samples from a reputable origin, and 10 indicating the most dangerous threats. At a glance: Files with no threats found don't get assigned a risk score and are therefore **unclassified**. Values from 0 to 5 are reserved for samples classified as **goodware/known**, and take into account the source and structural metadata of the file, among other things. Since goodware samples do not have threat names associated with them, they receive a description based on their risk score. Risk scores from 6 to 10 are reserved for **suspicious** and **malicious** samples, and express their severity. They are calculated by a ReversingLabs proprietary algorithm, and based on many factors such as file origin, threat type, how frequently it occurs in the wild, YARA rules, and more. Lesser threats like adware get a risk score of 6, while ransomware and trojans always get a risk score of 10. ### Malware type and risk score In cases where multiple threats are detected and there are no other factors (such as user overrides) involved, the final classification is always the one that presents the biggest threat. If they belong to the same risk score group, malware types are prioritized in this order: | Risk score | Malware types | |------------|---------------------------------------------------------------------------------------------------------------------| | 10 | EXPLOIT > BACKDOOR > RANSOMWARE > INFOSTEALER > KEYLOGGER > WORM > VIRUS > CERTIFICATE > PHISHING > FORMAT > TROJAN | | 9 | ROOTKIT > COINMINER > ROGUE > BROWSER | | 8 | DOWNLOADER > DROPPER > DIALER > NETWORK | | 7 | SPYWARE > HYPERLINK > SPAM > MALWARE | | 6 | ADWARE > HACKTOOL > PUA > PACKED | ## Threat level and trust factor The [risk score table](#risk-score) describes the relationship between the risk score, and the threat level and trust factor used by the [File Reputation API](/SpectraIntelligence/API/FileThreatIntel/tca-0101). The main difference is that the risk score maps all classifications onto one numerical scale (0-10), while the File Reputation API uses two different scales for different classifications. ### Nomenclature The following classifications are equivalent: | File Reputation API | Spectra Analyze | Spectra Detect Worker | | ------------------- | --------------- | ------------------------ | | known | goodware | 1 (in the Worker report) | In the Worker report, the [risk score](#risk-score) is called `rca_factor`. ## Deciding sample priority The [risk score table](#risk-score) highlights that the a sample's risk score and its classification don't have a perfect correlation. This means that a sample's risk score cannot be interpreted on its own, and that the primary criterion in deciding a sample's priority is its classification. Samples classified as suspicious can be a result of heuristics, or a possible early detection. A suspicious file may be declared malicious or known at a later time if new information is received that changes its threat profile, or if the user manually modifies its status. The system always considers a malicious sample with a risk score of 6 as a higher threat than a suspicious sample with a risk score of 10, meaning that samples classified as malicious always supersede suspicious samples, regardless of the calculated risk score. The reason for this is certainty - a malicious sample is decidedly malicious, while suspicious samples need more data to confirm the detected threat. It is a constant effort by ReversingLabs to reduce the number of suspicious samples. While a suspicious sample with a risk score of 10 does deserve user attention and shouldn't be ignored, a malicious sample with a risk score of 10 should be triaged as soon as possible. ## Acting on classification and risk tolerance When appliances or integrations surface data from ReversingLabs Threat Intelligence lookups, the **classification** and **rca_factor** fields are the primary signal to act upon. - Always evaluate the `classification` field first: it determines whether a sample is known goodware, suspicious, or malicious and therefore dictates priority. - Use `rca_factor` to adjust the risk aversion profile and suppress alerts for low-level threats when appropriate. Lower factors correspond to less severe detections. Additional fields worth reviewing: - `propagated`: `true` means one of the extracted child files caused the verdict, so you may need to investigate nested artifacts. - `scan_results`: contains all malicious detections found in the file; the worst detection is propagated as the final file verdict. - `result`: verbal verdict name. - `factor`: can usually be ignored because it is actionable only in conjunction with clean samples. - `propagated_source`: contains the extracted file hash that was found to be malicious and triggered propagation. ### Risk tolerance profiles 1. **Low risk tolerance**: the user wants to be alerted about any possible threat. In this profile, act on both **Suspicious** and **Malicious** verdicts, which generates the maximum number of matches. 2. **High risk tolerance**: the user only wants to be notified about highly impactful threats. In this profile, filter for `classification:Malicious` and `rca_factor >= 7` so only the most severe detections surface. For deployments using Spectra Analyze, the [Risk Tolerance Levels](/SpectraAnalyze/Administration/users-personalization/risk-tolerance-levels) guide explains how appliance sensitivity settings choose which analysis sources count toward classification. For more context on interpreting the classification structure and `rca_factor`, refer to the [Spectra Detect Analysis](/SpectraDetect/Usage/Analysis) documentation, which outlines the fields returned in Worker reports. ## Malware naming standard --- ## Handling False Positives # Handling False Positives A false positive occurs when a legitimate file is incorrectly classified as malicious. While ReversingLabs strives for high accuracy, false positives can occasionally happen due to the complexity of malware detection across hundreds of file formats and millions of samples. ## What You Can Do If you encounter a false positive, you have several options: ### 1. Local Classification Override On Spectra Analyze, you can immediately override the classification using the classification override feature: - Navigate to the file's Sample Details page - Use the classification override option to manually set the file as goodware - The override takes effect immediately on your appliance - All users on the same appliance will see the updated classification ### 2. Spectra Intelligence Reclassification Request Submit a reclassification request through Spectra Intelligence: - The override propagates across all appliances connected to the same Spectra Intelligence account - Other appliances in your organization will automatically receive the updated classification - This is the recommended approach for organization-wide corrections ### 3. Goodware Overrides Use Goodware Overrides to propagate trusted parent classifications to extracted child files: - If a trusted parent file (e.g., from Microsoft or another reputable vendor) contains files that trigger false positives - The parent's goodware classification can automatically override the child files - This is particularly useful for legitimate installers that may contain components flagged by heuristics ## How ReversingLabs Handles False Positive Reports If a customer reports a false positive (through Zendesk, or by contacting the Support team at support@reversinglabs.com), the first thing we do is re-scan the sample to make sure that the results are up-to-date. If the results are still malicious, our Threat Analysis team will: 1. Conduct our own research of the software and the vendor 2. Contact the AV scanners and notify them of the issue 3. Change the classification in our system (we do not wait for AVs to correct the issue) --- If the file is confirmed to be a false positive, we begin by analyzing why the incorrect classification occurred. Then we try to correct the result by making adjustments related to file relationships, certificates, AV product detection velocity (e.g. are detections being added or removed), we will re-scan and reanalyze samples, adjust/add sources and, if necessary, manually investigate the file. If these efforts do not yield a correct result, we have the ability to **manually override the classification** — but we only do so after thorough analysis confirms the file is benign. --- ## ReversingLabs malware naming standard The ReversingLabs detection string consists of three main parts separated by dots. All parts of the string will always appear (all three parts are mandatory). ``` platform-subplatform.type.familyname ``` 1. The first part of the string indicates the **platform** targeted by the malware. This string is always one of the strings listed in the [Platform string](#platform-string) table. If the platform is Archive, Audio, ByteCode, Document, Image or Script, then it has a subplatform string. Platform and subplatform strings are divided by a hyphen (`-`). The lists of available strings for Archive, Audio, ByteCode, Document, Image and Script subplatforms can be found in their respective tables. 2. The second part of the detection string describes the **malware type**. Strings that appear as malware type descriptions are listed in the [Type string](#type-string) table. 3. The third and last part of the detection string represents the malware family name, i.e. the name given to a particular malware strain. Names "Agent", "Gen", "Heur", and other similar short generic names are not allowed. Names can't be shorter than three characters, and can't contain only numbers. Special characters (apart from `-`) must be avoided as well. The `-` character is only allowed in exploit (CVE/CAN) names (for example CVE-2012-0158). #### Examples If a trojan is designed for the Windows 32-bit platform and has the family name "Adams", its detection string will look like this: ``` Win32.Trojan.Adams ``` If some backdoor malware is a PHP script with the family name "Jones", the detection string will look like this: ``` Script-PHP.Backdoor.Jones ``` Some potentially unwanted application designed for Android that has the family name "Smith" will have the following detection string: ``` Android.PUA.Smith ``` Some examples of detections with invalid family names are: ``` Win32.Dropper.Agent ByteCode-MSIL.Keylogger.Heur Script-JS.Hacktool.Gen Android.Backdoor.12345 Document-PDF.Exploit.KO Android.Spyware.1a Android.Spyware.Not-a-CVE Win32.Trojan.Blue_Banana Win32.Ransomware.Hydra:Crypt Win32.Ransomware.HDD#Cryptor ``` #### Platform string The platform string indicates the operating system that the malware is designed for. The following table contains the available strings and the operating systems for which they are used. | String | Short description | | ----------- | ------------------------------------------------------------------------------------------ | | ABAP | SAP / R3 Advanced Business Application Programming environment | | Android | Applications for Android OS | | AOL | America Online environment | | Archive | Archives. See [Archive subplatforms](#archive-subplatforms) for more information. | | Audio | Audio. See [Audio subplatforms](#audio-subplatforms) for more information. | | BeOS | Executable content for Be Inc. operating system | | Boot | Boot, MBR | | Binary | Binary native type | | ByteCode | ByteCode, platform-independent. See [ByteCode subplatforms](#bytecode-subplatforms) for more information. | | Blackberry | Applications for Blackberry OS | | Console | Executables or applications for old consoles (e.g. Nintendo, Amiga, ...) | | Document | Documents. See [Document subplatforms](#document-subplatforms) for more information. | | DOS | DOS, Windows 16 bit based OS | | EPOC | Applications for EPOC mobile OS | | Email | Emails. See [Email subplatforms](#email-subplatforms) for more information. | | Firmware | BIOS, Embedded devices (mp3 players, ...) | | FreeBSD | Executable content for 32-bit and 64-bit FreeBSD platforms | | Image | Images. See [Image subplatforms](#image-subplatforms) for more information. | | iOS | Applications for Apple iOS (iPod, iPhone, iPad…) | | Linux | Executable content for 32 and 64-bit Linux operating systems | | MacOS | Executable content for Apple Mac OS, OS X | | Menuet | Executable content for Menuet OS | | Novell | Executable content for Novell OS | | OS2 | Executable content for IBM OS/2 | | Package | Software packages. See [Package subplatforms](#package-subplatforms) for more information. | | Palm | Applications for Palm mobile OS | | Script | Scripts. See [Script subplatforms](#script-subplatforms) for more information. | | Shortcut | Shortcuts | | Solaris | Executable content for Solaris OS | | SunOS | Executable content for SunOS platform | | Symbian | Applications for Symbian OS | | Text | Text native type | | Unix | Executable content for the UNIX platform | | Video | Videos | | WebAssembly | Binary format for executable code in Web pages | | Win32 | Executable content for 32-bit Windows OS's | | Win64 | Executable content for 64-bit Windows OS's | | WinCE | Executable content for Windows Embedded Compact OS | | WinPhone | Applications for Windows Phone | ##### Archive subplatforms | String | Short description | | ---------------------------------- | ------------------------------------------------------------ | | ACE | WinAce archives | | AR | AR archives | | ARJ | ARJ (Archived by Robert Jung) archives | | BZIP2 | Bzip2 archives | | CAB | Microsoft Cabinet archives | | GZIP | GNU Zip archives | | ISO | ISO image files | | JAR | JAR (Java ARchive) archives | | LZH | LZH archives | | RAR | RAR (Roshal Archive) archives | | 7ZIP | 7-Zip archives | | SZDD | Microsoft SZDD archives | | TAR | Tar (tarball) archives | | XAR | XAR (eXtensible ARchive) archives | | ZIP | ZIP archives | | ZOO | ZOO archives | | *Other Archive identification* | All other valid [Spectra Core](/General/AnalysisAndClassification/SpectraCoreAnalysis) identifications of Archive type | ##### Audio subplatforms | String | Short description | | -------------------------------- | ---------------------------------------------------------- | | WAV | Wave Audio File Format | | *Other Audio identification* | All other valid Spectra Core identifications of Audio type | ##### ByteCode subplatforms | String | Short description | | ------ | ----------------- | | JAVA | Java bytecode | | MSIL | MSIL bytecode | | SWF | Adobe Flash | ##### Document subplatforms | String | Short description | | ----------------------------------- | ------------------------------------------------------------ | | Access | Microsoft Office Access | | CHM | Compiled HTML | | Cookie | Cookie files | | Excel | Microsoft Office Excel | | HTML | HTML documents | | Multimedia | Multimedia containers that aren't covered by other platforms (e.g. ASF) | | Office | File that affects multiple Office components | | OLE | Microsoft Object Linking and Embedding | | PDF | PDF documents | | PowerPoint | Microsoft Office PowerPoint | | Project | Microsoft Office Project | | Publisher | Microsoft Office Publisher | | RTF | RTF documents | | Visio | Microsoft Office Visio | | XML | XML and XML metafiles (ASX) | | Word | Microsoft Office Word | | *Other Document identification* | All other valid Spectra Core identifications of Document type | ##### Email subplatforms | String | Short description | | ------ | ------------------------------------- | | MIME | Multipurpose Internet Mail Extensions | | MSG | Outlook MSG file format | ##### Image subplatforms | String | Short description | | -------------------------------- | ------------------------------------------------------------ | | ANI | File format used for animated mouse cursors on Microsoft Windows | | BMP | Bitmap images | | EMF | Enhanced Metafile images | | EPS | Adobe Encapsulated PostScript images | | GIF | Graphics Interchange Format | | JPEG | JPEG images | | OTF | OpenType Font | | PNG | Portable Network Graphics | | TIFF | Tagged Image File Format | | TTF | Apple TrueType Font | | WMF | Windows Metafile images | | *Other Image identification* | All other valid Spectra Core identifications of Image type | ##### Package subplatforms | String | Short description | | ---------------------------------- | ------------------------------------------------------------ | | NuGet | NuGet packages | | DEB | Debian Linux DEB packages | | RPM | Linux RPM packages | | WindowStorePackage | Packages for distributing and installing Windows apps | | *Other Package identification* | All other valid Spectra Core identifications of Package type | ##### Script subplatforms | String | Short description | | --------------------------------- | ------------------------------------------------------------ | | ActiveX | ActiveX scripts | | AppleScript | AppleScript scripts | | ASP | ASP scripts | | AutoIt | AutoIt scripts (Windows) | | AutoLISP | AutoCAD LISP scripts | | BAT | Batch scripts | | CGI | CGI scripts | | CorelDraw | CorelDraw scripts | | Ferite | Ferite scripts | | INF | INF Script, Windows installer scripts | | INI | INI configuration file | | IRC | IRC, mIRC, pIRC/Pirch Script | | JS | Javascript, JScript | | KiXtart | KiXtart scripts | | Logo | Logo scripts | | Lua | Lua scripts | | Macro | Macro (e.g. VBA, AmiPro macros, Lotus123 macros) | | Makefile | Makefile configuration | | Matlab | Matlab scripts | | Perl | Perl scripts | | PHP | PHP scripts | | PowerShell | PowerShell scripts, Monad (MSH) | | Python | Python scripts | | Registry | Windows Registry scripts | | Ruby | Ruby scripts | | Shell | Shell scripts | | Shockwave | Shockwave scripts | | SQL | SQL scripts | | SubtitleWorkshop | SubtitleWorkshop scripts | | WinHelp | WinHelp Script | | WScript | Windows Scripting Host related scripts (can be VBScript, JScript, …) | | *Other Script identification* | All other valid Spectra Core identifications of Script type | #### Type string This string is used to describe the general type of malware. The following table contains the available strings and describes what each malware type is capable of. For a catalog of common software weaknesses that enable malware, see [CWE](https://cwe.mitre.org/) maintained by MITRE. CISA maintains advisories on actively exploited vulnerabilities at [cisa.gov/known-exploited-vulnerabilities](https://www.cisa.gov/known-exploited-vulnerabilities). | String | Description | | ----------- | ------------------------------------------------------------ | | Adware | Presents unwanted advertisements | | Backdoor | Bypasses device security and allows remote access | | Browser | Browser helper objects, toolbars, and malicious extensions | | Certificate | Classification derived from certificate data | | Coinminer | Uses system resources for cryptocurrency mining without the user's permission | | Dialer | Applications used for war-dialing and calling premium numbers | | Downloader | Downloads other malware or components | | Dropper | Drops malicious artifacts including other malware | | Exploit | Exploits for various vulnerabilities, CVE/CAN entries | | Format | Malformations of the file format. Classification derived from graylisting, validators on unpackers | | Hacktool | Software used in hacking attacks, that might also have a legitimate use | | Hyperlink | Classifications derived from extracted URLs | | Infostealer | Steals personal info, passwords, etc. | | Keylogger | Records keystrokes | | Malware | New and recently discovered malware not yet named by the research community | | Network | Networking utilities, such as tools for DoS, DDoS, etc. | | Packed | Packed applications (UPX, PECompact…) | | Phishing | Email messages (or documents) created with the aim of misleading the victim by disguising itself as a trustworthy entity into opening malicious links, disclosing personal information or opening malicious files. | | PUA | Potentially unwanted applications (hoax, joke, misleading...) | | Ransomware | Malware which encrypts files and demands money for decryption | | Rogue | Fraudulent AV installs and scareware | | Rootkit | Provides undetectable administrator access to a computer or a mobile device | | Spam | Other junk mail that does not unambiguously fall into the Phishing category, but contains unwanted or illegal content. | | Spyware | Collects personal information and spies on users | | Trojan | Allows remote access, hides in legit applications | | Virus | Self-replicating file/disk/USB infectors | | Worm | Self-propagating malware with exploit payloads | --- ## Risk score reference table --- ## How Spectra Core analysis works # How Spectra Core Analysis Works All ReversingLabs products are powered by [Spectra Core](https://www.reversinglabs.com/products/spectra-core) - the engine that analyzes every file and sample. The process of analyzing software involves several steps, and the final output are the analysis reports. To better understand the source and significance of the information contained in those reports, it's helpful to learn what Spectra Core does in the background of ReversingLabs products. This page provides an overview of the Spectra Core analysis process and explains what happens with files in each of the analysis steps. The following main steps have dedicated sections where they are described in detail: 1. [Identification](#1-identification) 2. [Unpacking](#2-unpacking) 3. [Validation](#3-validation) 4. [Metadata processing](#4-metadata-processing) 5. [Classification](#5-classification) ## Automated static analysis When you scan a file with Spectra Core, the engine automatically performs static analysis on the file and all files extracted from it. Automated static analysis is also referred to as **complex binary analysis**. This unique approach to software analysis decomposes files, collects their metadata, and classifies them in terms of the security risk they pose to end-users. Files are analyzed recursively, which means that every file extracted from the software package goes through the same analysis process like its container software package. As implemented in Spectra Core, automated static analysis does not require access to the source code (like SAST tools typically do). It can directly examine compiled software binaries to determine their structure, dependencies and behaviors. In addition to analyzing software binaries (which is the primary use-case), Spectra Core can analyze library code and source code for specific scripting languages. Another benefit of automated static analysis is that **files are not executed during the analysis process**. All available data is extracted even if the files are compressed, executable, or damaged - regardless of their target OS or platform. Because the analysis process does not execute any files, it can be completed in milliseconds and performed on very large files without significant performance penalties. All these features of automated static analysis give Spectra Core a unique advantage - it can analyze post-build artifacts and detect more novel, sophisticated software supply chain attacks than SCA tools are able to. SCA tools typically analyze package managers, manifest files, or source code repositories to find vulnerabilities. They are limited by the need for known signatures of open source dependencies that have to be cross-referenced against a vulnerability database. Being used in pre-build environments, SCA tools lack visibility into deep file structures and build process tampering evidence - insights that Spectra Core readily provides. ## The Spectra Core analysis process The process starts with the input file. The analysis engine performs several distinct steps on every object it extracts from the input file. The following diagram illustrates the flow that every object goes through. You can interact with the diagram to learn more about the process: - Select steps in the diagram to access their dedicated sections on this page ### 1. Identification Format identification is the initial step of the Spectra Core analysis process. To successfully perform the subsequent analysis steps, we first need to know the file format of every object we are analyzing. Specifically, this step analyzes the object structure to determine whether it's **binary** or **text**, and assigns the analyzed object a unique file format description. This description - file format identification - instructs the analysis engine on which rules and modules to use for further file processing. Two main approaches are used for format identification: - **Signatures** - created by ReversingLabs researchers to identify **binary** file formats based on their unique features. For example, Windows .exe files start with bytes "MZ", while PNG files will usually start with "‰PNG". Signatures describe expectations of what a file format should contain. Using heuristics, the analysis process checks whether those expectations align with the actual file structure. In addition to signatures, the analysis process also evaluates any relevant YARA rules (built into the engine as well as user-provided). If there are multiple matches, those from signatures take priority over YARA rule matches. - **Machine learning models** - created and trained by ReversingLabs researchers to identify **textual** file formats based on statistical text identification. The models are able to recognize basic text objects as scripting languages and distinguish software source code from other types of textual content. **Note: ✅ Completing the identification step** The results of the format identification step are: - File hashes - calculated by the analysis engine - File format descriptions - represented as File type.File subtype.Identification (for example, `Binary/Archive/ZIP`). If there are multiple versions of a file format, they can be identified through the additional `version` field. After the format has been identified, the file is either directed to the proper unpacking module according to its signature, or to the validation step. ### 2. Unpacking Unpacking, also referred to as **file decomposition**, is a step in the Spectra Core analysis process where the analyzed file is taken apart to extract all available components and metadata. During the unpacking process, the analysis engine eliminates obfuscation, encryption, compression, and any other protections that may have been applied to the file and its contents. The engine has built-in mechanisms to prevent infinite recursion, and supports configuring the decompression ratio and unpacking depth (how many layers of a file to extract). Different file formats require different unpacking approaches because of their structure and complexity. Because static analysis does not execute a file, it requires **unpackers** - specialized tools for parsing and unpacking individual file formats. ReversingLabs develops in-house static unpackers tailored to specific file formats, and Spectra Core relies on those unpackers during analysis. Generally speaking, goodware file formats are easier to unpack because their structure is known and well-defined, and file behavior can be observed from the format definition. File formats commonly used for malware are good at hiding code, which makes their unpacking more challenging. To create an unpacker for malware file formats, researchers have to identify each format and document its structure. The unpacker must be able to simulate file execution so that its code can be reconstructed and its behavior observed. Any obfuscation and protection artifacts must also be removed to allow extracting further objects. Information about the file behavior allows the unpacker - and consequently, the analysis process - to reveal the original software intent and to let users understand the true meaning of the code that was packed in that particular file format. The ability to unpack a file format makes it possible for the Spectra Core analysis engine to extract a wealth of metadata and critical information often not available from other tools. The collected metadata includes but is not limited to: format header details, strings (including secrets and URIs), function names, library dependencies, and file segments. Unpacking greatly increases the surface that can be analyzed and helps file classification by providing more metadata to look at. This makes it easier to confirm classification verdicts and increases the chance to catch every threat. **Note: ✅ COMPLETING THE UNPACKING STEP** After the file has been successfully unpacked, all collected metadata and the unpacked file content are passed to the validator assigned to the file format. The validator then performs integrity checks on the available data. ### 3. Validation Validation is a step in the Spectra Core analysis process where the **structure** and the **digital signatures** of the analyzed file are verified according to specific criteria for each file format. In the validation step, the previously identified file format is checked against its specification (the formal definition of the file format by its designer). In other words, the validation process looks for differences between the file format specification and its implementation. By doing this, we can gather additional information about the file format and detect anomalies in it. Any malformations that violate the file format specification are further examined to determine if they are capable of triggering potentially malicious behavior. Such malformations may be reported as known vulnerabilities. ReversingLabs uses these malformation patterns to create heuristics for potential future exploits and predictive vulnerability detection. Multiple validators may be used to verify a file format. They are called successively, first to last, or until one of them acknowledges that it recognizes and can handle the specific file format. If validation fails for one of them, the entire file is marked as invalid. Detected issues are reported as validation warnings or errors, depending on their severity. In addition to performing integrity checks of the file format structure, the validation step also verifies any digital certificates that have been used for code signing. Depending on its status, a certificate may influence the classification of files signed with it. The validation step assigns one of the following statuses to every detected certificate: - Valid certificate - Invalid certificate - Bad checksum - Bad signature - Malformed certificate - Self-signed certificate - Impersonation attempt - Expired certificate - Untrusted certificate - Revoked certificate **Note: ✅ COMPLETING THE VALIDATION STEP** After the file has been validated, all collected metadata is processed, evaluated, and transformed into actionable information that can be used to deliver the final file classification. ### 4. Metadata processing Metadata processing is a step in the Spectra Core analysis process where all previously collected metadata is translated into **human-readable**, **explainable information**. That information is used to produce or support the final file classification. Most of it is surfaced in Spectra Core analysis reports. In this step, metadata is converted into **capabilities** and **indicators**. They build up on the file format properties and platform-specific features of the analyzed file to describe software behavior and intent in more detail. The goal is to make it clearer what the analyzed code means and what each object is trying to do. #### Indicators Indicators can be described as behavior markers that are triggered when a specific pattern is found in the collected metadata or in the file content. An indicator may be triggered for multiple reasons. While some indicators can only be found in specific file formats, most are universal and therefore generally applicable. Indicators contribute to the final file classification, but not in an equal measure. Those deemed highly relevant are better at describing the detected malware type, while those with less relevant contributions help in solidifying the machine learning detection. #### Capabilities Based on the indicators triggered on a file, the analysis engine infers that the file exhibits a specific behavior, or that it is capable of performing specific actions. Similar software behaviors are grouped into broader categories - capabilities - according to the features they have in common. For example, a file can have the filesystem capability, which is a broad description that says the file can access the filesystem or perform filesystem operations, but doesn't describe which operation will actually take place. More fine-grained software behavior descriptions are derived from the indicators (e.g. "Accesses the httpd.conf file"). #### Tags The metadata processing step also assigns tags to files based on their properties such as certificate information, software behaviors, file contents, and many more. Some tags can only be applied to specific file types (for example, web browsers or mobile applications). Tags are visible in [Spectra Analyze](/SpectraAnalyze/system-and-user-tags) and can be queried through the [Spectra Intelligence Advanced Search (TCA-0320)](/SpectraIntelligence/API/MalwareHunting/tca-0320) API. In SAFE reports generated by Spectra Assure, tags appear for all unpacked files and for URIs in the Networking section, where they can be used for filtering. **Note: ✅ COMPLETING THE METADATA PROCESSING STEP** After the metadata has been fully processed, the file receives its classification status in the next step of the analysis. ### 5. Classification Classification is a step in the Spectra Core analysis process where the analysis engine produces a **verdict** on whether the analyzed file contains threats harmful to the end-user. Multiple technologies are used for file classification: - format identification - signatures (byte pattern matches) - file structure validation - extracted file hierarchy - file similarity (RHA1) - certificates - machine learning - heuristics (for scripts and fileless malware) - YARA rules included in the analysis engine They are shipped with the analysis engine and can be used offline, without connecting to any external sources. Their coverage varies based on threat and file format type. In other words, not all technologies can detect all threat types, and not all of them work on all file formats. Those default classification abilities of the Spectra Core platform can be extended with **threat intelligence from the ReversingLabs Cloud** to retrieve file reputation information, and with **custom YARA rules for user-assisted classification**. Some classification approaches are more specific than others, with signatures being the most specific. The final classification result relies on the information from all analysis steps, and it is a combination of all technologies applicable to the file format. It will always match one of the technologies even though they may have differing results between them. Because of differences in how malicious files and malware families behave, some files might end up classified as malicious by one technology, and still be considered goodware by others. This doesn’t negate or diminish the final classification. #### Explainable Machine Learning Spectra Core is the first and only solution on the market that relies on [Explainable Machine Learning (xAI)](https://www.reversinglabs.com/blog/machine-learning-for-humans) for threat detection. Explainable Machine Learning was launched by ReversingLabs in 2020 as a predictive threat detection method that can detect novel malware. It focuses on providing threat analysts with human-readable insights into machine learning-driven classifications. The goal of ReversingLabs Explainable Machine Learning is to go beyond the basic verdict of "goodware vs malware", and to help analysts understand **what type of threat was found**, **why it was detected**, and **what to do with it next**. To achieve that, the classification system combines: - **explainability** (by surfacing software behaviors in the form of indicators), - **relevance** (by ranking behaviors based on their contribution to the final verdict), - and **transparency** (by displaying why each software behavior was triggered). Using natural language to provide clear explanations for classification decisions helps security analysts understand how analyzed software behaves and what malware is capable of doing to the system. This transparency fosters trust, facilitates informed decision-making, and makes the logic behind machine learning classification verdicts easier to follow. Over the years, ReversingLabs threat analysts and researchers have carefully transformed raw code and metadata produced by static analysis into indicators - descriptions of software intent. Those indicators are used in training machine learning (ML) models to recognize if a file is malicious based on the described software functionality and behavior. Many of the threats in the training datasets are hand-picked by ReversingLabs experts and fully, correctly labeled so that ML models can learn what constitutes a specific threat type, and distinguish it from other threat types as well as from clean software. This allows ML models to proactively detect and describe threats - even brand new malware - without the need for additional training. When Spectra Core scans a file and extracts some indicators from it, ML models can match them against the indicators they have learned to recognize as typical for malware or a specific threat type. Some indicators are more meaningful in the context of a malware or threat type, so they contribute more to the classification. When the model decides that something is malicious, the decision can be verified through indicators and reasons why they were triggered. This makes the decision more transparent, relevant, and explainable in terms that are familiar to human analysts. ReversingLabs ML models are tailored to threat types to increase accuracy and [continuously improved](https://www.reversinglabs.com/blog/how-to-harden-ml-models-against-adversarial-attacks) to boost their resilience. All classification models can detect if a file is malicious or not. The PE (Portable Executable) malware classifier is also able to provide the information on the detected threat type. The exact threat type indicates higher confidence in the classification result, while threats that get assigned a generic threat type ("Malware") may point to new, emerging malware. The following ML models are used for malware classification: - PE malware classifier - detects if a file is malicious (that covers all the threat types) and if it is a specific malware type (one of **Backdoor**, **Downloader**, **Infostealer**, **Keylogger**, **PUA**, **Ransomware**, **Worm**) - Script classifiers - apply to `Text/