Skip to main content
Version: File Inspection Engine 1.2.0

Docker

Docker image

The File Inspection Engine Docker image can either be obtained from the ReversingLabs container registry or provided as a .tar file.

Pulling the Docker image

To pull the Docker image from the ReversingLabs container registry:

  1. Log in to the Docker Registry

    Log in using your cloud username and password:

    docker login registry.reversinglabs.com
  2. Pull the Docker image

    Pull the file-inspection-engine Docker image with the specified tag:

    docker pull registry.reversinglabs.com/fie/file-inspection-engine:1.2.0

Loading the Docker image from a .tar file

If you have received the Docker image as a .tar archive, load the image using the following command:

docker image load -i file-inspection-engine-1.2.0.tar.gz

Running the application

The File Inspection Engine (FIE) reads its license from an environment variable called RL_LICENSE. This license, provided by ReversingLabs, must be passed to the application at startup.

To start the application, use one of the following commands. If the image has been pulled from the registry, the full image name should include the registry.

docker run --rm -it -e RL_LICENSE="$(cat <path/to/license/file>)" -v /path/on/host/threat-data:/rl/threat-data -v /path/on/host/tmp:/rl/tmp --net host file-inspection-engine:1.2.0

In this example, the container runs on the host network, so no port mapping is needed.

If you’re not using the host network, you’ll need to map the container’s port to the host.

The HTTP server uses port 8000 by default, but you can change it:

  • To map the port to a different host port:

    docker run --rm -it -p 127.0.0.1:80:8000 file-inspection-engine:1.2.0
  • To change the HTTP port used by the container:

    docker run --rm -it -p 127.0.0.1:80:9000 file-inspection-engine:1.2.0 --http-address :9000

Storage and mounting considerations

FIE uses two directories inside the container for storage:

  1. /rl/threat-data, which it uses to assign file classifications.
  2. /rl/tmp, which it uses to store file uploads, unpacked files, and file analysis reports.

The /rl/threat-data directory contains roughly 20 GB for malicious data and 1 GB for suspicious data, and additional space is needed during updates, as files are downloaded fully before replacement.

Threat data synchronization starts shortly after the application is up and running and continues at regular intervals, configurable via the --cloud-update-interval parameter.

Initial synchronization involves larger files, while subsequent updates use incremental changes (typically < 100 KB per segment). Data is divided into 256 segments per classification, and each segment may require multiple updates, which can increase the total download size to several megabytes, especially with less frequent updates.

This means that a container started "bare" - without any threat data mounted upon start - will first need to pull in around 20 GB of data, every time it is started. This radically decreases the performance of FIE, so mounting an external volume is essential, for example:

docker run --rm -it -e RL_LICENSE="$(cat <license file>)" -v /path/on/host:/rl/threat-data file-inspection-engine:1.2.0

This allows reusing threat data between containers, for example by transferring it to an air-gapped instance. Mounting an external volume also means that you avoid the performance costs associated with writing to disk inside the container.

warning

Reusing threat data must be done in read-only mode.

Individual FIE containers will, by default, continuously monitor and update their /rl/threat-data directory. Reusing threat data between several containers can lead to an issue with how containers interact with that directory. Therefore, make sure that containers which reuse the same source of threat data do not write to it.

This can be accomplished by turning off cloud updates for all containers which reuse the same data.

Note: Do not reuse threat data even when only one container is writing. Even in this case, the read-only containers could potentially use old data while it is being updated. Since containers are only aware of their own threat data updates, they cannot detect another container being in the middle of an update.

Selecting the mount type

The two main factors to consider when choosing a mount type are persistence and speed.

You want the /rl/threat-data directory to be persistent and have good read speed (as that's where the application will look when classifying files), and you want good write speed for /rl/tmp.

If you're working directly with the threat data (as described in the air-gapped instance section), select a regular bind mount. This allows you to freely interact with the downloaded data from the host system.

You could also select a Docker volume if you need a persistent source of data, but do not intend to directly interact with it.

For /rl/tmp, persistence is not important, but write speed is. A possible choice here is tmpfs mounts. This also allows the highest throughput, as the underlying static analysis engine performs a lot of disk writes, and tmpfs mounts are RAM-only - which means that the write speeds will be faster. Note, however, that this requires allocating more RAM than e.g. using a bind mount.

Manual threat data synchronization

The File Inspection engine retrieves updates automatically.

If you want to pre-download threat data so your customers can start using it immediately, or if you prefer to manually sync the data, use the threat-data command included in the image. This command is also used to download threat data in air-gapped environments.

If manual threat data updates occur less than once per week, incremental updates may take longer than a full database download. Performance depends on system resources, network bandwidth, and the deployment environment. Incremental updates are recommended by default, but if they are slow, consider these factors and opt for a full download if necessary.

Supported Options

The threat-data command supports the following options in addition to username and password:

  • RL_PARANOID_MODE Download data collection for suspicious files.
  • RL_PROXY_ADDRESS Specify a proxy server address if you need to connect to the cloud via a proxy.
  • RL_RETRY_COUNT The number of retries if a segment fails to download during update.
  • RL_LOG_JSON Defines the log output format as either JSON or colored plain text.

Sync Command

To manually sync the threat data, use the sync sub-command, which requires specifying the threat data directory:

./threat-data sync /threat/data/dir

To execute this via Docker, run:

docker run --rm -it \
-e RL_CLOUD_USERNAME=username \
-e RL_CLOUD_PASSWORD=password \
-e RL_PARANOID_MODE=true \
-v ./external/dir:/rl/threat-data:z \
--entrypoint ./threat-data \
registry.reversinglabs.com/fie/file-inspection-engine:1.2.0
sync /rl/threat-data

If you need to treat suspicious files as malicious, make sure to set the RL_PARANOID_MODE option to true in the command.

Important:

  • The threat-data command only supports configuration via environment variables.
  • We recommend pre-downloading the threat data once and including it in your distribution for multiple users, as a full threat data download is more resource-intensive compared to incremental updates.
  • Do not run the threat-data command concurrently with the application if both are accessing the same directory.

Air-gapped manual threat data synchronization

For air-gapped environments, follow the process below to synchronize threat data. First, download the threat data on a machine with internet access, then transfer the data to the air-gapped instance.

  1. Start a File Inspection Engine (FIE) instance on a machine with internet access. Once the data sync is complete, stop the FIE instance that was used for downloading, and then proceed to step 2.

    Alternatively, run the following command to manually sync the threat data:

    docker run --rm -it \
    -e RL_CLOUD_USERNAME=username \
    -e RL_CLOUD_PASSWORD=password \
    -v /external/dir:/rl/threat-data \
    --entrypoint ./threat-data \
    file-inspection-engine:1.2.0
    sync /rl/threat-data

    /external/dir represents the path on the host system where the threat data is stored. If the directory contains older threat data, it will be incrementally updated.

    note

    If using paranoid mode, set the environment variable RL_PARANOID_MODE=true.

    Upon successful synchronization, the log should show Threat data fully updated. In case of errors, rerun the command to retry. Proceed to step 2.

  2. Stop a production FIE instance (or create a new one) in the air-gapped environment.

  3. Copy the threat data from /external/dir on the internet-connected machine to the corresponding threat data directory used by the air-gapped FIE instance. Ensure that the transferred data is placed in the directory where the application would normally download it if it were online. For further assistance, contact ReversingLabs Support.

  4. Restart or deploy the air-gapped FIE instance with the updated threat intelligence data.