Skip to content

Commit

Permalink
docs: update air-gapped docs (aquasecurity#7160)
Browse files Browse the repository at this point in the history
Signed-off-by: knqyf263 <[email protected]>
Co-authored-by: knqyf263 <[email protected]>
  • Loading branch information
itaysk and knqyf263 authored Aug 9, 2024
1 parent 59c1541 commit 08cc14b
Show file tree
Hide file tree
Showing 11 changed files with 178 additions and 183 deletions.
228 changes: 124 additions & 104 deletions docs/docs/advanced/air-gap.md
Original file line number Diff line number Diff line change
@@ -1,142 +1,162 @@
# Air-Gapped Environment
# Advanced Network Scenarios

Trivy can be used in air-gapped environments. Note that an allowlist is [here][allowlist].
Trivy needs to connect to the internet occasionally in order to download relevant content. This document explains the network connectivity requirements of Trivy and setting up Trivy in particular scenarios.

## Air-Gapped Environment for vulnerabilities
## Network requirements

### Download the vulnerability database
At first, you need to download the vulnerability database for use in air-gapped environments.
Trivy's databases are distributed as OCI images via GitHub Container registry (GHCR):

=== "Trivy"
- <https://ghcr.io/aquasecurity/trivy-db>
- <https://ghcr.io/aquasecurity/trivy-java-db>
- <https://ghcr.io/aquasecurity/trivy-checks>

```
TRIVY_TEMP_DIR=$(mktemp -d)
trivy --cache-dir $TRIVY_TEMP_DIR image --download-db-only
tar -cf ./db.tar.gz -C $TRIVY_TEMP_DIR/db metadata.json trivy.db
rm -rf $TRIVY_TEMP_DIR
```
The following hosts are required in order to fetch them:

=== "oras >= v0.13.0"
Please follow [oras installation instruction][oras].
- `ghcr.io`
- `pkg-containers.githubusercontent.com`

Download `db.tar.gz`:
The databases are pulled by Trivy using the [OCI Distribution](https://github.com/opencontainers/distribution-spec) specification, which is a simple HTTPS-based protocol.

```
$ oras pull ghcr.io/aquasecurity/trivy-db:2
```
[VEX Hub](https://github.com/aquasecurity/vexhub) is distributed from GitHub over HTTPS.
The following hosts are required in order to fetch it:

=== "oras < v0.13.0"
Please follow [oras installation instruction][oras].
- `api.github.com`
- `codeload.github.com`

Download `db.tar.gz`:
## Running Trivy in air-gapped environment

```
$ oras pull -a ghcr.io/aquasecurity/trivy-db:2
```
An air-gapped environment refers to situations where the network connectivity from the machine Trivy runs on is blocked or restricted.

### Download the Java index database[^1]
Java users also need to download the Java index database for use in air-gapped environments.
In an air-gapped environment it is your responsibility to update the Trivy databases on a regular basis.

!!! note
You container image may contain JAR files even though you don't use Java directly.
In that case, you also need to download the Java index database.
## Offline Mode

=== "Trivy"
By default, Trivy will attempt to download latest databases. If it fails, the scan might fail. To avoid this behavior, you can tell Trivy to not attempt to download database files:

```
TRIVY_TEMP_DIR=$(mktemp -d)
trivy --cache-dir $TRIVY_TEMP_DIR image --download-java-db-only
tar -cf ./javadb.tar.gz -C $TRIVY_TEMP_DIR/java-db metadata.json trivy-java.db
rm -rf $TRIVY_TEMP_DIR
```
=== "oras >= v0.13.0"
Please follow [oras installation instruction][oras].
- `--skip-db-update` to skip updating the main vulnerability database.
- `--skip-java-db-update` to skip updating the Java vulnerability database.
- `--skip-check-update` to skip updating the misconfiguration database.

Download `javadb.tar.gz`:
```shell
trivy image --skip-db-update --skip-java-db-update --offline-scan --skip-check-update myimage
```

## Self-Hosting

```
$ oras pull ghcr.io/aquasecurity/trivy-java-db:1
```
### OCI Databases

=== "oras < v0.13.0"
Please follow [oras installation instruction][oras].
You can host the databases on your own local OCI registry.

Download `javadb.tar.gz`:
First, make a copy of the databases in a container registry that is accessible to Trivy. The databases are in:

```
$ oras pull -a ghcr.io/aquasecurity/trivy-java-db:1
```
- `ghcr.io/aquasecurity/trivy-db:2`
- `ghcr.io/aquasecurity/trivy-java-db:1`
- `ghcr.io/aquasecurity/trivy-checks:0`

Then, tell Trivy to use the local registry:

### Transfer the DB files into the air-gapped environment
The way of transfer depends on the environment.
```shell
trivy image \
--db-repository myregistry.local/trivy-db \
--java-db-repository myregistry.local/trivy-java-db \
--checks-bundle-repository myregistry.local/trivy-checks \
myimage
```

=== "Vulnerability db"
```
$ rsync -av -e ssh /path/to/db.tar.gz [user]@[host]:dst
```
#### Authentication

=== "Java index db[^1]"
```
$ rsync -av -e ssh /path/to/javadb.tar.gz [user]@[host]:dst
```
If the registry requires authentication, you can configure it as described in the [private registry authentication document](../advanced/private-registries/index.md).

### Put the DB files in Trivy's cache directory
You have to know where to put the DB files. The following command shows the default cache directory.
### VEX Hub

You can host a copy of VEX Hub on your own internal server.

First, make a copy of VEX Hub in a location that is accessible to Trivy.

1. Download the [VEX Hub](https://github.com/aquasecurity/vexhub) archive from: <https://github.com/aquasecurity/vexhub/archive/refs/heads/main.zip>.
1. Download the [VEX Hub Repository Manifest](https://github.com/aquasecurity/vex-repo-spec#2-repository-manifest) file from: <https://github.com/aquasecurity/vexhub/blob/main/vex-repository.json>.
1. Create or identify an internal HTTP server that can serve the VEX Hub repository in your environment (e.g `https://server.local`).
1. Make the downloaded archive file available for serving from your server (e.g `https://server.local/main.zip`).
1. Modify the downloaded manifest file's [Location URL](https://github.com/aquasecurity/vex-repo-spec?tab=readme-ov-file#locations-subfields) field to the URL of the archive file on your server (e.g `url: https://server.local/main.zip`).
1. Make the manifest file available for serving from your server under the `/.well-known` path (e.g `https://server.local/.well-known/vex-repository.json`).

Then, tell Trivy to use the local VEX Repository:

1. Locate your [Trivy VEX configuration file](../supply-chain/vex/repo/#configuration-file) by running `trivy vex repo init`. Make the following changes to the file.
1. Disable the default VEX Hub repo (`enabled: false`)
1. Add your internal VEX Hub repository as a [custom repository](../supply-chain/vex/repo/#custom-repositories) with the URL pointing to your local server (e.g `url: https://server.local`).

#### Authentication

If your server requires authentication, you can configure it as described in the [VEX Repository Authentication document](../supply-chain/vex/repo/#authentication).

## Manual cache population

You can also download the databases files manually and surgically populate the Trivy cache directory with them.

### Downloading the DB files

On a machine with internet access, pull the database container archive from the public registry into your local workspace:

Note that these examples operate in the current working directory.

=== "Using ORAS"
This example uses [ORAS](https://oras.land), but you can use any other container registry manipulation tool.

```shell
oras pull ghcr.io/aquasecurity/trivy-db:2
```
$ ssh user@host
$ trivy -h | grep cache
--cache-dir value cache directory (default: "/home/myuser/.cache/trivy") [$TRIVY_CACHE_DIR]
```
=== "Vulnerability db"
Put the DB file in the cache directory + `/db`.

```
$ mkdir -p /home/myuser/.cache/trivy/db
$ cd /home/myuser/.cache/trivy/db
$ tar xvf /path/to/db.tar.gz -C /home/myuser/.cache/trivy/db
x trivy.db
x metadata.json
$ rm /path/to/db.tar.gz
```

=== "Java index db[^1]"
Put the DB file in the cache directory + `/java-db`.

```
$ mkdir -p /home/myuser/.cache/trivy/java-db
$ cd /home/myuser/.cache/trivy/java-db
$ tar xvf /path/to/javadb.tar.gz -C /home/myuser/.cache/trivy/java-db
x trivy-java.db
x metadata.json
$ rm /path/to/javadb.tar.gz
```



In an air-gapped environment it is your responsibility to update the Trivy databases on a regular basis, so that the scanner can detect recently-identified vulnerabilities.

### Run Trivy with the specific flags.
In an air-gapped environment, you have to specify `--skip-db-update` and `--skip-java-db-update`[^1] so that Trivy doesn't attempt to download the latest database files.
In addition, if you want to scan `pom.xml` dependencies, you need to specify `--offline-scan` since Trivy tries to issue API requests for scanning Java applications by default.

You should now have a file called `db.tar.gz`. Next, extract it to reveal the db files:

```shell
tar -xzf db.tar.gz
```
$ trivy image --skip-db-update --skip-java-db-update --offline-scan alpine:3.12

You should now have 2 new files, `metadata.json` and `trivy.db`. These are the Trivy DB files.

=== "Using Trivy"
This example uses Trivy to pull the database container archive. The `--cache-dir` flag makes Trivy download the database files into our current working directory. The `--download-db-only` flag tells Trivy to only download the database files, not to scan any images.

```shell
trivy image --cache-dir . --download-db-only
```

## Air-Gapped Environment for misconfigurations
You should now have 2 new files, `metadata.json` and `trivy.db`. These are the Trivy DB files, copy them over to the air-gapped environment.

No special measures are required to detect misconfigurations in an air-gapped environment.
### Populating the Trivy Cache

### Run Trivy with `--skip-check-update` option
In an air-gapped environment, specify `--skip-check-update` so that Trivy doesn't attempt to download the latest misconfiguration checks.
In order to populate the cache, you need to identify the location of the cache directory. If it is under the default location, you can run the following command to find it:

```shell
trivy -h | grep cache
```
$ trivy conf --skip-policy-update /path/to/conf

For the example, we will assume the `TRIVY_CACHE_DIR` variable holds the cache location:

```shell
TRIVY_CACHE_DIR=/home/user/.cache/trivy
```

[allowlist]: ../references/troubleshooting.md
[oras]: https://oras.land/docs/installation
Put the Trivy DB files in the Trivy cache directory under a `db` subdirectory:

```shell
# ensure cache db directory exists
mkdir -p ${TRIVY_CACHE_DIR}/db
# copy the db files
cp /path/to/trivy.db /path/to/metadata.json ${TRIVY_CACHE_DIR}/db/
```

### Java DB

For Java DB the process is the same, except for the following:

1. Image location is `ghcr.io/aquasecurity/trivy-java-db:1`
2. Archive file name is `javadb.tar.gz`
3. DB file name is `trivy-java.db`

## Misconfigurations scanning

Note that the misconfigurations checks bundle is also embedded in the Trivy binary (at build time), and will be used as a fallback if the external database is not available. This means that you can still scan for misconfigurations in an air-gapped environment using the Checks from the time of the Trivy release you are using.

[^1]: This is only required to scan `jar` files. More information about `Java index db` [here](../coverage/language/java.md)
The misconfiguration scanner can be configured to load checks from a local directory, using the `--config-check` flag. In an air-gapped scenario you can copy the checks library from [Trivy checks repository](https://github.com/aquasecurity/trivy-checks) into a local directory, and load it with this flag. See more in the [Misconfiguration scanner documentation](../scanner/misconfiguration/index.md).
2 changes: 1 addition & 1 deletion docs/docs/compliance/contrib-compliance.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ Additional information is provided below.

#### 1. Referencing a check that is already part of Trivy

Trivy has a comprehensive list of checks as part of its misconfiguration scanning. These can be found in the `trivy-policies/checks` directory ([Link](https://github.com/aquasecurity/trivy-checks/tree/main/checks)). If the check is present, the `AVD_ID` and other information from the check has to be used.
Trivy has a comprehensive list of checks as part of its misconfiguration scanning. These can be found in the `trivy-checks/checks` directory ([Link](https://github.com/aquasecurity/trivy-checks/tree/main/checks)). If the check is present, the `AVD_ID` and other information from the check has to be used.

Note: Take a look at the more generic compliance specs that are already available in Trivy. If you are adding new compliance spec to Kubernetes e.g. AWS EKS CIS Benchmarks, chances are high that the check you would like to add to the new spec has already been defined in the general `k8s-ci-v.000.yaml` compliance spec. The same applies for creating specific Cloud Provider Compliance Specs and the [generic compliance specs](https://github.com/aquasecurity/trivy-checks/tree/main/specs/compliance) available.

Expand Down
6 changes: 2 additions & 4 deletions docs/docs/references/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -203,10 +203,7 @@ Trivy v0.23.0 or later requires Trivy DB v2. Please update your local database o
!!! error
FATAL failed to download vulnerability DB

If trivy is running behind corporate firewall, you have to add the following urls to your allowlist.

- ghcr.io
- pkg-containers.githubusercontent.com
If Trivy is running behind corporate firewall, refer to the necessary connectivity requirements as described [here][network].

### Denied

Expand Down Expand Up @@ -271,4 +268,5 @@ $ trivy clean --all
```

[air-gapped]: ../advanced/air-gap.md
[network]: ../advanced/air-gap.md#network-requirements
[redis-cache]: ../../vulnerability/examples/cache/#cache-backend
25 changes: 12 additions & 13 deletions docs/docs/scanner/misconfiguration/check/builtin.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,20 @@
# Built-in Checks

## Check Sources
Built-in checks are mainly written in [Rego][rego] and Go.
Those checks are managed under [trivy-checks repository][trivy-checks].
## Checks Sources
Trivy has an extensive library of misconfiguration checks that is maintained at <https://github.com/aquasecurity/trivy-checks>.
Trivy checks are mainly written in [Rego][rego], while some checks are written in Go.
See [here](../../../coverage/iac/index.md) for the list of supported config types.

For suggestions or issues regarding policy content, please open an issue under the [trivy-checks][trivy-checks] repository.
## Checks Bundle
When performing a misconfiguration scan, Trivy will automatically download the relevant Checks bundle. The bundle is cached locally and Trivy will reuse it for subsequent scans on the same machine. Trivy takes care of updating the cache automatically, so normally users can be oblivious to it.

## Check Distribution
Trivy checks are distributed as an OPA bundle on [GitHub Container Registry][ghcr] (GHCR).
When misconfiguration detection is enabled, Trivy pulls the OPA bundle from GHCR as an OCI artifact and stores it in the cache.
Those checks are then loaded into Trivy OPA engine and used for detecting misconfigurations.
If Trivy is unable to pull down newer checks, it will use the embedded set of checks as a fallback. This is also the case in air-gap environments where `--skip-policy-update` might be passed.

## Update Interval
## Checks Distribution
Trivy checks are distributed as an [OPA bundle](opa-bundle) hosted in the following GitHub Container Registry: <https://ghcr.io/aquasecurity/trivy-checks>.
Trivy checks for updates to OPA bundle on GHCR every 24 hours and pulls it if there are any updates.

### External connectivity
Trivy needs to connect to the internet to download the bundle. If you are running Trivy in an air-gapped environment, or an tightly controlled network, please refer to the [Advanced Network Scenarios document](../../../advanced/air-gap.md).
The Checks bundle is also embedded in the Trivy binary (at build time), and will be used as a fallback if Trivy is unable to download the bundle. This means that you can still scan for misconfigurations in an air-gapped environment using the Checks from the time of the Trivy release you are using.

[rego]: https://www.openpolicyagent.org/docs/latest/policy-language/
[trivy-checks]: https://github.com/aquasecurity/trivy-checks
[ghcr]: https://github.com/aquasecurity/trivy-checks/pkgs/container/trivy-checks
[opa-bundle]: https://www.openpolicyagent.org/docs/latest/management-bundles/
22 changes: 10 additions & 12 deletions docs/docs/scanner/misconfiguration/custom/data.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,14 @@
# Custom Data

Custom checks may require additional data in order to determine an answer.
Custom checks may require additional data in order to make a resolution. You can pass arbitrary data files to Trivy to be used when evaluating rego checks using the `--data` flag.
Trivy recursively searches the specified data paths for JSON (`*.json`) and YAML (`*.yaml`) files.

For example, an allowed list of resources that can be created.
Instead of hardcoding this information inside your policy, Trivy allows passing paths to data files with the `--data` flag.
For example, consider an allowed list of resources that can be created.
Instead of hardcoding this information inside your policy, you can maintain the list in a separate file.

Given the following yaml file:
Example data file:

```bash
$ cd examples/misconf/custom-data
$ cat data/ports.yaml [~/src/github.com/aquasecurity/trivy/examples/misconf/custom-data]
```yaml
services:
ports:
- "20"
Expand All @@ -19,17 +18,16 @@ services:
- "23/tcp"
```
This can be imported into your policy:
Example usage in a Rego check:
```rego
import data.services

ports := services.ports
```

Then, you need to pass data paths through `--data` option.
Trivy recursively searches the specified paths for JSON (`*.json`) and YAML (`*.yaml`) files.
Example loading the data file:

```bash
$ trivy conf --policy ./policy --data data --namespaces user ./configs
```
trivy config --config-check ./checks --data ./data --namespaces user ./configs
```
Loading

0 comments on commit 08cc14b

Please sign in to comment.