Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(misconfig): Add flag to supply checks bundle locally #7029

Closed
simar7 opened this issue Jun 26, 2024 · 14 comments
Closed

feat(misconfig): Add flag to supply checks bundle locally #7029

simar7 opened this issue Jun 26, 2024 · 14 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. scan/misconfiguration Issues relating to misconfiguration scanning

Comments

@simar7
Copy link
Member

simar7 commented Jun 26, 2024

Today we have the --checks-bundle-repository flag that can be use to allow Trivy to download a bundle from a specified registry URL.

However there can be instances where a bundle is built locally and needs to be used without pushing up to a registry. For this purpose we can add a --checks-bundle-path to specify a bundle available from disk.

This is also needed for automating testing of every trivy-checks bundle that we release within the same pipeline on every PR.

@simar7 simar7 added kind/feature Categorizes issue or PR as related to a new feature. scan/misconfiguration Issues relating to misconfiguration scanning labels Jun 26, 2024
@simar7 simar7 added this to the v0.54.0 milestone Jun 26, 2024
@knqyf263
Copy link
Collaborator

I may be misguided as I don't understand the context very well, but I will leave a few comments.

However there can be instances where a bundle is built locally and needs to be used without pushing up to a registry.

For testing purposes, you can put the checks bundle under /path/to/cache/policy and update the metadata so it looks like the latest one.

policyDir: filepath.Join(cacheDir, "policy"),

trivy/pkg/policy/policy.go

Lines 163 to 166 in 0ccdbfb

// No need to update if it's been within a day since the last update.
if c.clock.Now().Before(meta.DownloadedAt.Add(updateInterval)) {
return false, nil
}

That's what we're doing for the vulnerability DB.

cacheDir := dbtest.InitDB(t, fixtures)
defer db.Close()
dbDir := filepath.Dir(db.Path(cacheDir))
metadataFile := filepath.Join(dbDir, "metadata.json")
f, err := os.Create(metadataFile)
require.NoError(t, err)
err = json.NewEncoder(f).Encode(metadata.Metadata{
Version: db.SchemaVersion,
NextUpdate: time.Now().Add(24 * time.Hour),
UpdatedAt: time.Now(),
})
require.NoError(t, err)

If you really need to configure the bundle path, we might want to introduce a local repository similar to Maven. Then, use --checks-bundle-repository file:///path/to/local/bundle.
https://maven.apache.org/guides/introduction/introduction-to-repositories.html

@simar7
Copy link
Member Author

simar7 commented Jun 27, 2024

Yeah we could do those changes to get around, I just thought that it would be nice (and probably easier for the user) to have a flag.

On a related note, my original motivation for adding a flag also partially came from reading the air gap install procedure we have today. It seems a little involved to me to have to poke around the file system given that we could enable the user to provide the database (or a bundle in case of misconfiguration scanning) to us with a flag.

@knqyf263
Copy link
Collaborator

On a related note, my original motivation for adding a flag also partially came from reading the air gap install procedure we have today. It seems a little involved to me to have to poke around the file system given that we could enable the user to provide the database (or a bundle in case of misconfiguration scanning) to us with a flag.

--cache-dir can configure the cache directory. Do you mean we should provide more granular options? In general, I'd keep DB-related flags consistent with the bundle flags. If we provide --checks-bundle-path, we should provide --db-path, --java-db-path, etc.

@simar7
Copy link
Member Author

simar7 commented Jun 27, 2024

Do you mean we should provide more granular options?

Yes. Although as you mentioned earlier, we can add maven style flags which enable file:// and https:// for local and remote.

In general, I'd keep DB-related flags consistent with the bundle flags

Yes likewise, I think we should add such a flag for both DB and bundle. If we go the maven style route (to re-use the existing flag), we can ensure that it is backwards compatible. If not, adding a new flag for both DB and bundle.

@knqyf263
Copy link
Collaborator

However there can be instances where a bundle is built locally and needs to be used without pushing up to a registry.

I generally understand the main points, but I'd like to delve a bit deeper into why the --cache-dir option is insufficient as the requirement you shared above is achieved with --cache-dir. I've thought of the following items, but since I haven't heard any specific complaints from users, I'd like to confirm your thoughts:

  1. Currently, the path after --cache-dir is fixed. For example, the vulnerability database must be stored in $CACHE_DIR/db/trivy.db, where the path db/trivy.db is fixed. Similarly, the checks bundle is fixed to $CACHE_DIR/policy/content. There may be a need to set these freely. For instance, something like --checks-bundle-path /myproject/checks.

  2. Currently, with --cache-dir, the root directory is fixed. There might be use cases where users want to place the vulnerability database and checks bundle in completely different paths. For example: --cache-dir /tmp/trivy --db-path /mycache/db/ --checks-bundle-path /myproject/checks

While these options would be convenient, considering the compatibility with trivy clean --all, it might be more convenient to place everything under $CACHE_DIR. On the other hand, --checks-bundle-path might be necessary precisely because users don't want it to be deleted as cache.

I'm generally open to this idea, but I want to understand the requirements correctly. For instance, if the first point is the main reason, it might be better to specify a relative path after $CACHE_DIR. Depending on the requirements, there might be different solutions.

@simar7
Copy link
Member Author

simar7 commented Jun 28, 2024

Sorry but yes you're right there are two different things to solve here.

  1. Cache directory (temporary, not user provided, e.g. scan results, plugin index, metadata etc.)
  2. Artifact directory (permanent, user provided, e.g. checks bundle and trivy db)

Today we basically use the cache directory for both purposes. Which is fine and for the purposes of trivy clean works better as we only have one location to clean up. For most purposes, a user IMO is less interested in knowing where the metadata or plugin index lives as compared to where the checks bundle or policy db is picked up from because the former cannot be supplied by them, but the latter can be.

My suggestion is simply only addressing the fact that today users need to understand the structure of cache dir (to be able to place artifacts in there) and to me that seems a little involved. As for caching the supplied bundle (and trivy db), we could still very well place it within the cache dir, which works well with trivy clean as we discussed above.

As far as testing is concerned, none of this matters as we can set it up the way you described above.

Let me know if it makes sense!

@knqyf263
Copy link
Collaborator

knqyf263 commented Jul 1, 2024

My suggestion is simply only addressing the fact that today users need to understand the structure of cache dir (to be able to place artifacts in there) and to me that seems a little involved.

Yes, I agree. The more artifacts that are cached, the more difficult it becomes to understand the directory structure. Is this a problem that can be avoided by careful documentation? Or would it be better to have a flag specifying a local path for each artifact, as you originally suggested?

As far as testing is concerned, none of this matters as we can set it up the way you described above.

Right, we are familiar with the directory structure, so we can just put the checks bundle (and other artifacts) in a defined path for testing.

So I think our focus in this discussion should be on the issue of users having to know the directory structure in an air-gapped environment. The --cache-dir flag should be sufficient in all but air-gapped environments, as artifacts are automatically downloaded and placed in the appropriate path.

@DmitriyLewen @nikpivkin Please feel free to share your thoughts.

@DmitriyLewen
Copy link
Contributor

I'm not sure we need this flag.
We already have a large number of flags and it’s better to try to make do with them (and teach users this).

I don't have much experience with checks-bundle, but in terms of source code, checks-bundle is very similar to trivy-db (we have our own repository, we check and update checks-bundle before every run).
So i would say that it will be enough to update docs (I think we have written quite detailed instructions for installing trivy-db (https://aquasecurity.github.io/trivy/v0.52/docs/advanced/air-gap/)).

On the other hand, if checks-bundles are not updated frequently and users often use their own checks-bundles - perhaps we can treat checks-bundles as plugins (i mean move them to ~/.trivy). But in this case, we must remove checks-bundle from the trivy clean and, update the logic for updating/removing them.

Second option is appropriate for me, but I am more inclined to the first option.

@simar7
Copy link
Member Author

simar7 commented Jul 3, 2024

Fair enough, thanks for the feedback! Based on this discussion let's start with the documentation first, I will add that in. I will close this issue.

@simar7 simar7 closed this as not planned Won't fix, can't repro, duplicate, stale Jul 3, 2024
@itaysk
Copy link
Contributor

itaysk commented Jul 12, 2024

I'm also wondering about this. we have --db-repository and --java-db-repository flags, why in the air-gapped doc we are telling people to surgically replace the db in the cache and not point trivy at the self-hosted db?

@knqyf263
Copy link
Collaborator

Do you mean they have access to the self-hosted DB in their air-gapped environment? We can mention that case in the doc.

@itaysk
Copy link
Contributor

itaysk commented Jul 12, 2024

yes that's what I meant. I guess we created those flags for this use case, but in the doc that was supposed to highlight them it's missing.

@knqyf263
Copy link
Collaborator

As far as I remember, the flag was originally added by GitLab or other companies so they can use their own DB, and was not for air-gapped. But yes, this flag must be useful for such use cases as well. We should document it.

@knqyf263 knqyf263 removed this from Trivy Roadmap Jul 23, 2024
@knqyf263 knqyf263 removed this from the v0.54.0 milestone Jul 23, 2024
@nikpivkin
Copy link
Contributor

@simar7 This PR extends the possible sources for databases and adds their download over HTTP(S). Therefore, there is no need to add support for a new flag to load the bundle from the local system, as it will be possible to simply add support for the file protocol. Is this still relevant?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. scan/misconfiguration Issues relating to misconfiguration scanning
Projects
None yet
Development

No branches or pull requests

5 participants