Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(misconf): Filtering findings for Terraform modules based on attributes #7180

Open
simar7 opened this issue Jul 17, 2024 Discussed in #7157 · 8 comments
Open

feat(misconf): Filtering findings for Terraform modules based on attributes #7180

simar7 opened this issue Jul 17, 2024 Discussed in #7157 · 8 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. scan/misconfiguration Issues relating to misconfiguration scanning
Milestone

Comments

@simar7
Copy link
Member

simar7 commented Jul 17, 2024

TODO:

  • Trivy should support accessing individual blocks in the case of for_each or count variables. This would allow to target specific resources such as described in the example below.
  • In order to specify more granular ignores for the above attributes, Trivy should allow such a configuration to be passed in via the .trivyignore.yaml format for ease of use and readability.
  • While displaying the misconfigurations for such cases, we should expand the blocks to specifically mention which iterated item is being shown.

Discussed in #7157

Originally posted by acdha July 12, 2024

Description

Like many people, I use the terraform-aws-modules/vpc/aws module and that module creates a variety of resources using large lists for things like ACLs:

module "vpc" {
  source                 = "terraform-aws-modules/vpc/aws"
  version                = "~> 5.8"public_inbound_acl_rules = concat(
    [
      for i in range(length(local.default_ingress_acls)) :
      merge(
        {
          rule_number = 1000 + i,
          rule_action = local.default_ingress_acls[i]["action"]
        },
        local.default_ingress_acls[i]
      )
    ],
…

This creates a problem on every project when using Trivy because you'll get findings for the resources created using count or for_each blocks inside the module based on those lists:

CRITICAL: Network ACL rule allows ingress from public internet.

Opening up ACLs to the public internet is potentially dangerous. You should restrict access to IP addresses or ranges that explicitly require it where possible.

See https://avd.aquasec.com/misconfig/avd-aws-0105

 terraform-aws-modules/vpc/aws/.terraform/modules/vpc/main.tf:204
   via terraform-aws-modules/vpc/aws/.terraform/modules/vpc/main.tf:191-206 (aws_network_acl_rule.public_inbound[13])
    via vpc.tf:55-250 (module.vpc)

 191   resource "aws_network_acl_rule" "public_inbound" {
 ...   
 204 [   cidr_block      = lookup(var.public_inbound_acl_rules[count.index], "cidr_block", null)
 ...   
 206   }

(this also repeats for AVD-AWS-102, etc.)

I think Trivy needs explicit support for filtering things like this safely. The current design for attribute filtering does not support indexed resources (as every input to that module uses) and since these rules are usually false-positives (most people use AWS to host public applications so they'll need to support ingress HTTP, HTTPS, ICMP for Path MTU discovery, etc.), it's really tempting to just toss #trivy:ignore:avd-aws-0102 trivy:ignore:avd-aws-0105 into that file — greatly increasing the odds of Trivy not catching an actual security problem when someone adds another rule – and even if indexed access were possible, it'd be risky because the indexes are not stable (e.g. in my example above, some of the rules come from a different module which has common firewall rules and an update could easily shift all of the index numbers).

Since Trivy has full knowledge of the resources created in a module, it seems like it should be possible to have some way to use that hierarchy to select a resource which would otherwise not be in scope at this point:

# #trivy:ignore:avd-aws-0102[module_resource_name=aws_network_acl_rule.public_inbound,type=ingress,protocol=1]
module "vpc" {
  source                 = "terraform-aws-modules/vpc/aws"

I don't love that syntax and it seems like it might be preferable to say that complex filtering has to be done in a .trivyignore.yaml file where it could look more like this:

- id: AVD-AWS-0102
  statement: ICMP Path MTU discovery
  paths:
    - vpc.tf
  resources:
    - path: module.vpc.aws_network_acl_rule.public_inbound
      attributes:
        type: ingress
        protocol: 1
- id: AVD-AWS-0105
  statement: Public web app ingress HTTPS
  paths:
    - vpc.tf
  resources:
    - path: module.vpc.aws_network_acl_rule.public_inbound
      attributes:
        type: ingress
        protocol: 6
        from_port: 443
        to_port: 443

One other thought which came out of this is that the error messages could be improved somewhat to list the attributes of the resource in question. When you have large lists being fed into a loop it'd be really nice to be able to see cidr_block=0.0.0.0/0 from_port=22 to_port=22 so you could easily know which source definitely it refers to rather than having to count them.

Target

None

Scanner

Misconfiguration

@simar7 simar7 added kind/feature Categorizes issue or PR as related to a new feature. scan/misconfiguration Issues relating to misconfiguration scanning labels Jul 17, 2024
@nikpivkin
Copy link
Contributor

nikpivkin commented Jul 17, 2024

@simar7 I'll divide this into 3 tasks.

  1. I implemented filtering by count and each objects, but we decided not to add that. It would be easy to add it. 835351d
  2. I think we need to have a better discussion about the file scheme. We need to somehow distinguish resources from different projects. For example, there could be a case where the scan target is a directory containing unrelated root modules foo and bar, which have resources with the same logical path. Maybe we should add some kind of entrypoint field? The path field for a resource makes no sense if it is in a remote module. Only the logical path is enough.
  3. Right now we don't display the actual value of attributes in the report. What would it look like? Now for displaying misconfig we just read the file and display a piece of code according to the location of the finding. In order to display the expanded block in the report, we need to modify the HCL file, overwrite the file in the file system, and update the finding metadata to refer to the new location.

@simar7 simar7 added this to the v0.55.0 milestone Aug 28, 2024
@knqyf263 knqyf263 modified the milestones: v0.55.0, v0.56.0 Sep 5, 2024
@simar7 simar7 modified the milestones: v0.56.0, v0.57.0 Sep 26, 2024
@nikpivkin
Copy link
Contributor

The first point has already been implemented. What about 2 and 3?

@simar7 simar7 modified the milestones: v0.57.0, v0.58.0 Oct 29, 2024
@simar7
Copy link
Member Author

simar7 commented Nov 1, 2024

Maybe we should add some kind of entrypoint field?

This sounds like it needs uniqueness right? What if we add the root module itself as the entry point? What other options do we have?

In order to display the expanded block in the report, we need to modify the HCL file, overwrite the file in the file system, and update the finding metadata to refer to the new location.

Actually as I thought of this more, expanding HCL and displaying has another problem of shifting indexes. This may cause the output to be indeterministic. I see two ways here that we can go:

  1. Keep it as is for the output when it comes to showing count or for_each types.
  2. Group results for count and for_each types. This is similar to the idea we had about "similar results" but a little less involved as only results in a single loop are labeled together. We can think of a way to output this meaningfully. Maybe something like the following:
CRITICAL: Network ACL rule allows ingress from public internet.

Opening up ACLs to the public internet is potentially dangerous. You should restrict access to IP addresses or ranges that explicitly require it where possible.

See https://avd.aquasec.com/misconfig/avd-aws-0105

 terraform-aws-modules/vpc/aws/.terraform/modules/vpc/main.tf:204
   via terraform-aws-modules/vpc/aws/.terraform/modules/vpc/main.tf:191-206 (aws_network_acl_rule.public_inbound[13])
    via vpc.tf:55-250 (module.vpc)
+      and 6 similar results originating from

 191   resource "aws_network_acl_rule" "public_inbound" {
 ...   
 204 [   cidr_block      = lookup(var.public_inbound_acl_rules[count.index], "cidr_block", null)
 ...   
 206   }

Just an idea, I'm open to other ideas as well.

@nikpivkin
Copy link
Contributor

nikpivkin commented Nov 1, 2024

I think the 3rd point is not about grouping results (we already have an issue for that). Right now, the user needs to evaluate the attribute values themselves to determine which parameters caused the misconfiguration in one of the resource instances created with for-each or count.

For example, in the following expression cidr_block = lookup(var.public_inbound_acl_rules[count.index], “cidr_block”, null), you must first define an index, then look up the rule at that index to find out which specific CIDR caused the incorrect setting. If the attribute expression is more complex, this complicates the process of analyzing and identifying the cause.

@simar7
Copy link
Member Author

simar7 commented Nov 1, 2024

If we handle the order of indexes to be deterministic always (sort), what if we build the HCL in memory? We can omit line numbers from such a case (or use the ones from original file). The only difference in this output compared to what we have will be the populated indexes and values, rest should remain the same.

We could limit this to a certain extent maybe by only expanding for a limited set of indexes so we don't run into memory issues since we will have to do this all in memory.

@nikpivkin
Copy link
Contributor

nikpivkin commented Nov 1, 2024

Will this be displayed next to the code snippet?

...
 terraform-aws-modules/vpc/aws/.terraform/modules/vpc/main.tf:204
   via terraform-aws-modules/vpc/aws/.terraform/modules/vpc/main.tf:191-206 (aws_network_acl_rule.public_inbound[13])
    via vpc.tf:55-250 (module.vpc)

 191   resource "aws_network_acl_rule" "public_inbound" {
 ...   
 204 [   cidr_block      = lookup(var.public_inbound_acl_rules[count.index], "cidr_block", null)
 ...   
 206   }

Rendered:
   resource "aws_network_acl_rule" "public_inbound" {
   ...   
   [   cidr_block      = "172.16.0.0/24"
   ...
   }

If so, should there be a flag to disable this? I don't think it will affect memory if we only do it for findings. We can use the hclwrite package to do this.

@simar7
Copy link
Member Author

simar7 commented Nov 1, 2024

Will this be displayed next to the code snippet?

...
 terraform-aws-modules/vpc/aws/.terraform/modules/vpc/main.tf:204
   via terraform-aws-modules/vpc/aws/.terraform/modules/vpc/main.tf:191-206 (aws_network_acl_rule.public_inbound[13])
    via vpc.tf:55-250 (module.vpc)

 191   resource "aws_network_acl_rule" "public_inbound" {
 ...   
 204 [   cidr_block      = lookup(var.public_inbound_acl_rules[count.index], "cidr_block", null)
 ...   
 206   }

Rendered:
   resource "aws_network_acl_rule" "public_inbound" {
   ...   
   [   cidr_block      = "172.16.0.0/24"
   ...
   }

If so, should there be a flag to disable this? I don't think it will affect memory if we only do it for findings. We can use the hclwrite package to do this.

Yes, that looks good to me. If memory isn't an issue, why should we have the need to disable this? What if we only display the rendered output and not the actual code snippet? Like I said, we could omit line numbers or omit them altogether if they don't line up well with the rendered output.

We could keep it under a feature flag (disabled by default). It could be a similar to the --trace flag we have today since it really only helps users debug the origin of the misconfiguration.

@nikpivkin
Copy link
Contributor

I think the code snippet is still needed to match between the finding and the source code. It doesn't make sense for the rendered output to have line numbers, since the expression may take several lines (e.g. function call or grouping with parentheses), but the value will have one line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. scan/misconfiguration Issues relating to misconfiguration scanning
Projects
Status: No status
Development

No branches or pull requests

3 participants