1

I am enabling AWS Macie 2 using terraform and I am defining a default classification job as following:

resource "aws_macie2_account" "member" {}

resource "aws_macie2_classification_job" "member" {
  job_type = "ONE_TIME"
  name     = "S3 PHI Discovery default"
  s3_job_definition {
    bucket_definitions {
      account_id = var.account_id
      buckets    = ["S3 BUCKET NAME 1", "S3 BUCKET NAME 2"]
    }
  }
  depends_on = [aws_macie2_account.member]
} 

AWS Macie needs a list of S3 buckets to analyze. I am wondering if there is a way to select all buckets in an account, using a wildcard or some other method. Our production accounts contain hundreds of S3 buckets and hard-coding each value in the s3_job_definition is not feasible.

Any ideas?

Dimi
  • 309
  • 5
  • 25
  • 1
    The Terraform AWS provider does not support a data source for listing S3 buckets at this time, unfortunately. For things like this (data sources that Terraform doesn't support), the common approach is to use the AWS CLI through an external data source. – Jordan Jul 15 '21 at 00:35
  • Thanks @Jordan. This is exactly what i've been trying as a workaround – Dimi Jul 15 '21 at 09:10
  • I've added details in an answer so you can close the question. – Jordan Jul 15 '21 at 18:00
  • Great, accepted it. Thank you for the details. – Dimi Jul 16 '21 at 10:12

1 Answers1

1

The Terraform AWS provider does not support a data source for listing S3 buckets at this time, unfortunately. For things like this (data sources that Terraform doesn't support), the common approach is to use the AWS CLI through an external data source.

These are modules that I like to use for CLI/shell commands:

Using the data source version, it would look something like:

module "list_buckets" {
  source  = "Invicton-Labs/shell-data/external"
  version = "0.1.6"

  // Since the command is the same on both Unix and Windows, it's ok to just
  // specify one and not use the `command_windows` input arg
  command_unix         = "aws s3api list-buckets --output json"

  // You want Terraform to fail if it can't get the list of buckets for some reason
  fail_on_error   = true

  // Specify your AWS credentials as environment variables
  environment = {
    AWS_PROFILE = "myprofilename"
    // Alternatively, although not recommended:
    // AWS_ACCESS_KEY_ID = "..."
    // AWS_SECRET_ACCESS_KEY = "..."
  }
}

output "buckets" {
  // We specified JSON format for the output, so decode it to get a list
  value = jsondecode(module.list_buckets.stdout).Buckets
}
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

Outputs:

buckets = [
  {
    "CreationDate" = "2021-07-15T18:10:20+00:00"
    "Name" = "bucket-foo"
  },
  {
    "CreationDate" = "2021-07-15T18:11:10+00:00"
    "Name" = "bucket-bar"
  },
]
Jordan
  • 3,998
  • 9
  • 45
  • 81