2

Good day,

Our team utilizes a module that creates Linux instances with a standard configuration in user_data as defined below.

resource "aws_instance" "this" {
...
user_data = templatefile("${path.module}/user_data.tp", { hostname = upper("${local.prefix}${count.index + 1}"), domain = local.domain })
...
}

Contents of the user_data.tp:

#cloud-config
repo_update: true
repo_upgrade: all

preserve_hostname: false
hostname: ${hostname}
fqdn: ${hostname}.${domain}
manage_etc_hosts: false

runcmd:
  - 'echo "preserve_hostname: true" >> /etc/cloud/cloud.cfg.d/99_hostname.cfg'

What is the best way to modify this module such that the contents of user_data.tp are always executed and optionally another block could be passed to install certain packages or execute certain shell scripts?

I'm assuming it involves using cloudinit_config and a multipart mime configuration, but would appreciate any suggestions.

Thank you.

Kimmel
  • 485
  • 1
  • 4
  • 12
  • What do you mean by always executed: every apply, every restart, or something else entirely? Also, maybe you could give an example on the "extra" bits. You can already run arbitrary commands like installing software in user_data. – theherk Oct 24 '21 at 18:01
  • Possible duplicate: https://stackoverflow.com/questions/43642308/multiple-user-data-file-use-in-terraform – Mark B Oct 24 '21 at 18:32
  • @theherk, sorry for the ambiguity. I want the user_data only executed at initial provisioning. I want to be able to modify the module such that persons can call it and optionally pass another block to be appended to the user_data. The above user_data.tp needs to be standard for all instances, but a user may want to install specific packages and execute a bash script on their instances. – Kimmel Oct 24 '21 at 18:54
  • @Mark, I think that’s what I need. Is this still the best way? – Kimmel Oct 24 '21 at 18:56
  • @Kimmel yes, if I knew of a better way I would have posted it. – Mark B Oct 24 '21 at 19:15

1 Answers1

2

Since you showed a cloud-config template I'm assuming here that you're preparing a user_data for an AMI that runs cloud-init on boot. That means this is perhaps more of a cloud-init question than a Terraform question, but I understand that you also want to know how to translate the cloud-init-specific answer into a workable Terraform configuration.

The User-data Formats documentation describes various possible ways to format user_data for cloud-init to consume. You mentioned multipart MIME in your question and indeed that could be a viable answer here if you want cloud-init to interpret the two payloads separately, rather than as a single artifact. The cloud-init docs talk about the tool make-mime, but the Terraform equivalent of that is the cloudinit_config data source belonging to the hashicorp/cloudinit provider:

variable "extra_cloudinit" {
  type = object({
    content_type = string
    content      = string
  })

  # This makes the variable optional to set,
  # and var.extra_cloudinit will be null if not set.
  default = null
}

data "cloudinit_config" "user_data" {
  # set "count" to be whatever your aws_instance count is set to
  count = ...

  part {
    content_type = "text/cloud-config"
    content      = templatefile(
      "${path.module}/user_data.tp",
      {
        hostname = upper("${local.prefix}${count.index + 1}")
        domain = local.domain
      }
    )
  }

  dynamic "part" {
    # If var.extra_cloud_init is null then this
    # will produce a zero-element list, or otherwise
    # it'll produce a one-element list.
    for_each = var.extra_cloudinit[*]
    content {
      content_type = part.value.content_type
      content      = part.value.content

      # NOTE: should probably also set merge_type
      # here to tell cloud-init how to merge these
      # two:
      # https://cloudinit.readthedocs.io/en/latest/topics/merging.html
    }
  }
}

resource "aws_instance" "example" {
  count = length(data.cloudinit_config.user_data)

  # ...
  user_data = data.cloudinit_config.user_data[count.index].rendered
}

If you expect that the extra cloud-init configuration will always come in the form of extra cloud-config YAML values then an alternative approach would be to merge the two data structures together within Terraform and then yamlencode the merged result:

variable "extra_cloudinit" {
  type = any

  # This makes the variable optional to set,
  # and var.extra_cloudinit will be null if not set.
  default = {}

  validation {
    condition     = can(merge(var.extra_cloudinit, {}))
    error_message = "Must be an object to merge with the built-in cloud-init settings."
  }
}

locals {
  cloudinit_config = merge(
    var.extra_cloudinit,
    {
      repo_update  = true
      repo_upgrade = "all"
      # etc, etc
    },
  )
}

resource "aws_instance" "example" {
  count = length(data.cloudinit_config.user_data)

  # ...
  user_data = <<EOT
#!cloud-config
${yamlencode(local.cloudinit_config)}
EOT
}

A disadvantage of this approach is that Terraform's merge function is always a shallow merge only, whereas cloud-init itself has various other merging options. However, an advantage is that the resulting single YAML document will generally be simpler than a multipart MIME payload and thus probably easier to review for correctness in the terraform plan output.

Martin Atkins
  • 62,420
  • 8
  • 120
  • 138
  • Thorough answer as always. I greatly appreciate it. I actually went with your first option yesterday based on a similar answer you provided late last year. Only real difference is that I used concat to combine my static and dynamic lists so I only required a dynamic “part”. I actually like your yamlencode option even more though, and might go this route. One quick question- in my “extra_cloudinit” variable I used “[ ]” as the default instead of null. Does this change the behavior? – Kimmel Oct 26 '21 at 02:35
  • 1
    If you declared `extra_cloudinit` as a list to be concatentated then `[]` would be a good "do nothing" default value indeed. I used `null` here because I'd declared it as a single object and so the logical way to represent its absense was as a `null`, which `[*]` then turned into the empty list when needed. – Martin Atkins Oct 27 '21 at 00:38
  • Anyone seeing this answer: vote it up please. – JdeHaan Jun 23 '22 at 07:21