4
  • I am trying to provision multiple Windows EC2 instance with Terraform's remote-exec provisioner using null_resource.

$ terraform -v Terraform v0.12.6 provider.aws v2.23.0 provider.null v2.1.2

  • Originally, I was working with three remote-exec provisioners (Two of them involved rebooting the instance) without null_resource and for a single instance, everything worked absolutely fine.
  • I then needed to increase the count and based on several links, ended up using null_resource. So, I have reduced the issue to the point where I am not even able to run one remote-exec provisioner for more than 2 Windows EC2 instances using null_resource.

Terraform template to reproduce the error message:

//VARIABLES

variable "aws_access_key" {
  default = "AK"
}
variable "aws_secret_key" {
  default = "SAK"
}
variable "instance_count" {
  default = "3"
}
variable "username" {
  default = "Administrator"
}
variable "admin_password" {
  default = "Password"
}
variable "instance_name" {
  default = "Testing"
}
variable "vpc_id" {
  default = "vpc-id"
}

//PROVIDERS
provider "aws" {
  access_key = "${var.aws_access_key}"
  secret_key = "${var.aws_secret_key}"
  region     = "ap-southeast-2"
}

//RESOURCES
resource "aws_instance" "ec2instance" {
  count         = "${var.instance_count}"
  ami           = "Windows AMI"
  instance_type = "t2.xlarge"
  key_name      = "ec2_key"
  subnet_id     = "subnet-id"
  vpc_security_group_ids = ["${aws_security_group.ec2instance-sg.id}"]
  tags = {
    Name = "${var.instance_name}-${count.index}"
  }
}

resource "null_resource" "nullresource" {
  count = "${var.instance_count}"
  connection {
    type     = "winrm"
    host     = "${element(aws_instance.ec2instance.*.private_ip, count.index)}"
    user     = "${var.username}"
    password = "${var.admin_password}"
    timeout  = "10m"
  }
   provisioner "remote-exec" {
     inline = [
       "powershell.exe Write-Host Instance_No=${count.index}"
     ]
   }
//   provisioner "local-exec" {
//     command = "powershell.exe Write-Host Instance_No=${count.index}"
//   }
//   provisioner "file" {
//       source      = "testscript"
//       destination = "D:/testscript"
//   }
}
resource "aws_security_group" "ec2instance-sg" {
  name        = "${var.instance_name}-sg"
  vpc_id      = "${var.vpc_id}"


//   RDP
  ingress {
    from_port   = 3389
    to_port     = 3389
    protocol    = "tcp"
    cidr_blocks = ["CIDR"]
    }

//   WinRM access from the machine running TF to the instance
  ingress {
    from_port   = 5985
    to_port     = 5985
    protocol    = "tcp"
    cidr_blocks = ["CIDR"]
    }

  tags = {
    Name        = "${var.instance_name}-sg"
  }

}
//OUTPUTS
output "private_ip" {
  value = "${aws_instance.ec2instance.*.private_ip}"
}

Observations:

st_rt_dl_8
  • 317
  • 2
  • 11
  • Don't oversimplify like that please, it's not helpful to work out where you're going wrong. Reduce your example to a [mcve] that other people can run but still see the same error as you. – ydaetskcoR Aug 06 '19 at 07:11
  • Thanks for the feedback. I have modified the code so it can be reproduced now. – st_rt_dl_8 Aug 07 '19 at 02:15
  • @ydaetskcoR Added the template so that it's easy to reproduce than earlier snippet. – st_rt_dl_8 Aug 11 '19 at 12:58

3 Answers3

4

Update: what eventually did the trick was downgrading Terraform to v11.14 as per this issue comment.

A few things you can try:

  1. Inline remote-exec:
resource "aws_instance" "ec2instance" {
  count         = "${var.instance_count}"
  # ...
  provisioner "remote-exec" {
    connection {
      # ...
    }
    inline = [
      # ...
    ]
  }
}

Now you can refer to self inside the connection block to get the instance's private IP.

  1. Add triggers to null_resource:
resource "null_resource" "nullresource" {
  triggers {
    host    = "${element(aws_instance.ec2instance.*.private_ip, count.index)}" # Rerun when IP changes
    version = "${timestamp()}" # ...or rerun every time
  }
  # ...
}

You can use the triggers attribute to recreate null_resource and thus re-execute remote-exec.

Aleksi
  • 4,483
  • 33
  • 45
  • Thanks for the suggestion Aleksi. I had already tried #1 and ended up using null_resource when I faced the same issue with remote-exec inside aws_instance block. Tried #2 however still the same issue. Terraform skipped to run the provisioner on one of the instances and got stuck at "Still creating...". – st_rt_dl_8 Aug 13 '19 at 10:16
  • 1
    How about adding `sleep` at the beginning and end of your `inline` command? As per [this answer](https://stackoverflow.com/a/51777995/1763012). Some people also [report a similar issue](https://github.com/hashicorp/terraform/issues/22006#issuecomment-509588621) being fixed by downgrading to terraform `v11.14`, that might not be an option in your case though? – Aleksi Aug 13 '19 at 10:46
  • 1
    Tried that once again (Put sleep before and after the command). Did not work. What happened was, as mentioned in the original post, Terraform ran the provisioner on all the three instances, showed that the creating of one resource completed and then got stuck with "Still creating..." message for the other two instances and never showed the "Apply complete!" green message. Though [this](https://github.com/hashicorp/terraform/issues/22006#issuecomment-509588621) issue talks about file provisioner, I will still try downgrading and update soon. – st_rt_dl_8 Aug 13 '19 at 13:26
  • I downgraded the version to v11.14 and that magically worked. Seems like a bug in v0.12.6. Thanks a ton for your time on this! It really sucks to spend weeks on an issue to later find out something like this :) I will try to bring this to Hashicorp's attention. Meanwhile, could you please write the same in the answer so that I can accept it? – st_rt_dl_8 Aug 14 '19 at 04:48
  • I am having a VERY similar problem with the `chef` provisioner: [stack overflow link](https://stackoverflow.com/questions/57929171/chef-provisioner-in-terraform-hangs-when-provisioning-more-than-one-resource) I believe I got started on all this on a later version than 0.11.4, so it's always been there... i went through all this time to upgrade the syntax to be 0.12 compliant, but if it solves the multiple-provisioner issue, maybe it'd be worth it to revert it all. I logged my issue [on their github](https://github.com/hashicorp/terraform/issues/22722). – Max Cascone Sep 19 '19 at 16:58
  • 2
    Yes... that seems to have done it... 3 parallel `chef` provisions using `null_resource` went smoothly and successfully after downgrading and reverting syntax to `v.0.11.14`. I don't know how I didn't find this thread sooner. But at least it works now. – Max Cascone Sep 19 '19 at 20:33
1

I used this triger in null_resource and it works perfectly for me. It also works when number of instances are increased and it do configuration on all instances.I am using terraform and openstack.

triggers= { instance_ids = join(",",openstack_compute_instance_v2.swarm-cluster-hosts[*].id) }

Niaz Hussain
  • 139
  • 1
  • 9
0

Terraform 0.12.26 resolved similar issue for me (when using multiple file provisioners when deploying multiple VMs)

Hope this helps you: https://github.com/hashicorp/terraform/issues/22006

DarrenS
  • 60
  • 6