tdlr; I want to create an AMI with "partial sysprep" so that SSM can connect when I launch a different Instance Type than the original off that AMI, but want to keep all else equal. It only needs to update metadata/kms routes.
I recently hit an issue where SSM was unreachable if I deployed instances of an instance type other than the instance type for which the AMI was originally created. This turned out to be because the different instance size would be launched into a different availability zone, and the routes to connect to SSM were saved to the image pointing to the availability zone of the original instance type AMI.
The solution to this was to shut down with sysprep before creating the AMI. However, that opened other issues: 1- Launching systems off the sysprep'd AMI take 2+ minutes for SSM to become available, as opposed to instantly when sysprep is not used. But more importantly: 2- Part of my launch script downloads an exe to the desktop and install it using SSM RunPowershellScript. This part now fails, I believe because the desktop, etc. isn't created until I RDP into the new instance. I've tested with a 15 minute sleep with same result. That portion of the code runs fine after I've rdp'd into the instance.
I have: 1. Confirmed the exe installer runs fine when the AMI is not sysprep'd. However, in this mode, I am stuck with only the Instance Type for which the original AMI was created.
Tried a 15min sleep before downloading/running the installer when sysprep is used. This did not work.
Confirmed that on the sysprep'd image, installer downloads & runs if I have rdp'd into the instance to initialize the desktop, etc.
This is all related to metadata/KMS routes described at the bottom of the page here: https://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/ec2launch.html#ec2launch-inittasks
When an AMI is created without using sysprep, if an instance launches off that AMI into a different Availability Zone, SSM is unreachable and the following error occurs in logs:
2019-08-28 22:39:12 ERROR [func1 @ coremanager.go.245] [instanceID=i-0d6c57bbfe2db46af] error occurred trying to start core module. Plugin name: StartupProcessor. Error: Internal error occurred by startup processor: runtime error: invalid memory address or nil pointer dereference 2019-08-28 22:39:27 ERROR [SetWebSocket @ controlchannel.go.89] [MessageGatewayService] Failed to get controlchannel token, error: CreateControlChannel failed with error: createControlChannel request failed: unexpected response from the service Unauthorized request.
Expected behavior is to be able to launch instances off an AMI with everything preconfigured (including the desktop, etc. which needs to be fully reinitialized when sysprep is used), and have the new instance update metadata/kms routes so SSM is reachable.