11

I have to take certain actions during AWS autoscaling scale-in event.The ec2 instance should be able to save some logs and reports to S3 bucket. This can take anywhere between 5 to 15 mins.

I already have a script that gets called on termination:

ln -s /etc/ec2-termination /etc/rc0.d/S01ec2-termination

However the script ends abruptly within 5 mins. I am looking at leveraging AWS LifeCycle hooks to extend the EC2 lifetime. The documentation is not clear on invoking a script in a way similar to user-data script.

There are ways of using AWS lambda or SNS to receive notification. This can be potentially used to inform the ec2.

But, I would like to know if there is a simpler solution to this problem. Is there a way to register a script with Lifecycle hooks that gets called on a scale-in event.

rahul gupta
  • 238
  • 1
  • 4
  • 11
  • 1
    Why not just run a cron job to collect logs at regular interval to save time at shutdown #Sweet :) – raevilman May 09 '18 at 07:28
  • A cron job is an excellent alternative approach. If you're using autoscaling, it's good to design the system so that it doesn't depend on a specific instance being available for long periods of time. – arowell May 26 '21 at 21:19

5 Answers5

11

No.

The fact that the instance is being terminated is managed within the AWS infrastructure. Auto Scaling does not have the ability to reach "into" the EC2 instance to trigger anything.

Instead, you would need to write some code on the instance that checks whether the instance is in the termination state and then takes appropriate action.

An example might be:

  • The Lifecycle Hook sends a notification via Amazon SNS
  • Amazon SNS triggers an AWS Lambda function
  • The Lambda function could add a tag to the instance (eg Terminating = Yes)
  • A script on the EC2 instance is triggered every 15 seconds to check the tags associated with the EC2 instance (on which it is running). If it finds the tag, it triggers the shutdown process.

(Be careful that the script doesn't trigger again during the shutdown process otherwise it might try performing the shutdown process every 15 seconds!)

Alternatively, store the shutdown information in the Systems Manager Parameter Store or a database, but using Tags seems nicely scalable!

Updated version:

Thanks to raevilman for the idea:

  • The Lifecycle Hook sends a notification via Amazon SNS
  • Amazon SNS triggers an AWS Lambda function
  • The Lambda function calls the AWS Systems Manager Run Command to trigger code on the instance

Much simpler!

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
  • Rather than having a script running every 15 sec to check for tags.... that lambda in step 2 can send AWS SSM command to EC2 :) This way your "Be Careful point of having script run again while system is shutting down is also covered" HIH – raevilman May 09 '18 at 07:22
  • 1
    To many resources and integrations just to run a script on termination – LP13 Nov 18 '22 at 21:39
9

Yes, you can run a shell-script on your terminating EC2 instance using AWS Systems manager.

  1. Configure Lifecycle Hooks for your Autoscaling group. You can do this from the EC2 console or CLI:

    aws autoscaling put-lifecycle-hook
    --lifecycle-hook-name my-lifecycle-hook
    --auto-scaling-group-name My_AutoScalingGroup
    --lifecycle-transition autoscaling:EC2_INSTANCE_TERMINATING
    --default-result CONTINUE
    --region us-east-2

Set the Heartbeat timeout value depending on the duration your script takes to run. Now, when your ASG scales-in, your instances go into a Terminate:Wait state during which your script will run.

  1. Set up a CloudWatch Event that is triggered when an instance changes to Terminating:Wait status and it Targets a System Manager Run Command that executes the shell script on your instance. Use the console.

Alternative Solution: Your Lifecycle Hook sends a message with the Instance-ID to SQS when an instance changes to Terminating:Wait status. SQS on receiving a message triggers a Lambda function that sends the Run Command to System Manager to execute the shell script on your terminating instance.

References: 1 2 3

Shitij Mathur
  • 385
  • 2
  • 10
  • We tried this, but we could not figure out how to have the cloudwatch event pass the instance id to the "SSM Run Command". Any insight into how to do that? – bpeikes Jul 09 '21 at 20:06
  • Yeah, same here. I'm not sure on how to actually target *that* specific instance – Merricat Aug 26 '21 at 17:31
  • @bpeikes I figured it out. see my answer in [this post](https://stackoverflow.com/a/68945715/8845253). Essentially, you need to invoke a Lambda function from EventBridge, instead of invoking the SSM Command directly. Then in Lambda you can parse the event data (json), extract the instanceId, then use the ssm api to SendCommand to that instance. – Merricat Aug 27 '21 at 19:02
6

Here is a solution using Lifecycle Hooks, Automation and Run Command, based on this article:

Resources:
  MyTerminationHook:
    Type: AWS::AutoScaling::LifecycleHook
    Properties:
      AutoScalingGroupName: !Ref MyAutoScalingGroup
      DefaultResult: CONTINUE
      HeartbeatTimeout: 900
      LifecycleTransition: autoscaling:EC2_INSTANCE_TERMINATING

  MyTerminationDocument:
    Type: AWS::SSM::Document
    Properties:
      DocumentType: Automation
      Content:
        description: 'Run command before terminating instance'
        schemaVersion: '0.3'
        assumeRole: !GetAtt MyTerminationDocumentRole.Arn
        parameters:
          instanceId:
            type: String
        mainSteps:
          - name: RunCommand
            action: aws:runCommand
            inputs:
              DocumentName: AWS-RunShellScript
              InstanceIds:
                - '{{ instanceId }}'
              TimeoutSeconds: 60
              Parameters:
                commands: /etc/my-termination-script.sh
                executionTimeout: '900'
          - name: TerminateInstance
            action: aws:executeAwsApi
            inputs:
              Api: CompleteLifecycleAction
              AutoScalingGroupName: !Ref MyAutoScalingGroup
              InstanceId: '{{ instanceId }}'
              LifecycleActionResult: CONTINUE
              LifecycleHookName: !Ref MyTerminationHook
              Service: autoscaling

  MyTerminationRule:
    Type: AWS::Events::Rule
    Properties:
      EventPattern:
        source:
          - aws.autoscaling
        detail-type:
          - EC2 Instance-terminate Lifecycle Action
        detail:
          AutoScalingGroupName:
            - !Ref MyAutoScalingGroup
      Targets:
        - Id: my-termination-document
          Arn: !Sub 'arn:aws:ssm:${AWS::Region}:${AWS::AccountId}:automation-definition/${MyTerminationDocument}:$DEFAULT'
          RoleArn: !GetAtt MyTerminationRuleRole.Arn
          InputTransformer:
            InputPathsMap:
              instanceId: '$.detail.EC2InstanceId'
            InputTemplate: '{"instanceId":[<instanceId>]}'

  MyTerminationRuleRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              Service: events.amazonaws.com
            Action: sts:AssumeRole
      Policies:
        - PolicyName: start-automation
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - ssm:StartAutomationExecution
                Resource: !Sub 'arn:aws:ssm:${AWS::Region}:${AWS::AccountId}:automation-definition/${MyTerminationDocument}:$DEFAULT'

  MyTerminationDocumentRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              Service: ssm.amazonaws.com
            Action: sts:AssumeRole
      Policies:
        - PolicyName: run-command-and-complete-lifecycle
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - autoscaling:CompleteLifecycleAction
                Resource: !Sub 'arn:aws:autoscaling:${AWS::Region}:${AWS::AccountId}:autoScalingGroup:*:autoScalingGroupName/${MyAutoScalingGroup}'
              - Effect: Allow
                Action:
                  - ssm:DescribeInstanceInformation
                  - ssm:ListCommands
                  - ssm:ListCommandInvocations
                Resource: '*'
              - Effect: Allow
                Action:
                  - ssm:SendCommand
                Resource: 'arn:aws:ssm:*::document/AWS-RunShellScript'
              - Effect: Allow
                Action:
                  - ssm:SendCommand
                Resource: !Sub 'arn:aws:ec2:${AWS::Region}:${AWS::AccountId}:instance/*'

The permissions required to deploy these are

          - Sid: CreateDocument
            Effect: Allow
            Action:
              - "ssm:CreateDocument"
              - "ssm:GetDocument"
              - "ssm:DeleteDocument"
              - "ssm:ListTagsForResource"
            Resource: !Sub "arn:aws:ssm:<...>"
          - Sid: InstallLifecycleHook
            Effect: Allow
            Action:
              - "autoscaling:DeleteLifecycleHook"
              - "autoscaling:CreateLifecycleHook"
            Resource: !Sub "arn:aws:autoscaling:<...>"
          - Sid: ManageRules
            Effect: Allow
            Action:
              - "events:PutRule"
              - "events:ListRules"
              - "events:DescribeRule"
              - "events:DeleteRule"
              - "events:PutTargets"
              - "events:RemoveTargets"
            Resource: !Sub "arn:aws:events:<...>"

There might be more; these are the ones I had to add to our existing deployment policy. They may also not all be required, but I was fed up redeploying and adding them piecemeal so I added some of the Rule ones as an educated guess.

daniu
  • 14,137
  • 4
  • 32
  • 53
yurez
  • 2,826
  • 1
  • 28
  • 22
  • 1
    Excellent, we got this running. Took some time to figure out what permissions were required so I added the ones that were missing for us to the answer. – daniu Feb 07 '23 at 13:18
1

Depending on what you want to achieve, with this approach you would only need 2 things and much simpler):

  1. The Lifecycle Hook sends a notification to SQS
  2. The app reads the SQS and performs the action
user740413
  • 41
  • 1
  • 7
0

Just a correction on my earlier comment. SSM run command would work against the instance in Autoscaling group if an instance got terminated due to an auto scaling event not if you terminate the instance manually.