37

I want to automatically toggle alarms on/off during specific periods of time so that they do not fire during maintenance windows. I'm doubting that an easy or direct method exists since I could not find such a thing in the documentation. Does anyone know of a different approach to achieve this while still using CloudWatch alarms, or did I miss an obvious solution?

jmsb
  • 4,846
  • 5
  • 29
  • 38
  • 2
    I perfectly agree with you that it would be a great feature of CloudWatch. I have scenarios where I consider the metric alarming during business hours but it hasn't the same meaning after 22h for instance. I can handle this with a Lambda or something like this, but as it seems a common requirement it would be an awesome built-in feature – Danilo Gomes Mar 20 '19 at 23:44
  • 1
    @jmsb I ran into the same problem and was able to sort it out without lambdas. https://don-016.medium.com/pause-aws-cloudwatch-alarms-during-blackout-windows-1dc188ee9c40. Fair warning - I wrote that post (that's why I didn't want to post it as an answer) – Don Victor Feb 05 '21 at 09:14

10 Answers10

10

I got here while looking for something that would help me to disable alerts for my machine which is performing backups every Saturday between 11:00 and 11:30. The only solution I found is to create cronjob to disable/enable particular alert and run them on particular times:

59 10 * * 6 ec2-user aws cloudwatch disable-alarm-actions --alarm-names "Alarm-1" "Alarm-2"
31 11 * * 6 ec2-user aws cloudwatch disable-alarm-actions --alarm-names "Alarm-1" "Alarm-2"

Your node needs to have access to CloudWatch, obviously. I gave it CloudWatchFullAccess.

Remigiusz
  • 450
  • 3
  • 8
10

Another idea is using math expressions. In the simplest case a combination of IF and HOUR could help. This can be done directly in cloudwatch without lambda functions.

An example to illustrate the idea: Assume a metric m that takes only natural numbers as values and an alarm triggered if m=0. Then you could use the expression

IF(HOUR(m) > 8 && HOUR(m) < 18, m, m+0.01)

instead of m in the alarm. Note that HOUR returns the hour in UTC, so adjust it to your time zone. The 0.01 is only added if hour is outside of the interval [8, 18]. In case m is actually 0 the extra 0.01 ensures that the alarm checking for 0 is not triggered. 0.01 is an arbitrary value that must be small enough to not changing the meaning of your metric. Finding such a number might not be possible for all metrics. I think you get the idea.

To add a math expression to an alarm definition via Cloudwatch UI, click buttons in the following order:

  1. During alarm creation: Select metric -> View graphed metrics -> Add math expression
  2. Or during editing an alarm: Edit (in the first step Specify metric and conditions) -> Add math expression

(This question is one of the top google search results so I wanted to add a workaround which helped me.)

mrteutone
  • 111
  • 2
  • 5
  • 1
    A similar approach is now also described by AWS: https://aws.amazon.com/blogs/mt/enhance-cloudwatch-metrics-with-metric-math-functions/ – mrteutone Dec 15 '21 at 22:37
  • 1
    Sadly theres no support for timezones here, so if you have daylight savings, your suppression will break once a year, and/or be too suppressed for half the year. I'm posting this just after the clocks change for a reason :( – Kurru Nov 07 '22 at 14:36
8

Expanding on mrteutone's answer, first create a separate metric alarm for a metric like

IF(8 <= HOUR(TIME_SERIES(1)) && HOUR(TIME_SERIES(1)) < 18, 1, 0)

and set it to alarm when the value > 0: this gives you an alarm that is always in ALARM state between 8-18 UTC.

You can then combine this with your original metric alarm into a Composite Alarm which only goes off when both metric alarms trigger.

Aleksi
  • 4,483
  • 33
  • 45
  • 1
    My answer has the problem of adding an artificial number to a metric and therefore potentially creating a misleading graph. Your approach does not have this problem. Also, such a "0/1 business hour alarm switch" can be reused in many composite alarms. I like it. – mrteutone Oct 28 '22 at 21:13
5

It's not automatic but it can be done:

http://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_EnableAlarmActions.html

What you want to do is: right before the maintenance window starts you Disable the alarm actions. As the windows ends you Enable the alarm actions.

Mircea
  • 10,216
  • 2
  • 30
  • 46
2

This works not only for hours but also for days. I want my alarm to trigger only during business hours, between 1300 and 23000 UTC. The alarm threshold is 0, so any value greater than 0 would not trigger the alarm. This is the expression used:

IF(DAY(m1)>5 OR HOUR(m1)>23 OR HOUR(m1)<13, 100, m1)

I get a metric value of 100 during nights and weekends, and the actual value I want to alarm on during business hours

Tony BenBrahim
  • 7,040
  • 2
  • 36
  • 49
2

You can automate this by creating an EventBridge rule where you specify a cron or schedule expression that runs a lambda function.

Then, you can use your Lambda function to enable or disable an alarm (or even multiple alarms together) according to your desired schedules.

disable_alarm = client.disable_alarm_actions(AlarmNames=alarm_names)

Here's a good tutorial: https://medium.com/geekculture/terraform-structure-for-enabling-disabling-alarms-in-batches-5c4f165a8db7

x89
  • 2,798
  • 5
  • 46
  • 110
0

Yeah, just as above you can do it using CLI. Also, you can stop sending data from scripts INCASE of custom metrics so that will automatically stop the alarm.

Another way is you can write a script where you can have defined alarm using put-alarm (CloudWatch CLI ). There could be a function that creates alarms and one that deletes them. Call them when needed.

Ranvijay Jamwal
  • 661
  • 1
  • 6
  • 17
0

Agreed. Felt this pain on "presence of data" alarms over "off hours". I would be looking to leverage AWS Lambda's scheduled CloudWatch triggers which is our new go-to scheduler for actions invoked at a rate less than 1/minute.

https://docs.aws.amazon.com/lambda/latest/dg/with-scheduled-events.html https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/cloudwatch.html#CloudWatch.Client.disable_alarm_actions

hundel
  • 58
  • 7
0

If you looking for a longer pause this workaround might be useful:

  • Create a new SNS Topic
  • Don't subscribe to new Topic
  • Edit your alarm to use new Topic
semaphore
  • 59
  • 4
0

If Your Intrested in using lambda function then you can use below lambda functions

Enable Alarm actions

import boto3 cloudwatch = boto3.client('cloudwatch')
 
 def lambda_handler(event, context):
     
     response = cloudwatch.enable_alarm_actions(
        AlarmNames=['AlarmName'] 
)

Disable Alarm actions

import boto3 cloudwatch = boto3.client('cloudwatch')
     
     def lambda_handler(event, context):
         
         response = cloudwatch.disable_alarm_actions(
            AlarmNames=['AlarmName'] 
    )
sachin_ur
  • 2,375
  • 14
  • 27