1

I need to get a sum of MB sent to a bunch of Cloudwatch log groups as data, not in the console. But first I need to get sums for two working.

I started with an AWS support article. Then I grabbed the metric names I needed from the Cloudwatch console. Then looked at the docs for the get-metric-data CLI.

cloudwatch metrics groups

Between the three this was the closest I got:

aws cloudwatch get-metric-data --profile default --metric-data-queries file://./.temp/metric-data-queries.json  \
--start-time 2019-12-04T00:00:00Z --end-time 2019-12-18T00:00:00Z

Where the query file looks like this:

[
    {
        "Id": "mbSum",
        "MetricStat": {
            "Metric": {
                "Namespace": "AWS/Logs",
                "MetricName": "IncomingBytes",
                "Dimensions": [
                    {
                        "Name": "LogGroupName",
                        "Value": "/aws/lambda/prd-***-lambda"
                    },
                    {
                        "Name": "LogGroupName",
                        "Value": "/aws/lambda/prd-****-lambda"
                    }
                    ... 98 more, down the road, but just two for now
                ]
            },
            "Period": 1209600,
            "Stat": "Sum",
            "Unit": "Megabytes"
        }
    }
]

The result I got was:

{
    "MetricDataResults": [
        {
            "Id": "mbSum",
            "Label": "IncomingBytes",
            "Timestamps": [],
            "Values": [],
            "StatusCode": "Complete"
        }
    ],
    "Messages": []
}

I'd expect a zero in there if there were no results. Tried with a period of 300 (like the get-metric-data doc suggests), no change. The information I have regarding period is contradictory/unclear. What am I missing here?

jcollum
  • 43,623
  • 55
  • 191
  • 321

2 Answers2

1

Getting this working with the AWS CLI was a huge hassle. Ended up grabbing a Python script from this answer and modifying it a little:

#!/usr/bin/env python3

# Outputs all loggroups with > 1GB of incomingBytes in the past x days

import boto3
from datetime import datetime as dt
from datetime import timedelta

days_to_check=30

logs_client = boto3.client('logs')
boto3.setup_default_session(profile_name="default")
cloudwatch_client = boto3.client('cloudwatch')

end_date = dt.today().isoformat(timespec='seconds')
start_date = (dt.today() - timedelta(days=days_to_check)).isoformat(timespec='seconds')
print("looking from %s to %s" % (start_date, end_date))

paginator = logs_client.get_paginator('describe_log_groups')
pages = paginator.paginate()
page_c = 0
total_checked = 0

for page in pages:
  page_c += 1
  for json_data in page['logGroups']:
    total_checked += 1
    log_group_name = json_data.get("logGroupName")

    print(f"Page {page_c}: checking {log_group_name}                                    ", end="\r", flush=True)

    cw_response = cloudwatch_client.get_metric_statistics(
       Namespace='AWS/Logs',
       MetricName='IncomingBytes',
       Dimensions=[
        {
            'Name': 'LogGroupName',
            'Value': log_group_name
        },
        ],
        StartTime= start_date,
        EndTime=end_date,
        Period=3600 * 24 * days_to_check,
        Statistics=[
            'Sum'
        ],
        Unit='Bytes'
    )
    if len(cw_response.get("Datapoints")):
        stats_data = cw_response.get("Datapoints")[0]
        stats_sum = stats_data.get("Sum")
        sum_GB = stats_sum /  (1000 * 1000 * 1000)
        if sum_GB > 1.0:
            print("   **** %s exceeded 1GB log sent, total %.2f GB **** " % (log_group_name , sum_GB))

print(f"Done. Checked {total_checked} logs.                                         ")

Worth noting that we have 1000s of log groups, so the CLI was going to be a difficult solution for this. If anyone wants to improve that, go for it. My python is meh.

jcollum
  • 43,623
  • 55
  • 191
  • 321
0

These metrics are emitted with unit Bytes and CloudWatch is not doing any conversion on the unit automatically. Change the unit to Bytes (or don't specify the unit at all) and use metric math to convert bytes to megabytes.

Here is a simplified request that sums up all the incoming bytes for all log groups:

[
    {
        "Id": "mbSum",
        "Expression": "SUM(SEARCH('{AWS/Logs,LogGroupName} MetricName=\"IncomingBytes\"', 'Sum', 1209600))/1000000",
        "ReturnData": true
    }
]

Response I got on my test account:

{
    "MetricDataResults": [
        {
            "Timestamps": [
                "2019-12-04T00:00:00Z"
            ],
            "StatusCode": "Complete",
            "Values": [
                4.844451
            ],
            "Id": "mbSum",
            "Label": "mbSum"
        }
    ] }

Saved the payload in query.json and executed this command:

aws cloudwatch get-metric-data --metric-data-queries file://query.json  \
--start-time 2019-12-04T00:00:00Z --end-time 2019-12-18T00:00:00Z
Dejan Peretin
  • 10,891
  • 1
  • 45
  • 54
  • You sure? https://docs.aws.amazon.com/cli/latest/reference/cloudwatch/get-metric-data.html disagrees with this. Changed it to Bytes, no difference. – jcollum Dec 18 '19 at 19:32
  • Yeah, from the doc you linked: 'If you specify a unit, the operation returns only data data that was collected with that unit specified. If you specify a unit that does not match the data collected, the results of the operation are null. CloudWatch does not perform unit conversions'. – Dejan Peretin Dec 18 '19 at 19:45
  • Next thing to check would be region and creds. Make sure the region in the default profile matches the one where the metrics are and the creds match the account. – Dejan Peretin Dec 18 '19 at 19:46
  • OK but I changed it to Bytes and still got an empty result set. Same result for no units specified. – jcollum Dec 18 '19 at 19:47
  • Tried adding `--region us-west-2` (and the the other 3 US regions) to the CLI call -- no change in result. – jcollum Dec 18 '19 at 19:49
  • I updated the answer with a request that I tested on my account, to rule out any issue with the request payload. – Dejan Peretin Dec 18 '19 at 19:54
  • Ah, I see, you listed all the dimensions under the same metric. Those should be separate metrics. See if my SEARCH expression works for you. – Dejan Peretin Dec 18 '19 at 20:01
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/204504/discussion-between-jcollum-and-unkindness-of-datapoints). – jcollum Dec 18 '19 at 20:02