34

I have a set of cloudwatch logs in json format that contain a username field. How can I write a cloudwatch metric query that counts the number of unique users per month?

postelrich
  • 3,274
  • 5
  • 38
  • 65

5 Answers5

36

Now you can count unique field values using the count_distinct instruction inside CloudWatch Insights queries.

Example:

fields userId, @timestamp
| stats count_distinct(userId)

More info on CloudWatch Insights: https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AnalyzingLogData.html

Leonardo Kuffo
  • 901
  • 9
  • 10
  • 7
    I know this is an old comment but I just wanted to flag that if the number of unique values is too high, count_distinct provides an APPROXIMATION. I had a situation where the real number was something like 1760 and count_distinct was returning 1470. We spent a few hours trying to figure out why https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_QuerySyntax.html#:~:text=specified%20field%20name.-,count_distinct,-(fieldName%3A%20LogField) – Gabriel Apr 07 '22 at 18:24
26

You can now do this! Using CloudWatch Insights.

API: https://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/API_StartQuery.html

I am working on a similar problem and my query for this API looks something like:

fields @timestamp, @message
| filter @message like /User ID/
| parse @message "User ID: *" as @userId
| stats count(*) by @userId

To get the User Ids. Right now this returns with a list of them then counts for each one. Getting a total count of unique can either be done after getting the response or probably by playing with the query more.

You can easily play with queries using the CloudWatch Insights page in the AWS Console.

ImperviousPanda
  • 426
  • 6
  • 13
18

I think you can achieve that by following query:

Log statement being parsed: "Trying to login user: abc ....."

fields @timestamp, @message
| filter @message like /Trying to login user/
| parse @message "Trying to login user: * and " as user
| sort @timestamp desc
| stats count(*) as loginCount by user | sort loginCount desc

This will print the table in such a way,

# user loginCount
1 user1 10
2 user2 15
......
Dharman
  • 30,962
  • 25
  • 85
  • 135
Sahil Mahajan
  • 203
  • 2
  • 6
  • I had to filter out a huge list of error logs and extract certain unique token from all of them. This answer saved the day. :) Thanks! – Abhay Maniyar Apr 06 '22 at 09:02
  • This answer worked great for me. I wanted to extract some data as a substring where the useful text was at the end of the log line. +1 – iam.Carrot Mar 01 '23 at 11:54
2

I don't think you can.

Amazon CloudWatch Logs can scan log files for a specific string (eg "Out of memory"). When it encounters this string, it will increment a metric. You can then create an alarm for "When the number of 'Out of memory' errors exceeds 10 over a 15-minute period".

However, you are seeking to count unique users, which does not translate well into this method.

You could instead use Amazon Athena, which can run SQL queries against data stored in Amazon S3. For examples, see:

John Rotenstein
  • 241,921
  • 22
  • 380
  • 470
0

If I need to see all the distinct counts and not just a number then I do this.

fields @timestamp, @message
| sort @timestamp desc
| stats count_distinct(field_1) as myHeader by field_1

Inspired by @sahil-mahajan's answer