We have kafka which we are monitoring using grafana dashboad provided by aws cloud watch. From the past few days we are seeing some abnormality. I dont have Idea much about SumOffsetLag metric parameter. How this spiked value can degrade the performance and how we can improve it
1 Answers
Consumer-lag monitoring
Monitoring consumer lag allows you to identify slow or stuck consumers that aren't keeping up with the latest data available in a topic. When necessary, you can then take remedial actions, such as scaling or rebooting those consumers. To monitor consumer lag, you can use Amazon CloudWatch or open monitoring with Prometheus.
Consumer lag metrics quantify the difference between the latest data written to your topics and the data read by your applications. Amazon MSK provides the following consumer-lag metrics, which you can get through Amazon CloudWatch or through open monitoring with Prometheus: EstimatedMaxTimeLag, EstimatedTimeLag, MaxOffsetLag, OffsetLag, and SumOffsetLag.Amazon MSK supports consumer for cluster with Apache Kafka 2.2.1 or latest version.

- 179,855
- 19
- 132
- 245

- 11
- 2
-
2Plagiarized - https://docs.aws.amazon.com/msk/latest/developerguide/consumer-lag.html – OneCricketeer Dec 03 '22 at 17:37