0

Please help me to understand the functionality of Google cloud Pubsub subscription/num_undelivered_messages metric with pull subscription.

From docs: subscription/num_undelivered_messages is

Number of unacknowledged messages (a.k.a. backlog messages) in a subscription. Sampled every 60 seconds. After sampling, data is not visible for up to 120 seconds.

And for Pull delivery from docs

In pull delivery, your subscriber application initiates requests to the Cloud Pub/Sub server to retrieve messages. The subscribing application explicitly calls the pull method, which requests messages for delivery.

Now I setup a pull subscription against a Google public topic named projects/pubsub-public-data/topics/taxirides-realtime which is suppose to continuously provide stream of taxi rides data.

Now my requirement is to calculate number of taxi rides in past 1 hour. The usual approach came in my mind is to pull all messages from topic and perform aggregation over it.

However while searching I found these 2 links link1 and link2 which I feel like can solve the problem but below question 1 is lingering as doubt for this solution and confuses me!

So overall my question is
1. How does a pub subscription finds value of num_undelivered_messages from a topic, even when subscription didn't made any pull call? Actually I can see this metric in stackdriver monitoring by filtering on subscription id.

  1. What is the right way to calculate aggregate of number of messages present in a topic in a certain duration?
Neeraj Kumar
  • 836
  • 1
  • 10
  • 29

1 Answers1

1

The number of undelivered messages is established based on when the subscription is created. Any messages published after that are messages that should be delivered to the subscription. Therefore, any of these messages not pulled and acked by the subscription will count toward num_undelivered_messages.

For solving your particular problem, it would be better to read the feed and aggregate the data. The stats like num_undelivered_messages are useful for examining the health of subscribers, e.g., if the count is building up, it could indicate that something is wrong with the subscribers or that the data published has changed in some way. You could look at the difference in the number between the end of your desired time interval and the beginning to get an estimate of the number of messages published in that time frame, assuming you aren't also consuming and acking any messages.

However, it is important to keep in mind that the time at which messages are published in this feed may not exactly correspond to the time at which a taxi ride occurred. Imagine there was an issue with the publisher and it was unable to publish the messages for a period of time and then once fixed, published all of the messages that had built up during that time. In this scenario, the timestamp in the messages themselves indicating when the taxi ride occurred would not match the time at which the message was received by Cloud Pub/Sub.

Kamal Aboul-Hosn
  • 15,111
  • 1
  • 34
  • 46
  • Thank you @Kamal Aboul-Hosn. Explaination is very helpful. Regarding the point of finding the difference. Is examples mentioned here https://cloud.google.com/monitoring/custom-metrics/reading-metrics#aligning can be modified for this use case? – Neeraj Kumar Nov 27 '19 at 15:04
  • For eg, By passing start and end time with an hour difference and changing metric type to "metric.type = "pubsub.googleapis.com/subscription/num_undelivered_messages" AND resource.label.subscription_id = "subscription-name"'" and Aligner to ALIGN_SUM? Actually I ran this modified program and was getting big difference in metric values compare to stackdriver monitoring graph vs program. For eg when stackdriver was showong 38M, my program was showing 30M. Could you suggest please? – Neeraj Kumar Nov 27 '19 at 15:04
  • I don't think ALIGN_SUM is what you want. It's going to show the sum of the points in the interval. I have found what works is to set aligner to ALIGN_NEXT_OLDER, alignment period to 3600s, and the start and end time to an hour apart. You will get back two data points and you take the difference between those two. – Kamal Aboul-Hosn Nov 27 '19 at 20:31
  • as suggested I tried ALIGN_NEXT_OLDER but it returns only 1 data point. – Neeraj Kumar Nov 28 '19 at 00:49
  • @NeerajKumar Hello I have reached a similar conclusion as you have . Did you find a solution to this particular issue? – Malcode Feb 13 '22 at 20:02