0

I would like to know how to get the number of messages per topic in kafka through java api, i don't know want to use the command line tool which is mentioned in the following post. Any idea how to do this?

PS: i dont want to loop through the KAFKA consumer stream to figure out the count, i trying to figure this count at the beginning (before consuming from Kafka)

Java, How to get number of messages in a topic in apache kafka

Community
  • 1
  • 1
amateur
  • 941
  • 4
  • 22
  • 33

1 Answers1

3

Using the new KafkaConsumer you could use seekToBeginning(...) and seekToEnd(...) and compute the difference of largest and smallest offset for each partition and sum up those numbers.

If you seek, you will not consume messages. Keep in mind, that seek is lazy, i.e., you need to use position(...) to actually trigger seeking. Because of laziness, both seek-methods do not return anything. However, position(...) will give the the offsets you can use for your compuation.

See http://kafka.apache.org/0100/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html

Matthias J. Sax
  • 59,682
  • 7
  • 117
  • 137
  • 1
    This works well except on the MapR version of Kafka (MapR Streams). The offset there is the actual byte offset of the start of the record and not the "increment by one" offset used in the original Kafka... So taking the difference between earliest and latest only tells you the sum of the messages (and not even that, really) with mapR. – Chris Gerken Aug 07 '16 at 02:56
  • 1
    The question does not say anything about using MapR Streams (which is AFAIK not a MapR version of Kafka, but a new system -- only API compatible). – Matthias J. Sax Aug 07 '16 at 09:08
  • 1
    You're correct about that that. The question was about "real" Kafka, that's why the +1 – Chris Gerken Aug 07 '16 at 10:42