2

I'm planning to skip the start of the topic and only read messages from a certain timestamp to the end. Any hints on how to achieve this?

Mickael Maison
  • 25,067
  • 7
  • 71
  • 68
Silithus
  • 181
  • 2
  • 13

2 Answers2

8

I'm guessing you are using kafka-python (https://github.com/dpkp/kafka-python) as you mentioned "KafkaConsumer".

You can use the offsets_for_times() method to retrieve the offset that matches a timestamp. https://kafka-python.readthedocs.io/en/master/apidoc/KafkaConsumer.html#kafka.KafkaConsumer.offsets_for_times

Following that just seek to that offset using seek(). https://kafka-python.readthedocs.io/en/master/apidoc/KafkaConsumer.html#kafka.KafkaConsumer.seek

Hope this helps!

Mickael Maison
  • 25,067
  • 7
  • 71
  • 68
0

I got around it, however I'm not sure about the values that I got from using the method. I have a KafkaConsumer (ck), I got the partitions for the topic with the assignment() method. Thus, I can create a dictionary with the topics and the timestamp I'm interested into (in this case 100).

Side Question: Should I use 0 in order to get all the messages?.

I can use that dictionary as the argument in the offsets_for_times(). However, the values that I got are all None

zz = dict(zip(ck.assignment(), [100]*ck.assignment() ))
z = ck.offsets_for_times(zz)
z.values()

dict_values([None, None, None])

Diego
  • 1