1

During the time when i was going through the doc and come across ,

Automatic Commit The easiest way to commit offsets is to allow the consumer to do it for you. If you configure enable.auto.commit=true, then every five seconds the consumer will commit the largest offset your client received from poll(). The five-second interval is the default and is controlled by setting auto.commit.interval.ms. Just like everything else in the consumer, the automatic commits are driven by the poll loop. Whenever you poll, the consumer checks if it is time to commit, and if it is, it will commit the offsets it returned in the last poll.

And also, How does kafka consumer auto commit work?

The auto-commit check is called in every poll and it checks that the "time elapsed is greater than the configured time". If so, the offset is committed.

In case the commit interval is 5 seconds and poll is happening in 7 seconds, the commit will happen after 7 seconds only.

However, if we take a close look, the auto commit doesnt seem actually happen every 5 sec ( or time interval configured through "auto.commit.interval.ms ) but happens every time if "time elapsed" is greater than "auto.commit.interval.ms" and intervals of "time elapsed"+"auto.commit.interval.ms" -- which means it doesn't necessarily commit the offset every interval, configured thorough "auto.commit.interval.ms".

Please add your thoughts

Update #1

It is adding up confusion after going through more details , Can someone add more details about this - will poll() method happens in the background at 5 sec which is different from poll() method issued from the consumer ?

The poll() call is issued in the background at the set auto.commit.interval.ms.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Nag
  • 1,818
  • 3
  • 24
  • 41
  • I would check the more recent docs. The definitive guide was published 4 years ago when Kafka 0.10 was released. There have been a lot of updates that have changed the behaviour of the clients (for example, heartbeats are sent in a different thread nowadays, and not from `poll()`). Read the "definitive guide" as a high level intro, and then jump to the docs in the confluent website. About your question, poll might be called more often depending on other configuration properties or the amount of data going through the topic. – Augusto Jun 07 '20 at 08:41
  • hmm, it is adding up a lot of confusion now. As i understand, poll will be triggered with an explicit call using poll() method from the application / consumer application. However, I read a phrase such as "The poll() call is issued in the background at the set auto.commit.interval.ms.", it will give a lot of conflicting interpretations - does it mean it will be called "automatically" every 5 sec( configured through auto.commit.interval.ms) without consumer application triggers another poll() method . I also need to check :-( – Nag Jun 07 '20 at 08:54
  • The code is open source, too. You could read it for what really happens rather than the documentation. Just a thought – OneCricketeer Jun 09 '20 at 03:43
  • that is right, i guess this requires simple answer , I dont think it requires to be looked into the code unless it is corner case – Nag Jun 09 '20 at 05:22
  • This is very confusing. The best explanation I've found so far: https://docs.confluent.io/platform/current/clients/consumer.html#message-handling – Ant Feb 16 '23 at 22:31

0 Answers0