1

I have a system with Kafka that looks like this (all consumers are within a single consumer group):

Producer ---[ 1 topic, 1 partition] ---> Consumer1
                                    |--> Consumer2
                                    ...
                                    |--> Consumern

In each consumer I poll the messages, then do an expensive computation (from 1s up to 60s). If the operation is successful, I commit the consumer.

Can it happen that before I commit, another consumer will start to process the same message? I need to guarantee that once the message is picked up, it is executed exactly once - unless the processing fails mid-way.

zool
  • 11
  • 5
  • The discourse about your topic here: https://stackoverflow.com/questions/25896109/in-apache-kafka-why-cant-there-be-more-consumer-instances-than-partitions/40062081#comment128811425_40062081 The short answer - nope :) – Nikita_kharkov_ua Jul 10 '22 at 15:04

1 Answers1

2

Not sure what you exactly meant by multiple consuming from one partition.

But the thump rule here is, irrespective of how many consumers you have it in a single consumer group, at a given point in time, only one partition will be assigned to only one consumer only. Rest of the consumer instances will be idle, until the active consumer dies. And, when to commit the record after the poll, it is upto you. You can have At-Most once (Commit and process the message), At-least Once (Commit the message after processing the message) and exactly once semantics.

ChristDist
  • 572
  • 2
  • 8
  • 2
    But what if I have thousands of message in one partition and I should read them as fast as I can. If there is only one consumer - it's toooo long. Can you tip some solution? I guess my question relates to main one. – Nikita_kharkov_ua Jul 10 '22 at 15:07