64

I noticed that the Consumer configuration has two IDs. One is group.id (mandatory) and second one is consumer.id (not Mandatory).

What is the difference between these 2 IDs?

Ivan Aracki
  • 4,861
  • 11
  • 59
  • 73
Gnana
  • 2,130
  • 5
  • 26
  • 57
  • 4
    Based on the consumer groups, we can determine the messaging model of the consumer: ** If all consumer instances have the same consumer group, then this works just like a traditional queue balancing load over the consumers. If all consumer instances have different consumer groups, then this works like a publish-subscribe and all messages are broadcast to all the consumers.** . as per above statement, to act as publish-subscribe all consumer group should be unique name. then why cant be consumer id. why consumer group id?. what does group means – Gnana Jan 02 '16 at 05:57

2 Answers2

48

Consumers groups is a Kafka abstraction that enables supporting both point-to-point and publish/subscribe messaging. A consumer can join a consumer group (let us say group_1) by setting its group.id to group_1. Consumer groups is also a way of supporting parallel consumption of the data i.e. different consumers of the same consumer group consume data in parallel from different partitions.

In addition to group.id, each consumer also identifies itself to the Kafka broker using consumer.id. This is used by Kafka to identify the currently ACTIVE consumers of a particular consumer group.

Read this documentation for more details.

Aravind Yarram
  • 78,777
  • 46
  • 231
  • 327
  • 1
    Please give more example to understand. what is **different consumers of the same consumer group consume data in parallel from different partitions.**. How to add different consumer in same consumer group ? I ran Java Consumer program with group.id=group1. it consumed message. I ran same program without changing goup.id. first program consume message. second consumer couldn't consume message. what does group means here ? – Gnana Jan 01 '16 at 18:03
  • 4
    if you haven't partitioned the topic, then only 1 consumer is going to consume messages. if you want both consumers (belonging to the same consumer group) consume messages in parallel then you need to partition your topic. I'd recommend you read the documentation – Aravind Yarram Jan 01 '16 at 18:27
  • Thanks for additional information. I have created one Topic demo with 3 partition in one broker. I ran the program with demo-group, it is consuming message. I ran again program with same group id. but, Second program couldn't consume message. if change the group id name to demo-group1. it works good. please tell me the consumer group concept. – Gnana Jan 01 '16 at 18:51
  • I read tutorial. I need to understand how to create consumer group that has more consumer using java API. I am using java simple consumer API. Please help me on that. – Gnana Jan 01 '16 at 19:21
  • 6
    Reading the documentation, for `consumer.id`, description states "Generated automatically if not set." I assume this probably means if manually set, consumer.id should be unique for each consumer? Curious to wonder what happens if you reuse the consumer ID across consumers (like how one shares the consumer group ID) - problems arise or it will work fine but just messy for tracking/debugging active consumers in the group? – David Nov 08 '18 at 22:43
  • @David That might be worth creating a new post for – OneCricketeer Dec 05 '18 at 23:44
  • There is no `consumer.id` in the documentation. Has it been replaced by `client_id`? but then it is same for every consumer by default. – y_159 Aug 16 '21 at 12:25
11

The figure below describes the difference between group.id and consumer.id really well.

enter image description here

In this example, we have one topic with four partitions, one consumer group and three consumers inside the consumer group. Consumer 0 and consumer 2 read from one partition each, whereas consumer 1 reads from two partitions.

The group.id is equal to Consumer group. Your group.id always represents a unique name/ID of your consumer group across your Kafka cluster. A consumer group can have one or multiple consumers, but only as many consumers as partitions are available in the topic.

Here we have four partitions and three consumers joined the consumer group: Consumer 0, Consumer 1 and Consumer 2. Each ID (or name) of a consumer represents the consumer.id. We can have a maximum of four consumers (i.e. consumer.ids) because the topic has four partitions.

Each consumer has a unique consumer.id across its consumer group. If you don't define a consumer.id, the Kafka client (Java, Python, Node.js, etc) usually chooses a random ID.

Another good example to understand the relationship between group.id and consumer.id is the following figure:

enter image description here

Topic T1 has four partitions, and two consumer groups exist, group 1 and group 2. Group 1 holds four consumers (maxed out), while group 2 holds two consumers (space for 2 more consumers). Both consumers in group 2 read from two partitions each. In group 1, each consumer just reads from one partition.

This is a good example showcasing why we cannot have two or more consumer groups (group.ids) with the same name. If Kafka allows groups to have the same name, the offset tracking for a partition would get out of whack because a consumer in group 1 would overwrite the offset of a consumer in group 2 (and vice versa).

Jay
  • 1,564
  • 16
  • 24
  • In your last example, is "Consumer X" the consumer id? Does that mean that in that picture the group 2 has 2 consumers which have the same consumer.id of the consumers of group 1? – Bakuriu Dec 01 '22 at 18:21