0

Would like to know if it is possible to read and process multiple messages by a Kafka consumer(Java client) parallely ..I mean using multiple threads ... Should I use rxJava ??

1) Is it a good approach to do so??? 2) And also as per my understanding Kafka treats even each thread as a consumer...please correct me if I'm wrong ...

3) and also would like to make the Java client as daemon service to run in Linux so that it runs continuously and polls Kafka for messages , read and process the same ..is this a good approach ..

shiv455
  • 7,384
  • 19
  • 54
  • 93

2 Answers2

1

Kafka support parallel processing messages by partitions, you can start several consumers, one or several partitions for one kafka client, and kafka also can support sequence processing in same partition by this mode.

Of course, you can start multiple threads to process multiper messages in one consumer, but the sequence processing in one partition can't be assured.

Joey
  • 11
  • 2
  • Lemme understand ,so if I have 3 consumers(same group) and 3 partitions each partition will be assigned to the consumer..now at a time each consumer can read and process only one message..as I have 3 consumers at a time only 3 messages were processed right ??.. And didn't get the second part of ur answer like "sequence processing in one partition can't be assured" how this affects the consumer as it always read message not knowing from which partition it came from.. – shiv455 Feb 24 '16 at 13:01
  • (1)right. (2) One partition only will be assign to one consumer, so messages in same partition can be processed sequentially. – Joey Feb 25 '16 at 01:24
  • When you say "so messages in same partition can be processed sequentially" then we cannot achieve high throughput right..if at a time consumer processes only one message ...i.e. For example producer publishes 40msgs per sec ...but consumer process only 1 msg per sec ...though we have multiple consumers suppose if we have 10 consumers it's 10 messages per sec...????? – shiv455 Feb 25 '16 at 08:50
  • You can create 10 patitions for the topic and then start 10 consumers to consume 10 messages per second – Joey Feb 27 '16 at 02:37
0

Okay, there are a lot of questions here

Would like to know if it is possible to read and process multiple messages by a > Kafka consumer(Java client) parallely

The kafka client for java only supports serial processing, you can parallelize kafka consumer up to the number of partitions your topic have by creating many threads and one consumer for each one, threads are tricky though, I would suggest you to use some library to achieve that, e.g. rapids-kafka-client.

public static void main(String[] args){
  ConsumerConfig.<String, String>builder()
      .prop(KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName())
      .prop(VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName())
      .prop(GROUP_ID_CONFIG, "stocks")
      .topics("stock_changed")
      .consumers(7)
      .callback((ctx, record) -> {
        System.out.printf("status=consumed, value=%s%n", record.value());
      })
      .build()
      .consume()
      .waitFor();
}
  1. Is it a good approach to do so???

Yep, no problem on that, partitions were created to make parallelism possible

  1. And also as per my understanding Kafka treats even each thread as a consumer...please correct me if I'm wrong ...

On Kafka vanilla client library a consumer is a class capable of download messages from one or more partitions of a topic, this client don't support many threads by itself, you can create many threads or use some library (e.g. rapids-kafka-client) for that purpose so you can create many consumers for the different topic partitions and consume in parallel then.

  1. and also would like to make the Java client as daemon service to run in Linux so that it runs continuously and polls Kafka for messages , read and process the same ..is this a good approach ..

Yes, use the library, make your code, publish a jar, run that and let it keep processing data

deFreitas
  • 4,196
  • 2
  • 33
  • 43