You're right that the documentation specifies that Kafka consumers are not thread-safe. However, it also says that you should run consumers on separate threads,
not processes. That's quite different. See here for an answer with more specifics, geared towards Java/JVM:
https://stackoverflow.com/a/15795159/236528
In general, you can have as many consumers as you want on a Kafka topic. Some of these might share a group id, in which case, all the partitions for that topic will be distributed across all the consumers active at any point in time.
There's much more detail on the Javadoc for the Kafka Consumer, linked at the bottom of this answer, but I copied the two thread/consumer models suggested by the documentation below.
1. One Consumer Per Thread
A simple option is to give each thread its own consumer instance. Here
are the pros and cons of this approach:
PRO: It is the easiest to implement
PRO: It is often the fastest as no inter-thread co-ordination is needed
PRO: It makes in-order processing on a per-partition basis very easy to implement (each thread just processes messages in the order it receives them).
CON: More consumers means more TCP connections to the cluster (one per thread). In general Kafka handles connections very efficiently so this is generally a small cost.
CON: Multiple consumers means more requests being sent to the server and slightly less batching of data which can cause some drop in I/O throughput.
CON: The number of total threads across all processes will be limited by the total number of partitions.
2. Decouple Consumption and Processing
Another alternative is to have one or more consumer threads that do
all data consumption and hands off ConsumerRecords instances to a
blocking queue consumed by a pool of processor threads that actually
handle the record processing. This option likewise has pros and cons:
PRO: This option allows independently scaling the number of consumers
and processors. This makes it possible to have a single consumer that
feeds many processor threads, avoiding any limitation on partitions.
CON: Guaranteeing order across the processors requires particular care
as the threads will execute independently an earlier chunk of data may
actually be processed after a later chunk of data just due to the luck
of thread execution timing. For processing that has no ordering
requirements this is not a problem.
CON: Manually committing the
position becomes harder as it requires that all threads co-ordinate to
ensure that processing is complete for that partition. There are many
possible variations on this approach. For example each processor
thread can have its own queue, and the consumer threads can hash into
these queues using the TopicPartition to ensure in-order consumption
and simplify commit.
In my experience, option #1 is the best for starting out, and you can upgrade to option #2 only if you really need it. Option #2 is the only way to extract the maximum performance from the kafka consumer, but its implementation is more complex. So, give option #1 a try first, and see if it's good enough for your specific use case.
The full Javadoc is available at this link:
https://kafka.apache.org/23/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html