1

I would like to write to kafka from spark stream data. I know that I can use KafkaUtils to read from kafka. But, KafkaUtils doesn't provide API to write to kafka.

I checked past question and sample code.

Is Above sample code the most simple way to write to kafka? If I adopt way like above sample, I must create many classes...

Do you know more simple way or library to help to write to kafka?

Community
  • 1
  • 1
tamagohan2
  • 445
  • 4
  • 15
  • As a supplemental explanation, I describe spark application with scala. So, would like to know simple way which can be used on scala. – tamagohan2 Jul 10 '16 at 11:39

1 Answers1

1

Have a look here:

Basically this blog post summarise your possibilities which are written in different variations in the link you provided.

If we will look at your task straight forward, we can make several assumptions:

  • Your output data is divided to several partitions, which may (and quite often will) reside on different machines
  • You want to send the messages to Kafka using standard Kafka Producer API
  • You don't want to pass data between machines before the actual sending to Kafka

Given those assumptions your set of solution is pretty limited: You whether have to create a new Kafka producer for each partition and use it to send all the records of that partition, or you can wrap this logic in some sort of Factory / Sink but the essential operation will remain the same : You'll still request a producer object for each partition and use it to send the partition records.

I'll suggest you continue with one of the examples in the provided link, the code is pretty short, and any library you'll find would most probably do the exact same thing behind the scenes.

avr
  • 4,835
  • 1
  • 19
  • 30
Michael Kopaniov
  • 957
  • 9
  • 17