4

I'm trying to send multiple .mp4 files as kafka stream messages.
I tried to follow the same approach as for text messages, but it did not work out.

Does it mean I need a special Encoder/Decoder/Serializer/Deserializer while producing and consuming. How do I configure my producer and consumer for that?

Viacheslav Shalamov
  • 4,149
  • 6
  • 44
  • 66
Partha
  • 102
  • 1
  • 10
  • 1
    can you show your producer and consumer code and configuration? – Viacheslav Shalamov Mar 18 '19 at 16:01
  • 1
    please, add more details to your questions to make sure you get most relevant answers and keel this question useful for the SO community. – Viacheslav Shalamov Mar 18 '19 at 16:22
  • Streaming large files to Kafka (which videos are typically fairly large) isn't very common. The default record size for AK is 1MB, if you want to send larger records you'll need to set `max.message.bytes` to a larger number on the broker. Keep in mind, sending larger records will cause longer GC pauses. Are you sure you mean to send video to Kafka, what about putting the file on shared storage like S3, and pass along metadata that references the asset? – Chris Matta Mar 18 '19 at 17:46
  • Well, It is not recommended to stream so big files i.e video or huge audios over kafka. Instead you should consider storing the files into some other location and pass reference in kafka queue as Chris also mentioned. – Nishu Tayal Mar 18 '19 at 19:30

2 Answers2

1

create file similar to text file creation

By this, I assume, that you are following as example of setting up producer and consumer, sending text/json messages over kafka.

In your case, you need serialise your video file/piece/chunk to bytes, send raw bytes to kafka, read those in consumer and deserialise your video file/piece/chunk back.

To send raw bytes through kafka, you need to use ByteArraySerializer in producer and ByteArrayDeserializer in consumer.

See: https://kafka.apache.org/20/javadoc/index.html?org/apache/kafka/common/serialization/ByteArrayDeserializer.html https://kafka.apache.org/20/javadoc/org/apache/kafka/common/serialization/ByteArraySerializer.html

So, in your configuration you need to specify properties( assuming that you don't use keys, only values): producer:

"key.serializer":"org.apache.kafka.common.serialization.StringSerializer"
"value.serializer":"org.apache.kafka.common.serialization.ByteArraySerializer"

consumer:

"key.deserializer":"org.apache.kafka.common.serialization.StringDeserializer"
"value.deserializer":"org.apache.kafka.common.serialization.ByteArrayDeserializer"

If you just want to sent an mp4 file, read it as bytes like this (in java): File to byte[] in Java

byte[] array = Files.readAllBytes(new File("/path/to/file").toPath());

On the other side, in consumer, you receive that byte array and save them to file.

Viacheslav Shalamov
  • 4,149
  • 6
  • 44
  • 66
  • String Serialializer /Deserializer is fine with text file , how do we do with mp4 format file . Do we need codec code etc etc ... – Partha Mar 18 '19 at 16:29
  • I just told you above. MP4 file -> byte array -> kafka -> byte array -> MP4 file – Viacheslav Shalamov Mar 18 '19 at 16:31
  • What language do you use? Seriously, please, edit your question and provide more information about what and how you are doing, what did you researched and what did you already tried. – Viacheslav Shalamov Mar 18 '19 at 16:31
1

I have an example of doing it using JS.

You can visit : https://gitlab.com/dimimpov/streaming-video-apache-kafka

Basically you read a file to byte array then through kafka and then you stream the byte array to a video player that it can play chunks.