Questions tagged [apache-kafka-streams]

Related to Apache Kafka's built-in stream processing engine called Kafka Streams, which is a Java library for building distributed stream processing apps using Apache Kafka.

Kafka Streams is a Java library for building fault-tolerant distributed stream processing applications using streams of data records from topics in Apache Kafka.

Kafka Streams is a library for building streaming applications, specifically applications that transform input Kafka topics into output Kafka topics (or calls to external services, or updates to databases, or whatever). It lets you do this with concise code in a way that is distributed and fault-tolerant.

Documentation: https://kafka.apache.org/documentation/streams/

3924 questions
171
votes
3 answers

Kafka: Consumer API vs Streams API

I recently started learning Kafka and end up with these questions. What is the difference between Consumer and Stream? For me, if any tool/application consume messages from Kafka is a consumer in the Kafka world. How Stream is different as this…
sabtharishi
  • 3,141
  • 5
  • 24
  • 27
49
votes
5 answers

Akka Stream Kafka vs Kafka Streams

I am currently working with Akka Stream Kafka to interact with kafka and I was wonderings what were the differences with Kafka Streams. I know that the Akka based approach implements the reactive specifications and handles back-pressure,…
48
votes
6 answers

Handling bad messages using Kafka's Streams API

I have a basic stream processing flow which looks like master topic -> my processing in a mapper/filter -> output topics and I am wondering about the best way to handle "bad messages". This could potentially be things like messages that I can't…
bm1729
  • 2,315
  • 3
  • 21
  • 30
43
votes
3 answers

How to send final kafka-streams aggregation result of a time windowed KTable?

What I'd like to do is this: Consume records from a numbers topic (Long's) Aggregate (count) the values for each 5 sec window Send the FINAL aggregation result to another topic My code looks like this: KStream longs =…
odavid
  • 629
  • 1
  • 6
  • 17
38
votes
1 answer

Kafka Streams API: KStream to KTable

I have a Kafka topic where I send location events (key=user_id, value=user_location). I am able to read and process it as a KStream: KStreamBuilder builder = new KStreamBuilder(); KStream locations = builder …
Guido
  • 46,642
  • 28
  • 120
  • 174
37
votes
8 answers

UnsatisfiedLinkError: /tmp/snappy-1.1.4-libsnappyjava.so Error loading shared library ld-linux-x86-64.so.2: No such file or directory

I am trying to run a Kafka Streams application in kubernetes. When I launch the pod I get the following exception: Exception in thread "streams-pipe-e19c2d9a-d403-4944-8d26-0ef27ed5c057-StreamThread-1" java.lang.UnsatisfiedLinkError:…
el323
  • 2,760
  • 10
  • 45
  • 80
33
votes
1 answer

Why Apache Kafka Streams uses RocksDB and if how is it possible to change it?

During investigation within new features in Apache Kafka 0.9 and 0.10, we had used KStreams and KTables. There is an interesting fact that Kafka uses RocksDB internally. See Introducing Kafka Streams: Stream Processing Made Simple. RocksDB is not…
30
votes
2 answers

Kafka Streaming Concurrency?

I have some basic Kafka Streaming code that reads records from one topic, does some processing, and outputs records to another topic. How does Kafka streaming handle concurrency? Is everything run in a single thread? I don't see this mentioned in…
clay
  • 18,138
  • 28
  • 107
  • 192
29
votes
4 answers

Handling exceptions in Kafka streams

Had gone through multiple posts but most of them are related handling Bad messages not about exception handling while processing them. I want to know to how to handle the messages that is been received by the stream application and there is an…
Thiru
  • 2,541
  • 4
  • 25
  • 39
28
votes
1 answer

Difference between Apache Samza and Apache Kafka Streams (focus on parallelism and communication)

In Samza and Kafka Streams, data stream processing is performed in a sequence/graph (called "dataflow graph" in Samza and "topology" in Kafka Streams) of processing steps (called "job" in Samza" and "processor" in Kafka Streams). I will refer to…
Lukas Probst
  • 289
  • 1
  • 3
  • 5
26
votes
1 answer

Is it possible to access message headers with Kafka Streams?

With the addition of Headers to the records (ProducerRecord & ConsumerRecord) in Kafka 0.11, is it possible to get these headers when processing a topic with Kafka Streams? When calling methods like map on a KStream it provides arguments of the key…
Nathan Myles
  • 303
  • 1
  • 3
  • 8
24
votes
1 answer

How to commit manually with Kafka Stream?

Is there a way to commit manually with Kafka Stream? Usually with using the KafkaConsumer, I do something like below: while (true) { ConsumerRecords records = consumer.poll(100); for (ConsumerRecord record :…
Glide
  • 20,235
  • 26
  • 86
  • 135
23
votes
3 answers

Does Kafka python API support stream processing?

I have used Kafka Streams in Java. I could not find similar API in python. Do Apache Kafka support stream processing in python?
22
votes
2 answers

Difference between idempotence and exactly-once in Kafka Stream

I was going through document what I understood we can achieve exactly-once transaction with enabling idempotence=true idempotence: The Idempotent producer enables exactly once for a producer against a single topic. Basically each single message…
Sandeep
  • 225
  • 1
  • 2
  • 7
22
votes
1 answer

Difference between KTable and local store

What the difference between these entities? As i think, KTable - simple kafka topic with compaction deletion policy. Also, if logging is enabled for KTable, then there is also changelog and then, deletion policy is compaction,delete. Local store -…
Nikita Ryanov
  • 1,520
  • 3
  • 17
  • 34
1
2 3
99 100