0

Been reading a lot about kafka's use as an event store and a potential good candidate for CQRS. I was wondering, since messages in kafka have a limited retention time, how will events be replayed after the messages were deleted from the disk where kafka retains messages?

Logically, when these messages are stored externally from kafka (after reading messages from kafka topics) in a db (sql/nosql), that would make more sense from an event store standpoint than kafka.

In lieu of above, given my understanding is correct, what is the real use case of kafka being used in CQRS even though the actual intent of kafka was just a high throughput messaging system?

Hary
  • 1,127
  • 4
  • 24
  • 51
  • 1
    Does this answer your question? [Using Kafka as a (CQRS) Eventstore. Good idea?](https://stackoverflow.com/questions/17708489/using-kafka-as-a-cqrs-eventstore-good-idea) – Jonas Nov 15 '19 at 19:35
  • 1
    "limited retention time" is not true. In fact you can have infinite retention and use it as a datastore: [It's Okay To Store Data In Kafka - Confluent](https://www.confluent.io/blog/okay-store-data-apache-kafka/) – Paizo Nov 15 '19 at 19:37
  • is that a typical use case for kafka to save events indefinitely and use it as true event store? – Hary Nov 15 '19 at 19:45

2 Answers2

0

You can use Kafka of event store and CQRS. You can use Kafka Stream to process all events generated by commands and store a snapshot of your entities in a changelog topic and store the changelog topic in a NOSQL one or more databases that meets your requirement. Also, all event can be store in a database(PostgresSql). What's important to know is that Kafka can be used as a store(its store files in high available way) or as a message query.

MihaiGhita
  • 137
  • 6
  • is kafka stream is going to be just a regular java application then that I can containerize as well? – Hary Nov 19 '19 at 20:54
  • Yes. Kafka Streams is a library that can be imported in you Java app. Kafka Streams is a wrapper of Kafka Consumer and Producer classes that provide streaming processing like map, filter, join and aggregation using data inside Kafka Cluster. – MihaiGhita Nov 20 '19 at 07:32
0

Retention time: You can set the retention time as long as you want or even keep messages forever in the topic.

Using Kafka as the data store: Sure, you can. There is a feature named Log Compaction. Let say the following scenario:

  • Insert product with ID=10, Name=Apple, Price=10
  • Insert product with ID=20, Name=Orange, Price=20
  • Update product with ID=10, Price becomes 30

When one topic is turned on the log compaction, a background job will periodically clean up messages on that topic. This job will check if any message has the same key then only keeps the final. With the above scenario, messages which are written to Kafka will the following format:

  • Message 1: Key=1, Name=Apple, Price=10
  • Message 2: Key=2, Name=Orange, Price=20
  • Message 3: Key=1, Name=Apple, Price=30 (Every update now includes all fields so it can self-contained)

After the log compaction, the topic will become:

  • Message 1: Key=2, Name=Orange, Price=20
  • Message 2: Key=1, Name=Apple, Price=30 (Keep the lastest record with the ID=1)

In reality, Kafka uses log compaction feature to make Kafka as the persistent data storage.

hqt
  • 29,632
  • 51
  • 171
  • 250