2

I would like to consume an event from a Kafka topic, inject it into a database, perform some queries, remove the event from the database and produce the results of the query back to a topic.

This looks like what was asked in External system queries during Kafka Stream processing

As an exemple, I could perform geolocalisation queries against a spatial DB, where I would first insert coordinate from the received message, perform a lookup to compute some neighborhood, remove the coordinate from the db and forward downstream a result message containing the neighborhood.

I could use a tranformer to perform the queries and forward the enriched messages dowstream as suggested in the first solution in the previous link. However, I have some performance concerns with KafkaStream when doing this as it results in a per event query on the Db.

  • Could I use some stream pattern like windowed stream to batch several coordinates together and perform only one query per batch ?
  • How KafkaStreams would react in case of DB failure ?
  • Would the KafkaStream thread being stuck ?
  • Is there some timeout where the stream application would fail ?
  • Once the DB back, would the KafkaStream thread recover nicely ?
  • What would be the corner cases of such a design ?
  • Would it be better to develop an external service to perform the db
    query with KafkaConnect or alpakka and read the requests from a topic and write responses to another topic ?
Aurélien
  • 357
  • 2
  • 13
  • How is "best approach" to be determined??? Minimal resource usage, code line count, least likely to fail, etc.??? – Ramón J Romero y Vigil Jun 04 '19 at 17:11
  • Can you elaborate on what role the database is playing here? Writing to and reading from the database as part of a stream process sounds like a pattern that's not always appropriate. – Robin Moffatt Jun 07 '19 at 09:08
  • Thanks for your comments. I updated my question to be more precise with an usage exemple. – Aurélien Jul 16 '19 at 17:04

0 Answers0