I would like to consume an event from a Kafka topic, inject it into a database, perform some queries, remove the event from the database and produce the results of the query back to a topic.
This looks like what was asked in External system queries during Kafka Stream processing
As an exemple, I could perform geolocalisation queries against a spatial DB, where I would first insert coordinate from the received message, perform a lookup to compute some neighborhood, remove the coordinate from the db and forward downstream a result message containing the neighborhood.
I could use a tranformer to perform the queries and forward the enriched messages dowstream as suggested in the first solution in the previous link. However, I have some performance concerns with KafkaStream when doing this as it results in a per event query on the Db.
- Could I use some stream pattern like windowed stream to batch several coordinates together and perform only one query per batch ?
- How KafkaStreams would react in case of DB failure ?
- Would the KafkaStream thread being stuck ?
- Is there some timeout where the stream application would fail ?
- Once the DB back, would the KafkaStream thread recover nicely ?
- What would be the corner cases of such a design ?
- Would it be better to develop an external service to perform the db
query with KafkaConnect or alpakka and read the requests from a topic and write responses to another topic ?