MSK & building aggregrate tables (e.g. for analytics)

Question

I use MSK and I manually build aggregate tables of my streams in my application code (e.g. TypeScript in a node.js webservice). I have lots of data (approaching 1M events per day), and I want to be able to productionise different real-time 'views' on the incoming stream. E.g. for some sales data, I might want to create these views: sales per customer (table schema: customer, sum_of_sales) sales per day (table schema: date, sum_of_sales) sale per customer per day (table schema: date, customer, sum_of_sales)

Today if I wanted to achieve this I would scaffold 3 tables up (could be RDMS or something like DynamoDB), and then in my application code, I would insert/upsert into the table for every sales event that arrived. The scaffolding around that feels a little tedious, I was wondering if there is a better way without having to write a bunch of code in my webservice to actually pull from the consumer, upsert the data into a table.

All I would expect my code in my web service to do is provide APIs (e.g. REST APIs) to fetch data from these views. E.g. a client makes a REST request to get all sales in the last 7 days for customers X, Y and Z.

There seems like a lot of technologies out there, but my use case is fairly trivial and from the not-so-brief look I took nothing does this.

Thanks

If it's noteworthy, I currently keep my data indefinitely.

You could use Kafka Connect to write to the databases or ksqlDB to process data without node, but you'll still need to write APIs to expose those tables... Pinot or Druid are also better suited for such analytics and both integrate with Kafka — OneCricketeer, May 05 '22 at 18:04
my understanding is I cannot use kSQL with MSK. Or am I getting something wrong? — friartuck, May 07 '22 at 04:06
MSK is just Kafka, I can't think why it wouldn't be possible, but like I said, Pinot or Druid are more suited for the actual analytical queries — OneCricketeer, May 07 '22 at 14:55
from https://stackoverflow.com/questions/67562194/ksql-in-aws-msk `MSK doesn't offer KSQL because it's against the Confluent Licensing`. Seems like I cannot use ksql with MSK. — friartuck, May 09 '22 at 05:50
As I answered there, you need to install it separately (EC2 / EKS, for example) and connect it to MSK. Doesn't mean it cannot be used — OneCricketeer, May 09 '22 at 12:55

MSK & building aggregrate tables (e.g. for analytics)

0 Answers0