Is it possible to BULK load data from a Kafka queue directly to SQL Server?

Question

SQL Server offers bulk insert functionality. You can see that this file reads from e.g. a csv file and inserts to table.

I am understanding that this has a clear drawback when working with Kafka:

you would have to take the kafka message and transform it to CSV
you would have to take the kafka message, and after the transformation in the previous step, write it to disk, so that the BULK INSERT can access the file.

My question is about how to overcome the above drawbacks; something about this whole process looks wrong. What is most worrying to me is the 2nd drawback, writing to disk. Would I be able to write a file to memory, and then execute bulk insert over it?

Perhaps you can send the data in JSON format? For newer versions of SQL Server anyway. https://stackoverflow.com/questions/57615642/trying-to-insert-pandas-dataframe-to-temporary-table/57616645#57616645 — Jason Cook, Oct 22 '21 at 11:24

OneCricketeer · Answer 1 · 2021-10-22T15:24:12.150

1

Sure, it's "possible", but ideally you wouldn't use this BULK INSERT method from a CSV.

Instead, you can use Kafka Connect JDBC sink, which buffers records in memory, not as a file, as a Kafka Consumer, then uses regular INSERT INTO table VALUES query

If you only want to be able to query Kafka data with SQL functions, then you don't need to upload data to a relational database - you can use ksqlDB or PrestoDB, for example

edited Oct 22 '21 at 15:24

answered Oct 22 '21 at 13:46

OneCricketeer

179,855
19
132
245

1

You wouldn't use CSV file method (or Python, as you'd originally tagged the question)... Kafka Connect is completely open source if you want to read how it works. And it runs as a Java process. Can run locally fine. You can read this 3-part guide that uses MySQL (as a source), for example. https://www.confluent.io/blog/simplest-useful-kafka-connect-data-pipeline-world-thereabouts-part-1/ or this about working with primary keys in the sink connector - https://rmoff.net/2021/03/12/kafka-connect-jdbc-sink-deep-dive-working-with-primary-keys/ – OneCricketeer Oct 22 '21 at 15:22

Is it possible to BULK load data from a Kafka queue directly to SQL Server?

1 Answers1