11

I have a Windows Service application that receives a stream of data with the following format

IDX|20120512|075659|00000002|3|AALI                 |Astra Agro Lestari Tbk.                                     |0|ORDI_PREOPEN|12  |00000001550.00|00000001291.67|00001574745000|00001574745000|00500|XDS1BXO1|                                        |00001574745000|›§
IDX|20120512|075659|00000022|3|ALMI                 |Alumindo Light Metal Industry Tbk.                          |0|ORDI        |33  |00000001300.00|00000001300.00|00000308000000|00000308000000|00500|--U3---2|                                        |00000308000000|õÄ

This data comes in millions of rows and in sequence 00000002....00198562 and I have to parse and insert them according to the sequence into a database table.

My question is, what is the best way (the most effective) to insert these data into my database? I have tried to use a simple method as to open a SqlConnection object then generate a string of SQL insert script and then execute the script using SqlCommand object, however this method is taking too long.

I read that I can use Sql BULK INSERT but it has to read from a textfile, is it possible for this scenario to use BULK INSERT? (I have never used it before).

Thank you

update: I'm aware of SqlBulkCopy but it requires me to have DataTable first, is this good for performance? If possible I want to insert directly from my data source to SQL Server without having to use in memory DataTable.

Arief
  • 6,055
  • 7
  • 37
  • 41
  • 3
    What do you mean by "insert them according to the sequence"? Why does it matter what order you insert rows? They are stored in the order specified in the [clustered index](http://msdn.microsoft.com/en-us/library/aa933131.aspx), regardless of insertion order. – Mark Byers May 24 '12 at 05:03

3 Answers3

22

If you are writing this in C# you might want to look at the SqlBulkCopy class.

Lets you efficiently bulk load a SQL Server table with data from another source.

Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452
  • 2
    ;) Amazing how people supposedly never look at the documentation - SqlBUlkCopy sort of springs out to at least check. Note: It is trivial (an hour) to write a converter that allows SqlBulkCopy to use an enumeration of objects as data source ;) – TomTom May 24 '12 at 04:42
  • Hi Mark, thanks for your answer, I have updated my question. Could you comment on that please? thanks. – Arief May 24 '12 at 04:53
3

First, download free LumenWorks.Framework.IO.Csv library.

Second, use the code like this

StreamReader sr = new TextReader(yourStream);
var sbc = new SqlBulkCopy(connectionString);
sbc.WriteToServer(new LumenWorks.Framework.IO.Csv.CsvReader(sr));

Yeah, it is really that easy.

Vasyl Boroviak
  • 5,959
  • 5
  • 51
  • 70
  • 1
    I was looking for a library that copes with CSV fields with embedded quotation marks and newlines. This CSVReader is perfect, and easily plugs into SqlBulkCopy. I can now load millions of rows per minute using only a few lines of code. Thanks for sharing. – Martin May 07 '14 at 14:08
  • Hi, How to insert values with check? e.g. I want to check whether the record exists or not in the db, if exists I want to append to other table – Nitin Sawant Sep 24 '14 at 12:54
  • 1
    @NitinSawant It won't gonna be *bulk* insert any more. But a regular one-by-one. – Vasyl Boroviak Sep 29 '14 at 08:23
  • @NitinSawant You need to change the processing idea - use bulk insert to push into Table A, you can then perform a join on your other table to drive the decision where to put it; databases should be used for set based operations, they perform really poorly on row based – Paul Hatcher Jun 20 '15 at 10:41
0

You can use SSIS "Sql Server Integration Service" for converting data from source data flow to destination data flow. The source can be a text file and destination can be a SQL Server table. Your conversion executes in bulk insert mode.

O. Jones
  • 103,626
  • 17
  • 118
  • 172
mehdi lotfi
  • 11,194
  • 18
  • 82
  • 128