-2

I am inserting large amounts of data into a table.

For example once every 15 minutes, N records of data become available to be inserted into the table.

My question is, what should I do if inserting N records takes more than 15 minutes? That's, the next insertion cannot begin because the previous one is still in progress.

Please assume that I've used the most affordable hardware and even dropping indexes before starting to insert data does not make inserting faster than 15 minutes.

My preference is not to drop indexes though, because at the same time, the table is queried. What's the best practice in such scenario?

P.S. I don't have any actual code. I am just thinking of and questioning about a possible scenario.

B Faley
  • 17,120
  • 43
  • 133
  • 223
  • What tool/language are you using to insert the data? – Bohemian Aug 29 '15 at 08:20
  • Use staging tables. https://technet.microsoft.com/en-us/library/cc966380.aspx – Lukasz Szozda Aug 29 '15 at 08:25
  • 2
    There is a lot of missing information - How do you perform the insert? if it's row by row there are several optimizations you can do. What is the volume (#records/storage) for each transaction? – Yosi Dahari Aug 29 '15 at 08:39
  • @Yosi Assume all the optimizations have been done and the volume for each transaction is the optimum value. Still, the amount of data that becomes available to insert in 15 minutes is more than what we can insert at the same time. – B Faley Aug 29 '15 at 08:53
  • 1
    If we have to assume you have done everything that is possible, then we can't really help you. – James Z Aug 29 '15 at 08:55
  • 1
    @Meysam - I cannot assume that until I saw your code.. – Yosi Dahari Aug 29 '15 at 08:55
  • @Bohemian The latest version of .Net framework (C#) and sql-server 2014. – B Faley Aug 29 '15 at 08:55
  • @Meysam - writing latest version is problematic. tomorrow it might not be latest.. – Yosi Dahari Aug 29 '15 at 09:31

2 Answers2

0

If you are receiving/loading a large quantity of data every quarter hour, you have an operational requirement, not an application requirement, so use an operational solution.

All database have a "bulk insert" utility, sql server is no exception and even calls the function BULK INSERT:

BULK INSERT mytable FROM 'my_data_file.dat'

Such utilities are built for raw speed and will outstrip any alternative application solution.

Write a shell script to receive the data into a file, formatting it as required using shell utilities, and invoke BULK INSERT.

Wire the process up to crontab (or the equivalent Windows scheduler such as AT if you are running on Windows).

Bohemian
  • 412,405
  • 93
  • 575
  • 722
0

First thing is to look for basic optimizations for inserts.
You can find many posts about it:
What is the fastest way to insert large number of rows
Insert 2 million rows into SQL Server quickly

Second thing is to see why it takes more than 15 minutes? Many things can explain that - locks, isolation level etc. So try to challenge it (for example can some portion of the queries can read uncommitted records?).

Third thing - finding the right quota for insert, and consider splitting to several smaller chunks of data, with intermediate commits. Many inserts in one transaction without committing may have a bad affect on the server (log file/locks wise - you need to be able to rollback the entire transaction).

Community
  • 1
  • 1
Yosi Dahari
  • 6,794
  • 5
  • 24
  • 44
  • What if "taking more than 15 minutes" is exactly due to the amount of data after all? – B Faley Aug 29 '15 at 12:46
  • @Meysam - 15 minutes per x records is the incoming rate, the outgoing rate depends on how fast you are able to execute the inserts per records. – Yosi Dahari Aug 29 '15 at 13:08