1

I have a scenario where I need to divide the insertion (insert) of records to a table in that way it wouldn't be so slow, currently I have 81K rows of records to insert, the other day using the current process took about 4-5 hrs to complete. I wanted to enhance the thread process or just divide the insert into batch say by 20's (81,000/20) and process 4k of rows everytime using for loops. Which one would be best recommendation.

This is my code currently:

iProcs = 81000/20;
Thread[] threads = new Thread[iProcs]
                    for (int i = 0; i < iProcs; i++)
                {
                    //range of values to get
                    iStart = iRange * i;
                    if (i == iProcs - 1)  //for last processor use the rest of the list
                        iEnd = packageList.Count - iStart;
                    else
                        iEnd = iRange;

                    var listSubset = packageList.GetRange(iStart, iEnd).ToList();
                    Thread myThread = new Thread(
                        delegate()
                        {
                            service.PostToClient(listSubset);
                        }
                    );
                    myThread.Start();
                    threads[i] = myThread;
                }
// all threads should complete before we continue with main.
foreach (Thread thread in threads) { thread.Join(); }

The above code is pretty slow, I haven't test this one below if this is more effecient:

for (int i = 0; i <= iProcs; i++)
                {

                    iStart = i * iRange;
                    // lets add what's been processed
                    iEnd =  iRange;

                    // find out how much record is left to process
                    int cntleft = totalRec - iStart;

                    if (cntleft < iEnd)
                        iEnd =  cntleft;


                    // process data save images to db
                    var listSubset = packageList.GetRange(iStart, iEnd).ToList();
                   //   service.PostToDB(listSubset);


                }

I'm haven't done much Thread programs so I'm not really an expert or even mediocre on it.

Any response is appreciated. Thank you.

PinoyDev
  • 949
  • 1
  • 10
  • 21
  • 3
    You're creating 4050 threads. This is BAD! Note that having a lot of threads does NOT necessarily make your application faster. In fact, having way too many threads can actually cause the program to be even slower compared to a single threaded scenario, since there's an overhead of switching between threads. Many modern personal computers can handle 2-8 threads "at a time". You need to reconsider this. – Arian Motamedi Sep 09 '13 at 16:53
  • You could check here http://stackoverflow.com/questions/145304/when-to-use-thread-pool-in-c?rq=1 I'm quite certain its almost the same question exept he creates them , not , in a for loop – Don Thomas Boyle Sep 09 '13 at 16:53
  • 2
    Best recommendation:) - profile your code to know what is slow. At very least look at CPU/disk/network usage and see if you can spot bottleneck. Than optimize/solve it. Randomly changing code to use threading will likely cause more trouble than it worth, especially if you are new to it. – Alexei Levenkov Sep 09 '13 at 16:54
  • why don't you use parallel? – swcraft Sep 09 '13 at 16:57
  • 4
    Don't implement your own threading for this. This is exactly the kind of thing which `Parallel.For` is designed to do efficiently. – Cory Nelson Sep 09 '13 at 16:57
  • Thank you so much guys!, you all have been a great help, I'll take a look at parallel threads, definitely check the health monitor for different statuses. – PinoyDev Sep 09 '13 at 17:01
  • 1
    If this is a SQL Server database, then you should take a look at SQLBulkCopy(see here http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlbulkcopy.aspx). It will handle most of this stuff for you and has incredible performance. If it takes more than a few minutes to insert 80K records, then you've got a network or SQL Server problem. – RBarryYoung Sep 09 '13 at 17:05
  • yes it is SQL db, I wonder if I can still use SQLBulk Copy when im passing it via wcf web service – PinoyDev Sep 09 '13 at 17:07
  • You should keep in mind *actual number of thread can execute simultaneously is equal to the number of cores you have* rest all OS pretends to execute simultaneously but not. – Sriram Sakthivel Sep 09 '13 at 17:32
  • Where is the data coming from? If it's coming out of another database or a flat file, you're going to get the best performance from a SQL Server Integration Services (SSIS) package. Even if the data is springing fully formed out of the head of .NET, you may still want to dump it to a flat file and have SSIS do the actual inserts. – criticalfix Sep 09 '13 at 19:32
  • 1
    Inserting 81k records should *not* be a problem for any database. You must be doing something else. – Brian Rasmussen Sep 09 '13 at 20:09

1 Answers1

1
  1. Fist of all use SqlBulkCopy(for sql server) or other BulkCopy (depends on DB vendor) if possible
  2. If you need insert data asap, so perform some benchmark with multiple inserting threads on real hardware. By this data you can optimize number of inserting threads for best time.
sh1ng
  • 2,808
  • 4
  • 24
  • 38