0

@EDIT I followed steps from Fastest Way of Inserting in Entity Framework and got even worse results, so it's not a duplicate.

My goal is to create a seed method to fill one of the tables of LocalDb. The method will add 182500 rows (to mimic yearly activity data for 500 devices) for further test. I might want to run it few more times to change the number of devices, so even more rows will be generated. That is why I need to insert rows as efficient as possible.

    protected void SeedReportDataTable(int numberOfTerminals)
    {
        var rand = new Random();
        var tidsList = new List<string>();

        // generuj liste losowych numerow tid
        for (int i = 0; i < numberOfTerminals; i++)
        {
            var randomTid = rand.Next(100000, 1000000).ToString(); // generuj 6-cyfrowy numer tid
            while (tidsList.Contains(randomTid)) { randomTid = rand.Next(100000, 1000000).ToString(); } // elminuj powtorzenia
            tidsList.Add(randomTid);
        }

        // dla kazdego z numerow tid generuj roczna historie aktywnosci
        var recordsList = new BlockingCollection<ReportData>();
        int year = Convert.ToInt32(DateTime.Now.Year);

        Parallel.ForEach(tidsList, tid =>
        {
            // dla kazdego miesiaca
            for (int month = 1; month <= 12; month++)
            {
                // dla kazdego dnia
                for (int day = 1; day <= DateTime.DaysInMonth(year, month); day++)
                {
                    var record = new ReportData
                    {
                        Tid = tid,
                        Active = Convert.ToBoolean(
                            rand.Next(0, 11)), // generuj losowy stan aktywnosci z prawdopodbienstwem 1/10 na bycie nieaktywnym
                        Date = new DateTime(year, month, day)
                    };
                    recordsList.Add(record);
                }
            }
        });
        // dodaj unikalne klucze glowne rekordom przed dodaniem do kontekstu bazy
        var keyValue = 1;

        foreach (var record in recordsList)
        {
            record.Id = keyValue++;
        }

        // podziel liste na czesci
        int chunkSize = 1000;

        for (int recordsSkipped = 0; recordsSkipped < recordsList.Count; recordsSkipped += chunkSize)
        {
            // wymieniaj kontekst
            using (var db = new dbEntities())
            {
                db.Configuration.AutoDetectChangesEnabled = false;
                db.Configuration.ValidateOnSaveEnabled = false;
                // dodawaj do bazy po kawalku
                db.ReportData.AddRange(recordsList.Skip(recordsSkipped).Take(chunkSize));
                db.SaveChanges();
            }
        }
    }

Running this code took 30 minutes to complete. Before that I ran a version ending with:

using (var db = new dbEntities())
{
    db.ReportData.AddRange(recordsList);
    db.SaveChanges();
}

and it took 15 minutes, which is still slower than I expected.

Why did my "improvements" fail?

What can I do to make it insert rows faster?

  • 3
    Possible duplicate of [Fastest Way of Inserting in Entity Framework](https://stackoverflow.com/questions/5940225/fastest-way-of-inserting-in-entity-framework) – Reyan Chougle Aug 26 '17 at 10:42
  • Are you going to use LocalDB in production or Sql Server? If you are worried about production performance then you should benchmark against Sql Server if that is what would be running in production. – Igor Aug 26 '17 at 10:43
  • 2
    That said for seeding a DB with that many records using EF is not the most performant choice. Consider using Sql Bulk Insert instead if performance is paramount when inserting a large number of records (millions or more). Even then you should also check Sql Server for bottlenecks (ie. index updates and statistics updates across inserts can slow down such a batch job as well). – Igor Aug 26 '17 at 10:45
  • I use localdb to have data for testing my other methods for generating pdf reports. This method is purely for testing. **It is not a duplicate because I followed tips from that topic and got even worse results.** – adriangutowski Aug 26 '17 at 10:49
  • If it is for testing then why do you care if it is very fast or not? – Igor Aug 26 '17 at 10:59
  • I already explained. I might want to use this method again with even more rows to add and I don't want to wait 15-30 minutes each time. Also, I tried to improve it with tips from other topic and made it twice as bad. Could you explain me why? – adriangutowski Aug 26 '17 at 11:03
  • 1.You create your dbContext 1850 times. so it will be slow however 2.Try to increase your chunk size to 10000 or even more if it's not problem – sTrenat Aug 26 '17 at 11:05
  • Try something like this: using (var db = new dbEntities()) { db.Configuration.AutoDetectChangesEnabled = false; db.Configuration.ValidateOnSaveEnabled = false; // dodawaj do bazy po kawalku for (int recordsSkipped = 0; recordsSkipped < recordsList.Count; recordsSkipped += chunkSize) { db.ReportData.AddRange(recordsList.Skip(recordsSkipped).Take(chunkSize)); db.SaveChanges(); } } – sTrenat Aug 26 '17 at 11:08
  • 2
    `It is not a duplicate because I followed tips from that topic and got even worse results` Can you show us the version of the code you tested using `SqlBulkCopy` (and share with us how long it took to execute)? – mjwills Aug 26 '17 at 11:19
  • @mjwills The question is not what is the fastest way to insert records in Sql Server database from .NET code. It's still a duplicate of the linked question, but `SqlBulkCopy` has nothing in common with the (database agnostic) Entity Framework solution. Which unfortunately doesn't exist :( – Ivan Stoev Aug 26 '17 at 12:25
  • Is that correct @gutowskiap ? Is it more important to you that you use Entity Framework? Or is it more important that it is fast? – mjwills Aug 26 '17 at 12:28
  • 1
    BlockingCollection has no random access as far as I know, so the repeated `recordsList.Skip(...).Take(...)` in the loop will take more time the more elements are skipped before taking. You need to use a different data structure for Skip/Take or use a different chunking approach. – grek40 Aug 26 '17 at 12:35
  • @mjwills It is true that I wanted to improve my existing code (using EF). SqlBulkCopy is intresting solution, but I would have to convert my list to datatable. **Now I'm testing my code with 10000 chunkSize.** – adriangutowski Aug 26 '17 at 12:35
  • 1
    `I would have to convert my list to datatable.` https://stackoverflow.com/questions/3913371/sqlbulkcopy-from-a-list – mjwills Aug 26 '17 at 12:48

1 Answers1

1

When I add my seeding method to Configuration.cs and run update-database command it takes less than 5 minutes to insert all rows.

It works best when calling Context.AddRange() only once.

        dbContext.Configuration.AutoDetectChangesEnabled = false;
        dbContext.Configuration.ValidateOnSaveEnabled = false;
        dbContext.ReportData.AddRange(recordsList);
        dbContext.SaveChanges();