0

Good morning.

Today I'm facing problem with EF 6 and AddRange method. There is an WPF application which works with ~100000 records. I wrote import functionality, which imports data from .csv files and there is a problem.

It looks like this:

 private void FileImport()
    {
        //Open dialog to choose file
        OpenFileDialog ofd = new OpenFileDialog();

        string fileName = string.Empty;

        if (ofd.ShowDialog() == true)
        {
            fileName = ofd.FileName;
        }

        if (!string.IsNullOrEmpty(fileName))
        {
            //getting all lines
            var lines = File.ReadAllLines(fileName).ToList();

            //File requirements says that there cannot be empty values in first element
            if (lines.Any(line => line.Split(';')[0].Equals("null")))
            {
                MessageBox.Show("BLA BLA BLA");
            }
            else
            {

                List<List<string>> splitLines = new List<List<string>>();

                //split lines into smaller list. For every sublist in list we will do things separatly in separate threads to get it faster.
                for (int i = 0; i < lines.Count; i += 1000)
                {
                    splitLines.Add(lines.GetRange(i, Math.Min(1000, lines.Count - i)));
                }

                var taskList = new List<Task>();

                List<ut_katabcdx_file_filters> filterList = new List<ut_katabcdx_file_filters>();

                foreach (var list in splitLines)
                {
                    //define a Task
                    var t = new Task(() =>
                    {
                        foreach (var line in list)
                        {
                            var filters = line.Split(';');

                            //split line into elements array. It must have 6 elemets
                            if (filters.Count() == 6)
                            {

                                //Temporary declaration for parsing. If element that pretends to be decimals are empty we set its value to -100000.
                                decimal temp;
                                int tempi;
                                decimal? proga = filters[1].Equals("") ? -100000 : (decimal.TryParse(filters[1], out temp) ? (decimal?)temp : null);
                                decimal? progb = filters[2].Equals("") ? -100000 : (decimal.TryParse(filters[2], out temp) ? (decimal?)temp : null);
                                int? plan_sprz_rok = filters[3].Equals("") ? -100000 : (int.TryParse(filters[3], out tempi) ? (int?)tempi : null);

                                ut_katabcdx_file_filters filter = new ut_katabcdx_file_filters()
                                {
                                    indeks = filters[0],
                                    //produkty_iz = ProduktyIzChecked ? (filters[1].Equals("null") ? null : filters[1]) : string.Empty,
                                    proga = ProgaChecked ? proga : -100000,
                                    progb = ProgbChecked ? progb : -100000,
                                    plan_sprz_rok = PlanSprzRokChecked ? plan_sprz_rok : -100000,
                                    kat_tech = KatTechChecked ? (filters[4].Equals("null") ? null : filters[4]) : string.Empty,
                                    kat_handl = KatHandlChecked ? (filters[5].Equals("null") ? null : filters[5]) : string.Empty,
                                };

                                filterList.Add(filter);
                            }
                        }
                    });

                    taskList.Add(t);
                    t.Start();
                }

                //wait for all tasks to end
                Task.WaitAll(taskList.ToArray());

                using (var ctx = new ABCDXContext())
                {
                    ctx.ut_katabcdx_file_filters.AddRange(filterList);
                    ctx.SaveChanges();
                    string param_xml = GetParamXml();
                    Products = new ObservableCollection<ut_katabcdx_wytwor>(ctx.WytworFileUpdate(param_xml));
                }
            }

        }
    }

When I'm debuging code It stops at ctx.ut_katabcdx_file_filters.AddRange(filterList); and does not go further. I've checked filterList.Count and there is an about 60000 rows. I've checked database table also, but It's empty.

Is it because big amount of data or I'm doing something not properly ? I would be very greatful for any advices.

fanarek
  • 367
  • 3
  • 16
  • 2
    When you say it "does not go further", you need to wait until it is finished before you see anything in the database. Inserting 60k records is not a quick task, especially for EF. You may want to look at other tools for bulk inserting data, – DavidG May 08 '19 at 12:22
  • @DavidG Can it last even 20 minutes ? – fanarek May 08 '19 at 12:52
  • It could take hours, depends on a lot of factors here. – DavidG May 08 '19 at 12:56
  • 1
    Short answer - don't use EF for this. Longer answer - read https://stackoverflow.com/questions/13722014/insert-2-million-rows-into-sql-server-quickly (this should get it from > 20 minutes to < 10 seconds). https://stackoverflow.com/questions/5940225/fastest-way-of-inserting-in-entity-framework is also worth a read. – mjwills May 08 '19 at 13:01
  • 1
    When inserting a very large number of rows like this, it is FAR more performant to use SQL bulk copy. There are several Nuget packages that add bulk insert support to Entity Framework 6. – Bradley Uffner May 08 '19 at 13:41

1 Answers1

0

I agree with DavidG's comment about trying to insert 60k records using AddRange. It will be slow. You'll either have to simply wait for it to finish, or find an alternative.

I have found that your Context will be happier when you call SaveChanges periodically. It seems that piling up a large number of items in the change tracker is a huge performance hit.

So you could do something like this:

using (var ctx = new ABCDXContext())
{
    var count = 0;
    foreach (var filter in filterList)
    {
        ctx.ut_katabcdx_file_filters.Add(filter);
        count++;
        if (count > 100)
        {
            ctx.SaveChanges();
            count = 0;
        }
    }
    if (count > 0)
        ctx.SaveChanges();
    string param_xml = GetParamXml();
    Products = new ObservableCollection<ut_katabcdx_wytwor>(ctx.WytworFileUpdate(param_xml));
}

...just to break up the save into blocks. You can adjust count to a value that balances performance for your scenario.

This answer provides additional suggestions you might try. Comments there suggest good results.

DonBoitnott
  • 10,787
  • 6
  • 49
  • 68
  • 2
    Word of warning with this code, it will only insert batches of 100 records. If you have 199 for example, those last 99 will be lost. – DavidG May 08 '19 at 12:59
  • 1
    Probably calling Add method is not a good idea. It can cause additional change detection in context. As for me, it would be better to divide list into small chunks (like 1000 records) and call AddRange(chunkList); with SaveChanges() – svoychik May 08 '19 at 13:01
  • 1
    Manual batching can often be replaced with MoreLinq's `Batch` to avoid the kind of issue that @DavidG (rightfully) highlights. – mjwills May 08 '19 at 13:07
  • @DavidG True, forgot the additional call to save, which I've now added. – DonBoitnott May 08 '19 at 13:18
  • @svoychik Never tried that, but it's not a bad idea. – DonBoitnott May 08 '19 at 13:18
  • You should really check that `count > 0` before doing that last save. – DavidG May 08 '19 at 13:43
  • 1
    @DavidG Why? If change tracker has 0 nothing will happen. Should be zero overhead in the attempt if there's nothing to do. I went ahead and added it...no harm either way, I suppose. – DonBoitnott May 08 '19 at 15:39