0

I had a table which has more than 200,000 records for any particular month.

Getting records from a table is not a problem it is working as expected but searching through records shows very slow performance

var listEmpShiftDetails =ctx.tblTO_ShiftSchedule
                            .Where(m => m.CompanyId == companyId &&
                                        m.ShiftDate >= fromdate &&
                                        m.ShiftDate <= todate)
                            .Select(m => m).ToList();

Records fetched from database around: 200 000

var data = (from a in ctx.tblEmployee
             join b in ctx.tblTO_Entry  on a.Id equals b.EmployeeId
             where a.CompanyId == companyId && b.CompanyId == companyId &&
             (b.Entry_Date >= fromDate && b.Entry_Date <= toDate)                 
             select new { a, b }).ToList();

*ote: No database called are made in below code.all the data is fetched above

Linq Query to fetch one by one record

 foreach (var item in data) // Data consist of employee details 3k Records
   {          
       if (listEmpShiftDetails
           .Any(m => m.EmployeeId == item.a.Id && 
                     m.ShiftDate == item.b.Entry_Date))
       {
            var shiftDetails = listEmpShiftDetails
                               .Where(m => m.EmployeeId == item.a.Id && 
                                           m.ShiftDate ==item.b.Entry_Date)
                               .Select(m => m)
                               .FirstOrDefault();
                  //Other Calculations    
          }
    }

Above 2 Lines takes too much time to execute, below is output from Visual Studio. How to improve the performance?

Profiler Output

enter image description here

Community
  • 1
  • 1
Nilesh Gajare
  • 6,302
  • 3
  • 42
  • 73
  • 5
    You're executing the same query twice. Just do it once. – Enigmativity Jul 09 '18 at 10:19
  • "2 Lakh Records"??? – Enigmativity Jul 09 '18 at 10:20
  • 200,000 records – Nilesh Gajare Jul 09 '18 at 10:20
  • Do you need the first line at all - just check the result of the second returning anything. – PaulF Jul 09 '18 at 10:20
  • If at all possible you should do the filtering in sql, it is by far the most efficient way. – Esko Jul 09 '18 at 10:22
  • PaulF & Enigmativity i wil remove 1st line and once again run the profiler to check the result – Nilesh Gajare Jul 09 '18 at 10:23
  • @Enigmativity not even twice, 6k times! – Mafii Jul 09 '18 at 10:24
  • @Enigmativity after updating code performance has increased but it still take 4-5 min to execute. Is there any other solution to reduce time? – Nilesh Gajare Jul 09 '18 at 10:41
  • Where does the employee data come from - if it is from the database can you modify the query to only get the records you want. – PaulF Jul 09 '18 at 10:46
  • Employee data comes from database and that is quick, it doesent take time. Problem in searching through records using Linq – Nilesh Gajare Jul 09 '18 at 10:49
  • My point is exactly that - avoid using linq by changing the query so you only get the data you want from the database. You have 3k employee data + 200k shift data that you are manually searching - all of that could be done quickly by the database engine. – PaulF Jul 09 '18 at 10:53
  • Another possibility is using a [Linq Join](https://learn.microsoft.com/en-us/dotnet/csharp/linq/perform-inner-joins) followed by a [GroupBy & get the first of the group](https://stackoverflow.com/questions/19012986/how-to-get-first-record-in-each-group-using-linq) - that may be quicker than your current code - but it will be slower than doing the correct database query. – PaulF Jul 09 '18 at 10:57
  • 2
    @Nilesh - You should show us your full code. We should see how all the data is retrieved from the database and how you're preparing it. Then we can give you a proper answer. – Enigmativity Jul 09 '18 at 11:23
  • Perhaps a `Parallel.ForEach` can help a little. – Drag and Drop Jul 09 '18 at 11:29
  • @Enigmativity i had updated the code – Nilesh Gajare Jul 09 '18 at 12:04
  • @Nilesh - That can't be your actual code. In your `foreach` loop `item` has an `Id` property but you're showing that `data` only has two properties `a` and `b`. What's real and what's not? – Enigmativity Jul 09 '18 at 12:09
  • @Enigmativity updated foreach with actual code – Nilesh Gajare Jul 09 '18 at 12:19
  • @Nilesh - It still doesn't look like actual code that you run. The casing is off. Nonetheless, I think I've created a single query for you to try. – Enigmativity Jul 09 '18 at 12:54
  • What do you think `.Select(m => m)` does? – NetMage Jul 09 '18 at 17:15
  • Just for reference, what version of the .net framework are you using. Also, it might be helpful to try and narrow down if its any one specific part that is slow. How I would tackle this. First, i would rewrite the linq query(the 46% one), as is, as a standard `for loop` and see if your performance is the same. Then i'd try an empty `for loop`, is the performance the same? No? Now try it with half the statements inside the original loop, etc. Narrow down exactly what is slow. – Taekahn Jul 10 '18 at 02:35

3 Answers3

0

var listEmpShiftDetails = Records Fetched from Database around :- 2 Lakh Recordd;

foreach (var item in data) // Data consist of employee details 3k Records
{
     var selectedItem = listEmpShiftDetails.FirstOrDefault(m => m.EmployeeId == item.Id && 
     m.ShiftDate == item.Entry_Date);
  if (selectedItem != null)
  {
     // Other Calculations    
  }
}

No need to iterate same query several times. you just need first item if it matches otherwise null. Hope above query gives you better performance.

zacs
  • 132
  • 6
0

To start with, do it this way to avoid executing the query twice:

foreach (var item in data) // Data consist of employee details 3k Records
{
    var shiftDetails = listEmpShiftDetails.Where(m => m.EmployeeId == item.Id && m.ShiftDate == item.Entry_Date).FirstOrDefault()
    if (shiftDetails != null)
    {
        //Other Calculations    
    }
}

Next, it appears that you're doing some sort of join it would be ideal to see what makes up data so that we could further suggest a way to improve the time significantly.

It's possible that this might give you some improvement:

var query =
(
    from a in ctx.tblEmployee.Where(x => x.CompanyId == companyId)
    join b in ctx.tblTO_Entry.Where(x => x.CompanyId == companyId) on a.Id equals b.EmployeeId
    where b.Entry_Date >= fromDate
    where b.Entry_Date <= toDate
    join m in ctx.tblTO_ShiftSchedule.Where(x => x.CompanyId == companyId) on new
    {
        a.Id,
        b.Entry_Date
    } equals new
    {
        Id = m.EmployeeId,
        Entry_Date = m.ShiftDate
    } into g
    from m2 in g.Where(x => x.ShiftDate >= fromDate).Where(x => x.ShiftDate <= toDate).Take(1)
    select m2
).ToList();

foreach (var shiftDetails in query)
{
    //Other Calculations   
}
Enigmativity
  • 113,464
  • 11
  • 89
  • 172
  • Must be a copy past from op, but to far on the right to be noticed. Or It a leftover because Op do a select but didn't show the detail of it. – Drag and Drop Jul 09 '18 at 11:34
  • But if the select is not needed, It should be remove as Select will need an other instance of the where select enumerable iterator. Little perf gain. – Drag and Drop Jul 09 '18 at 11:50
0

What Comes on my head after looking your query:

  1. Extra query load, no need to check If(){}.
  2. If your "listEmpShiftDetails" is IQueryable then listEmpShiftDetails.Any() is OK. But if it is List then it hampers the persormance. Worth spending 5 mins Count Vs Any performance.
  3. Keep your query simple.

    var shiftDetails = listEmpShiftDetails.FirstOrDefault(m => m.EmployeeId == item.Id && m.ShiftDate == item.Entry_Date);
    
Bikram
  • 483
  • 6
  • 16