0

I have two very large tables, both have over 10K recodes in them. I need to use Linq to query from these two tables. See below example, You can see I need to use group join.

enter image description here

Below is my query, it works. But it takes about 6-7 minutes to finish. TableA has over 10K records. TableB has more records than tableA.

from a in tableA
join b in tableB on new {ID = a.OrderID, Name = a.Option} 
             equals new{ID = b.OrderID, Name = b.SelectedOption} into jgroup
select new{
    OrderID = a.OrderID,
    SelectedOption = a.Option,
    SelectedOptionValue = 
             jgroup.Select(g => g.SelectedOptionValue).SingleOrDefault()
}

Is there a better solution to make this faster using linq to SQL.

Any advice will be appreciated. Thanks!

Lin
  • 15,078
  • 4
  • 47
  • 49
  • Does it make a difference if you change the SelectedOptionValue selected to `jgroup.FirstOrDefault().SelectedOptionValue`? – It'sNotALie. Nov 26 '13 at 20:56
  • Do you have an appropriate index on Table B? – neilh Nov 26 '13 at 20:58
  • hi @It'sNotALie, thanks. But it does not make a differece. – Lin Nov 26 '13 at 21:01
  • 2
    Most likely the problem is not with the LINQ query itself. You should try the same query inside Sql Server and post the Execution Plan here so we can take a look and suggest index improvements – tucaz Nov 26 '13 at 21:01
  • is this linq to objects, or with a query provider? If the latter, what is the query provider? – Servy Nov 26 '13 at 21:06
  • thanks @tucaz, I know the LINQ query doesn't have problems, what I want to know is a better solution for query a large data using group join. – Lin Nov 26 '13 at 21:07
  • @Lin there is a better way. Indexing. But in order to help you with it that we need to see a query plan – tucaz Nov 26 '13 at 21:11
  • @tucaz I haven't seen my query plan yet, you gave me a good idea. Let me check it, if I can't figure it out, I'll post it, thanks. – Lin Nov 26 '13 at 21:36

1 Answers1

0

The linq statement might not be translating optimally. I think a more explicit left join syntax would work best

var query = 
  from a in tableA
  from b in tableB.Where(x => a.OrderID == x.OrderID && a.Option == x.SelectedOption).DefaultIfEmpty()
  select new
  {
    OrderID = a.OrderID,
    SelectedOption = a.Option,
    SelectedOptionValue = b.SelectedOptionValue
  };
Aducci
  • 26,101
  • 8
  • 63
  • 67
  • This isn't a left join, it's a cross join, and will perform quite a bit worse on larger data sets (in addition to simply not returning the same results). – Servy Nov 26 '13 at 21:25
  • @Servy - This gets translated to a left join in Linq-to-Sql. Take a look at this question: http://stackoverflow.com/questions/700523/linq-to-sql-left-outer-join – Aducci Nov 26 '13 at 21:28
  • If there are multiple matching entries in the second table for a given item in the first table your code has two output rows, the OP's code will have one. They are clearly not producing the same type of join. – Servy Nov 26 '13 at 21:33
  • @Servy - The OP uses `SingleOrDefault`. OP would use `FirstOrDefault` if there was a chance for multiple matches. – Aducci Nov 26 '13 at 21:40
  • I wouldn't expect the query provider to have different SQL representations for those two methods; I'd expect it to treat them identically. – Servy Nov 26 '13 at 21:41