7

I have a data call in a Linq to Entities powered data access layer that is designed to make paged calls.

In doing so, I need to select a subset of the data, say 50 rows, but also get the count of all matches to know how many total matches exist to be paged through.

Currently, I'm doing the following:

var queryResult = DatabaseContext.Table
    .Where(x => !x.IsDeleted)
    .Where(p => (
            p.PropertyOne.ToLower().Contains(query) ||
            p.PropertyTwo.ToLower().Contains(query) 
            ));

int count = queryResult.Count();

var returnData = queryResult
    .OrderBy(i => i.ID)
    .Skip(start).Take((length))
    .Select(y => new ObjectDTO
    {
        PropertyOne = y.PropertyOne,
        PropertyTwo = y.PropertyTwo
    }
    .AsEnumerable();

This results in two costly database operations. The COUNT operation for some reason actually takes longer than the SELECT operation.

Is there a way to get count and a subset in the same operation?

The logical flow to me says we do the following:

  • Look at Table
  • Find Items in Table that match a condition
  • Get a Count of all matches
  • Return a numbered subset of the matches

This seems possible in one operation, but I cannot figure out how.

Attempt One, Slower

Tried D Stanley's suggestion of casting the full result set to a List and doing count and memory in paging, but it is roughly 2x slower (6.9s avg vs 3.9s avg)

It's worth mentioning the data set is roughly 25,000 records, with over a dozen related tables that are searched in a JOIN.

Wesley
  • 5,381
  • 9
  • 42
  • 65
  • 1
    What if you enumerate the `queryResult` to a `List` and use `.Count` on the list, is it faster? – mausworks Mar 23 '15 at 18:59
  • 2
    FYI the count takes longer because it has to go through all the rows in your table where as the second one can stop as soon as it get `start` + `length` matches. – juharr Mar 23 '15 at 18:59
  • @diemaus That would pull a lot more data than is needed and likely would be slower depending on the amount of data. – juharr Mar 23 '15 at 19:02
  • @juharr as you said, it depends on how much data you are retrieving. Why not try it out? :) – mausworks Mar 23 '15 at 19:10
  • You didn't use the value of `count` in the code. I mean `int count = queryResult.Count();` – Hui Zhao Mar 23 '15 at 19:49
  • @HuiZhao Happens later, wasn't relevant here – Wesley Mar 24 '15 at 00:02
  • You can try Entity Framework Extended https://github.com/loresoft/EntityFramework.Extended This will allow you to to do both operations in a single call to the database. It's still not likely to be quick given the nature of the query. – toasties Mar 23 '15 at 19:34

2 Answers2

0

It may be possible, but it probably won't be much faster because of the criteria you're using. Since you're searching for text within a column value, you cannot use an index and thus must do a table scan. You could do a single query to get all records and do the Count and Skip/Take in linq-to-objects:

var queryResult = DatabaseContext.Table
    .Where(x => !x.IsDeleted)
    .OrderBy(i => i.ID)
    .Where(p => (
            p.PropertyOne.ToLower().Contains(query) ||
            p.PropertyTwo.ToLower().Contains(query) 
            ))
    .ToList();

int count = queryResult.Count();  // now this will be a linq-to-objects query

var returnData = queryResult
    .Skip(start).Take((length))
    .AsEnumerable();

but you'd have to try it to see if it would be any faster.

D Stanley
  • 149,601
  • 11
  • 178
  • 240
  • But, of course this will be problem if there are not few rows in the table. – Farhad Jabiyev Mar 23 '15 at 19:06
  • Possible, but it may still be faster that doing _two table scans_ on the database. – D Stanley Mar 23 '15 at 19:38
  • 2
    Once you've called `.ToList()`, you've got a list in memory. Just access the `Count` property instead of calling the `.Count()` method -- though LINQ should see the underlying property and use that anyway. – afrazier Mar 23 '15 at 20:22
  • The only downside is I'm casting to a data transfer object with a SELECT statement I didn't show. That might change things. Let me update the question. – Wesley Mar 23 '15 at 20:24
  • Then just add the `Select` to the base query - then you're only transferring the columns you need over the wire and it doesn't affect the `Count`. – D Stanley Mar 23 '15 at 20:29
  • I'll try it both ways and let you know – Wesley Mar 23 '15 at 20:35
  • 1
    Unfortunately @DStanley, casting to the list and filtering in memory is actually slower on most tables. – Wesley Mar 23 '15 at 22:48
  • @afrazier fun fact: `IEnumerable.Count()` internally looks at .Count on Lists (or `ICollection`s) so the performance-impact is very slight. However; it does not do this for `IEnumerable.Any()` which coincidentally means that on lists, `.Count()` is actually _faster_ than `.Any()`. source: http://referencesource.microsoft.com/#System.Core/System/Linq/Enumerable.cs,1194 – mausworks Mar 25 '15 at 20:38
0

How about something like this:

db.Products
   .Take(10)
   .Select(p => new 
                {
                    Total = db.Products.Count, 
                    Product = p
                })

If that's no good, it may help to know that in SQL, you get the total results and one page of them using OVER().

MGOwen
  • 6,562
  • 13
  • 58
  • 67