139

I'm trying to determine how to count the matching rows on a table using the EntityFramework.

The problem is that each row might have many megabytes of data (in a Binary field). Of course the SQL would be something like this:

SELECT COUNT(*) FROM [MyTable] WHERE [fkID] = '1';

I could load all of the rows and then find the Count with:

var owner = context.MyContainer.Where(t => t.ID == '1');
owner.MyTable.Load();
var count = owner.MyTable.Count();

But that is grossly inefficient. Is there a simpler way?


EDIT: Thanks, all. I've moved the DB from a private attached so I can run profiling; this helps but causes confusions I didn't expect.

And my real data is a bit deeper, I'll use Trucks carrying Pallets of Cases of Items -- and I don't want the Truck to leave unless there is at least one Item in it.

My attempts are shown below. The part I don't get is that CASE_2 never access the DB server (MSSQL).

var truck = context.Truck.FirstOrDefault(t => (t.ID == truckID));
if (truck == null)
    return "Invalid Truck ID: " + truckID;
var dlist = from t in ve.Truck
    where t.ID == truckID
    select t.Driver;
if (dlist.Count() == 0)
    return "No Driver for this Truck";

var plist = from t in ve.Truck where t.ID == truckID
    from r in t.Pallet select r;
if (plist.Count() == 0)
    return "No Pallets are in this Truck";
#if CASE_1
/// This works fine (using 'plist'):
var list1 = from r in plist
    from c in r.Case
    from i in c.Item
    select i;
if (list1.Count() == 0)
    return "No Items are in the Truck";
#endif

#if CASE_2
/// This never executes any SQL on the server.
var list2 = from r in truck.Pallet
        from c in r.Case
        from i in c.Item
        select i;
bool ok = (list.Count() > 0);
if (!ok)
    return "No Items are in the Truck";
#endif

#if CASE_3
/// Forced loading also works, as stated in the OP...
bool ok = false;
foreach (var pallet in truck.Pallet) {
    pallet.Case.Load();
    foreach (var kase in pallet.Case) {
        kase.Item.Load();
        var item = kase.Item.FirstOrDefault();
        if (item != null) {
            ok = true;
            break;
        }
    }
    if (ok) break;
}
if (!ok)
    return "No Items are in the Truck";
#endif

And the SQL resulting from CASE_1 is piped through sp_executesql, but:

SELECT [Project1].[C1] AS [C1]
FROM   ( SELECT cast(1 as bit) AS X ) AS [SingleRowTable1]
LEFT OUTER JOIN  (SELECT 
    [GroupBy1].[A1] AS [C1]
    FROM ( SELECT 
        COUNT(cast(1 as bit)) AS [A1]
        FROM   [dbo].[PalletTruckMap] AS [Extent1]
        INNER JOIN [dbo].[PalletCaseMap] AS [Extent2] ON [Extent1].[PalletID] = [Extent2].[PalletID]
        INNER JOIN [dbo].[Item] AS [Extent3] ON [Extent2].[CaseID] = [Extent3].[CaseID]
        WHERE [Extent1].[TruckID] = '....'
    )  AS [GroupBy1] ) AS [Project1] ON 1 = 1

[I don't really have Trucks, Drivers, Pallets, Cases or Items; as you can see from the SQL the Truck-Pallet and Pallet-Case relationships are many-to-many -- although I don't think that matters. My real objects are intangibles and harder to describe, so I changed the names.]

NVRAM
  • 6,947
  • 10
  • 41
  • 44

7 Answers7

147

Query syntax:

var count = (from o in context.MyContainer
             where o.ID == '1'
             from t in o.MyTable
             select t).Count();

Method syntax:

var count = context.MyContainer
            .Where(o => o.ID == '1')
            .SelectMany(o => o.MyTable)
            .Count()

Both generate the same SQL query.

Jerther
  • 5,558
  • 8
  • 40
  • 59
Craig Stuntz
  • 125,891
  • 12
  • 252
  • 273
  • 1
    Why the `SelectMany()`? Is it needed? Wouldn't it work proper without it? – Jo Smo Jul 04 '15 at 17:29
  • 1
    @JoSmo, no, that's a totally different query. – Craig Stuntz Jul 08 '15 at 18:28
  • Thank you for clearing that up for me. Just wanted to be sure. :) – Jo Smo Jul 08 '15 at 18:39
  • How do I find the row number of the searched result (assuming it existed)? – puretppc Mar 18 '16 at 14:00
  • 2
    Can you tell me why is it different with the SelectMany? I don't understand. I do it without SelectMany but it gets really slow because I have over 20 million records. I tried the answer from Yang Zhang and works great, just wanted to know what the SelectMany does. – mikesoft Oct 20 '16 at 17:45
  • Sorry to ress the topic, but it'll be awesome to know the difference between SelectMany or not. – Austin Felipe May 19 '17 at 23:07
  • 2
    @AustinFelipe Without the call to SelectMany, the query would return the number of rows in MyContainer with the ID equal to '1'. The SelectMany call returns all rows in MyTable that belong to the previous result of the query (meaning the result of `MyContainer.Where(o => o.ID == '1')`) – sbecker Aug 07 '17 at 09:14
63

I think you want something like

var count = context.MyTable.Count(t => t.MyContainer.ID == '1');

(edited to reflect comments)

Kevin
  • 8,353
  • 3
  • 37
  • 33
  • 1
    No, he needs the count of the entities in MyTable referenced by the one entity with ID = 1 in MyContainer – Craig Stuntz May 20 '09 at 21:47
  • 3
    Incidentally, if t.ID is a PK, then count in the code above will *always* be 1. :) – Craig Stuntz May 20 '09 at 21:50
  • 2
    @Craig, you're right, I should have used t.ForeignTable.ID. Updated. – Kevin May 20 '09 at 22:07
  • 2
    Well this is short and simple. My choice is: `var count = context.MyTable.Count(t => t.MyContainer.ID == '1');` not long and ugly: `var count = (from o in context.MyContainer where o.ID == '1' from t in o.MyTable select t).Count();` But it depends on coding style... – C.L. May 08 '13 at 12:11
  • make sure you include "using System.Linq", or this wont work – CountMurphy Mar 02 '16 at 16:13
17

As I understand it, the selected answer still loads all of the related tests. According to this msdn blog, there is a better way.

http://blogs.msdn.com/b/adonet/archive/2011/01/31/using-dbcontext-in-ef-feature-ctp5-part-6-loading-related-entities.aspx

Specifically

using (var context = new UnicornsContext())

    var princess = context.Princesses.Find(1);

    // Count how many unicorns the princess owns 
    var unicornHaul = context.Entry(princess)
                      .Collection(p => p.Unicorns)
                      .Query()
                      .Count();
}
Quickhorn
  • 1,171
  • 12
  • 23
  • 5
    There is no need to make extra `Find(1)` request. Just create the entity and attach to the context: `var princess = new PrincessEntity{ Id = 1 }; context.Princesses.Attach(princess);` – tenbits Feb 04 '16 at 16:27
15

This is my code:

IQueryable<AuctionRecord> records = db.AuctionRecord;
var count = records.Count();

Make sure the variable is defined as IQueryable then when you use Count() method, EF will execute something like

select count(*) from ...

Otherwise, if the records is defined as IEnumerable, the sql generated will query the entire table and count rows returned.

Yang Zhang
  • 4,540
  • 4
  • 37
  • 34
10

Well, even the SELECT COUNT(*) FROM Table will be fairly inefficient, especially on large tables, since SQL Server really can't do anything but do a full table scan (clustered index scan).

Sometimes, it's good enough to know an approximate number of rows from the database, and in such a case, a statement like this might suffice:

SELECT 
    SUM(used_page_count) * 8 AS SizeKB,
    SUM(row_count) AS [RowCount], 
    OBJECT_NAME(OBJECT_ID) AS TableName
FROM 
    sys.dm_db_partition_stats
WHERE 
    OBJECT_ID = OBJECT_ID('YourTableNameHere')
    AND (index_id = 0 OR index_id = 1)
GROUP BY 
    OBJECT_ID

This will inspect the dynamic management view and extract the number of rows and the table size from it, given a specific table. It does so by summing up the entries for the heap (index_id = 0) or the clustered index (index_id = 1).

It's quick, it's easy to use, but it's not guaranteed to be 100% accurate or up to date. But in many cases, this is "good enough" (and put much less burden on the server).

Maybe that would work for you, too? Of course, to use it in EF, you'd have to wrap this up in a stored proc or use a straight "Execute SQL query" call.

Marc

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • 1
    It won't be a full table scan due to the FK reference in the WHERE. Only details of the master will be scanned. The performance issue he was having was from loading blob data, not record count. Presuming there aren't typically tens of thouusands + of detail records per master record, I wouldn't "optimize" something which isn't actually slow. – Craig Stuntz May 21 '09 at 12:55
  • OK, yes, in that case, you'll only select a subset - that should be fine. As for the blob data - I was under the impression you could set a "deferred loading" on any column in any of your EF tables to avoid loading it, so that might help. – marc_s May 21 '09 at 16:35
  • Is there a way to use this SQL with the EntityFramework? Anyway, in this case I only needed to know there were matching rows, but I intentionally asked the question more generally. – NVRAM May 22 '09 at 17:38
6

Use the ExecuteStoreQuery method of the entity context. This avoids downloading the entire result set and deserializing into objects to do a simple row count.

   int count;

    using (var db = new MyDatabase()){
      string sql = "SELECT COUNT(*) FROM MyTable where FkId = {0}";

      object[] myParams = {1};
      var cntQuery = db.ExecuteStoreQuery<int>(sql, myParams);

      count = cntQuery.First<int>();
    }
goosemanjack
  • 940
  • 9
  • 7
  • 6
    If you write `int count = context.MyTable.Count(m => m.MyContainerID == '1')` then the generated SQL will resemble exactly what you're doing, but the code is much nicer. No entities are loaded into memory as such. Try it out in LINQPad if you like -- it will show you the SQL used under the covers. – Drew Noakes Aug 04 '12 at 16:27
  • 1
    In-line SQL . . not my favorite thing. – Duanne Mar 02 '17 at 21:18
3

I think this should work...

var query = from m in context.MyTable
            where m.MyContainerId == '1' // or what ever the foreign key name is...
            select m;

var count = query.Count();
bytebender
  • 7,371
  • 2
  • 31
  • 54
  • This is the direction I went at first, too, but it's my understanding that unless you've manually added it, m will have a MyContainer property but no MyContainerId. Hence, what you want to examine is m.MyContainer.ID. – Kevin May 20 '09 at 22:14
  • If MyContainer is the parent and MyTable are the children in the relationship then you had to establish that relationship with some foreign key, I am not sure how else you would know which MyTable entities where a associated with a MyContainer entity... But maybe I made an assumption about the structure... – bytebender May 21 '09 at 15:45