-2

I have a linq query which tries to fetch approximately 500K records from DB. I have a Count() which eventually timing out.

I want to know if my linq query contains 5000 or more records or not. I don't count of all records, just need to check if linq contains 5000 records.

Is there any effective way to check if there 5000 or more records in linq without calling Count()? I am using EF core 3.1.

Linq Query :

  var results = (from a in RepoContext.Employee
                          join b in RepoContext.Program on a.ProgramId equals b.ProgramId 
                          where a.ActiveFlag == true
                                && b.ClientId == 2
                          select new RAManufacturerDto
                          {

                              BusinessName = a.BusinessName,
                              ClientId = a.ClientId.Value,
                              ClientName = b.ClientName
                              DCode = b.DCode,
                              StoreId = b.StoreId,
                              ProgramId = a.ProgramId
                          });

bool isRecordsLimitReached = results.Count() > 5000;

I am getting an error when trying to do Count() on result. I just want to get if it contains more than 5000 records.

Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236
Sachin
  • 459
  • 1
  • 7
  • 22
  • 2
    This would be a much better question if you could provide some context. Can you show us some code? – Robert Harvey Oct 14 '21 at 17:39
  • @sachin please show us the code, without seeing it’s impossible to tell – Vivek Nuna Oct 14 '21 at 17:45
  • @RobertHarvey I edited my question. – Sachin Oct 14 '21 at 17:50
  • @viveknuna I edited my question. – Sachin Oct 14 '21 at 17:51
  • If you need only the count, don’t select the columns, just do `Count`. You can also add conditions in `Count` in place of `Where` – Vivek Nuna Oct 14 '21 at 17:51
  • @viveknuna, I need to know if that list contains more than 5000 records, That linq query is part of some shared method. – Sachin Oct 14 '21 at 17:53
  • You can also create index of `ClientId` in database, if possible. It will improve the time. And what about `ProgramId` column? – Vivek Nuna Oct 14 '21 at 17:56
  • @Sachin 50K rows is no data at all. LINQ is *NOT* SQL or a replacement for SQL, and won't even query any database without an ORM like Entity Framework. This means the question itself is meaningless without knowing which ORM you used. – Panagiotis Kanavos Oct 14 '21 at 17:56
  • So you want to call a shared method that returns data and change its behavior so it only returns the count? Have you considered asking the shared method owners to provide an overload, or an optional parameter, or a separate method? – Aaron Bertrand Oct 14 '21 at 17:59
  • @Sachin there's no reason to use JOINs in LINQ either. It's the ORM's job to generate any JOINs necessary from the relation between entities. If your `Employee` class had a `Program` property, EF would generate the proper JOIN. – Panagiotis Kanavos Oct 14 '21 at 17:59
  • @Sachin as for a timeout, 50K rows isn't a lot. Even if there were no indexes there shouldn't be a timeout. It's quite possible that the query is getting blocked for some reason. Perhaps some other call started a transaction to edit the rows that match the search criteria. so your query has to wait. You should use indexes that cover the join and filter columns ie `ProgramId` in both tables and `ClientId` – Panagiotis Kanavos Oct 14 '21 at 18:02
  • Pity you don't use EF5; getting the query it will execute is trivial. It's [bit more work](https://stackoverflow.com/questions/37527783/get-sql-code-from-an-entity-framework-core-iqueryablet) in EF31 - take a look at what it executes, post the plan, run it in SSMS and see how long it really takes etc – Caius Jard Oct 14 '21 at 18:11

3 Answers3

0

Use linq directly instead, suppose you have your dbcontext properly configured and such relation between the two defined in onmodelcreating, move the code into the RepoContext. ( if in doubt to define relations check this out: https://learn.microsoft.com/en-us/ef/core/modeling/relationships?tabs=fluent-api%2Cfluent-api-simple-key%2Csimple-key )

    DbSet<Employee> Employees {get;set;}
    DbSet<Program> Programs {get;set;}

    public bool HasMoreThan(int number) {
      var count = RepoContext.Employee.Include(x => x.Program).Count(y => 
      y.ActiveFlag == true && y.Program.ClientId == 2);
      return count >= number;
    }
T. Nielsen
  • 835
  • 5
  • 18
0

Just try to skip 5000 records and if there are records - we have reached our goal:

bool isRecordsLimitReached = results.Skip(5000).Any();
Svyatoslav Danyliv
  • 21,911
  • 3
  • 16
  • 32
0

If you only need to get the count, you don't need to select all of those properties. You can optimize this query by selecting only one of the non nullable properties.

var results = (from a in RepoContext.Employee
               join b in RepoContext.Program on a.ProgramId equals b.ProgramId 
               where a.ActiveFlag == true && 
               b.ClientId == 2
               select a.Id); //Id = Primary Key of the Employee database

bool isRecordsLimitReached = results.Count() > 5000;

If you still get a timeout with this then you may need to add an index to the foreign key on the Program table if it is not already there. An extra index for ActiveFlag and ClientId would not hurt either.

izzy255
  • 21
  • 3