1

EF Core version: 3.1.

Here's my method:

public static ILookup<string, int> GetClientCountLookup(DepotContext context, DateRange dateRange)
            => context
                .Flows
                .Where(e => e.TimeCreated >= dateRange.Start.Date && e.TimeCreated <= dateRange.End.Date)
                .GroupBy(e => e.Customer)
                .Select(g => new { g.Key, Count = g.Count() })
                .ToLookup(k => k.Key, e => e.Count);

All fields used are indexed.

Here's generated query:

SELECT [f].[Customer] AS [Key], COUNT(*) AS [Count]
FROM [Flows] AS [f]
WHERE ([f].[TimeCreated] >= @__dateRange_Start_Date_0) AND ([f].[TimeCreated] <= @__dateRange_End_Date_1)
GROUP BY [f].[Customer]

When that query is executed as SQL, the execution time is 100ms. When that query is used in code with ToLookup method - execution time is 3200ms.

What's even more weird - the execution time in EF Core seems totally independent from the data sample size (let's say, depending on date range we can count hundreds, or hundreds of thousands records).

WHAT THE HECK IS HAPPENING HERE?

The query I pasted is the real query EF Core sends. The code fragment I pasted first is executed in 3200ms. Then I took exact generated SQL and executed in as SQL query in Visual Studio - took 100ms.

It doesn't make any sense to me. I use EF Core for a long time and it seem to perform reasonably. Most queries (plain, simple, without date ranges) are fast, results are fetched immediately (in less than 200ms).

In my application I built a really HUGE query with like 4 multi-column joins and subqueries... Guess what - it fetches 400 rows in 3200ms. It also fetches 4000 rows in 3200ms. And also when I remove most of the joins, includes, even remove the subquery - 3200ms. Or 4000, depending on my Internet or server momentary state and load.

It's like constant lag and I pinpointed it to the exact first query I pasted.

I know ToLookup method causes to finally fetch all input expression results, but in my case (real world data) - there are exactly 5 rows.

The results looks like this:

|------------|-------|
| Key        | Count |
|------------|-------|
| Customer 1 | 500   |
| Customer 2 | 50    |
| Customer 3 | 10    |
| Customer 4 | 5     |
| Customer 5 | 1     |

Fetching 5 rows from database takes 4 seconds?! It's ridiculous. If the whole table was fetched, then rows grouped and counted - that would add up. But the generated query returns literally 5 rows.

What is happening here and what am I missing?

Please, DO NOT ASK ME TO PROVIDE THE FULL CODE. It is confidential, part of a project for my client, I am not allowed to disclose my client's trade secrets. Not here nor in any other question. I know it's hard to understand what happens when you don't have my database and the whole application, but the question here is pure theoretical. Either you know what's going on, or you don't. As simple as that. The question is very hard though.

I can only tell the RDBMS used is MS SQL Express running on Ubuntu server, remotely. The times measured are the times of executing either code tests (NUnit) or queries against the remote DB, all performed on my AMD Ryzen 7 8 core 3.40GHz processor. The server lives on Azure, like 2 core of I5 2.4GHz or something like that.

Harry
  • 4,524
  • 4
  • 42
  • 81
  • Might I suggest you have a read here and check that your data types are all lined up as you would expect: https://stackoverflow.com/questions/15767803/entity-framework-query-slow-but-same-sql-in-sqlquery-is-fast – Paddy May 20 '20 at 07:38
  • 1
    The database is not fetching 4 rows, but fetching 566 rows and grouping them. My guess is that your problem is determining the rows within the desired interval, and it will be solved if you add an index for the [TimeCreated] column. – Marc Guillot May 20 '20 at 07:44
  • @MarcGuillot You're thinking exactly as I was thinking - I made sure `TimeCreated` has index. I also made sure 5 rows are actually fetched. How? I tried to fetch 5000 rows and count (explicitly, client side) - now this takes significantly longer. However not with my query, it really fetches only 5 rows, and I probably solved this by accident... – Harry May 20 '20 at 08:06
  • It's not clear if you're just looking at a cold start of the application. If you can't get under 3200ms I think it's just that. – Gert Arnold May 20 '20 at 08:56
  • @GertArnold Exactly that. It's a unit test, but the first test is a cold start. I changed the benchmark code to perform a dummy query just before start of measuring time and now it measures 200ms for the query that is a good time considering framework overhead and all. It seems like the overhead is constant, so when I take a way more complex queries returning way more rows - it's not getting considerably slower. – Harry May 20 '20 at 09:08

1 Answers1

2

Here's test:

[Test]
public void Clients() {
    var dateRange = new DateRange {
        Start = new DateTime(2020, 04, 06),
        End = new DateTime(2020, 04, 11)
    };
    var q1 = DataContext.Flows;
    var q2 = DataContext.Flows
        .Where(e => e.TimeCreated >= dateRange.Start.Date && e.TimeCreated <= dateRange.End.Date)
        .GroupBy(e => e.Customer)
        .Select(g => new { g.Key, Count = g.Count() });
    var q3 = DataContext.Flows;
    var t0 = DateTime.Now;
    var x = q1.Any();
    var t1 = DateTime.Now - t0;
    t0 = DateTime.Now;
    var l = q2.ToLookup(g => g.Key, g => g.Count);
    var t2 = DateTime.Now - t0;
    t0 = DateTime.Now;
    var y = q3.Any();
    var t3 = DateTime.Now - t0;
    TestContext.Out.WriteLine($"t1 = {t1}");
    TestContext.Out.WriteLine($"t2 = {t2}");
    TestContext.Out.WriteLine($"t3 = {t3}");
}

Here's the test result:

t1 = 00:00:00.6217045 // the time of dummy query
t2 = 00:00:00.1471722 // the time of grouping query
t3 = 00:00:00.0382940 // the time of another dummy query

Yep: 147ms is my grouping that took 3200ms previously. What happened? A dummy query was executed before.

That explains why the results hardly depended on data sample size!

The huge unexplainable time is INITIALIZATION, not the actual query time. I mean, if not the dummy query before, the whole time would elapse on ToLookup line of code! The line would initialize the DbContext, create connection to database and then perform the actual query and fetch the data.

So as the final answer I can say my test methodology was wrong. I measured the time of the first query to my DbContext. This is wrong, the database should be initialized before the times are measured. I can do it by performing any query before the measured queries.

Well, another question appears - why is THE FIRST query so slow, why is initialization so slow. If my Blazor app would use DbContext as Transient (instantiated each time injected) - would it take so much time each time? I don't think so, because it's how my application worked before major redesign. It didn't have noticeable lags (I would notice 3 seconds lag when changing between pages). But I'm not sure. Now my application uses a scoped DbContext, so it's one for user session. So I won't see the initialization overhead at all, so - the method of measuring time after a dummy query seems to be accurate.

Harry
  • 4,524
  • 4
  • 42
  • 81
  • Oh, and the huge query with 3 joins and a subquery completes in 500ms. Totally acceptable. – Harry May 20 '20 at 08:52