Joining GUID to String - Entity Framework Produces Slow Query

Question

To summarize the issue, I have a query that Entity Framework generates based on my LINQ query, which is joining two tables on a value that is stored as a UNIQUEIDENTIFIER in one and a (Nullable) VARCHAR in another.

Because of the two different data types, I have to call ToString() on the GUID value in the LINQ query.

The SQL code that is produced, although admirable, ends up being slow because, rather than just compare the two columns (as I might in a standard SQL query), it tries to convert the UNIQUEIDENTIFIER columns in the one table and tries to catch the Nulls.

If I take the query that Entity Framework produces and simply remove the part where it tries to do the conversion, the query goes from 30 seconds to 0 seconds when there are approximately 300 records to return.

So, assuming that I'm stuck with tables that have different data types to store the same values, what I would like to know is whether I could tell Entity Framework to NOT attempt the conversion on this column and just do a straight up comparison. (We do this all the time in queries and stored procedures and it works just fine and is fast.)

If there is not a way to tell Entity Framework, in this particular instance to do that, would I just need to write a stored procedure and call it, rather than trying to write the query in LINQ?

Here is the method that returns notifications for the user.

var notifications = from un in _context.UserNotifications
                                join n in _context.Notifications on un.NotificationId equals n.Id
                                join r in _context.Referrals on n.txt_assess_seq_no equals r.seq_no.ToString()
                                where un.UserId == userId & !un.IsRead
                                orderby n.create_timestamp descending
                                select n;

This gives me the results I want, but the SQL query generated by Entity Framework is slow because of the joining of Notifications.txt_assess_seq_no (which is a VARCHAR in the table) and Referrals.seq_no (which is a UNIQUIDENTIFIER).

Here is the SQL that is generated:

SELECT 
    [GroupBy1].[A1] AS [C1]
    FROM ( SELECT 
        COUNT(1) AS [A1]
        FROM   (SELECT [Extent1].[UserId] AS [UserId], [Extent2].[txt_assess_seq_no] AS [txt_assess_seq_no]
            FROM  [dbo].[UserNotifications] AS [Extent1]
            INNER JOIN [dbo].[ContactDetails] AS [Extent2] ON [Extent1].[NotificationId] = [Extent2].[seq_no]
            WHERE [Extent1].[IsRead] = 0 ) AS [Filter1]
        INNER JOIN [dbo].[Referrals] AS [Extent3] ON ([Filter1].[txt_assess_seq_no] = (LOWER( CAST( [Extent3].[seq_no] AS nvarchar(max))))) OR (([Filter1].[txt_assess_seq_no] IS NULL) AND (LOWER( CAST( [Extent3].[seq_no] AS nvarchar(max))) IS NULL))
        WHERE [Filter1].[UserId] = 'USER ID (GUID) AS STRING'
    )  AS [GroupBy1]

If I simply change this line:

INNER JOIN [dbo].[Referrals] AS [Extent3] ON ([Filter1].[txt_assess_seq_no] = (LOWER( CAST( [Extent3].[seq_no] AS nvarchar(max))))) OR (([Filter1].[txt_assess_seq_no] IS NULL) AND (LOWER( CAST( [Extent3].[seq_no] AS nvarchar(max))) IS NULL))

to:

([Filter1].[txt_assess_seq_no] = [Extent3].[seq_no])

It is super duper fast and is just what I want, but I'm not quite sure how to get there. I'd like to keep it written in LINQ, but if I just have to use a stored procedure or some old fashioned SQL code, I will do that.

Any thoughts or advice would be greatly appreciated!

What you really should do is fix the tables. When you mix datatypes you are setting yourself up for performance issues. — Sean Lange, Jul 21 '17 at 20:32
I know, but in this case, that is not possible for reasons that you'd probably not like to hear. I did mention that *assuming* that is not possible, would you have any advice. I do appreciate you answering, though! — Bill Kron, Jul 21 '17 at 20:34
Well what is happening is that EF realizes the datatypes are different and does an explicit conversion. Your code is just doing an implicit conversion. I don't know much of anything about EF other than a lot of the sql it creates is horrible for performance. — Sean Lange, Jul 21 '17 at 20:36
The thing is that SQL Server handles comparing UNIQUEIDENTIFIERs to VARHCHARs just fine (it does its own implicit conversion) and that is not a performance problem in our other applications where we write custom code against this same database. I'd like to know if EF has some way of being told to NOT bother on a per-query basis. Thanks again! — Bill Kron, Jul 21 '17 at 20:39
You have been lucky. Implicit conversions can cause some serious performance issues. But the conversion code that EF generated is not so whippy. Sorry but I can't help you with EF stuff. Hopefully somebody will come along and help. — Sean Lange, Jul 21 '17 at 20:49
The implicit conversion goes the other direction from the explicit one. Uniqueidintifier has higher prececedence than string so without an explicit cast of guid to string you get the reverse. — Martin Smith, Jul 21 '17 at 21:11
Martin, I'm not sure I totally understand. Is this something I can address in the LINQ query itself? I am calling ToString() on the GUID property there, but I assume that is not the same as what you are saying here? — Bill Kron, Jul 21 '17 at 21:13
Don't know. in the LINQ query what happens if you convert the string to Guid rather than calling ToString on the Guid? But even if this does happen to work you should fix your datatypes instead. — Martin Smith, Jul 21 '17 at 21:15
I just tried to do a Guid.Parse(r.seq_no) inside the LINQ query, but it fails at runtime because the method cannot be translated into a store expression. There must be another way to do this, so that I can still join on the columns. I will research and report back. Thank you for the idea! — Bill Kron, Jul 21 '17 at 21:26
What about `new Guid(string)`? I rarely use EF and wouldn't be surprise if that fails as well. The answer given doesn't seem optimistic. — Martin Smith, Jul 21 '17 at 21:28

David Browne - Microsoft · Accepted Answer · 2017-07-22T18:11:00.663

2

As noted SQL Server will perform an implicit conversion in the join, converting the (N)VARCHAR value to UNIQUEIDENTIFIER for comparison. Note this is the opposite of your explicit conversion. There's no way in EF to cause a LINQ query to generate a query with an implicit conversion here.

You might try at the EF Core project, and suggest some way to specify in implicit-conversion comparison between different types.

Anyhoo, your performance problem may not relate to the conversion difference. Try setting

db.Configuration.UseDatabaseNullSemantics = true;

which will simplify the null handling in the generated query.

For a join, you can rewrite as a filtered crossjoin (ie non-ANSI join) to get this behavior. eg

using System;
using System.Collections.Generic;
using System.Data.Entity;
using System.Data.Entity.SqlServer;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace ConsoleApp6
{

    class Foo
    {
        public int Id   { get; set; }
        public string Name { get; set; }
        public Guid UID { get; set; }
    }

    class Bar
    {
        public int Id { get; set; }
        public string Name { get; set; }
        public string UID { get; set; }
    }
    class Db: DbContext
    {
        public DbSet<Foo> Foo { get; set; }
        public DbSet<Bar> Bar { get; set; }
    }
    class Program
    {

        static void Main(string[] args)
        {
            Database.SetInitializer(new DropCreateDatabaseAlways<Db>());

            using (var db = new Db())
            {
                db.Configuration.UseDatabaseNullSemantics = true;

                var q = from f in db.Foo
                        from b in db.Bar
                        where b.UID == f.UID.ToString()
                        select new { f, b };
                Console.WriteLine(q.ToString());


                Console.ReadKey();

            }
        }
    }
}

outputs

SELECT
    [Extent1].[Id] AS [Id],
    [Extent1].[Name] AS [Name],
    [Extent1].[UID] AS [UID],
    [Extent2].[Id] AS [Id1],
    [Extent2].[Name] AS [Name1],
    [Extent2].[UID] AS [UID1]
    FROM  [dbo].[Foos] AS [Extent1]
    CROSS JOIN [dbo].[Bars] AS [Extent2]
    WHERE [Extent2].[UID] = (LOWER( CAST( [Extent1].[UID] AS nvarchar(max))))

edited Jul 22 '17 at 18:11

answered Jul 21 '17 at 21:24

David Browne - Microsoft

80,331
6
39
67

This sounds promising. Will this just impact this query or does this setting go outside the scope of this query in any way? – Bill Kron Jul 21 '17 at 21:29
1

You can set this in your DbContext constructor to apply to all queries, or you can set it on a DbContext instance to apply to just selected queries. – David Browne - Microsoft Jul 21 '17 at 21:36
I just had a chance to try this by adding it to the DbContext constructor, but it still produces the exact same query. Any other thoughts, David? – Bill Kron Jul 21 '17 at 22:39
I was looking around and saw something about the fact that this could still happen if you're trying to join on a field that could still be nullable, which I am. So, in that case, it was suggested to do an explicit null check inside the query, which I did add, but it didn't change the generated query. All I did was add: & n.txt_assess_seq_no != null to the LINQ WHERE clause. Although, I'm not sure that is even right. Here was the answer I was referring to: https://stackoverflow.com/a/39101728/912854 – Bill Kron Jul 21 '17 at 23:11
1

@BillKron Please note that the option applies to comparison operators, but not for joins (David's colleagues seem to forgot them :). So additionally to turning the option to `true`, replace `join r in _context.Referrals on n.txt_assess_seq_no equals r.seq_no.ToString()` with `from r in _context.Referrals where n.txt_assess_seq_no == r.seq_no.ToString()` – Ivan Stoev Jul 21 '17 at 23:53
1

Ivan, thank you for the additional help! I will report back with the results later this evening. – Bill Kron Jul 21 '17 at 23:55
1

@DavidBrowne-Microsoft (and Ivan, too) - Much appreciation to you both. This is EXACTLY what I was looking for; an EF configuration settting that allowed me to keep the existing query in LINQ with very minor changes. Ivan, I was able to make the adjustments as you indicated. David, as a result, the previously offending WHERE clause now looks like this: WHERE ( [Filter1].[txt_assess_seq_no] = ( Lower(Cast([Extent3].[seq_no] AS NVARCHAR(max))) ) ) Perfection! And it's super duper fast. What more can a guy ask for? Thank you, both, so very much for your help! – Bill Kron Jul 22 '17 at 04:03
@IvanStoev - I hate to ask, but you were so helpful on this issue and thought you could help me with something similar. If you are willing and able, I have a new SO question here: https://stackoverflow.com/questions/49887361/troubleshooting-linq-query-performance-in-asp-net-mvc and would love your feedback if you have the time. Thank you. – Bill Kron Apr 18 '18 at 16:17

score 1 · Answer 2 · answered Jul 21 '17 at 21:54

1

You could however create a stored proc that uses the implicit conversion and run that from EF.

answered Jul 21 '17 at 21:54

HLGEM

94,695
15
113
186

Joining GUID to String - Entity Framework Produces Slow Query

2 Answers2