15

I have a very strange behavior when using TPH on EF 6.1.3. Here is a basic example to reproduce :

public class BaseType
{
    public int Id { get; set; }
}
public class TypeA : BaseType
{
    public string PropA { get; set; }
}
public class TypeB : BaseType
{
    public decimal PropB { get; set; }
    public OneEnum PropEnum { get; set; }
}
public class TypeC : TypeB
{
    public int PropC { get; set; }
}

public enum OneEnum
{
    Foo,
    Bar
}

public partial class EnumTestContext : DbContext
{
    public EnumTestContext()
    {
        this.Database.Log = s => { Debug.WriteLine(s); };
    }
    public DbSet<BaseType> BaseTypes { get; set; }
}

class Program
{
    static void Main(string[] args)
    {
        Database.SetInitializer(new DropCreateDatabaseAlways<EnumTestContext>());
        using (var context = new EnumTestContext())
        {
            context.BaseTypes.Add(new TypeA() { Id = 1, PropA = "propA" });
            context.BaseTypes.Add(new TypeB() { Id = 2, PropB = 4.5M, /*PropEnum = OneEnum.Bar*/ });
            context.BaseTypes.Add(new TypeC() { Id = 3, PropB = 4.5M, /*PropEnum = OneEnum.Foo,*/ PropC = 123 });
            context.SaveChanges();

            var onetype = context.BaseTypes.Where(b => b.Id == 1).FirstOrDefault();

            Console.WriteLine("typeof {0} with {1}", onetype.GetType().Name, onetype.Id);
        }

        Console.WriteLine("Press any key to exit...");
        Console.ReadKey();
    }
}

This code works perfectly, but the generated query is extrememly weird and complex, especialy there are a lot of CASE WHEN

SELECT 
    [Limit1].[C1] AS [C1], 
    [Limit1].[Id] AS [Id], 
    [Limit1].[C2] AS [C2], 
    [Limit1].[C3] AS [C3], 
    [Limit1].[C4] AS [C4], 
    [Limit1].[C5] AS [C5]
    FROM ( SELECT TOP (1) 
        [Extent1].[Id] AS [Id], 
        CASE WHEN ([Extent1].[Discriminator] = N'BaseType') THEN '0X' WHEN ([Extent1].[Discriminator] = N'TypeA') THEN '0X0X' WHEN ([Extent1].[Discriminator] = N'TypeB') THEN '0X1X' ELSE '0X1X0X' END AS [C1], 
        CASE WHEN ([Extent1].[Discriminator] = N'BaseType') THEN CAST(NULL AS varchar(1)) WHEN ([Extent1].[Discriminator] = N'TypeA') THEN [Extent1].[PropA] WHEN ([Extent1].[Discriminator] = N'TypeB') THEN CAST(NULL AS varchar(1)) END AS [C2], 
        CASE WHEN ([Extent1].[Discriminator] = N'BaseType') THEN CAST(NULL AS decimal(18,2)) WHEN ([Extent1].[Discriminator] = N'TypeA') THEN CAST(NULL AS decimal(18,2)) WHEN ([Extent1].[Discriminator] = N'TypeB') THEN [Extent1].[PropB] ELSE [Extent1].[PropB] END AS [C3], 
        CASE WHEN ([Extent1].[Discriminator] = N'BaseType') THEN CAST(NULL AS int) WHEN ([Extent1].[Discriminator] = N'TypeA') THEN CAST(NULL AS int) WHEN ([Extent1].[Discriminator] = N'TypeB') THEN [Extent1].[PropEnum] ELSE [Extent1].[PropEnum] END AS [C4], 
        CASE WHEN ([Extent1].[Discriminator] = N'BaseType') THEN CAST(NULL AS int) WHEN ([Extent1].[Discriminator] = N'TypeA') THEN CAST(NULL AS int) WHEN ([Extent1].[Discriminator] = N'TypeB') THEN CAST(NULL AS int) ELSE [Extent1].[PropC] END AS [C5]
        FROM [dbo].[BaseTypes] AS [Extent1]
        WHERE ([Extent1].[Discriminator] IN (N'TypeA',N'TypeB',N'TypeC',N'BaseType')) AND (1 = [Extent1].[Id])
    )  AS [Limit1]

Except the cost of multiple and useless THEN CAST(NULL as X), the query is large (> 50 KBs) in my project because I have a lot of derived classes, containing a lot a properties. As you can expect, my DBA team is not happy to see this kind of queries to our databases.

If I remove the enumeration property on TypeB, the request is much more cleaner. Same thing if I have only two hierarchy levels, aka class TypeC : BaseType (compared to 3 in the example because class TypeC : TypeB).

Is there any settings or model configuration or workaround to avoid this strange behavior ?

Update

Here is the generated query if I remove TypeB.PropEnum

SELECT TOP (1) 
    [Extent1].[Discriminator] AS [Discriminator], 
    [Extent1].[Id] AS [Id], 
    [Extent1].[PropA] AS [PropA], 
    [Extent1].[PropB] AS [PropB], 
    [Extent1].[PropC] AS [PropC]
    FROM [dbo].[BaseTypes] AS [Extent1]
    WHERE ([Extent1].[Discriminator] IN (N'TypeA',N'TypeB',N'TypeC',N'BaseType')) AND (1 = [Extent1].[Id])

Update 2

A common solution is to create a separate property the integer value and ignore the enum property. This works, but it's quite confusing to have 2 properties for the same purpose.

public class TypeB : BaseType
{
    public decimal PropB { get; set; }

    public int PropEnumValue { get; set; }

    [NotMapped]
    public OneEnum PropEnum
    {
        get { return (OneEnum)PropEnumValue; }
        set { PropEnumValue = (int)value; }
    }
}

Update 3

I've found a bug on codeplex : https://entityframework.codeplex.com/workitem/2117. It doesn't seems to be solved.

Cybermaxs
  • 24,378
  • 8
  • 83
  • 112
  • Performance-wise, EF does not work very well with large hierarchies of classes, even in case of TPH (at least that was the case with EF 5). I strongly suggest against using EF if you are planning on creating huge hierarchies of classes – RX_DID_RX Nov 25 '15 at 10:30
  • maybe, but why do I have this problem only with enums ? If I remove all enums, the SQL is always clean even with large hierarchies – Cybermaxs Nov 25 '15 at 12:37
  • Sorry, the last version I've used (EF 5) did not have support for enums. We were stuck storing enums as ints. Maybe you should try something similar. Try store your enum value as integer and see how this affects performance – RX_DID_RX Nov 25 '15 at 20:05
  • @Cybermaxs Why are you not satisfied with your solution (the one with additional property) ? It's exactly what i used on one of my projects and it was kind of ok... SQL doesn't support Enumerations directly so it's essential to use proxy of some sort isn't it ? And it can very well be a property. – Fabjan Nov 27 '15 at 10:59
  • First, Enums are supported by EF6. By default, they're stored as int in my DB. It's not the problem here. I am not satisfied with my solution because I have to update my model (initially correct) just to fix the generated query. It's also quite confusing to have 2 properties for the same *value* ; the int version should not be used directly, and doesn't have constraints compared to the enum version. – Cybermaxs Nov 27 '15 at 12:15
  • Can't you include the db values of the ints on your enum? Like `public enum OneEnum : int { Foo = 1, Bar = 2 }` – Alexander Derck Nov 28 '15 at 16:48
  • Looks like a bug to me - if the query is simple when using an int property having an enum property should not change this. Internally enum types are converted to the underlying enum type and therefore should follow rules for its underlying type. – Pawel Nov 29 '15 at 18:15
  • @AlexanderDerck : same query – Cybermaxs Nov 30 '15 at 08:50
  • @Pawel : that's why there is this question – Cybermaxs Nov 30 '15 at 08:50
  • @Cybermaxs - file it at https://entityframework.codeplex.com/WorkItem/Create?ProjectName=entityframework – Pawel Dec 02 '15 at 05:56

2 Answers2

2

About using EF/Large queries

I've done some work with EF6 and semi-large hierarchies. There are a few things you should consider. First of all why isn't your DBA team not happy with these kind of queries. Of course these arn't the queries they would write but assuming management doesn't want you to spend the time to write every single query from scratch they'll have to live with the fact that you use an ORM framework and that ORM framework might cause queries that are a bit larger.

Now if they have specific performance concerns you SHOULD address those.

What you can do

now what can you do to clean up your queries.

1) Make all classes that could be abstract abstract.

2) Make all other classes sealed.

3) In your linq queries cast to concrete types where possible(using OfType() ). This might even work better than an .Select(x => x as SomethingHere). If you have a particular nasty query it might take some experimentation what tunes your query from linq best.

explanation what i've found through experimentation

As you notice with your queries it's checking the discriminator. If you queries get a bit more complex (and i expect those 50k queries to be one of those) you'll see that it adds in code for string concatenation to check every possible combination. you see that happening a bit in the

THEN '0X' WHEN ([Extent1].[Discriminator] = N'TypeA') THEN '0X0X' 

part. I've did some POCs trying to figure out this behaviour and what seems to be happening is that entity framework is translating properties to 'aspects' (my term). For example an class will have a "PropertyA" if the translated string contains either '0X' or '0X0X'. PropertyB it might translate to "R2D2" and PropertyC to "C3P0". that way if a classname is translated to "R2D2C3P0". it knows it has both PropertyB & PropertyC. It has to take in account some hidden derived types and all supertypes. Now if enity framework can be more sure about your class hierarchy (by making classes sealed) it can simplify the logic here. And in my experience the string building logic EF generates can be even more complex than then ones you're showing here. That is why making classes abstract/sealed EF can be smarter about this and reduce your queries.

Another performance tip

Now also make sure you have proper indexes on the discriminator column. (You could do this from your DbMigration script inside entity framework).

'Desparate' performance measure

Now if all else fails make your discriminator an int. This will hurt the readability of your database/queries a LOT, but it helps performance. (and you could even have all your classes automatically emit a property that contains the class name so you keep some readability of types inside your database).

UPDATE:

after some more research after the comment from RX_DID_RX it turns out you can only seal/make poco's abstract if you don't use dynamic proxy generation. (lazy loading & change tracking). In my particular app we didn't use this so it worked well for us but i have to revert my earlier recommendation.

For more detail an EF6 specific link http://www.entityframeworktutorial.net/Types-of-Entities.aspx

adding indexes, and playing with casting in linq queries can still help though.

Batavia
  • 2,497
  • 14
  • 16
  • I really appreciate the comment ... but it doesn't seem to be focused on my problem (enum with TPH) – Cybermaxs Nov 30 '15 at 14:38
  • You're problem is that EF6 generates queries which are too large according your DBA. I show a way so EF can simplify your queries (with the remark that i think your DBA shouldn't complain about query size as long as they don't bring down the server in terms of performance). AFAIK this should help your problems with and without enums - and you do remark seeing the same thing happening in 2 vs 3 hierarchy levels – Batavia Nov 30 '15 at 15:26
  • "Make classes abstract/sealed" - don't know much about EF 6, but back in EF 4 it was a bad idea. See: https://msdn.microsoft.com/library/dd468057(v=vs.100).aspx – RX_DID_RX Dec 02 '15 at 08:43
  • hey RX_DID_RX, i didn't know that. certainly worth further investigation. My experience however is that queries become simpler and performance was improved (in our particular situation) when sealing/abstracting classes. But maybe that was in combination with some other things we did – Batavia Dec 02 '15 at 09:16
0

From Batavia answer about queries: "Now if they have specific performance concerns you SHOULD address those" and don't waste time with other queries. And also, don't waste time to understand why EF generates a query (if you trace LINQ queries with Include you will be negatively impressed about the generated queries).
Other queries you need to address is query that are not compatible with your EF Provider (like queries with CROSS JOINs that sometimes EF generate).

About performance in SQL statements (in DML you can find several other questions also on stackoverflow):
- if you want you can use stored procedures;
- there is a missing feature in EF. You can't run a SQL query and map it to a class using the mapping defined in EF. You can find an implementation here Entity framework Code First - configure mapping for SqlQuery (actually it could need some fix to work with TPH).

Community
  • 1
  • 1
bubi
  • 6,414
  • 3
  • 28
  • 45