Entity framework split model into multiple tables programmatically

Question

I'm using Entity Framework 6 Code First to store a large POCO model in a database. The model happens to have 1000s of properties (don't ask lol**), meaning it must be split into multiple SQL tables (the column limit is 1024). I know this is normally done by specifying the individual columns like so:

modelBuilder.Entity<HugeEntity>.Map(m =>
{
    m.Properties(e => new { e.Prop1, e.Prop2 });
    m.ToTable("HugeEntity1");
}.Map(m =>
{
    m.Properties(e => new { e.Prop3, e.Prop4 });
    m.ToTable("HugeEntity2");
};

I'm wondering if there's any way to do this without having to specify the properties individually. Ideally it could partition an entity automatically based on a given column limit (viz. 1000).

Even if there is no standard way, what is the easiest hack to get this to work? The properties on the model are subject to change, so I would really like to avoid listing them exhaustively in more than one place.

Any advice appreciated!

**CONTEXT: This is a domain model, representing user entered data that should be captured on a certain web page. It is also exposed in a WebAPI. My team considered a key/value pair approach, but decided this would make the data more difficult to consume by future BI applications that hit the WebAPI.

Reflection comes to mind as a possible hack. Loop over the properties and split them equally among tables. I don't know if there is a "proper" way to do this. — Bradley Uffner, Jun 28 '17 at 19:07
If you are able to, you might want to look into a Document Store database (mongoDB as an example). Those work great when each object can have an independent schema. — gunr2171, Jun 28 '17 at 19:09
I would say that any object with 1000s of properties is certainly not a POCO. It is some kind of crazy object that needs to be refactored. Working with that would be painful. Intellisense must die when it tries to load the properties into cache. — Sean Lange, Jun 28 '17 at 19:12
Why not just one entity model per thousand properties... Doesn't solve the actual problem but... — Antoine Pelletier, Jun 28 '17 at 19:58
I'm voting up this question because there are people out there who have this problem, and I presume that the comments and answers will be good ones for such people to see. However, if anybody is running into this problem due to the effects of translating model inheritance into the database via EF, then this is the wrong area. Look up TPH, TPT, and TPC, which are EF inheritance strategies. — pwilcox, Jun 28 '17 at 21:38

pwilcox · Answer 1 · 2017-06-28T21:33:00.757

4

You say "don't ask" to us. But your biggest problem is that you're saying it to yourself. If your model is getting to have even more than 50 fields, then you need to ask what's going on. I wonder if it's not worth taking a breath and revisiting some more abstract concepts in computing. I would start with Database Normalization. 1K dynamic properties tells me that you're in desperate need of this.

And by the way, "Database Normalization" concepts aren't specific to SQL databases per se. You should be normalizing your POCO models just the same where you can. Granted, there are some non-relational concepts in OO languages. But no excuse for the extremes you're describing.

Another paradigm might be Domain Driven Design. I'm less fluent in this myself, so that's why I'm not saying you should start there. But from the beginnings I have in implementing it I'd say the learning curve has been worth it.

I should be careful not to condescend. I don't exactly have all my tables in the highest normal forms myself. But I'll tell you that in the areas I don't, the headaches are stronger.

edited Jun 28 '17 at 21:33

answered Jun 28 '17 at 21:19

pwilcox

5,542
1
19
31

You'd get my vote just for this already _But your biggest problem is that you're saying it to yourself._ I also think OP is not questioning the design but instead trying to work around it and I believe that all those columns are most probably properties that could be saved as rows with only a few columns. – t3chb0t Jun 29 '17 at 03:58
Added some context to original post. With it, I'd be curious to know if you still feel as strongly that redesign is required... – Kyle McClellan Jun 29 '17 at 14:15
No, not really. The only thing I would say is that if it is in fact already in a production environment, then of course you'll be living with the design for awhile and the transfer must be done with caution. Also, a Key-Value approach has other disadvantages than even what your team considered. It comes at the expense of your IDE not telling you that one change will not jive well with some other dependent object. So don't be in the habit of using it too freely. – pwilcox Jun 29 '17 at 15:26
Also, you mention that the POCO is "subject to change". This is why I said it had "dynamic" properties. That might be overkill. But it seems that you are implicitly using dynamic properties (for instance, I imagine that half the times that your POCO gets used, many of its properties are irrelevant for the purpose). A key-value approach just makes the "dynamicness" of it become explicit. Any dynamic approach, explicit or implicit, should be converted to a strongly-typed approach if possible. And it's probably more possible than you think. – pwilcox Jun 29 '17 at 15:46

score 1 · Accepted Answer · answered Jun 29 '17 at 12:54

1

Figured out a way to do it. I had to employ Linq Expressions and the "dynamic" keyword:

    private static void SplitIntoTables<T>(DbModelBuilder modelBuilder, IReadOnlyCollection<PropertyInfo> properties, int columnLimit) where T : class
    {
        var numberOfTables = Math.Ceiling((properties.Count + (double)columnLimit / 2) / columnLimit);
        var paramExp = Expression.Parameter(typeof(T));

        var tableIndex = 0;
        foreach (var tableGroup in properties.GroupBy(p => p.Name.GetHashCode() % numberOfTables))
        {
            var expressions = tableGroup.Select(p => Expression.Lambda(
                typeof(Func<,>).MakeGenericType(typeof(T), p.PropertyType),
                Expression.Property(paramExp, p), paramExp));

            modelBuilder.Entity<T>().Map(m =>
            {
                foreach (var exp in expressions)
                {
                    m.Property((dynamic) exp);
                }
                m.ToTable($"{typeof(T).Name}_{++tableIndex}");
            });
        }
    }

answered Jun 29 '17 at 12:54

Kyle McClellan

664
7
23

Didn't see your response in my earlier comments above. When I described "dynamic" there, I wasn't referring to the keyword. Here's some info about the best practices of using the [dynamic](https://stackoverflow.com/questions/31859016/is-the-use-of-dynamic-considered-a-bad-practice) keyword. – pwilcox Jun 29 '17 at 16:12
This is slick. Thx for sharing :-) – t3chb0t Jun 29 '17 at 16:13
`p.Name.GetHashCode() % numberOfTables` why are you using the hash-code here? Is this another trick? – t3chb0t Jun 29 '17 at 16:15
@t3chb0t Thanks! The hashcode makes it so that which table a property goes to is determined by its name alone (not its placement in the collection). That way the model can be changed more easily without affecting the tables as much. One can be added to the middle (for instance) without causing all the properties to shift one. – Kyle McClellan Jun 29 '17 at 19:56

Entity framework split model into multiple tables programmatically

2 Answers2