11

This may be too much of an opinion-based question but here goes:

I've found an interesting quirk with Entity Framework and database migrations. It seems that whenever we create a foreign key it also creates an index on that column.

I read this SO question: Entity Framework Code First Foreign Key adding Index as well and everyone seems to say it's a great, efficient idea but I don't see how; indexing a column is very circumstance-specific. For instance, EF is indexing FKs on my table that are almost never (~1%) used for searches and are also on a source table, meaning that even when I join other tables, I'm searching the FK's linked table using it's PK...there's no benefit from having the FK indexed in that scenario (that I'm aware of).

My question:

Am I missing something? Is there some reason why I would want to index a FK column that is never searched and is always on the source table in any joins?

My plan is to remove some of these questionable indexes but I wanted to to confirm that there's not some optimization concept that I'm missing.

Community
  • 1
  • 1
  • 2
    *This is not an opinionated question.* It is about configuring EF to be less stupidly opinionated. YMMV. – user2864740 Apr 17 '15 at 19:02
  • 1
    Entity Framework would usually want it indexed because it speeds up the process of populating the navigation properties on the independent objects. – Asad Saeeduddin Apr 17 '15 at 19:03
  • @Asad Indexes only speed up queries .. when they are usable. They also slow down updates. A good database design uses indices as appropriate. *The OP clearly says that this scenario is ~1%.* – user2864740 Apr 17 '15 at 19:03
  • 4
    That's why you should never let an ORM dictate the structure of your database. – Alejandro Apr 17 '15 at 19:05
  • 1
    @Alejandro I would argue that is the *number one reason* why *not* to use EF [code first / migrations] .. and why I dislike EF [code first / migrations]. However, the OP *is* using EF [code first / migrations] .. – user2864740 Apr 17 '15 at 19:06
  • @user2864740 If the user has entities on the other side of the relationship that have navigation properties, the navigation properties will be filled by doing a query that filters by the foreign key column. Since navigation properties are so ubiquitous in EF, I don't think a slightly redundant index is such a horrible design decision. – Asad Saeeduddin Apr 17 '15 at 19:07
  • 2
    Just relax, bro. EF is working as it should. – NoWar Apr 17 '15 at 19:07
  • @Asad I think EF [code first / migrations] is a horrid opinionated design. Opinions are like.. and EF has more than one. – user2864740 Apr 17 '15 at 19:07
  • 6
    @user2864740 I don't know if the EF devs murdered your dog and eloped with your wife or something, but there's no sense in being irrationally upset about EF. It's a tool; it helps some people for their use cases, maybe you're not one of those people. "EF is horrid opinionated design" is not productive or relevant to the OP's question. – Asad Saeeduddin Apr 17 '15 at 19:10
  • @Asad I am not "irrationally upset". I dislike EF for *good reasons* (if you are curious what they are, search for EF and my profile); I use EF *a lot* with *many* different schemes/databases. Once moving outside of EF-owns-the-schema it becomes a leaky ship. EF *is* opinionated and *is* horrible when stepping outside this opinionated realm which is why I call it "horrid": it is the EF way or bust. It is *more harmful* to assume that fighting EF means that EF is correct - because of how much it binds the model. If the tool 'wins', you 'lose'. – user2864740 Apr 17 '15 at 19:28
  • 1
    @user2864740 Many people like the opinionated "this is how you need to do it, and we'll make things a little faster if you play nice" feel. It's great for small projects where you don't really care about the ORM and just want to get it out of your way. Like I said, different people have different use cases: EF works great for some, if you want a lot of customization you have options like NHibernate. – Asad Saeeduddin Apr 17 '15 at 19:29
  • @Asad "It feels great for small project" - RoR feels great for a todo-list in 5 minutes as well. Let me know how 'they' feel when working on a non-trivial project, especially when such includes existing schemas. – user2864740 Apr 17 '15 at 19:31
  • 3
    This is quickly turning into the opinion war I was trying to avoid. I greatly appreciate all inputs but am getting the vibe that it boils down to my original thought: create smart indexes based on my data. EF will make a good guess but if it does something I don't like, change it. –  Apr 17 '15 at 19:31
  • It isn't an "opinion war". **It is a *fact* that EF is opinionated**, which is were it started - *I am defending the OPs stance against EF* and against statements like "EF is working as it should" (which is itself an opinion). It is my *opinion* that EF has stupid opinions. This does not change the *fact* that it is opinionated. – user2864740 Apr 17 '15 at 19:32

4 Answers4

12

In EF Code First, the general reason why you would model a foreign key relationship is for navigability between entities. Consider a simple scenario of Country and City, with eager loading defined for the following LINQ statement:

var someQuery = 
   db.Countries
     .Include(co => co.City)
     .Where(co => co.Name == "Japan")
     .Select(...);

This would result in a query along the lines of:

SELECT *
FROM Country co
INNER JOIN City ci
  ON ci.CountryId = co.ID
WHERE co.Name = 'Japan';

Without an Index on the foreign key on City.CountryId, SQL will need to scan the Cities table in order to filter the cities for the Country during a JOIN.

The FK index will also have performance benefits if rows are deleted from the parent Country table, as referential integrity will need to detect the presence of any linked City rows (whether the FK has ON CASCADE DELETE defined or not).

TL;DR

Indexes on Foreign Keys are recommended, even if you don't filter directly on the foreign key, it will still be needed in Joins. The exceptions to this seem to be quite contrived:

  • If the selectivity of the foreign key is very low, e.g. in the above scenario, if 50% of ALL cities in the countries table were in Japan, then the Index would not be useful.

  • If you don't actually ever navigate across the relationship.

  • If you never delete rows from the parent table (or attempt update on the PK) .

One additional optimization consideration is whether to use the foreign key in the Clustered Index of the child table (i.e. cluster Cities by Country). This is often beneficial in parent : child table relationships where it is common place to retrieve all child rows for the parent simultaneously.

StuartLC
  • 104,537
  • 17
  • 209
  • 285
  • 2
    The OP claims the index is irrelevant for his/her queries. This mostly just reiterates what is. – user2864740 Apr 17 '15 at 19:36
  • 1
    I believe you may have missed the point about Joins. Despite OP not explicitly filtering on a FK, he / she will likely be using it heavily in Joins. – StuartLC Apr 17 '15 at 19:41
  • 1
    @StuartLC I think the OP's use case is one where the join is only ever carried out one way: i.e. data is selected from the dependent table and joined onto the table that *doesn't* have the foreign key column, in which case the index doesn't benefit you. – Asad Saeeduddin Apr 17 '15 at 19:44
  • Correct @Asad. We have a `People` table and an `Addresses` table, for example. The `People.AddressID` FK was Indexed by EF but I only ever start from a `People` row and search for the `Addresses` record; I never find an `Addresses` row and then search the `People.AddressID` column for a matching record. –  Apr 17 '15 at 20:20
  • 7
    @Kalmino - If you never do a reverse navigation, then why model it? – Erik Funkenbusch Apr 17 '15 at 21:42
  • @ErikFunkenbusch - The FK relationship exists so that we can start from the People table and find their FK related information, like Address. The part that seems to imply a modeled reverse navigation is the Index, which is what I was questioning in the first place; I don't think it should be there. –  Apr 20 '15 at 13:08
  • 2
    @Kalmino - I did not ask you why you model the People -> Address relationship, I asked why you are modeling the Address -> People relationship. ie, why do you have a reverse navigational property? – Erik Funkenbusch Apr 20 '15 at 14:03
  • @StuartLC given the OP, you should have elaborated on why the database performance will be impacted with deletes and updates if EF chose not to add an index. – Jeff Fischer Mar 09 '23 at 02:48
  • 1
    @JeffFischer well, you've added your answer, and you're right, I missed out on RI checks on inserts and FK updates. +1 – StuartLC Mar 09 '23 at 08:01
1

Short answer. No.

To expand slightly, at the database create time, entity framework does not know how many records each table or entity will have, nor does it know how the entities will be queried.

*In my opinion * the creation of a foreign key is more likely to be right than wrong, I had massive performance issues using a different ORM which took longer to diagnose because I thought I had read in the documentation that it behaved the same way.

You can check the Sql statement that EF produces and run it manually if you want to double check.

You know your data better than EF does, and it should work just fine if you drop the index manually.

IIRC you can create 1 way navigation properties if you use the right naming convention, although this was some time ago, and I never checked whether the index was created.

Alejandro
  • 7,290
  • 4
  • 34
  • 59
ste-fu
  • 6,879
  • 3
  • 27
  • 46
1

The accepted answer insufficiently answers the question at hand, so I'm going to give it another go.

The direct answer to the question is that: High performance database design will always prefer having indexes on Foreign Keys because they're used during the check process for inserts, updates and deletes.

When any data is being inserted, updated or deleted constraints must be checked against existing foreign keys.

The OP is solely focused on the query patterns which misses the point. Indexes in this case are dual purpose: query patterns (which are not at play in this scenario) and FK constraint maintenance.

Many production apps have suffered by doing deletes on tables with FKs that are missing corresponding indexes. The data set and cascading delete just has to be large enough.

Here's a decent resource to read further - https://www.dataversity.net/foreign-keys-and-the-delete-performance-issue/

As the article mentions,

There is a very basic, yet very common misunderstanding set by developers and beginner modelers

and unfortunately the opinionated conversation has ensued from this. There is no opinion when it comes to good database design principles that must be followed or suffer the performance consequence.

Jeff Fischer
  • 2,063
  • 1
  • 17
  • 12
0

Change the conflict FK (Foreign Key) name in ApplicationDbContextModelSnapshot file with another one. Then add migration again. It will override to it and not gonna give error.