7

A colleague claims that he used foreign keys to optimize queries in past. I think that foreign keys are used only when inserting or updating data in a table. I don't see how they can be used to speed up the search.

How can a foreign key help when creating an execution plan? Am I missing something? If so, then under what circumstances does it help?

(We use PostgreSQL, which I don't have much experience with. Is it possible that it behaves differently than, say, Oracle or MySQL?)

Martin Grey
  • 744
  • 2
  • 12
  • 26
  • On PostgreSQL at least a foreign key does not boost performance, you are right it is simple checked on INSERT, UPDATE, DELETE or TRUNCATE. But make no change in SELECT. – MatheusOl Sep 08 '16 at 11:32
  • The index will improve the performance for DELETE statements on the parent table as the database will need to check if there are no more child rows - and that check will be faster if the FK column is indexed –  Sep 09 '16 at 09:03
  • I don't know about PostgreSQL but I know it helps on e.g. MSSQL. Even when it doesn't help right now, it might in the future as the program (specifically the optimizer/planner) is still under active development. – deroby Sep 09 '16 at 10:02
  • Does this answer your question? [Does Foreign Key improve query performance?](https://stackoverflow.com/questions/507179/does-foreign-key-improve-query-performance) – philipxy Mar 09 '22 at 04:45
  • @philipxy that is a similar question for Microsoft SQL Server, while I was interested in PostgreSQL. Different database systems sometimes behave in different ways. – Martin Grey Mar 15 '22 at 15:12

3 Answers3

8

Yes, foreign keys can definitely improve performance of queries, but it depends on the Database you are using and often whether these keys are 'enforced' or not.

In Oracle and SQL Server having foreign keys definitely can increase performance when reading / joining multiple tables on their foreign key

Why? Having an checked/validated foreign key gives the query optimizer extra information regarding the relation 2 tables have.

It knows, that when a child table is inner joined to a parent table:

  1. That the parent table has the same amount of records or less compared to the child table.
  2. That all keys in the child exists in the parent.

This all helps the query optimizer in estimating the rows that are going to be processed. This esimation being right is really important for most (if not all) query optimizers.

Proof for this general fact can been seen by the recent addition of foreign keys in the form of metadata only to Hadoop Hive. The goal of this addition is do help the CBO (Cost Based Optimizer), this Hive Jira entry explains...

Furthermore, having (bitmap) indexes on foreign keys also improves performance in Oracle when using fact tables: 'A bitmap index should be built on each of the foreign key columns of the fact table or tables'. See the following link...

Foreign keys, for obvious reasons will cost you extra when inserting / updating data: extra work has to be done by the database compared with NOT having fk's

You can easily see this in SQL server (for example) by investigating Explain plans.

I do not know Postgresql, but my approach to validate the effect of FK's would be to look at explain plans. Do they differ when FK's are enabled / disabled / dropped?

[Edit] I actually found this proof that FK's can enable read performance in Postgresql but the reason for this is somewhat different: BECAUSE FK's are enabled, the query in the example can be changed to be more performant.

  • I tried to disable (or even drop) FKs on various tables, but execution plans were still the same. The optimizer can guess table sizes from index sizes. I don't think the FK itself can give it anything it doesn't already know. – Martin Grey Sep 08 '16 at 14:32
  • Hmm, It could be that Postgresql is different from Oracle / SQL server in this behaviour. I am really shure it has effect there. Are the plans **also the same** with fk's enabled / disabled when you alter the query by **including a row-restricting filter** on the Parent or Child table? – Rogier Werschkull Sep 09 '16 at 08:37
  • 1
    I think the emphasis here is on "*can* improve performance", right? As you say it gives the optimiser more information to go on, and can give better estimates of join and query result cardinality. No guarantees of course, and in many cases there will be no effect on the optimiser because statistics (max/min/distinct values) on the joined columns will strongly suggest that all values in the child table's column are present in the parent table's column. – David Aldridge Sep 09 '16 at 11:19
  • 1
    Furthermore, if you imagine a query such as "select from child table where a row exists in the parent table", this can be transformed to "select from child table" in the presence of a foreign key. – David Aldridge Sep 09 '16 at 11:20
  • The two main points of this answer are wrong and/or misleading. The parent table can still have more elements than the child table, no problem. A parent member does not have to have a child, even with a foreign key from child to parent. The optimizer does indeed know that all keys of the child table exist in the parent. So the result is between 1 and the number of elements in the child table. This is the same as if we have no foreign key but in which the target table's fields form a unique or primary key (which is a prerequisit for a foreign key in postgres but not necessarily in other DBs). – Christopher Gertz Oct 02 '21 at 20:53
3

Without an index on foreign key(s) the whole child table needs to be scanned to verify if any rows are referencing the key(s) you're trying to delete or update in the parent (i.e. foreign) table.

So yes, in this particular case having indexes will greatly improve performance.

Vedran
  • 10,369
  • 5
  • 50
  • 57
  • 1
    Conversely - if you will **never** allow deletion of parent rows or updates of the parent PK column, having the index will only incur a cost on child table inserts and updates. – Vedran Oct 06 '22 at 09:10
2

Primary Keys and Unique constraints create a corresponding INDEX. But not FK constrains:

declaration of a foreign key constraint does not automatically create an index on the referencing columns.

https://www.postgresql.org/docs/current/static/ddl-constraints.html

So you are right. But often it is a good idea to have an index on FK

Vao Tsun
  • 47,234
  • 13
  • 100
  • 132