4

I would like to discuss this simple postgres query and figure out if I have got a bit of the theory behind postgres and DBs in general. Here it comes:

-- Query 1
SELECT posts.*, users.* 
         FROM posts INNER JOIN users 
            ON posts.user_id = users.id
         WHERE 
            posts.user_id = :id
         ORDER BY posts.creation_time

It clearly involves two tables (users and posts) which are joined together. There is a index on (posts.creation_time, posts.user_id) to speed up the search.

My understanding is that I ALSO need a foreign key that associates posts.user_id to users.id not only to enforce a form of referencial integrity as such, but olso, and probably more importantly, to speed up the ON posts.user_id = users.id bit of the query.

Am I right in saying so?

Now conside this version of the query:

-- Query 2
SELECT posts.*, users.* 
         FROM posts INNER JOIN users 
            ON posts.user_id = users.id
         WHERE 
            users.id = :id
         ORDER BY posts.creation_time

Assuming the aforementioned foreign key (posts.creation_time, posts.user_id) exists, would this query scale, taking into account the variable in the equation lives on the other side of the join?

My guess is that it won't scale

Thanks

nourdine
  • 7,407
  • 10
  • 46
  • 58
  • Check the execution plan and you'll know (but in general: indexes speed up queries, foreign key don't have an impact on performance - only on the correctness of the data) –  Mar 24 '14 at 21:04
  • FKs are for maintaining the referential integrity. FKs can have impact on performance too. Just consider the additional checks on delete for instance, which have to be made to maintain the referential integrity. – ivicaa Mar 24 '14 at 21:06
  • ok but in general and levaing aside referencial integrity: do you need to index the field involved in the join in order to speed up the matching? and if so, should it be part of a multicolumn index along with the other fields involved in the query (posts.user_id, posts.creation_time) or should it be an independent index? – nourdine Mar 24 '14 at 21:20
  • Yes, you should definitely index the columns that are used in a join (or a where clause). Whether or not a multi-column join is better you need to verify by looking at the execution plan –  Mar 24 '14 at 22:27

2 Answers2

0

Check this post:

Postgres and Indexes on Foreign Keys and Primary Keys

Foreign keys are necessary for maintaining referential integrity of the data. Indexes is what helps to make queries faster. Creating a foreign key will not automatically create an index.

Community
  • 1
  • 1
ivicaa
  • 605
  • 1
  • 6
  • 17
0

If an index on the FK is required for a certain query, depends on the queries execution plan. If you start on the child, you use the PK index to find the corresponding parent rows. If you start on the parent side, you need one to efficiently find the corresponding child rows. But remember that there are some implicit queries that might take place, e.g. deleting a row from the parent table has to ensure that there are no corresponding child rows. Thus in most cases it is a good idea to put an index on a FK.

I am not sure about postgresql, but in oracle not creating indexes on FK may even lead to table locks when deleting parent rows.

Also rember that you can usually use only one index per table in a given query for conditions that are combined via "and" (explicit or implicit for joins). In your example I would put a combined index on table posts with columns user_id, creation_time (in this order), to be suitable for both

WHERE posts.user_id = :id

and

ORDER BY posts.creation_time

simultaneously.

Drunix
  • 3,313
  • 8
  • 28
  • 50
  • "*you can usually use only one index per table*" - Postgres can happily use more than one query per table in a single statement. –  Mar 24 '14 at 22:26
  • You can have more than one index on a table, but only one is used in a given query (except the table appears more than once in the query) – Drunix Mar 25 '14 at 05:29
  • No. Postgres *can* use more than one index in a single query for a single table. http://sqlfiddle.com/#!15/55cc4/1 this is the plan a bit nicer: http://explain.depesz.com/s/ZurA –  Mar 25 '14 at 07:56
  • Yes, you are right for or-queries. I was thinking of and-queries (explicit in where or implicit via join). Will edit my statement. – Drunix Mar 25 '14 at 08:49
  • @Drunix >> Talking about Query 1: I guess you meant I should create an index like this: (`user_id`, `creation_time`) an not (`id`, `creation_time`). `id` is in the child table (users). – nourdine Mar 25 '14 at 09:37