0

Currently I'm working in an environment where nothing - absolutely nothing in our database is relational besides what we define being relational through stored procedures and other functions.

I have no idea currently what would happen to our system if we decided to actually connect everything through proper constraining, but what I'm asking is this:

Would there be any performance increases, in implementing proper constraining?

Dale K
  • 25,246
  • 15
  • 42
  • 71
  • You have index actually or not ??? – ScaisEdge Oct 31 '19 at 08:20
  • Some things are indexed, most things are not. but definitely a good place to start.. – Mark Mazur Lussenburg Oct 31 '19 at 08:26
  • Good performance require good indexing .. connstrains are primarly ... unique index .. – ScaisEdge Oct 31 '19 at 08:28
  • If anything, in terms of straight performance, you'll likely lose some. Every insert or update has to check the data integrity as part of the operation. – Dale K Oct 31 '19 at 09:25
  • Enforcing data integrity through your code is likely to perform way worse that letting the database enforce it for you (and you will get less integrity after everything is said and done). See [this answer](https://stackoverflow.com/a/20873843/533120). – Branko Dimitrijevic Oct 31 '19 at 10:08
  • There are already some excellent answers, but the performance changes you will see from implementing referential integrity will be very hard to notice if your application is already performing well; if your application is not performing well, implementing referential integrity is not the most obvious change to make. – Neville Kuyt Oct 31 '19 at 11:03

2 Answers2

2

Primary keys and foreign keys are not performance boosters, they are not designed for performance and the reason to use them has nothing to do with performance.

They exist for a much more important reason - to make sure your data is valid.
This is called data integrity, and it's one of the main reasons why relational databases are used on the first place.

Primary keys are implemented as a unique index on a non-nullable column (or a group of non-nullable columns) which means they might help with query performance (as with all indexes, they can boost select performance and somewhat degrade insert/update/delete performance) - but that is not what they are used for - it's merely a side-effect caused by how they are implemented by the database.

So, to answer your question - You might expect some performance gain in select statements when adding primary keys, but much, much more important - you gain data integrity by adding keys (both primary and foreign) to your database.

Zohar Peled
  • 79,642
  • 10
  • 69
  • 121
  • This makes a lot of sense! thank you and I'll take this with me :) – Mark Mazur Lussenburg Oct 31 '19 at 10:05
  • @ZoharPeled . . . Primary keys are the default clustering key for tables in SQL Server. That means that they *do* have a performance impact. The first sentence is wrong. – Gordon Linoff Oct 31 '19 at 11:28
  • @GordonLinoff Yes, they are default clustering key, and they do have a performance impact - but that's not their main goal, it's actually a side effect, which is the main point of my answer here. – Zohar Peled Oct 31 '19 at 11:31
1

Would there be any performance increases, in implementing proper constraining?

Yes, there likely will be.

Imagine you want to insert a "child" row which cannot exist without the corresponding "parent" row. How would you check for it in a stored procedure?

Well, naively, you would:

IF EXISTS (SELECT * FROM PARENT WHERE ...)
    INSERT INTO CHILD VALUES (...);
ELSE
    THROW ...;

But of course, a concurrent transaction might delete the parent row between IF and INSERT, so you'd need to lock the parent row:

...SELECT * FROM PARENT WITH(UPDLOCK)...

But now that lock is held to the end of the transaction, blocking anyone wishing to modify parent. Or SQL Server might decide to escalate to table lock, at which point your concurrency goes down the drain...

Letting the database enforce FKs will likely allow for more concurrency and better performance.


A SELECT query can also benefit from declarative foreign keys.

If you have an INNER JOIN, but select columns from only the child table, the query optimizer may skip accessing the parent table completely - it already knows that if the child row exists the parent row must exist too, so it doesn't need to check for parent existence explicitly.

Simply omitting the parent from the query is easy enough if you have just one query, but may not be very practical if you have layers of views and inline table-valued functions. In that case you'd like to reuse the existing code without having to modify it just to cull the "extra" processing that you don't need, so you can let the query optimizer cull it for you.

Branko Dimitrijevic
  • 50,809
  • 10
  • 93
  • 167