2

I am currently examining the options available to me for re-designing a database schema. Unfortunately the person that originally designed the database used GUID datatype for the primary keys and the clustered indexes (I'm well aware of the problems that this causes and it is now evident in the performance of the database).

Having read several articles, my intention is to use auto incrementing integer identity columns in each table - these will be used to form the clustered index.

My question is, are there any dangers in using this column as the primary key?

Currently I've carried out tests in which I've used a combination of identity columns for the clustered index and GUIDs for the primary key and there has been a significant performance improvement. However, my understanding is that there will be even more performance improvements if I use the smaller identity column as my primary key too. My main concern is maintaining the integrity of identity values in relation to mirroring, etc - but perhaps I'm worrying about nothing.

We plan to eventually host the database on Azure

Thanks

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
  • http://stackoverflow.com/questions/11938044/what-are-the-best-practices-for-using-a-guid-as-a-primary-key-specifically-rega – Haytem BrB Jan 25 '16 at 13:04

2 Answers2

1

Use identity integer primary keys as they are easier to index, faster to scan, and use less index storage.

For your mirroring concern, keep an integration key column that is a GUID type - just don't make it the primary key column.

Jordan Parmer
  • 36,042
  • 30
  • 97
  • 119
  • Thanks for your reply Jordan. Should I have any concerns at all about mirroring? Thanks. – Gary Neilson Jan 25 '16 at 14:14
  • It depends on how you mirror and why. If you do a disk copy or use a tool that maintains foreign keys then you are fine. If you are rebuilding the database by migrating the data over via inserts, the identity keys will change (start over) so you will lose foreign key integrity. For instance, let's say you have an application that works in a local, disconnected database. You then want to sync records to a remote database when it becomes online. Having an integration key can help since the keys don't rehydrate. Otherwise, you don't need to worry about it. – Jordan Parmer Jan 26 '16 at 23:03
  • From what I understand the mirroring will be configured within Azure so, based on what you've explained I shouldn't have anything to worry about. Thanks for your help. Much appreciated. – Gary Neilson Jan 27 '16 at 18:45
0

Go with integer IDENTITY PKs. The only problem is if you don't create appropriate FKs, as then you'll probably end up with 'orphan' rows in related tables.

Also remember in tables that are going to get a lot of rows added (e.g. Log tables) even if the number of rows at any one time is small (due to deletions for whatever treason) the IDENTITY will keep increasing. In busy tables this can eventually reach the limits of int, and a fuss will arise. It should be possible to fore-see this coming during design, and if that happens then use bigint (64 bit) instead of int (32 bit). I've never seen a performance drop when I've used bigint. Mind you, it has to be a lot of rows to go over 21 billion.

simon at rcl
  • 7,326
  • 1
  • 17
  • 24