6

I'm trying to shift my design sensibilities from the LAMP stack to the Microsoft stack, and I just thought of something - when would I want to use a GUID? What benefits/drawbacks does it have compared to the old, reliable auto-incremented int?

Major Productions
  • 5,914
  • 13
  • 70
  • 149

4 Answers4

14

Your experience never having needed anything beyond autoincremental ids might suggest that GUIDs are often a solution in search of a problem. Use them when and if you ever run into a requirement where your familiar pattern doesn't work. The microsoftiness is irrelevant.

The only realistic scenario I've seen is merging tables from two sources.

dkretz
  • 37,399
  • 13
  • 80
  • 138
4

"old, reliable auto-incremented int" depends rather strongly on just how scalable your database needs to be. Auto incrementing stops working in the trivial case when you have a setup with at least two masters. It's not too difficult to work around that, of course, because it's such a common problem; Different database engines may coordinate the sequence between the masters, for instance only one master may allocate from any given sequence.

When you get into sharding the data, it's normally desirable to know the shard from the key. An auto incremented id doesn't contain information about which shard should host that record.

GUID's solve the problem in a different way; two distinct masters have distinct host identifiers (typically a MAC address). Since that is used in computing a new GUID, distinct masters cannot create guid's that collide. Furthermore, since the host is part of the ID, It can be used to directly identify the shard that holds the record.

A third option is to use no surrogate keys at all (neither auto increment integers nor guids).

SingleNegationElimination
  • 151,563
  • 33
  • 264
  • 304
  • It's quite usual to see multiple GUIDs from the same host; or devices without a MAC address. – dkretz Mar 31 '11 at 02:39
  • Just out of curiosity, how often do you see duplicates? – Dave Mar 31 '11 at 03:03
  • That answer went mostly over my head. I haven't had to work with anything more complex than 1 db at a time. Still, +1 for providing me with some terms to research. ;-) – Major Productions Mar 31 '11 at 11:33
  • 4
    Here is and example of what he is saying. You have two databases in two different datacenters each creating records. If you try and merge those, then you have to ensure that the primary keys are different. One way is to set auto-increment to be odd in one and even in the other. Another is to use GUIDs ... since they supposedly never collide., – Dave Apr 10 '11 at 03:08
3

One problem with GUIDs:

Because they are not sequential, the database will have to work hard to update indexes. With a sequential id, it can usually just append it to the end (more or less). Since GUIDs are random it has to fit it into an existing block. That said, we use GUIDs for some tables and they seem to work fine even under fairly heavy load.

Dave
  • 13,518
  • 7
  • 42
  • 51
1

I would recommend to use int ID instead of Guid for the following reasons:

  • Int ID has a size of 4 bytes (32 bits) however Guid has size of 16 bytes (128 bits) which 4 times larger. in this case you might have performance problems and storage implaication
  • By default GUIDs are not sequential (be carefull)
  • It's easy to visualize links and relationships between tables