6

I have a user_info table in five different locations. Now I need to integrate all the rows into a central user_info table. To do this I need a unique id for each row in source tables. Because when all the rows come into the central table, then each user must have a unique ID.

Now my questions are:

  1. if I use uniqueidentifier NEWID() for each source table then is it will be unique for globally & life time or have any chance to be duplicate?

  2. How does SQL Server create the NEWID()? I need to know the key generation structure.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
riad
  • 7,144
  • 22
  • 59
  • 70

1 Answers1

8

Yes, there is no chance of a duplicate between machines.

NEWID() is based on a combination of a pseudo random number (from the clock) and the MAC address of the primary NIC.

However, inserting random numbers like this as the clustered key on a table is terrible for performance. You should consider either NEWSEQUENTIALID() or a COMB-type function for generating GUIDs that still offer the collision-avoidance benefits of NEWID() while still maintaining acceptable INSERT performance.

Mykola
  • 3,343
  • 6
  • 23
  • 39
richardtallent
  • 34,724
  • 14
  • 83
  • 123
  • 1
    Any documentation about what windows API call SQL Server uses for `NEWID`? I know that `NEWSEQUENTIALID` uses the MAC address as this is [clearly visible in the GUID](http://stackoverflow.com/a/3398847/73226) but couldn't see any documentation for NEWID. – Martin Smith Sep 23 '12 at 11:29
  • `CoCreateGuid` most likely? Eric Lippert had a [blog post series][1] about GUIDs in general. See also: http://stackoverflow.com/questions/39771/is-a-guid-unique-100-of-the-time For `NEWID()` the first hex digit of the third section is always 4 (when I try it), for `NEWSEQUENTIALID()` it uses the MAC address mentioned above. I can't remember what happens when there is no MAC address in the computer - anyone know what happens in SQL in this case? [1]: http://blogs.msdn.com/b/ericlippert/archive/2012/05/07/guid-guide-part-three.aspx – Seph Sep 23 '12 at 12:24
  • How accurate time does newid() use? It sounds like if I fetch two ids at the same time I should get a duplicate, yet I've never seen it happening. – jumxozizi Sep 19 '16 at 10:09
  • @tubelius, the newid() value is based on a pseudorandom number generated *from* the clock and other factors. It is not the clock value itself, so the chances of duplication are so infinitesimal that it isn't even worth worrying about. If you want the timestamp to be *explicitly* built into the value (to make it sort roughly sequentially without the disadvantages of newsequentiaid()), you'll need a COMB library. I have an open-source one (RT.Comb) on Github / Nuget. – richardtallent Sep 20 '16 at 04:48
  • @richardtallent No, you should mention that there is a negligible change. Saying there is no change means that there is zero probability, which is obvious not! – kelalaka Sep 17 '19 at 20:56