14

Possible Duplicate:
Is a GUID unique 100% of the time?

After reading all posts on Guid, still I am unclear about one simple thing:

Guids generated by different machines also maintain their Uniqueness or not.

I have read about guid uniqueness on single machine,but I still don't know about uniqueness with different machines

Community
  • 1
  • 1
Rohit Garg
  • 598
  • 3
  • 17
  • BUt how does two system manage that it has to create a different guid..How does guid generator works??how does it creates unique guid everytime?? – Rohit Garg Dec 28 '12 at 18:54
  • They are random. There are so many random combinations that we simply "trust" that it will be unique. You won't live long enough to see a dupe - unless you're using sequential guids :). – Paul Fleming Dec 28 '12 at 18:57
  • 2
    Take a random number between 1 and 100. Do it again at. The odds of you getting the same number the second time is 1 in 100. Now take a random number between 1 and 5,316,911,983,139,663,491,615,228,241,121,400,000. That number doesn't have a name. – Paul Fleming Dec 28 '12 at 18:59
  • @flem - Or a flawed RND generator feeding the GUID. It also depends on the GUID type according to the UUID spec. – OnoSendai Dec 28 '12 at 18:59

3 Answers3

15

It is generally accepted that a new random GUID will ALWAYS be unique. Probabilistically this is not true, but the likelihood of generating a dupe is so small we don't need to care about it.

The odds of generating two identical guids is 1 in 5,316,911,983,139,663,491,615,228,241,121,400,000

So if you generate 1 million guids on 1 million computers, the odds of generating a duplicate are: 1 in 5,316,911,983,139,663,491,615,228

Take 1 billion guids on 1 billion computers, odds of generating a dupe are: 1 in 5,316,911,983,139,663,491 (that's 5.3 quintillion).

The numbers speak for themselves, you're not going to generate a dupe.

In case you're wondering where I'm getting these numbers, the value part of a GUID is 122 bit. 2^122 is 5.3169119831396634916152282411214 x 10^36

Some more crazy figures...
If you generate 1 million guids per second, it would take 168,486,464,147,580,370,470,736 years to probabilistically guarantee a duplicate.

@viggity mentioned some guids have 48 bits taken by a mac address, the numbers are still staggering hence the affordability to lose those bits. Taking the above example of 2 million guids per second (on the same computer), it would still take 598,584,166 years to guarantee a dupe. That's 600 million years. That's longer than life has existed on Earth. Or if you're a Young Earth Creationist, that's at least 60 thousand times the lifespan of Earth.

Paul Fleming
  • 24,238
  • 8
  • 76
  • 113
  • 2
    Just to add to the discussion, a quote from the Wikipedia entry about GUID: 'Some flawed GUID-generating implementations rely on pseudorandom number generators that use random number seed sources that turn out to be predictable'. So mathematically there's 2^122 possible values (6 bits reserved to indicate a random GUID), but for practical means it can be limited by the RND function. – OnoSendai Dec 28 '12 at 18:55
  • 3
    +1 for blowing my mind with probablilities – JOpuckman Dec 28 '12 at 19:02
  • 2
    Totally off-topic, a creationist would probably use 14 bit GUIDS, possibly defying a 122 bit GUID, claiming it represents a number that is greater than the age of the Earth and thus isn't real. – Wim Ombelets Dec 28 '12 at 19:29
  • @Wimbo - controversial! :) – Paul Fleming Dec 28 '12 at 19:36
  • 4
    -1 for wrong probabilities and completely ignoring 4/5 UUID generation schemes. The odds of a duplicate are actually significantly higher than your figures, due to the birthday problem. Now, with 122 random bits, a billion still isn't enough to make collisions likely. But depending on the type you generate, how fast you generate them, and how many distinct machines you use, the odds are 1 due to the pigeonhole principle. Some UUIDs include time stamps, MAC addresses, "domain IDs", and other values which are decidedly not unique for an extended period of time or for a given computer. –  Dec 28 '12 at 20:13
5

GUID are "practically" universally unique.

A GUID is a 128-bit integer (16 bytes) that can be used across all computers and networks wherever a unique identifier is required. Such an identifier has a very low probability of being duplicated.

From the MSDN

Paul Fleming
  • 24,238
  • 8
  • 76
  • 113
Tilak
  • 30,108
  • 19
  • 83
  • 131
  • "universal" implies "different machines", since all machines are part of the "universe" – John Saunders Dec 28 '12 at 18:50
  • BUt how does two system manage that it has to create a different guid..How does guid generator works??how does it creates unique guid everytime?? – Rohit Garg Dec 28 '12 at 18:57
  • Read [Guid Generation Algorithm](http://en.wikipedia.org/wiki/Globally_unique_identifier#Algorithm) on wikipedia. – Tilak Dec 28 '12 at 19:01
  • DO NOT TRUST THESE NUMBERS. I have tested GUID v4 randoms in JS, PHP, Ruby, and several distributed DBMS and found many collisions within a few thousand - few million tests. Always test for uniqueness. See also: http://usecuid.org – Eric Elliott Jan 05 '15 at 19:28
2

I remember hearing somewhere that if you visualized the IPV4 address space (32 bits) as being the size of postage stamp, IPV6 (128 bits) is the size of our solar system. Generating a dupe is just not going to happen.

Also, if anything you're more likely to get a duplicate the same machine than two different machines because most Guid generation algorithms will embed your computers NIC MAC address within the guid (it's 48 bits). Although there are algorithms that don't embed the MAC address and are just purely random. see: http://guid.us/

Edit: another fun scale example the volume of the earth is roughly 10^27 cubic centimeters. meaning that every cubic centimeter for the ENTIRE VOLUME of the earth could have 340,000,000,000 guids all to itself. this number is mindbogglingly big.

Alternatively, every square NANOmeter of the surface of the earth could have roughly 650,000 guids all to itself.

viggity
  • 15,039
  • 7
  • 88
  • 96
  • > most Guid generation algorithms will embed your computers NIC MAC address within the guid (it's 48 bits) Wow, that's silly. Ripping out 48 bits for that? Is there any reason to do that? – vbullinger Dec 28 '12 at 19:01
  • http://guid.us/Test/GUID shows CCCCCCCC-CCCC-CCCC-CCCC-CCCCCCCCCCCC is a valid guid – Rohit Garg Dec 28 '12 at 19:02
  • 3
    @vbullinger Yes, there is a reason, it's so that even if the remaining portion of the GUID is identical (it used a timestamp, so the idea is that two machines might try to create a GUID at the same time) wouldn't collide. Such GUIDs have fallen out of practice though; these days it's much more common to see GUIDs that are entirely based on one big random number. – Servy Dec 28 '12 at 19:11
  • 2
    MAC addresses fell out of favor because GUIDs generated in a VM tend NOT to be unique. Generate just 1000 GUIDs a second in a VM, and using the MAC address GUID generation version, you will get the same GUID generated. – StarPilot Jun 09 '15 at 22:04