5

Is there a way to generate every time a 100% new GUID without any chance to collide within entire application?

Since I cannot answer my question in eight hours, I come up with the solution:

internal static class GuidGenerator
{
    private static readonly HashSet<Guid> _guids = new HashSet<Guid>();

    internal static Guid GetOne()
    {
        Guid result;

        lock (_guids)
            while (!_guids.Add(result = Guid.NewGuid())) ;

        return result;
    }
    internal static void Utilize(Guid guid)
    {
        lock (_guids)
            _guids.Remove(guid);
    }
}

Is this code solves the problem within the app?

EDIT: Uh, its getting complicated. Thread safety kills the speed.

AgentFire
  • 8,944
  • 8
  • 43
  • 90
  • 7
    You'll have to use an infinite length GUID. – Nasreddine Nov 02 '11 at 13:34
  • http://stackoverflow.com/questions/39771/is-a-guid-unique-100-of-the-time – Sangram Nandkhile Nov 02 '11 at 13:37
  • 6
    @AgentFire: Arent GUID's suppose to be globally unique? For entire app, I believe `.NewGuid()` is enough. Unless you are talking about unique id's across the **universes**? – KMån Nov 02 '11 at 13:37
  • The short answer is no. Mind you - getting a duplicate GUID is extremely unlikely event. I can't stress how extremely unlikely it is. Having said that - I've experienced a duplicate GUID generation when using cloned VMs, but that's another can-of-worms altogether. Is there a reason why Guid.NewGuid() isn't enough? – Goran Nov 02 '11 at 13:38
  • Alright, so my generator works on 100% for the application, is that right? – AgentFire Nov 02 '11 at 13:39
  • Yes, there is a reason. They CAN collide and it can cause the entire server application fail. That's why i have created the generator and asking if it is okay. – AgentFire Nov 02 '11 at 13:39
  • 1
    @AgentFire - There is nothing like 100%. `99,99999999999999999` percents:P – Petar Minchev Nov 02 '11 at 13:40
  • 1
    Assuming your hardware works perfectly. A random GUID collision is much more unlikely than faulty hardware in practice. – CodesInChaos Nov 02 '11 at 13:40
  • 15
    @AgentFire This is the code equivalent of staying indoors on a sunny, cloudless day in order to avoid being struck by lightning. – Daniel Pratt Nov 02 '11 at 13:42
  • **Unlikely** does not satisfy me. But 100% do. So i ask again, is the code is right? Or there is some pitfalls in memory/other things? – AgentFire Nov 02 '11 at 13:43
  • 1
    @AgentFire you are trying to solve year 10000 problem :) when year will be of 5 digits (year 10k problem) – Surjit Samra Nov 02 '11 at 13:46
  • 3
    @AgentFire There will never be a 100% guarantee that you will not encounter a GUID collision so long as you use a finite-length GUID. What Daniel and CodeInChaos are saying is that the chance of a collision actually occurring in practice is so miniscule that your application is much, much more likely to fail for other reasons, such as a failure of the hardware running it, than because of a GUID collision. – Jared Ng Nov 02 '11 at 13:47
  • 1
    This is like somebody playing russian roulette while smoking a cigarette and then worrying about lung cancer. – CodesInChaos Nov 02 '11 at 13:47
  • @AgentFire I've seen a large scale app depend soley on Guid.NewGuid() with no problems at all. I think there are `340,282,366,920,938,500,000,000,000,000,000,000,000` (approximate) possible Guids. This is a 32 character hex based Guid. – Brandon Buck Nov 02 '11 at 13:48
  • 1
    I got it.. So am i right: right now i have to let it go (i mean, like adding some GUIDs in some Lists/Hashsets in my program and intentionally not to worry about a collision, which can and will result in a fault? Is that are u telling me? – AgentFire Nov 02 '11 at 13:50
  • @AgentFire Well, use your method, benchmark the number of times you have to loop through to get a new Guid, run it for a while. See if it's worth it yourself. – Brandon Buck Nov 02 '11 at 13:54
  • Yeah it will take 1 operation.. But im still confused. Allowing a chance.. :/ – AgentFire Nov 02 '11 at 13:57
  • Your new thread safe code is utterly broken. And I doubt the CPU cost of locking is significant int practice. You will likely run out of memory before the CPU cost matters. – CodesInChaos Nov 02 '11 at 14:00
  • Your code does not defend against Maxwell's Demon :) – Goran Nov 02 '11 at 14:09
  • @CodeInChaos I'm not gonna run out of memory since i unregister each `System.Guid` back from the memory every time i dont need it. – AgentFire Nov 03 '11 at 05:24
  • 3
    Why don't you just use an integer counter that you increment? – CodesInChaos Nov 03 '11 at 11:59

11 Answers11

26

No, there isn't any way to generate absolutely unique GUIDs. There are only 3.40282367 × 1038 possible GUIDs so as galaxies collide so will these identifiers. Even for a single application, it depends on how many GUIDs the application has. Unless your app is bigger than all of Google's indexers combined, you don't need to lose sleep over this. Just use Guid.NewGuid().

Mark Cidade
  • 98,437
  • 31
  • 224
  • 236
17

Sure. A GUID is just a 128-bit value. So use a 128-bit integer (e.g. represented by two ulong values) and increment it. When you've reached the maximum value for the 128-bit integer type, you've generated all possible GUIDs. For example:

public IEnumerable<Guid> GetAllGuids()
{
    unchecked
    {
        byte[] buffer = new byte[16];
        ulong x = 0UL;
        do
        {
           byte[] high = BitConverter.GetBytes(x);
           Array.Copy(high, 0, buffer, 0, 8);
           ulong y = 0UL;
           do
           {
               y++;
               byte[] low = BitConverter.GetBytes(y);
               Array.Copy(low, 0, buffer, 8, 8);
               yield return new Guid(buffer);
           } while (y != 0UL);
           x++;
        } while (x != 0UL);
    }
}

Notes:

  • This is definitely not as efficient as it might be.
  • Iterating over all possible ulong values is a pain - I don't like using do...while...
  • As noted in comments, this will produce values which are not valid UUIDs

Of course, this is in no way random...

In practice, as others have mentioned, the chances of collisions from Guid.NewGuid are incredibly small.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • So then you write an app to generate all of the 128 bit values and put them in a database, and then `select top 1 guid from AllGuids order by newid()` Genius! – asawyer Nov 02 '11 at 13:40
  • 2
    @Sangram: I'm not sure what point you're trying to make. The chances of collision are small but non-zero using normal approaches. My code generates every possible GUID, without any repetition. – Jon Skeet Nov 02 '11 at 13:40
  • +1 for yield return new Guid(buffer); and there is typo "} while (y != 0UL;" – Surjit Samra Nov 02 '11 at 13:57
  • >I don't like using `do...while` ... Make it just `while { }`. – AgentFire May 18 '12 at 05:47
  • @AgentFire: Obviously I would have done if it were simple to do so - but it's not. How would you write the while loop to cover all ulong values? – Jon Skeet May 18 '12 at 05:53
  • Like that `while (++y != 0UL) { ... .GetBytes(y - 1) ... }` – AgentFire May 18 '12 at 07:51
  • 1
    @AgentFire: That's far worse than using the do/while version, IMO. – Jon Skeet May 18 '12 at 07:51
  • Actually, for that code u've provided the `while (++y != 0UL) { ... .GetBytes(y) ... }` is quite enough but I suspect u wanted to place the incrementor to the end of the loop. – AgentFire May 18 '12 at 08:00
  • 1
    @AgentFire: Yes, I preferred that way. I personally don't like using the prefix/postfix increment within a larger expression - the do/while is clearer to me than that. – Jon Skeet May 18 '12 at 08:12
  • @AgentFire: Except that would last forever, instead of stopping after it exhausts all 64 bit values. I'm happy enough with it at the moment... – Jon Skeet May 18 '12 at 08:27
  • 1
    @AgentFire: No, I don't, actually. Try writing the full thing, and you may see why I didn't use it. Hint: if you use `y < ulong.MaxValue` you'll never see `y == ulong.MaxValue`... (There's never a value of `y` at the beginning of the loop which means you should terminate.) – Jon Skeet May 18 '12 at 08:38
  • 1
    There is a bug in the code. The second Array.Copy should be Array.Copy(low, 0, buffer, 8, 8); As written the code will just keep generating a 0 GUID. – Craig W. Apr 05 '13 at 21:51
  • 1
    @Medinoc: Are those validated anywhere? The result may not be a valid `Guid` according to various definitions, but I don't *think* .NET itself will complain. I've added a note about this. – Jon Skeet Oct 30 '13 at 13:57
7

Not 100%. But if your GUID generator works well, the collision probability is very very small. This can practically count as 0.

A randomly generated (kind 4) guid has about 120 random bits. From the birthday problem you can see that collisions get likely once you generate about 2^60 or 10^18 GUIDs, which is a damn lot.

So simply using Guid.NewGuid() should be good enough.


Your proposed solution isn't a good idea IMO:

  • It can take a lot of memory if you have a lot of GUIDs
  • Since you need to know all GUIDs locally, there is no reason to use a GUID in the first place. A simple integer counter would do the job just as well.
  • Random GUID collisions are less likely than faulty hardware corrupting your data structure.

Your code itself looks correct to me. i.e. if you register all GUIDs and your hardware works perfectly, and the software has no other bugs you are guaranteed no collisions.

And of course it's not threadsafe either, which is unexpected for a static method.

CodesInChaos
  • 106,488
  • 23
  • 218
  • 262
4

If you use a finite number of characters, then according to the Pigeonhole( also called Dirichlet) principle there is always a chance you will receive a collision.

Petar Minchev
  • 46,889
  • 11
  • 103
  • 119
3

If you need unique GUID in your context, just start with 00000000-0000-0000-0000-000000000000 and use incrementation. All generaged GUIDs will be unique unless you reach FFFFFFFF-FFFF-FFFF-FFFF-FFFFFFFFFFFF

Tom Kris
  • 1,227
  • 1
  • 8
  • 15
3
var newGuid = Guid.NewGuid();

http://msdn.microsoft.com/en-us/library/system.guid.newguid.aspx

Edit - I agree with what @David Heffernan says. You can use the mechanism in place for generating the best unique identifier, but there are very few things in this universe that you can count on 100%.

asawyer
  • 17,642
  • 8
  • 59
  • 87
2

Storing current GUIDs is impractical for anything but a small number of GUIDs - which would have an incredibly low chance of collision anyway.

In a real-world scenario where you're generating probably millions or even billions of GUIDS on a regular basis, the overhead for storing 128 bit values to ensure unique GUIDs becomes impractical. (16 bytes per GUID)

For a mere 10,000,000,000 GUIDs, you need 160,000,000,000 bytes = 156,250,000 Kbytes = 152,588 Mbytes = 149 Gbytes

Lookup times in a large table will also make generating new unique GUIDs slow to a virtual crawl (in CPU time scale), particularly as new GUIDs collide with existing values causing the generation of new GUIDs - which then need to be checked, etc.

Generating a 'random' 128 bit value, or even using something like (current time * processor clock) will probably be 'close enough' - only 45 bits are enough to store millisecond counts for 1,000 years. 128 bits gives you 9,671,406,556,917,033,397,649,408 times that many values.

Regardless of what you do, collisions WILL occur with 128 bit values, even if using a counter.

2

It depends what you want. If you want uniqueness amongst GUIDs that you generate, that can be achieved. Just maintain a list of GUIDs and whenever you need to create a new one, do this is a loop until you find one that is not in your list.

If you want some sort of global uniqueness, whereby global means out of all GUIDs in use across the entire planet, then that can never be achieved.

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
2

You can use Guid.NewGuid(). It will generate the GUID for you, and I don't believe you'll have confliction with another GUID.

1

Guid.NewGuid() is the least likely way to generate a GUID that won't collide with another. There is no way to be 100% sure unless you generate a GUID and look at existing GUIDs to make sure they do not exist.

KallDrexx
  • 27,229
  • 33
  • 143
  • 254
0

GUID http://msdn.microsoft.com/en-us/library/system.guid.newguid.aspx

Edit - I agree with what @David Heffernan says. You can use the mechanism in place for generating the best unique identifier, but there are very few things in this universe that you can count on 100%.

  • Half of your answer is just a link (without any explanation), another half is your opinion about other answer. – Pochmurnik Sep 11 '19 at 06:35