12

What would be the easiest way to code a function in .NET to generate a GUID based on a seed so that I can have greater confidence about its uniqueness?

string GenerateSeededGuid(int seed) { /* code here */ }

Ideally, the seed would come from CryptGenRandom which describes its random number generation as follows:

The data produced by this function is cryptographically random. It is far more random than the data generated by the typical random number generator such as the one shipped with your C compiler.

This function is often used to generate random initialization vectors and salt values.

Software random number generators work in fundamentally the same way. They start with a random number, known as the seed, and then use an algorithm to generate a pseudo-random sequence of bits based on it. The most difficult part of this process is to get a seed that is truly random. This is usually based on user input latency, or the jitter from one or more hardware components.

With Microsoft CSPs, CryptGenRandom uses the same random number generator used by other security components. This allows numerous processes to contribute to a system-wide seed. CryptoAPI stores an intermediate random seed with every user. To form the seed for the random number generator, a calling application supplies bits it might have—for instance, mouse or keyboard timing input—that are then combined with both the stored seed and various system data and user data such as the process ID and thread ID, the system clock, the system time, the system counter, memory status, free disk clusters, the hashed user environment block. This result is used to seed the pseudorandom number generator (PRNG). [...] If an application has access to a good random source, it can fill the pbBuffer buffer with some random data before calling CryptGenRandom. The CSP then uses this data to further randomize its internal seed. It is acceptable to omit the step of initializing the pbBuffer buffer before calling CryptGenRandom.

danijar
  • 32,406
  • 45
  • 166
  • 297
CJ7
  • 22,579
  • 65
  • 193
  • 321
  • 2
    Generate a random number. Convert to a GUID .. but I *thought* NewGuid created an UUIDv4 (read: *random*) already. See http://en.wikipedia.org/wiki/Globally_unique_identifier (note the bit set talked about in the Algorithm section) to verify what NewGuid returns. –  Nov 02 '12 at 02:19
  • Side note: if all bits are random it may not be completely valid GUID/UUID as 8 bits represent version/variant: see - http://en.wikipedia.org/wiki/UUID – Alexei Levenkov Nov 02 '12 at 02:24
  • 2
    This is same question you asked earlier today and deleted it. You confuse random with unique. The purpose of GUID is to be unique not random - WAY DIFFERENT. GUID is not even in the System.Security.Cryptography namespace. GUID is based on MAC for uniqueness. If you want to seed then use a proper algorithm in the cryptography namespace. – paparazzo Nov 02 '12 at 02:24
  • @pst The question states MAC, time. Regardless would you seriously use .NET GUID for cryptography? – paparazzo Nov 02 '12 at 02:30
  • @Blam: one of the entropy components for NewGuid is time, so how could that provide uniqueness if executed at the same time on the same machine? – CJ7 Nov 02 '12 at 02:32
  • @pst: according to this article it does use MAC/Time: http://blogs.msdn.com/b/oldnewthing/archive/2008/06/27/8659071.aspx – CJ7 Nov 02 '12 at 02:34
  • @pst: my question did not specify .NET version. As it happens, I am using .NET 2.0. – CJ7 Nov 02 '12 at 02:38
  • 1
    @CJ7 And just how are you going to execute NewGuid more than once at the SAME time? – paparazzo Nov 02 '12 at 02:39
  • @Blam: Two different processes/threads on the same machine. – CJ7 Nov 02 '12 at 02:41
  • @CJ7 PLEASE post your code where a single machine (or even multiple machines) generate duplicate NEWGUID. – paparazzo Nov 02 '12 at 02:45
  • @pst: question edited. If that Raymond Chen article is so patently false, why does MS keep it on its site and why is it so widely recommended, even here on SO? – CJ7 Nov 02 '12 at 02:52
  • @Blam: it's a not a matter of whether I currently have evidence of that. Should I wait until it happens in a production environment? – CJ7 Nov 02 '12 at 02:53
  • @CJ7 If you don't trust NEWGUID to be unique then don't use it. You are the one that asserted it would not be unique. If you have evidence NEWGUID is not unique then present it (and really embarrass Microsoft). – paparazzo Nov 02 '12 at 02:59
  • 1
    @Blam: MS don't claim it is unique – CJ7 Nov 02 '12 at 03:04
  • @pst: please provide a ref for .NET 2.0 `NewGuid` using GUIDv4. – CJ7 Nov 02 '12 at 03:07
  • @CJ7 There isn't one AFAIK. Microsoft guarantees "a very low chance of collisions", which is all that a GUID generator can do - there is no "master" list - GUIDv4, which has been used since Windows 2000, is "random" (except for the version bits). While there is no contract that GUIDv4 is used (and in Windows 98/ME it was not the case!) I would feel safe to say that all future versions will follow a generator that is "at least as unique" as the GUIDv4 implementation. –  Nov 02 '12 at 03:08
  • @CJ: You may find [my GUID blog post](http://nitoprograms.blogspot.com/2010/11/few-words-on-guids.html) helpful. .NET has never used v1 GUIDs, but even if they did, v1 GUIDs cannot collide by being generated at the same time on the same machine, as clearly spelled out in RFC4122. I also have a [GUID decoder](http://stephencleary.com/#applications_GuidDecoder) on my home page that is fun to play with. – Stephen Cleary Nov 02 '12 at 04:12
  • @StephenCleary: in a nutshell, what is stopping a GUID collision on the same machine at the same time? – CJ7 Nov 02 '12 at 06:59
  • 1
    @CJ7: RFC4122 states that the v1 UTC "timestamp" is in 100-ns intervals. If two v1 GUIDs are generated within the same 100-ns window, then the algorithm will stall for up to 100 ns so it gets a different timestamp. This is all covered by RFC4122. – Stephen Cleary Nov 02 '12 at 11:15
  • @StephenCleary: does this apply if the GUID generation is being done in separate processes/threads? – CJ7 Nov 02 '12 at 12:50
  • @CJ7: Yes. The v1 GUID generation is a machine-wide algorithm. – Stephen Cleary Nov 02 '12 at 13:09
  • @CJ7 If statistically unique is not good enough for your application then no random generator (with or without a seed) is going to guarantee absolute uniqueness. What is business problem you are trying to solve? – paparazzo Nov 02 '12 at 13:58

4 Answers4

22

tldr; use Guid.NewGuid instead of trying to invent another "more random" approach. (The only reason I can think of to create a UUIDvX from a seed is when a predictable, resettable, sequence is desired. However, a GUID might also not be the best approach2.)

By very definition of being a finite range - 128bits minus 6 versioning bits, so 122 bits of uniqueness for v4 - there are only so many (albeit supremely huge number! astronomically big!) "unique" identifiers.

Due to the Pigeonhole Principle there are only so many Pigeonholes. If Pigeons keep reproducing eventually there will not be enough Holes for each Pigeon. Due to the Birthday Paradox, assuming complete randomness, two Pigeons will try to fight for the same Pigeonholes before they are all filled up. Because there is no Master Pigeonhole List1 this cannot be prevented. Also, not all animals are Pigeons3.

While there are no guarantees as to which GUID generator will be used, .NET uses the underlying OS call, which is a GUIDv4 (aka Random UUID) generator since Windows 2k. As far as I know - or care, really - this is as good a random as it gets for such a purpose. It has been well vetted for over a decade and has not been replaced.


From Wikipedia:

.. only after generating 1 billion UUIDs every second for the next 100 years, the probability of creating just one duplicate would be about 50%. The probability of one duplicate would be about 50% if every person on earth owns 600 million UUIDs.

1 While there are still a finite set of Pigeonholes, UUIDv1 (aka MAC UUID) - assuming unique time-space - is guaranteed to generate deterministically unique numbers (with some "relatively small" theoretical maximum number of UUIDs generated per second on a given machine). Different broods of Pigeons living in different parallel dimensions - awesome!

2 Twitter uses Snowflakes in parallel dimensions in its own distributed Unique-ID scheme.

3 Rabbits like to live in Burrows, not Pigeonholes. The use of a GUID also acts as an implicit parallel partition. It is only when a duplicate GUID is used for the same purpose that collision-related problems can arise. Just think of how many duplicate auto-increment database primary keys there are!

Community
  • 1
  • 1
21

All you really need to do in your GenerateSeededGuid method is to create a 128-bit random number and the convert it to a Guid. Something like:

public Guid GenerateSeededGuid(int seed)
{
  var r = new Random(seed);
  var guid = new byte[16];
  r.NextBytes(guid);

  return new Guid(guid);
}
Sani Huttunen
  • 23,620
  • 6
  • 72
  • 79
  • 12
    +1 For "from a seed", although 1) this can end up with invalid UUIDs (wrong version bits) and 2) **generates more collisions** as the seed is limited in range; GUIDs are *designed* to "have a very low collision chance" so there is no reason to inherently believe that a custom one-off Randomly generated approach would work better or be more random that a GUIDv4 found Windows 2k+ –  Nov 02 '12 at 03:10
  • 2
    @pst: I agree. Personally I see no reason not to use `NewGuid`. – Sani Huttunen Nov 02 '12 at 03:25
  • 3
    Unit tests that do approvals of whole docs - those benefit from this technique. Thanks! – George Mauer Nov 27 '17 at 00:42
  • Guids are **not** guaranteed to be random, only unique. – Enigmativity Nov 15 '18 at 11:23
2
    public static Guid SeededGuid(int seed, Random random = null)
    {
        random ??= new Random(seed);
        return Guid.Parse(string.Format("{0:X4}{1:X4}-{2:X4}-{3:X4}-{4:X4}-{5:X4}{6:X4}{7:X4}",
            random.Next(0, 0xffff), random.Next(0, 0xffff),
            random.Next(0, 0xffff),
            random.Next(0, 0xffff) | 0x4000,
            random.Next(0, 0x3fff) | 0x8000,
            random.Next(0, 0xffff), random.Next(0, 0xffff), random.Next(0, 0xffff)));
    }

    //Example 1
    SeededGuid("Test".GetHashCode());
    SeededGuid("Test".GetHashCode());

    //Example 2
    var random = new Random("Test".GetHashCode());
    SeededGuid("Test".GetHashCode(), random);
    SeededGuid("Test".GetHashCode(), random);

This method is based on php v4 uui https://www.php.net/manual/en/function.uniqid.php#94959

William Magno
  • 502
  • 5
  • 6
0

This is a bit old, but no need for a random generator. But yes this is usefull for testing purpose, but not for general uses

    public static Guid GenerateSeededGuid<T>(T value)
    {
        byte[] bytes = new byte[16];
        BitConverter.GetBytes(value.GetHashCode()).CopyTo(bytes, 0);
        return new Guid(bytes);
    }
Calimero100582
  • 832
  • 1
  • 7
  • 13
  • GetHashCode does not always return the same value for the same input. It can for example change with restart of the application. – EKS Apr 24 '20 at 12:14