6

I'm looking at using a Guid as a random anonymous visitor identifier for a website (stored both as a cookie client-size, and in a db server-side), and I wanted a cryptographically strong way of generating Guids (so as to minimize the chance of collisions).

For the record, there are 16 bytes (or 128 bits) in a Guid.

This is what I have in mind:

/// <summary>
/// Generate a cryptographically strong Guid
/// </summary>
/// <returns>a random Guid</returns>
private Guid GenerateNewGuid()
{
    byte[] guidBytes = new byte[16]; // Guids are 16 bytes long
    RNGCryptoServiceProvider random = new RNGCryptoServiceProvider();
    random.GetBytes(guidBytes);
    return new Guid(guidBytes);
}

Is there a better way to do this?

Edit: This will be used for two purposes, a unique Id for a visitor, and a transaction Id for purchases (which will briefly be the token needed for viewing/updating sensitive information).

Christopher Stevenson
  • 2,843
  • 20
  • 25
  • 7
    If you want to do minimize the chance of collisions use a normal guid, it already has all the protections you need in place. Now having non predictable random id's, that is a different requirement with a different solution. So what is your *real* requirements, no collisions or unpredictability (with a higher chance of collisions over the same keyspace)? – Scott Chamberlain Dec 10 '13 at 23:39

5 Answers5

8

In answer to the OP's actual question whether this is cryptographically strong, the answer is yes since it is created directly from RNGCryptoServiceProvider. However the currently accepted answer provides a solution that is most definitely not cryptographically secure as per this SO answer:

Is Microsoft's GUID generator cryptographically secure.

Whether this is the correct approach architecturally due to theoretical lack of uniqueness (easily checked with a db lookup) is another concern.

Community
  • 1
  • 1
James Westgate
  • 11,306
  • 8
  • 61
  • 68
0

So, what you're building is not technically a GUID. A GUID is a Globally Unique Identifier. You're building a random string of 128 bits. I suggest, like the previous answerer, that you use the built-in GUID generation methods. This method has a (albeit tremendously small) chance of generating duplicate GUID's.

There are a few advantages to using the built-in functionality, including cross-machine uniqueness [partially due to the MAC Address being referenced in the guid, see here: http://en.wikipedia.org/wiki/Globally_Unique_Identifier.

Regardless of whether you use the built in methods, I suggest that you not expose the Purchase GUID to the customer. The standard method used by Microsoft code is to expose a Session GUID that identifies the customer and expires comparatively quickly. Cookies track customer username and saved passwords for session creation. Thus your 'short term purchase ID' is never actually passed to (or, more importantly, received from) the client and there is a more durable wall between your customers' personal information and the Interwebs at large.

Dylan Brams
  • 2,089
  • 1
  • 19
  • 36
  • I didn't want to get into architecture details, but I have a session to maintain between two websites (one for shopping and address management, and one for payment processing--two different subdomains within a single domain). The current design is to use links (or redirects) for customers to go from one to the other as needed when purchasing. – Christopher Stevenson Dec 11 '13 at 00:49
0

Collisions are theoretically impossible (it's not Globally Unique for nothing), but predictability is a whole other question. As Christopher Stevenson correctly points out, given a few previously generated GUIDs it actually becomes possible to start predicting a pattern within a much smaller keyspace than you'd think. GUIDs guarantee uniqueness, not predictability. Most algorithms take it into account, but you should never count on it, especially not as transaction Id for purchases, however briefly. You're creating an open door for brute force session hijacking attacks.

To create a proper unique ID, take some random stuff from your system, append some visitor specific information, and append a string only you know on the server, and then put a good hash algorithm over the whole thing. Hashes are meant to be unpredictable and unreversable, unlike GUIDs.

To simplify: if uniqueness is all you care about, why not just give all your visitors sequential integers, from 1 to infinity. Guaranteed to be unique, just terribly predictable that when you just purchased item 684 you can start hacking away at 685 until it appears.

Niels Keurentjes
  • 41,402
  • 9
  • 98
  • 136
0

To avoid collisions:

  1. If you can't keep a global count, then use Guid.NewGuid().
  2. Otherwise, increment some integer and use 1, 2, 3, 4...

"But isn't that ridiculously easy to guess?"

Yes, but accidental and deliberate collisions are different problems with different solutions, best solved separately, note least because predictability helps prevent accidental collision while simultaneously making deliberate collision easier.

If you can increment globally, then number 2 guarantees no collisions. UUIDs were invented as a means to approximate that without the ability to globally track.

Let's say we use incrementing integers. Let's say the ID we have in a given case is 123.

We can then do something like:

private static string GetProtectedID(int id)
{
  using(var sha = System.Security.Cryptography.SHA1.Create())
  {
    return string.Join("", sha.ComputeHash(Encoding.UTF8.GetBytes(hashString)).Select(b => b.ToString("X2"))) + id.ToString();
  }
}

Which produces 09C495910319E4BED2A64EA16149521C51791D8E123. To decode it back to the id we do:

private static int GetIDFromProtectedID(string str)
{
  int chkID;
  if(int.TryParse(str.Substring(40), out chkID))
  {
    string chkHash = chkID.ToString() + "this is my secret seed kjٵتשڪᴻᴌḶḇᶄ™∞ﮟﻑfasdfj90213";
    using(var sha = System.Security.Cryptography.SHA1.Create())
    {
      if(string.Join("", sha.ComputeHash(Encoding.UTF8.GetBytes(hashString)).Select(b => b.ToString("X2"))) == str.Substring(0, 40))
        return chkID;
    }
  }
  return 0;//or perhaps raise an exception here.
}

Even if someone guessed from that they were given number 123, it wouldn't let them deduce that the id for 122 was B96594E536C9F10ED964EEB4E3D407F183FDA043122.

Alternatively, the two could be given as separate tokens, and so on.

Jon Hanna
  • 110,372
  • 10
  • 146
  • 251
-3

I generally just use Guid.NewGuid();

http://msdn.microsoft.com/en-us/library/system.guid.newguid(v=vs.110).aspx

TGH
  • 38,769
  • 12
  • 102
  • 135
  • 2
    Isn't NewGuid() [somewhat predictable](http://stackoverflow.com/questions/2621563/how-random-is-system-guid-newguid-take-two)? – Christopher Stevenson Dec 11 '13 at 00:26
  • I can't really speak to the algorithm, but my impression is that due to the large number of possible guids, it's hard to guess guids that have been generated your system. Collisions are extremely rare, and you could of course append that with a constraint in a database if you are worried. Seems like the OP is mostly concerned with uniqueness, so the predictability might even be a lesser concern. – TGH Dec 11 '13 at 00:32
  • Assuming uniform probability for simplicity, the probability of one duplicate would be about 50% if every person on earth owned 600 million GUIDs. – dspiegs Dec 11 '13 at 00:38
  • @dspiegs collisions are not the issue at hand, but the chance of brute forcing somebody else's session GUID while they're making a purchase. – Niels Keurentjes Dec 11 '13 at 00:44
  • @NielsKeurentjes by brute force do you mean just counting up from 0 to 2^128? because that is never going to happen. – dspiegs Dec 11 '13 at 00:55
  • @dspiegs you're overestimating the keyspace. A GUID's uniqueness partially stems from using *predictable* elements like a local MAC address. Hence when you've seen a few GUIDs from the same server, you can derive a pattern. It's related to a [birthday attack](https://en.wikipedia.org/wiki/Birthday_attack) from that point on. I'm not saying there's a viable attack vector right now, but given time it's not unlikely that one will appear due to the entire nature of GUIDs being generated to be unique amongst a few billion computers. – Niels Keurentjes Dec 11 '13 at 00:58
  • To put it differently: GUID algorithms use MAC addresses to ensure being 'globally unique'. A MAC is 6 bytes, so the keyspace is 2^48. This means that if you know the MAC, the remainder of the GUID only has a complexity of 2^80 all of a sudden, which is bruteforcable with today's hardware. And there are more predictable elements in a GUID's algorithm to avoid depending on possibly cloned MACs. – Niels Keurentjes Dec 11 '13 at 01:03
  • @NielsKeurentjes 1) Most GUIDs algorithms these days don't use the MAC address anymore 2) In the OP's case he's generating all of the guids (on a web server), not a potentially malicious user; a malicious user changing their MAC address won't affect the GUID at all. – Servy Dec 11 '13 at 01:05
  • 1
    @Servy 1) some still do, and you don't know how they're going to change it in .NET 5.0, and it doesn't matter since other algorithms still take measures to be unique amongst billions of computers, limiting the keyspace severely and making it predictable, 2) that's the whole problem, they're all generated on the same machine, hence susceptible to patterns. – Niels Keurentjes Dec 11 '13 at 01:13
  • Randomly generated GUIDs only use 6 bits to specify they are random. I am sure Guid.NewGuid() is good enough. OP is worried about collisions, not predictability. It may be possible to predict his next GUID if someone logs all the GUIDs created, but he will never get a duplicate. – dspiegs Dec 12 '13 at 17:47