97

This relates to this question. I am using the code below from this answer to generate a UUID in JavaScript:

'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'.replace(/[xy]/g, function(c) {
    var r = Math.random()*16|0, v = c == 'x' ? r : (r&0x3|0x8);
    return v.toString(16);
});

This solution appeared to be working fine, but I am getting collisions. Here's what I have:

  • A web application running in Google Chrome.
  • 16 users.
  • about 4000 UUIDs have been generated in the past two months by these users.
  • I got about 20 collisions - e.g., a new UUID generated today was the same as about two months ago (different user).

What is causing this issue and how can I avoid it?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Muxa
  • 5,563
  • 6
  • 46
  • 56
  • 2
    Combine a good random number with the current time (in milliseconds). The odds of the random number colliding at exactly the same time are really, really, really low. – jfriend00 Aug 02 '11 at 06:33
  • 7
    @jfriend00 if you need to do that then it is not a "good random number", not even a decent pseudo-random number. – Attila O. Jul 17 '12 at 15:14
  • 2
    what does the `(r&0x3|0x8)` portion mean / evaluation to? – Kristian Dec 11 '14 at 17:42
  • What about appending a Date.now().toString() to it? – Vitim.us May 15 '15 at 04:48
  • 4
    There's a big problem in your architecture, unrelated to UUIDs -- client may intentionally generate colliding IDs. Generate IDs only by a system you trust. As a workaround, though, prepend client-generated IDs with user_id, so that adversary/faulty client can only collide with themselves (and handle that on server side). – Dzmitry Lazerka Apr 15 '16 at 20:29
  • What about using a timestamp to scramble data when generating the UUIDs (eg : by calling new Date()) ? Would it help to reduce the collisions on Chrome ? – tigrou Jul 03 '17 at 21:45

6 Answers6

37

My best guess is that Math.random() is broken on your system for some reason (bizarre as that sounds). This is the first report I've seen of anyone getting collisions.

node-uuid has a test harness that you can use to test the distribution of hex digits in that code. If that looks okay then it's not Math.random(), so then try substituting the UUID implementation you're using into the uuid() method there and see if you still get good results.

[Update: Just saw Veselin's report about the bug with Math.random() at startup. Since the problem is only at startup, the node-uuid test is unlikely to be useful. I'll comment in more detail on the devoluk.com link.]

broofa
  • 37,461
  • 11
  • 73
  • 73
  • 1
    Thanks, I'm going with uuid.js now, since it uses browser's strong crypto if available. Will see if there are any collisions. – Muxa Aug 31 '11 at 22:21
  • can you provide a link to the uuid.js code you're referring to? (sorry, not sure which lib you mean.) – broofa Sep 07 '11 at 16:51
  • 10
    Had no collisions so far :) – Muxa Feb 12 '12 at 09:41
  • Anyway, if it's Chrome and only when starting, your app could generate and discard a row of, say, ten guids using the above function :) – Vinko Vrsalovic Jan 21 '14 at 05:48
  • The problem is with limited entropy you get from Math.random(). For some browsers the entropy is as low as just 41 bits all together. Calling Math.random() multiple times won't raise the entropy. If you really want unique v4 UUIDs you need to use a cryptographically strong RNG that produces at least 122bit entropy per UUID generated. – mlehmk Sep 25 '15 at 12:28
36

Indeed there are collisions, but only under Google Chrome. Check out my experience on the topic in Google Chrome random number generator issue

It seems like collisions only happen on the first few calls of Math.random. Because if you just run the createGUID / testGUIDs method above (which obviously was the first thing I tried), it just works without any collisions whatsoever.

So to make a full test one needs to restart Google Chrome, generate 32 byte, restart Chrome, generate, restart, generate, etc.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Veselin Kulov
  • 361
  • 2
  • 2
21

Just so that other folks can be aware of this - I was running into a surprisingly large number of apparent collisions using the UUID generation technique mentioned here. These collisions continued even after I switched to seedrandom for my random number generator. That had me tearing my hair out, as you can imagine.

I eventually figured out that the problem was (almost?) exclusively associated with Google's web crawler bots. As soon as I started ignoring requests with "googlebot" in the user-agent field, the collisions disappeared. I'm guessing that they must cache the results of JS scripts in some semi-intelligent way, with the end result that their spidering browser can't be counted on to behave the way that normal browsers do.

Just an FYI.

Ken Smith
  • 20,305
  • 15
  • 100
  • 147
  • 2
    Ran into the same issue with our metrics system. Was seeing thousands of UUID collisions using the 'node-uuid' module to generate session IDs in browser. Turns out it was googlebot all along. Thanks! – domkck Mar 27 '17 at 17:47
4

The answer that originally posted this UUID solution was updated on 2017-06-28:

A good article from Chrome developers discussing the state of Math.random PRNG quality in Chrome, Firefox, and Safari. tl;dr - As of late-2015 it's "pretty good", but not cryptographic quality. To address that issue, here's an updated version of the above solution that uses ES6, the crypto API, and a bit of JS wizardy I can't take credit for:

function uuidv4() {
  return ([1e7]+-1e3+-4e3+-8e3+-1e11).replace(/[018]/g, c =>
    (c ^ crypto.getRandomValues(new Uint8Array(1))[0] & 15 >> c / 4).toString(16)
  )
}

console.log(uuidv4());
Luke Willis
  • 8,429
  • 4
  • 46
  • 79
4

I just ran a rudimentary test of 100,000 iterations in Chrome using the UUID algorithm you posted, and I didn't get any collisions. Here's a code snippet:

var createGUID = function() {
    return 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'.replace(/[xy]/g, function(c) {
        var r = Math.random()*16|0, v = c == 'x' ? r : (r&0x3|0x8);
        return v.toString(16);
    });
}

var testGUIDs = function(upperlimit) {
    alert('Doing collision test on ' + upperlimit + ' GUID creations.');
    var i=0, guids=[];
    while (i++<upperlimit) {
        var guid=createGUID();
        if (guids.indexOf(guid)!=-1) {
            alert('Collision with ' + guid + ' after ' + i + ' iterations');
        }
        guids.push(guid);
    }
    alert(guids.length + ' iterations completed.');
}

testGUIDs(100000);
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
user533676
  • 575
  • 3
  • 11
  • 4
    Yes, I ran some local tests too and got no collisions. Collisions happen between UUIDs what are generated on different user's machines. I might need to generate some data on different machines and check for collisions. – Muxa Aug 02 '11 at 22:58
  • 2
    Also, i've noticed that collisions are between UUIDs generated 3-4 weeks apart. – Muxa Aug 03 '11 at 00:05
  • Very odd. What platform are you running on? – user533676 Aug 03 '11 at 03:14
  • 2
    It seems unlikely that there's a flaw so basic in V8's Math.random(), but Chromium 11 added support for strong random number generation using the window.crypto.getRandomValues API if you want to try it instead. See http://blog.chromium.org/2011/06/new-chromium-security-features-june.html. – user533676 Aug 03 '11 at 03:23
  • Running on combination of Windows 7 and Windows XP. – Muxa Aug 03 '11 at 11:02
1

The answers here deal with "what's causing the issue?" (Chrome Math.random seed issue) but not "how can I avoid it?".

If you are still looking for how to avoid this issue, I wrote this answer a while back as a modified take on Broofa's function to get around this exact problem. It works by offsetting the first 13 hex numbers by a hex portion of the timestamp, meaning that even if Math.random is on the same seed it will still generate a different UUID unless generated at the exact same millisecond.

Briguy37
  • 8,342
  • 3
  • 33
  • 53