1

I am referring to the function in the answer on PHP random string generator (I have removed the default $length param).

So this code uses that function but for some reason if you run this code you will see that the strings appear multiple times in the array! So how if this is truly random can I produce these results? Do I need to amend something to produce really random strings?

function generateRandomString($length) {
    $characters = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
    $charactersLength = strlen($characters);
    $randomString = '';

    for ($i = 0; $i < $length; $i++) {
        $randomString .= $characters[rand(0, $charactersLength - 1)];
    }

    return $randomString;
}

$i = 0;
while ($i < 5000){
    $r = '';
    $r = generateRandomString(8);
    $arr[$r][$i] = $r;
    $i++;
}

foreach ($arr as $key=>$rand){
    if(count($rand) > 1){
        echo "$key has ".count($rand).' results<br>';
    }
}

Example results:

ASCn2Db1 has 2 results
4ceXoUgh has 2 results
fCdzjEAV has 2 results
QRkXxAUJ has 2 results
Community
  • 1
  • 1
Antony
  • 3,875
  • 30
  • 32
  • 2
    Creating random strings does _not_ mean that the strings are guaranteed to be unique! Why should that be the case? A random algorithm does not have a history. And actually in contrary: for short strings and many attempts you have to expect repetitions. There are only so many possible combinations... – arkascha Aug 13 '15 at 09:43
  • You are saving these random strings in database. If yes, you can easily get rid of duplicate strings using while loop – Happy Coding Aug 13 '15 at 09:47
  • True - I had just thought we were unlikely to see SO many duplicates – Antony Aug 13 '15 at 09:49
  • Generated 5000000 random strings with this algorithm on my machine and had no duplicates. – Erki Aring Aug 13 '15 at 10:23

4 Answers4

2

Random is not unique. Throw a dice and you will get the same number multiple times..

BobbyTables
  • 4,481
  • 1
  • 31
  • 39
  • 1
    Fair enough but in 5000 loops and choosing from 61 elements in an 8 element string what are the odds that over 900 (which is what I saw) results will contain duplicates?? I'd say thats too high. – Antony Aug 13 '15 at 09:47
  • The human is not good at estimating probability distribution ;) – BobbyTables Aug 13 '15 at 09:52
  • 1
    Having 900 duplicates in 5000 loops obviously cannot be called random, no matter how bad you are at estimating. – Erki Aring Aug 13 '15 at 10:20
1

I provided an answer to the question you referenced in your question, but it's buried pretty far down under insecure answers and I'm not surprised everyone missed it.

The bug in your code that is causing so many duplicate values is here:

for ($i = 0; $i < $length; $i++) {
    $randomString .= $characters[rand(0, $charactersLength - 1)];
}

The problem is rand(). If you want a high quality PHP random string generator you need to use a better random number generator to power it. You have three good options here:

  • random_int() (PHP 7+ only)
  • random_compat, which exposes a compatible interface for random_int() in PHP 5 projects (5.2+)
  • RandomLib (PHP 5.3.2+)

TL;DR use random_int(), never rand() or mt_rand().

Even with a secure random number generator, if you have short strings and a sufficiently large sample size, collisions are inevitable due to the birthday problem. Use longer strings and they will be far less frequent.

Scott Arciszewski
  • 33,610
  • 16
  • 89
  • 206
0

I actually came across PHP: How to generate a random, unique, alphanumeric string? which seemed to answer my question. In short we use openssl_random_pseudo_bytes as a function. When run with 4 characters I got 1x collision but on 5 characters I didn't generate any. So for 8 I should be fine.

Community
  • 1
  • 1
Antony
  • 3,875
  • 30
  • 32
-1

I remember reading an article about this some time ago and I found it again.
In this article he makes a bitmap showing the random function running on Windows, not being totally random.

I would recommend you to use mt_rand() instead, you can read more about the mt_rand() here

Oliver Nybroe
  • 1,828
  • 22
  • 30