16

A while back I wrote a random string generator that builds a string using the mt_rand()th character in a string until the desired length is reached.

public function getPassword ()
{
    if ($this -> password == '')
    {
        $pw             = '';
        $charListEnd    = strlen (static::CHARLIST) - 1;
        for ($loops = mt_rand ($this -> min, $this -> max); $loops > 0; $loops--)
        {
            $pw .= substr (static::CHARLIST, mt_rand (0, $charListEnd), 1);
        }
        $this -> password   = $pw;
    }
    return $this -> password;
}

(CHARLIST is a class constant containing a pool of characters for the password. $min and $max are length contraints)

Today, when researching something else entirely I stumbled upon the following code:

function generateRandomString ($length = 10) {    
    return substr(str_shuffle ("0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"), 0, $length);
}

This accomplishes pretty much the same effect as my looping mt_rand() based code in one line. I really like it for that simple reason, fewer lines of code is always a good thing. :)

But when I looked up str_shuffle in PHP's manual the documentation on it was pretty light. One thing I was really keen to learn was what algorithm does it use for randomness? The manual doesn't mention what kind of randomization is done to get the shuffled string. If it uses rand() instead of mt_rand() then sticking to my current solution may be better after all.

So basically I'd like to know how str_shuffle randomizes the string. Is it using rand() or mt_rand()? I'm using my random string function to generate passwords, so the quality of the randomness matters.

UPDATE: As has been pointed out, the str_shuffle method is not equivalent to the code I'm already using and will be less random due to the string's characters remaining the same as the input, only with their order changed. However I'm still curious as to how the str_shuffle function randomizes its input string.

GordonM
  • 31,179
  • 15
  • 87
  • 129

3 Answers3

37

A better solution would be mt_rand which uses Mersenne Twister which much more better.

As has been pointed out, the str_shuffle method is not equivalent to the code I'm already using and will be less random due to the string's characters remaining the same as the input, only with their order changed. However I'm still curious as to how the str_shuffle function randomizes its input string.

To make the output equal lets just use 0,1 and look at the visual representation of each of the functions

Simple Test Code

header("Content-type: image/png");
$im = imagecreatetruecolor(512, 512) or die("Cannot Initialize new GD image stream");
$white = imagecolorallocate($im, 255, 255, 255);
for($y = 0; $y < 512; $y ++) {
    for($x = 0; $x < 512; $x ++) {
        if (testMTRand()) { //change each function here 
            imagesetpixel($im, $x, $y, $white);
        }
    }
}
imagepng($im);
imagedestroy($im);

function testMTRand() {
    return mt_rand(0, 1);
}

function testRand() {
    return rand(0, 1);
}

function testShuffle() {
    return substr(str_shuffle("01"), 0, 1);
}

Output testRand()

enter image description here

Output testShuffle()

enter image description here

Output testMTRand()

enter image description here

So basically I'd like to know how str_shuffle randomizes the string. Is it using rand() or mt_rand()? I'm using my random string function to generate passwords, so the quality of the randomness matters.

You can see clearly that str_shuffle produces almost same output as rand ...

Baba
  • 94,024
  • 28
  • 166
  • 217
  • 6
    Annoyingly pedantic nitpick: Different algorithms can have the same output. It's also possible that they only behave the same when the range is [0,1]. *Very* unlikely though. Either way, +1. I'm a sucker for nifty pictures :). – Corbin Dec 29 '12 at 08:59
  • 1
    `testShuffle` doesn't produce almost the same output as `testRand`, it produces the exact opposite (in your test) :-) – OlavJ Jan 16 '13 at 11:47
  • I was judging by your output. If you inverse your output from testRand(), it is exactly the same as testShuffle().. – OlavJ Jan 21 '13 at 14:03
  • Are your sure? I think the two images are 100% opposite of each other and Paint.net seems to agree with me. – OlavJ Jan 21 '13 at 14:43
  • I'll do some test myself .. .. That would be an interesting find – Baba Jan 21 '13 at 14:50
  • Use http://www.catenarysystems.com/demos/compare/comparator.aspx set y-cordinate to 180 .. its only about 75% similar – Baba Jan 21 '13 at 15:04
  • Why set a different y-coordinate? When aligned (x- and y-coordinates at 0) that site verifies what I said: they are 100% opposite of each other. – OlavJ Jan 21 '13 at 15:25
  • I still don't want to believe `rand` and `shuffle` is that messed up – Baba Jan 21 '13 at 15:35
  • I really don't know. I just noticed that the images you generated seemed to be inverse of each other and the verification's I did confirmed this. – OlavJ Jan 22 '13 at 07:47
  • generated a new one its the same – Baba Jan 22 '13 at 16:26
  • `mt_rand` is fine for for simulation purposes, but it's not a good choice for security. Generating a password with `mt_rand` is almost as bad as using `rand`. Its seed is too small, and its outputs can be predicted by observing only a small part of the output. – CodesInChaos Oct 31 '13 at 16:59
  • 1
    Quick update: [starting from PHP 7.1](http://php.net/manual/en/migration71.incompatible.php#migration71.incompatible.rand-srand-aliases), `rand()` is an alias to `mt_rand()`, so the above three test methods will be the same – Razor Apr 05 '18 at 20:36
3

Please be aware that this method should not be used if your application is really focused on security. The Mersenne Twister is NOT cryptographically secure. A PRNG can yield values which statistically appear to be random but still are easy to break.

mepilk
  • 136
  • 5
0

Still not cryptographically secure, but here is a way to use str_shuffle() while allowing character repetition, thereby improving complexity...

generate_password($length = 8, $strength = 3) {
    if ($length < 6) $length = 6;
    if ($length > 32) $length = 32;
    // Excludes [0,O,o,1,I,i,L,l,1] on purpose for readability
    $chars = 'abcdefghjkmnpqrstuvwxyz';
    if ($strength >= 2) $chars .= '23456789';
    if ($strength >= 3) $chars .= strtoupper($lower);
    if ($strength >= 4) $chars .= '!@#$%&?';
    return substr(str_shuffle(str_repeat($chars, $length)), 0, $length);
}

$chars is repeated $length times before the string is shuffled to make this a little better than shuffling only single occurrence.

We only use this in systems that do not store sensitive information ;)

Mavelo
  • 1,199
  • 11
  • 16
  • One improvement could be to build the string one by one checking the last character to ensure 2 of the same are not in sequence, but you get the idea ;) – Mavelo Sep 30 '17 at 17:42