0

The standard way to create a cryptographically secure token using PHP seems to be:

$token = bin2hex(openssl_random_pseudo_bytes(16));

I understand if you're using Linux (which I always do) because this uses /dev/urandom — which is changed according all the many things that go on in the operating system —it makes it nigh impossible to predict.

My function is more like this so I can do it by char length rather than bit length (though I don't really ever use it, see below):

function token($charLength = 32) {

    // Each byte produces 2 hexadecimal characters so bit length should be half the char length
    $bitLength = $charLength / 2;

    // Generate token
    $token = bin2hex(openssl_random_pseudo_bytes($bitLength));

    return $token;

}

Is it the unpredictability that makes it secure? I can't help thinking it's less secure because the output is hexadecimal and therefore is less hard to guess or brute-force than a string with the same number of chars that contains the rest of the alphabet, uppercase letters, other symbols, etc.

Is this why when people refer to tokens they refer to the bit length as opposed to char length?

Consider instead:

function randomString($length,
                      $alpha = true,
                      $alphau = true,
                      $numeric = true,
                      $specialChars = '') {

    $string = $specialChars;

    if($alpha === true) {

        $string .= 'abcdefghijklmnopqrstuvwxyz';

    }

    if($alphau === true) {

        $string .= 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';

    }

    if($numeric === true) {

        $string .= '0123456789';

    }

    $array      = str_split($string);
    $string     = '';

    for($counter = 0; $counter < $length; $counter ++) {

        $string .= $array[array_rand($array)];

    }

    return $string;

}

In the context of web development when would you use the first function over the second for:

  1. Creating a random password for a password reset
  2. Creating a one-time use token (e.g. for a forgotten password link)
  3. Creating a salt for a password hash (e.g. bcrypt, sha512, PBKDF2)
  4. Creating a token for a “remember me” cookie token

In all instances I would use randomString() over token() so I guess I'm asking if and why I'm wrong in any of the above.

My rationale in relation to the above points:

  1. 12 char random password with uppercase, lower case and numbers is hard to guess; plus I freeze people out for 15 mins after 5 failed login attempts
  2. 64 char random string, If someone tried brute-forcing the token to reset a password the firewall would pick up on it
  3. Salts should be assumed to be public anyway, so long as they're different per password it makes it impossible to produce a rainbow table
  4. My remember me token is 128 char random string stored in a cookie and is salted and sha 512'd in the database
texelate
  • 2,460
  • 3
  • 24
  • 32

1 Answers1

1

The primary concern with random number generators is generally not the output created, but the predictability in which this data is generated. Your basic question is why not use array_rand (which internally uses php_rand) over openssl_random_pseudo_bytes for cryptographic purposes. The answer has to do with the technique each function takes, with array_rand being a much more predictable (and reproduce-able) approach. See Pádraic Brady's article "Predicting Random Numbers In PHP – It’s Easier Than You Think!" for more detail: http://blog.astrumfutura.com/2013/03/predicting-random-numbers-in-php-its-easier-than-you-think/.

Concerning the output of random number generators, password/key strength in relation to brute force attacks is often measured in entropy. This is usually listed in bits with the more bits the better. The Wikipedia page on password strength (http://en.wikipedia.org/wiki/Password_strength) has some great reference tables for determining the entropy level of passwords at different lengths and using various combinations of character types. The openssl_random_pseudo_bytes() function utilizes all binary/hex values resulting in a full 8 bits of entropy per symbol. At best your randomString() function would result in 5.954 bits of entropy per symbol.

enter image description here

Use of a crypto strong random number should be used in all security related scenarios where the ability to guess one of these numbers would negatively affect your site in some manner. The only item in your list of 4 where I see a crypto strong random number not being required is with salt values for hashes. A salt value must be universally unique. It can certainly be produced by a crypto random number generator (CRNG), but this is not required as the resulting value can be made public. See https://security.stackexchange.com/questions/8246/what-is-a-good-enough-salt-for-a-saltedhash

Community
  • 1
  • 1
Brice Williams
  • 588
  • 1
  • 4
  • 9
  • Hi, and thanks for your reply. I have since found this post: http://stackoverflow.com/a/2595372/2338825 Do you think if I adapted my randomString() function to work like this I'd have cryptographically safe numbers that aren't limited to hexadecimal? I.e. good enough to use in all four scenarios. – texelate Dec 31 '14 at 15:45
  • The referenced function is better from the predictability side, but still limits the entropy in the resulting output. If you are concerned about the portability of the output from openssl_random_pseudo_bytes I would recommend using the base64_encode() function (http://php.net/manual/en/function.base64-encode.php) to encode the binary output. You will then end up with values that look something like: VGhpcyBpcyBhbiBlbmNvZGVkIHN0cmluZw== – Brice Williams Dec 31 '14 at 15:54
  • I think I'm getting it now. So when you create with openssl_random_pseudo_bytes you effectively treat each byte as a symbol, correct? If so, I think I'm getting hung up on the textual representation of the binary. So, just to confirm then, for creating a a reset key or a remember me token then bin2hex(openssl_random_pseudo_bytes(64)) would be just fine? A random password might be better with http://stackoverflow.com/a/2595372/2338825 – texelate Dec 31 '14 at 17:14
  • Both bin2hex and base64_encode are encoding functions that turn binary data into a string representation. base64_encode will result in a smaller output string than bin2hex, but both are valid for use. Using openssl_random_pseudo_bytes(64) will result in a very strong random number with 512bit entropy (8 * 64). – Brice Williams Dec 31 '14 at 18:34
  • Base64 encoding 64 bytes of openssl_random_pseudo_bytes output will result in an 88 character string. This is likely larger than you need for passwords which typically use a much lower entropy (a strong password storage technique like bcrypt will add additional entropy). As an example using base64_encode(openssl_random_pseudo_bytes(6)) will result in an 8 character string with 48bit entropy (6 * 8) as opposed to an 8 character user entered password that might achieve 47.6bit (8 * 5.954) entropy at best. – Brice Williams Dec 31 '14 at 18:34