77

I've been trying to find a more pythonic way of generating random string in python that can scale as well. Typically, I see something similar to

''.join(random.choice(string.letters) for i in xrange(len))

It sucks if you want to generate long string.

I've been thinking about random.getrandombits for a while, and figuring out how to convert that to an array of bits, then hex encode that. Using python 2.6 I came across the bitarray object, which isn't documented. Somehow I got it to work, and it seems really fast.

It generates a 50mil random string on my notebook in just about 3 seconds.

def rand1(leng):
    nbits = leng * 6 + 1
    bits = random.getrandbits(nbits)
    uc = u"%0x" % bits
    newlen = int(len(uc) / 2) * 2 # we have to make the string an even length
    ba = bytearray.fromhex(uc[:newlen])
    return base64.urlsafe_b64encode(str(ba))[:leng]

edit

heikogerlach pointed out that it was an odd number of characters causing the issue. New code added to make sure it always sent fromhex an even number of hex digits.

Still curious if there's a better way of doing this that's just as fast.

mikelikespie
  • 5,682
  • 3
  • 31
  • 36
  • 1
    How do I make this so that it will only include numbers, letters, and underscore? (This includes a dash) – wenbert Dec 30 '10 at 06:49
  • 2
    @wenbert ''.join(random.choice(string.letters+string.digits+"_") for i in xrange(length)) – yanjost Aug 31 '11 at 09:56

5 Answers5

130
import os
random_string = os.urandom(string_length)

and if you need url safe string :

import os
random_string = os.urandom(string_length).hex() 

(note random_string length is greatest than string_length in that case)

jmny
  • 308
  • 2
  • 17
Seun Osewa
  • 4,965
  • 3
  • 29
  • 32
  • Ah! So simple. I didn't think it was cross-platform, but apparently it is. – mikelikespie Apr 24 '09 at 09:17
  • Just a followup, it's really odd, but at least on OS X, the getrandbits method is 2-3x faster. – mikelikespie Apr 24 '09 at 09:25
  • 9
    That's probably because os.urandom will be a cryptographically secure PRNG (usually a stream cipher) while random is a "normal" PRNG which are usually way faster to calculate. – Joey Apr 24 '09 at 12:29
  • 6
    Is there a way to use this to generate ASCII strings rather than unicode? For example, so the string can be used in a URL. – Derek Dahmer Feb 06 '10 at 02:07
  • 8
    You could use random.choice, string.digits, and string.letters like the first example: >>> import random, string >>> ''.join(random.choice(string.letters + string.digits) for i in xrange(10)) 'FywhcRLmh1' (I'm assuming you aren't generating an enormous string like the op since it's for a URL...) – JJ Geewax Mar 19 '10 at 18:45
  • 1
    For URLs one may want to use `string.ascii_letters`. – jholster May 22 '10 at 16:21
  • @Derek: You can encode the random string in base64 for a url. – Seun Osewa Oct 23 '10 at 13:11
  • 63
    Specifically, I've used this: base64.urlsafe_b64encode(os.urandom(30)) – jricher Mar 29 '11 at 15:21
  • 4
    Sorry for re-posting in an old thread. Is there any way to use `os.urandom(string_length)` and get ASCII letters only? ... As python is an interpreted language, the loop which generates one byte at a time seems quite costly. – BiGYaN Oct 19 '11 at 04:26
  • 2
    @BiGYaN: jricher gave a solution for that which returns a base64 encoded string, that is: ASCII letters only. – ereOn Sep 04 '12 at 16:17
10

Sometimes a uuid is short enough and if you don't like the dashes you can always.replace('-', '') them

from uuid import uuid4

random_string = str(uuid4())

If you want it a specific length without dashes

random_string_length = 16
str(uuid4()).replace('-', '')[:random_string_length]
Joelbitar
  • 3,520
  • 4
  • 28
  • 29
6

Taken from the 1023290 bug report at Python.org:

junk_len = 1024
junk =  (("%%0%dX" % junk_len) % random.getrandbits(junk_len *
8)).decode("hex")

Also, see the issues 923643 and 1023290

fdr
  • 391
  • 1
  • 8
2

It seems the fromhex() method expects an even number of hex digits. Your string is 75 characters long. Be aware that something[:-1] excludes the last element! Just use something[:].

  • There was a trailing L with the __hex__(). I rewrote the sample code. Anyways, I think you were right on with it requiring an even number of digits – mikelikespie Apr 24 '09 at 09:17
2

Regarding the last example, the following fix to make sure the line is even length, whatever the junk_len value:

junk_len = 1024
junk =  (("%%0%dX" % (junk_len * 2)) % random.getrandbits(junk_len * 8)).decode("hex")
user115995
  • 21
  • 1