For a project, I need a method of creating thousands of random strings while keeping collisions low. I'm looking for them to be only 12 characters long and uppercase only. Any suggestions?
-
3You mean you don't want any lowercase digits? – martineau Aug 19 '13 at 17:02
-
Hmm, yeah, that should be clarified :) – Maarten Bodewes Aug 19 '13 at 17:02
-
Don't forget to read this page about [the default random number generator in python](http://docs.python.org/2/library/random.html). The chance of collisions seems to be fully dependent on the size of the "random strings", but that does not mean that an attacker cannot re-create the random numbers; the random numbers generated are *not cryptographically secure*. – Maarten Bodewes Aug 19 '13 at 17:10
-
Hah, right. I meant alphanumeric. – Brandon Aug 20 '13 at 15:08
7 Answers
CODE:
from random import choice
from string import ascii_uppercase
print(''.join(choice(ascii_uppercase) for i in range(12)))
OUTPUT:
5 examples:
QPUPZVVHUNSN
EFJACZEBYQEB
QBQJJEEOYTZY
EOJUSUEAJEEK
QWRWLIWDTDBD
EDIT:
If you need only digits, use the digits
constant instead of the ascii_uppercase
one from the string
module.
3 examples:
229945986931
867348810313
618228923380

- 19,134
- 9
- 53
- 73

- 11,726
- 7
- 55
- 77
-
4yeah, well this is missleading: *"12 digits long and uppercase"* -- since digits can't be uppercased – Peter Varo Aug 19 '13 at 17:01
-
And if you need Alphanumeric i.e ASCII Uppercase plus digits then `import digits` `print(''.join(choice(ascii_uppercase + digits) for i in range(12)))` – Sandeep Kanabar Jan 05 '17 at 12:45
-
Does this gives an unique Id each time? What if I call this function from multiple threads (e.g. 2 of them) for 10000 times? What is the probability of collision or getting the same id at given point of time? – AnilJ Sep 06 '17 at 22:43
-
@AnilJ for further info on how the `random` module is working, please read the official documentation on it: https://docs.python.org/3/library/random.html – Peter Varo Sep 07 '17 at 07:44
-
Well, digits is not on Python3. You can use `string.hexdigits` to get a mix of '0123456789abcdefABCDEF', or just `string.digits + string.ascii_letters` for all letters. – goetz Oct 31 '17 at 01:20
-
-
@PeterVaro Few years late, but can you elaborate on that ? I do not understand how a digit can be uppercased. – Itération 122442 Jul 26 '21 at 15:09
By Django
, you can use get_random_string
function in django.utils.crypto
module.
get_random_string(length=12,
allowed_chars=u'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789')
Returns a securely generated random string.
The default length of 12 with the a-z, A-Z, 0-9 character set returns
a 71-bit value. log_2((26+26+10)^12) =~ 71 bits
Example:
get_random_string()
u'ngccjtxvvmr9'
get_random_string(4, allowed_chars='bqDE56')
u'DDD6'
But if you don't want to have Django
, here is independent code of it:
Code:
import random
import hashlib
import time
SECRET_KEY = 'PUT A RANDOM KEY WITH 50 CHARACTERS LENGTH HERE !!'
try:
random = random.SystemRandom()
using_sysrandom = True
except NotImplementedError:
import warnings
warnings.warn('A secure pseudo-random number generator is not available '
'on your system. Falling back to Mersenne Twister.')
using_sysrandom = False
def get_random_string(length=12,
allowed_chars='abcdefghijklmnopqrstuvwxyz'
'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789'):
"""
Returns a securely generated random string.
The default length of 12 with the a-z, A-Z, 0-9 character set returns
a 71-bit value. log_2((26+26+10)^12) =~ 71 bits
"""
if not using_sysrandom:
# This is ugly, and a hack, but it makes things better than
# the alternative of predictability. This re-seeds the PRNG
# using a value that is hard for an attacker to predict, every
# time a random string is required. This may change the
# properties of the chosen random sequence slightly, but this
# is better than absolute predictability.
random.seed(
hashlib.sha256(
("%s%s%s" % (
random.getstate(),
time.time(),
SECRET_KEY)).encode('utf-8')
).digest())
return ''.join(random.choice(allowed_chars) for i in range(length))

- 9,862
- 1
- 60
- 64
Could make a generator:
from string import ascii_uppercase
import random
from itertools import islice
def random_chars(size, chars=ascii_uppercase):
selection = iter(lambda: random.choice(chars), object())
while True:
yield ''.join(islice(selection, size))
random_gen = random_chars(12)
print next(random_gen)
# LEQIITOSJZOQ
print next(random_gen)
# PXUYJTOTHWPJ
Then just pull from the generator when they're needed... Either using next(random_gen)
when you need them, or use random_200 = list(islice(random_gen, 200))
for instance...

- 138,671
- 33
- 247
- 280
-
2
-
@martineau can take one at a time, set up ones with different variables, can slice off to take n many at a time etc... The main difference is that it's in effect an iterable itself, instead of repeatedly calling a function... – Jon Clements Aug 19 '13 at 17:12
-
-
`functools.partial` can fix parameters, and `list(itertools.islice(gen, n))` isn't any better than `[func() for _ in xrange(n)]` – user2357112 Aug 19 '13 at 17:58
-
@user2357112 by building a generator, there's an advantage over resuming its state, than setting up and calling up a function repeatedly... Also the `list` and `islice` will work at the implementation level instead of as a list-comp that could leak its `_` (in Py 2.x) variable and has to build an unnecessary range constraint that's otherwise handled... Also, it's also harder to build on top of functions, rather than streams... – Jon Clements Aug 19 '13 at 18:05
-
Resuming a generator's state vs calling a function repeatedly isn't an advantage, and if you want to set up fixed parameters, `functools.partial` can do that. The fact that `list` and `islice` are in C would be an advantage if there weren't a Python-level generator and several Python-level function calls in the inner loop. Leaking the loop variable is annoying, but no reason to avoid using list comprehensions. – user2357112 Aug 19 '13 at 18:14
-
If you use a generator, getting a single random string is `next(random_chars(n))`, whereas with a regular function it's just `random_chars(n)`. Looping over `k` random strings is `for s in islice(random_chars(n), k):`, whereas with a regular function, it's `for i in xrange(k): s = random_chars(n)`. I find the `islice` and `next` calls to be warning signs that you don't actually want a generator here. – user2357112 Aug 19 '13 at 18:19
-
@user2357112 depends on the use-case... I was just offering another option... If it's to associate a userid in a file (for instance) with a random password, then `dict(zip(fileobj, random_gen))` is perhaps better than using a dict comp with a call() as the value). If it's going to be arbitrarily used then I'd go for the approach already suggested, but what's the point of offering a duplicate answer ;) – Jon Clements Aug 19 '13 at 18:26
#!/bin/python3
import random
import string
def f(n: int) -> str:
bytes(random.choices(string.ascii_uppercase.encode('ascii'),k=n)).decode('ascii')
run faster for very big n. avoid str concatenate.

- 269
- 2
- 8
For cryptographically strong pseudo-random bytes you might use the pyOpenSSL wrapper around OpenSSL.
It provides the bytes
function to gather a pseudo-random sequences of bytes.
from OpenSSL import rand
b = rand.bytes(7)
BTW, 12 uppercase letters is a little bit more that 56 bits of entropy. You will only to have to read 7 bytes.

- 50,096
- 7
- 103
- 125
-
1Wouldn't 12 randomly selected uppercase letters correspond to ~56.4 bits worth of entropy? – DSM Aug 19 '13 at 17:40
-
1
This function generates random string of UPPERCASE letters with the specified length,
eg: length = 6, will generate the following random sequence pattern
YLNYVQ
import random as r
def generate_random_string(length):
random_string = ''
random_str_seq = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
for i in range(0,length):
if i % length == 0 and i != 0:
random_string += '-'
random_string += str(random_str_seq[r.randint(0, len(random_str_seq) - 1)])
return random_string

- 2,247
- 24
- 20
-
With above code `random_str_seq = "ABC@#$%^!&_+|*()OPQRSTUVWXYZ"` can give you even more complex results. – Iqra. Jan 18 '19 at 11:39
A random generator function without duplicates using a set
to store values which have been generated before. Note this will cost some memory with very large strings or amounts and it probably will slow down a bit. The generator
will stop at a given amount or when the maximum possible combinations are reached.
Code:
#!/usr/bin/env python
from typing import Generator
from random import SystemRandom as RND
from string import ascii_uppercase, digits
def string_generator(size: int = 1, amount: int = 1) -> Generator[str, None, None]:
"""
Return x random strings of a fixed length.
:param size: string length, defaults to 1
:type size: int, optional
:param amount: amount of random strings to generate, defaults to 1
:type amount: int, optional
:yield: Yield composed random string if unique
:rtype: Generator[str, None, None]
"""
CHARS = list(ascii_uppercase + digits)
LIMIT = len(CHARS) ** size
count, check, string = 0, set(), ''
while LIMIT > count < amount:
string = ''.join(RND().choices(CHARS, k=size))
if string not in check:
check.add(string)
yield string
count += 1
for my_count, my_string in enumerate(string_generator(12, 20)):
print(my_count, my_string)
Output:
0 IESUASWBRHPD
1 JGGO1THKLC9K
2 BW04A5GWBA7K
3 KDQTY72BV1S9
4 FAOL5L28VVMN
5 NLDNNBGHTRTI
6 2RV6TE6BCQ8K
7 B79B8FBPUD07
8 89VXXRHPUN41
9 DFC8QJUY6HRB
10 FXYYDKVQHC5Z
11 57KTZE67RSCU
12 389H1UT7N6CI
13 AKZMN9XITAVB
14 6T9ACH3GDAYG
15 CH8RJUQMTMBE
16 SPQ7E02ZLFD3
17 YD6JFXGIF3YF
18 ZUSA2X6OVNCN
19 JQRH6LR229Y4

- 172
- 1
- 7