32

What I want is to generate a string(key) of size 5 for my users on my website. More like a BBM PIN.

The key will contain numbers and uppercase English letters:

  • AU1B7
  • Y56AX
  • M0K7A

How can I also be at rest about the uniqueness of the strings even if I generate them in millions?

In the most pythonic way possible, how can I do this?

Yax
  • 2,127
  • 5
  • 27
  • 53
  • 1
    `base64.b32encode('12345') == 'GEZDGNBV'` – Paulo Scardine Sep 25 '14 at 05:02
  • 1
    se also https://github.com/jbittel/base32-crockford – Paulo Scardine Sep 25 '14 at 05:06
  • 2
    You can generate them with `django.utils.crypto.get_random_string(5, string.ascii_uppercase+string.digits)`. You might want to restrict the set of characters so that you don't generate potentially confusing strings such as `l1I1l` which might be indecipherable in certain fonts. Uniqueness is going to require that you persist the set of allocated strings. – mhawke Sep 25 '14 at 05:25
  • 1
    You can't have both "random" *and* "unique". Replace the "random" with "random-looking". – Ignacio Vazquez-Abrams Sep 25 '14 at 06:37
  • @IgnacioVazquez-Abrams: Why didn't you register that observation instead of your conclusion? It's misleading. – Yax Sep 25 '14 at 06:39
  • 3
    Because I am the crusher of dreams, the bringer of woe and despair. WOE AND DESPAIR. – Ignacio Vazquez-Abrams Sep 25 '14 at 06:49
  • @IgnacioVazquez-Abrams: I was at your site, your book. This isn't WOE AND DESPAIR. Those who bring woe and despair do it the right way. They are not angry. You just have blown up digits. Not all have their lives tied to coding, some do it for the fun of it. – Yax Sep 25 '14 at 06:56

12 Answers12

70

My favourite is

import uuid 
uuid.uuid4().hex[:6].upper()

If you using django you can set the unique constrain on this field in order to make sure it is unique. https://docs.djangoproject.com/en/dev/ref/models/fields/#django.db.models.Field.unique

zzart
  • 11,207
  • 5
  • 52
  • 47
  • 2
    +1 for the length of the code. But I was curious about how you are sure that you are generating unique values ? Doesn't it generate repeated values ? – d-coder Sep 25 '14 at 07:30
  • Sure there could be collisions! http://en.wikipedia.org/wiki/Universally_unique_identifier#Random_UUID_probability_of_duplicates The longer uuid the chances of collision are slimmer. Basically there need to be some further logic to it as all depens HOW you are going about generating and storing them. – zzart Sep 25 '14 at 10:06
  • 11
    This isn't going to generate AU1B7, Y56AX, nor M0K7A as alpha characters are limited to A..F. It's not going to generate guaranteed unique values either. You could add the code to show how to handle collisions. – mhawke Sep 25 '14 at 10:10
  • 1
    You could handle collisions like [in this SO question](http://stackoverflow.com/questions/3928016/) – gatlanticus Apr 26 '17 at 23:51
  • 1
    This generates a hex string. You literally can't generate millions (plural) with it. But you can generate a million. – Caveman Dec 10 '19 at 17:02
21

From 3.6 You can use secrets module to generate nice random strings. https://docs.python.org/3/library/secrets.html#module-secrets

import secrets
print(secrets.token_hex(5))
Michael Stachura
  • 1,100
  • 13
  • 18
  • 2
    This generates a hex string. You literally can't generate millions (plural) with it. But you can generate a million. – Caveman Dec 10 '19 at 17:02
19

A more secure and shorter way of doing is using Django's crypto module.

from django.utils.crypto import get_random_string
code = get_random_string(5)

get_random_string() function returns a securely generated random string, uses secrets module under the hood.

You can also pass allowed_chars:

from django.utils.crypto import get_random_string
import string

code = get_random_string(5, allowed_chars=string.ascii_uppercase + string.digits)
rockikz
  • 586
  • 1
  • 6
  • 17
Reza Abbasi
  • 1,521
  • 2
  • 15
  • 21
4

Am not sure about any short cryptic ways, but it can be implemented using a simple straight forward function assuming that you save all the generated strings in a set:

import random

def generate(unique):
    chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890"
    while True:
        value = "".join(random.choice(chars) for _ in range(5))
        if value not in unique:
            unique.add(value)
            break

unique = set()
for _ in range(10):
    generate(unique)
Elias Zamaria
  • 96,623
  • 33
  • 114
  • 148
nu11p01n73R
  • 26,397
  • 3
  • 39
  • 52
  • 1
    Just to highlight that it's stated on python's `secret` [package page](https://docs.python.org/3/library/secrets.html#module-secrets): "In particularly, secrets should be used in preference to the default pseudo-random number generator in the random module, which is designed for modelling and simulation, not security or cryptography." – Shadi Sep 30 '19 at 10:08
  • Upvote for using python sets for a collection of unique elements. – Caveman Dec 10 '19 at 17:00
4

If you can afford to lose '8' and '9' in the generated numbers there is a very pythonic solution to getting a random number.

import os
import base64

base64.b32encode(os.urandom(3))[:5].decode('utf-8')

Since you are going for uniqueness then you have a problem since 36 * 36 * 36 * 36 * 36 = 60'466'176 which will definitely result in collisions if you have millions. Since sets are faster than dicts we do...

some_set = set()

def generate():
    return base64.b32encode(os.urandom(3))[:5].decode('utf-8')

def generate_unique():
    string = generate()
    while string in some_set:
        string = generate()
    some_set.add(string)
    return string

However since uniqueness is usually more important I'd recommend generating a unique code for each of the numbers from 0 to 36^5 - 1 like this. We can use a large prime and modulo to make a psuedo-random number like this.


import base64
import math

num = 1
prime_number = 60466181
characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ123456789'

def num_to_code(n: int):
    string = ''
    hashed = hash_number(n)
    for x in range(5):
        charnumber = hashed % 36
        hashed = math.floor(hashed / 36)
        string += characters[charnumber]
    return string

def hash_number(n: int, rounds = 20):
    if rounds <= 0:
        return n
    hashed = (n * prime_number) % (36 ** 5)
    return hash_number(hashed, rounds - 1)

if __name__ == '__main__':
    code = num_to_code(1)
    print(code)

Here are the results from generating 0-5, they'll always generate the same sequence.

0 AAAAA (easily fixable ofc)
1 ZGQR9
2 ON797
3 DUMQ6
4 31384
5 R8IP3
Caveman
  • 2,527
  • 1
  • 17
  • 18
3

If you have a way of associating each user to a unique ID (for example Primary Key in Django or Flask). You can do something like this:

Note: This does not generate a fixed length.

We will pad the user_id to the right to make the generated length a bit static

import os
import base64

user_id = 1

#pad the string
number_generate = str(user_id).rjust(5,"0")

base64.b32encode(bytes(number_generate, 'utf-8')).decode('utf-8').replace('=','')
Erisan Olasheni
  • 2,395
  • 17
  • 20
1
size = 5
''.join(random.choice(string.letters[26:] + string.digits) for in range(size))

this will generate some short code, but they can be duplicated. so check if they are unique in your database before saving.

def generate(size=5):
    code = ''.join(random.choice(string.letters[26:] + string.digits) for in range(size))
    if check_if_duplicate(code):
        return generate(size=5)
    return code

or using django unique constrain, and handle exceptions.

Tianqi Tong
  • 156
  • 1
  • 5
1

There is a function in django that does what you're looking for (credits to this answer):

Django provides the function get_random_string() which will satisfy the alphanumeric string generation requirement. You don't need any extra package because it's in the django.utils.crypto module.

>>> from django.utils.crypto import get_random_string
>>> unique_id = get_random_string(length=32)
>>> unique_id
u'rRXVe68NO7m3mHoBS488KdHaqQPD6Ofv'

You can also vary the set of characters with allowed_chars:

>>> short_genome = get_random_string(length=32, allowed_chars='ACTG')
>>> short_genome
u'CCCAAAAGTACGTCCGGCATTTGTCCACCCCT'
Vedant Agarwala
  • 18,146
  • 4
  • 66
  • 89
1

I have a unique field, named 'systemCode' within a lot of my models. And I am generating this manually, but also sometimes it can take value from user input, so I have to check this value before saving and if it matches , regenerating this value as a unique value.

And this is how I generate unique strings at this scenario :


This is my standard class Model :

class ClassOne(models.Model):
   name = models.CharField(max_length=100)
   systemCode = models.CharField(max_length=25, blank=True, null=True, unique=True)
   ....

I am using save() method to generate and check this systemCode is unique :

    def save(self, *args, **kwargs):
        systemCode = self.systemCode
        if not systemCode:
            systemCode = uuid.uuid4().hex[:6].upper()
        while ClassOne.objects.filter(systemCode=systemCode).exclude(pk=self.pk).exists():
            systemCode = uuid.uuid4().hex[:6].upper()
        self.systemCode = systemCode
        super(ClassOne, self).save(*args, **kwargs)

But I have same systemCode field in all my Models. So I am using a function to generate value.

So, this is how to generate unique value for all models using saveSystemCode() function :

import uuid 

def saveSystemCode(inClass, inCode, inPK, prefix):
    systemCode = inCode
    if not systemCode:
        systemCode = uuid.uuid4().hex[:6].upper()

    while inClass.objects.filter(systemCode=systemCode).exclude(pk=inPK).exists():
        systemCode = uuid.uuid4().hex[:6].upper()

    return systemCode

class ClassOne(models.Model):
    name = models.CharField(max_length=100)
    systemCode = models.CharField(max_length=25, blank=True, null=True, unique=True)
    ....

    def save(self, *args, **kwargs):
        self.systemCode = saveSystemCode(ClassOne, self.systemCode, self.pk, 'one_')
        super(ClassOne, self).save(*args, **kwargs)


class ClassTwo(models.Model):
    name = models.CharField(max_length=100)
    systemCode = models.CharField(max_length=25, blank=True, null=True, unique=True)
    ....

    def save(self, *args, **kwargs):
        self.systemCode = saveSystemCode(ClassTwo, self.systemCode, self.pk, 'two_')
        super(ClassTwo, self).save(*args, **kwargs)

class ClassThree(models.Model):
    name = models.CharField(max_length=100)
    systemCode = models.CharField(max_length=25, blank=True, null=True, unique=True)
    ....

    def save(self, *args, **kwargs):
        self.systemCode = saveSystemCode(ClassThree, self.systemCode, self.pk, 'three_')
        super(ClassThree, self).save(*args, **kwargs)

while loop in the 'saveSystemCode' function is preventing to save same value again.

mevaka
  • 1,980
  • 2
  • 17
  • 24
1

To generate unique one you can use below command:

import uuid 
str(uuid.uuid1())[:5]
Eslamspot
  • 77
  • 5
1

Here a solution to gen codes of lenght 5 or any on a file:

import shortuuid as su
n = int(input("# codes to gen: "))
l = int(input("code lenght: "))
shou = su.ShortUUID(alphabet="QWERTYUIOPASDFGHJKLZXCVBNM0123456789")
codes = set()
LEN_CNT = 0

with open('file.txt', 'w') as file:
    while len(codes) < n:
        cd = shou.random(length=l)
        codes.add(cd)
        if len(codes) > LEN_CNT:
            LEN_CNT = len(codes)
            file.write(f"{cd}\n")

(shortuuid sometimes gen duplicated codes, so I use a set to deal with that)

jarh1992
  • 603
  • 7
  • 8
0

As the time of writing this answer, there is an actively maintained package that generates short UUIDs:

https://github.com/skorokithakis/shortuuid

For Django support, have a look here:

https://github.com/skorokithakis/shortuuid#django-field

Yaser Alnajjar
  • 395
  • 5
  • 18