How can I generate a unique ID in Python?

Question

I need to generate a unique ID based on a random value.

Can you be more specific on what kind of unique id. Does it need to be a number? or can it contain letters? Give some examples of the type of id. — MitMaro, Jul 31 '09 at 02:54
Possibly relevant, all objects have a unique id `id(my_object)`, or `id(self)`. That was sufficient for me considering everything that is in python is an object and has a numeric id; strings: `id('Hello World')` classes: `id('Hello World')`, everything has an id. — ThorSummoner, Oct 01 '15 at 22:06
Actually I was having trouble with the use of id, it seemed to have some relationship with the variable name, where variables with the same name were getting ids the same of the variable just replaced. Probably avoid using id unless you have good unit testing and are sure it behaves the way you want it to. — ThorSummoner, Oct 01 '15 at 22:56

score 203 · Answer 1 · edited Aug 25 '14 at 21:03

203

Perhaps uuid.uuid4() might do the job. See uuid for more information.

edited Aug 25 '14 at 21:03

byc

66
10

answered Jul 31 '09 at 02:54

Michael Aaron Safyan

93,612
16
138
200

29

Watch out, the library underlying that module is buggy and tends to fork a process that keeps FDs open. That's fixed in newer versions, but most people probably don't have that yet, so I generally avoid this module. Caused me major headaches with listen sockets... – Glenn Maynard Jul 31 '09 at 03:14
2

@Glenn: Any more details on which version is buggy? I'm using this in production code (and about to roll out to more uses of it in a newer release). Now I'm scared! – Matthew Schinckel Nov 06 '10 at 07:38
2

@Matthew: I don't know if it's since been fixed, but the uuid backend that use uuidlib forked without closing FDs, so the TCP sockets I had open at the time would never be closed and I couldn't reopen the port later. I'd have to manually kill `uuidd` as root. I worked around this by setting `uuid._uuid_generate_time` and `uuid._uuid_generate_random` to None so the `uuid` module never used the native implementation. (That should really be an option anyway; generating V4 random UUIDs causing a daemon to be started is completely unnecessary.) – Glenn Maynard Nov 06 '10 at 11:22
48

Its been 5 years, is that still an issue? – Jim Hall Apr 02 '16 at 15:44
44

@GlennMaynard Still wondering 9 years later – PascalVKooten Apr 08 '18 at 10:18
1

It is also extremely slow. Generating ~1 million uids cost me around 30s. – Gellweiler Aug 08 '18 at 10:59
3

The issue seems to be with os.urandom on some systems. UUID(int=random.getrandbits(128), version=4)) performs fine. – Gellweiler Aug 08 '18 at 11:19
3

From the future, wondering if the bug got resolved in the past. – Haris Nov 04 '20 at 13:57
Is this still an issue? – Mithun Kinarullathil Jan 21 '22 at 20:47
3

@MithunKinarullathil no issue now – Azhar Uddin Sheikh Sep 28 '22 at 10:09

DreadPirateShawn · Answer 2 · 2013-10-22T13:24:53.250

169

You might want Python's UUID functions:

21.15. uuid — UUID objects according to RFC 4122

eg:

import uuid
print uuid.uuid4()

7d529dd4-548b-4258-aa8e-23e34dc8d43d

edited Oct 22 '13 at 13:24

answered Jul 31 '09 at 02:54

DreadPirateShawn

8,164
4
49
71

2

This should be the answer. More detailed ! – Mayur Mahajan May 19 '20 at 16:10
3

What if we want for only specific number of bits? – Muhammad Rafeh Atique Jul 08 '20 at 07:33

SingleNegationElimination · Answer 3 · 2014-09-10T00:17:21.723

28

unique and random are mutually exclusive. perhaps you want this?

import random
def uniqueid():
    seed = random.getrandbits(32)
    while True:
       yield seed
       seed += 1

Usage:

unique_sequence = uniqueid()
id1 = next(unique_sequence)
id2 = next(unique_sequence)
id3 = next(unique_sequence)
ids = list(itertools.islice(unique_sequence, 1000))

no two returned id is the same (Unique) and this is based on a randomized seed value

edited Sep 10 '14 at 00:17

answered Jul 31 '09 at 04:04

SingleNegationElimination

151,563
33
264
304

7

This isn't unique. If I start it twice, and generate a million values each time, the chances of a collision between the two runs is significant. I'd have to store the last "seed" each time to avoid that--and then there's no point to having the seed; it's just a sequence generator. *Statistically* unique IDs typically are generated from random data; at least one class of UUIDs works that way. – Glenn Maynard Jul 31 '09 at 04:38
1

It is unique so long as each unique sequence comes from only a single invocation of uniqueid. there is no guarantee of uniqueness across generators. – SingleNegationElimination Jul 31 '09 at 05:14
15

With that condition, even a counter is unique. – Glenn Maynard Jul 31 '09 at 07:17
1

is there something wrong with generating unique ID's by means of a counter? this is a common practice in databse design. – SingleNegationElimination Jul 31 '09 at 17:30
Your usage wouldn't work that way (at least not in 2.7): you'd need to call unique_sequence.next(). – Gerald Senarclens de Grancy Aug 24 '11 at 19:30
There is nothing wrong with generating sequential unique ID numbers in a database, but that's typically because an atomic (uninterrupted) operation is performed to guarantee that the number returned is unique, so any database operation can rely on getting a unique ID for the context it is operating in. – Avery Payne Jul 26 '12 at 22:59

score 9 · Answer 4 · answered Aug 14 '13 at 18:07

9

Maybe this work for u

str(uuid.uuid4().fields[-1])[:5]

answered Aug 14 '13 at 18:07

Pjl

1,752
18
21

2

could you elaborate on your resulting str – serup Feb 15 '16 at 12:32
5

Yeah, understanding the transformation and whether it is still unique would be nice – alisa Mar 15 '16 at 22:00
6

How does this ensure uniqueness? – Hassan Baig Sep 15 '17 at 11:13
@Pjl Thank u so much! Its very nice way to get unique id. – Muhammad Rafeh Atique Jul 08 '20 at 07:36
10

This is not a good solution. See https://gist.github.com/randlet/65c3812c648517e365f1d774a0122d18 – randlet May 26 '21 at 01:43
how can I add a comment as an answer? – Bernardo Troncoso Dec 10 '21 at 19:03

score 7 · Answer 5 · edited Feb 13 '14 at 12:00

import time
import random
import socket
import hashlib

def guid( *args ):
    """
    Generates a universally unique ID.
    Any arguments only create more randomness.
    """
    t = long( time.time() * 1000 )
    r = long( random.random()*100000000000000000L )
    try:
        a = socket.gethostbyname( socket.gethostname() )
    except:
        # if we can't get a network address, just imagine one
        a = random.random()*100000000000000000L
    data = str(t)+' '+str(r)+' '+str(a)+' '+str(args)
    data = hashlib.md5(data).hexdigest()

    return data

score 4 · Answer 6 · answered Apr 01 '13 at 10:46

here you can find an implementation :

def __uniqueid__():
    """
      generate unique id with length 17 to 21.
      ensure uniqueness even with daylight savings events (clocks adjusted one-hour backward).

      if you generate 1 million ids per second during 100 years, you will generate 
      2*25 (approx sec per year) * 10**6 (1 million id per sec) * 100 (years) = 5 * 10**9 unique ids.

      with 17 digits (radix 16) id, you can represent 16**17 = 295147905179352825856 ids (around 2.9 * 10**20).
      In fact, as we need far less than that, we agree that the format used to represent id (seed + timestamp reversed)
      do not cover all numbers that could be represented with 35 digits (radix 16).

      if you generate 1 million id per second with this algorithm, it will increase the seed by less than 2**12 per hour
      so if a DST occurs and backward one hour, we need to ensure to generate unique id for twice times for the same period.
      the seed must be at least 1 to 2**13 range. if we want to ensure uniqueness for two hours (100% contingency), we need 
      a seed for 1 to 2**14 range. that's what we have with this algorithm. You have to increment seed_range_bits if you
      move your machine by airplane to another time zone or if you have a glucky wallet and use a computer that can generate
      more than 1 million ids per second.

      one word about predictability : This algorithm is absolutely NOT designed to generate unpredictable unique id.
      you can add a sha-1 or sha-256 digest step at the end of this algorithm but you will loose uniqueness and enter to collision probability world.
      hash algorithms ensure that for same id generated here, you will have the same hash but for two differents id (a pair of ids), it is
      possible to have the same hash with a very little probability. You would certainly take an option on a bijective function that maps
      35 digits (or more) number to 35 digits (or more) number based on cipher block and secret key. read paper on breaking PRNG algorithms 
      in order to be convinced that problems could occur as soon as you use random library :)

      1 million id per second ?... on a Intel(R) Core(TM)2 CPU 6400 @ 2.13GHz, you get :

      >>> timeit.timeit(uniqueid,number=40000)
      1.0114529132843018

      an average of 40000 id/second
    """
    mynow=datetime.now
    sft=datetime.strftime
    # store old datetime each time in order to check if we generate during same microsecond (glucky wallet !)
    # or if daylight savings event occurs (when clocks are adjusted backward) [rarely detected at this level]
    old_time=mynow() # fake init - on very speed machine it could increase your seed to seed + 1... but we have our contingency :)
    # manage seed
    seed_range_bits=14 # max range for seed
    seed_max_value=2**seed_range_bits - 1 # seed could not exceed 2**nbbits - 1
    # get random seed
    seed=random.getrandbits(seed_range_bits)
    current_seed=str(seed)
    # producing new ids
    while True:
        # get current time 
        current_time=mynow()
        if current_time <= old_time:
            # previous id generated in the same microsecond or Daylight saving time event occurs (when clocks are adjusted backward)
            seed = max(1,(seed + 1) % seed_max_value)
            current_seed=str(seed)
        # generate new id (concatenate seed and timestamp as numbers)
        #newid=hex(int(''.join([sft(current_time,'%f%S%M%H%d%m%Y'),current_seed])))[2:-1]
        newid=int(''.join([sft(current_time,'%f%S%M%H%d%m%Y'),current_seed]))
        # save current time
        old_time=current_time
        # return a new id
        yield newid

""" you get a new id for each call of uniqueid() """
uniqueid=__uniqueid__().next

import unittest
class UniqueIdTest(unittest.TestCase):
    def testGen(self):
        for _ in range(3):
            m=[uniqueid() for _ in range(10)]
            self.assertEqual(len(m),len(set(m)),"duplicates found !")

hope it helps !

score 4 · Answer 7 · answered Mar 14 '14 at 12:36

This will work very quickly but will not generate random values but monotonously increasing ones (for a given thread).

import threading

_uid = threading.local()
def genuid():
    if getattr(_uid, "uid", None) is None:
        _uid.tid = threading.current_thread().ident
        _uid.uid = 0
    _uid.uid += 1
    return (_uid.tid, _uid.uid)

It is thread safe and working with tuples may have benefit as opposed to strings (shorter if anything). If you do not need thread safety feel free remove the threading bits (in stead of threading.local, use object() and remove tid altogether).

Hope that helps.

score 3 · Answer 8 · answered Jul 31 '09 at 02:54

3

Maybe the uuid module?

answered Jul 31 '09 at 02:54

zenazn

14,295
2
36
26

How can I generate a unique ID in Python?

8 Answers8

Linked

Related