56

I need to generate unique 64 bits integers from Python. I've checked out the UUID module. But the UUID it generates are 128 bits integers. So that wouldn't work.

Do you know of any way to generate 64 bits unique integers within Python? Thanks.

Continuation
  • 12,722
  • 20
  • 82
  • 106
  • 2
    How unique do they need to be? unique for that program, or unique for every ID ever generated by any program on any computer (which is what UUID gives you)? – Dave Kirby Aug 20 '10 at 11:56
  • Dave - these are document ID's. Every ID ever generated needs to be unique. I could have multiple servers each has Python processes. – Continuation Aug 20 '10 at 20:34
  • Why not simply assign sequential numbers? They're unique. – S.Lott Aug 21 '10 at 18:04
  • 2
    @S.Lott - How do you coordinate different Python processes on different machines to assign sequential numbers? – Continuation Aug 22 '10 at 03:29
  • 1
    (1) Why does that matter? Is it a requirement? If it's a requirement, then why isn't this requirement in the question? (2) That's what database servers are for. – S.Lott Aug 23 '10 at 00:43

5 Answers5

76

just mask the 128bit int

>>> import uuid
>>> uuid.uuid4().int & (1<<64)-1
9518405196747027403L
>>> uuid.uuid4().int & (1<<64)-1
12558137269921983654L

These are more or less random, so you have a tiny chance of a collision

Perhaps the first 64 bits of uuid1 is safer to use

>>> uuid.uuid1().int>>64
9392468011745350111L
>>> uuid.uuid1().int>>64
9407757923520418271L
>>> uuid.uuid1().int>>64
9418928317413528031L

These are largely based on the clock, so much less random but the uniqueness is better

John La Rooy
  • 295,403
  • 53
  • 369
  • 502
  • 3
    uuid1 reveals MAC address and time - uuid4 is more secure. – Lukas Cenovsky Oct 02 '10 at 18:55
  • 7
    Right-shifting by 64 bits removes the MAC address and time, leaving only the clock. – Glyph Jun 01 '12 at 17:10
  • @LukasCenovsky, The uuid1 will be more likely to be unique precisely for that reason. Depends whether security is required or not, but the trade off is that for uuid4, collisions will be more likely – John La Rooy Jul 08 '12 at 10:10
  • 1
    @JohnLaRooy the part with uuid1 is incorrect or misleading, because it creates UNSIGNED integers, not integers (an integer should be signed by default). I think that the correct way is something like this: int.from_bytes(uuid.uuid1().bytes, byteorder='big', signed=True) >> 64 – Stan Prokop Aug 14 '18 at 08:02
  • @JohnLaRooy, if you believe stanProkop improves your answer would you be willing to update it? My guess is that it is improved by the extra bit. – Robert Lugg Sep 13 '19 at 19:20
30

64 bits unique

What's wrong with counting? A simple counter will create unique values. This is the simplest and it's easy to be sure you won't repeat a value.

Or, if counting isn't good enough, try this.

>>> import random
>>> random.getrandbits(64)
5316191164430650570L

Depending on how you seed and use your random number generator, that should be unique.

You can -- of course -- do this incorrectly and get a repeating sequence of random numbers. Great care must be taken with how you handle seeds for a program that starts and stops.

S.Lott
  • 384,516
  • 81
  • 508
  • 779
  • 1
    No matter how good your seeds are you are likely to get repeats after approximately 2^32 IDs have been generated if you use the getrandbits() method. – President James K. Polk Aug 21 '10 at 15:17
  • 2
    The sequence is theoretically longer. "It produces 53-bit precision floats and has a period of 2**19937-1." Why would getrandbits() not have the full period? Does it generate multiple numbers? Even if it generates 64 distinct values and uses only one bit, the resulting period would be 2^311. – S.Lott Aug 21 '10 at 18:03
  • How big is the seed? If you use the same seed you would get the same random numbers – dalore Jul 23 '15 at 21:17
  • 1
    How would you implement that "just count"? – buhtz Feb 16 '22 at 14:57
  • _"Why would getrandbits() not have the full period?"_ It may have the full period, but there are only 2\*\*64 distinct 64-bit integers, so you can't get a sequence of 2\*\*19937-1 unique ones. Assuming a random distribution, you'd expect duplicates to start cropping up around the 2\*\*32 mark. – towr Jul 05 '22 at 14:15
9

A 64-bit random number from the OS's random number generator rather than a PRNG:

>>> from struct import unpack; from os import urandom
>>> unpack("!Q", urandom(8))[0]
12494068718269657783L
Glyph
  • 31,152
  • 11
  • 87
  • 129
2

You can use uuid4() which generates a single random 128-bit integer UUID. We have to 'binary right shift' (>>) each 128-bit integer generated by 64-bit (i.e. 128 - (128 - 64)).

from uuid import uuid4

bit_size = 64
sized_unique_id = uuid4().int >> bit_size
print(sized_unique_id)
Chuma Umenze
  • 933
  • 12
  • 18
  • 1
    You'd do better to simply generate the bytes directly with e.g. `os.urandom(8)`, or `secrets.randbelow(2**64)`. For one thing, only 122 of the 128 bits of a uuid4 are randomly generated; the other 6 are fixed. Your method only gives you 60 random bits, not 64, which increases the chance of a random collision. – Mark Dickinson Nov 07 '18 at 18:59
  • the bit size is not 64. it's 60~64 – Xiao Jul 04 '20 at 03:34
0

Why not try this?

import uuid
  
id = uuid.uuid1()
  
# Representations of uuid1()

print (repr(id.bytes)) # k\x10\xa1n\x02\xe7\x11\xe8\xaeY\x00\x16>\x99\x0b\xdb

print (id.int)         # 142313746482664936587190810281013480411  

print (id.hex)         # 6b10a16e02e711e8ae5900163e990bdb
  
sultanmyrza
  • 4,551
  • 1
  • 30
  • 24