8

In a Java application files are created where the filename is a UUID generated from a protein sequence (e.g. TTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN) created using the function UUID.nameUUIDFromBytes. This results in the UUID c6a0deb5-0c4f-3961-9d19-3f0fde0517c2.

UUID.namedUUIDFromBytes doesn't take a namespace as a parameter, whereas in python uuid.uuid3 does. According to What namespace does the JDK use to generate a UUID with nameUUIDFromBytes?, the namespace should have been passed as part of the name, but it's no longer possible to change the java code.

Is there a way to create a custom namespace in the python code such that it will produce the same UUID's as the Java code?

Community
  • 1
  • 1
Jon
  • 9,815
  • 9
  • 46
  • 67

3 Answers3

18

nameUUIDFromBytes only takes one parameter, which is supposed to be the concatenation of the namespace and name just like you say. The namespace parameter is supposed to be a UUID, and as far as I know, they don't have a null value defined.

A "null uuid" can be passed to Python's uuid3 like this. This should work as long as the namespace has a bytes attribute (tested with Python 2 and 3):

class NULL_NAMESPACE:
    bytes = b''
uuid.uuid3(NULL_NAMESPACE, 'TTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN')
# returns: UUID('c6a0deb5-0c4f-3961-9d19-3f0fde0517c2')
André Laszlo
  • 15,169
  • 3
  • 63
  • 81
  • Superb. I was trying for ages to turn an empty string into bytes but couldn't find the trick. – Jon Jan 14 '15 at 09:37
  • In Python 3 it is no longer possible to mix Unicode strings with encoded strings - it will result in a `TypeError: Can't convert 'bytes' object to str implicitly`. I think that `bytes` should by binary string: `bytes = ''.encode('utf-8')` – Jarek Przygódzki Jan 14 '15 at 09:43
  • @JarekPrzygódzki Good point. I changed `''` to `b''` for Python 3 compatibility. – André Laszlo Jan 14 '15 at 09:47
  • 2
    Rather than create the class, you can use some (somewhat ugly) code to do this inline: `type('', (), dict(bytes=b''))()` (from [this answer](https://stackoverflow.com/a/1123054) to another question). So then you can do this as a one liner if needed, e.g.: `python -c "import uuid; print uuid.uuid3(type('', (), dict(bytes=b''))(), 'TTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN')"`. – Rik Jul 12 '17 at 21:00
  • 1
    @Rik - the creation of the class is quite a good approach if you ever need to reuse the constant in any place in your project but I agree with your solution also; as a combined solution you can have `class Numeric(Enum): NULL_NAMESPACE = type('', (), dict(bytes=b''))() uuid.uuid3(Numeric.NULL_NAMESPACE, 'TTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN') # returns: UUID('c6a0deb5-0c4f-3961-9d19-3f0fde0517c2') ` – Mache Feb 07 '19 at 15:13
2

In the case that it is helpful, if you want to do the Java side of this, you can use the following:

UUID namespaceUUID = UUID.fromString("9db60607-6b12-41eb-8848-eafd26681583");
String myString = "sometextinhere";

ByteBuffer buffer = ByteBuffer.wrap(new byte[16 + myString.getBytes().length]);
buffer.putLong(namespaceUUID.getMostSignificantBits());
buffer.putLong(namespaceUUID.getLeastSignificantBits());
buffer.put(myString.getBytes());

byte[] uuidBytes = buffer.array();

UUID myUUID = UUID.nameUUIDFromBytes(uuidBytes);

This will provide the same output UUID as the following Python:

namespaceUUID = UUID('9db60607-6b12-41eb-8848-eafd26681583')
myUUID = uuid.uuid3(myUUID, 'sometextinhere'))
turbomerl
  • 61
  • 2
0

@turbomeri's answer is right.. just python code didn't had few mistakes, so corrected the version

a = 'sometextinhere'
namespaceUUID = uuid.UUID('9db60607-6b12-41eb-8848-eafd26681583')
print(str(uuid.uuid3(namespaceUUID, a)))```