2

I need to convert a guid to a large integer.. this is fine, but during testing i have highlighted something that i need explained to me please ;)

If i do the following:

        var g = Guid.NewGuid();     // 86736036-6034-43c5-9b85-1c833837dbea
        var p = g.ToByteArray();
        var x = new BigInteger(p);  // -28104782885366703164142972435490971594

but if i do this in python.. i get a different result:

        import uuid
        x = uuid.UUID('86736036-6034-43c5-9b85-1c833837dbea')
        print x
        print x.int  # 178715616993326703606264498842288774122

can someone with better knowledge of python, and also .net help explain this?

m1nkeh
  • 1,337
  • 23
  • 45
  • 2
    I should learn to read the question – TheGeneral Jun 04 '18 at 11:14
  • 1
    It seems that the C# code is treating it as a signed integer, which I think doesn't make sense. – GolezTrol Jun 04 '18 at 11:15
  • indeed... even if i do get a positive output for x, it still does not match the python value.... all i am after is a 128bit numerical representation of the guid – m1nkeh Jun 04 '18 at 11:16
  • i should also add that this site: http://guid-convert.appspot.com/ returns the "python" value – m1nkeh Jun 04 '18 at 11:17
  • Yeah, but that's how integers work internally. Not so familiar with BigInteger, but I think C# still follows two's complement logic, and it will just treat the first bit as the sign bit. Maybe you can prepend an extra byte with value 0 in C#'s byte array to fool it into treating it as a positive number. – GolezTrol Jun 04 '18 at 11:18
  • 1
    My previous statement is confirmed by [The Docs](https://msdn.microsoft.com/en-us/library/dd268207(v=vs.110).aspx): *"The constructor expects positive values in the byte array to use sign-and-magnitude representation, and negative values to use two's complement representation. In other words, if the highest-order bit of the highest-order byte in value is set, the resulting BigInteger value is negative. "* – GolezTrol Jun 04 '18 at 11:20
  • i *think* i follow, could you indulge me with a bit of psudo-code? are you suggesting that i add a new byte array element to the front? of the array? how would this look :| – m1nkeh Jun 04 '18 at 11:21
  • 2
    Does [this](https://stackoverflow.com/questions/5649190/byte-to-unsigned-biginteger) help? – ProgrammingLlama Jun 04 '18 at 11:21
  • @john yeah, looks to be the same thing... i will give it a swirl! – m1nkeh Jun 04 '18 at 11:22
  • So, at the end not at the front. Little mixup because of the endianness, but it looks like John's link solves it all. – GolezTrol Jun 04 '18 at 11:24
  • Possible duplicate of [byte\[\] to unsigned BigInteger?](https://stackoverflow.com/questions/5649190/byte-to-unsigned-biginteger) – Christian Gollhardt Jun 04 '18 at 11:27
  • 1
    [`GUID.ToByte()`](https://msdn.microsoft.com/en-us/library/system.guid.tobytearray(v=vs.110).aspx) does *not* preserve order, it is incompatibal with python, as python's `GUID.hex` *does* preserver order of the bytes – Freggar Jun 04 '18 at 11:28
  • 1
    excellent, thanks @Freggar, so which is the more 'correct' ? if such a thing is possible... – m1nkeh Jun 04 '18 at 11:28
  • Possible duplicate of [Why does Guid.ToByteArray() order the bytes the way it does?](https://stackoverflow.com/questions/9195551/why-does-guid-tobytearray-order-the-bytes-the-way-it-does) – mjwills Jun 04 '18 at 11:46

2 Answers2

4

Encoding a GUID to its component bytes is a non-standardized operation that is dealt with differently on Windows/Microsoft platforms (IMO in a most confusing fashion).

var g = Guid.Parse("86736036-6034-43c5-9b85-1c833837dbea");
var guidBytes = $"0{g:N}"; //no dashes, leading 0
var pythonicUuidIntValue = BigInteger.Parse(guidBytes, NumberStyles.HexNumber);

will give you the pythonic value from C#

The reason .ToByteArray fails is implicit in the instructions:

The order of the beginning four-byte group and the next two two-byte groups is reversed, whereas the order of the last two-byte group and the closing six-byte group is the same.

Knowing this, it's probably possible to write a method that doesn't involve a trip through strings. An exercise for the reader.

spender
  • 117,338
  • 33
  • 229
  • 351
  • 1
    @Freggar Someone already asked: https://stackoverflow.com/questions/9195551/why-does-guid-tobytearray-order-the-bytes-the-way-it-does – spender Jun 04 '18 at 11:45
3

Just out of curiosity, swapping some bytes here and there :-) and then adding an additional byte if necessary for the sign.

var g = new Guid();
var bytes = g.ToByteArray();

var bytes2 = new byte[bytes[3] >= 0x7F ? bytes.Length + 1 : bytes.Length];

bytes2[0] = bytes[15];
bytes2[1] = bytes[14];
bytes2[2] = bytes[13];
bytes2[3] = bytes[12];
bytes2[4] = bytes[11];
bytes2[5] = bytes[10];
bytes2[6] = bytes[9];
bytes2[7] = bytes[8];

bytes2[8] = bytes[6];
bytes2[9] = bytes[7];

bytes2[10] = bytes[4];
bytes2[11] = bytes[5];

bytes2[12] = bytes[0];
bytes2[13] = bytes[1];
bytes2[14] = bytes[2];
bytes2[15] = bytes[3];

var bi2 = new BigInteger(bytes2);

(I've tested on 1,000,000 random Guid and the result is equivalent to the one obtained with @spender method).

xanatos
  • 109,618
  • 12
  • 197
  • 280
  • ok, i am still a bit lost on to *why* these things needs to be swapped about.. i will read the supporting posts, and come back... many thanks! – m1nkeh Jun 04 '18 at 11:59
  • 2
    @m1nkeh Because someone decided that their internal format is different. BigInteger are little endian, while Guid are big endian, *but* they aren't a single big number but many small numbers (one of 4 bytes, two of 2 bytes, and 8 of one byte). – xanatos Jun 04 '18 at 12:03
  • what does your theirs line do btw? – m1nkeh Jun 04 '18 at 12:04
  • @m1nkeh I copy the bytes in the "correct" order, swapping some of them (you can see that the indexers on the right aren't simply going 15...0, but are 15...8, 6, 7, 4, 5, 0, 1, 2, 3) – xanatos Jun 04 '18 at 12:09
  • 1
    @m1nkeh There is an annotation [here](https://en.wikipedia.org/wiki/Universally_unique_identifier#Encoding) about the format used by Microsoft (the second one) – xanatos Jun 04 '18 at 12:10
  • i can't mark two responses as the answer i'm afraid, so i picked this one due to it (and it's comments) being most helpful... – m1nkeh Jun 07 '18 at 10:18