22

I am creating a GUID like this

Guid g = new Guid(new byte[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0xA, 0xB, 0xC, 0xD, 0xE, 0xF });
Console.WriteLine(g);

This outputs

03020100-0504-0706-0809-0a0b0c0d0e0f

According to Wikipedia there are four parts in the guid and this explains why the bytes order switch in four groups. However the Wikipedia article also states that all parts are stored in Big Endian format. Obviously the first three parts are not Big Endian. The GetBytes() method of the guid returns the bytes in the very same order used for creation. What is the explaination for this behavior?

Stilgar
  • 22,354
  • 14
  • 64
  • 101

2 Answers2

11

It appears that MS are storing the five parts in a structure. The first 4 parts are either 2 or 4 bytes long and are therefore probably stored as a native type (ie. WORD and DWORD) in little endian format. The last part is 6 bytes long and it therefore handled differently (probably an array).

Does the Spec state that the GUID is stored in big-endian order, or that the storage of parts are in that order but the indiviual parts may be implementation specific?

EDIT:

From the UUID spec, section 4.1.2. Layout and Byte Order (emphasis mine):

To minimize confusion about bit assignments within octets, the UUID
record definition is defined only in terms of fields that are
integral numbers of octets. The fields are presented with the most
significant one first.

...

In the absence of explicit application or presentation protocol
specification to the contrary
, a UUID is encoded as a 128-bit object, as follows:

The fields are encoded as 16 octets, with the sizes and order of the fields defined above, and with each field encoded with the Most Significant Byte first (known as network byte order).

It might be that MS have stored the bytes in the correct order, but have not bothered to network-to-host order the WORD and DWORD parts for presentation (which appears to be ok according to the spec, at least by my unskilled reading of it.)

Community
  • 1
  • 1
Grhm
  • 6,726
  • 4
  • 40
  • 64
  • According to Wikipedia (I did not check the references) the UUID standard of which GUID is supposed to be implementation states that parts should be encoded in Big Endian. Both the UUID and the GUID specs define that there are four parts of sizes 4, 2, 2 and 8 bytes in this order. – Stilgar Apr 17 '12 at 12:27
  • Indeed, and when displayed the last 8 byte part is normally shown as 2bytes-6bytes - both of which appears to be correctly big endian (as shown is your example). – Grhm Apr 17 '12 at 12:29
  • Yeah the last 8 bytes are displayed as 2-6 in the string representation probably for readability reasons but they are part of the same data part. The real question is if Guid is violating the standard or there is some other explaination. – Stilgar Apr 17 '12 at 12:32
  • Good find. I wonder if we should update the Wikipedia article now. – Stilgar Apr 17 '12 at 12:51
8

I'm no expert here, but the Wiki page you mention, also says:

However, the reference for a commonly[4] used structure of the data type doesn't mention byte ordering

That citation ([4]) points to http://msdn.microsoft.com/en-us/library/aa373931(VS.85).aspx which subsequently identifies how Microsoft implement a GUID as

typedef struct _GUID {
  DWORD Data1;
  WORD  Data2;
  WORD  Data3;
  BYTE  Data4[8];
} GUID;

since the last 8 bytes are stored as a byte array, I think this identifies the behaviour you are seeing.

pms1969
  • 3,354
  • 1
  • 25
  • 34
  • So DWORD and WORD are little endian for some reason? – Stilgar Apr 17 '12 at 12:22
  • 1
    http://en.wikipedia.org/wiki/Endianness It would depend on your architecture. On an x86 architecture, yes. – pms1969 Apr 17 '12 at 12:34
  • 1
    But this would also mean that GUID violates the UUID standard? Also that the Wikipedia article is kind of misleading (stating that a GUID stores data parts in Big Endian format) – Stilgar Apr 17 '12 at 12:36
  • 1
    @Stilgar: The uuid standard and GUID article only state they are stored in big-endian format - neither appears to explicitly state how the GUID/UUID is rendered in a human readable format. – Grhm Apr 17 '12 at 12:42
  • 1
    Big endian = human readable. Humans write most significant digits first (at least in all left-right scripts). https://lwn.net/Articles/628233/ – MarcH Jul 29 '15 at 06:09