14

Consider the following unit test:

    [TestMethod]
    public void TestByteToString()
    {
        var guid = new Guid("61772f3ae5de5f4a8577eb1003c5c054");
        var guidString = guid.ToString("n");
        var byteString = ToHexString(guid.ToByteArray());

        Assert.AreEqual(guidString, byteString);
    }

    private String ToHexString(Byte[] bytes)
    {
        var hex = new StringBuilder(bytes.Length * 2);
        foreach(var b in bytes)
        {
            hex.AppendFormat("{0:x2}", b);
        }
        return hex.ToString();
    }

Here's the result:

Assert.AreEqual failed. Expected:<61772f3ae5de5f4a8577eb1003c5c054>. Actual:<3a2f7761dee54a5f8577eb1003c5c054>.

svick
  • 236,525
  • 50
  • 385
  • 514
Daniel Schaffer
  • 56,753
  • 31
  • 116
  • 165

3 Answers3

15

Well, they are the same, after the first 4 bytes. And the first four are the same, just in the reverse order.

Basically, when created from the string, it's assumed to be in "big-endian" format: Highest byte to the left. However, when stored internally (on an Intel-ish machine), the bytes are ordered "little-endian": highest order byte to the right.

James Curran
  • 101,701
  • 37
  • 181
  • 258
15

If you compare the results, you can see that the first three groups are reversed:

61 77 2f 3a   e5 de   5f 4a   8577eb1003c5c054
3a 2f 77 61   de e5   4a 5f   8577eb1003c5c054

That's because in the GUID structure, these 3 groups are defined as DWORD and two WORDs rather than bytes:

{0x00000000,0x0000,0x0000,{0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00}}

so in memory, an Intel processor stores them in Little-endian order (the most significant byte the last).

ivan_pozdeev
  • 33,874
  • 19
  • 107
  • 152
dtb
  • 213,145
  • 36
  • 401
  • 431
4

A GUID is structured as follows:

int a
short b
short c
byte[8] d

So for the part represented by a your code gets the bytes reversed. All other parts are transformed correctly.

ChrisF
  • 134,786
  • 31
  • 255
  • 325
  • 1
    why would `b` and `c` not be reversed too? – Sebastian Jan 18 '12 at 20:32
  • @SebastianGodelet - because they are `short` rather than `int`. – ChrisF Jan 18 '12 at 21:03
  • 1
    I thought that everything greater than one byte is subject to endianess: `short s = 0xaf21;` could be stored: |af|21| or |21|af| – Sebastian Jan 18 '12 at 21:47
  • @SebastianGodelet - Possibly, but endianness is usually at the `int` level. – ChrisF Jan 18 '12 at 21:49
  • 4
    not trying to be nasty, but I think there's a slight misunderstanding here... 16bit integers, c.f. `System.Int16` are also subject to endianess, e.g. : Unicode Transfer encodings: UTF-16LE and UTF-16BE, but only: UTF-8, – Sebastian Jan 19 '12 at 19:11
  • @SebastianGodelet - It's OK. It's been a while since I had to do anything serious with endianness so my memory could well be faulty. – ChrisF Jan 19 '12 at 21:05