2

I've stumbled into some odd behavior of a custom IComparer implementation and can't seem to figure out how to correct it and get the expected behavior.

I created a static class providing an extension method to System.Guid that allows the first 4 bytes of a Guid to be overwritten with a specified Int32 value. This is done for the purpose of allowing creation of semi-sequential Guids which are index-friendly for high population database tables.

public static class GuidExt
{
    public static Guid Sequence(this Guid obj, int index)
    {
        byte[] b = obj.ToByteArray();
        BitConverter.GetBytes(index).CopyTo(b, 0);
        return new Guid(b);
    }
}

Pretty straight forward and works exactly as expected.

The custom Comparer class I've created is designed to allow Guids to be sorted in ascending order of the inserted Int32 portion of a Guid. The implementation is as follows:

public class GuidSequenceComparer : IComparer<Guid>
{
    public int Compare(Guid x, Guid y)
    {
        var xBytes = x.ToByteArray();
        var yBytes = y.ToByteArray();
        byte[] xIndexBytes = new byte[4];
        for (int i = 0; i < 4; i++)
        {
            xIndexBytes[i] = xBytes[0];
        }
        byte[] yIndexBytes = new byte[4];
        for (int i = 0; i < 4; i++)
        {
            yIndexBytes[i] = yBytes[i];
        }
        var xIndex = BitConverter.ToInt32(xIndexBytes, 0);
        var yIndex = BitConverter.ToInt32(yIndexBytes, 0);

        return xIndex.CompareTo(yIndex);

        //// The following was used to test if any sort was being performed
        //// and reverses the ordering (see below paragraph)
        // if (xIndex > yIndex)
        // {
        //     return -1;
        // }
        // if (xIndex < yIndex)
        // {
        //     return 1
        // }
        // return 0;
    }
}

When I use this custom comparer on a List, it does perform a sort, but it sorts the index position within the List object! No idea why this is happening but I confirmed it by reversing the results of the comparison as shown in the commented out section and it simply reversed the order of the List, not based on the int values within the Guid, but based on the existing positions within the List. I'm completely lost as to why this is happening.

Here's the simple test I used within a console app, if you would like to try and recreate the behavior.

        List<Guid> guidList = new List<Guid>();

        guidList.Add(Guid.NewGuid().Sequence(5));
        guidList.Add(Guid.NewGuid().Sequence(3));
        guidList.Add(Guid.NewGuid().Sequence(8));
        guidList.Add(Guid.NewGuid().Sequence(1));

        Console.WriteLine("unsorted:");
        foreach (Guid item in guidList)
        {
            Console.WriteLine(item);
        }

        guidList.Sort(new GuidSequenceComparer());

        Console.WriteLine("sorted:");
        foreach (Guid item in guidList)
        {
            Console.WriteLine(item);
        }

        Console.ReadLine();
Mike Johnson
  • 686
  • 5
  • 13
  • whats wrong with an identity column? – Jodrell Apr 30 '13 at 13:03
  • 2
    `xBytes[0]` in the loop does not seem correct. – Magnus Apr 30 '13 at 13:05
  • lol Yea I completely missed that. That was the issue and all works properly now. Ah boy, late night code binges... – Mike Johnson Apr 30 '13 at 13:07
  • 1
    @MikeJohnson Also you can write `xBytes.CopyTo(xIndexBytes, 0);` and skip the loop. – Magnus Apr 30 '13 at 13:08
  • @Magnus Excellent suggestion to boost performance. Thank you. – Mike Johnson Apr 30 '13 at 13:10
  • @Jodrell This has nothing to do with identity columns. This was the requirement I was given. And it doesn't affect the uniqueness of the Guid as there is still 96 bits of entropy within the Guid after the int is assigned to its 4 leading bytes. – Mike Johnson Apr 30 '13 at 13:13
  • If was just that when you wrote "for the purpose of allowing creation of semi-sequential Guids which are index-friendly for high population database tables", I thought, an identity column is good for that. Indexing GUIDs, even ones that start sequentially is going to suck. – Jodrell Apr 30 '13 at 13:24
  • You could just view a guid as a 128 bit integer, http://stackoverflow.com/questions/7981190/how-to-generate-absolutely-unique-guids – Jodrell Apr 30 '13 at 13:35
  • Their requirement was that they wanted this implementation to act as an extension to the Guid type as it would allow for the internal optimization of the type itself while meeting their needs for index-friendlyness and still providing enough entropy through the remaining unmodified bytes. – Mike Johnson May 01 '13 at 18:08

3 Answers3

3

In your comparer, I think the line

        xIndexBytes[i] = xBytes[0];

should be

        xIndexBytes[i] = xBytes[i];

Not sure if it's the problem, but this is exactly the kind of thing that happens to me from time to time :-)

User @Magnus beat me to it but you can avoid the copy loop with the Array.Copy function, like this:

Array.Copy(xBytes, 0, xIndexBytes, 0, 4);
Jalayn
  • 8,934
  • 5
  • 34
  • 51
3

I'd optimize your compare a little

public int Compare(Guid x, Guid y)
{
    var xBytes = x.ToByteArray();
    var yBytes = y.ToByteArray();
    int result = 0;
    for (int i = 0; i < 4; i++)
    {
        var result = xBytes[i].CompareTo(yBytes[i]);
        if (result != 0)
        {
            break;
        }
    }

    return result;
}

Buffer.BlockCopy is a bit quicker than Array.Copy too,

public static Guid Sequence(this Guid source, int index)
{
    var buffer = source.ToByteArray();
    Buffer.BlockCopy(BitConvertor.GetBytes(index), 0, buffer, 0, sizeof(int));
    return new Guid(buffer);
}

or how about this, no copying required

public static Guid Sequence(this Guid source, int index)
{
    var buffer = source.ToByteArray();
    return new Guid(
        index,
        BitConvertor.ToInt16(buffer, 4),
        BitConvertor.ToInt16(buffer, 6),
        buffer[8],
        buffer[9],
        buffer[10],
        buffer[11],
        buffer[12],
        buffer[13],
        buffer[14],
        buffer[15]);
}
Jodrell
  • 34,946
  • 5
  • 87
  • 124
3

You may want to consider adjusting your extension methods to this:

public static class GuidExt
{
    public static Guid Sequence(this Guid g, int sequenceNum)
    {
        var bytes = g.ToByteArray();

        BitConverter.GetBytes(sequenceNum).CopyTo(bytes, 0);

        return new Guid(bytes);
    }

    public static int GetSequenceNum(this Guid g)
    {
        return BitConverter.ToInt32(g.ToByteArray(), 0);
    }
}

There's quite a bit of extra gyration in your methods that you just don't need. Interestingly enough, that was the source of the error that Jalayn found.

You can then change your comparer to do something much simpler:

public int Compare(Guid x, Guid y)
{
    return x.GetSequenceNum().CompareTo(y.GetSequenceNum());
}

Hope this helps.

FMM
  • 4,289
  • 1
  • 25
  • 44