1

I am trying to implement the IComparable interface in my custom object so that List.Sort() can sort them alphabetically.

My object has a field called _name which is a string type, and I want it to sort based on that. Here is the method I implemented:

    public int CompareTo(object obj)
    {
        //Int reference table:
        //1 or greater means the current instance occurs after obj
        //0 means both elements occur in the same position
        //-1 or less means the current instance occurs before obj

        if (obj == null)
            return 1;

        Upgrade otherUpgrade = obj as Upgrade;

        if (otherUpgrade != null)
            return _name.CompareTo(otherUpgrade.Name);
        else
            throw new ArgumentException("Passed object is not an Upgrade.");
    }

Not sure if I did something wrong or if it's just the way the string CompareTo works, but basically my List was sorted like this:

  • Test Upgrade
  • Test Upgrade 10
  • Test Upgrade 11
  • Test Upgrade 12
  • Test Upgrade 13
  • Test Upgrade 14
  • Test Upgrade 15
  • Test Upgrade 2
  • Test Upgrade 3
  • Test Upgrade 4
  • Test Upgrade 5

I want them to be sorted like this:

  • Test Upgrade
  • Test Upgrade 2
  • Test Upgrade 3
  • ...etc
TheGateKeeper
  • 4,420
  • 19
  • 66
  • 101

5 Answers5

7

You want "natural order" -- the collation that a human who is familiar with the conventions of English would choose -- as opposed to what you've got, which is "lexicographic" collation: assign every letter a strict ordering, and then sort by each letter in turn.

Jeff has a good article on some of the ins and outs here, with links to different algorithms that try to solve the problem:

http://www.codinghorror.com/blog/2007/12/sorting-for-humans-natural-sort-order.html

and Raymond discussed how Windows dealt with it here:

http://technet.microsoft.com/en-us/magazine/hh475812.aspx

Basically the problem is: natural order collation requires solving an artificial intelligence problem; you're trying to emulate what a human would do, and that can be surprisingly tricky. Good luck!

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • It can be an incredibly complex problem, but it depends a little on the source data. If we know that the numbers in the strings will only ever be integers, it's simplified a lot. Was there ever any consideration on including something like this in the BCL (as a nother type of StringComparer, for example)? – Bennor McCarthy Feb 01 '13 at 02:27
5

Strings are sorted in lexicographic order. You'll have to either format all your numbers to have the same length (eg: Test Upgrade 02) or parse the number in your comparer and incorporate it in your comparison logic.

drch
  • 3,040
  • 1
  • 18
  • 29
3

The reason this happens is that you are doing string comparison, which has no explicit knowledge of numbers. It orders each string by the respective character codes of each character.

To get the effect you want will require a bit more work. See this question: Sort on a string that may contain a number

Community
  • 1
  • 1
Bennor McCarthy
  • 11,415
  • 1
  • 49
  • 51
2

AlphaNumeric Sorting

public class AlphanumComparatorFast : IComparer
{
    public int Compare(object x, object y)
    {
    string s1 = x as string;
    if (s1 == null)
    {
        return 0;
    }
    string s2 = y as string;
    if (s2 == null)
    {
        return 0;
    }

    int len1 = s1.Length;
    int len2 = s2.Length;
    int marker1 = 0;
    int marker2 = 0;

    // Walk through two the strings with two markers.
    while (marker1 < len1 && marker2 < len2)
    {
        char ch1 = s1[marker1];
        char ch2 = s2[marker2];

        // Some buffers we can build up characters in for each chunk.
        char[] space1 = new char[len1];
        int loc1 = 0;
        char[] space2 = new char[len2];
        int loc2 = 0;

        // Walk through all following characters that are digits or
        // characters in BOTH strings starting at the appropriate marker.
        // Collect char arrays.
        do
        {
        space1[loc1++] = ch1;
        marker1++;

        if (marker1 < len1)
        {
            ch1 = s1[marker1];
        }
        else
        {
            break;
        }
        } while (char.IsDigit(ch1) == char.IsDigit(space1[0]));

        do
        {
        space2[loc2++] = ch2;
        marker2++;

        if (marker2 < len2)
        {
            ch2 = s2[marker2];
        }
        else
        {
            break;
        }
        } while (char.IsDigit(ch2) == char.IsDigit(space2[0]));

        // If we have collected numbers, compare them numerically.
        // Otherwise, if we have strings, compare them alphabetically.
        string str1 = new string(space1);
        string str2 = new string(space2);

        int result;

        if (char.IsDigit(space1[0]) && char.IsDigit(space2[0]))
        {
        int thisNumericChunk = int.Parse(str1);
        int thatNumericChunk = int.Parse(str2);
        result = thisNumericChunk.CompareTo(thatNumericChunk);
        }
        else
        {
        result = str1.CompareTo(str2);
        }

        if (result != 0)
        {
        return result;
        }
    }
    return len1 - len2;
    }
}

Usage :

using System;
using System.Collections;

class Program
{
    static void Main()
    {
    string[] highways = new string[]
    {
        "100F",
        "50F",
        "SR100",
        "SR9"
    };
    //
    // We want to sort a string array called highways in an
    // alphanumeric way. Call the static Array.Sort method.
    //
    Array.Sort(highways, new AlphanumComparatorFast());
    //
    // Display the results
    //
    foreach (string h in highways)
    {
        Console.WriteLine(h);
    }
    }
}

Output

50F
100F
SR9
SR100

Parimal Raj
  • 20,189
  • 9
  • 73
  • 110
0

Thanks for all the replies guys. I did my own method and it seems to work fine. It doesn't work for all cases but it works for my scenario. Here is the code for anyone who is interested:

    /// <summary>
    /// Compares the upgrade to another upgrade
    /// </summary>
    /// <param name="obj"></param>
    /// <returns></returns>
    public int CompareTo(object obj)
    {
        //Int reference table:
        //1 or greater means the current instance occurs after obj
        //0 means both elements occur in the same position
        //-1 or less means the current instance occurs before obj

        if (obj == null)
            return 1;

        Upgrade otherUpgrade = obj as Upgrade;

        if (otherUpgrade != null)
        {
            //Split strings into arrays
            string[] splitStringOne = _name.Split(new char[] { ' ' });
            string[] splitStringTwo = otherUpgrade.Name.Split(new char[] { ' ' });

            //Will hold checks to see which comparer will be used
            bool sameWords = false, sameLength = false, bothInt = false;

            //Will hold the last part of the string if it is an int
            int intOne = 0, intTwo = 0;

            //Check if they have the same length
            sameLength = (splitStringOne.Length == splitStringTwo.Length);

            if (sameLength)
            {
                //Check to see if they both end in an int
                bothInt = (int.TryParse(splitStringOne[splitStringOne.Length - 1], out intOne) && int.TryParse(splitStringTwo[splitStringTwo.Length - 1], out intTwo));

                if (bothInt)
                {
                    //Check to see if the previous parts of the string are equal
                    for (int i = 0; i < splitStringOne.Length - 2; i++)
                    {
                        sameWords = (splitStringOne[i].ToLower().Equals(splitStringTwo[i].ToLower()));

                        if (!sameWords)
                            break;
                    }
                }
            }

            //If all criteria is met, use the customk comparer
            if (sameWords && sameLength && bothInt)
            {
                if (intOne < intTwo)
                    return -1;
                else if (intOne > intTwo)
                    return 1;
                else //Both equal
                    return 0;
            }
            //Else use the default string comparer
            else
                return _name.CompareTo(otherUpgrade.Name);            
        }
        else
            throw new ArgumentException("Passed object is not an Upgrade.");
    }

Will work for strings spaced out using a " " character, like so:

Test data:

  • Hello 11
  • Hello 2
  • Hello 13

Result

  • Hello 2
  • Hello 11
  • Hello 13

Will not work for data such as Hello11 and Hello2 since it cannot split them. Not case sensitive.

TheGateKeeper
  • 4,420
  • 19
  • 66
  • 101