4

I have an app which is displaying a data set which allows me to call out to custom .NET code and am stuck on a sorting problem. One column in my data set contains both strings and numeric data, and I want to sort the strings alphabetically and the numbers numerically. All I can do is take the current value the sorter is working with, and return something.

If my list is {"-6", "10", "5"}, I want to produce strings from those numbers that would sort them alphabetically. What I've come up with is making them all positive, then padding with zeros, like this:

public object Evaluate(object currentValue)
{
    //add 'a' to beginning of non-numbers, 'b' to beginning of numbers so that numbers come second
    string sortOrder = "";
    if(!currentValue.IsNumber)
        sortOrder = "a" + currentValue;
    else
    {
        sortOrder = "b"
        double number = Double.Parse(currentValue);

        //add Double.MaxValue to our number so that we 'hopefully' get rid of negative numbers, but don't go past Double.MaxValue
        number += (Double.MaxValue / 2)

        //pad with zeros so that 5 comes before 10 alphabetically:
        //"0000000005"
        //"0000000010"
        string paddedNumberString = padWithZeros(number.ToString())


        //"b0000000005"
        //"b0000000010"
        sortOrder += paddedNumberString;
    }
}

Problem:
If I simply return the number, then they get sorted alphabetically and 10 will come before 5, and I don't even know what will happen with negative numbers.

Solution?:
One hack I thought of was trying to convert from doubles (8 bytes) to unsigned longs (8 bytes). That would get rid of negative numbers since they would be starting at 0. The problem of 10 coming before 5 still remains though. For that, perhaps pad with 0s or something...

It seems like this should be possible, but I have the dumb today and can't smart.

example data:
'cat'
'4'
'5.4'
'dog'
'-400'
'aardvark'
'12.23.34.54'
'i am a sentence'
'0'

that should be sorted to:
'12.23.34.54'
'aardvark'
'cat'
'dog'
'i am a sentence'
'-400'
'0'
'4'
'5.4'

MStodd
  • 4,716
  • 3
  • 30
  • 50
  • Can you give some example values for this column? Are they just text with a double appended? – Lance U. Matthews Feb 10 '12 at 16:57
  • Is it possible to send in a comparer here? http://msdn.microsoft.com/en-us/library/cfttsh47.aspx – jrsconfitto Feb 10 '12 at 17:07
  • Comparison functions don't work when you only have one value. I can return a string from the value the sorter is currently evaluating. I'll make that more clear in my description – MStodd Feb 10 '12 at 17:31
  • After seeing your edit and some of your comments I'm not sure what you want to achieve, but my answer doesn't seem to answer the question you are asking so I'm deleting it. I think you could improve this question by better explaining how you intend to use `Evaluate` to sort. – Martin Liversage Feb 10 '12 at 18:10

4 Answers4

4

Not very efficient, but a simple comparison algorithm which first separates between numbers and non-numbers, then sorts between them would work - see code below. The inneficiency comes from the fact that we'll do the string to double conversion quite a few times, so you can do a pre-processing of the numbers (i.e., store their double values in a List<double?>) then use those instead of always doing the parsing.

public class StackOverflow_9231493
{
    public static void Test()
    {
        List<string> list = new List<string>
        {
            "cat",
             "4",
             "5.4",
             "dog",
             "-400",
             "aardvark",
             "12.23.34.54",
             "i am a sentence",
             "0" ,
        };

        list.Sort(new Comparison<string>(delegate(string s1, string s2)
        {
            double d1, d2;
            bool isNumber1, isNumber2;
            isNumber1 = double.TryParse(s1, out d1);
            isNumber2 = double.TryParse(s2, out d2);
            if (isNumber1 != isNumber2)
            {
                return isNumber2 ? -1 : 1;
            }
            else if (!isNumber1)
            {
                return s1.CompareTo(s2);
            }
            else
            {
                return Math.Sign(d1 - d2);
            }
        }));

        Console.WriteLine(string.Join("\n", list));
    }
}

Update based on comments:

If you want to only return something without using the comparer directly, you can use the same logic, but wrap the values in a type which knows how to do the comparison as you want, as shown below.

public class StackOverflow_9231493
{
    public class Wrapper : IComparable<Wrapper>
    {
        internal string value;
        private double? dbl;

        public Wrapper(string value)
        {
            if (value == null) throw new ArgumentNullException("value");
            this.value = value;
            double temp;
            if (double.TryParse(value, out temp))
            {
                dbl = temp;
            }
        }

        public int CompareTo(Wrapper other)
        {
            if (other == null) return -1;
            if (this.dbl.HasValue != other.dbl.HasValue)
            {
                return other.dbl.HasValue ? -1 : 1;
            }
            else if (!this.dbl.HasValue)
            {
                return this.value.CompareTo(other.value);
            }
            else
            {
                return Math.Sign(this.dbl.Value - other.dbl.Value);
            }
        }
    }
    public static void Test()
    {
        List<string> list = new List<string>
        {
            "cat",
             "4",
             "5.4",
             "dog",
             "-400",
             "aardvark",
             "12.23.34.54",
             "i am a sentence",
             "0" ,
        };

        List<Wrapper> list2 = list.Select(x => new Wrapper(x)).ToList();
        list2.Sort();
        Console.WriteLine(string.Join("\n", list2.Select(w => w.value)));
    }
}
carlosfigueira
  • 85,035
  • 14
  • 131
  • 171
  • Can't use comparison. See description – MStodd Feb 10 '12 at 18:03
  • 1
    Looks like everybody got downvoted. Is that really appropriate here? Perhaps they do qualify for the "This answer is not useful" label on the downvote arrow, but until the most recent edit we didn't even know the signature of the function we were implementing. – Lance U. Matthews Feb 10 '12 at 18:09
  • Sorry, I don't stackoverflow enough. Thanks for explaining, but the part in bold was there the whole time, just not bold. Hopefully it's clear now – MStodd Feb 10 '12 at 18:28
  • Already edited, with another alternative which returns a single value for each member and can be compared the way you need. – carlosfigueira Feb 10 '12 at 19:09
2

I have a solution for you but it requires an arbitrary, fixed maximum string size but requires no other information about the set

First, define a custom character set as follows:

public class CustomChar
{
    public static readonly int Base;
    public static readonly int BitsPerChar;

    public char Original { get; private set; }
    public int Target { get; private set; }

    private static readonly Dictionary<char, CustomChar> Translation;

    private static void DefineOrderedCharSet(string charset)
    {
        foreach (var t in charset)
        {
            new CustomChar(t);
        }
    }

    static CustomChar()
    {
        Translation = new Dictionary<char, CustomChar>();
        DefineOrderedCharSet(",-.0123456789 aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ");
        BitsPerChar = (int)Math.Ceiling(Math.Log(Translation.Count, 2));
        Base = (int) Math.Pow(2, BitsPerChar);
    }

    private CustomChar(char original)
    {
        Original = original;

        if(Translation.Count > 0)
        {
            Target = Translation.Max(x => x.Value.Target) + 1;
        }
        else
        {
            Target = 0;
        }

        Translation[original] = this;
    }

    public static CustomChar Parse(char original)
    {
        return Translation[original];
    }
}

Then define a construct for handling conversion from string to a System.Numeric.BigInteger as follows

public class CustomString
{
    public string String { get; private set; }
    public BigInteger Result { get; private set; }
    public const int MaxChars = 600000;

    public CustomString(string source)
    {
        String = source;
        Result = 0;

        for (var i = 0; i < String.Length; i++)
        {
            var character = CustomChar.Parse(String[i]);
            Result |= (BigInteger)character.Target << (CustomChar.BitsPerChar * (MaxChars - i - 1));
        }

        double doubleValue;

        if (!double.TryParse(source, out doubleValue))
        {
            return;
        }

        Result = new BigInteger(0x7F) << (MaxChars * CustomChar.BitsPerChar);
        var shifted = (BigInteger)(doubleValue * Math.Pow(2, 32));
        Result += shifted;
    }

    public static implicit operator CustomString(string source)
    {
        return new CustomString(source);
    }
}

Notice the ctor for CustomString finds doubles and augments their BigInteger representations to organize things for a numeric value sort.

This is a fairly quick throw-together but gets your described output from the test:

class Program
{
    public static string[] Sort(params CustomString[] strings)
    {
        return strings.OrderBy(x => x.Result).Select(x => x.String).ToArray();
    }

    static void Main()
    {
        var result = Sort(
            "cat",
            "4",
            "5.4",
            "dog",
            "-400",
            "aardvark",
            "12.23.34.54",
            "i am a sentence",
            "0");

        foreach (var str in result)
        {
            Console.WriteLine(str);
        }

        Console.ReadLine();
    }
}
mlorbetske
  • 5,529
  • 2
  • 28
  • 40
1

I suspect you're after something called 'Natural Sort order'. Attwood has a post on it: http://www.codinghorror.com/blog/2007/12/sorting-for-humans-natural-sort-order.html

There's a couple examples of implementations in the post.

Frederik
  • 2,921
  • 1
  • 17
  • 26
0

I'm assuming your data is of type string and not object. The following function can be called with a Comparison<string> delegate.

static int CompareTo(string string1, string string2)
{
    double double1, double2;

    // Add null checks here if necessary...

    if (double.TryParse(string1, out double1))
    {
        if (double.TryParse(string2, out double2))
        {
            // string1 and string2 are both doubles

            return double1.CompareTo(double2);
        }
        else
        {
            // string1 is a double and string2 is text; string2 sorts first

            return 1;
        }
    }
    else if (double.TryParse(string2, out double2))
    {
        // string1 is text and string2 is a double; string1 sorts first

        return -1;
    }
    else
    {
        // string1 and string2 are both text

        return string1.CompareTo(string2);
    }
}

You can test it like this:

static void Main(string[] args)
{
    var list = new List<string>() {
        "cat",
        "4",
        "5.4",
        "dog",
        "-400",
        "aardvark",
        "12.23.34.54",
        "i am a sentence",
        "0"
    };

    list.Sort(CompareTo);
    foreach (var item in list)
        Console.WriteLine(item);
}
Lance U. Matthews
  • 15,725
  • 6
  • 48
  • 68