2

Consider the following list of two string elements, sorting it with Sort() or ordering with linq .OrderBy() gives an unexpected result, a.1.10-a- being the first element in the newly ordered list.

var list = new List<string> 
    {
        "a.1.1-a-",  
        "a.1.10-a-", 
    };
    list.Sort();
    foreach(var l in list)
        Console.WriteLine(l);
    foreach(var l in list.OrderBy(x=>x))
        Console.WriteLine(l);

Actual results:

a.1.10-a-
a.1.1-a-
------
a.1.10-a-
a.1.1-a-

However, removing the letter a from each of the elements, the output changes to:

a.1.1--
a.1.10--
------
a.1.1--
a.1.10--

I've reproduced this in https://dotnetfiddle.net/NBF3Pf

But, copying the same code in https://try.dot.net/ gives the expected results with and without the letter a included towards the end of the two strings.

I have tried casting each of the strings to a list of char then to list of ints. The two lists are identical until the 0 which has the ASCII code of 48 and the - which has a ASCII code of 45. 48 is greater than 45, but still the sorting places the element a.1.10-a- first.

EDIT: The same results are happening by using list.Sort(StringComparer.InvariantCulture);

Could anyone explain why this is happening?

  • Wait, your question isn't about the difference in `OrderBy()` vs `Sort()`, is it? Both ways of sorting have the same output. The question is why the default string comparer (which both sorting algorithms use) sorts this way. Can you [edit] your question to remove this confusion? Then I can unmark as duplicate. – CodeCaster Jan 21 '19 at 11:41
  • .NET offloads string comparison to the runtime, which (on Windows) offloads it to Windows APIs. This behavior is well documented on MSDN and in duplicate questions I can't find right now. – CodeCaster Jan 21 '19 at 11:46
  • You need a custom sort the splits the fields and converts the digits to integers. – jdweng Jan 21 '19 at 11:51

1 Answers1

0

The default StringComparer depends on your current culture, so can give different results on different machines. Try specifying the culture explicitly to get consistent results:

    list.Sort(StringComparer.InvariantCulture);
    foreach(var l in list)
        Console.WriteLine(l);
    Console.WriteLine();

    foreach(var aa in list.OrderBy(x=>x, StringComparer.InvariantCulture))
        Console.WriteLine(aa);

You could consider using StringComparer.Ordinal, depending on what result you want. I suspect your current culture may be using CultureInfo.StringSort. which:

Indicates that the string comparison must use the string sort algorithm. In a string sort, the hyphen and the apostrophe, as well as other nonalphanumeric symbols, come before alphanumeric characters

Joe
  • 122,218
  • 32
  • 205
  • 338
  • Just tried with `StringComparer.InvariantCulture`, but I get the same results .. – Stefan Kovacs Jan 21 '19 at 11:54
  • @StefanKovacs - your DotNetFiddle uses `StringComparer.InvariantCulture` for the `Sort` method, but not for the `OrderBy` method. – Joe Jan 21 '19 at 11:57
  • my question was about why adding or removing an `a` after the second to last `-` in my inputs would change the result. – Stefan Kovacs Jan 21 '19 at 12:10
  • 1
    @StefanKovacs - I suspect the string sort algorithm is being used and is treating hyphens as something akin to word separators. But it looks like you want `StringComparer.Ordinal` as I already mentioned. – Joe Jan 21 '19 at 15:45