3

I've run into a strange sorting of string list in c#:

 var s = new List<string>();
 s.Add("as");
 s.Add("a_");
 s.Add("a0");

 s.Sort();

I was expecting this code to sort the list as:

a0
a_
as

It actually resulted in:

a_
a0
as

Can someone help me understand why a_ was sorted before a0 when the ASCII value of _ is 95 and the ASCII value of 0 is 48?

Trevor
  • 7,777
  • 6
  • 31
  • 50
Ido Ran
  • 10,584
  • 17
  • 80
  • 143
  • This isn't strange sorting, there's many factors that come into play. SO has more than a few examples of this as well as the internet, have you looked into them? – Trevor Jan 02 '20 at 21:22
  • Depends on Sorting algorithm which is selected by the [default comparator](https://learn.microsoft.com/en-us/dotnet/api/system.collections.generic.list-1.sort?view=netframework-4.8) and it's data content. – Markus Zeller Jan 02 '20 at 21:22

1 Answers1

4

By default strings are sorted using CurrentCulture which uses a locale-sensitive sorting algorithm.

Use StringComparer.Ordinal to sort strings by their Unicode (not ASCII) code-points.

List<String> list = ...
list.Sort( comparer: StringComparer.Ordinal );
Dai
  • 141,631
  • 28
  • 261
  • 374
  • 1
    How does ASCII figure into this? – Robert Harvey Jan 02 '20 at 21:23
  • 1
    @RobertHarvey The OP explicitly mentioned ASCII values of characters - I mentioned that it sorts by Unicode code-points and not ASCII values that to remind everyone that .NET uses UTF-16 internally (though the integer values for the first 128 ASCII characters and UTF-16 are the same). – Dai Jan 02 '20 at 21:25
  • The first 128 code points of Unicode are the same characters as ASCII, in the same order. So if the text is guaranteed to be only ASCII characters this will work. – Joe Sewell Jan 02 '20 at 21:26