10

If you have strings like:

"file_0"
"file_1"
"file_2"
"file_3"
"file_4"
"file_5"
"file_6"
"file_11"

how can you sort them so that "file_11" doesn't come after "file_1", but comes after "file_6", since 11 > 6.

Do I have to parse the string and convert it into a number for this?

Windows explorer in Win7 sorts files out the way I wanted.

Joan Venge
  • 315,713
  • 212
  • 479
  • 689

6 Answers6

12

Do I have to parse the string and convert it into a number for this?

Essentially, yes; but LINQ may help:

var sorted = arr.OrderBy(s => int.Parse(s.Substring(5)));
foreach (string s in sorted) {
    Console.WriteLine(s);
}
Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
10

To handle sorting of intermixed strings and numbers for any kind of format, you can use a class like this to split the strings into string and number components and compare them:

public class StringNum : IComparable<StringNum> {

   private List<string> _strings;
   private List<int> _numbers;

   public StringNum(string value) {
      _strings = new List<string>();
      _numbers = new List<int>();
      int pos = 0;
      bool number = false;
      while (pos < value.Length) {
         int len = 0;
         while (pos + len < value.Length && Char.IsDigit(value[pos+len]) == number) {
            len++;
         }
         if (number) {
            _numbers.Add(int.Parse(value.Substring(pos, len)));
         } else {
            _strings.Add(value.Substring(pos, len));
         }
         pos += len;
         number = !number;
      }
   }

   public int CompareTo(StringNum other) {
      int index = 0;
      while (index < _strings.Count && index < other._strings.Count) {
         int result = _strings[index].CompareTo(other._strings[index]);
         if (result != 0) return result;
         if (index < _numbers.Count && index < other._numbers.Count) {
            result = _numbers[index].CompareTo(other._numbers[index]);
            if (result != 0) return result;
         } else {
            return index == _numbers.Count && index == other._numbers.Count ? 0 : index == _numbers.Count ? -1 : 1;
         }
         index++;
      }
      return index == _strings.Count && index == other._strings.Count ? 0 : index == _strings.Count ? -1 : 1;
   }

}

Example:

List<string> items = new List<string> {
  "item_66b",
  "999",
  "item_5",
  "14",
  "file_14",
  "26",
  "file_2",
  "item_66a",
  "9",
  "file_10",
  "item_1",
  "file_1"
};

items.Sort((a,b)=>new StringNum(a).CompareTo(new StringNum(b)));

foreach (string s in items) Console.WriteLine(s);

Output:

9
14
26
999
file_1
file_2
file_10
file_14
item_1
item_5
item_66a
item_66b
Guffa
  • 687,336
  • 108
  • 737
  • 1,005
  • @nawfal: If you call `Sort` without the comparer, it will use the default string comparison and you get a different result. Besides, the `List` class did not exist prior to .NET 2.0. – Guffa Jun 23 '11 at 07:45
  • @Guffa no, i meant one has to implement all what you wrote including interfacing like `public class StringNum : IComparable`. But the last line of code `items.Sort((a,b)=>new StringNum(a).CompareTo(new StringNum(b)));` wont work in .net 2.0 I suppose.. Or would it? I just used `items.Sort()` instead of it, and got the code working..! – nawfal Jun 23 '11 at 18:03
  • 1
    @nawfal: The lambda expression won't work in C# 2.0, so you just write it using a delegate instead: `items.Sort(delegate(string a, string b){ return new StringNum(a).CompareTo(new StringNum(b)); });`. – Guffa Jun 23 '11 at 18:47
  • @Guffa thanks for that.. But then how did using just `.Sort()` work for me? do u know why? When I merely tried to do `.Sort()` without any `IComparable`, the code didnt even run; gave me an exception. See this linl, http://www.codedigest.com/Articles/CSHARP/84_Sorting_in_Generic_List.aspx It has `: IComparable` IComparable example in which they did just `.Sort()` (the first method in that link). So I guess mine should work – nawfal Jun 23 '11 at 19:18
  • @nawfal: There had to be some other reason for the exception, the `StringNum` class has no relation to the `String` class, so if you don't use it in the `Sort` call it won't change the result at all. In the example that you are reading they have a `List` where the `Customer` class implements `IComparable`, so the `Sort` method will use that without specifying a comparison in the call. That doesn't work with a `List` as you can't change the implementation of the `String` class. – Guffa Jun 23 '11 at 19:40
  • @Guffa ya you are right. I'm going for your piece of code `items.Sort(delegate(string a, string b){ return new StringNum(a).CompareTo(new StringNum(b)); });` . Thanks for that ! :) – nawfal Jun 24 '11 at 12:39
  • @JoshG: Yes, that is true. However, it's sometimes hard to determine when a number actually is negative. For example, in a string like `"abc-45-def"`, the first dash could either be a separator or indicating a negative number. – Guffa Nov 01 '13 at 21:39
  • @Guffa: nice work, but sorting these two lists (which have the same elements), produces different results: {"b10", "b", "a", "a10"}, {"a10","a","b","b10"}. Is it a bug or a feature ;) ? – Razvan Socol May 23 '15 at 18:38
  • @RazvanSocol: That's a bug, of course. :) Thanks for spotting it. I overlooked how different length strings should be sorted. I fixed the code in the answer. – Guffa May 24 '15 at 08:31
9

You could import the StrCmpLogicalW function and use that to sort the strings. This is the very same function that Explorer itself uses for file names.

Won't help you if you don't want P/Invoke or stay compatible on other systems, though.

Joey
  • 344,408
  • 85
  • 689
  • 683
6

The following code based on Joey's suggestion works for me (extension method to string[]):

public static void SortLogical(this string[] files)
{
    Array.Sort<string>(files, new Comparison<string>(StrCmpLogicalW));
}

[DllImport("shlwapi.dll", CharSet=CharSet.Unicode, ExactSpelling=true)]
private static extern int StrCmpLogicalW(String x, String y);
Yousef
  • 876
  • 6
  • 13
2

A simple way is to pad the numeric portion like so:

file_00001
file_00002
file_00010
file_00011

etc.

But this reles on knowing the maximum value the numeric portion can take.

Mitch Wheat
  • 295,962
  • 43
  • 465
  • 541
2

I have used the following approach in a project a while ago. It's not particularly efficient, but if the number of items to sort is not huge it performed well enough for that use. What it does is that it splits up the strings to compare into arrays on the '_' character, and then compares each element of the arrays. An attempt is made to parse the last element as an int, and make a numeric comparison there.

It also has an early exit if the input strings would contain a different number of elements (so if you compare "file_nbr_1" to "file_23", it will not go into comparing each part of the strings, but rather just to a regular string comparison on the full strings):

char[] splitChars = new char[] { '_' };
string[] strings = new[] {
    "file_1",
    "file_8",
    "file_11",
    "file_2"
};

Array.Sort(strings, delegate(string x, string y)
{
    // split the strings into arrays on each '_' character
    string[] xValues = x.Split(splitChars);
    string[] yValues = y.Split(splitChars);

    // if the arrays are of different lengths, just 
    //make a regular string comparison on the full values
    if (xValues.Length != yValues.Length)
    {
        return x.CompareTo(y);
    }

    // So, the arrays are of equal length, compare each element
    for (int i = 0; i < xValues.Length; i++)
    {
        if (i == xValues.Length - 1)
        {
            // we are looking at the last element of the arrays

            // first, try to parse the values as ints
            int xInt = 0;
            int yInt = 0;
            if (int.TryParse(xValues[i], out xInt) 
                && int.TryParse(yValues[i], out yInt))
            {
                // if parsing the values as ints was successful 
                // for both values, make a numeric comparison 
                // and return the result
                return xInt.CompareTo(yInt);
            }
        }

        if (string.Compare(xValues[i], yValues[i], 
            StringComparison.InvariantCultureIgnoreCase) != 0)
        {
            break;
        }
    }

    return x.CompareTo(y);

});
Fredrik Mörk
  • 155,851
  • 29
  • 291
  • 343