3

Each item/string in my array starts with two letters followed by two or three numbers and then sometimes followed by another letter.

Examples, RS01 RS10 RS32A RS102 RS80 RS05A RS105A RS105B

I tried to sort this using the default Array.Sort but it came back with this...

RS01
RS05A
RS10
RS102
RS105A
RS105B
RS32A
RS80

But I need it like this..

RS01
RS05A
RS10
RS32A
RS80
RS102
RS105A
RS105B

Any Ideas?

spajce
  • 7,044
  • 5
  • 29
  • 44
Sammy
  • 31
  • 2
  • what .net framework you're using? – spajce Feb 21 '13 at 22:01
  • Is it always starting with "RS"? If not, do you want to sort alphabetically first, then numerical? – Tim Schmelter Feb 21 '13 at 22:02
  • It will always start with "RS" – Sammy Feb 21 '13 at 22:02
  • Do you want to take the number inside the string as a number instead of a string? I swear there's something in Win32 for that... – It'sNotALie. Feb 21 '13 at 22:08
  • @Sammy If it always starts with "RS" I'd suggest just trimming that and then adding it back in when you're done; it will probably make parsing it easier. Then you only need to check if the last character is a letter; if it is, store it and parse the number out of the start, if not, parse the whole thing as a number. – Servy Feb 21 '13 at 22:10
  • @ofstream Well, in the general case, I believe it's an insolvable problem. It's called a "natural sort". There are a number of different implementations you can find of them. – Servy Feb 21 '13 at 22:12
  • @Servy Found it. http://msdn.microsoft.com/en-gb/library/windows/desktop/bb759947(v=vs.85).aspx StrCmpLogicalW. now if I was better at P/Invoke I would write an answer. – It'sNotALie. Feb 21 '13 at 22:13
  • Huh, seems that someone's already wrapped it. Link: http://stackoverflow.com/questions/248603/natural-sort-order-in-c-sharp/248613#248613 – It'sNotALie. Feb 21 '13 at 22:16

3 Answers3

5

Here is sorting with custom comparison delegate and regular expressions:

string[] array = { "RS01", "RS10", "RS32A", "RS102", 
                   "RS80", "RS05A", "RS105A", "RS105B" };

Array.Sort(array, (s1, s2) =>
    {
        Regex regex = new Regex(@"([a-zA-Z]+)(\d+)([a-zA-Z]*)");
        var match1 = regex.Match(s1);                                        
        var match2 = regex.Match(s2);

        // prefix
        int result = match1.Groups[1].Value.CompareTo(match2.Groups[1].Value);
        if (result != 0)
            return result;

        // number 
        result = Int32.Parse(match1.Groups[2].Value)
                        .CompareTo(Int32.Parse(match2.Groups[2].Value));

        if (result != 0)
            return result;

        // suffix
        return match1.Groups[3].Value.CompareTo(match2.Groups[3].Value);
    });

UPDATE (little refactoring, and moving all stuff to separate comparer class). Usage:

Array.Sort(array, new RSComparer());

Comparer itself:

public class RSComparer : IComparer<string>
{
    private Dictionary<string, RS> entries = new Dictionary<string, RS>();

    public int Compare(string x, string y)
    {
        if (!entries.ContainsKey(x))
            entries.Add(x, new RS(x));

        if (!entries.ContainsKey(y))
            entries.Add(y, new RS(y));

        return entries[x].CompareTo(entries[y]);
    }

    private class RS : IComparable
    {
        public RS(string value)
        {
            Regex regex = new Regex(@"([A-Z]+)(\d+)([A-Z]*)");
            var match = regex.Match(value);
            Prefix = match.Groups[1].Value;
            Number = Int32.Parse(match.Groups[2].Value);
            Suffix = match.Groups[3].Value;
        }

        public string Prefix { get; private set; }
        public int Number { get; private set; }
        public string Suffix { get; private set; }

        public int CompareTo(object obj)
        {
            RS rs = (RS)obj;
            int result = Prefix.CompareTo(rs.Prefix);
            if (result != 0)
                return result;

            result = Number.CompareTo(rs.Number);
            if (result != null)
                return result;

            return Suffix.CompareTo(rs.Suffix);
        }
    }
}
Sergey Berezovskiy
  • 232,247
  • 41
  • 429
  • 459
  • 1
    Looks like the only solution here that would work. Might be better to do the parsing before you start the sort so that you aren't parsing each one over and over, but for a small data set that's not an issue. – Servy Feb 21 '13 at 22:16
  • @Servy thanks! And yes, that works, verified :) Actually, I'd better go with comparer instead of delegate (code will not be that messy then) – Sergey Berezovskiy Feb 21 '13 at 22:17
  • @Sammy take a look on updated answer - I made a comparer, so your code will look more clean. Also I created inner class RS for holding prefix, number, and suffix (possibly you want to use this class instead of strings in your application). And last thing -comparer holds already parsed entities, so it's more efficient this time:) – Sergey Berezovskiy Feb 21 '13 at 22:29
2

You can use this linq query:

var strings = new[] { 
    "RS01","RS05A","RS10","RS102","RS105A","RS105B","RS32A","RS80"
};
strings = strings.Select(str => new
{
    str,
    num = int.Parse(String.Concat(str.Skip(2).TakeWhile(Char.IsDigit))),
    version = String.Concat(str.Skip(2).SkipWhile(Char.IsDigit))
})
.OrderBy(x => x.num).ThenBy(x => x.version)
.Select(x => x.str)
.ToArray();

DEMO

Result:

RS01
RS05A
RS10
RS32A
RS80
RS102
RS105A
RS105B
Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • @granadaCoder: Linq is not always better. But it can be more readable and shorter(not always). It's also not always the most efficient, but often that doesn't matter. – Tim Schmelter Feb 21 '13 at 22:44
0

You'll want to write a custom comparer class implementing IComparer<string>; it's pretty straightforward to break your strings into components. When you call Array.Sort, give it an instance of your comparer and you'll get the results you want.

Ben
  • 6,023
  • 1
  • 25
  • 40