7

I have a requirement to sort some strings that contain data like this:

var strings = new List<string>{"2009 Arrears","2008 Arrears","2008 Arrears Interest","2009 Arrears Interest"};

And they want the results ordered like this:

  1. "2009 Arrears"
  2. "2009 Arrears Interest"
  3. "2008 Arrears"
  4. "2008 Arrears Interest"

It seems like I need to create a function to see if the string starts with a number. If so, the function will get all numbers up until the first character and sort the numeric result descending and then sort the remaining characters ascending. I am having trouble trying to write a method that gets all starting numbers in a string. What would be an efficient way to do that?

MackM
  • 2,906
  • 5
  • 31
  • 45
Rob Packwood
  • 3,698
  • 4
  • 32
  • 48

2 Answers2

10
public int GetLeadingNumber(string input)
{
    char[] chars = input.ToCharArray();
    int lastValid = -1;

    for(int i = 0; i < chars.Length; i++)
    {
        if(Char.IsDigit(chars[i]))
        {
            lastValid = i;
        }
        else
        {
            break;
        }
    }

    if(lastValid >= 0)
    {
        return int.Parse(new string(chars, 0, lastValid + 1));
    }
    else
    {
        return -1;
    }
}

Though this would strictly be the most efficient, the regular expression solutions offered by other posters is obviously more concise and could be clearer, depending on how much processing you'll do on the string.

Adam Robinson
  • 182,639
  • 35
  • 285
  • 343
9

A regex would split this up nicely:

var match = Regex.Match(text, @"^(\d+) (.*)$");

Then match.Groups[0].Value is the year, and match.Groups[1].Value is the title ("Arrears", "Arrears Interest", etc)

You can use LINQ to apply the sort (year descending, title ascending):

string[] titles = new[] { "2008 Arrears", "2009 Arrears" };

var sortedTitles = 
    from title in titles
    let match = Regex.Match(title, @"^(\d+) (.*)$")
    orderby match.Groups[0].Value descending, match.Groups[1].Value
    select title;

listBox.ItemsSource = sortedTitles.ToArray();  // for example

A regex may not be the fastest solution; here's an alternative that's still kept nice and clean with LINQ:

var sortedTitles =
    from title in titles
    let year = new string(title.TakeWhile(ch => char.IsDigit(ch)).ToArray())
    let remainder = title.Substring(year.Length).Trim()
    orderby year descending, remainder
    select title;
Ben M
  • 22,262
  • 3
  • 67
  • 71
  • 1
    Note that this expression will require the numbers to be followed by a space (unless the Regex is created/invoked with the `IgnoreWhitespace` option). – Fredrik Mörk Sep 30 '09 at 17:55
  • Yes--I assumed the space is a guaranteed element of the string. – Ben M Sep 30 '09 at 17:57
  • I know it is a very late modifier, but since you use greedy \d+, you could merely follow up the space with "?" so that is not mandatory. You might also, if your starting number is a recent year (above 1000, actually), use @"^(\d{4}) ?(.*)$" for performance reasons (the engine will be faster if you say it to look for four digits. But then you have to be prepared for strings with errors in the beginning date. Which you should be anyway) – Ando Jurai Jun 06 '20 at 06:37