3

I have a simple solution for a small problem, but as a newbie developer, I always want to learn the more correct/better ways to do something. At the moment, I get a string that looks more or less like this...

Some Random Text, Reference Number: Supp-1234 and some more random text...

Now if we assume this scenario and we want to get only the digit part of the reference number, we can do something like this...

int firstCharacter = mystring.IndexOf("Supp-", StringComparison.InvariantCultureIgnoreCase);
myIssueId = firstCharacter != -1 ? mailItem.Subject.Substring(firstCharacter + 5, 4) : "";

But now, suppose we get a reference number Supp-12345, now this wont work. You can try something like this...

int firstCharacter = mystring.IndexOf("Supp-", StringComparison.InvariantCultureIgnoreCase);
if (firstCharacter != -1)
{
    string temp = mystring.Substring(firstCharacter + 5, 5);
    try
    {
        int x = Convert.ToInt32(temp); // to ensure it is indeed a number
    }
    catch (Exception)
    {
        temp = mystring.Substring(firstCharacter + 5, 4); // it is 4 digits, not 5
    }
    myIssueId = temp;
}

Now my question is this. How do I improve this code? Its a bit too messy for my liking. Any ideas will be appreciated.

Vivek Jain
  • 3,811
  • 6
  • 30
  • 47
KapteinMarshall
  • 490
  • 6
  • 20
  • 1
    This seems like a job for regular expressions: http://msdn.microsoft.com/en-us/library/hs600312.aspx. Here you'd use something like `Supp-(?\d+)`, then the substring representing only the digits would be in the capture group `issue`. – millimoose Aug 12 '13 at 12:02
  • 1
    As much as I hate to say it, this is the type of problem Regular Expressions were created to solve. – Binary Worrier Aug 12 '13 at 12:02
  • thats smart... regular expressions... I remembered using these in University... – KapteinMarshall Aug 12 '13 at 12:03
  • Regex I think is your answer http://stackoverflow.com/questions/4734116/c-sharp-find-and-extract-number-from-a-string – Squirrel5853 Aug 12 '13 at 12:04
  • @BinaryWorrier Why hate to say it? REs aren't inherently bad, just widely overused for problems that are too complex. Finding a simple pattern in free-form text is fine. – millimoose Aug 12 '13 at 12:07
  • @millimoose: because I find any regex longer than a few characters prohibitively obtuse. This is a perfect example of when to use a regex, but I've seen an unfortunate number of inappropriate regex "solutions". The most important thing to learn when using Regex is when _NOT_ to use Regex. – Binary Worrier Aug 12 '13 at 12:13
  • From what I have read and learned from my older, and supposed wiser brother, Regular expressions tend to be very fast!!! May I ask you guys, when is it not a good idea to use a regular expression ? – KapteinMarshall Aug 12 '13 at 12:16
  • When not to use Regular Expressions http://programmers.stackexchange.com/questions/113237/when-you-should-not-use-regular-expressions – Binary Worrier Aug 12 '13 at 12:33
  • @KapteinMarshall Regular expressions can be very fast as well as very slow, depending on what they do. (Backtracking a lot can be very expensive, and backtracking is essential to handling repeated constructs in the middle of a pattern.) They will frequently be faster than using string operations in scripting languages where the RE engine is implemented in C and doesn't need to touch the interpreter state while it works. In C# I'd expect REs to be mostly a wash with well-written parsing code, so just use what's appropriate. – millimoose Aug 12 '13 at 12:49

1 Answers1

3

Your are looking for some digits that are preceeded by the string "Supp-". This can be translated into a regular expression like

(<=Supp-)\d+

You'd use it like myIssueId = Regex.Match(input, "(<=Supp-)\d+").Value.

Jens
  • 25,229
  • 9
  • 75
  • 117