12

In my ASP.net project I need to validate some basic data types for user inputs. The data types are like numeric, decimal, datetime etc.

What is the best approach that I should have taken in terms of performance? Is it to do it by Regex.IsMatch() or by TryParse()?

Thanks in advance.

Ian Kemp
  • 28,293
  • 19
  • 112
  • 138
Thanushka
  • 1,395
  • 5
  • 25
  • 54
  • 9
    Do you really care? Compared to everything else asp.net does any of the two approaches seem insignificant in terms of performance impact. – mmix Apr 19 '11 at 10:50
  • Use (Ajax Control Toolik)[http://ajaxcontroltoolkit.codeplex.com/] and be happy. – Soner Gönül Apr 19 '11 at 10:52
  • Use what is most appropriate / easiest to maintain. Performance is a non-issue in this case :) – Eben Roux Apr 19 '11 at 11:50

4 Answers4

12

TryParse and Regex.IsMatch are used for two fundamentally different things. Regex.IsMatch tells you if the string in question matches some particular pattern. It returns a yes/no answer. TryParse actually converts the value if possible, and tells you whether it succeeded.

Unless you're very careful in crafting the regular expression, Regex.IsMatch can return true when TryParse will return false. For example, consider the simple case of parsing a byte. With TryParse you have:

byte b;
bool isGood = byte.TryParse(myString, out b);

If the value in myString is between 0 and 255, TryParse will return true.

Now, let's try with Regex.IsMatch. Let's see, what should that regular expression be? We can't just say @"\d+" or even @\d{1,3}". Specifying the format becomes a very difficult job. You have to handle leading 0s, leading and trailing white space, and allow 255 but not 256.

And that's just for parsing a 3-digit number. The rules get even more complicated when you're parsing an int or long.

Regular expressions are great for determining form. They suck when it comes to determining value. Since our standard data types all have limits, determining its value is part of figuring out whether or not the number is valid.

You're better off using TryParse whenever possible, if only to save yourself the headache of trying to come up with a reliable regular expression that will do the validation. It's likely (I'd say almost certain) that a particular TryParse for any of the native types will execute faster than the equivalent regular expression.

The above said, I've probably spent more time on this answer than your Web page will spend executing your TryParse or Regex.IsMatch--total throughout its entire life. The time to execute these things is so small in the context of everything else your Web site is doing, any time you spend pondering the problem is wasted.

Use TryParse if you can, because it's easier. Otherwise use Regex.

Jim Mischel
  • 131,090
  • 20
  • 188
  • 351
8

As other would say, the best way to answer that is to measure it ;)

    static void Main(string[] args)
    {

        List<double> meansFailedTryParse = new List<double>();
        List<double> meansFailedRegEx = new List<double>();
        List<double> meansSuccessTryParse = new List<double>();
        List<double> meansSuccessRegEx = new List<double>();


        for (int i = 0; i < 1000; i++)
        {


            string input = "123abc";

            int res;
            bool res2;
            var sw = Stopwatch.StartNew();
            res2 = Int32.TryParse(input, out res);
            sw.Stop();
            meansFailedTryParse.Add(sw.Elapsed.TotalMilliseconds);
            //Console.WriteLine("Result of " + res2 + " try parse :" + sw.Elapsed.TotalMilliseconds);

            sw = Stopwatch.StartNew();
            res2 = Regex.IsMatch(input, @"^[0-9]*$");
            sw.Stop();
            meansFailedRegEx.Add(sw.Elapsed.TotalMilliseconds);
            //Console.WriteLine("Result of " + res2 + "  Regex.IsMatch :" + sw.Elapsed.TotalMilliseconds);

            input = "123";
            sw = Stopwatch.StartNew();
            res2 = Int32.TryParse(input, out res);
            sw.Stop();
            meansSuccessTryParse.Add(sw.Elapsed.TotalMilliseconds);
            //Console.WriteLine("Result of " + res2 + " try parse :" + sw.Elapsed.TotalMilliseconds);


            sw = Stopwatch.StartNew();
            res2 = Regex.IsMatch(input, @"^[0-9]*$");
            sw.Stop();
            meansSuccessRegEx.Add(sw.Elapsed.TotalMilliseconds);
            //Console.WriteLine("Result of " + res2 + "  Regex.IsMatch :" + sw.Elapsed.TotalMilliseconds);
        }

        Console.WriteLine("Failed TryParse mean execution time     " + meansFailedTryParse.Average());
        Console.WriteLine("Failed Regex mean execution time        " + meansFailedRegEx.Average());

        Console.WriteLine("successful TryParse mean execution time " + meansSuccessTryParse.Average());
        Console.WriteLine("successful Regex mean execution time    " + meansSuccessRegEx.Average());
    }
}
Bruno
  • 1,944
  • 13
  • 22
  • 7
    tl;dr for those who actually need an answer: `int.TryParse()` is always faster than `Regex.IsMatch()` (up to an order of magnitude so). It's also faster than a Regex object constructed with `RegexOptions.Compiled`. Make sure you use the appropriate class (e.g. `ulong`) for your situation, depending on the length of your numbers, and bear in mind that `decimal.TryParse()` is significantly slower than `int.TryParse()`. If neither of these work for you, `stringVariable.All(char.IsDigit)` provides an acceptable middle ground in terms of performance, and it's nice and concise too. – Ian Kemp Apr 16 '14 at 16:12
4

Don't try to make regexes do everything.

Sometimes a simple regex will get you 90% of the way and to make it do everything you need the complexity grows ten times or more.

Then I often find that the simplest solution is to use the regex to check the form and then rely on good old code for the value checking.

Take a date for example, use a regex to check for a match on a date format and then use capturing groups to check the values of the individual values.

Daniel
  • 41
  • 1
2

I'd guess TryParse is quicker, but more importantly, it's more expressive.

The regular expressions can get pretty ugly when you consider all the valid values for each data type you're using. For example, with DateTime you have to ensure the month is between 1 and 12, and that the day is within the valid range for that particular month.

Misko
  • 2,044
  • 12
  • 15