61

I allow users to enter a regular expression to match IP addresses, for doing an IP filtration in a related system. I would like to validate if the entered regular expressions are valid as a lot of userse will mess op, with good intentions though.

I can of course do a Regex.IsMatch() inside a try/catch and see if it blows up that way, but are there any smarter ways of doing it? Speed is not an issue as such, I just prefer to avoid throwing exceptions for no reason.

Andy Lester
  • 91,102
  • 13
  • 100
  • 152
Mark S. Rasmussen
  • 34,696
  • 4
  • 39
  • 58
  • do you mean blowing up on creating the actual Regex? new regex(str) ? – Nicholas Mancuso Oct 20 '08 at 14:52
  • Allowing the users to enter a start and end value for each octet (or a similar solution) might be worth considering instead of regex. – Greg Oct 20 '08 at 14:56
  • You might also consider using CIDR (192.168.0.0/24) if your IP address regex is for ranges. http://en.wikipedia.org/wiki/CIDR – Richard Szalay Nov 21 '09 at 10:07
  • I have a method to test whether a RegEx is valid, but it just wraps the regex in a Try/Catch. I'm not sure if there's a better way to do this, but I couldn't find one. – Jon Tackabury Oct 20 '08 at 14:46

9 Answers9

59

I think exceptions are OK in this case.

Just make sure to shortcircuit and eliminate the exceptions you can:

private static bool IsValidRegex(string pattern)
{
    if (string.IsNullOrWhiteSpace(pattern)) return false;

    try
    {
        Regex.Match("", pattern);
    }
    catch (ArgumentException)
    {
        return false;
    }

    return true;
}
Shimmy Weitzhandler
  • 101,809
  • 122
  • 424
  • 632
Jeff Atwood
  • 63,320
  • 48
  • 150
  • 153
  • 1
    I wonder will JIT compiler be smart or dumb enough to optimize away the whole try catch block because the return value of a pure function is not used? – deerchao Nov 23 '13 at 16:44
  • Would `IsMatch()` be any faster/better than `Match()`, seeing that we don't actually want to perform a match? Just like testing for primality is infinitely (well, almost) faster than actually finding the factors. – dotNET Aug 28 '16 at 12:41
  • 1
    `IsMatch()` [calls](https://github.com/dotnet/corefx/blob/master/src/System.Text.RegularExpressions/src/System/Text/RegularExpressions/Regex.cs) `internal Match Run(bool quick, int prevlen, string input, int beginning, int length, int startat)` with `quick` set to `true` while `Match()` calls it with `quick` set to `false`. It is indeed a bit faster, about **1-5%** according to my simple tests. – Mikael Dúi Bolinder Dec 29 '17 at 09:14
  • 4
    How about just `new Regex(pattern)`? – Drew Noakes Mar 16 '18 at 20:53
  • Question specifically asks if it can be done without handling an exception. – Bretton Wade Aug 12 '19 at 18:35
  • Does `RegexParseException` need to be handled as well? – Jess Apr 27 '22 at 21:00
  • Never mind, `RegexParseException` is an `ArgumentException`. Catching ArgumentException will handle both! – Jess Apr 27 '22 at 21:10
42

As long as you catch very specific exceptions, just do the try/catch.

Exceptions are not evil if used correctly.

Robert Deml
  • 12,390
  • 20
  • 65
  • 92
  • 6
    The question specifically asks if it can be done without handling an exception. – Bretton Wade Aug 12 '19 at 18:34
  • 4
    `Exceptions are not evil if used correctly` but they are expensive to throw as they include a dump of the stack trace inside of them. Exceptions should be used for actual errors (ideally) and not for testing inputs – Liam Feb 03 '21 at 09:29
  • 1
    `but they are expensive to throw` I just spent the day trying to convert MS `RegexParser.ScanRegex()` into something that will return just an `enum` representing the error. I would bet that while exceptions may be memory expensive to throw, that there would be a significant performance cost to adding a bunch of checks. if I finish it I'll benchmark it. – Michael Wagner Dec 12 '21 at 23:19
  • 1
    I guess I'll eat my hat https://github.com/mwagnerEE/BenchmarkResults/blob/main/ToThrowOrNotToThrow-report-github.md – Michael Wagner Dec 13 '21 at 03:16
7

Not without a lot of work. Regex parsing can be pretty involved, and there's nothing public in the Framework to validate an expression.

System.Text.RegularExpressions.RegexNode.ScanRegex() looks to be the main function responsible for parsing an expression, but it's internal (and throws exceptions for any invalid syntax anyway). So you'd be required to reimplement the parse functionality - which would undoubtedly fail on edge cases or Framework updates.

I think just catching the ArgumentException is as good an idea as you're likely to have in this situation.

Liam
  • 27,717
  • 28
  • 128
  • 190
Mark Brackett
  • 84,552
  • 17
  • 108
  • 152
  • 1
    We have TryParse to deal with potentially malformed numbers. We should have a TryRegex to do the same thing--return a failure rather than an exception. Debugging user-entered regexes is annoying! – Loren Pechtel Aug 16 '16 at 20:42
  • I think its actually `System.Text.RegularExpressions.RegexParser.ScanRegex()` Source: https://referencesource.microsoft.com/#System/regex/system/text/regularexpressions/RegexParser.cs – Michael Wagner Dec 12 '21 at 19:35
5

I've ever been use below function and have no problem with that. It uses exception and timeout both, but it's functional. Of course it works on .Net Framework >= 4.5.

    public static bool IsValidRegexPattern(string pattern, string testText = "", int maxSecondTimeOut = 20)
    {
        if (string.IsNullOrEmpty(pattern)) return false;
        Regex re = new Regex(pattern, RegexOptions.None, new TimeSpan(0, 0, maxSecondTimeOut));
        try { re.IsMatch(testText); }
        catch{ return false; } //ArgumentException or RegexMatchTimeoutException
        return true;
    }
MiMFa
  • 981
  • 11
  • 14
2

A malformed regex isn't the worst of reasons for an exception.

Unless you resign to a very limited subset of regex syntax - and then write a regex (or a parser) for that - I think you have no other way of testing if it is valid but to try to build a state machine from it and make it match something.

Tomalak
  • 332,285
  • 67
  • 532
  • 628
2

Depending on who the target is for this, I'd be very careful. It's not hard to construct regexes that can backtrack on themselves and eat a lot of CPU and memory -- they can be an effective Denial of Service vector.

Clinton Pierce
  • 12,859
  • 15
  • 62
  • 90
0

In .NET, unless you write your own regular expression parser (which I would strongly advise against), you're almost certainly going to need to wrap the creation of the new Regex object with a try/catch.

theraccoonbear
  • 4,283
  • 3
  • 33
  • 41
0

This is my solution, that outputs an enum telling whether the pattern is useable, and if yes, then return the compiled regex as an out parameter that you can use directly in your calling code. Regards.

namespace ProgrammingTools.Regex
{
    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text.RegularExpressions;    

    public enum eValidregex { No, Yes, YesButUseCompare }

    public class RegEx_Validate
    {
        public static eValidregex IsValidRX ( string pattern , out Regex RX )
        {
            RX = null; 

            if ( pattern.Length == 0 )
                return eValidregex.No;

            List<char> c1 = new List<char>
            {
                '\\' , '.' , '(' , ')' , '{' , '}' , '^' , '$' , '+' , '*' , '?' , '[' , ']', '|'
            };

            if ( c1.Count( e => pattern.Contains( e ) ) > 0 )
            {
                TimeSpan ts_timeout = new TimeSpan(days: 0,hours: 0,minutes: 0,seconds: 1,milliseconds: 0);

                try
                {
                    RX = new Regex( pattern , RegexOptions.Compiled | RegexOptions.IgnoreCase , ts_timeout );
                    return eValidregex.Yes;
                }
                catch ( ArgumentNullException )
                {
                    return eValidregex.No;
                }
                catch ( ArgumentOutOfRangeException )
                {
                    return eValidregex.No;
                }
                catch ( ArgumentException )
                {
                    return eValidregex.No;
                }
            }
            else
            {
                return eValidregex.YesButUseCompare;
            }

        }

    }

}
Johan B
  • 16
  • 1
-3

By using following method you can check wether your reguler expression is valid or not. here testPattern is the pattern you have to check.

public static bool VerifyRegEx(string testPattern)
{
    bool isValid = true;
    if ((testPattern != null) && (testPattern.Trim().Length > 0))
    {
        try
        {
            Regex.Match("", testPattern);
        }
        catch (ArgumentException)
        {
            // BAD PATTERN: Syntax error
            isValid = false;
        }
    }
    else
    {
        //BAD PATTERN: Pattern is null or blank
        isValid = false;
    }
    return (isValid);
}
Alex
  • 23,004
  • 4
  • 39
  • 73