1

ASP.NET 4.5 / c#

I've been using a RegularExpressionValidator for multi-line textboxes that looks like this:

<asp:RegularExpressionValidator
ID="rev_txtEducationProfessional_courseDescription"
runat="server"
Display="None"
ValidationGroup="educationProfessional"
ControlToValidate="txtEducationProfessional_courseDescription"
ValidationExpression="(\s|.){0,500}$"
ErrorMessage="Course Description: 500 characters maximum"
EnableClientScript="false" />

I haven't had any issues with this up until now, but I'm having one that IMHO is very odd. So if I do something like type 501 letter Xs in the textbox, the validator fires fine. If I type 499 letter Xs with a couple of carriage returns, the validator fires fine. A user contacted me today with an issue with the site hanging. I traced the issue to the validator for the multiline textbox. The user is submitting well over 500 characters, but the site hangs and crashes on validation. When I run in debug, it is the Page.Validate that hangs (you may have guessed, all my validation is server side). I'm going to provide the text that causes the problem. I should note, if I paste in this text up to 500 characters, everything is fine...once I hit 501, my site takes a dump. There is something specific to this text in combination with how I am validating, "(\s|.){0,500}$", that is going very wrong, but I'm lost as to what the exact issue is. Without further ado, here is the entry that is the thorn in my side:


COMBINES DETAILED LECTURE REGARDING THE SCIENTIFIC BASIS FOR FRICTION RIDGE IDENTIFICATION WITH INTENSE LATENT PRINT COMPARISON PRACTICAL EXERCISES. THE LECTURE MATERIAL INCORPORATES THE RIDGEOLOGY CONCEPTS OF DAVID ASHBAUGH WITH THE PRACTICAL APPLICATION TECHNIQUES DEVELOPED BY PAT WERTHEIM.

RECOGNITION OF RIDGE PATTERN CLUES, IDENTIFYING WHICH FINGER OR WHICH AREA OF THE SKIN MADE A PARTICULAR LATENT IMPRESSION. THE CORRECT USE OF 3RD LEVEL DETAIL. PHILOSOPHY AND ACE-V METHODOLOGY OF COMPARING AND IDENTIFYING LATENT PRINTS.


I tried removing the hyphen. Also, the text came from Word, so I tried copying into notepad to remove any special formatting and tried removing the carriage return. Usually the site will run for several minutes and then say the page is not available without a specific error. I just tried in debug and have been waiting what seems like forever to get a response. I will add if anything useful comes up. I'm very curious if I have done something obviously wrong or perhaps I can work around this by modifying the regex.

Thanks all!

iAwardYouNoPoints
  • 263
  • 1
  • 3
  • 12
  • I'm wondering if copying from Word to notepad in an attempt to remove any special characters wasn't successful. I haven't tried typing the test in manually. If that works, then I could modify it section by section to find the offender. Still, I'd like to have validation that could handle this (assuming this is the issue). – iAwardYouNoPoints Jan 06 '14 at 19:17

2 Answers2

3

Not sure if anyone else will have this problem, but I have seen this RegEx quoted several times for validating textbox length:

ValidationExpression="(\s|.){0,500}$"

The fix was to use this:

ValidationExpression="^[\S\s]{1,500}$"

I'm no expert with regular expressions, but found an old post that was extremely helpful.

the cause of the pathological case you're experiencing, typically referred to as "catastrophic" or "super-linear" backtracking. The reason the backtracking is so expensive (computationally) is that the characters matched by the dot and "\s" overlap ("\s" matches more than just newlines), which in this case causes the number of choice points to consider while backtracking to blow up at a near-exponential rate. So, it's not an infinite loop, but it might take until the end of time to complete (if your server or client doesn't crash or die before then, which is fairly likely). Some regex libraries such as PCRE set a hard cap on the number of backtracking steps allowed to avoid this kind of case, but that doesn't really solve the problem.

http://regexadvice.com/forums/thread/37591.aspx

This was a frustrating issue to troubleshoot. Hope this helps someone down the road.

Take care all.

iAwardYouNoPoints
  • 263
  • 1
  • 3
  • 12
0

Yes, this could be catastrophic backtracking. With regular expressions (using the .Net API or the RegularExpressionValidator), you should always care following the Best Practices for Regular Expressions in the .NET Framework.

As a general advice, try to rewrite your regex pattern (the previous answer probably fixed the issue). Test with a tool like RegexBuddy when you have doubts about performance of your regex pattern in "the worst case" scenario.

As a mitigation, you may also add a timeout to your validator (MatchTimeout attribute, in milliseconds). This related post also discuss how to configure a timeout globally.

Community
  • 1
  • 1
lrodriguez
  • 29
  • 1
  • 5