1

I'm looking for a regular expression that which will match a string ##test## in a text. The string can be surround by any word or nonword character, or white-space, or newline - so everything can be one the left and on the right of the string, no matter. I need to use it in a asp.net RegularExpressionValidator control for ValidationExpression.

Thanks in advance for help ! Regards.

<asp:RegularExpressionValidator ID="RegularExpressionValidator9" runat="server" ControlToValidate="TextBox1"
            Display="Dynamic" ValidationGroup="Mail" ErrorMessage="Insert !"
            ValidationExpression="(\w|\s|\n)##test##(\w|\s|\n)" >*</asp:RegularExpressionValidator>
        <asp:Button ID="Button1" runat="server" Text="Button" 
            ValidationGroup="Mail" Onclick="Button1_Click" />
John
  • 13
  • 3
  • 4
    You really should post something that you have attempted, your code, anything besides a list of demands. This isn't a freelance site. – austinbv Apr 18 '11 at 22:40
  • Just match the string "###test###" with the pattern `/###test###/`, and be done with it. – tchrist Apr 19 '11 at 02:25
  • @tchrist: He said there could be other text before or after the bit he's matching. But the validator implicitly anchors the match at both ends, so he has to "pad" the regex to match that stuff. He may not *care* about the rest of the text, but he still has to consume it. – Alan Moore Apr 19 '11 at 04:57

2 Answers2

3

This should do it:

ValidationExpression="[\s\S]*##test##[\s\S]*"

I can see three problem with your regex, all having to do the (\w|\s|\n) portion:

  • It will match exactly one character. You need it to match zero or more characters; adding the * quantifier does that.

  • It's needlessly redundant and gratuitously inefficient. [\w\s] matches the same things: a word character (\w) or a whitespace character (\s, which already includes \n). And whenever you have a choice between using and alternation ((a|b|c)) or a character class ([abc]), the character class should always be the first tool you reach for. It may look like a trivial choice, but it can have a huge impact on performance.

  • It leaves out a lot of characters, most notably punctuation characters like !, +, ., etc..

If you're wondering why I used [\s\S]* and not .*, it's because the validation can be performed either in the browser or the server. In JavaScript regexes, as in most other flavors, the . metacharacter doesn't match newlines. Most of the others also support a "single-line" or "dot-matches-all" mode, but not JavaScript.

Of course, you could just force it to run on the server only, but you might as well get into habit of dumbing down your regexes to JavaScript's level when you can. :-/

Alan Moore
  • 73,866
  • 12
  • 100
  • 156
  • I don't think so. See my comment under @John's answer. – Alan Moore Apr 19 '11 at 05:03
  • Thanks for very good explanation. It was realy usefull. I have also used (.|\s)*##test##(.|\s)* but Yours is exactly what I need. – John Apr 19 '11 at 20:52
  • @John: Just so you know, `(.|\s)` is **not** equivalent to `[\s\S]`. In fact, it's much worse than what you started with. For the reason why, see the explanation in [this answer](http://stackoverflow.com/questions/2407870/javascript-regex-hangs-using-v8/2408599#2408599). And this applies to all regex flavors, not just JavaScript; it's just that most flavors don't require this kind of hackage in the first place. – Alan Moore Apr 19 '11 at 22:52
0

Something like this?

.*(##test##).*
Town
  • 14,706
  • 3
  • 48
  • 72
  • Not quite. The dot character loses its special meaning when it appears in a character class, so `[\r\n.]` matches a carriage return, a linefeed, or a dot. – Alan Moore Apr 19 '11 at 05:01
  • 1
    @Alan Moore: Good point, I'll remove that bit and +1 yours for the good explanation. – Town Apr 19 '11 at 10:02