1

I am using an asp RegularExpressionValidator to validate if a textarea has html or encoded html. I need the validator to work client side because I have ValidateRequest set to true on the page. My regex is set to match any string that does not have a less than character followed by an alpha character or an ampersand followed by some number of alpha characters ending in a semi-colon.

^((?![<]{1}[a-z]{1}).)*$
^((?![&]{1}[a-z]+;).)*$
  • Why are you using `{1}`? It's redundant. I'm not sure what that's meant to do, but `{x}` doesn't do anything unless x is 2 or greater. – Justin Morgan - On strike Dec 11 '14 at 17:10
  • By the way, unless you just need something quick & dirty and aren't worried about accuracy, [regex is the wrong tool for validating HTML](http://stackoverflow.com/a/1732454/399649). If this is for something important, like protecting against XSS, regex won't cut it. For example, `< script>` will get past your pattern because of the whitespace. Use a real HTML parser instead. – Justin Morgan - On strike Dec 11 '14 at 17:19
  • It is ok if < script gets past the client side validation. It is not valid html, anyway, and shouldn't be an issue even if it were saved to the database. I am doing more validation server side. I just need this client side validation to work so my end users don't get hit with the application error page. –  Dec 11 '14 at 17:40
  • `< script src="http://foo.bar">` is valid HTML. The whitespace after the `<` doesn't matter. – Justin Morgan - On strike Dec 11 '14 at 18:26
  • Negative. Try the following code on a test html page. It will render the text as plain text in the browser but not execute the script: < script>alert(""); –  Dec 11 '14 at 18:30
  • 1
    Well, I'll be! Looks like it applies to all HTML tags as well. Thanks, you've taught me something new. I would caution you that not all browsers are standards-compliant, but as long as you have stronger server-side validation, you should be good. – Justin Morgan - On strike Dec 11 '14 at 19:29

1 Answers1

2

Javascript does not have a concept of Single-Line which lets your period match any character including line breaks. You should use the following in place of your comma: [\s\S]

^((?![<]{1}[a-z]{1})[\s\S])*$
^((?![&]{1}[a-z]+;)[\s\S])*$
Jason Williams
  • 2,740
  • 28
  • 36