3

I am really bad with regexps, what I want is a regexp that does not match any html tags (for user input validation).

What I want is negative of this:

<[^>]+>

What I currently have is

public class MessageViewModel
{
    [Required]
    [RegularExpression(@"<[^>]+>", ErrorMessage = "No html tags allowed")]
    public string UserName { get; set; }
}

but it does opposite of what I want - allows usernames only with html tags

Anarion
  • 2,406
  • 3
  • 28
  • 42

2 Answers2

8

Regular expressions cannot do "negative" matches.

But they can do "positive" matches and you can then throw out of the string everything that they have found.


Edit - after the question was updated, things became a little clearer. Try this:

public class MessageViewModel
{
    [Required]
    [RegularExpression(@"^(?!.*<[^>]+>).*", ErrorMessage = "No html tags allowed")]
    public string UserName { get; set; }
}

Explanation:

^            # start of string
(?!          # negative look-ahead (a position not followed by...)
  .*         #   anything
  <[^>]+>    #   something that looks like an HTML tag
)            # end look-ahead
.*           # match the remainder of the string
Tomalak
  • 332,285
  • 67
  • 532
  • 628
  • I am using a [RegularExpression] attribute in c# and I want to disallow user to input any html tags – Anarion Feb 01 '15 at 17:32
  • Exactly what I wanted, but could not find a clear syntax to do. Regexes are the most coplex thing I came across so far – Anarion Feb 01 '15 at 17:45
  • 2
    Instead of blacklisting things (like HTML tags), think about whitelisting. That's a lot more secure and much easier to maintain. `[RegularExpression(@"^[\p{L}\p{N}]+$", ErrorMessage = "Only letters and numbers. No spaces allowed.")]` See http://www.regular-expressions.info/unicode.html for help selecting the proper Unicode character ranges. The non-Unicode version of this `"^[a-zA-Z0-9]+$"`. – Tomalak Feb 01 '15 at 17:51
  • 1
    It seems to work with an inline s modifier. `^(?s)(?!.*<[^>]+>).*` – Wesley Egbertsen Jun 15 '16 at 15:26
  • @Wesley Of course, that's the alternative. The .NET regex flavor supports that modifier, not all of them do. – Tomalak Jun 15 '16 at 15:30
0

This question is a bit old but I recently came across the same need, almost.

I wanted to allow ">" or "<" if it has a space before/after preventing the HTML to see it has tag.

Maybe it's not perfect but it did the job for me :

^((?!(<\S)|(\S>)).)*$

You can test it here : regex101.com

Kradhox
  • 1
  • 1
  • Given that there is already an accepted answer (from a long time ago), could you explain in which case your solution would be an interesting alternative, to help users make their choice. – manu190466 Dec 09 '20 at 22:40