0

I am trying to create a regex validation attribute in asp.net mvc to validate that an entered email has the .edu TLD.

I have tried the following but the expression never validates to true...

[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+edu

and

\w.\w@{1,1}\w[.\w]?.edu

Can anyone provide some insight?

Winner Crespo
  • 1,644
  • 15
  • 29
stephen776
  • 9,134
  • 15
  • 74
  • 123
  • 8
    I'd separate this into two different problems: the (well-known to be horrifying) problem of [validating an email address](http://stackoverflow.com/search?q=regex+valid+email+&submit=search), full stop, and the simpler problem of validating that a string ends in `.edu`... – Dan J Jan 16 '12 at 19:15
  • @djacobson: This sounds a lot like an answer. – H.B. Jan 16 '12 at 19:16
  • I like the idea of validating that the entry is and email and ends with .edu seperately. I can use the mvc DataAnnotationsExtension library to check for a valid email with no problem. Can someone provide an update regex to check for the last 4 characters in the string being ".edu" or would it be easier/better to check this with the string classes in .net? – stephen776 Jan 16 '12 at 19:38
  • do you need to accept international languages or not? if so, then check out the accepted answer, and then masons answer at: http://stackoverflow.com/questions/2049502/what-characters-are-allowed-in-email-address for converting to punycode first. – Adam Tuliper Jan 16 '12 at 19:58
  • @Adam Tuliper: Edu can hardly be international anyway, it's for North American educational institutions, other countries have ac.uk or no separate domain at all - uu.se, helsinki.fi etc. – tripleee Jan 16 '12 at 20:05
  • @tripleee good point, except my address is adam.カメラ-ポー@lehigh.edu j/k : ) – Adam Tuliper Jan 16 '12 at 21:07

4 Answers4

5

This should work for you:

^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.+-]+\.edu$

Breakdown since you said you were weak at RegEx:

^ Beginning of string

[a-zA-Z0-9._%+-]+ one or more letters, numbers, dots, underscores, percent-signs, plus-signs or dashes

@ @

[a-zA-Z0-9.+-]+ one or more letters, numbers, dots, plus-signs or dashes

\.edu .edu

$ End of string

mynameiscoffey
  • 15,244
  • 5
  • 33
  • 45
  • You should allow for subdomains like `you@cs.example.edu` as well. The localpart regex is also slightly too tight, although I could quickly only identify fairly theoretical problems like not allowing `*@example.edu` or `uucp!bang!path@example.edu`. – tripleee Jan 18 '12 at 12:38
  • @tripleee I don't follow - the regex already matches `you@cs.example.edu` correctly. – mynameiscoffey Jan 18 '12 at 15:54
  • Sorry, my bad, should have used my glasses. +1 – tripleee Jan 18 '12 at 16:34
1

if you're using asp.net mvc validation attributes, your regular expression actually has to be coded with javascript regex syntax, and not c# regex syntax. Some symbols are the same, but you have to be weary about that.

You want your attribute to look like the following:

 [RegularExpression(@"([0-9]|[a-z]|[A-Z])+@([0-9]|[a-z]|[A-Z])+\.edu$", ErrorMessage = "text to display to user")]

the reason you include the @ before the string is to make a literal string, because I believe c# will apply its own escape sequences before it passes it to the regex

(a|b|c) matches either an 'a' or 'b' or 'c'. [a-z] matches all characters between a and z, and the similar for capital letters and numerals so, ([0-9]|[a-z]|[A-Z]) matches any alphanumeric character

([0-9]|[a-z]|[A-Z])+ matches 1 or more alphanumeric characters. + in a regular expression means 1 or more of the previous

@ is for the '@' symbol in an email address. If it doesn't work, you might have to escape it, but i don't know of any special meaning for @ in a javascript regex

Let's simplify it more

[RegularExpression(@"\w+@\w+\.edu$", ErrorMessage = "text to display to user")]

\w stands for any alphanumeric character including underscore

read some regex documentation at https://developer.mozilla.org/en/JavaScript/Guide/Regular_Expressions for more information

  • this allows @#$%#%$#$@#$.mycollege!@!@!!@.edu, not valid for an email : ) – Adam Tuliper Jan 16 '12 at 19:52
  • although - after a bit of consideration, international addresses have chars we wouldnt allow. So one may want to allow everything to not exclude them unless you convert to punycode, in which case the standard rules apply and you can apply exclusions. so my note the op question – Adam Tuliper Jan 16 '12 at 19:58
  • I've edited my answer again to explain the symbols in the regular expression. You can modify it to allow other symbols that I might not have included – Sam I am says Reinstate Monica Jan 16 '12 at 20:03
  • What's with the three separate character classes when the much simpler and quicker `[A-Za-z0-9]` would do? Also you don't allow for subdomains, various special characters, etc. In particular, dot is very common in addresses like `firstname.lastname@example.edu`. – tripleee Jan 18 '12 at 12:41
0

You may have different combinations and may be this very simple one :

\S+@\S+\.\S+\.edu
Mubarek
  • 2,691
  • 1
  • 15
  • 24
-1

try this:

Regex regex = new Regex(@"^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.(edu)$", RegexOptions.IgnoreCase);

ANSWER UPDATED...

Desolator
  • 22,411
  • 20
  • 73
  • 96