12

Adding simple e-mail validation to my code, I created the following function:

def isValid(email: String): Boolean = if("""(?=[^\s]+)(?=(\w+)@([\w\.]+))""".r.findFirstIn(email) == None)false else true

This will pass emails like bob@testmymail.com and fail mails like bobtestmymail.com, but mails with space characters slip through, like bob @testmymail will also return true.

I'm probably being silly here...

pagoda_5b
  • 7,333
  • 1
  • 27
  • 40
Jack
  • 16,506
  • 19
  • 100
  • 167
  • 1
    Correct regex for validating emails would be huge. See [here](http://stackoverflow.com/questions/201323/using-a-regular-expression-to-validate-an-email-address) for example. – Andrew Logvinov Dec 17 '12 at 10:48
  • Thanks Andrew, I realise that, which is why I'm just adding a 'simple' rule here ;-) i.e. does it have an '@' does it have a '.' and is it void of spaces. It's just a simple check to test for typos rather than to cater for all cases. Re-writing it is under my todo's :-p – Jack Dec 17 '12 at 10:59
  • 2
    Also not an answer—although I'm a little unclear on the question, anyway—but rolling your own email validator probably isn't a good idea. It's possible you're already using a framework that provides one, and if not, [Apache Commons does](http://commons.apache.org/validator/apidocs/org/apache/commons/validator/routines/EmailValidator.html). – Travis Brown Dec 17 '12 at 11:21
  • Check this link: http://haacked.com/archive/2007/08/21/i-knew-how-to-validate-an-email-address-until-i.aspx – wleao Dec 17 '12 at 13:40
  • I've copied your code in a scala worksheet and your regex is not returning true to emails with spaces. – wleao Dec 17 '12 at 14:00
  • 1
    I think spaces are allowed by [RFC 2822](http://tools.ietf.org/html/rfc2822#section-3.2.5) if enclosed in double quotes: "paolo falabella"@testdomain.com would thus be theoretically valid (but I do agree that the owner of such an address must have by now given up trying to use it as a login on the internet...) – Paolo Falabella Dec 17 '12 at 14:20

5 Answers5

28

My function is inspired from the one that the Play Framework uses (see PlayFramework) and uses the regexp presented here: W3C recommendation. Hope it helps. All tests suggested in the other questions are passed.

private val emailRegex = """^[a-zA-Z0-9\.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$""".r


def check(e: String): Boolean = e match{
    case null                                           => false
    case e if e.trim.isEmpty                            => false
    case e if emailRegex.findFirstMatchIn(e).isDefined  => true
    case _                                              => false
}
John
  • 4,786
  • 8
  • 35
  • 44
  • 1
    This is pretty good, but testing it against certain valid and invalid cases, I can still find problems, but people shouldn't have such weird e-mails anyway! – bbarker Jun 12 '18 at 15:10
4

As I've tested your regex and it was catching simple emails, I then checked your code and saw that you're using findFirstIn. I believe that is your problem. findFirstIn will jump all the spaces until it matches some sequence anywhere in the string. I believe that in your case it's better to use unapplySeq and check if it returns Some List

def isValid(email: String): Boolean =
   if("""(?=[^\s]+)(?=(\w+)@([\w\.]+))""".r.findFirstIn(email) == None)false else true

def isValid2(email: String): Boolean =
  """(\w+)@([\w\.]+)""".r.unapplySeq(email).isDefined

isValid("test@gmail.com")                        //> res0: Boolean = true

isValid("t es t@gmailcom")                       //> res1: Boolean = true

isValid("b ob @tes tmai l.com")                  //> res2: Boolean = false

isValid2("test@gmail.com")                       //> res3: Boolean = true

isValid2("t es t@gmailcom")                      //> res4: Boolean = false

isValid2("b ob @tes tmai l.com")                 //> res5: Boolean = false

// but those don't work for both:
// I recommend you using a proper regex pattern to match emails
isValid("test@gma.i.l.c.o.m")                    //> res6: Boolean = true

isValid("test@gmailcom")                         //> res7: Boolean = true

isValid2("test@gma.i.l.c.o.m")                   //> res8: Boolean = true

isValid2("test@gmailcom")                        //> res9: Boolean = true
wleao
  • 2,316
  • 1
  • 18
  • 17
  • 1
    This solution is not really valid in all cases: isValid2("user@provider") will result in TRUE – sonix Oct 10 '14 at 11:47
2
scala> def isValid(email : String): Boolean = if("""^[-a-z0-9!#$%&'*+/=?^_`{|}~]+(\.[-a-z0-9!#$%&'*+/=?^_`{|}~]+)*@([a-z0-9]([-a-z0-9]{0,61}[a-z0-9])?\.)*(aero|arpa|asia|biz|cat|com|coop|edu|gov|info|int|jobs|mil|mobi|museum|name|net|org|pro|tel|travel|[a-z][a-z])$""".r.findFirstIn(email) == None)false else true
v: (email: String)Boolean

scala> isValid("""bob@test.com""")
res0: Boolean = true

scala> isValid("""bob @test.com""")
res1: Boolean = false

scala> isValid("""bobtest.com""")  
res2: Boolean = false
0xAX
  • 20,957
  • 26
  • 117
  • 206
  • """\b[a-zA-Z0-9.!#$%&’*+/=?^_`{|}~-]+@[a-zA-Z0-9-]+(?:\.[a-zA-Z0-9-]+)*\b""".r this one is not that strict – wleao Dec 17 '12 at 13:41
0
scala> val email4 = """([\w\.!#$%&*+/=?^_`{|}~-]+)@([\w]+)([\.]{1}[\w]+)+""".r
email4: scala.util.matching.Regex = ([\w\.!#$%&*+/=?^_`{|}~-]+)@([\w]+)([\.]{1}[\w]+)+

scala> email4.pattern.matcher("arun#eja-729@gmail..gov").matches
res10: Boolean = false

scala> email4.pattern.matcher("arun#eja-729@gmail.com.gov").matches
res11: Boolean = true

scala> email4.pattern.matcher("arun#eja-729@gmail.com.gov.in").matches
res12: Boolean = true

scala> email4.pattern.matcher("arun#eja-729@gmail.com.gov.in.").matches
res13: Boolean = false

scala> email4.pattern.matcher("arun#eja-729@gmail.com.").matches
res14: Boolean = false

scala> email4.pattern.matcher("arun#eja-729@gmail.com").matches
res15: Boolean = true

scala> email4.pattern.matcher("arun#eja-729@gmail..com").matches
res16: Boolean = false

scala> email4.pattern.matcher("arun#eja-729@gmail.ap.com").matches
res17: Boolean = true

scala> email4.pattern.matcher("arun#eja-729@gmail.ap.com.").matches
res18: Boolean = false
0

The below one is the regex for email id with minimum 10 and maximum 30 char length.

scala> val email4 = """[([\w\.!#$%&*+/=?^_`{|}~-]+)@([\w]+)([\.]{1}[\w]+)+]{10,30}""".r
email4: scala.util.matching.Regex = [([\w\.!#$%&*+/=?^_`{|}~-]+)@([\w]+)([\.]{1}[\w]+)+]{10,30}

scala> email4.pattern.matcher("ar@g.com").matches
res19: Boolean = false

scala> email4.pattern.matcher("ar@gmail.com").matches
res20: Boolean = true

scala> email4.pattern.matcher("ar1234567890@gmail1234567890.com").matches
res21: Boolean = false

scala> email4.pattern.matcher("ar1234567890@gmail123456780.com").matches
res22: Boolean = false

scala> email4.pattern.matcher("ar1234567890@gma.com").matches
res23: Boolean = true

scala> email4.pattern.matcher("ar1234567890@gmghfjdfcga.com").matches
res24: Boolean = true

scala> email4.pattern.matcher("ar1234567890@gmghfjdfcga1.com").matches
res25: Boolean = true

scala> email4.pattern.matcher("ar1234567890@gmghfjdfcga111.com").matches
res26: Boolean = false

scala> email4.pattern.matcher("ar1234567890@gmghfjdfcga11.com").matches
res27: Boolean = true