3

I am trying to accept an email without having whitespace or blank at the end. I have tried two variants

pattern="^[A-Za-z0-9._%+-]+@[a-z0-9.-]+.[a-z]{1,4}[^\s]+$">

pattern="^[A-Za-z0-9._%+-]+@[a-z0-9.-]+.[a-z]{1,4}\S$">

both are not working & input is accepting whitespace at the end. My full input line is

input type="email" id="guestUserEmail" name="guestEmail" data-pattern-error="Email is invalid" data-required-error="Please enter email address" required pattern="^[A-Za-z0-9._%+-]+@[a-z0-9.-]+.[a-z]{1,4}$"

I referred this answer regex for no whitespace at the begining and at the end but allow in the middle

Please suggest!

Jeevan Bodas
  • 148
  • 5
  • 13
  • use at the end `*$` instead of `+$` – Arvind Katte Dec 29 '17 at 10:17
  • you can do this way also, first trim that email `String` using `trim()` function then apply regular expression. – Arvind Katte Dec 29 '17 at 10:21
  • @ArvindKatte I have tired using * & + alternatively both are not working. About trimming, I did suggest that but specific requirement is to show error on addition of space at the end. I am unable to understand why ^\s or \S is not working – Jeevan Bodas Dec 29 '17 at 10:31
  • use this one `^[A-Za-z0-9._%+-]+@[a-z0-9.-]+.[a-z]{1,4}[^\\S]+$` it worked for me – Arvind Katte Dec 29 '17 at 10:41
  • yes you have to use, `\\S` instead of `\s` – Arvind Katte Dec 29 '17 at 10:42
  • As in the majority of attempts at using regex to "validate" email addresses, you came up with a regex which rejects valid email addresses. I don't think you will receive any answers which correct this basic flaw. – tripleee Dec 29 '17 at 11:20
  • @tripleee can you please mention some of the valid email addresses which the above mention regex rejects, so that I can test them. – Jeevan Bodas Dec 29 '17 at 11:26
  • 1
    The domain name part fails to allow dashes in the TLD. The localpart rejects `*` and doubtlessly some other allowed characters. Neither part obviously copes with internationalized email addresses, though that may well be out of scope here. http://emailregex.com/ has a fairly comprehensive test suite, though that also covers variations in how comments, real names, and other adornments are coded in the email `From:` header. – tripleee Dec 29 '17 at 11:41
  • @tripleee Thanks for valuable inputs. I will try to modify the regex accordingly. Do you have any idea about not allowing whitespace at the end (Barring the Trim option) – Jeevan Bodas Dec 29 '17 at 11:43
  • The email regex you apparently started with doesn't allow trailing whitespace to begin with. The site I linked to above has more regexes, many of them better, none of which permit trailing whitespace, by quick glance. – tripleee Dec 29 '17 at 11:48
  • @JeevanBodas triplee's [emailRegEx](http://emailregex.com/) link is spot on. If you can't get a valid regex out that site, then your method of testing is erroneous. If you wish to get help with that aspect, more info is needed. – zer00ne Dec 29 '17 at 11:55
  • @tripleee I used the pattern given under HTML section ^[a-zA-Z0-9.!#$%&’*+/=?^_`{|}~-]+@[a-zA-Z0-9-]+(?:\.[a-zA-Z0-9-]+)*$ from the site suggested. It allows whitespace at the end & also it allows "n" characters after domain_name "." eg: test@gmail.commmmmmmmmmmmmmmm – Jeevan Bodas Dec 29 '17 at 11:57
  • @tripleee if you note the regex featured in [emailRegex](http://emailregex.com/) they also escape dots `\.` – zer00ne Dec 29 '17 at 11:57
  • @JeevanBodas see [Fail 5](https://regex101.com/r/N967YZ/3/tests) – zer00ne Dec 29 '17 at 11:59
  • Dots are literal *in character classes* (inside square brackets). In fact, `[.]` and `\.` are both valid ways to match a single literal dot outside of a character class. – tripleee Dec 29 '17 at 12:02
  • @zer00ne Yes over here https://regex101.com/r/N967YZ/1/tests everything is good. In previous comment i was referring http://emailregex.com html section pattern – Jeevan Bodas Dec 29 '17 at 12:04
  • @tripleee you are correct, sir. Answer has been corrected, regardless note that the last dot still needs escaping. – zer00ne Dec 29 '17 at 12:06
  • No, the regex does not permit whitespace at the end. It *correctly* fails to impose any constraints on the TLD; somebody could register the TLD `commmmmmmmmmmmmmmm` and the regex has no way to tell whether that has happened. If you watn to restrict to *currently* valid TLDs you need to enumerate them (currently some 400-odd IIRC) and keep the regex up to date as more are registered and some old ones abandoned. If you want to arbitrarily restrict which TLDs are "more real", please share your criteria and reasoning in the question. – tripleee Dec 29 '17 at 12:08
  • @JeevanBodas are you using Safari? Safari does not support `` – zer00ne Dec 29 '17 at 12:09
  • @zer00ne I am using Chrome – Jeevan Bodas Dec 29 '17 at 12:09

5 Answers5

2

Update

HTML by default collapses whitespace. This means:

  • If there are more than one whitespace between chars, it will render as a single whitespace char.

    • ex.this string has a double space right here .
      will render as:
      this string has a double space right here .
  • Leading and trailing whitespace are stripped (this doesn't happen with strings hence the necessity for methods such as trim())

    • ex. abc123@email.com   
      will render as:
      abc123@email.com

So if you have a billion spaces after the email address, it will be automatically stripped in an <input>. The Demo has your <input> wrapped in a <form> and the form will actually send data to a real test server. If you send a valid email address with trailing whitespace, look at the response. You'll see that in the response, that the value has no trailing whitespace.


Have you considered escaping the periods?

A . means ANY one char

A \. means a dot or period

It looks like this works:

[A-Za-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{1,4}\S$

See tests at RegEx101


Demo

input {
  font: inherit
}
<form id='contact' action='https://httpbin.org/post' method='post'>
  <input type="email" id="guestEmail" name="guestEmail" pattern="^[A-Za-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{1,4}$" required placeholder='abc123@email.com'>
  <input type='submit'>
</form>
zer00ne
  • 41,936
  • 6
  • 41
  • 68
  • It works fine in the link provided but I dont know it helps in my html – Jeevan Bodas Dec 29 '17 at 10:56
  • Yeah...I ran 6 tests of which the first three should be valid email addresses and the other three are invalid email addresses which passed as invalid. Do you have any pattern that isn't covered by these 6 tests? – zer00ne Dec 29 '17 at 11:09
  • I am trying with email test@gmail.com & it is not throwing error, which means the /S part at the end is not working – Jeevan Bodas Dec 29 '17 at 11:16
  • A dot in a character class is just a literal dot. – tripleee Dec 29 '17 at 11:20
  • @tripleee yes my point exactly. `abc@emailXcom` would be the same as `abc@email.com` if you don't escape the dot. – zer00ne Dec 29 '17 at 11:28
  • @JeevanBodas See [Fail 4](https://regex101.com/r/N967YZ/2/tests) it does detect a space as invalid – zer00ne Dec 29 '17 at 11:31
  • @zer00ne yes it works fine on regex texter but still I am unable to figure out why it is not working for me. My line is pattern="^[A-Za-z0-9\._%+-]+@[a-z0-9\.-]+\.[a-z]{1,4}\S$" – Jeevan Bodas Dec 29 '17 at 11:41
  • You are using the `pattern` attribute on input? ex. ` – zer00ne Dec 29 '17 at 11:44
  • @zer00ne yes I am using pattern attribute, full html input field is pasted in question. – Jeevan Bodas Dec 29 '17 at 12:08
1

Java escape sequence will not consider \, you have to append one more slash, like this \\.

Modify your String pattern, like this

^[A-Za-z0-9._%+-]+@[a-z0-9.-]+.[a-z]{1,4}[^\\S]+$

This will resolve,

Arvind Katte
  • 995
  • 2
  • 10
  • 20
  • I tried this it detects whitespace & throws error but now it has stop detecting range {1,4} & is accepting "n" characters after "." – Jeevan Bodas Dec 29 '17 at 10:57
0

I think your regexes are ok:

https://regex101.com/r/j0MXV5/1

https://regex101.com/r/oN9v8U/1

Maybe it is an option to use blur and trim the either the leading and trailing whitespaces:

onblur="this.value=this.value.trim();

Or only the trailing whitespaces:

onblur="this.value=this.value.replace(/\s+$/, '');

<form>
    <input type="email" id="guestUserEmail" name="guestEmail" data-pattern-error="Email is invalid" data-required-error="Please enter email address" required pattern="^[A-Za-z0-9._%+-]+@[a-z0-9.-]+.[a-z]{1,4}[^\\S]+$" onblur="this.value=this.value.replace(/\s+$/, '');">
</form>
The fourth bird
  • 154,723
  • 16
  • 55
  • 70
  • This option works but specific requirement given to me is to show error message on blank or whitespace added after email. – Jeevan Bodas Dec 29 '17 at 11:21
0

You are using a HTML input field of type email. The defined and documented behavior of this element is to accept but remove any leading or trailing whitespace. If that's not acceptable, don't use this element.

Your attempt at "fixing" the regex are misdirected. Basic regex reading skills should tell you that the apparent original regex only allows alphabetics just before end of string. Your attempts would basically change it from "mostly nearly correct email" to "mostly nearly correct email, followed by any junk at all, as long as it's not whitespace".

Generally speaking, you fix a too-permissive regex by constraining it more (dropping some stuff it would previously accept, perhaps refactoring the regex along the way; for example, to only allow internal dashes in a domain part, you have to split it into first character, optional middle perhaps with dashes, slightly less optional last character without a dash) but certainly not by adding new matching possibilities -- especially not a broad generic character class repeated an arbitrary number of times.

As already noted in comments, there is no way to completely match every valid email address with a regex, but the regular built-in validation in HTML5 is pretty much guaranteed to do a better job than a random regex you found on some PHP forum.

tripleee
  • 175,061
  • 34
  • 275
  • 318
0

Use this RegEx. It will validate your email with or without white space at start or end of email id

^([^\S/.]{0,})+(([^<>()[\]\.,;:\s@\"]+(\.[^<>()[\]\.,;:\s@\"]+)*)|(\".+\"))@(([^<>()[\]\.,;:\s@\"]+\.)+[^<>()[\]\.,;:\s@\"]{2,})+([^\S/.]{0,})$
TomBay
  • 21
  • 6