2

I'm trying to find out how to remove all invalid characters in an email address.

Ex: email="taeo͝';st@yy.com"(. is an email character) and the result should be: email = "taest@yy.com"

I'm using the following email pattern:

String email_pattern = "^[^[_A-Za-z0-9-\\+]+(\\.[_A-Za-z0-9-]+)*@"+ "[A-Za-z0-9-]+(\\.[A-Za-z0-9]+)*(\\.[A-Za-z]{2,})$]";

String modifiedEmail = email.replaceAll(email_pattern,"");

But the above code is giving the result: email = "aest@yy.com" but expected "taest@yy.com"

Any suggestions or a better approach would be appreciated.

Protector one
  • 6,926
  • 5
  • 62
  • 86
TP_JAVA
  • 1,002
  • 5
  • 23
  • 49
  • What's wrong with a [unicode email address](http://stackoverflow.com/questions/3844431/are-email-addresses-allowed-to-contain-non-alphanumeric-characters)? –  Dec 26 '12 at 22:01

3 Answers3

11

Here is a nice blog post of why you shouldn't filter your email adresses: http://davidcel.is/blog/2012/09/06/stop-validating-email-addresses-with-regex/

TL;DR: Check if there is an @ (optionally a period) and send a test mail.
David suggests to use this regular expression:

/.+@.+\..+/
Protector one
  • 6,926
  • 5
  • 62
  • 86
Huluk
  • 864
  • 6
  • 18
  • Great article. I love this solution. I also like how I can actually memorize this regular expression for future use. – Protector one May 11 '16 at 09:09
  • My use case is different, I create my own e-mail addresses from names, names that might contain spaces, 8-bit characters etc. Any good way to do that? – d-b Mar 27 '23 at 21:44
1

I got it resolved by using pattern matcher.

email = "testo͝';@.com.my"   
String EMAIl_PATTERN = "[^a-zA-Z0-9!#$%&@'*+-/=?^_`{|}~.]+";
modifiedEmail = email.replaceAll(EMAIl_PATTERN, "");
TP_JAVA
  • 1,002
  • 5
  • 23
  • 49
0

Further thinking: You could also be testing for know providers of email adresses used without authentication (e.g. http://trashmail.com/).