0

I'm trying to make a regex to match email addresses, like any of these:

example@website.com
first.last@website.org
joe87_smith@web.net

I've written this regex:

$pattern = "/[\-\.\_a-z0-9]+(\@){1}[\-\.\_a-zA-Z0-9]+(\.){1}[\-a-z0-9]+/i";

and here is some code that I am using to test it:

$str = "test_last@test.com was the email address associated with another one, another.test@other.org";
$pattern = "/[\-\.\_a-z0-9]+(\@){1}[\-\.\_a-zA-Z0-9]+(\.){1}[\-a-z0-9]+/i";
preg_match_all($pattern, $str, $matches);
var_dump($matches);

(The text between the emails is filler) It's supposed to do as follows:

  1. Check for a username that can include one or more periods, dashes, underscores, or alphanumeric characters.
  2. Check for one and only one (required) "@" sign.
  3. Check for a domain or any number of subdomains (alphanumeric + periods + dashes)
  4. Check for a period followed by alphanumeric or dash characters.

When I test the code above, I get this output:

array(3) {
    [0] => array(2) {
        [0] => string(22) "test_last@test.com was"
        [1] => string(22) "another.test@other.org"
    }
    [1] => array(2) {
        [0] => string(1) "@"
        [1] => string(1) "@"
    }
    [2] => array(2) {
        [0] => string(1) " "
        [1] => string(1) "r"
    }
 }

Why is it matching so many other characters, such as single @ signs and the letter "r"? Why does the very first email contain the word was? I never tested for spaces to my knowledge...

apparatix
  • 1,492
  • 7
  • 22
  • 37

3 Answers3

1

To answer the question from the comments. The problem was using groups within regex which means that preg_match_all was matching on those groups separately as well.

Changing the regex to:

/[\-\.\_a-z0-9]+[\@]{1}[\-\.\_a-zA-Z0-9]+[\.]{1}[\-a-z0-9]+/

Returned:

Array
(
    [0] => Array
        (
            [0] => test_last@test.com
            [1] => another.test@other.org
        )

)

Using the OPs test text.

Sammaye
  • 43,242
  • 7
  • 104
  • 146
0

PHP has built in filters to check for things like e-mail validity now. More specifically, you might want to look into filter_var() and the FILTER_VALIDATE_EMAIL filter.

Sample usage:

$valid_email = filter_var($email, FILTER_VALIDATE_EMAIL);
if($valid_email)
        echo "Hooray!";

All three of your sample e-mail addresses should return the "hooray!"

noko
  • 1,129
  • 2
  • 14
  • 25
  • Note: If you are looking for complete domains only within emails then validate_email will also validate incomplete domains such as `example@example`. Even though this is a valid email is might be undesired behaviour within most web apps. – Sammaye Aug 16 '12 at 07:25
0

Validating email addresses (with regexp and otherwise) is problematic; see here: Using a regular expression to validate an email address.

Community
  • 1
  • 1
mkataja
  • 970
  • 2
  • 10
  • 28