31

My string of text looks like this:

johndoe@domain.com (John Doe)

I need to get just the part before the @ and nothing else. The text is coming from a simple XML object if that matters any.

The code I have looks like this:

$authorpre = $key->{"author"};
$re1 = '((?:[a-z][a-z]+))';

if ($c = preg_match_all ("/".$re1."/is", $authorpre, $matches))
{
    $author = $matches[1][0];
}

Sometimes the username might have numbers or an underscore before the @ symbol, which is where the regex stops it seems.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
mrpatg
  • 10,001
  • 42
  • 110
  • 169
  • Your regexp has an outer capturing group `()` and an inner non-capturing group `(?:)`. The inner non-capturing group may be unnecessary given that you want to capture what is inside. The `[a-z]` means capture a lower-case letter. The `[a-z]+` means capture 1 or more lower-case letters. So effectively your expression means capture anything that is 2 or more lower-case letters long. If you were to put a `^` at the very front of your expression it would ensure that matching only takes place from the _beginning_ of the text. – PP. Nov 25 '09 at 17:04
  • Won't be very fun, I fear. Some example strings you may want to test: `"John Doe"@example.com (John Doe)`, `"(>'.')>"@example.com (John Doe)`, `foo@[192.168.2.1] (John Doe)`, `^.^@example.com (John Doe)`, `"a@b@c"@example.com (John Doe)"`. Yes, those are all valid e-mail addresses :-) – Joey Nov 25 '09 at 17:06
  • @Johannes: `"a@b@c"@example.com (John Doe)` is really allowed? That really complicates things... – Welbog Nov 25 '09 at 17:08
  • 2
    Welbog: http://en.wikipedia.org/wiki/E-mail_address ... you can quote the local-part which allows for characters otherwise not allowed. – Joey Nov 25 '09 at 17:13

12 Answers12

96

The regular expression that will match and capture any character until it reaches the @ character:

([^@]+)

That seems like what you need. It'll handle all kinds of freaky variations on e-mail addresses.


I'm not sure why Ben James deleted his answer, since I feel it's better than mine. I'm going to post it here (unless he undeletes his answer):

Why use regex instead of string functions?

$parts = explode("@", "johndoe@domain.com");
$username = $parts[0];

You don't need regular expressions in this situation at all. I think using explode is a much better option, personally.


As Johannes Rössel points out in the comments, e-mail address parsing is rather complicated. If you want to be 100% sure that you will be able to handle any technically-valid e-mail address, you're going to have to write a routine that will handle quoting properly, because both solutions listed in my answer will choke on addresses like "a@b"@example.com. There may be a library that handles this kind of parsing for you, but I am unaware of it.

Community
  • 1
  • 1
Welbog
  • 59,154
  • 9
  • 110
  • 123
  • depending on how intense your regex can get, i personally like explode function. Fits well with what your looking for. – Anthony Forloney Nov 25 '09 at 17:06
  • 3
    What's with the e-mail address `"a@b"@example.com`? – Joey Nov 25 '09 at 17:07
  • 1
    The fun never ends with source routes in e-mail addresses: http://www.remote.org/jochen/mail/info/address.html – PP. Nov 25 '09 at 17:09
  • @Johannes: Is the `@` character allowed in the domain portion of the address? Because, if not, both solutions could still work as long as they look for the *last* `@` character instead of the first. – Welbog Nov 25 '09 at 17:09
  • 1
    Taking everything before the *last* `@` should work, yes. Unless Jane Doe comes with the cool idea to use `"j@nedoe"@example.com (J@ne Doe)` ... – Joey Nov 25 '09 at 17:11
  • @Johannes: Good catch. So, last `@` character before the first `(` character, then. Does anyone have a counterexample for that one? – Welbog Nov 25 '09 at 17:12
  • @Greg: No, because of Johannes' counterexample of the address `"a@b"@example.com`. – Welbog Nov 25 '09 at 17:13
  • Maybe `"j@n(= doe"@example.com (J@ne Doe)`? :D – Joey Nov 25 '09 at 17:13
  • (Ok, this is getting weird. I'm beginning to understand why e-mail addresses get much more restricted by many e-mail providers.) – Joey Nov 25 '09 at 17:14
  • @Johannes: Damn it. Does PHP have any built-in or third-party e-mail parsing libraries? – Welbog Nov 25 '09 at 17:15
  • Not that I know of. Still, I think your idea is perfectly fine for roughly 100 % of all e-mail addresses *in use*. But I still want to have one of those addresses one day to pester applications with poor validation routines :-) – Joey Nov 25 '09 at 17:16
  • @Johannes: So do I... Just as a subtle protest to whoever it is who came up with such a complicated grammar for e-mail addresses. – Welbog Nov 25 '09 at 17:17
  • However, if you just search for the first *unquoted* `@` and take everything before that, it should work. – Joey Nov 25 '09 at 17:22
  • @Johannes: Yeah, but deciding whether a character is quoted or not is bordering on the upper limit of regular expressions' domain. It's possible, sure, but it won't be pretty. – Welbog Nov 25 '09 at 17:24
  • Oh, and quoting can also be done with a backslash ... Yes, it's definitely unpretty :-) – Joey Nov 25 '09 at 17:28
  • To bypass the @ problem just count the number of items in the array. If it's 2 then the email does not contain any extra @. If it has more you just have to get all the items in the array except the last on and join them with a @ :D – AntonioCS Nov 25 '09 at 17:34
  • @AntonioCS: That works too, unless there's a `@` in the parenthetical string like Johannes' example `"j@n(= doe"@example.com (J@ne Doe)`. Counting the `@`s will lead you to believe that `ne Doe` is the domain. – Welbog Nov 25 '09 at 17:39
  • 1
    Your regexp snippet "[^"]*" does not correctly match a quoted string, since a quoted string may contain escaped quote characters. For instance, "contains \"quotes\"" is a valid address. It would be better with "(?:[^"]|\\.)*". – markusk Nov 25 '09 at 18:08
  • @markusk: This is exactly why I wouldn't use a regular expression in this situation. I'm just going to pull it down. People who want to see it can look in the revision history. – Welbog Nov 25 '09 at 18:10
  • What about using the limit in the explode function? Something like this `implode(explode('@',$email,-1),'');` That would give you everything before the last '@'. – willwashburn Nov 10 '14 at 13:52
  • Would need to implode with `@` so that any non-final occurrences are restored: `implode(explode('@',$email,-1),'@');` – GaryJ May 20 '15 at 20:13
7

@OP, if you only want to get everything before @, just use string/array methods. No need complicated regex. Explode on "@", then remove the last element which is the domain part

$str = '"peter@john@doe"@domain.com (John Doe)';
$s = explode("@",$str);
array_pop($s); #remove last element.
$s = implode("@",$s);
print $s;

output

$ php test.php
"peter@john@doe"
ghostdog74
  • 327,991
  • 56
  • 259
  • 343
6

Maybe this variant is a bit slower than explode(), but it takes only one string:

$name = preg_replace('/@.*?$/', '', $email);
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Leksat
  • 2,923
  • 1
  • 27
  • 26
4
<?php
$email  = 'name@example.com';
$domain = strstr($email, '@');
echo $domain; // prints @example.com

$user = strstr($email, '@', true); // As of PHP 5.3.0
echo $user; // prints name
?>

source

munjal
  • 1,384
  • 9
  • 15
4

I used preg_replace

$email_username = preg_replace('/@.*/', '', $_POST['email']);
michalzuber
  • 5,079
  • 2
  • 28
  • 29
  • This will fail on `"()<>[]:,;@\\"!#$%&'-/=?^_\`{}| ~.a"@example.org` and other emails with quoted `@` locals. Read more: https://stackoverflow.com/a/38787343/2943403 – mickmackusa May 06 '20 at 06:55
3

My suggestion:

$email = 'johndoe@domain.com';
$username = substr($email, 0, strpos($email, '@'));

// Output (in $username): johndoe
webcoder
  • 1,355
  • 1
  • 15
  • 14
  • This will fail on `"()<>[]:,;@\\"!#$%&'-/=?^_\`{}| ~.a"@example.org` and other emails with quoted `@` locals. Read more: https://stackoverflow.com/a/38787343/2943403 – mickmackusa May 06 '20 at 07:00
2

I'd go with $author = str_replace(strrchr($authorpre, '@'), '', $authorpre);

Arkh
  • 8,416
  • 40
  • 45
2

You could start by using mailparse_rfc822_parse_addresses to parse the address and extract just the address specification without any display name. Then, you could extract the part before @ with the regexp (.*)@.

markusk
  • 6,477
  • 34
  • 39
1

Use something like this:

list($username, $domain) = explode('@', $email . "@"); // ."@" is a trick: look note below

With this solution you'll have already populated two variables with email address parts in one row.

."@": This is made to avoid in short critical errors with the list command and ensure that explode will produce at least two variables as needed.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
kante
  • 229
  • 1
  • 4
  • 13
  • 1
    This will fail on `"()<>[]:,;@\\"!#$%&'-/=?^_\`{}| ~.a"@example.org` and other emails with quoted `@` locals. Read more: https://stackoverflow.com/a/38787343/2943403 – mickmackusa May 06 '20 at 06:56
1

In case someone is still looking in 2020 ..here's the regex that can pick the text before the '@'

^(\S+)(?=@)
Amjad Desai
  • 91
  • 1
  • 2
  • This will not match valid address like: `" "@example.org`, [valid/invalid addresses](https://en.wikipedia.org/wiki/Email_address#Examples) – Toto Jun 28 '20 at 08:31
0

Basic example:

    $email = "linuxUser@IsGrand.com";
    if (filter_var($email, FILTER_VALIDATE_EMAIL)) {
        list($user, $domain) = explode('@', trim($email) . "@");
    } else {
        echo "Unable to get account info ....";
    }

Complex Example: Something like this to populate Firstname and Last name fields:

1) valid email ?  if yes get the two parts  user and domain.
2) else set to something default etc.
3) use the email address if we don't have a decoded value.

Code:

    if (filter_var($email, FILTER_VALIDATE_EMAIL)) {
        list($fname, $lname) = explode('@', trim($email) . "@");
    } else {
        $fname = "Xdefault";
        $lname = "Ydefault";
    }

    $fname = (!empty($decoded['firstname'][0]))  ? $decoded['firstname'][0] : $fname ;
    $lname = (!empty($decoded['lastname'][0]))  ? $decoded['lastname'][0] : $lname ;
Mike Q
  • 6,716
  • 5
  • 55
  • 62
  • This will fail on `"()<>[]:,;@\\"!#$%&'-/=?^_\`{}| ~.a"@example.org` and other emails with quoted `@` locals. Read more: https://stackoverflow.com/a/38787343/2943403 – mickmackusa May 06 '20 at 06:59
0

I would also like to suggest a non regex solution here as it may be useful in most cases:

strstr('n.shah@xyz.co', '@', true)

output:

n.shah

Nabeel MS
  • 9
  • 1