Regex explanation - how does it work?

Question

I came across on "REGEX" but I not really understand those symbol use by it. example... and many more

rege = /^([A-Za-z0-9_\-\.])+\@([A-Za-z0-9_\-\.])+\.([A-Za-z]{2,4})$/;

I use above to do email verification and it work, but as I say I'm not fully understand how it work! Does anyone of you have simple tutorial or website that explain about it?

Because I going to write for myself for Float verification.

Which accept 0-9 (numbers only) with decimal character (.) and length = 5.

example: 99.9

@Drakosha Do you mean he should google `/^([A-Za-z0-9_\-\.])+\@([A-Za-z0-9_\-\.])+\.([A-Za-z]{2,4})$/`? — Ram, Sep 09 '12 at 18:24
Dear Drakosha, i did google it and stackoverflow it, but none of it can make me fully understand, may be is my pool english. so i'm asking for good website, thanks — user1493339, Sep 09 '12 at 18:24
Don't write your own "is a number" regex. There are plenty of examples of such expressions online; use one of those. — Lone Shepherd, Sep 09 '12 at 18:24
[Duplicate of #5908817](http://stackoverflow.com/questions/5908817/regular-expression-for-dummies) — mekwall, Sep 09 '12 at 18:27
Dear L.B, now i only know regex standfor regular expression ! ^_^ thanks, this will help me on searching later. — user1493339, Sep 09 '12 at 18:27
What do you mean by length? Are you referring to precision or scale? for 99.873, the precision is 5 and the scale is 3. So what is length for this number? — Kash, Sep 09 '12 at 19:22
You can check everything here in the explanation panel http://regex101.com/r/oV8rZ0/1 — Ties, Aug 06 '14 at 15:46

Matthew Blancarte · Answer 1 · 2012-09-09T18:35:32.137

Beginning of a string:

^

This part of the string may contain the characters A-Z, a-z, 0-9, underscores, hyphens, and periods.

([A-Za-z0-9_\ -\ .])

Match one or more characters using the previous ruleset:

+

Continuing with an '@':

\@

Continuing with a period:

\ .

This part of the string may contain the characters A-Z and a-z, and must be either 2,3,4 characters in length.

([A-Za-z]{2,4})

End of a string:

$

score 2 · Answer 2 · answered Sep 09 '12 at 18:31

The regex is, as you might know, defined between the two / characters.

The first thing in the regex is the ^ character which indicates the beginning of the string, so it doesn't start to match halfway.

Then a group is defined within the braces (), this allows you to build a subpattern and return a partial match. The [] indicate a list of characters that match positively in your case A-Za-z0-9_\-\., so basically every letter, number, underscore, dash and dot character. The + tells that this group matches 1 or more occurences.

Then the @ sign is matched by \@, the \ escapes the next character so the default functionality is ignored and the character itself is matched.

Then there is another group like the first one, a . character is matched and the top level domain (only letter characters with a length of 2 to 4) is matched.

The $ means the end of the string so not only the beginning is matched.

score 1 · Answer 3 · answered Sep 09 '12 at 18:35

The enclosing /slashes/ defines the start and stop of pattern
"^" means beginning of string
Anything inside (parenthesis) will be matched as a separate group/part
The characters inside [brackets] means "match any of these"
"+" means "match whatever came before, one or more times"
"\x" means escape, which makes it possible to catch regex language characters, such as "*"
\@ means the "@" character escaped (not always necessary)
{x,y} means "match whatever came before, x to y times"
"$" means end of string

So, your pattern means:

A string starting with one or more url-safe characters (alphanumeric, "." and "_")
Then an "@"
Then one or more url-safe characters again
Then a dot "."
And lastly, 2 to 4 chars from the latin alphabet (i.e. "com", "eu", or "info")

...which is an email address

Experiment with it here: RegExr

score 1 · Answer 4 · edited May 23 '17 at 11:44

For a good tutorial, you can start with this link. Though this link speaks in generic terms about regexes, you should be aware (and the link also explains) that there are subtle variations between regex syntax and behavior among different regex engines used by different languages.

The breakdown of your regex is:

// The enclosing slashes denote it is a regex pattern.
^$ This denotes the start and end of the line.
([A-Za-z0-9_\-\.])+ The square brackets [] denote a character class which basically match ONE character out of the possible ones listed inside the square brackets. And a + quantifier outside this class denotes that this character class may repeat 1 or more times. In your regex, the character class represents a match which might be a word character (alphanumeric or underscore), a hyphen or a dot. This can be rewritten as ([-\w.]+). You can notice that you don't need to escape special characters in a character class with a backslash. For email address validation, this attempts to match the "local-part".
\@ This denotes a match for the rate symbol in the email address. This actually does not need to be escaped with a backslash.
([A-Za-z0-9_\-\.])+ This denotes the same as the "local-part", only now it attempts to match the domain address of the email.
\. This denotes the period between the domain and the top-level domain. This needs to be escaped with a backslash because period is a reserved word for the regex engine which denotes any character (except newlines).
([A-Za-z]{2,4}) This denotes alphabetic case-insensitive string from 2 to 4 characters in length. This represents the top level domain name like com, org, etc...

So the above regex can be better written as ^([-\w.]+)@([-\w.]+)\.([A-Za-z]{2,4})$.
Placed the + quantifiers inside the brackets. The most common reason to use () is to capture a part of the pattern so that you can extract it later. Hence if you need to capture the "local-part" or "domain" of the email address, it makes sense to capture the repeated character class than match only the first character like your regex has.

You can see this regex in action here at RegexPal

As for matching a float, this has already been discussed here.
But if you need a precision of 5 for your float, then it is a whole different deal.
Try this which uses a positive lookahead:

^[-+]?(?=\d+(\.\d+)?)(\.?\d){1,5}$

and play with it here

score 1 · Answer 5 · answered Sep 09 '12 at 19:23

May I recommend the following website: http://www.regex101.com

Here's a link to an explanation of your regular expression: http://regex101.com/r/aE3yB5

You can also insert test strings and test your expression.

The website will explain any given expression and find errors in them if there are any.

I hope this helps!

Regex explanation - how does it work?

5 Answers5