Understand Regular Expression

Question

What does this regular expression mean \w+(?=@). Im trying to understand how this expression will pull users from the AD, currently it is omitting all characters before a special Character. Ex: Viny.Trucker will result only in .Truck. I want the whole name Viny.Trucker to be extracted. Any help is greatly appreciated.

user489998 · Answer 1 · 2014-04-16T13:50:04.683

1

It means find "any number of word characters (a-z, A-Z, 0-9 and _) immediately followed by an @ symbol. The @ symbol won't be included in the matched characters. If you want to include the '.' character in the expression, you can try (\w|.)+(?=@)

edited Apr 16 '14 at 13:50

answered Apr 16 '14 at 10:35

user489998

4,473
2
29
35

`\w` includes also `[0-9_]`. – Toto Apr 16 '14 at 11:56

score 0 · Answer 2 · answered Apr 16 '14 at 10:24

0

Try this : http://www.regexr.com/ It comes with very nice explanation when you hover over chars. And here is your regex : http://www.regexr.com/38ndp

answered Apr 16 '14 at 10:24

scx

2,749
2
25
39

score 0 · Answer 3 · answered Apr 16 '14 at 10:30

0

It matches the characters before an @-character.

\w+    Match at least one word-character, i.e. letters, numbers and underscores.
(?=@)  Match an @-character without including it in the match.

answered Apr 16 '14 at 10:30

Jan Aagaard

10,940
8
45
80

score 0 · Answer 4 · answered Apr 16 '14 at 10:40

Jan AAgaard is right, it matches all word characters before an @.

How does it do this?

\w+   Matches ones or more word-chracters.
(?=@) This is called a positive-lookahead. It means require the following to come after the characters, but don't include that in the match.

A better alternative

Now, it seems that you want to match every character before an @, the easy way to to this is

[^@]+(?=@)

If you know that there will always be a @ in a line you could even use

[^@]+

and just take the first match.
If there will always be a certain set of chracters allowed before the @, you could even specify those in a character class: Simply put all of them in square brackets, e.g.:

[a-zA-Z0-9._]

Pedro Lobito · Answer 5 · 2014-04-16T11:04:26.377

It's searching for any character, digit or underscore followed by @

\w+(?=@)


Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
   Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=@)»
   Match the character “@” literally «@»

Positive and Negative Lookahead

Negative lookahead is indispensable if you want to match something not followed by something else. When explaining character classes, this tutorial explained why you cannot use a negated character class to match a q not followed by a u. Negative lookahead provides the solution: q(?!u). The negative lookahead construct is the pair of parentheses, with the opening parenthesis followed by a question mark and an exclamation point. Inside the lookahead, we have the trivial regex u.

Positive lookahead works just the same. q(?=u) matches a q that is followed by a u, without making the u part of the match. The positive lookahead construct is a pair of parentheses, with the opening parenthesis followed by a question mark and an equals sign. You can use any regular expression inside the lookahead (but not lookbehind, as explained below). Any valid regular expression can be used inside the lookahead. If it contains capturing groups then those groups will capture as normal and backreferences to them will work normally, even outside the lookahead. (The only exception is Tcl, which treats all groups inside lookahead as non-capturing.) The lookahead itself is not a capturing group. It is not included in the count towards numbering the backreferences. If you want to store the match of the regex inside a lookahead, you have to put capturing parentheses around the regex inside the lookahead, like this: (?=(regex)). The other way around will not work, because the lookahead will already have discarded the regex match by the time the capturing group is to store its match.

http://www.regular-expressions.info/lookaround.html

Understand Regular Expression

5 Answers5

How does it do this?

A better alternative