What is the purpose of ?: and .*? before \K in regular expressions?

Question

I have a regular expression that matches words with . in between them as potential urls but not those with @ in front of them as they are assumed to be emails.

This is the regex that I have:

(?:\@(https?:\/\/)?(\w+(\-*\w+)*\.)[a-zA-Z\.]+[\w+\/?\#?\??\=\%\&\-]+.*?)*\K(https?:\/\/)?(\w+(\-*\w+)*\.)[a-zA-Z\.]+[\w+\/?\#?\??\=\%\&\-]+

This is not working for the last occurrence of email perfectly.

For example, for the string

twitter.com facebook.com kamur@test.com ksou@uni.edu vimal@gsomething.com balaji@sweets.com john wayne <johnwayne@dc.com> 20,000.00

I expect the matches to be twitter.com and facebook.com.

But it also matches dc.com.

How is it working? How do you expect it to work? Please provide an [MCVE]. Thanks! — jpaugh, Sep 25 '18 at 23:00
This pattern doesn't make any sense, don't waste your time with it and read tutorials. — Casimir et Hippolyte, Sep 25 '18 at 23:02
?: Means non capturing Group, .*? Means non greedy zero or more of any character. — Poul Bak, Sep 25 '18 at 23:09
The matches, you showed, can really mean anything and does not have to be urls. The regex starts correctly with 'https?:'. — Poul Bak, Sep 25 '18 at 23:17
Possible duplicate of [Reference - What does this regex mean?](https://stackoverflow.com/questions/22937618/reference-what-does-this-regex-mean) — Ken White, Sep 25 '18 at 23:36

K.Dᴀᴠɪs · Answer 1 · 2018-09-25T23:51:02.830

In your (?:\@(https?:\/\/), the ? in https?: will match either http or https. The ? literally means 0 or 1 of the character s. The : you refer to in https?: is matching a literal :, nothing special.

Now, the difference is if your ?: comes after a non-escaped opening parenthesis, then that means it's a non-capturing group.

Escaped: \(?:, not a non-capturing group
Not-Escaped: (?:, is a non-capturing group

The next portion of your question, what does the .*? in [\w+\/?\#?\??\=\%\&\-]+.*? refer to?

. will match any character
* is a quantifier that will match your . _{(any character)} 0 to unlimited times
*? makes * non-greedy. An internet search will provide you with a lot of information on what a non-greedy match is if you are unaware.

What is the purpose of ?: and .*? before \K in regular expressions?

1 Answers1