regex to find domain without those instances being part of subdomain.domain

Question

I'm new to regex. I need to find instances of example.com in an .SQL file in Notepad++ without those instances being part of subdomain.example.com(edited)

From this answer, I've tried using ^((?!subdomain))\.example\.com$, but this does not work.

I tested this in Notepad++ and @ https://regex101.com/r/kS1nQ4/1 but it doesn't work.

Help appreciated.

Can you give an example of a matching URL as well as one which does not match? Maybe you could write a regex which matches positives rather than excluding negatives. — Tim Biegeleisen, Apr 14 '16 at 04:21
Ahh. Thanks @GiorgiNakeuri. This doesn't work in Notepad++ though, which is supposed to be [using the standard PCRE (Perl) syntax](http://docs.notepad-plus-plus.org/index.php/Regular_Expressions), — Steve, Apr 14 '16 at 04:46
If you really need the matches to be flush left, all you need is `^example\.com$` which matches lines with nothing else on them. If not, please clarify how the `^` factors in, and/or maybe show a real snippet of text from which you need the matches to be extracted. — tripleee, Apr 14 '16 at 05:31
@tripleee, the matches are not on one line with nothing else on them, they are interspersed randomly through a PHPmyAdmin `.sql` dump. — Steve, Apr 14 '16 at 05:52

Giorgi Nakeuri · Answer 1 · 2016-04-14T06:02:55.550

1

Simple

^example\.com$

with g,m,i switches will work for you.

https://regex101.com/r/sJ5fE9/1

If the matching should be done somewhere in the middle of the string you can use negative look behind to check that there is no dot before:

(?<!\.)example\.com

https://regex101.com/r/sJ5fE9/2

edited Apr 14 '16 at 06:02

answered Apr 14 '16 at 04:28

Giorgi Nakeuri

35,155
8
47
75

timolawl · Answer 2 · 2016-04-14T04:34:31.397

0

Here's a solution that takes into account the protocols/prefixes,

/^(www\.)?(http:\/\/www\.)?(https:\/\/www\.)?example\.com$/

edited Apr 14 '16 at 04:34

answered Apr 14 '16 at 04:23

timolawl

5,434
13
29

tripleee · Accepted Answer · 2016-04-14T06:34:55.167

Without access to example text, it's a bit hard to guess what you really need, but the regular expression

(^|\s)example\.com\>

will find example.com where it is preceded by nothing or by whitespace, and followed by a word boundary. (You could still get a false match on example.com.pk because the period is a word boundary. Provide better examples in your question if you want better answers.)

If you specifically want to use a lookaround, the neative lookahead you used (as the name implies) specifies what the regex should not match at this point. So (?!subdomain\.)example trivially matches always, because example is not subdomain. -- the negative lookahead can't not be true.

You might be better served by a lookbehind:

(?<!subdomain\.)example\.com

Demo: https://regex101.com/r/kS1nQ4/3

regex to find domain without those instances being part of subdomain.domain

3 Answers3