String that doesn't contain character group

Question

I wrote regex for finding urls in text:

/(http[^\s]+)/g

But now I need same as that but that expression doesn't contain certain substring, for instance I want all those urls which doesn't contain word google.

How can I do that?

http://stackoverflow.com/questions/1395177/regex-to-exclude-a-specific-string-constant — Nvan, Nov 06 '15 at 14:09
@Onilol, that would break the regex because any url that contains `g`, `o`, `l`, or `e` after the protocol would match your condition. Square brackets means it'll match one character of the group (or not match, with `^`), not the entire string. — ps2goat, Nov 06 '15 at 14:13

Wiktor Stribiżew · Accepted Answer · 2015-11-09T07:59:56.287

Here is a way to achieve that:

http:\/\/(?!\S*google)\S+

See demo

JS:

var re = /http:\/\/(?!\S*google)\S+/g; 
var str = 'http://ya.ru http://yahoo.com http://google.com';
var m;
 
while ((m = re.exec(str)) !== null) {
    document.getElementById("r").innerHTML += m[0] + "<br/>";
}

<div id="r"/>

Regex breakdown:

http:\/\/ - a literal sequence of http://
(?!\S*google) - a negative look-ahead that performs a forward check from the current position (i.e. right after http://), and if it finds 0-or-more-non-spaces-heregoogle the match will be cancelled.
\S+ - 1 or more non-whitespace symbols (this is necessary since the lookahead above does not really consume the characters it matches).

Note that if you have any punctuation after the URL, you may add \b right at the end of the pattern:

var re1 = /http:\/\/(?!\S*google)\S+/g; 
var re2 = /http:\/\/(?!\S*google)\S+\b/g; 
document.write(
  JSON.stringify(
    'http://ya.ru, http://yahoo.com, http://google.com'.match(re1)
  ) + "<br/>"
);

document.write(
  JSON.stringify(
    'http://ya.ru, http://yahoo.com, http://google.com'.match(re2)
  )
);

Sorry, I have little time, I will add some more explanation later. Main thing is that a look-ahead will fail a match if the non-whitespaces contain `google`. — Wiktor Stribiżew, Nov 06 '15 at 14:30

String that doesn't contain character group

1 Answers1