1

I wrote regex for finding urls in text:

/(http[^\s]+)/g

But now I need same as that but that expression doesn't contain certain substring, for instance I want all those urls which doesn't contain word google.

How can I do that?

Aleksa
  • 2,976
  • 4
  • 30
  • 49
  • http://stackoverflow.com/questions/1395177/regex-to-exclude-a-specific-string-constant – Nvan Nov 06 '15 at 14:09
  • Have you tried ? `(?!^google)` – Onilol Nov 06 '15 at 14:09
  • @Onilol, that would break the regex because any url that contains `g`, `o`, `l`, or `e` after the protocol would match your condition. Square brackets means it'll match one character of the group (or not match, with `^`), not the entire string. – ps2goat Nov 06 '15 at 14:13
  • 1
    @ps2goat my bad ! Still caffeinating – Onilol Nov 06 '15 at 14:13

1 Answers1

1

Here is a way to achieve that:

http:\/\/(?!\S*google)\S+

See demo

JS:

var re = /http:\/\/(?!\S*google)\S+/g; 
var str = 'http://ya.ru http://yahoo.com http://google.com';
var m;
 
while ((m = re.exec(str)) !== null) {
    document.getElementById("r").innerHTML += m[0] + "<br/>";
}
<div id="r"/>

Regex breakdown:

  • http:\/\/ - a literal sequence of http://
  • (?!\S*google) - a negative look-ahead that performs a forward check from the current position (i.e. right after http://), and if it finds 0-or-more-non-spaces-heregoogle the match will be cancelled.
  • \S+ - 1 or more non-whitespace symbols (this is necessary since the lookahead above does not really consume the characters it matches).

Note that if you have any punctuation after the URL, you may add \b right at the end of the pattern:

var re1 = /http:\/\/(?!\S*google)\S+/g; 
var re2 = /http:\/\/(?!\S*google)\S+\b/g; 
document.write(
  JSON.stringify(
    'http://ya.ru, http://yahoo.com, http://google.com'.match(re1)
  ) + "<br/>"
);

document.write(
  JSON.stringify(
    'http://ya.ru, http://yahoo.com, http://google.com'.match(re2)
  )
);
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563