regex ignore (stop capture) strings encapsulated in square brackets

Asked Jun 21 '18 at 12:10

Active Jun 21 '18 at 14:44

Viewed 123 times

I have a string containing text, in this text there are several domains included such as google.de www.google.de etc... I want to capture these, but ignore domain encapsulated in square brackets. At the moment I have the following:

https://regex101.com/r/t8IMd1/3

It doesn't ignore the encapsulated one though I used a negative lookahead.

What I have to do if more than one domain is in one line?

I don't get it at the moment, so I try to list all requirements and hope someone can provide a explained solution:

domain names can be www.domain.de || domain.de || domain.de/something
its a multiline text string, so the domains can occur on one line beside each other or in different lines
they are divided by one or more whitespaces
when domains are encapsulated in [domain] or [noscript]domain[/noscript] they have to be ignored

edited Jun 21 '18 at 14:44

asked Jun 21 '18 at 12:10

jmcclane

You may use [`\[[^]]*](*SKIP)(*F)|(?:www\.)?\w+\.[A-Za-z]{2,5}`](https://regex101.com/r/3BC4IC/2) – Wiktor Stribiżew Jun 21 '18 at 12:12
and be careful that's not a good regex for matching domain names. – revo Jun 21 '18 at 12:14
Thx, what I have to do if more than one domain is in one line? – jmcclane Jun 21 '18 at 13:59

regex ignore (stop capture) strings encapsulated in square brackets

0 Answers0