3

Negative lookahead to match itself in RegExp

I want to use a RegExp to match an image url that will have two cases:

The first case it is a normal image URL, ending with .png and may contain some query string, at which point the RegExp will match it in its entirety.

https://test.com/cat.png?pkey=abcd

In the second case, the URL is wrapped in an [img][/img] tag, the RegExp will be expected not to any string.

[img]https://test.com/cat.png?pkey=abcd[/img]

Now I have written this regular expression, it works great on the first case

(https:\/\/.*?\.png(?:\?[\w&=_-]*)?)(?!\[)

however it does not work with the second case, it will still be matched up to the penultimate character of the URL. How can I modify my RegExp to achieve my goal?

Here's RegExp link: https://regexr.com/7eohf

logi-kal
  • 7,107
  • 6
  • 31
  • 43
linjunchao
  • 33
  • 3
  • In such cases, a safer approach is to match and capture what you need, and just match what you do not need. `\[img][^]*?\[\/img]|(https:\/\/[^?\]\[\s]*?\.png(?:\?[\w&=_\-]*)?)`, something like that. – Wiktor Stribiżew Jun 01 '23 at 09:37
  • 1
    If you have found an answer that works for you, please [accept](https://stackoverflow.com/help/accepted-answer) it. – InSync Jun 01 '23 at 10:50
  • If [possessive quantifiers](https://www.regular-expressions.info/possessive.html) are supported: [`https:\/\/\S*?\.png(?:\?[-\w&=]*)?+(?!\[)`](https://regex101.com/r/Cf7LPt/1) but from your demo it looks like you are using JS. – bobble bubble Jun 01 '23 at 11:17

1 Answers1

3

Try to ancitipate the lookahead immediately after .png:

(https:\/\/.*?\.png(?!\S*\[)(?:\?[\w&=_-]*)?)

Where \S matches any character that is not a space (you can replace it as you wish).

See a demo here.

Alternatively, you can impose that the last character matched is not followed by [?\w&=_-]:

(https:\/\/.*?\.png(?:\?[\w&=_-]*)?)(?![?\w&=_-]|\[)
logi-kal
  • 7,107
  • 6
  • 31
  • 43