0

I've searched everywhere and couldn't find any similar problem.

So I wanted to get the entire string in between delimiters with a specific text inside it.

Example:

I wanted to get the whole string in between " that contains the text hotelscombined.com

This is the string to be searched from:
<a id="fake-id" href="http://hotelscombined.com?param=1&param=2&param=3">"Hello I'm a link"</a>

Result should be http://hotelscombined.com?param=1&param=2&param=3 and not Hello world or fake-id

This is not limited to HTML/XML attributes only. It could be done on non-HTML/XML elements

Example:

Content to be searched: Hello world I'm not an "html or xml" I could be "just random text"

Text to be match, anything that has the word random and surrounded by quotes "

So "just random text" is matched.

I tried it here https://regex101.com/r/qI8bS4/1 using this regex \".*?(hotelscombined\.com).*?\" However it seems to be greedy and reaches to the next quote ".

Oleksi
  • 12,947
  • 4
  • 56
  • 80
JohnnyQ
  • 4,839
  • 6
  • 47
  • 65
  • @nhahtdh Duplicate? Seriously? Please read the problem first. This is not limited to HTML elements. – JohnnyQ Nov 10 '15 at 04:03
  • 2
    Try [`href="([^"]*?hotelscombined\.com.*?)"`](https://regex101.com/r/qI8bS4/3) – Tushar Nov 10 '15 at 04:06
  • @Tushar hey thanks! that works I can see you used `[^"]*?` from the way I understand this it matches everything except `"` is that correct? I also added that at the end seems to work fine as well [`"([^"]*?hotelscombined\.com[^"]*?)"`](https://regex101.com/r/qI8bS4/4) – JohnnyQ Nov 10 '15 at 04:12
  • Yes, you can see the explanation on regex101 in the top right corner. – Tushar Nov 10 '15 at 04:13
  • Thanks! I didn't know that. – JohnnyQ Nov 10 '15 at 04:14

1 Answers1

0

Thanks to @Tushar for helping.

I've come up with this solution "([^"]*?hotelscombined\.com[^"]*?)"

For generality: Pattern is like this DELIM([^DELIM]*?TEXT[^DELIM]*?)DELIM

Where:
DELIM = The delimiter
TEXT = The specific text you want enclosed.

The key is to use [^DELIM]*? to match all characters except your delimiter and make it lazy search.

JohnnyQ
  • 4,839
  • 6
  • 47
  • 65