-2

I am trying to scrape titles from a html page, I am using regular expression to fetch the titles as follows:

Regex:

<h6 class="panel-title"><i class="icon-file-music2"><\/i>(.*?)<i class=" icon-info3 text-success" data-popup="tooltip" data-html="true" title="(.*)" data-placement="bottom"><\/i>\n<\/h6>

The code to parse:

<div class="panel-heading">
                                    <h6 class="panel-title"><i class="icon-file-music2"></i> TRXD - Sometimes.mp3
<i class=" icon-info3 text-success" data-popup="tooltip" data-html="true" title="<b>Uploaded on:</b><br/> 2017-11-14 06:56:54<br>  <b>Downloads:</b><br/> 7" data-placement="bottom"></i>

</h6>
                                    </div>

Regex101 Link

The section I am trying to get back is TRXD - Sometimes.mp3. The regular expression I have used doesn't seem to work, would appreciate someone explain what I'm doing wrong.

InvalidSyntax
  • 9,131
  • 20
  • 80
  • 127
  • [H̸̡̪̯ͨ͊̽̅̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) - Don't parse HTML with regex. – ctwheels Nov 20 '17 at 21:56
  • Regex is the wrong tool - Use something that will load the DOM and go from there – Ed Heal Nov 20 '17 at 21:57
  • You need to enable the single-line modifier `s` and add a 2nd newline at the end: `\n\n<\/h6>` – Aran-Fey Nov 20 '17 at 21:58
  • @Rawing Please convert to an answer. Thanks – InvalidSyntax Nov 20 '17 at 22:04
  • 1
    This question won't be useful for any future readers. Rather than answering it, I'd rather close it as a typo/unhelpful. – Aran-Fey Nov 20 '17 at 22:06

1 Answers1

-1

Here is a working example: https://regex101.com/r/6uWm8E/1, you are missing new lines captures.

antoni
  • 5,001
  • 1
  • 35
  • 44