0

I want to get all the download links in the html using NSRegularExpression.

For example, the html content is like this:

<a href="http://xxxx.com/file.mp3">text info</a>

and I want to get all strings like this:

href="http://xxxx.com/file.mp3"

Right now I am using this pattern:

NSString *pattern = @"(?<=href=\").+?\\.(mp3)";

but it does not work so well.

khelwood
  • 55,782
  • 14
  • 81
  • 108
David L
  • 569
  • 1
  • 6
  • 16
  • Relevant: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – Mike Weller May 29 '13 at 13:52
  • What exactly do you mean by "It does not work so well"? Do you have any test cases? – Monolo May 29 '13 at 14:32

1 Answers1

1

As I mentioned in my comment, this question is a bit underspecified, but if we take at face value, then you want to extract the href attribute from any <a> tag in the string, it the file name extension is .mp3. I hope I got this right.

To be honest, I would have expected that you only needed the URL, but for now we'll go with the href attribute.

Your pattern to get these strings is basically right, there is just no need to use a positive look behind (which means that the href=" part is not included in the match). So with this pattern you should get what you need:

NSString *pattern = @"href=\"[^\"]+\\.mp3\"";

Notice that the url is matched by including all characters that are not a quotation mark, because otherwise you risk to match with a random ".mp3" string in the html text.

Monolo
  • 18,205
  • 17
  • 69
  • 103