0

Possible Duplicate:
RegEx match open tags except XHTML self-contained tags
How to parse and process HTML/XML with PHP?

I have a current line of code as part of a image download script that looks like this:

preg_match_all('|<img.*?src=[\'"](.*?)[\'"].*?>|i', $content, $matches);

I need to alter this to include:

id="iwi"

within the preg_match_all command. The img is always in this format:

I've tried a few different variations and am getting errors and finally tried without the quotes like below and still nothing, is my syntax wrong?

preg_match_all('|<img.*?id=iwi.*?src=[\'"](.*?)[\'"].*?>|i', $content, $matches);
Community
  • 1
  • 1
Rocco The Taco
  • 3,695
  • 13
  • 46
  • 79
  • 1
    If you don't know regex, there are [easier ways](http://stackoverflow.com/questions/3577641/how-to-parse-and-process-html-xml-with-php), like `qp($html)->find("img#iwi")->attr("src")`. – mario Jan 20 '13 at 01:53
  • 1
    Is the `id` attribute before the `src` attribute? How did you try adding quotes? Also, if you need to extract a lot of attributes like this, do yourself a favour and use an HTML parser. – Ry- Jan 20 '13 at 01:54
  • I need to keep this code in place and alter it only. The image is always in this format: – Rocco The Taco Jan 20 '13 at 02:17
  • It's 2013. Use an XML parser. –  Jan 20 '13 at 02:22
  • And how did you try it with the quotes? (And is the semicolon really there?) – Ry- Jan 20 '13 at 02:25
  • Like this: preg_match_all('||i', $content, $matches); – Rocco The Taco Jan 20 '13 at 02:32
  • 1
    @RoccoTheTaco: The quotes are in the wrong place. They should be around `iwi`, not `id=iwi`. `preg_match_all('||i', $content, $matches);` – Ry- Jan 20 '13 at 02:39
  • 1
    Well, it's a good thing that attributes in XML tags always have to be in a given, fixed order. Oh, wait.... –  Jan 20 '13 at 02:39

2 Answers2

4

This is the number one problem with The Pony He Comes. You don't know if it wil be <img id="iwi" src="image.png" />, or <img src="image.png" id="iwi" />.

Instead, you should use a parser:

$dom = new DOMDocument();
$dom->loadHTML($content);
$img = $dom->getElementById("iwi");
$src = $img->getAttribute("src");
Community
  • 1
  • 1
Niet the Dark Absol
  • 320,036
  • 81
  • 464
  • 592
1

If you insist on using preg's despite all opposing-views, these methods also work;

// [\'"]* is useful cos sometime can't find " or ', and * means 0 or 1 time search
preg_match_all('~<img.*?id=[\'"]*([^\s\'"]*).*?src=[\'"]*([^\s\'"]*).*?>~i', $content, $matches);
preg_match_all('~<img.*?id=[\'"]*(?P<id>[^\s\'"]*).*?src=[\'"]*(?P<src>[^\s\'"]*).*?>~i', $content, $matches);
print_r($matches);
Kerem
  • 11,377
  • 5
  • 59
  • 58