1

This is the code that works, locally.

$str = <<<SSS
  <H1 class="prodname">Alison Raffaele Reality Base</H1>Foundation, Skintone 1 - Fairest&nbsp;1 fl oz (30 m)<p class="tip"><table id="TblProdForkSellCopy" width="100%" border="0"><tr><td class="contenttd"><p>Get full, flawless coverage with this luxurious oil-free formula. Continually refreshes and re-hydrates your skin for 12+ hours - and guards against premature aging by deflecting damaging free radicals. </p></td></tr></table><p></p>
SSS;

preg_match("~</[hH]1>(.+?)<p~",$str,$name)  ;
var_dump($name) ;

But doesn't work when the page is actually parsed. Why ? Link to the page . Is there anything wrong with my code. I have copy pasted exactly from the page. Oh and by doesn't work I mean it matches too much. When matched locally the first '<p' isn't included , but in my actual script (when the page is downloaded from the net) it includes the '<p' tag for some reason.

Thanks

gyaani_guy
  • 3,191
  • 8
  • 43
  • 51
  • 1
    "But doesn't work when the page is actually parsed" < what do you exactly mean by that? parsed? – yankee May 19 '12 at 13:40
  • I mean when I try to parse it with the regex . fetch page with curl > make simple html dom doc > parse it with regex. – gyaani_guy May 19 '12 at 13:43
  • Please refrain from parsing HTML with RegEx as it will [drive you insane](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454). Use an [HTML parser](http://stackoverflow.com/questions/292926/robust-mature-html-parser-for-php) instead. – Madara's Ghost May 19 '12 at 13:56
  • You speak the truth, Truth. It has driven me quite insane.. – gyaani_guy May 19 '12 at 16:49

1 Answers1

2

Try this:

/<h1[^>]*>([^<]+)/i

It's not working because you're closing the tag ignoring the HTML tag attributes. See the [^>]* it will match to all before(the attributes) >, as the class="prodname" part of your example. See the i flag. will not differentiate case. Can matchh and H.

The Mask
  • 17,007
  • 37
  • 111
  • 185