0

I am curling a page with php and then I am looking to find a section within that page. That section opens and closes with the html5 <section> tag like this:

<section id="postingbody">
   blah blah blah content
</section>

I am not sure how to get my matching working properly. Just to fill in the matching portion here:

preg_match("/ id=\"postingbody\"\">???????<\/section>/i", $compiled_results, $matches2);

Edit

So here is an example section of the content

<section id="postingbody">
    Looking to find a side job ( working your own hours ) or career in the new media field & internet marketing? Web design, graphic design, SEO, Printing & Internet marketing company looking to hire a sales team member. We have 10+ years experience in the Web design & marketing field. Work your own hours, competitive commission rates, we can also train the right candidates for sales. Our office is located in New Jersey.<br>
</section>

So the examples here don't seem to work.

halfer
  • 19,824
  • 17
  • 99
  • 186
MrTechie
  • 1,797
  • 4
  • 20
  • 36
  • 1
    Have a [look & a smile](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – RienNeVaPlu͢s Dec 16 '13 at 03:39

2 Answers2

2

Try this:

preg_match("/(?s)<section id=\"postingbody\">((?:.)*?)<\/section>/i", $compiled_results, $matches2);

Regular expression visualization

Debuggex Demo

Edit: For example, the following code works as expected for me (the value is in $matches2):

$compiled_results = '<section id="postingbody">
    Looking to find a side job ( working your own hours ) or career in the new media field & internet marketing? Web design, graphic design, SEO, Printing & Internet marketing company looking to hire a sales team member. We have 10+ years experience in the Web design & marketing field. Work your own hours, competitive commission rates, we can also train the right candidates for sales. Our office is located in New Jersey.<br>
</section>';
preg_match("/(?s)<section id=\"postingbody\">((?:.)*?)<\/section>/i", $compiled_results, $matches2);
var_dump($matches2);
elixenide
  • 44,308
  • 16
  • 74
  • 100
0

Regex is not always suited for this type of HTML/XML parsing. Better to use DOM parser in PHP.

However if you really have to then this regex should work for you with /s flag (DOTALL):

preg_match('# id="postingbody">.*?</section>#is', $compiled_results, $matches2);
anubhava
  • 761,203
  • 64
  • 569
  • 643