0

I have a string that looks like:

">ANY CONTENT</span>(<a id="show

I need to fetch ANY CONTENT. However, there are spaces in between

</span> and (<a id="show

Here is my preg_match:

$success = preg_match('#">(.*?)</span>\s*\(<a id="show#s', $basicPage, $content);

\s* represents spaces. I get an empty array!

Any idea how to fetch CONTENT?

user311509
  • 2,856
  • 12
  • 52
  • 69

2 Answers2

0

Use a real HTML parser. Regular expressions are not really suitable for the job. See this answer for more detail.

You can use DOMDocument::loadHTML() to parse into a structured DOM object that you can then query, like this very basic example (you need to do error checking though):

$dom = new DOMDocument;
$dom->loadHTML($data);
$span = $dom->getElementsByTagName('span');
$content = $span->item(0)->textContent;
Community
  • 1
  • 1
spencercw
  • 3,320
  • 15
  • 20
0

I just had to:

">

define the above properly, because "> were too many in the page, so it didn't know which one to choose specficially. Therefore, it returned everything before "> until it hits (

Solution:

.">

Sample:

$success = preg_match('#\.">(.*?)</span>\s*\(<a id="show#s', $basicPage, $content);
user311509
  • 2,856
  • 12
  • 52
  • 69