1

I have a very long HTML text in which I want iterate id value of a p tag in PHP. My original string:

$mystring="
<p> my very long text with a lot of words ....</p>
<p></p>
<p> my other paragraph with a very long text ...</p>
(...)
";

Result that I want:

$myparsestring= "
<p id=1>my very long text with a lot of words ....</p>
<p id=2> my other paragraph with a very long text ...</p>
";

As you can see, I can use getElementsByTagName () and regex (may be split).

What is your guidance to do this job?

Echilon
  • 10,064
  • 33
  • 131
  • 217

2 Answers2

3

If you're planning on parsing html try using DOM with xpath.

Here is a quick example :

$xpath = new DOMXPath($html);
$query = '//*/p';
$entries = $xpath->query($query);

Don't use regex, if all you plan on doing is parsing html like this use this method unless you've got a specific reason for using regex

piddl0r
  • 2,431
  • 2
  • 23
  • 35
0

You can go with regex like this:

$mystring="
<p> my very long text with a lot of words ....</p>
<p></p>
<p> my other paragraph with a very long text ...</p>
(...)
";

// This will give you all <p> tags, that have some information in it.
preg_match_all('/<p>(?<=^|>)[^><]+?(?=<|$)<\/p>/s', $mystring, $matches);

$myparsestring = '';
for( $k=0; $k<sizeof( $matches[0] ); $k++ )
{
    $myparsestring .= str_replace( '<p', '<p id='.($k+1), $matches[0][$k] );
}

echo htmlspecialchars( $myparsestring );

And the output/result:

<p id=1> my very long text with a lot of words ....</p>
<p id=2> my other paragraph with a very long text ...</p>
Peon
  • 7,902
  • 7
  • 59
  • 100
  • 2
    They ... just ... never ... stop. Tony The Pony .. He Comes .... http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – bobah Nov 14 '12 at 11:48