Find
tag in a very long text

Question

I have a very long HTML text in which I want iterate id value of a p tag in PHP. My original string:

$mystring="
<p> my very long text with a lot of words ....</p>
<p></p>
<p> my other paragraph with a very long text ...</p>
(...)
";

Result that I want:

$myparsestring= "
<p id=1>my very long text with a lot of words ....</p>
<p id=2> my other paragraph with a very long text ...</p>
";

As you can see, I can use getElementsByTagName () and regex (may be split).

What is your guidance to do this job?

piddl0r · Answer 1 · 2012-11-15T00:17:06.110

3

If you're planning on parsing html try using DOM with xpath.

Here is a quick example :

$xpath = new DOMXPath($html);
$query = '//*/p';
$entries = $xpath->query($query);

Don't use regex, if all you plan on doing is parsing html like this use this method unless you've got a specific reason for using regex

edited Nov 15 '12 at 00:17

answered Nov 14 '12 at 11:15

piddl0r

2,431
2
23
35

xpath string you can try : "//*/p" - It gets all P tags – Davuz Nov 14 '12 at 11:16
regex or not ? that is the question... ;-) thx a lot for the example – Fulgence Ridal Nov 14 '12 at 19:07
Don't use regex, if all you plan on doing is parsing html like this use this method unless you've got a specific reason for using regex. – piddl0r Nov 14 '12 at 19:30

Peon · Answer 2 · 2012-11-14T11:33:25.530

0

You can go with regex like this:

$mystring="
<p> my very long text with a lot of words ....</p>
<p></p>
<p> my other paragraph with a very long text ...</p>
(...)
";

// This will give you all <p> tags, that have some information in it.
preg_match_all('/<p>(?<=^|>)[^><]+?(?=<|$)<\/p>/s', $mystring, $matches);

$myparsestring = '';
for( $k=0; $k<sizeof( $matches[0] ); $k++ )
{
    $myparsestring .= str_replace( '<p', '<p id='.($k+1), $matches[0][$k] );
}

echo htmlspecialchars( $myparsestring );

And the output/result:

<p id=1> my very long text with a lot of words ....</p>
<p id=2> my other paragraph with a very long text ...</p>

edited Nov 14 '12 at 11:33

answered Nov 14 '12 at 11:13

Peon

7,902
7
59
100

2

They ... just ... never ... stop. Tony The Pony .. He Comes .... http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – bobah Nov 14 '12 at 11:48

Find tag in a very long text

2 Answers2

Find
tag in a very long text