1

I want divide these html as serval several part. One <h2> or <h3> with some <p> and <span> as one part. I tried explode array('<h2>','<h3>'), but it caused Warning. the explode not support multi choose.

So how to do it perfectly? Thanks.

$text=<<<EOT
<h2>title1</h2>
<p>something</p>
<span>something</span>
<h3>title2</h3>
<p>something</p>
<p>something</p>
<p>something</p>
<h2>title3</h2>
<span>something</span>
<h2>title4</h2>
<span>something</span>
<p>something</p>
<p>something</p>
EOT;

foreach ($text as $result) { 
    $arr = explode(array('<h2>','<h3>'),$result);
    reset($arr); 
    foreach($arr as $line){
        echo $line.'<hr />';
    } 
}

Warning: Invalid argument supplied for foreach() on line 23;

My expected output is:

<h2>title1</h2>
<p>something</p>
<span>something</span>
___________________________
<h3>title2</h3>
<p>something</p>
<p>something</p>
<p>something</p>
___________________________
<h2>title3</h2>
<span>something</span>
___________________________
<h2>title4</h2>
<span>something</span>
<p>something</p>
<p>something</p>
___________________________
cj333
  • 2,547
  • 20
  • 67
  • 110

3 Answers3

1

You should use a parser for this kind of tasks. I use Zend Framework which has a parser component. Otherwise you can use plain PHP DOMElement. Then you can query your dom with xpath or css selectors. Example:

<?php

$text=<<<EOT
<h2>title1</h2>
<p>something</p>
<span>something</span>
<h3>title2</h3>
<p>something</p>
<p>something</p>
<p>something</p>
<h2>title3</h2>
<span>something</span>
<h2>title4</h2>
<span>something</span>
<p>something</p>
<p>something</p>
EOT;


require_once 'Zend/Dom/Query.php';

$dom = new Zend_Dom_Query($text);
$results = $dom->query('h2');

foreach ($results as $domEl) {
    var_dump($domEl->nodeValue);
}
// outputs:
// string(6) "title1"
// string(6) "title3"
// string(6) "title4"

Edit: Given your expected output, my example doesn't fit exactly your needs, but you still need a parser to do that kind of HTML manipulation, because the parser splits the HTML in elements and you can manipulate them as tokens, not as text.

Fabio
  • 18,856
  • 9
  • 82
  • 114
  • You can get different elements using different queries on the same dom object or if you know xpath you can build complex queries, which retrieves all your needed elements. If you never heard of xpath go with separate queries because xpath is powerful but harder to learn. – Fabio Jul 01 '11 at 13:30
1

You can use preg_split() to explode at different things. You can use RegEx here:

$text = <<<EOT
<h2>title1</h2>
<p>something</p>
...
EOT;

$arr = preg_split("#(?=<h[23]>)#", $text);

if(isset($arr[0]) && trim($arr[0])=='') array_shift($arr); // remove first block if empty

foreach($arr as $block){
    echo $block."<hr />\n";
}
Floern
  • 33,559
  • 24
  • 104
  • 119
1

Ok, first, the warning is addressed to the foreach, not the explode. You are trying to loop a string ($text in this case) instead of an array.

Second, even if $text would be of type array and $result would be of type string, you are trying to use an array as delimiter in the explode() call, but that function wants the 1st parameter to be of type string.

I'd recommend to have a look at How to parse HTML with PHP? or to search SO for this terms, to find many many posts dealing with how to parse HTML with PHP.

Community
  • 1
  • 1
Jürgen Thelen
  • 12,745
  • 7
  • 52
  • 71