1

I have the below html

<p>text1</p>
 <ul>
   <li>list-a1</li>
   <li>list-a2</li>
   <li>list-a3</li>
 </ul>
<p>text2</p>
 <ul>
   <li>list-b1</li>
   <li>list-b2</li>
   <li>list-b3</li>
 </ul>
<p>text3</p>

Does anyone have an idea to parse this html file with php to get this output using complex array fist one for the tags "p" and the second for tags "ul" because after above every "p" tag a tag "ul"

Array
(
    [0] => Array
        (
            [value] => text1
                (
                    [il] => list-a1
                    [il] => list-a2
                    [il] => list-a3
                  )

                )
    [1] => Array
        (
            [value] => text2
                (
                    [il] => list-b1
                    [il] => list-b2
                    [il] => list-b3
                  )

                )
             )

I can't use replace or removing all tags cause I use

foreach ($doc->getElementsByTagName('p') as $link) 
{
    $dont = $link->textContent;
    if (strpos($dont, 'document.') === false) {
        $links2[] = array(
            'value' => $link->textContent, );
    }
$er=0;

foreach ($doc->getElementsByTagName('ul') as $link) 
{

$dont2 = $link->nodeValue;
//echo $dont2;
if (strpos($dont2, 'favorisContribuer') === false) {
  $links3[]= array(
       'il' => $link->nodeValue, );

}
filip
  • 41
  • 6
  • 2
    Possible duplicate of [How do you parse and process HTML/XML in PHP?](http://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php) – Sean Jan 03 '17 at 01:34
  • I think `strip_tags` would do it, that's not really parsing but you don't really seem to care about what the elements are. If in a browser use `nl2br` after the strip. – chris85 Jan 03 '17 at 02:12
  • Check this: https://eval.in/707966 – B.Kocaman Jan 03 '17 at 06:50
  • thank you for the replay but I can't use replace or removing all tags cause i have long html code not only the displayed one i wanna solution with dom method . – filip Jan 03 '17 at 07:06

1 Answers1

1

You could use the DOMDocument class (http://php.net/manual/en/class.domdocument.php)

You can see an example below.

<?php

$html = '
    <p>text1</p>
    <ul>
        <li>list-a1</li>
        <li>list-a2</li>
        <li>list-a3</li>
    </ul>
    <p>text2</p>
    <ul>
        <li>list-b1</li>
        <li>list-b2</li>
        <li>list-b3</li>
    </ul>
    <p>text3</p>
';

$doc = new DOMDocument();
$doc->loadHTML($html);

$textContent = $doc->textContent;
$textContent = trim(preg_replace('/\t+/', '<br>', $textContent));

echo '
    <!DOCTYPE html>
    <html>
    <head>
        <title></title>
    </head>
    <body>
        ' . $textContent . '
    </body>
    </html>
';

?>

However, I would suggest using javascript to find the content and send it to php instead.

Andrew Larsen
  • 1,257
  • 10
  • 21
  • You asked for a php solution, so I provided a php only solution. As I said I would suggest using javascript for this task. – Andrew Larsen Jan 03 '17 at 01:48