How to Cut This String This Html With Script PHP

Question

I am learning to grabbing data with curl. This my code.

function readHTML($url){
 $data = curl_init();
 curl_setopt($data, CURLOPT_RETURNTRANSFER, 1);
 curl_setopt($data, CURLOPT_URL, $url);
 $result = curl_exec($data);
 curl_close($data);
 return $result;}

    $codeHTML =  readHTML('http://website.com/');$ex1 = explode('ol class=tabcont>', $codeHTML); $ex2 = explode('/ol>', $ex1[1]);echo $ex2[0];

I Have a problem with this output html code.

<ul>
<li>content</li>
<li>content</li>
<li>content</li>
<li>content</li>
<li>content</li>
<li>content</li>
<li>content</li>
<li>content</li>
</ul>

I want to cut the code <li></li> with PHP so the code like it

<ul>
<li>content</li>
<li>content</li>
<li>content</li>
<li>content</li>
<li>content</li>
</ul>

How can i do it. sorry my english is bad. :) Thanks.

Are you looking for pagination? Or just minimize the number of rows displayed on a single page — Daryl Gill, Apr 09 '14 at 00:18
if you want to parse *existing* html, there's some options too: http://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php — Jorg, Apr 09 '14 at 00:19
no, the html is output from grabbing and shows like that and i want to minimize to five li tag. i want to use function explode but the code is same. thanks for editing :) — user3513136, Apr 09 '14 at 00:27
If you're grabbing html, please consider parsing it with the [DomDocument](http://www.php.net/manual/en/class.domdocument.php) class. This will make your life easier, as you can easily delete nodes and just have 5 (or however many) `li` tags within `ul`. — Dave Chen, Apr 09 '14 at 00:35
@user3513136 I wouldn't recommend using explode for scraping HTML. There are great extensions made just for this type work. — Dave Chen, Apr 09 '14 at 01:12
hhhe,, i am just learning about PHP. i must study hard again :) — user3513136, Apr 09 '14 at 01:22

score 2 · Accepted Answer · answered Apr 09 '14 at 01:09

Since you are grabbing this HTML, instead of being hard-coded. I feel using DomDocument is appropriate.

<?php

$html = '<ul>
<li>content</li>
<li>content</li>
<li>content</li>
<li>content</li>
<li>content</li>
<li>content</li>
<li>content</li>
<li>content</li>
</ul>';

$dom = new DOMDocument();
$dom->loadHTML($html);

$ul    = $dom->getElementsByTagName('ul')->item(0);
$count = 0;

$toRemove = array();

foreach ($ul->childNodes as $node)
    if ($node->tagName === 'li')
        if ($count++ >= 5)
            $toRemove[] = $node;
foreach ($toRemove as $node)
    $ul->removeChild($node);

$dom->removeChild($dom->firstChild);
$dom->replaceChild($dom->firstChild->firstChild->firstChild, $dom->firstChild);
echo $dom->saveHTML();

Output:

<ul><li>content</li>
<li>content</li>
<li>content</li>
<li>content</li>
<li>content</li>



</ul>

The empty lines are due to the new lines around the <li> tags. You can remove them too by checking for the #text as well.

@user3513136 If you are satisfied with this answer please accept it with a checkmark. I would ask again that you reconsider using explode to scrape html data. If anything changes, nothing will work. — Dave Chen, Apr 09 '14 at 01:28

How to Cut This String This Html With Script PHP

1 Answers1