php preg_replace html menu, footer

Question

I am tring parse html use simple html and remove page menu and footer (In example, i choose http://codex.buddypress.org/developer-docs/the-bp-global/, and then may be other url.). But my code return Fatal error: Call to a member function find() on a non-object , where is wrong? Thanx.

require('simple_html_dom.php');
$webch = curl_init();
curl_setopt($webch, CURLOPT_URL, "http://codex.buddypress.org/developer-docs/the-bp-global/");
curl_setopt($webch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($webch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 GTB5');
$htmls = curl_exec($webch);
curl_close($webch);
$html = str_get_html($htmls);
$html = preg_replace('#<div(.*?)id="(.*?)head(.*?)"(.*?)>.*</div>#is', '', $html);
$html = preg_replace('#<div(.*?)class="(.*?)head(.*?)"(.*?)>.*</div>#is', '', $html);
$html = preg_replace('#<div(.*?)id="(.*?)menu(.*?)"(.*?)>.*</div>#is', '', $html);
$html = preg_replace('#<div(.*?)class="(.*?)menu(.*?)"(.*?)>.*</div>#is', '', $html);
$html = preg_replace('#<div(.*?)id="(.*?)foot(.*?)"(.*?)>.*</div>#is', '', $html);
$html = preg_replace('#<div(.*?)class="(.*?)foot(.*?)"(.*?)>.*</div>#is', '', $html);
foreach($html->find('a') as $element){
   echo $element.'<hr />';
}

score 0 · Accepted Answer · edited May 23 '17 at 10:25

str_get_html seems like it is a function from an HTML DOM Parser. What it returns is anything but a string, and that's what you're treating as. The preg_replace expects a string as input and returns a string, which is then set to $html.

Your problem is that you are then calling $html->find, this means that you are expecting $html to be an object, as the one returned by str_get_html, but it is not because you just assigned it to a string, returned by preg_replace.

What you probably want is either one of these two things:

Do the string processing (using preg_replace), before doing it $html = str_get_html($htmls);. After that statement, it is no longer a string and any processing you do will be useless and wrong.
Do whatever you are doing using actual tools available in the library you are using (Simple HTML DOM Parser, as far as Google can tell). Something like $html->find('div.menu')->class = ''; for example.

I would recommend the second point (if it is what you want), because HTML processing using regular expressions is not a really good idea.

php preg_replace html menu, footer

1 Answers1