0

I am tring parse html use simple html and remove page menu and footer (In example, i choose http://codex.buddypress.org/developer-docs/the-bp-global/, and then may be other url.). But my code return Fatal error: Call to a member function find() on a non-object , where is wrong? Thanx.

require('simple_html_dom.php');
$webch = curl_init();
curl_setopt($webch, CURLOPT_URL, "http://codex.buddypress.org/developer-docs/the-bp-global/");
curl_setopt($webch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($webch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 GTB5');
$htmls = curl_exec($webch);
curl_close($webch);
$html = str_get_html($htmls);
$html = preg_replace('#<div(.*?)id="(.*?)head(.*?)"(.*?)>.*</div>#is', '', $html);
$html = preg_replace('#<div(.*?)class="(.*?)head(.*?)"(.*?)>.*</div>#is', '', $html);
$html = preg_replace('#<div(.*?)id="(.*?)menu(.*?)"(.*?)>.*</div>#is', '', $html);
$html = preg_replace('#<div(.*?)class="(.*?)menu(.*?)"(.*?)>.*</div>#is', '', $html);
$html = preg_replace('#<div(.*?)id="(.*?)foot(.*?)"(.*?)>.*</div>#is', '', $html);
$html = preg_replace('#<div(.*?)class="(.*?)foot(.*?)"(.*?)>.*</div>#is', '', $html);
foreach($html->find('a') as $element){
   echo $element.'<hr />';
}
fish man
  • 2,666
  • 21
  • 54
  • 94

1 Answers1

0

str_get_html seems like it is a function from an HTML DOM Parser. What it returns is anything but a string, and that's what you're treating as. The preg_replace expects a string as input and returns a string, which is then set to $html.

Your problem is that you are then calling $html->find, this means that you are expecting $html to be an object, as the one returned by str_get_html, but it is not because you just assigned it to a string, returned by preg_replace.

What you probably want is either one of these two things:

  • Do the string processing (using preg_replace), before doing it $html = str_get_html($htmls);. After that statement, it is no longer a string and any processing you do will be useless and wrong.
  • Do whatever you are doing using actual tools available in the library you are using (Simple HTML DOM Parser, as far as Google can tell). Something like $html->find('div.menu')->class = ''; for example.

I would recommend the second point (if it is what you want), because HTML processing using regular expressions is not a really good idea.

Community
  • 1
  • 1
jadkik94
  • 7,000
  • 2
  • 30
  • 39