0

I am trying to scrape a content from one site using the simple_html_dom using this code

$html = file_get_html('http://www.aswaqcity.com/thread1230092.html');
//echo $html;
// Find all article blocks
foreach($html->find('/html/body/div[2]/div[1]/div/div/div/table[1]/tbody/tr[2]/td[2]') as $article) {
    $item['title']      = $article->find('/div[1]/strong', 0)->plaintext;
    $articles[] = $item;
}

print_r($articles); 

I got the xpath from firebug options but there is nothing scraped.

Cœur
  • 37,241
  • 25
  • 195
  • 267
  • @Enissay So, are the answers to [this question](http://stackoverflow.com/questions/9378107/how-to-use-xpath-in-php-simple-html-dom-parser) wrong? Not familiar with PHP, just curious. It seems to me XPath expressions can be used: http://simplehtmldom.sourceforge.net/manual.htm#section_find. – Mathias Müller Jan 01 '15 at 22:07
  • @MathiasMüller Scratch that, both are supported (my bad)... I tried to explore the code, but it looks like it has some encoding problem when displaying the result and which I couldn't solve... – Enissay Jan 01 '15 at 22:22
  • 1
    Please explain what you are trying to find on this page. What would be the expected output? @Enissay No worries - I misread specifications all the time myself.. – Mathias Müller Jan 01 '15 at 22:24

1 Answers1

1

Most likely the tbody isn't really there. HTML browsers will add those to the dom whenever they are missing.

Also you should be using css instead of xpath, it's the whole point of using simple-html-dom.

hakre
  • 193,403
  • 52
  • 435
  • 836
pguardiario
  • 53,827
  • 19
  • 119
  • 159