I am using simple_html_dom.php from http://simplehtmldom.sourceforge.net to obtain the complete urls of all images on a Wikipedia page. I'm searching mostly for companies and organisations. The script below works for a few but I get Fatal error: Call to a member function find() on a non-object... for many searches in this example YouTube and also if I try Facebook amongst others. I am aware that it because the $html is not an object. What is the method which is going to have the most success in returning the urls. Please see the code below. Any help is greatly appreciated.
<html>
<body>
<h2>Search</h2>
<form method="post">
Search: <input type="text" name="q" value="YouTube"/>
<input type="submit" value="Submit">
</form>
<?php
include 'simple_html_dom.php';
if (isset($_POST['q']))
{
$search = $_POST['q'];
$search = ucwords($search);
$search = str_replace(' ', '_', $search);
$html = file_get_html("http://en.wikipedia.org/wiki/$search");
?>
<h2>Search results for '<?php echo $search; ?>'</h2>
<ol>
<?php
foreach ($html->find('img') as $element): ?>
<?php $photo = $element->src;
echo $photo;
?>
<?php endforeach;
?>
</ol>
<?php
}
?>
</body>
</html>
I have now followed advice in comments below (though I'm probably making a mistake) and encounter errors when I click Submit along the lines of:
Warning: DOMDocument::loadHTMLFile(): ID ref_media_type_table_note_2 already defined in http://en.wikipedia.org/wiki/YouTube, line: 270 in...
Warning: DOMDocument::loadHTMLFile(): ID ref_media_type_table_note_2 already defined in http://en.wikipedia.org/wiki/YouTube, line: 501 in...
Please see my amended code below:
<html>
<body>
<form method="post"> Search:
<input type="text" name="q" value="YouTube"/>
<input type="submit" value="Submit"> </form>
<?php
if (isset($_POST['q']))
{ $search = $_POST['q'];
$search = ucwords($search);
$search = str_replace(' ', '_', $search);
$doc = new DOMDocument();
$doc->loadHTMLFile("http://en.wikipedia.org/wiki/$search");
foreach ($doc->getElementsByTagName('img') as $image)
echo $image->getAttribute('src');
}
?>
</body>
</html>