I'm trying to scrape some product details from a website using the following code:
$list_url = "http://www.topshop.com/en/tsuk/category/sale-offers-436/sale-799";
$html = file_get_contents($list_url);
echo $html;
However, I'm getting this error:
Warning: file_get_contents(http://www.topshop.com/en/tsuk/category/sale-offers-436/sale-799) [function.file-get-contents]: failed to open stream: HTTP request failed! HTTP/1.0 403 Forbidden in /homepages/19/d361310357/htdocs/shopaholic/rss/topshop_f_uk.php on line 123
I gather that this is some sort of block by the website to prevent scraping. Is there a way around this - perhaps using cURL and setting a user agent?
If not, is there another way of getting basic product data like item name and price?
EDIT
The context of my code is that I'd eventually still want to be able to achieve the following:
$doc = new DOMDocument();
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);