php dom scraping - best method for grabbing product prices

Question

I'm using simpleHtmlDom to do some basic screen scraping. I'm having some problems with grabbing product prices though. Sometimes I can get it to work, sometimes I can't. Also, sometimes I'm getting multiple prices... say for example the website has something like "normally $100... now $79.99" Any suggestions out there? Currently, I'm using this:

$prices = array();
$prices[] = $html->find("[class*=price]", 0)->innertext;
$prices[] = $html->find("[class*=msrp]", 0)->innertext;
$prices[] = $html->find("[id*=price]", 0)->innertext;
$prices[] = $html->find("[id*=msrp]", 0)->innertext;
$prices[] = $html->find("[name*=price]", 0)->innertext;
$prices[] = $html->find("[name*=msrp]", 0)->innertext;

One website that I have no idea of how to grab the price from is Victoria Secret.... the price looks like it's just floating around in random HTML.

do you have any particular question? We cannot come up with a one size fits all solution for any possible markup out there. Have a look at http://stackoverflow.com/questions/3577641/how-to-parse-and-process-html-with-php for some tips about parsing HTML with PHP. — Gordon, Dec 07 '11 at 15:26
I'm looking to see what other methods people are using to grab product prices as well as to grab the correct prices. I realize that there isn't a "single solution" to this, but there must be something better than what I'm currently doing. — Stanley, Dec 07 '11 at 15:45

score 1 · Answer 1 · answered Dec 08 '11 at 00:06

First of all, don't use simplehtmldom. Use the built in dom functions or a library that's based on them. If you want to extract all prices from a page you could try something like this:

$html = "<html><body>normally $100... now $79.99</body></html>";
$dom = new DOMDocument();
$dom->loadHTML($html);
$xpath = new DomXpath($dom);

foreach($xpath->query('//text()[contains(.,"$")]') as $node){
    preg_match_all('/(\$[\d,.]+)/', $node->nodeValue, $m);
    print_r($m);
}

php dom scraping - best method for grabbing product prices

1 Answers1