0

Using PHP DomXPath to scrape some websites.

Currently using this tutorial to traverse XPaths.

I am currently scraping this site, getting the character names and Steam ID (the mess of an XPath below is what gets one Steam ID).

My question is - there are multiple Steam IDs and character names. The XPath that I painstakingly created only gets one.

How should I scrape all of the Steam IDs instead of just one of them?

$xpath = new DomXPath($this->ourTeamHTML);

/* Set HTTP response header to plain text for debugging output */
header("Content-type: text/plain");

$steamName = $xpath->query('//*[@id="wrapper"]/section/div/div[1]/div[2]/div[2]/div[1]/div/div/div[1]/div/div[1]/h5/b');
/* Traverse the DOMNodeList object to output each DomNode's nodeValue */
foreach ($steamName as $node) {
    echo "Steam Name: " . $node->nodeValue . "\n";
}
Community
  • 1
  • 1
theGreenCabbage
  • 5,197
  • 19
  • 79
  • 169

1 Answers1

0

Your xpath is too verbose, having full path and element indexes it is not intuitive to read and tends to break due to slight changes in the page source. Try using the following simpler xpath :

//*[@id="wrapper"]//div[@class='col-md-12']//h5/b

It worked for me to get all Steam ID's and character names (total of 32 elements) from the linked page (tested using firefox's firepath add-on)

har07
  • 88,338
  • 12
  • 84
  • 137