0

I'm trying to parse the table from this URL.

www.goo.gl/QaU5hB

In particular the MOBILE version of the table. Not the full 'calendar' layout.

This is my current code:

<?php
$url = "https://www.officeholidays.com/2018/01.php";
$html = file_get_contents($url);
libxml_use_internal_errors(true);
$doc = new \DOMDocument();
if($doc->loadHTML($html))
{
$result = new \DOMDocument();
$result->formatOutput = true;
$table = $result->appendChild($result->createElement("table"));
$tr = $table->appendChild($result->createElement("tr"));

$table->setAttribute("class", "table table-striped");
$tr->setAttribute("class", "roe");

$xpath = new \DOMXPath($doc);

$newRow = $tr->appendChild($result->createElement("tr"));

foreach($xpath->query("//table[@class='info-table mobile_ad']/tr/th") as $header)
{
$newRow->appendChild($result->createElement("th", trim($header->nodeValue)));
}

foreach($xpath->query("//table[@class='info-table mobile_ad']/tr") as $row)
{
$newRow = $tr->appendChild($result->createElement("tr"));

if ($row->hasAttribute("class"))
{
$newRow->setAttribute("class", $row->getAttribute("class"))
}

foreach($xpath->query("./td", $row) as $cell)
{
$newCell = $newRow->appendChild($result->createElement("td", trim($cell->nodeValue)));

if ($cell->hasAttribute("class"))
{
$newCell->setAttribute("class", $cell->getAttribute("class"))
}
}
}

echo $result->saveXML($result->documentElement);
}
?>

This produces a 'plain text' copy of the table... However, the "source" table has some rows with classes (depending on the type of holiday it is).

Is there a way that I could also copy these classes over?

I already have the CSS set up on my end... But I just need to know if I can also copy over the classes?? Or does PHP HTML DOM only work with plain text??

Thanks

  • You may not be able to rely on consistent web layout, so your parse code may fail in the future. There is an API, but it's not working yet. – StackSlave Dec 06 '17 at 00:44
  • Hi, Yeah I see that there's an API, but like you say, it doesn't appear to be functioning correctly. This is the main reason why I'm going for this method at the moment. Is it possible to do what I'm asking for? Bringing across the table row class (if one exists) using my code above? – Ashley Taylor Dec 06 '17 at 15:29
  • I don't suppose anyone has any idea how to do this? – Ashley Taylor Dec 08 '17 at 19:10
  • [DOMDocument](http://php.net/manual/en/class.domdocument.php) is PHP, somewhat like JavaScript, for scraping. You might have to look up some of the ways to access get methods that return Arrays. Here are a couple of [useful](https://stackoverflow.com/questions/2909849/loop-over-domdocument) [links](http://php.net/manual/en/domnodelist.item.php) that may help in your endeavor. – StackSlave Dec 12 '17 at 05:44

0 Answers0