I am trying to get the inner html of a <p>
tag and save it as a .txt file. It is a very simple page; there is only one <p>
on it. I tried using getElementsByTagName('p')
as per: Using PHP to get DOM Element. Unfortunately, it didn't work for me, but maybe I'm missing something. My code is:
<?php
$dataPage = file_get_contents('http://www.somedataurl.com');
$doc = new DOMDocument;
$doc->loadHTML($dataPage);
$dataNodeList = $doc->getElementsByTagName('p');
$dataNode = $dataNodeList->item(0);
function innerHTML($node) {
return implode(array_map([$node->ownerDocument, "saveHTML"],
iterator_to_array($node->childNodes)));
}
$theData = innerHTML($dataNode);
header('Content-Type: text/plain');
$filename = date('Y-m-d') . '.txt';
file_put_contents($filename, $theData);
The error log is giving me:
PHP Notice: Undefined property:: DOMNodeList (line 10)
PHP Notice: Undefined property:: DOMNodeList (line 11)
PHP Catchable fatal error (line 11)
These errors sound rather alarming, especially the last one.
Question: Is there a better tool I can use other than getElementsByTagName()
since I am only dealing with one <p>
? Or can this way work if I adjust a few things?
` tags?
– Phil May 16 '18 at 02:53` tags are there, but judging by "Undefined property" I think my script might not be finding them for some reason. My only other hunch was a data structure error, domnodelist vs node
– Arash Howaida May 16 '18 at 03:05` with no class.
– Arash Howaida May 16 '18 at 03:11` tag you think is there isn't coming from JavaScript
– Phil May 16 '18 at 03:25