I am working on to retrieve a table content(everything under <tbody>
) from an URL to my page.
It can be everything under <table>
but remove <thread>...</thread>
I have search many references in this forum but not able to get the result I want.
The HTML structure as per the image(actual code too lengthy to paste here): [1]: https://i.stack.imgur.com/SgwM1.png
Appreciate if you can show me the light Orz
My sample code"
$url = 'https://xxxxxx.com/tracking/SUA000085003';
$ch = curl_init($url);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, true);
$cl = curl_exec($ch);
$dom = new DOMDocument();
$dom->loadHTML($cl);
$dom->validate();
$rows = $dom->getElementsByTagName("tr");
foreach ($rows as $row) {
$cells = $row -> getElementsByTagName('td');
foreach ($cells as $cell) {
print $cell->nodeValue; // print cells' content as 124578
echo "<BR>";
}
}
The result I got is:
https://xxxxxx.com/tracking/SUA000085003
15 May 202101:35:33
the goods left the warehouse in guangzhou
15 May 202101:35:33
arrived at sorting facility
14 May 202123:35:33
express operation is complete
The URL from the result is under <Table><thread>...</thread>
I would like to remove this text entirely or only show the text after the last /
, SUA000085003
is the example for this case.