I'm trying to parse a table from an HTML webpage, but I'm having trouble.
Here is what my HTML approximately looks like :
<tbody>
<tr class="even">
<td class="time">Monday 20:10</td>
<td class="place">Paris 14</td>
</tr>
<tr class="odd">
<td class="time">Monday 21:00</td>
<td class="place">Paris 13</td>
</tr>
</tbody>
EDIT : Here is my PHP
<?php
$url = 'https://www.gymsuedoise.com/loc/dt/?id=64';
$options = array(
CURLOPT_RETURNTRANSFER => true, // return web page
CURLOPT_HEADER => false, // don't return headers
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_ENCODING => "", // handle all encodings
CURLOPT_USERAGENT => "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:18.0) Gecko/20100101 Firefox/18.0", // something like Firefox
CURLOPT_AUTOREFERER => true, // set referer on redirect
CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect
CURLOPT_TIMEOUT => 120, // timeout on response
CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
);
$curl = curl_init($url); curl_setopt_array( $curl, $options ); $content = curl_exec($curl); curl_close($curl);
$dom = new DOMDocument(); @$dom->loadHTML($content); $xpath = new DOMXPath($dom);
$tables = $dom->getElementsByTagName('tbody');
$rows = $tables->item(0)->getElementsByTagName('tr');
foreach ($rows as $row)
{
$cols = $row->getElementsByTagName('td');
$date = $cols->item(0)->nodeValue; $liste_element[$i]['date'] = trim($date);
$intensite = $cols->item(2)->nodeValue; $liste_element[$i]['intensite'] = trim($intensite);
$animateur = $cols->item(3)->nodeValue; $liste_element[$i]['animateur'] = trim($animateur);
$forfait = $cols->item(5)->nodeValue; $liste_element[$i]['forfait'] = trim($forfait);
$i++;
}
echo '<pre>';
print_r ($liste_element);
echo '<pre>';
?>
My issue is that my script can't scrape anything in the 6th column (i.e. item(5)
) of the table, as there are only pictures and no text.
How could I scrape the content in the alt
or title
attribute if the the <img>
tag ?