Im building a script that give me an product array by parsing html from a list of websites.
I believe that Im doing everything right.. But for some reason i have alots of difficulty with only one website Makita.ca
So.. Im using DOMXPath for retrieving element. i am providing the RAW html that im getting from makita.ca
What picture i want to get is those on the pictures that are on the left
please also note that the only thing i need is the link of the image and not the actual image.
the folowing image page is at http://www.makita.ca/index2.php?event=tool&id=100
$productArray = array();
$Dom = new DOMDocument();
@$Dom -> loadHTML($this->html);
$xpath = new DOMXPath($Dom);
echo $xpath -> query('//*[@id="content_other"]/table[2]/tbody/tr/td[1]/table/tbody/tr[4]/td/table/tbody/tr[1]/td/div/a/img')->length;
if($xpath -> query('//*[@id="content_other"]/table[2]/tbody/tr/td[1]/table/tbody/tr[4]/td/table')->length > 0)
{
for($i=0;$i<$xpath->query('//*[@id="content_other"]/table[2]/tbody/tr/td[1]/table/tbody/tr[4]/td/table/tbody/tr')->length;$i++)
{
if($xpath->query('//*[@id="content_other"]/table[2]/tr/td[1]/table/tr[4]/td/table/tr['.$i.']/td/div/a/img') > 0)
$productArray['picture'][] = $xpath -> query('//*[@id="content_other"]/table[2]/tr/td[1]/table/tr[4]/td/table/tr['.$i.']/td/div/a/img')->item(0)->nodeValue;
}
}
Do you see what is my mistake ? since now im really lost.
Edit:
ok for test purposes i am echoing the length of the query() method witch should give me how much element match the query
So I retyped to hole query down so they can't have any non asci character So i retyped the hole query '//*[@id="content_other"]/table[2]//tr/td1/table//tr[4]/td/table//tr1/td/div/a/img' then the result is 0
So i removed the end of the query part by part..
//*[@id="content_other"]/table[2]//tr/td[1]/table//tr[4]/td/table//tr[1]/td/div/a = 0
//*[@id="content_other"]/table[2]//tr/td[1]/table//tr[4]/td/table//tr[1]/td/div = 0
//*[@id="content_other"]/table[2]//tr/td[1]/table//tr[4]/td/table//tr[1]/td = 0
//*[@id="content_other"]/table[2]//tr/td[1]/table//tr[4]/td/table//tr[1] = 0
//*[@id="content_other"]/table[2]//tr/td[1]/table//tr[4]/td/table = 0
//*[@id="content_other"]/table[2]//tr/td[1]/table//tr[4]/td = 0
//*[@id="content_other"]/table[2]//tr/td[1]/table//tr = 5
Wooo i got some element matching here ! ok let try the last element witch is the one i need so since it is zero based then to get the tr number 5 i need to enter as a path this
//*[@id="content_other"]/table[2]//tr/td[1]/table//tr[4]
But I still get 0.... So i dont know what to do any more..