I am looking to get the exact list of a url that has a list of items to store in a database and use it after. The thing is that I get only the first item of this. I want to have the list of this page and then go to page 2, then 3 then 4 ... and scrape all the links if possible.
I want to get the http:..............html of the post and the title, then go to the next page and get all the pages and so on and store them in database.
Here is the code I used:
$url ='http://newyork.craigslist.org/search/jjj?addFour=part-time';
$timeout = 10;
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_FRESH_CONNECT, true);
curl_setopt($ch, CURLOPT_TIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$data = curl_exec($ch);
curl_close($ch);
function get_matched($pattern,$data)
{
preg_match($pattern,$data,$match);
return $match[1];
}
$pattern= "/<p>(.*?)<\/p>/";
$caty= get_matched($pattern,$data);
echo "$caty";
How can I do this?