0

I'm using curl to scrap two websites, both of them with the same php script(that is ran every 30 min by a cron job). The request is very simple:

//website 1
$ch = curl_init();
$url = 'url';
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$result = curl_exec($ch);
curl_close($ch);

//website 2
$ch2 = curl_init();
$url2 = 'url';
curl_setopt($ch2, CURLOPT_URL, $url2);
curl_setopt($ch2, CURLOPT_RETURNTRANSFER, true);
$result2 = curl_exec($ch2)
curl_close($ch2);

My question/s is: what is the best practice in cases like this to prevent running out of memory(didn't happen yet but who knows) and to maximize execution speed?

Is there a way to clean memory after each curl request?

Thank you! :D

emma
  • 761
  • 5
  • 20
  • `curl_close` is supposed to free up any allocated resources, so you should be fine memory-wise. – Jeto Aug 18 '18 at 07:56
  • Hey @Jeto Thank you! Is there any other suggestion regarding some better practices that i could implement? X_X – emma Aug 18 '18 at 07:59
  • Some potential hints [here](https://stackoverflow.com/questions/19467449/how-to-speed-up-curl-in-php). I'm no cURL expert so can't say if they're reliable but they seem like reasonable things to try. – Jeto Aug 18 '18 at 08:52
  • Depending on how much data there is on the website you are scraping, you might wanna adjust the memory limit php is allowed to use.Use `ini_get('memory_limit')` to see the current limit. – Vinay Aug 18 '18 at 14:44
  • Hey @Viney, i was thinking about that but isn't that a bad practice? i mean i would like to use as little resources as possible even if that means that i'd have to split my code, it is ran by a cron job anyway so i don't have a problem with slowing it down a little bit just to spend less resources. But is this approach the right one or should i consider some other approach? – emma Aug 18 '18 at 14:50
  • Yes if done without a proper consideration you'll be just wasting your resources.Depending on your PHP version the limit might be 8MB for PHP5.3 or 128MB for PHP 7.0. So unless your fetched HTML exceeds 8 MB there's nothing to worry about.Since PHP has inbuilt GC (garbage collection) it would help if you add `$ch = null` after `curl_close($ch);` this would hint GC to recover some memory used by curl variable and resource handles used for website1 before proceeding to scrape website2. – Vinay Aug 18 '18 at 15:03

0 Answers0