I'm curious if anyone has any recommendations as to the best method to leverage PHP/CURL (or another technology even) to download content from a website. Right now I'm using curl_multi to do 10 requests at a time, which helps some.
I literally need to request about 100K pages daily, which can get a bit tedious (takes 16 hours right now). My initial thoughts are just setting up multiple virtual machines and splitting up the task, but was wondering if there is something else I'm missing besides parallelization. (I know you can always throw more machines at the problem heh)
Thanks in advance!