0

I have the following function in PHP that reads URL of pages from an array and fetches the HTML content of the corresponding pages for parsing. I have the following code that works fine.

public function fetchContent($HyperLinks){
    foreach($HyperLinks as $link){
        $content = file_get_html($link);
        foreach($content->find('blablabla') as $result)
        $this->HyperLink[] = $result->xmltext;}//foreach
    return($this->HyperLink);   
}

the problem with the code is that it is very slow and take 1 second to fetch content and parse its content. Considering very large number of files to read, I am looking for a parallel model of the above code. The content of each page is just few kilobyte.

I did search and found exec command but cannot figure out how to do it. I want to have a function and call it in parallel for N times so the execution takes less time. The function would get one link as input like below:

public function FetchContent($HyperLink){
  // reading and parsing code
}

I tried this exec could:

print_r(exec("FetchContent",$HyperLink ,$this->Title[]));

but no way. I also replaced "FetchContent" with "FetchContent($HyperLink)" and removed second para, but neither works.

Thanks. Pls let me know if anything is missing. You may suggest anyway that helps me quickly process the content of numerous files at least 200-500 pages.

Espanta
  • 1,080
  • 1
  • 17
  • 27
  • You're looking for threading, see [this question](http://stackoverflow.com/questions/209774/does-php-have-threading) for more details. – georg Nov 10 '14 at 14:03
  • `curl_multi_exec` http://php.net/manual/en/function.curl-multi-exec.php – Steve Nov 10 '14 at 14:05
  • Tnxs, let me chk pls. – Espanta Nov 10 '14 at 16:04
  • @Steve, what about computing and parsing part? I think pthread is a better splution. isn't it? – Espanta Nov 10 '14 at 17:03
  • No, i dont think so. 99% of the processing time is going to be network latency. Threading in web applications can have serious implications, even in languages that are naturally multithreaded – Steve Nov 10 '14 at 18:26

0 Answers0