1

i need a help ^^ What i need is script which will open and read all .csv files in folder 'csv/files' and then do that thing in "if". Well, when i had only one file it worked fine. I managed to construct some script which is not working but no "error line" popping up either ... So can somebody look at my code and tell me what i am doing wrong ?

<?php 
foreach (glob("*.csv") as $filename) {
    echo $filename."<br />";

    if (($handle = fopen($filename, "r")) !== FALSE) {
        while (($data = fgetcsv($handle, 1000, ";")) !== FALSE) {
            $url = $data[0];
            $path = $data[1];

            $ch = curl_init($url);
            $fp = fopen($path, 'wb');
            curl_setopt($ch, CURLOPT_FILE, $fp);
            curl_setopt($ch, CURLOPT_HEADER, 0);
            curl_exec($ch);
            curl_close($ch);
            fclose($fp);

        }
        fclose($handle);
    }
}
?>
SajithNair
  • 3,867
  • 1
  • 17
  • 23
user3197269
  • 93
  • 3
  • 12
  • Maybe script just hangs... Reading multiple files and sending http request is quite a heavy and time consuming process. – Leri Mar 04 '14 at 07:59
  • Try reading glob("*.csv") into a variable then looping. Otherwise it will end up in infinite loop. – SajithNair Mar 04 '14 at 08:03
  • @SajithNair No, it won't: http://stackoverflow.com/a/14854568/1283847 – Leri Mar 04 '14 at 08:08
  • Well, it will load page and stop, so i think there is no infinite loop. – user3197269 Mar 04 '14 at 08:11
  • I think you should refer this http://stackoverflow.com/questions/6413762/reading-off-multiple-csv-files – Avinash Babu Mar 04 '14 at 08:17
  • But i dont know how many files there will be ... 2 or 100. Thats why i need something that will open and read all .csv files – user3197269 Mar 04 '14 at 08:23
  • Well i changed one line .... and it worked !!!! :D after: foreach (glob("./csv/files/*.*") as $filename .... 2 hours just bcs this one line O.o thank you guys ^^ – user3197269 Mar 04 '14 at 09:07

1 Answers1

2

This is a prime candidate for multi-threading, and here's some code to do it:

<?php
class WebWorker extends Worker {
    public function run() {}
}

class WebTask extends Stackable {

    public function __construct($input, $output) {
        $this->input  = $input;
        $this->output = $output;
        $this->copied = 0;
    }

    public function run() {
        $data = file_get_contents($this->input);
        if ($data) {
            file_put_contents(
                $this->output, $data);
            $this->copied = strlen($data);
        }
    }

    public $input;
    public $output;
    public $copied;
}

class WebPool {
    public function __construct($max) {
        $this->max = $max;
        $this->workers = [];
    }

    public function submit(WebTask $task) {
        $random = rand(0, $this->max);

        if (isset($this->workers[$random])) {
            return $this->workers[$random]
                ->stack($task);
        } else {
            $this->workers[$random] = new WebWorker();
            $this->workers[$random]
                ->start();
            return $this->workers[$random]
                ->stack($task);
        }
    }

    public function shutdown() {
        foreach ($this->workers as $worker)
            $worker->shutdown();
    }

    protected $max;
    protected $workers;
}

$pool = new WebPool(8);
$work = [];
$start = microtime(true);

foreach (glob("csv/*.csv") as $file) {
    $file = fopen($file, "r");

    if ($file) {
        while (($line = fgetcsv($file, 0, ";"))) {
            $wid = count($work);
            $work[$wid] = new WebTask(
                $line[0], $line[1]);
            $pool->submit($work[$wid]);
        }
    }
}

$pool->shutdown();
$runtime = microtime(true) - $start;

$total = 0;
foreach ($work as $job) {
    printf(
        "[%s] %s -> %s %.3f kB\n", 
        $job->copied ? "OK" : "FAIL",
        $job->input, 
        $job->output, 
        $job->copied/1024);
    $total += $job->copied;
}
printf( 
    "[TOTAL] %.3f kB in %.3f seconds\n", 
    $total/1024, $runtime);
?>

This will create a maximum number of pooled threads, it will then read through a directory of semi-colon seperated csv files where each line is input;output, it will then submit the task to read the input and write the output asynchronously to the pool for execution, while the main thread continues to read csv files.

I have used the simplest input/output file_get_contents and file_put_contents so that you can see how it works without cURL.

The worker selected when a task is submitted to the pool is random, this may not be desirable, it's possible to detect if a worker is busy but this would complicate the example.

Further reading:

Leri
  • 12,367
  • 7
  • 43
  • 60
Joe Watkins
  • 17,032
  • 5
  • 41
  • 62