1

I have a symfony 1.4 task that opens a csv file with more than 200000 lines using splFileObject.

This is the code responsible for opening the file :

//open the CSV file for read
      $this->file = new SplFileObject($this->filePath, 'r');
      $this->file->setFlags(SplFileObject::READ_CSV);
      $this->file->setCsvControl($this->delimiter);
      $this->file->seek($this->options['limit']);

      $this->limit = $this->options['limit'];

      /*       * read the file* */
      $i = $this->limit + 1;
      foreach ($this->file as $row)
      {
        if ($this->file->valid())
        {
          //initialize invoices and clients object
          $this->initInvoicesArray($row);
          $this->prepareInvoice();

          //clear the row
          unset($row);

          $this->key = $this->file->key();

          //add msisdn to msisdn array, and client to clients array
          self::$InvoicesArray[$this->msisdn] = $this->client;

          //get valid users from table, and save to payment_notifications
          $this->prepareAndSaveValidClients();

          if ($i % $this->options['count'] === 0)
          {
            //launch process again with new $key
            $this->logSection('Payment Notification', $i. 'Lines Processed, Memory Usage : ' . memory_get_usage());
            $this->log($this->fileSys->execute("./symfony mediapi:send-payment-notifications --application=" . $this->options['application']
                    . " --env=" . $this->options['env'] .
                    " --limit=" . $i .
                    ' --count=' . $this->options['count']));

          }
        }

        $i++;
      }

As you can see, I execute the process once limit reaches 1000 lines. the limit option is used to determine from which line in the file the current process must start.

$this->prepareAndSaveValidClients(); is a function that validates data with a select, prepares objects and saves data to the database... this is the prepareAndSaveValidClients() :

  public function prepareAndSaveValidClients() {

    //get valid clients from database
    $this->query = Doctrine_Query::create()
        ->select('p.gender, p.email2, u.username, u.first_name, u.last_name, u.email_address, u.is_active, p.msisdn, p.user_id, p.city_id, p.street, p.zipcode, p.msisdn_status')
        ->from('sfGuardUser u')
        ->innerJoin('u.Profile as p ON u.id = p.user_id')
        ->whereIn('p.msisdn', $this->msisdn)
        ->whereIn('p.status', self::$AllowedStatus);

    //fetch valid clients result
    $this->results = $this->query->fetchArray();

    //unset the query to free memory
    $this->query->free(true);
    unset($this->query);

    //instanciat an object collection for payment_notifications
    $collection = new Doctrine_Collection("payment_notifications");

    //save valid clients in payment_notifications table
    if (!empty($this->results))
    {
      foreach ($this->results as $key => $client)
      {
        $invoice = self::$InvoicesArray[$client['Profile']['msisdn']];

        //prepare invoices
        $this->initInvoicesArray($invoice);
        $this->prepareInvoice();

        //prepare userprofile
        $this->prepareUserProfile($client);
        $this->prepareClient();

        $paymentNotifications = new paymentNotifications();
        $paymentNotifications->fromArray($this->client);

        $collection->add($paymentNotifications);

        //clear $client and $paymentNotifications
        unset($client);
        unset($paymentNotifications);
      }

      //save the collection
      $collection->save();

      //freeing the collection
      $collection->free(true);

      //clear memory
      $this->results = array();
      unset($collection);

      self::$MsisdnArray = array();
      self::$InvoicesArray = array();

    }
  }

The problem is that Memory_Usage increases through different processes:

Process 1 : 1000 Lines Processed, Memory Usage : 40947336
Process 2 : 2000Lines Processed, Memory Usage : 69401208
Process 3 : 3000Lines Processed, Memory Usage : 98444272
Process 4: 4000Lines Processed, Memory Usage : 126308156
...

How can I start new process, and free the previous process memory usage ???

Thank you

SmootQ
  • 2,096
  • 7
  • 33
  • 58
  • 1
    "and free the previous process memory usage" - why do you think it's not freed?? – Karoly Horvath Apr 16 '15 at 11:00
  • Thanks for the question , I used Memory_Usage function, to determine the Memory Usage of the current process... as you can see, it's increasing. The task ends with a Memory Exausted fatal error, and only 10% of lines are not executed. – SmootQ Apr 16 '15 at 11:02
  • "the Memory Usage of the **current** process" – Karoly Horvath Apr 16 '15 at 11:03
  • 1
    I can only assume that this may have something to do with Doctrine and possibly with circular references, too. You have to debug your code deeper to find out where and why objects are not being destroyed. – Aleksander Wons Apr 16 '15 at 11:20
  • @KarolyHorvath I don't mean "Of" the current process, but "of" the current process, plus the previous processes... It seems that Memory is not freed. – SmootQ Apr 16 '15 at 12:07
  • @awons , Thank you ... I will debug and see. – SmootQ Apr 16 '15 at 12:07
  • You should definitively look at `pcntl_fork`, here is a snippet about that: https://gist.github.com/pborreli/396110 – j0k Apr 22 '15 at 11:58
  • I'd take a look at this: [http://stackoverflow.com/questions/584960/whats-better-at-freeing-memory-with-php-unset-or-var-null]. In my experience, creating Doctrine objects when working with large datasets isn't a good idea, as you are finding out.I'd recommend using SQL for your inserts, which will require more coding, but allow dramatically lower memory usage. – moote Apr 22 '15 at 16:55

0 Answers0