6

I have table with 100 000+ rows, and I want to select all of it in doctrine and to do some actions with each row, in symfony2 with doctrine I try to do with this query:

    $query = $this->getDefaultEntityManager()
        ->getRepository('AppBundle:Contractor')
        ->createQueryBuilder('c')
        ->getQuery()->iterate();

    foreach ($query as $contractor) {
        // doing something
    }

but then I get memory leak, because I think It wrote all data in memory.

I have more experience in ADOdb, in that library when I do so:

$result = $ADOdbObject->Execute('SELECT * FROM contractors');
   while ($arrRow = $result->fetchRow()) {
        // do some action
   }

I do not get any memory leak.

So how to select all data from the table and do not get memory leak with doctrine in symfony2 ?

Question EDIT

When I try to delete foreach and just do iterate, I also get memory leak:

$query = $this->getDefaultEntityManager()
            ->getRepository('AppBundle:Contractor')
            ->createQueryBuilder('c')
            ->getQuery()->iterate();
Donatas Veikutis
  • 974
  • 2
  • 15
  • 36

3 Answers3

6

The normal approach is to use iterate().

$q = $this->getDefaultEntityManager()->createQuery('select u from AppBundle:Contractor c');
$iterableResult = $q->iterate();
foreach ($iterableResult as $row) {
    // do something
}

However, as the doctrine documentation says this can still result in errors.

Results may be fully buffered by the database client/ connection allocating additional memory not visible to the PHP process. For large sets this may easily kill the process for no apparant reason.

The easiest approach to this would be to simply create smaller queries with offsets and limits.

//get the count of the whole query first
$qb = $this->getDefaultEntityManager();
$qb->select('COUNT(u)')->from('AppBundle:Contractor', 'c');
$count = $qb->getQuery()->getSingleScalarResult();

//lets say we go in steps of 1000 to have no memory leak
$limit = 1000;
$offset = 0;

//loop every 1000 > create a query > loop the result > repeat
while ($offset < $count){
    $qb->select('u')
        ->from('AppBundle:Contractor', 'c')
        ->setMaxResults($limit)
        ->setFirstResult($offset);
    $result = $qb->getQuery()->getResult();
    foreach ($result as $contractor) {
        // do something
    }
    $offset += $limit;
}

With this heavy datasets this will most likely go over the maximum execution time, which is 30 seconds by default. So make sure to manually change set_time_limit in your php.ini. If you just want to update all datasets with a known pattern, you should consider writing one big update query instead of looping and editing the result in PHP.

oshell
  • 8,923
  • 1
  • 29
  • 47
  • 1
    Hi @Hosch great suggestion! You can avoid the maximum execution time on a single method overriding the value with the call `set_time_limit(0)` as described in the [doc](http://php.net/manual/en/function.set-time-limit.php) – Matteo Oct 21 '15 at 09:49
  • with iterate I still get memory leak, but second offer is ok :) – Donatas Veikutis Oct 22 '15 at 07:24
  • Yeah, I had the same problem once on cronjob and solved it with this kind of loop. – oshell Oct 22 '15 at 07:32
0

Try using this approach:

foreach ($query as $contractor) {
    // doing something

    $this->getDefaultEntityManager()->detach($contractor);
    $this->getDefaultEntityManager()->clear($contractor);
    unset($contractor); // tell to the gc the object is not in use anymore

}

Hope this help

Matteo
  • 37,680
  • 11
  • 100
  • 115
0

If you really need to get all the records, I'd suggest you to use database_connection directly. Look at its interface and choose method which won't load all the data into memory (and won't map the records to your entity).

You could use something like this (assuming this code is in controller):

$db = $this->get('database_connection');
$query = 'select * from <your_table>';
$sth = $db->prepare($query);
$sth->execute();
while($row = $sth->fetch()) {
    // some stuff
}

Probably it's not what you need because you might want to have objects after handling all the collection. But maybe you don't need the objects. Anyway think about this.

Alex
  • 1,073
  • 9
  • 20