29

I'm having trouble in locating the cause for a memory leak in my script. I have a simple repository method which increments a 'count' column in my entity by X amount:

public function incrementCount($id, $amount)
{
    $query = $this
        ->createQueryBuilder('e')
        ->update('MyEntity', 'e')
        ->set('e.count', 'e.count + :amount')
        ->where('e.id = :id')
        ->setParameter('id', $id)
        ->setParameter('amount', $amount)
        ->getQuery();

    $query->execute();
}

Problem is, if I call this in a loop the memory usage balloons on every iteration:

$entityManager = $this->getContainer()->get('doctrine')->getManager();
$myRepository = $entityManager->getRepository(MyEntity::class);
while (true) {
    $myRepository->incrementCount("123", 5);
    $doctrineManager->clear();
    gc_collect_cycles();
}

What am I missing here? I've tried ->clear(), as per Doctrine's advice on batch processing. I even tried gc_collect_cycles(), but still the issue remains.

I'm running Doctrine 2.4.6 on PHP 5.5.

Jonathan
  • 13,947
  • 17
  • 94
  • 123

7 Answers7

61

I just ran into the same issue, these are the things that fixed it for me:

--no-debug

As the OP mentioned in their answer, setting --no-debug (ex: php bin/console <my_command> --no-debug) is crucial for performance/memory in Symfony console commands. This is especially true when using Doctrine, as without it, Doctrine will go into debug mode which consumes a huge amount of additional memory (that increases on each iteration). See the Symfony docs here and here for more info.

--env=prod

You should also always specify the environment. By default, Symfony uses the dev environment for console commands. The dev environment usually isn't optimized for memory, speed, cpu etc. If you want to iterate over thousands of items, you should probably be using the prod environment (ex: php bin/console <my_command> --env prod). See here and here for more info.

Tip: I created an environment called console that I specifically configured for running console commands. Here is info about how to create additional Symfony environments.

php -d memory_limit=YOUR_LIMIT

If running a big update, you should probably choose how much memory is acceptable for it to consume. This is especially important if you think there might be a leak. You can specify the memory for the Command by using php -d memory_limit=x (ex: php -d memory_limit=256M). Note: you can set the limit to -1 (usually the default for the php cli) to let the command run with no memory limit but this is obviously dangerous.

A Well Formed Console Command For Batch Processing

A well formed console command for running a big update using the above tips would look like:

php -d memory_limit=256M bin/console <acme>:<your_command> --env=prod --no-debug

Use Doctrine's IterableResult

Another huge one when using Doctrine's ORM in a loop, is to use Doctrine's IterableResult (see the Doctrine Batch Processing docs). This won't help in the example provided but usually when doing processing like this it is over results from a query.

Flush Periodically

If part of what you are doing is making changes to the data, you should flush periodically instead of on each iteration. Flushing is expensive and slow. The less often you flush, the faster your command will finish. Keep in mind, however, that Doctrine will hold the unflushed data in memory. So the less often that you flush, the more memory you will need.

You can use something like the following to flush every 100 iterations:

if ($count % 100 === 0) {
    $this->em->flush();
}

Also make sure to flush again at the end of your loop (for flushing the last < 100 entries).

Clear the EntityManager

You may also want to clear after you flush:

$this->em->flush();
$em->clear();  // Detach ALL objects from Doctrine.

Or

$this->em->flush();
$em->clear(MyEntity::class); // Detach all MyEntity from Doctrine.
$em->clear(MyRelatedEntity::class); // Detach all MyRelatedEntity from Doctrine.

Output the memory usage as you go

It can be really helpful to keep track of how much memory your command is consuming while it is running. You can do that by outputting the response returned by PHP's built-in memory_get_usage() function.

$output->writeln(memory_get_usage());

Example

$memUse = round(memory_get_usage() / 1000000, 2).'MB';
$this->output->writeln('Processed '.$i.' of '.$totalCount.' (mem: '.$memUse.')');

Roll Your Own Batches

It may also be helpful to roll your own batches. You can do this by using a start and limit just like you would for pagination. I was able to process 4 millions rows using only 90Mb of RAM doing this.

Here's some example code:


protected function execute(InputInterface $input, OutputInterface $output) {
    /* ... */
    $totalCount = $this->getTotalCount();
    $batchSize = 10000;
    $i = 0;
    while ($i < $totalCount) {
        $i = $this->processBatch($i, $batchSize, $totalCount);
    }
    /* ... */
}

private function processBatch(int $start, int $limit, int $totalCount): int {
    /* @var $q \Doctrine\ORM\Query */
    $q = $this->em->createQueryBuilder()
        ->select('e')
        ->from('AcmeExampleBundle:MyEntity', 'e')
        ->setFirstResult($start)
        ->setMaxResults($limit)
        ->getQuery();

    /* @var $iterableResult \Doctrine\ORM\Internal\Hydration\IterableResult */
    $iterableResult = $q->iterate(null, \Doctrine\ORM\Query::HYDRATE_SIMPLEOBJECT);

    $i = $start;
    foreach ($iterableResult as $row) {
        /* @var $myEntity \App\Entity\MyEntity */
        $myEntity = $row[0];

        $this->processOne($myEntity);

        if (0 === ($i % 1000)) {
            $memUse = round(memory_get_usage() / 1000000, 2).'MB';
            $this->output->writeln('Processed '.$i.' of '.$totalCount.' (mem: '.$memUse.')');
        }
        $this->em->detach($row[0]);
        $i++;
    }

    return $i;
}

private function processOne(MyEntity $myEntity): void {
    // Do entity processing here.
}

private function getTotalCount(): int {
    /* @var $q \Doctrine\ORM\Query */
    $q = $this->em
        ->createQueryBuilder()
        ->select('COUNT(e.id)')
        ->from('AcmeExampleBundle:MyEntity', 'e')
        ->getQuery();

    $count = $q->getSingleScalarResult();

    return $count;
}

Good luck!

Collin Krawll
  • 2,210
  • 17
  • 15
  • 3
    for me $this->entityManager->getConnection()->getConfiguration()->setSQLLogger(null); fixed the issue – max4ever Feb 25 '20 at 14:41
33

I resolved this by adding --no-debug to my command. It turns out that in debug mode, the profiler was storing information about every single query in memory.

Jonathan
  • 13,947
  • 17
  • 94
  • 123
  • 2
    Every doctrine query was increasing memory usage by about 4k. This fixed it for me. – roberttstephens May 14 '15 at 20:00
  • Where exactly should I add the `--no-debug`? Can you please specify? –  Feb 03 '16 at 15:01
  • 2
    http://stackoverflow.com/a/10913115/3757139 shows how you can disable the SQL logging programmatically with `$em->getConnection()->getConfiguration()->setSQLLogger(null);` – Samuel Jan 02 '17 at 17:20
  • I thought `doctrine.dbal.connections.main.logging: false` configuration will have the same effect as `--no-debug`, but as it turns out it is not. I have memory leak even with `logging: false` in doctrine config. – Nuryagdy Mustapayev May 02 '22 at 12:38
14

Doctrine keeps logs of any query you make. If you make lots of queries (normally happens in loops) Doctrine can cause a huge memory leak.

You need to disable the Doctrine SQL Logger to overcome this.

I recommend doing this only for the loop part.

Before loop, get current logger:

$sqlLogger = $em->getConnection()->getConfiguration()->getSQLLogger();

And then disable the SQL Logger:

$em->getConnection()->getConfiguration()->setSQLLogger(null);

Do loop here: foreach() / while() / for()

After loop ends, put back the Logger:

$em->getConnection()->getConfiguration()->setSQLLogger($sqlLogger);

mFlorin
  • 476
  • 4
  • 7
13

For me it was clearing doctrine, or as the documentation says, detaching all entities:

$this->em->clear(); //Here em is the entity manager.

So inside my loop y flush every 1000 iterations and detach all entities (I don't need them anymore):

    foreach ($reader->getRecords() as $position => $value) {
        $this->processValue($value, $position);
        if($position % 1000 === 0){
            $this->em->flush();
            $this->em->clear();
        }
        $this->progress->advance();
    }

Hope this helps.

PS: here's the documentation.

amcastror
  • 528
  • 5
  • 15
  • this works like charm when you don't have complex references and don't create entities that are connected to single object. but after clear you may always recreate reference or partial reference if this is enough for your batch processing. – Eloar Feb 25 '21 at 21:59
  • This was the solution to all my problems. And I only read data, never wrote any. – Fels Jul 01 '21 at 14:28
  • `$this->em->clear(MyClass::class);` was helpful to clear only the entities of a class, since i wanted to keep some of another class around in the manager. I also needed the --no-debug, as mentioned in other solutions. – Brad G Oct 26 '21 at 02:23
6

You're wasting memory for each iteration. A much better way would be to prepare the query once and swap arguments many times. For example:

class MyEntity extends EntityRepository{
    private $updateQuery = NULL;

    public function incrementCount($id, $ammount)
    {
        if ( $this->updateQuery == NULL ){
            $this->updateQuery = $this->createQueryBuilder('e')
                ->update('MyEntity', 'e')
                ->set('e.count', 'e.count + :amount')
                ->where('e.id = :id')
                ->getQuery();
        }

        $this->updateQuery->setParameter('id', $id)
                ->setParameter('amount', $amount);
                ->execute();
    }
}

As you mentioned, you can employ batch processing here, but try this out first and see how well (if at all) performs...

Jovan Perovic
  • 19,846
  • 5
  • 44
  • 85
  • Thanks for the answer, but unfortunately this makes no difference to the memory leak. I expect this is because the `$updateQuery` variable is already being cleaned up by the garbage collector. Nonetheless your suggestion should make things slightly faster, so I will implement it. Any other ideas?! – Jonathan Oct 28 '14 at 22:23
  • I didn't expect that. How many iterations are we talking about here? – Jovan Perovic Oct 28 '14 at 22:49
  • Up to 10,000 iterations. – Jonathan Oct 28 '14 at 22:58
  • Ok, I think that calls for a native query (via plain `Connection` object). See this: http://stackoverflow.com/questions/3325012/execute-raw-sql-using-doctrine-2 – Jovan Perovic Oct 29 '14 at 08:23
2

I had similar issues with a memory leak. I'm running Doctrine in a Symfony 5.2 project. More specific, I built a never-ending Command which is processing entries from one table, retrieves entries from another table, and creates 2 new entries in other tables. (Event Processing)

I solved my leakage problems in two steps.

  1. I use the --no-debug when running the command (as already suggested by Jonathan)
  2. I added at the end of the loop $this->entityManager->clear();

In order to see and identify the leakages, I used the following line to output the current memory usage:

$output->writeln('Memory Usage in MB: ' . memory_get_usage() / 1024 / 1024);

Maybe this helps anyone still fighting with leakages.

dj_thossi
  • 774
  • 1
  • 9
  • 18
1

I encountered the same issue and disabling the query cache helped me.

$query = $this
    ->createQueryBuilder('e')
    ->update('MyEntity', 'e')
    ->set('e.count', 'e.count + :amount')
    ->where('e.id = :id')
    ->setParameter('id', $id)
    ->setParameter('amount', $amount)
    ->getQuery()
    ->useQueryCache(false); // <-- this line
Milan Miscevic
  • 146
  • 1
  • 4