19

I have written a Symfony command to import some data from an API. It works but the problem is my PHP memory usage increases when I insert a big JSON in my database. And my unitOfWork increases by '2' to after each activty import.

I have already unset all my used objects, and I have read the documentation of Symfony2 when you want to do massive batch: http://www.doctrine-project.org/blog/doctrine2-batch-processing.html

But when I use $em->clear() my entity manager gives this error:

Notice: Undefined index: 000000007b56ea7100000000e366c259 in path-to-application\vendor\doctrine\lib\Doctrine\ORM\UnitOfWork.php line 2228

Here is my complete code :

 /**
 * @see Command
 */
protected function configure() {
  $this
    ->setName('ks:user:runkeepersync')
    ->setDescription('Synchroniser les activités d\'un utilisateur runkeeper')
    ->setDefinition(array(
      new InputArgument('access_token', InputArgument::REQUIRED, 'Access token'),
    ))
}

/**
 * @see Command
 */
protected function execute(InputInterface $input, OutputInterface $output) {
  $accessToken = $input->getArgument('access_token');
  $em = $this->getContainer()->get('doctrine')->getEntityManager();
  $UserHasServices = $em->getRepository('KsUserBundle:UserHasServices')->findOneByToken($accessToken);
  if (!is_object($UserHasServices) ) {
    echo "Impossible de trouver l'utilisateur qui possède le jeton ".$accessToken."";
  }
  $user    = $UserHasServices->getUser();
  $service = $UserHasServices->getService();
  echo "avant de requérir l'api : ".memory_get_usage()."\n";
  try {
    $rkApi = $this->getContainer()->get('ks_user.runkeeper');
    $rkApi->setAccessToken($accessToken);
    $activities  = $rkApi->getFitnessActivities(0,25);
    $nbParPages  = 25;
    $nomberActivitites = $activities->size;
    $aActivities = $activities->items;
    $nbPages =  floor ($nomberActivitites/$nbParPages);
    $aEndurance = array("Running", "Cycling", "Mountain Biking", "Walking", "Hiking", "Downhill Skiing", "Cross-Country Skiing", "Snowboarding", "Skating","Wheelchair", "Rowing", "Elliptical", "Other");
    $aEnduranceUnderWater = array("Swimming");
    $enduranceOnEarthType = $em->getRepository('KsActivityBundle:SportType')->findOneByLabel("endurance");
    if (!is_object($enduranceOnEarthType) ) {
      echo "Impossible de trouver le type de sport d'endurance";
    }
    $enduranceUnderWaterType = $em->getRepository('KsActivityBundle:SportType')->findOneByLabel("endurance_under_water");
    if (!is_object($enduranceUnderWaterType) ) {
      echo "Impossible de trouver le type de sport d'endurance sous l'eau ";
    }
    echo "Après avoir récupéré 25 activités : ".memory_get_usage()."\n";
    $a = 0;
    for($i=0;$i<=$nbPages;$i++){
      if($i!=0){
        $activities  = $rkApi->getFitnessActivities($i,25);
        $aActivities = $activities->items;
      }
      foreach ($aActivities as $activity) {
        $a = $a+1;
        $codeSport = $this->formatNameSport($activity->type);
        $sport = $em->getRepository('KsActivityBundle:Sport')->findOneByCodeSport($codeSport);
        if (!is_object($sport) ) {
          $sport = new \Ks\ActivityBundle\Entity\Sport();
          $sport->setLabel($codeSport);
          $sport->setCodeSport($codeSport);
          $sport->setSportType($enduranceOnEarthType);
          $em->persist($sport);
          $em->flush();
        }
        $activityDetail = json_decode($rkApi->requestJSONHealthGraph($activity->uri));
        if(in_array($activity->type, $aEndurance)){
          $urlActivitieDetail = $activityDetail->activity;
          $ActivitySessionEnduranceOnEarth = new \Ks\ActivityBundle\Entity\ActivitySessionEnduranceOnEarth($user);
          isset($activity->total_distance)? $ActivitySessionEnduranceOnEarth->setDistance($activity->total_distance) : "";
          isset($activity->duration)? $ActivitySessionEnduranceOnEarth->setDuration($this->secondesToTimeDuration($activity->duration)) : "";
          isset($activity->start_time)?  $ActivitySessionEnduranceOnEarth->setIssuedAt(new \DateTime($activity->start_time)) : "";
          $ActivitySessionEnduranceOnEarth->setModifiedAt(new \DateTime('Now'));
          $ActivitySessionEnduranceOnEarth->setSport($sport);
          isset($activityDetail->total_calories)?  $ActivitySessionEnduranceOnEarth->setCalories($activityDetail->total_calories) : "";
          isset($activityDetail->climb)?  $ActivitySessionEnduranceOnEarth->setElevationGain($activityDetail->climb) : "";
          $maxElevation = 0;
          $minElevation = 10000;
          if(isset($activityDetail->path)){
            foreach($activityDetail->path as $gpsPoint){
              if($gpsPoint->altitude > $maxElevation){
                $maxElevation = $gpsPoint->altitude;
              }
              if($gpsPoint->altitude < $minElevation){
                $minElevation = $gpsPoint->altitude;
              }
            }
            $ActivitySessionEnduranceOnEarth->setElevationMin($minElevation);
            $ActivitySessionEnduranceOnEarth->setElevationMax($maxElevation);
          }
          $em->persist($ActivitySessionEnduranceOnEarth);
          $em->flush();
          //Pour chaque activité on a un identifiant relatif au service qu'on synchronise
          $ActivityComeFromService = new \Ks\ActivityBundle\Entity\ActivityComeFromService();
          $ActivityComeFromService->setActivity($ActivitySessionEnduranceOnEarth);
          $ActivityComeFromService->setService($service);
          $ActivityComeFromService->setIdWebsiteActivityService($activity->uri);
          $ActivityComeFromService->setSourceDetailsActivity($rkApi->requestJSONHealthGraph($activity->uri));
          $ActivityComeFromService->setTypeSource("JSON");
          $em->persist($ActivityComeFromService);
          $em->flush();
          echo "Import de l'activite num ".$a." type :".$activity->type." effectue avec success \n";
          unset($ActivitySessionEnduranceOnEarth);
          unset($ActivityComeFromService);
          echo "UnitOFWOrk -> ".$em->getUnitOfWork()->size()."\n";
        }
        if(in_array($activity->type, $aEnduranceUnderWater)){
          $ActivitySessionEnduranceUnderWater = new \Ks\ActivityBundle\Entity\ActivitySessionEnduranceUnderWater($user);
          isset($activity->total_distance)? $ActivitySessionEnduranceUnderWater->setDistance($activity->total_distance) : "";
          isset($activity->duration)? $ActivitySessionEnduranceUnderWater->setDuration($this->secondesToTimeDuration($activity->duration)) : "";
          isset($activity->start_time) && !empty($activity->start_time)?  $ActivitySessionEnduranceUnderWater->setIssuedAt(new \DateTime($activity->start_time)) : "";
          $ActivitySessionEnduranceUnderWater->setModifiedAt(new \DateTime('Now'));
          $ActivitySessionEnduranceUnderWater->setSport($sport);
          isset($activityDetail->total_calories)?  $ActivitySessionEnduranceUnderWater->setCalories($activityDetail->total_calories) : "";
          isset($activityDetail->notes)?  $ActivitySessionEnduranceUnderWater->setDescription($activityDetail->notes) : "";
          $em->persist($ActivitySessionEnduranceUnderWater);
          $em->flush();
          $ActivityComeFromService = new \Ks\ActivityBundle\Entity\ActivityComeFromService();
          $ActivityComeFromService->setActivity($ActivitySessionEnduranceUnderWater);
          $ActivityComeFromService->setService($service);
          $ActivityComeFromService->setIdWebsiteActivityService($activity->uri);
          $ActivityComeFromService->setSourceDetailsActivity($rkApi->requestJSONHealthGraph($activity->uri));
          $ActivityComeFromService->setTypeSource("JSON");
          $em->persist($ActivityComeFromService);
          $em->flush();
          echo "Import de l'activité num ".$a." type :".$activity->type." effectué avec succès\n";
          unset($ActivitySessionEnduranceUnderWater);
          unset($ActivityComeFromService);
        }
        echo "Après chaque activité : ".memory_get_usage()."\n";
        unset($sport);
        unset($activityDetail);
        $em->clear();
      }
    }
  } catch (\Exception $e) {
    throw $e;
  }
}

Thanks, @AdrienBrault. I have tested with --env=prod --no-debug, and it is true that it consumes less memory, but the memory still increase. How can I really clear the entity manager? and stabilize the memory?

halfer
  • 19,824
  • 17
  • 99
  • 186
psylo66
  • 608
  • 11
  • 26
  • The first step to decrease memory usage on commands is to run them on the prod environment and with debug disabled : `php app/console command --env=prod --no-debug` – AdrienBrault Jul 09 '12 at 10:51
  • 1
    Thanks, @AdrienBrault I have tested with --env=prod --no-debug, and it is true that it consumes less memory, but the memory still increase ... How can i do to really clear the entity manager ? and stabilize the memory ? – psylo66 Jul 09 '12 at 12:29
  • @Hosh, why not add a comment under the answer you do not like, explaining why it is not a good solution? They may be willing to offer further help or advice. – halfer Jul 27 '14 at 22:07
  • My bad, allow me to explain. My problem with the current answer is it's not a fix, it's a workaround. I don't see the need to reset the manager. There's clearly a different issue going on that is breaking Doctrine. While resetting this (if it works at all, I've not tried it) is fine as a workaround, it's not a fix. Anyway, doing this in production sounds like a really bad idea. Hope that makes things clearer! – GManz Jul 27 '14 at 23:58
  • try to remove usage of ```unset``` from your code. – Alexey B. Jul 28 '14 at 07:18

2 Answers2

10

Symfony logs all SQL queries in dev environment, so first you need to disable it

// disable logger
$em->getConnection()->getConfiguration()->setSQLLogger(null);

You may use event listeners on entities, it may also increase memory usage. You can disable them like so

// remove all listeners
foreach ($em->getEventManager()->getListeners() as $event => $listeners) {
    foreach ($listeners as $listener) {
        $em->getEventManager()->removeEventListener($event, $listener);
    }
}

Remove unset from your code, there is no need for them, as you clear entity manager every step of your loop.

// save and clear
$em->flush();
$em->getUnitOfWork()->clear();

Remember that doctrine can optimize your queries, and improve perfomance if you group queries into one flush. So the best practice would be to execute flush once over some parts of your data. For example:

// collect 100 entities and then save them
if (($i % 100) == 0) {
    $em->flush();
    $em->getUnitOfWork()->clear();
}
Th. Ma.
  • 9,432
  • 5
  • 31
  • 46
Alexey B.
  • 11,965
  • 2
  • 49
  • 73
  • This is not exactly the solution. But your solution led me to the solution. It turns out one of the event listeners I was overriding (by overloading the class parameter for it) was doing something that caused this. Having removed that definition (and manually only removing that event listener) solved this issue. I will give you the bounty. – GManz Aug 03 '14 at 22:43
  • How did you get rid of "undefined index" ? And can you write code for the extended solution you talked about ? – Cyril CHAPON Dec 13 '16 at 10:16
  • also with getUntiOfWork instead of the plain clear call i get the undefined index - Anybody come up with a real solution? – user3714751 Nov 18 '20 at 12:09
1

Try to reset the entity manager with:

$this->getContainer()->get('doctrine')->resetEntityManager();

and then:

$em = $this->getContainer()->get('doctrine')->getEntityManager();
Besnik
  • 6,469
  • 1
  • 31
  • 33