4

Original question

I am doing a fetch join in Doctrine, using a join table that has a composite key and no other defined fields, and getting incorrect data loaded into my entity. Is this a bug in Doctrine, or am I doing something wrong?

What follows is a simple example that illustrates the problem.

Create three entities:

/** 
 * @ORM\Entity
 * @ORM\Table(name="driver")
 */  
class Driver
{   
    /** 
     * @ORM\Id
     * @ORM\Column(type="integer")
     * @ORM\GeneratedValue(strategy="AUTO")
     */  
    private $id;

    /** 
     * @ORM\Column(type="string", length=255);
     */  
    private $name;

    /** 
     * @ORM\OneToMany(targetEntity="DriverRide", mappedBy="driver")
     */  
    private $driverRides;

    function getId() { return $this->id; } 
    function getName() { return $this->name; } 
    function getDriverRides() { return $this->driverRides; }
}

/**
 * @ORM\Entity
 * @ORM\Table(name="driver_ride")
 */
class DriverRide
{
    /**
     * @ORM\Id
     * @ORM\ManyToOne(targetEntity="Driver", inversedBy="driverRides")
     * @ORM\JoinColumn(name="driver_id", referencedColumnName="id")
     */
    private $driver;

    /**
     * @ORM\Id
     * @ORM\ManyToOne(targetEntity="Car", inversedBy="carRides")
     * @ORM\JoinColumn(name="car", referencedColumnName="brand")
     */
    private $car;

    function getDriver() { return $this->driver; }
    function getCar() { return $this->car; }
}

/**
 * @ORM\Entity
 * @ORM\Table(name="car")
 */
class Car
{
    /**
     * @ORM\Id
     * @ORM\Column(type="string", length=25)
     * @ORM\GeneratedValue(strategy="NONE")
     */
    private $brand;

    /**
     * @ORM\Column(type="string", length=255);
     */
    private $model;

    /**
     * @ORM\OneToMany(targetEntity="DriverRide", mappedBy="car")
     */
    private $carRides;

    function getBrand() { return $this->brand; }
    function getModel() { return $this->model; }
    function getCarRides() { return $this->carRides; }
}

Populate the corresponding database tables with this data:

INSERT INTO driver (id, name) VALUES (1, 'John Doe');

INSERT INTO car (brand, model) VALUES ('BMW', '7 Series');
INSERT INTO car (brand, model) VALUES ('Crysler', '300');
INSERT INTO car (brand, model) VALUES ('Mercedes', 'C-Class');
INSERT INTO car (brand, model) VALUES ('Volvo', 'XC90');
INSERT INTO car (brand, model) VALUES ('Dodge', 'Dart');

INSERT INTO driver_ride (driver_id, car) VALUES (1, 'Crysler');
INSERT INTO driver_ride (driver_id, car) VALUES (1, 'Mercedes');
INSERT INTO driver_ride (driver_id, car) VALUES (1, 'Volvo');
INSERT INTO driver_ride (driver_id, car) VALUES (1, 'BMW');
INSERT INTO driver_ride (driver_id, car) VALUES (1, 'Dodge');

Use this code to hydrate a Driver entity and display its contents:

$qb = $em->createQueryBuilder();
$driver = $qb->select('d, dr, c')
   ->from('Driver', 'd')
   ->leftJoin('d.driverRides', 'dr')
   ->leftJoin('dr.car', 'c')
   ->where('d.id = 1')
   ->getQuery()->getSingleResult();

print '<p>' . $driver->getName() . ':';
foreach ($driver->getDriverRides() as $ride) {
    print '<br>' . $ride->getCar()->getBrand() . ' ' . $ride->getCar()->getModel();
}

Expected output:

John Doe:
BMW 7 Series
Crysler 300
Dodge Dart
Mercedes C-Class
Volvo XC90


Actual output:

John Doe:
BMW 7 Series
Dodge Dart
Dodge Dart
Volvo XC90
Volvo XC90

There is a strange duplication going on here with the associated entities. Specifically, child entity #3 is duplicated as #2, #5 is duplicated as #4, etc., and child entity #2, #4, etc. are not loaded at all. This pattern is consistent and reproducible.

Is there something wrong with my code? Why is Doctrine failing to correctly map the data from the database tables to the entities?

Additional remarks

The same problem occurs if one queries for all rides (aka associations between drivers and cars) and performs a fetch-join. This is an important case, because the answer by @qaqar-haider does not work for this scenario.

Assume the following table data

INSERT INTO driver (id, name) VALUES (1, 'John Doe');
INSERT INTO driver (id, name) VALUES (2, 'Erika Mustermann');

INSERT INTO car (brand, model) VALUES ('BMW', '7 Series');
INSERT INTO car (brand, model) VALUES ('Crysler', '300');
INSERT INTO car (brand, model) VALUES ('Mercedes', 'C-Class');

INSERT INTO driver_ride (driver_id, car) VALUES (1, 'Crysler');
INSERT INTO driver_ride (driver_id, car) VALUES (1, 'Mercedes');
INSERT INTO driver_ride (driver_id, car) VALUES (1, 'BMW');
INSERT INTO driver_ride (driver_id, car) VALUES (2, 'BMW');
INSERT INTO driver_ride (driver_id, car) VALUES (2, 'Crysler');

Imagine the following query with fetch-joins

$qb = $em->createQueryBuilder();
$rides = $qb->select('dr, d, c')
   ->from('DriverRide', 'dr')
   ->leftJoin('dr.driver', 'd')
   ->leftJoin('dr.car', 'c')
   ->getQuery()->getResult();

foreach ($rides as $ride) {
    print $ride->driver->getName() . ': ' . $ride->getCar()->getModel();
}

Expected output:

John Doe: Chrysler
John Doe: Mercedes
John Doe: BMW
Erika Mustermann: BMW
Erika Mustermann: Chrysler


Actual output:

John Doe: Chrysler
John Doe: Mercedes
John Doe: Mercedes
Erika Mustermann: BMW
Erika Mustermann: BMW

Once again, the number of results is correct, but the associations are mixed up.

This is a simplified test case of the original question and is not subject to any WHERE or GROUP BY clause.

user2690527
  • 1,729
  • 1
  • 22
  • 38
jake stayman
  • 1,687
  • 13
  • 22
  • Did you find a solution? I encountered the very same problem yesterday and it kept me busy several hours without finding an error in my code? I also get strange duplications and missing associations. – user2690527 Aug 31 '16 at 10:41
  • A thing to consider: if in the initial query use `getArrayResult()` instead of `getSingleResult()`, then the `$driver` will actually contain the right cars, in the right order. – Alexander Guz Aug 31 '16 at 21:49
  • I am all but certain that this is a bug in Doctrine. I filed a [bug report](https://github.com/doctrine/doctrine2/issues/5762) over 4 months ago, followed by a [pull request](https://github.com/doctrine/doctrine2/pull/5776) with a simple bug fix. Sadly, neither received any response from the Doctrine project maintainers. – jake stayman Sep 01 '16 at 16:35

3 Answers3

3

This is a bug in Doctrine. Unfortunately, the relevant Doctrine source code is quite ugly, so I am not certain that my bug fix is the best one, but it does fix the problem. I am copying the explanation from my bug report here for completeness:


The main cause of this bug is located in the ObjectHydrator class. In the hydrateRowData() method, $resultPointers is used to store a reference to the most recently hydrated object of each entity type. When a child entity needs to be linked to its parent, this method looks in $resultPointers to find a reference to the parent and, if it finds one, links the child to that parent.

The problem is that $resultPointers is an instance/object (rather than local/method) variable that does not get reinitialized every time hydrateRowData() is called, and so it may retain a reference to an entity that was hydrated the previous time the method was called rather than the current time.

In this particular example, the Car entity is hydrated before the DriverRide entity each time hydrateRowData() is called. When the method looks for the parent of the Car entity, it finds nothing the first time (because DriverRide has not yet been processed at all) and, on every subsequent call, finds a reference to the DriverRide object that was hydrated the previous time the method was called (because DriverRide has not yet been processed for the current row, and $resultPointers still retains a reference to the result of processing for the previous row).

The bug disappears when additional fields are added to DriverRide only because doing so happens to cause the DriverRide entity to be processed before the Car entity in hydrateRowData(). The duplication of records happens because some other weird part of this method causes the child entity to be identified as not fetch joined (and so lazy loaded) every other time the method is called (not counting the first time), and so those times (the third, fifth, etc.) the link between child and parent happens to turn out correctly.

I believe that the fundamental problem is that $resultPointers is neither a local variable nor reinitialized each time hydrateRowData() is called. I cannot think of any scenario in which you would need a reference to an object that was hydrated with data from the previous row of data, so I would recommend simply reinitializing this variable at the beginning of this method. That should fix the bug.

jake stayman
  • 1,687
  • 13
  • 22
  • It's 2020 and I ran into exact same problem: https://sergiupopa.me/clear-your-doctrine-collections though without converting from many-to-many to one-to-many... Long live Doctrine – sergio Aug 06 '20 at 17:47
0

This is a nice question. I've spent some time trying to find out the answer, but I managed to achieve the expected result only by modifying the entity mappings - why the initial problem arises I still cannot understand, but I will continue to investigate.

So, this is what I've done. Instead of manually defining a DriverRide entity, let Doctrine define it for you with many-to-many mapping, as described here.

The entities will look like:

/**
 * @Entity
 * @Table(name="cars")
 */
class Car
{
    /**
     * @Id
     * @Column(type="string", length=25)
     * @GeneratedValue(strategy="NONE")
     */
    protected $brand;

    /**
     * @Column(type="string", length=255);
     */
    protected $model;

    public function getBrand() { return $this->brand; }
    public function getModel() { return $this->model; }

    public function setBrand($brand) { $this->brand = $brand; }
    public function setModel($model) { $this->model = $model; }
}

/**
 * @Entity
 * @Table(name="drivers")
 */
class Driver
{
    /**
     * @Id
     * @Column(type="integer")
     * @GeneratedValue(strategy="AUTO")
     */
    protected $id;

    /**
     * @Column(type="string", length=255);
     */
    protected $name;

    // This will create a third table `driver_rides`

    /**
     * @ManyToMany(targetEntity="Car")
     * @JoinTable(
     *     name="driver_rides",
     *     joinColumns={@JoinColumn(name="driver_id", referencedColumnName="id")},
     *     inverseJoinColumns={@JoinColumn(name="car", referencedColumnName="brand")}
     * )
     */
    protected $cars;

    public function __construct()
    {
        $this->cars = new ArrayCollection();
    }

    function getId() { return $this->id; }
    function getName() { return $this->name; }
    function getCars() { return $this->cars; }

    public function setName($name) { $this->name = $name; }
    public function setCars($cars) { $this->cars = $cars; }

    public function addCar(\Car $car) { $this->cars[] = $car; return $this; }
    public function removeCar(\Car $car) { $this->cars->removeElement($car); }
}

And the query to get what you want:

$qb = $entityManager->createQueryBuilder();

/** @var $driver Driver */
$driver = $qb->select('d, dr')
    ->from('Driver', 'd')
    ->leftJoin('d.cars', 'dr')  // join with `driver_rides` table
    ->where('d.id = 1')
    ->getQuery()
    ->getSingleResult();

printf("%s:\n", $driver->getName());
foreach ($driver->getCars() as $car) {
    printf("%s %s\n", $car->getBrand(), $car->getModel());
}
Alexander Guz
  • 1,334
  • 12
  • 31
  • Thanks. I know the problem does not occur if one uses many-to-many-associations. In my case I stumbled across the problem after I converted a many-to-many association into two one-to-many associations. This was necessary because I wanted to store an additional attribute with the association. It seems that two things are required to trigger the problem: (a) A many-to-many association broken up into two one-to-many associations with an intermediate entity in the middle (b) The intermediate entity has a composed primary key. – user2690527 Sep 01 '16 at 07:00
  • I meanwhile believe I get close to the problem. What we are talking about is the use case 3 of the chapter "Composite and Foreign Keys as Primary Key" of the Doctrine Docs ("Join table with meta data"). However, in chapter 28 "Limitations and Known Issues" it is clearly said that foreign keys must always point to the primary key of the associated entity, otherwise funny errors may occur. But this is exactly what we have here: The many-to-many endpoints (car and driver) only point to one half of the primary key of the jon table. – user2690527 Sep 01 '16 at 07:05
  • Sorry, forget my last comment with the part about "pointing to one half of the primary key of the join table". I mixed up owning and inverse side of the association in my head. Of course, the foreign keys point to the primary key of `car` and `driver` respectively and these are complete primary keys. My fault! However, the problem vanishes if the join table `driver_ride` gets an independent primary key (e.g. just a plain `int`) and if a composed primary key is not used. – user2690527 Sep 01 '16 at 07:24
-1

You have to group it by a brand, othervise it would give you desired results Group By brand

qb = $em->createQueryBuilder();
    $driver = $qb->select('d, dr, c')
       ->from('Driver', 'd')
       ->leftJoin('d.driverRides', 'dr')
       ->leftJoin('dr.car', 'c')
       ->where('d.id = 1')
        ->groupBy('c.brand')
       ->getQuery()->getSingleResult();
Waqar Haider
  • 929
  • 10
  • 33
  • Grouping does not help here in any way. See my additional test case at the second half of the question. Your modified query returns the same result by pure coincidence, because the example database only contains records of one driver having only done at most one ride on each car. However, the result is not the same anymore as soon as you have more drivers and more rides. Your query is semantically different from the original query. So this is *not* a solution. Moreover, please correct your first sentence. It does not even make sense. – user2690527 Aug 31 '16 at 15:55
  • can you show me the sql, show me generated sql getSql() @user2690527 – Waqar Haider Aug 31 '16 at 16:16
  • `SELECT r0_.id AS id_0, r0_.name AS name_1, r1_.brand AS brand_2, r1_.model AS model_3, r2_.driver_id AS driver_id_4, r2_.car AS car_5, FROM driver_ride r2_ INNER JOIN driver r0_ ON r2_.driver_id = r0_.id INNER JOIN car r1_ ON r2_.car = r1_.brand` – user2690527 Aug 31 '16 at 17:02
  • It shouldn't be inner Join it should be leftJoin – Waqar Haider Sep 01 '16 at 05:57
  • your querybuilder is selecting form driver and your sql is selecting from driver_ride thats strange, – Waqar Haider Sep 01 '16 at 06:03
  • Sorry, my fault. I posted the wrong SQL because I was playing around with the query and forgot to revert the changes. However, the SQL is OK and the DB returns the correct results. – user2690527 Sep 01 '16 at 07:26