34

I'm new to Doctrine and there are still some blurred areas for me. In this case I'm inserting new record in the database using a loop and the entity manager. It works fine but I noticed that Doctrine make one insert query by entity, which can become pretty huge.

Using Doctrine2 and Symfony 2.3, I would like to know how we can set it up so it would make only 1 insert query with all the values in it (we are talking of 1 entity only of course).

What I mean is changing this :

INSERT INTO dummy_table VALUES (x1, y1)    
INSERT INTO dummy_table VALUES (x2, y2)

Into

INSERT INTO dummy_table VALUES (x1, y1), (x2, y2)

Here is my code :

$em = $this->container->get('doctrine')->getManager();

foreach($items as $item){
    $newItem = new Product($item['datas']);
    $em->persist($newItem);
}

$em->flush();
Touki
  • 7,465
  • 3
  • 41
  • 63
Molkobain
  • 1,457
  • 1
  • 12
  • 17
  • For what matter would you combine these queries into a single one ? – Touki Sep 06 '13 at 09:55
  • 1
    I was thinking performance improvements. This is just an example, in pratice it's most likely around 20 entities to insert. So making only connection would be much faster than n connections. EDIT : I found [this answer](http://stackoverflow.com/questions/1793169/which-is-faster-multiple-single-inserts-or-one-multiple-row-insert) about this very topic. – Molkobain Sep 06 '13 at 12:15
  • I might warn you, doctrine adds quite a lot of overhead on each of the inserts you do (manages state and such), so for really big inserts i'd choose DBAL queries instead of ORM relationships instead. // just my 2 cents – Sebastiaan Hilbers Jan 14 '14 at 14:51

5 Answers5

57

According to this answer, Doctrine2 does not allow you to combine multiple INSERT statements into one:

Some people seem to be wondering why Doctrine does not use multi-inserts (insert into (...) values (...), (...), (...), ...

First of all, this syntax is only supported on mysql and newer postgresql versions. Secondly, there is no easy way to get hold of all the generated identifiers in such a multi-insert when using AUTO_INCREMENT or SERIAL and an ORM needs the identifiers for identity management of the objects. Lastly, insert performance is rarely the bottleneck of an ORM. Normal inserts are more than fast enough for most situations and if you really want to do fast bulk inserts, then a multi-insert is not the best way anyway, i.e. Postgres COPY or Mysql LOAD DATA INFILE are several orders of magnitude faster.

These are the reasons why it is not worth the effort to implement an abstraction that performs multi-inserts on mysql and postgresql in an ORM.

You can read more about Doctrine2 batch processing here: https://www.doctrine-project.org/projects/doctrine-orm/en/latest/reference/batch-processing.html

You can either switch to DBAL or resort to processing your data in small batches by flushing your entity manager after a set amount of inserts:

$batchSize = 20;

foreach ($items as $i => $item) {
     $product = new Product($item['datas']);

     $em->persist($product);

     // flush everything to the database every 20 inserts
     if (($i % $batchSize) == 0) {
         $em->flush();
         $em->clear();
    }
}

// flush the remaining objects
$em->flush();
$em->clear();
Benoit Duffez
  • 11,839
  • 12
  • 77
  • 125
ukliviu
  • 3,136
  • 1
  • 24
  • 30
  • Thanks, now I know it's not possible through Doctrine. – Molkobain Sep 06 '13 at 12:23
  • 3
    +1 Flush the remaining objects...the documentation should really include this as if your batch size is smaller than the number of results nothing happens – Carlton Jan 06 '14 at 17:32
  • 4
    Pay attention that `$em->clear()` can have wrong effects with other entities you not dealing with in your function in direct way. For example, you can get some exception by relation to your User of Category of Product... So, it's more handy to clear only entities with type you want. In this case it will be `$em->clear('Product')` – FlameStorm Feb 20 '17 at 18:25
2

You can try this fork https://github.com/stas29a/doctrine2. It implements exactly what you want. I tested it in MySQL and it works fine and 5 times faster than that batch processing. This fork get a first inserted id and increments it in php for getting other id's. It works for most cases but not in all. So you need to understand what are you doing when using this fork.

s29a
  • 41
  • 1
2

You can use executeUpdate($query, array $params = array(), array $types = array()) method of DriverConnection interface to perform this action. However it's little tricky to bind multiple parameters.

Data:

$postMetaData = [
    [
        'post_id' => $product->getId(),
        'meta_key' => '_visibility',
        'meta_value' => 'visible',
    ],
    [
        'post_id' => $product->getId(),
        'meta_key' => '_stock_status',
        'meta_value' => $insert['in_stock'] ? 'instock' : 'outofstock',
    ]
];

Bulk update method:

public function updateOrCreateBulk($posts, \Doctrine\DBAL\Connection $connection)
{

    $placeholders = [];
    $values = [];
    $types = [];

    foreach ($posts as $columnName => $value) {
        $placeholders[] = '(?)';
        $values[] = array_values($value);
        $types[] = \Doctrine\DBAL\Connection::PARAM_INT_ARRAY;
    }

    return $connection->executeUpdate(
        'INSERT INTO `wp_postmeta` (`post_id`, `meta_key`, `meta_value`)  VALUES ' . implode(', ', $placeholders) . ' ON DUPLICATE KEY UPDATE `meta_value` = VALUES(`meta_value`)',
        $values,
        $types
    );
}
Sviatoslav Oleksiv
  • 2,538
  • 2
  • 13
  • 12
0

Thanks, now I know it's not possible through Doctrine. – @Molkobain

you can use for one shot

$em->merge($testCustomer);
then persist flush

it will work prefect

Idrees
  • 1
-10

I have not tested it but it seems possible to do this with a collection.

$collection = new Doctrine_Collection('tablename');
$collection->add($record1);
$collection->add($record2);
$collection->add($record3);
$collection->add($record4);
$collection->save();

Of course you should have the add in the loop.

idipous
  • 2,868
  • 3
  • 30
  • 45
  • I'll try it but if we have to specify the table's name every time I want to insert multiple values, it becomes a little too "complicated" of a solution to me. – Molkobain Sep 06 '13 at 12:20