0

I am trying to insert about 600 entries. I am using ORM, but after investigation and according to the docs, core seems to be the fastest one.

This is my implementation, and it's takingnearly 58 seconds to do the insert. Per docs SQLAlchemy Core: Total time for 100000 records 0.206904888153 secs

This is my implementation:

def bulk_insert_objects(self, simulation, default_mechanism, model, skus):
    t0 = time.time()
    table = inspect(model).local_table
    self.session.execute(
        table.insert(),
        [
            dict(
                simulation_id=simulation.simulation_id,
                product_id=i['product_id'],
                store_level_2_id=i['store_level_2_id'],
                data=default_mechanism,
            )
            for i in skus
        ],
    )
    print(
    "SQLAlchemy Core: Total time for " + str(len(skus)) +
    " records " + str(time.time() - t0) + " secs")

Total length of skus is 600.

My terminal looks like this, it performs INSERT INTO correctly, it takes a long time to finish that operation.

INFO sqlalchemy.engine.base.Engine INSERT INTO pricing.simulation_skus (simulation_id, product_id, data) VALUES (%(simulation_id)s, %(product_id)s, %(data)s)
2021-03-31 12:58:49,393 INFO sqlalchemy.engine.base.Engine ({'simulation_id': 185, 'product_id': 3859, 'data': '{"mechanism_cd": "dollar_amount"}'}, {'simulation_id': 185, 'product_id': 3859, 'data': '{"mechanism_cd": "dollar_amount"}'}, {'simulation_id': 185, 'product_id': 3859, 'data': '{"mechanism_cd": "dollar_amount"}'}, {'simulation_id': 185, 'product_id': 3859, 'data': '{"mechanism_cd": "dollar_amount"}'}, {'simulation_id': 185, 'product_id': 3859, 'data': '{"mechanism_cd": "dollar_amount"}'}, {'simulation_id': 185, 'product_id': 3859, 'data': '{"mechanism_cd": "dollar_amount"}'}, {'simulation_id': 185, 'product_id': 3859, 'data': '{"mechanism_cd": "dollar_amount"}'}, {'simulation_id': 185, 'product_id': 3861, 'data': '{"mechanism_cd": "dollar_amount"}'}  ... displaying 10 of 630 total bound parameter sets ...  {'simulation_id': 185, 'product_id': 723605, 'data': '{"mechanism_cd": "dollar_amount"}'}, {'simulation_id': 185, 'product_id': 723605, 'data': '{"mechanism_cd": "dollar_amount"}'})

After 58ish second, it finally completes.

HOOOPLA
  • 65
  • 3
  • 9
  • Does this answer your question? [Bulk insert with SQLAlchemy ORM](https://stackoverflow.com/questions/3659142/bulk-insert-with-sqlalchemy-orm) – Shahan M Mar 31 '21 at 19:13
  • It doesn't actually. I tried using the core API, the result is really slow – HOOOPLA Mar 31 '21 at 19:26
  • Which DBMS are you using, and what driver? – Ilja Everilä Mar 31 '21 at 19:35
  • SQLAlchemy/psycopg2 – HOOOPLA Mar 31 '21 at 19:38
  • Does that make a difference? @IljaEverilä – HOOOPLA Mar 31 '21 at 19:55
  • So PostgreSQL/psycopg2, and it might. The way you are executing ends up using DB-API `executemany`, which is not that great in psycopg2 (it is just a loop of `execute` calls), unless you are using a more recent version of SQLAlchemy that defaults to using the fast execution helpers of psycopg2. The log message you have shown indicates at the former. This would mean you are doing 600+ round trips to the DB, which depending on your network can be quite slow. 58s / 600 -> little under 100ms. What version of SQLAlchemy are you using? – Ilja Everilä Apr 01 '21 at 09:42

1 Answers1

0

You can use the bulk_insert_mappings(). The following code should do the trick. You can take a look at https://docs.sqlalchemy.org/en/13/_modules/examples/performance/bulk_inserts.html for more details.

table = inspect(model)
self.session.bulk_insert_mappings(
    table,
    [
        dict(
            simulation_id=simulation.simulation_id,
            product_id=i['product_id'],
            store_level_2_id=i['store_level_2_id'],
            data=default_mechanism,
        )
        for i in skus
    ],
)
Shahan M
  • 523
  • 1
  • 7
  • 18
  • it's still really slow just doing 600 inserts.. takes me about 68 seconds.. – HOOOPLA Mar 31 '21 at 19:25
  • Could you try with the `table = inspect(model)` based on this documentation https://docs.sqlalchemy.org/en/14/orm/mapping_styles.html. I updated the code above to reflect this. – Shahan M Mar 31 '21 at 19:30
  • Yup, doing it exactly like that, made no difference to the speed. I still get approx 58 seconds to insert 650 entries. I tried core, bulk_insert, bulk_save... same performance – HOOOPLA Mar 31 '21 at 19:36
  • Technically core should've been the fastest method to do this, but I feel like something else is off in here.. – HOOOPLA Mar 31 '21 at 19:56