0

I want to insert thousands of rows in to Oracle db using Python. For this I am trying to insert bulk_insert_mappings method of a sqlalchemy session. I am following this tutorial where it shows how to load a csv file into the database fast. I failed because bulk_insert_mappings expects a mapper object as well which they don't pass.

The code to create the connection and the mapping without a csv:

from sqlalchemy.sql import select, sqltypes
from sqlalchemy.schema import CreateSchema
from sqlalchemy import create_engine, MetaData, Table, inspect, engine, Column, String, DateTime, insert
from sqlalchemy.orm import sessionmaker
from sqlalchemy_utils import get_mapper
import pytz
import datetime
import pandas as pd

engine_url = engine.URL.create(
            drivername='oracle',
            username='ADMIN',
            password='****',
            host='***',
            database='****',    
        )


oracle_engine = create_engine(engine_url, echo=False)
Session = sessionmaker(bind=oracle_engine)

base = datetime.datetime.today().replace(tzinfo=pytz.utc)
date_list = [base - datetime.timedelta(days=x) for x in range(20)]
df = pd.DataFrame(date_list, columns = ['date_time'])

I use the following line of code to create the table if doesnt exist:

df[:0].to_sql('test_insert_df', oracle_engine, schema='ADMIN', if_exists='replace')

The I used this line to insert data into the table:

with Session() as session:
    session.bulk_insert_mappings(df.to_dict('records'))

The traceback I receive is the following:

TypeError: bulk_insert_mappings() missing 1 required positional argument: 'mappings'

How can I create the mapper if I dont use the sqlalchemy ORM to create the table? Looking at this question I know how to create the mapper object with a sqlalchemy model but not otherwise.

Also I have the option of inserting using bulk_insert_objects method but that would also need a model which I dont have

PS: I am doing this because I want to insert many rows into a Oracle database using sqlalchemy. If you have any better solution it is also welcome.

1 Answers1

1

The simplest way to insert would be to use the dataframe's to_sql method, but if you want to use SQLAlchemy you can use its core features to insert the data. There's no need to use the ORM in your use case. Something like this should work, assuming you are using SQLAlchemy 1.4 and you can arrange your data as a list of dicts:

import sqlalchemy as sa

...

tbl = sa.Table('test_insert_of', sa.MetaData(), autoload_with=engine)
ins = tbl.insert()

with engine.begin() as conn:
    # This will automatically commit
    conn.execute(ins, list_of_dicts)

If you have a lot of data you might want to insert in batches.

snakecharmerb
  • 47,570
  • 11
  • 100
  • 153
  • Does this code insert in batches or insert one row at a time? *to_sql* was a bit slow and using *method='multi'* option was crashing for me. – Asfandyar Abbasi May 12 '22 at 09:41
  • It will insert everything in `list_of_dicts`, using the oracle cursor's `executemany` method ([docs](https://docs.sqlalchemy.org/en/14/core/tutorial.html#executing-multiple-statements)); whether it creates a single statement with multiple `VALUES` or multiple statements I don't (no Oracle system to check). If you want to commit in batches you would have to divide the list yourself - [examples](https://stackoverflow.com/q/312443/5320906). – snakecharmerb May 12 '22 at 11:07