1

Now I have read about this bulk insert in the documentation of sqlalchemy, but for my application it is maybe taking more time than it should !

First I'll show you how I'm doing it:

def insert_all(self, table_name, schema_name, df):
    if not df.empty:

        values = df.to_dict(orient='records')

        table = Table(table_name , self._metadata, autoload=True, schema=schema_name)

        result_proxy = self._connection.execute(table.insert(), values)

        return result_proxy

    else:
        return None

Where values are of the form:

[{'col1': 1.0, 'col2': 0.5}, 
 {'col1': 2.0, 'col2': 0.75}, ...]

But number of columns may vary. Using this method it takes about 15 to 20 minutes to insert 400,000 records into MSSQL Server.

Could you please tell me if this is normal, if not, what can I do to make this process faster ?

Hasan Jawad
  • 197
  • 9
  • Why are you not using [`to_sql()`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_sql.html)? – univerio Nov 16 '17 at 18:58
  • Are you using pyodbc? [This](https://stackoverflow.com/questions/45484171/sqlalchemy-bulk-insert-is-slower-than-building-raw-sql) might be related. – Ilja Everilä Nov 16 '17 at 19:26
  • @univerio I also tested `to_sql()` they are the same. @IljaEverilä I am using pyodbc. And thanks for the reference, I'm looking at it and there maybe be a solution. I will have to test them and get back to you guys. – Hasan Jawad Nov 17 '17 at 09:40

0 Answers0