Now I have read about this bulk insert in the documentation of sqlalchemy, but for my application it is maybe taking more time than it should !
First I'll show you how I'm doing it:
def insert_all(self, table_name, schema_name, df):
if not df.empty:
values = df.to_dict(orient='records')
table = Table(table_name , self._metadata, autoload=True, schema=schema_name)
result_proxy = self._connection.execute(table.insert(), values)
return result_proxy
else:
return None
Where values
are of the form:
[{'col1': 1.0, 'col2': 0.5},
{'col1': 2.0, 'col2': 0.75}, ...]
But number of columns may vary. Using this method it takes about 15 to 20 minutes to insert 400,000 records into MSSQL Server.
Could you please tell me if this is normal, if not, what can I do to make this process faster ?