0

I have a pandas data frame called customer_df of length 11k. I am creating a SQL query to insert values of the df to a table customer_details on postgres.

my code right now is

insert_string = 'insert into  customer_details (col1,col2,col3,col4) values '
for i in range(len(customer_df)):
    insert_string = insert_string  + ('''('{0}','{1}','{2}','{3}'),'''.format(customer_df.iloc[i]['customer'],customer_df.iloc[i]['customerid'],customer_df.iloc[i]['loc'],customer_df.iloc[i]['pin']))

upload_to_postgres(insert_string)

this string finally gives me something like

insert into customer_details (col1,col2,col3,col4) values ('John',23,'KA',560021),('Ram',67,'AP',642918),(values of cust 3) .... (values of cust 11k) which is sent to postgres using the upload_to_postgres.

This process of creating the string takes around 30secs to 1min. Is there any better optimized way to reduce the timings?

Rakesh SK
  • 47
  • 1
  • 3

2 Answers2

0

You can use pandas to create and read that df faster here's a pandas cheat sheet and the library download link https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf https://pandas.pydata.org/pandas-docs/stable/getting_started/install.html

Caio Navarro
  • 48
  • 1
  • 13
  • OP is using pandas: "I have a pandas data frame". This is not an answer. – BigBen Jan 13 '21 at 18:25
  • 1
    I didn't see where he used that sorry. But if he's already using pandas just use this guide https://pandas.pydata.org/pandas-docs/stable/user_guide/enhancingperf.html – Caio Navarro Jan 13 '21 at 18:29
  • The answer space is for *answers* to the question, not for providing links. Please see [how to answer](https://stackoverflow.com/help/how-to-answer). Especially since you've had some lower-quality answers recently. – BigBen Jan 13 '21 at 18:29
0

The multiple insert is not the right way. At each row you create a transaction, so the best solution is to do a bulk insert. If you can create a file like a CSV and use the bulk action copy, your performance increases of order of magnitude. I've no example here, but i've found this article that exlain well the methodology.

Glauco
  • 1,385
  • 2
  • 10
  • 20