0

I have a dataframe with six columns and around 27000 rows.
I'm trying to load this dataframe into my SQL Server (not localhost), but it takes forever.

Does anyone know of any faster way to load than this -
27000 rows shouldnt take long. No problem when reading from the database. :-)

for index, row in predict.iterrows():
        params = [(row.account_no, row.group_company, row.customer_type, row.invoice_date, row.lower, row.upper)]
        cursor.fast_executemany = True
        cursor.executemany("INSERT INTO ML.predictions (account_no,group_company,customer_type,invoice_date,lower, upper) values(?,?,?,?,?,?)",
                       params)
      bachelor.commit()

ANSWER

records = [str(tuple(x)) for x in predict.values]


insert_ = """

INSERT INTO ml.predictions(account_no, group_company, customer_type, invoice_date, lower, upper) VALUES
 
"""


    def chunker(seq, size):
        return (seq[pos:pos + size] for pos in range(0, len(seq), size))
    
    for batch in chunker(records, 1000):
        rows = ','.join(batch)
        insert_rows = insert_ + rows
        cursor.execute(insert_rows)
        bachelor.commit()
Kristian
  • 1
  • 1
  • 1
    What is "forever" here? – roganjosh Nov 21 '21 at 16:30
  • 2
    Your indentation is messed up, so it's not clear whether you have `commit()` inside of the loop or whether that's a typo. If it's in the loop then your question is contradictory - you're using a bulk insertion method to insert single rows – roganjosh Nov 21 '21 at 16:31
  • You can pass bulk data to SQL Server using JSON. See https://stackoverflow.com/questions/60745932/update-sql-server-database-using-stored-procedure-with-table-as-paramater-using/60746532#60746532 – David Browne - Microsoft Nov 21 '21 at 16:35
  • Does this answer your question? [How to speed up bulk insert to MS SQL Server using pyodbc](https://stackoverflow.com/questions/29638136/how-to-speed-up-bulk-insert-to-ms-sql-server-using-pyodbc) – Charlieface Nov 21 '21 at 17:08

1 Answers1

0

Thank for your answers - I tried them all besides using JSON, it might be me that its wrong with.

This was my solution

records = [str(tuple(x)) for x in predict.values]


insert_ = """

INSERT INTO ml.predictions(account_no, group_company, customer_type, invoice_date, lower, upper) VALUES
 
"""


def chunker(seq, size):
    return (seq[pos:pos + size] for pos in range(0, len(seq), size))

for batch in chunker(records, 1000):
    rows = ','.join(batch)
    insert_rows = insert_ + rows
    cursor.execute(insert_rows)
    bachelor.commit()
Kristian
  • 1
  • 1