1

I'd like to append a column into the table existing in sqlite3 database, using values stored in a pandas Series.

My original DataFrame df looks like:

   a  b
0  1  2
1  3  4

And this is stored as a table in sqlite3 also. If I add a column to df as:

df['c'] = df.a + df.b

then df will be:

   a  b  c
0  1  2  3
1  3  4  7

whereas the table in the sqlite3 db is not changed yet. What I want to do is to append a column ('c') into the table in sqlite3 and fill its values with df['c'].

What I tried is:

con = sqlite3.connect('data/a.db')
cur = con.cursor()
cur.execute('alter table temptable add column c integer')
con.commit()
cur.execute('update temptable set c=?', df.c)
con.commit()
con.close()

However, it is not working. Is there a possible way to perform bulk update for the new column 'c' in sqlite3? The number of rows is usually around 100,000,000.

overfit
  • 349
  • 1
  • 12
  • so you have around 100mm row in your database ? then you need to think about use chunk – BENY Feb 07 '20 at 05:05

1 Answers1

1

Assuming ALTER TABLE worked and you were able to add the new column, try using sqlite3's executemany method to insert values into the new column. The accepted answer to this SO question shows you how to do it (Note that you'll need a primary key on your table)

As an alternative, this link shows you how to use DataFrame.to_sql to update the entire table using the dataframe without writing sql query yourself.

Emrah Diril
  • 1,687
  • 1
  • 19
  • 27