I have a database, which I store as a .db
file on my disk. I implemented all the function neccessary for managing this database using sqlite3
. However, I noticed that updating the rows in the table takes a large amount of time. My database has currently 608042 rows. The database has one table - let's call it Table1
. This table consists of the following columns:
id | name | age | address | job | phone | income
(id
value is generated automaticaly while a row is inserted to the database).
After reading-in all the rows I perform some operations (ML algorithms for predicting the income) on the values from the rows, and next I have to update (for each row) the value of income
(thus, for each one from 608042 rows I perform the SQL update
operation).
In order to update, I'm using the following function (copied from my class):
def update_row(self, new_value, idkey):
update_query = "UPDATE Table1 SET income = ? WHERE name = ?" %
self.cursor.execute(update_query, (new_value, idkey))
self.db.commit()
And I call this function for each person registered in the database.
for each i out of 608042 rows:
update_row(new_income_i, i.name)
(values of new_income_i are different for each i).
This takes a huge amount of time, even though the dataset is not giant. Is there any way to speed up the updating of the database? Should I use something else than sqlite3
? Or should I instead of storing the database as a .db file store it in memory (using sqlite3.connect(":memory:")
)?