I'm using Python with MySQLdb to perform some calculations on a large 75M row database. The basic program flow is to iterate through every row, process the data, then use another cursor to put new data back in the row. The processing is too complex for pure SQL.
db1 = MySQLdb.connect(host=HOST, user=USERNAME, passwd=PASSWORD, db=DATABASE)
db1.ping(True)
cursor1 = db1.cursor()
db = MySQLdb.connect(host=HOST, user=USERNAME, passwd=PASSWORD, db=DATABASE)
db2.ping(True)
cursor2 = db2.cursor()
cursor1.execute("SELECT * FROM table")
for row in cursor1:
cursor2.execute("SELECT * FROM table")
new_data = process_stuff(row)
cursor2.execute("UPDATE table SET column = '%f'", (new_data))
db2.commit()
db1.close()
db2.close()
This works, usually. Very often though, it silently quits with no error message. Sometimes, but not always, it will give a "MySQL has gone away" error.
It's also rather slow. I know it's a fairly large database, but it takes about 24 hours to process everything. It's a decent machine with 24 GB of RAM, but should it really take this long? I've spent hours fiddling with the my.conf, to no real gain.