0

I have a huge (1GB+) database dump that I want to load into new databases on other servers. I tried parsing it line by line and executing each into mysql, but it unfortunately doesn't split lines evenly into commands and just fails on the incomplete ones.

filename='/var/test.sql'
fp = open(filename)
while True:
        a = fp.readline()
        if not a:
           break
        cursor.execute(a) #fails most of the time

It is also way too large to load the entire thing into memory call that. Furthermore, the python MySQLdb module does not support the source command.

EDITED

File includes a bunch of insert and create statements. Where its failing is on the inserts of large tables that contain raw text. There are all sorts of semi-colons and newlines in the raw text so hard to split commands based on that.

Community
  • 1
  • 1
JiminyCricket
  • 7,050
  • 7
  • 42
  • 59

3 Answers3

1

Assuming queries do end on line boundaries, you could just add lines together until they make a complete query.

Something like:

filename='/var/test.sql'
fp = open(filename)
lines = ''
while True:
        a = fp.readline()
        if not a:
           break
        try:
           cursor.execute(lines + a)
           lines = ''
        except e:
           lines += a

If it's only insert statements, you could look for lines ending ; and with the next line starting "INSERT".

filename='/var/test.sql'
fp = open(filename)
lines = ''
while True:
        a = fp.readline()
        if not a:
           break
        if lines.strip().endswith(';') and a.startswith('insert'):
           cursor.execute(lines)
           lines = a
        else:
           lines += a
# Catch the last one
cursor.execute(lines)

edit: replaced trim() with strip() & realised we don't need to execute the line a in second code example.

rjmunro
  • 27,203
  • 20
  • 110
  • 132
  • Thanks RJ. Good idea, I tried that but the dumps contain text that include semi-colons and newlines – JiminyCricket May 13 '11 at 16:34
  • @john machin @Xavier, yes, I meant strip. I wanted to take the newlines away before lines ended with ;. Unfortunately, python is not the language I've been programming with the most recently. – rjmunro May 14 '11 at 18:06
  • @JiminyCricket It has lines that end in ";" where the next line starts with "insert" *in the data*. What about the first method that just tries adding more lines to the query until it works? – rjmunro May 14 '11 at 18:08
1

Any reason you can't spawn out a process to do it for you?

import subprocess

fd = open(filename, 'r')
subprocess.Popen(['mysql', '-u', username, '-p{}'.format(password), '-h', hostname, database], stdin=fd).wait()

You may want to tailor that a little as the password will be exposed to ps.

linuts
  • 6,608
  • 4
  • 35
  • 37
0

I think, sometimes, we should choose the other ways to do the job effectively. I prefer to use this stuff for large data: http://www.mysqldumper.net/

anhtran
  • 2,006
  • 4
  • 27
  • 53