0

The Python code I have looks like this

f = open('data.tsv', 'r')

for line in f:
    a = line.split("\t")
    e_id = a[1]
    name = a[2]
    start = a[3]
    end = a[4]
    loc = a[5]
    tags = a[6]
    url = a[7]
    cur = con.cursor(mdb.cursors.DictCursor)
    cur.execute("INSERT INTO data_table VALUES (" + e_id + "," + name + "," + start + "," + end + "," + loc + "," + tags + "," + url + ");")

The "loc" piece of data very often has a comma in it because a lot of it is formatted "City, State", so MySQL interprets it as a value. I can't directly comment the commas out because they're being stored as variables. Is there a way around this?

Xonal
  • 1,340
  • 4
  • 20
  • 33
  • I imagine that your data is safe since you're not escaping it in anyway? – Brandon Buck May 31 '13 at 14:41
  • Yes, it is. And adding ' around the values makes the code say I exceed the column count – Xonal May 31 '13 at 14:47
  • Do you have any `'` inside of any other values? If your data is all correctly input then you will not get that warning. Unless you do not have 7 columns in the table. – Brandon Buck May 31 '13 at 14:56
  • 1
    Regardless of whether the data is 'safe' or not, use the `cur.execute("INSERT INTO data_table VALUES (%s,%s,%s,%s,%s,%s,%s)",(e_id,name,start,end,loc,tags,url))` syntax suggested by Miquel - it'll save you a lot of headaches. – Aya May 31 '13 at 15:00

3 Answers3

3

Be careful when building queries in this format. Before and after concatenating an string you need to put commas. For example,

INSERT INTO data_table VALUES ('" + e_id + "','" + name + "','" + start + "','" + end + "','" + loc + "','" + tags + "','" + url + "');")

I hope it helps to solve your issue.

Also, you can use specific python syntax to perform the inserts:

cur.execute("INSERT INTO data_table VALUES (%s,%s,%s,%s,%s,%s,%s)",(e_id,name,start,end,loc,tags,url)); 
glglgl
  • 89,107
  • 13
  • 149
  • 217
Miquel
  • 858
  • 11
  • 20
2

Why is no one talking about prepared statements? This is exactly the use case. It's an easier syntax, and it's completely safe.

cur.execute("INSERT INTO data_table VALUES (?,?,?,?,?,?,?)", (e_id,name,start,end,loc,tags,url))

(syntax may be slightly different. See Does Python support MySQL prepared statements? )

Community
  • 1
  • 1
korylprince
  • 2,969
  • 1
  • 18
  • 27
  • Ah, I just saw Miguel's answer. However prepared statements aren't python-specific. They're supported by the database. – korylprince May 31 '13 at 15:12
1

Your true problem likes in that your "City, State" line is a sequence of characters. You cannot directly insert a sequence of characters in SQL like this:

INSERT INTO test VALUES (My String);

Instead, you treat it like a string. MySQL expects strings to be wrapped in single quotes, so you would change the above to:

INSERT INTO test VALUES ('My String');

Now you've got the string 'My String' stored. Now, generating that dynamically isn't much different, except you're going to have to make sure it's wrapped in single quotes - like:

loc = "City, State"
sql = "INSERT INTO test VALUES ('" + loc + "');"

Notice the single quotes around where I'm inserting location.

The rest is additional information

This is a dangerous operation though because I'm allowing any value to directly enter my database, even if that values is "'); DROP DATABASE test; -- " which would do some damage. You'd think it's harmless but after plugged in you get:

INSERT INTO test VALUES(''); DROP DATABASE test; -- ');

And so now I've just lost all of my data. To fix this you want to escape values before putting them in your database which is as simple as MySQLdb.escape_string(str). Then you just:

loc = "'); DROP DATABASE test; -- "
loc = MySQLdb.escape_string(loc);
sql = "INSERT INTO test VALUES ('" + loc + "');"

And the result is:

INSERT INTO test VALUES ('\'); DROP DATABASE test; -- ');

Which will not result in any kind of damage to your database. This is not the only vulnerability that you open yourself to and it's a very simplified example.

Brandon Buck
  • 7,177
  • 2
  • 30
  • 51
  • 4
    Fine answer, but it lacks a pointer to the right direction, namely how to do it: namely with `cur.execute("INSERT INTO data_table VALUES (%s,%s,%s,%s,%s,%s,%s)",(e_id,name,start,end,loc,tags,url))`. – glglgl May 31 '13 at 15:18
  • @glglgl I didn't demonstrate that as it was already done. I chose to use the code he was using and explain why it was incorrect instead of show him another way to do it. That way he can understand what was wrong with the original snippet of code. I'm not teaching Python, I'm answering a question with already written code. – Brandon Buck May 31 '13 at 15:32