4

My objective is to store a JSON object into a MySQL database field of type json, using the mysql.connector library.

import mysql.connector
import json

jsonData = json.dumps(origin_of_jsonData)

cnx = mysql.connector.connect(**config_defined_elsewhere)
cursor = cnx.cursor()
cursor.execute('CREATE DATABASE dataBase')
cnx.database = 'dataBase'
cursor = cnx.cursor()
cursor.execute('CREATE TABLE table (id_field INT NOT NULL, json_data_field JSON NOT NULL, PRIMARY KEY (id_field))')

Now, the code below WORKS just fine, the focus of my question is the use of '%s':

insert_statement = "INSERT INTO table (id_field, json_data_field) VALUES (%s, %s)"
values_to_insert = (1, jsonData)
cursor.execute(insert_statement, values_to_insert)

My problem with that: I am very strictly adhering to the use of '...{}'.format(aValue) (or f'...{aValue}') when combining variable aValue(s) into a string, thus avoiding the use of %s (whatever my reasons for that, let's not debate them here - but it is how I would like to keep it wherever possible, hence my question).

In any case, I am simply unable, whichever way I try, to create something that stores the jsonData into the mySql dataBase using something that resembles the above structure and uses '...{}'.format() (in whatever shape or form) instead of %s. For example, I have (among many iterations) tried

insert_statement = "INSERT INTO table (id_field, json_data_field) VALUES ({}, {})".format(1, jsonData)
cursor.execute(insert_statement)

but no matter how I turn and twist it, I keep getting the following error:

ProgrammingError: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '[some_content_from_jsonData})]' at line 1

Now my question(s):

1) Is there a way to avoid the use of %s here that I am missing?

2) If not, why? What is it that makes this impossible? Is it the cursor.execute() function, or is it the fact that it is a JSON object, or is it something completely different? Shouldn't {}.format() be able to do everything that %s could do, and more?

Dov Benyomin Sohacheski
  • 7,133
  • 7
  • 38
  • 64
  • Probably you might be interested in reading [*Python best practice and securest to connect to MySQL and execute queries*](https://stackoverflow.com/questions/7929364/python-best-practice-and-securest-to-connect-to-mysql-and-execute-queries). `"some string %s" % "other string"` (note `%` instead of `,` between both string)is equivalent to `"some string {}".format("other string")` and you should be avoiding both of these in your query – Moinuddin Quadri Feb 06 '18 at 22:39
  • 2
    You should use `%s`. This is not the situation for string substitution. It is SQL parameter insertion. It just happens to use `%s`, so it looks a bit like the `%` operator. – khelwood Feb 06 '18 at 22:39
  • You don't say why you want to avoid `%s`, but it is *unequivocally wrong* to do so, and exposes you to the risk of SQL injection. You should use it. – Daniel Roseman Feb 06 '18 at 22:42
  • You can try to use the `__repr__` of the string, which will (most of the time but not all of the time) place the string in single-quotes, so use this: `"INSERT INTO table (id_field, json_data_field) VALUES ({}, {!r})".format(1, jsonData)` However, this result is brittle. Best to use the built-in SQL parameter insertion, for various reasons really. – juanpa.arrivillaga Feb 06 '18 at 22:43
  • Thank you very much for the answer! Especially the explanations (explicit and implicit) that this has nothing to do with string substitution, but SQL parameter insertion that just happens to look like string substitution is the information I was looking for... probably seems a stupid question now, but if you are just getting started, those ARE the types of questions that come up :) So again, thank you! –  Feb 06 '18 at 23:10

2 Answers2

7

First of all: NEVER DIRECTLY INSERT YOUR DATA INTO YOUR QUERY STRING!

Using %s in a MySQL query string is not the same as using it in a python string. In python, you just format the string and 'hello %s!' % 'world' becomes 'hello world!'. In SQL, the %s signals parameter insertion. This sends your query and data to the server separately. You are also not bound to this syntax. The python DB-API specification specifies more styles for this: DB-API parameter styles (PEP 249). This has several advantages over inserting your data directly into the query string:

Prevents SQL injection

Say you have a query to authenticate users by password. You would do that with the following query (of course you would normally salt and hash the password, but that is not the topic of this question):

SELECT 1 FROM users WHERE username='foo' AND password='bar'

The naive way to construct this query would be:

"SELECT 1 FROM users WHERE username='{}' AND password='{}'".format(username, password)

However, what would happen if someone inputs ' OR 1=1 as password. The formatted query would then become

SELECT 1 FROM users WHERE username='foo' AND password='' OR 1=1

which will allways return 1. When using parameter insertion:

execute('SELECT 1 FROM users WHERE username=%s AND password=%s', username, password)

this will never happen, as the query will be interpreted by the server separately.

Performance

If you run the same query many times with different data, the performance difference between using a formatted query and parameter insertion can be significant. With parameter insertion, the server only has to compile the query once (as it is the same every time) and execute it with different data, but with string formatting, it will have to compile it over and over again.

Jesse Bakker
  • 2,403
  • 13
  • 25
  • Hi Jesse - thank you VERY much indeed, especially for also adding the information on performance (which is very relevant in my case)! –  Feb 06 '18 at 23:19
  • @Mephisto3 for inserting many rows, take a look at the `executemany` function – Jesse Bakker Feb 06 '18 at 23:38
  • Thank you again! And ALSO thank you for pointing me to marking an answer as accepted... obviously, I am new to many things, one of which is Stackoverflow (at least as somebody who posts a question)... –  Feb 07 '18 at 01:56
5

In addition to what was said above, I would like to add some details that I did not immediately understand, and that other (newbies like me ;)) may also find helpful:

1) "parameter insertion" is meant for only for values, it will not work for table names, column names, etc. - for those, the Python string substitution works fine in the sql syntax defintion

2) the cursor.execute function requires a tuple to work (as specified here, albeit not immediately clear, at least to me: https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursor-execute.html)

EXAMPLE for both in one function:

def checkIfRecordExists(column, table, condition_name, condition_value):
    ...
    sqlSyntax = 'SELECT {} FROM {} WHERE {} = %s'.format(column, table, condition_name)
    cursor.execute(sqlSyntax, (condition_value,))

Note both the use of .format in the initial sql syntax definition and the use of (condition_value,) in the execute function.