0

I have a Python script, that's using PyMySQL to connect to a MySQL database, and insert rows in there. Some of the columns in the database table are of type json.

I know that in order to insert a json, we can run something like:

my_json = {"key" : "value"}

cursor = connection.cursor()
cursor.execute(insert_query)

"""INSERT INTO my_table (my_json_column) VALUES ('%s')""" % (json.dumps(my_json))

connection.commit()

The problem in my case is that the json is variable over which I do not have much control (it's coming from an API call to a third party endpoint), so my script keeps throwing new error for non-valid json variables.

For example, the json could very well contain a stringified json as a value, so my_json would look like:

{"key": "{\"key_str\":\"val_str\"}"}

→ In this case, running the usual insert script would throw a [ERROR] OperationalError: (3140, 'Invalid JSON text: "Missing a comma or \'}\' after an object member." at position 1234 in value for column \'my_table.my_json_column\'.')

Or another example are json variables that contain a single quotation mark in some of the values, something like:

{"key" : "Here goes my value with a ' quotation mark"}

→ In this case, the usual insert script returns an error similar to the below one, unless I manually escape those single quotation marks in the script by replacing them. [ERROR] ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'key': 'Here goes my value with a ' quotation mark' at line 1")

So my question is the following: Are there any best practices that I might be missing on, and that I can use in order to avoid my script breaking, in the 2 scenarios mentioned above, but also in any other potential examples of jsons that might break the insert query ?

I read some existing posts like this one here or this one, where it's recommended to insert the json into a string or a blob column, but I'm not sure if that's a good practice / if other issues (like string length limitations for example) might arise from using a string column instead of json.

Thanks !

Malek
  • 1

0 Answers0