0

I am reading a data table from an API that returns me the data in JSON format, and one of the columns is itself a JSON string. I succeed in creating a Pandas dataframe for the overall table, but in the process of reading it, double quotes in the JSON string get converted to single quotes, and I can't parse the nested JSON.

I can't provide a reproducible example, but here is the key code:

myResult = requests.get(myURL, headers = myHeaders).text
myDF = pd.read_json(myResult, orient = "records", dtype = {"custom": str}, encoding = "unicode_escape")

Where custom is the nested JSON string. Try as I might by setting the dtype and encoding arguments, I cannot force Pandas to preserve the double quotes in the string.

So what started off as:

"custom": {"Field1":"Value1","Field2":"Value2"}

gets into the dataframe as:

{'Field1':'Value1','Field2':'Value2'}

I found this question which suggests using a custom parser for read_csv - but I can't see that this option is available for read_json.

I found a few suggestions here but the only one I could try was manually replacing the double quotes - and this causes fresh errors because there are apostrophes contained within the nested field values themselves...

The JSON strings are formatted correctly within myResult so it's the parsing applied by read_json that's the problem. Is there any way to change that or do I need to find some other way of reading this in?

Tom Wagstaff
  • 1,443
  • 2
  • 13
  • 15
  • when you say nestet json how deep is it nested? Your DataFrame has only two dimensions, so if its double-nested it only can save it as you have shown. Can you provide the JSON String? – Florian H Nov 13 '18 at 15:38
  • You now have a dictionary...just use it as a dictionary, unless you need to perform string operations on it? – John R Nov 13 '18 at 15:39
  • OK thanks guys - at least I haven't missed a simple option to force it in as-is. Yes I can use it as a dictionary – Tom Wagstaff Nov 13 '18 at 16:13

0 Answers0