0

I need to analyze metadata from here: http://jmcauley.ucsd.edu/data/amazon/links.html

However, metadata JSON files here are nested & have single quotes, not double quotes. Therefore I can't use json_normalize to flatten this data into a Pandas dataframe.

Example:

{'A':'1', 'B':{'c':['1','2'], 'd':['3','4']}}

I need to flatten this into a Pandas data frame with objects A B.c B.d With guideline given in the link I used eval to get A and B but can't get B.c, B.d.

Could you please suggest a way to do this?

jonrsharpe
  • 115,751
  • 26
  • 228
  • 437

3 Answers3

1

A JSON cannot have keys or values encompassed in single quotes. If you have to parse a string with single quotes as a dict then you can probably use

import ast
data = str({'A':'1', 'B':{'c':['1','2'], 'd':['3','4']}})
data_dict = ast.literal_eval(data)

from pandas.io.json import json_normalize
data_normalized = json_normalize(data)

https://stackoverflow.com/a/21154138/13561487

Community
  • 1
  • 1
0

That's a dict, not a JSON, if you want to convert that to a DataFrame, just do this:

d = {'A':'1', 'B':{'c':['1','2'], 'd':['3','4']}}
df = pd.DataFrame(d)

   A       B
c  1  [1, 2]
d  1  [3, 4]
NYC Coder
  • 7,424
  • 2
  • 11
  • 24
0

If your problem is loading this text into a python dict you can try a couple of things

  1. replace single quotes -> json.loads(data.replace("'",'"'))
  2. try to read it as a python dict -> eval(data)
mojozinc
  • 164
  • 8