I have a nested JSON like this
{ key1: "value1", key2: { key21: { key211: "value211" } }
If I create a MultiIndex from tuples of keys I get
[(key1, nan, nan), (key2, key21, key211)]
As you can see with this I cannot do
df[key1] = 'scalar'
because def[key1]
is a dataframe which will result into level mismatch while assignment.
So, how to use dataframe for such JSON ?
What I know so far -
Normalise - I can normalise the data and operate. But I have to convert back to JSON. pandas has no out of the box API for this. So I can loop on rows and make objects. But if multiple rows makes one object in case of any key being a list, I have to group or index then make it. In any case, I would like to avoid looping.
What I am thinking -
Maybe a combination of Index and MultiIndex. So that
df[key1]
is Series and I can also useto_dict
API of pandas DataFrame. But I am not able to figure out how to do this or even if this is possible.Is there a way to take care of
nan
in my MultiIndex, so that I can accessdf[key1]
as a series. This way I don't have to worry about how deep my levels go.
UPDATE 1
I still could not figure out to handle this scenario with MultiIndex, so I ended up working with flat/normalised data. So anyone with a deadline, I would recommend working on flat structure and use the utilities given in this question - Inverse of Pandas json_normalize