I have nested json that I would like to unpack into pandas dataframe, I can do it using the following code. Is there any way to modify the code to remove the global variable?
d = {
"name":"Vertebrates",
"children":[
{
"name":"Mammals",
"children":[
{
"name":"human"
},
{
"name":"chimpanzee"
}
]
},
{
"name":"Birds",
"children":[
{
"name":"chicken"
},
{
"name":"turkey"
}
]
}
]
}
path = []
def unpack(d):
global path
if len(d) == 1:
yield(d['name'], path)
else:
path.append(d['name'])
for item in d['children']:
yield from unpack(item)
path = path[:-1]
pd.DataFrame.from_dict({key:value for key, value in unpack(d)},orient='index')
EDIT:
I actually started with path as a keyword argument, the issue was that I was getting this:
('human', ['Vertebrates', 'Mammals'])
('chimpanzee', ['Vertebrates', 'Mammals'])
('chicken', ['Vertebrates', 'Mammals', 'Birds'])
('turkey', ['Vertebrates', 'Mammals', 'Birds'])
where for chicken and turkey, path still has the word mammals, because the line: "path = path[:-1]" was useless in that code. so I decided to use a global variable to make sure I remove the last item whenever a branch in recursion finishes.
SOLVED: blhsing's answer can actually solve the problem, by removing the append function. bigwillydos's answer also does the trick.
I didn't know that in recursions variable updates are effective in a forward direction but ineffective in a backward direction. that's why I was getting accumulated path for later names.