I have a task to transform data from csv file to nested JSON file using python for web usage. I tried the Python code in this article. The desired output would be one member_id
shows one time in the JSON file, same for the tag_name
under one member_id
. The problem is when I tried groupby
with only member_id
, the tag_name
'm1' would show multiple times under 'abc123' for example. If I tried groupby
with both member_id
and tag_name
, 'abc123' would show up two times for tag 'm1' and 'm2'. I've been googling for a while but most resolutions only for one nested(not sure if I use the right term). Please let me know if any possible way to do this.
Sample Code:
import json
import pandas as pd
df = pd.read_csv('../detail.csv', sep=',', header=0
, index_col=False
, dtype = {'member_id':str,'tag_name':str,'detail_name':str,'detail_value':str} )
group = df.groupby(['member_id','tag_name'])
finalList, finalDict = [], {}
for key, value in group:
dictionary, dictionary1, dictList, dictList1 = {}, {}, [], []
j = group.get_group(key).reset_index(drop=True)
dictionary['member_id'] = j.at[0,'member_id']
dictionary1['tag_name'] = j.at[0,'tag_name']
for i in j.index:
anotherDict = {}
anotherDict['detail_name'] = j.at[i,'detail_name']
anotherDict['detail_value'] = j.at[i,'detail_value']
dictList1.append(anotherDict.copy())
dictionary1['detail'] = dictList1
dictList.append(dictionary1)
dictionary['tag'] = dictList
finalList.append(dictionary)
json.dumps(finalList,ensure_ascii = False)
detail.csv:
member_id, tag_name, detail_name, detail_value
-------------------------------------------------------
abc123, m1, Service_A, 20
abc123, m1, Service_B, 20
abc123, m2, Service_C, 10
xyz456, m3, Service A, 5
xyz456, m3, Service A, 10
Desired Output JSON:
{ "member_id": "abc123",
"tag":[ {"tag_name": "m1",
"detail":[{ "detail_name": "Service_A",
"detail_value": "20"},
{ "detail_name": "Service_B",
"detail_value": "20"}]},
{"tag_name": "m2",
"detail":[{ "detail_name": "Service_C",
"detail_value": "10"}]}]},
{ "member_id": "xyz456",
"tag":[{"tag_name": "m3",
"detail":[{ "detail_name": "Service_A",
"detail_value": "5"},
{ "detail_name": "Service_A",
"detail_value": "10"}]}]}