0

lets say i have a dataframe value

e.g. test_data.csv (contain below data)

effective_date,ds,id,id_type,e_data,create_id,create_timestamp
2021-07-26,am,27,a_id,"{""cup_id"": ""ffdsds"", ""rate"": ""B"", ""direct"": ""stable"", ""dl_tstmp"": ""2021-07-26 00:00:00"", ""inst_id"": 1213, ""src_p_tstmp"": ""2021-07-26 00:00:00"", ""inst_name"": ""abc corp""}",MA,2021-07-26 00:00:00
2021-07-26,am,24,a_id,"{""cup_id"": ""ererwe"", ""rate"": ""AB"", ""direct"": ""improvent"", ""dl_tstmp"": ""2021-07-26 00:00:00"", ""inst_id"": 66641, ""src_p_tstmp"": ""2021-07-26 00:00:00"", ""inst_name"": ""xyz corp""}",MA,2021-07-26 00:00:00
2021-07-27,am,22,a_id,"{""cup_id"": ""34kf3"", ""rate"": ""AA"", ""direct"": ""improvent"", ""dl_tstmp"": ""2021-07-26 00:00:00"", ""inst_id"": 6871, ""src_p_tstmp"": ""2021-07-26 00:00:00"", ""inst_name"": ""rimr corp""}",MA,2021-07-26 00:00:00
2021-07-27,am,32,a_id,"{""cup_id"": ""5gh23"", ""rate"": ""AAA"", ""direct"": ""downfall"", ""dl_tstmp"": ""2021-07-26 00:00:00"", ""inst_id"": 98795, ""src_p_tstmp"": ""2021-07-26 00:00:00"", ""inst_name"": ""prst corp""}",MA,2021-07-26 00:00:00


import pandas as pd
df = pd.read_csv("test_data.csv")

enter image description here

in which e_data column is in json_format not in dictionary format enter image description here

which i wanted to distribute into separate column format hence the expected output is as follows

enter image description here

  • Possibly a duplicate to (https://stackoverflow.com/questions/21104592/json-to-pandas-dataframe) – Manny Jul 28 '21 at 03:51
  • Did your query solved? if so then try considering [accepting](https://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work/5235#5235) to signal others that the issue is resolved. If not, you can provide feedback so the answer can be improved (or removed altogether) – Anurag Dabas Aug 14 '21 at 06:28

1 Answers1

1

Firstly convert the string dict to real dict:

from ast import literal_eval

df['e_data']=df['e_data'].map(literal_eval)

Finally:

try join()+DataFrame()+tolist() and pop() for removing 'e_data' column:

df=df.join(pd.DataFrame(df.pop('e_data').tolist()))

OR

df=df.join(df['e_data'].apply(pd.Series)).drop('e_data',1)
Anurag Dabas
  • 23,866
  • 9
  • 21
  • 41