how to convert one column of dataframe into distributive column which has values in json format (PYTHON)

Question

lets say i have a dataframe value

e.g. test_data.csv (contain below data)

effective_date,ds,id,id_type,e_data,create_id,create_timestamp
2021-07-26,am,27,a_id,"{""cup_id"": ""ffdsds"", ""rate"": ""B"", ""direct"": ""stable"", ""dl_tstmp"": ""2021-07-26 00:00:00"", ""inst_id"": 1213, ""src_p_tstmp"": ""2021-07-26 00:00:00"", ""inst_name"": ""abc corp""}",MA,2021-07-26 00:00:00
2021-07-26,am,24,a_id,"{""cup_id"": ""ererwe"", ""rate"": ""AB"", ""direct"": ""improvent"", ""dl_tstmp"": ""2021-07-26 00:00:00"", ""inst_id"": 66641, ""src_p_tstmp"": ""2021-07-26 00:00:00"", ""inst_name"": ""xyz corp""}",MA,2021-07-26 00:00:00
2021-07-27,am,22,a_id,"{""cup_id"": ""34kf3"", ""rate"": ""AA"", ""direct"": ""improvent"", ""dl_tstmp"": ""2021-07-26 00:00:00"", ""inst_id"": 6871, ""src_p_tstmp"": ""2021-07-26 00:00:00"", ""inst_name"": ""rimr corp""}",MA,2021-07-26 00:00:00
2021-07-27,am,32,a_id,"{""cup_id"": ""5gh23"", ""rate"": ""AAA"", ""direct"": ""downfall"", ""dl_tstmp"": ""2021-07-26 00:00:00"", ""inst_id"": 98795, ""src_p_tstmp"": ""2021-07-26 00:00:00"", ""inst_name"": ""prst corp""}",MA,2021-07-26 00:00:00


import pandas as pd
df = pd.read_csv("test_data.csv")

in which e_data column is in json_format not in dictionary format

which i wanted to distribute into separate column format hence the expected output is as follows

Possibly a duplicate to (https://stackoverflow.com/questions/21104592/json-to-pandas-dataframe) — Manny, Jul 28 '21 at 03:51
Did your query solved? if so then try considering [accepting](https://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work/5235#5235) to signal others that the issue is resolved. If not, you can provide feedback so the answer can be improved (or removed altogether) — Anurag Dabas, Aug 14 '21 at 06:28

Anurag Dabas · Answer 1 · 2021-07-28T05:54:51.453

1

Firstly convert the string dict to real dict:

from ast import literal_eval

df['e_data']=df['e_data'].map(literal_eval)

Finally:

try join()+DataFrame()+tolist() and pop() for removing 'e_data' column:

df=df.join(pd.DataFrame(df.pop('e_data').tolist()))

OR

df=df.join(df['e_data'].apply(pd.Series)).drop('e_data',1)

edited Jul 28 '21 at 05:54

answered Jul 28 '21 at 03:48

Anurag Dabas

23,866
9
21
41

hi @anurag dabas actually this logic is working fine here but while i am running it in my data set it is not working because it may fine over dictionary but not in json the column i am having store value in json format not in dictionary – Anil Tiwari Jul 28 '21 at 04:35
@AnilTiwari so post that format in the question instead of posting a Series of dictionaries – Anurag Dabas Jul 28 '21 at 04:36
@AnilTiwari so how you are reading it?..I mean you can tell the seprator? – Anurag Dabas Jul 28 '21 at 04:44
basically it is csv – Anil Tiwari Jul 28 '21 at 04:49
@AnilTiwari yes sir I know that's why I asking the seperator that you are using in `read_csv()` method to load the csv file – Anurag Dabas Jul 28 '21 at 04:51
i have change the data which will more helpful now you can simply read it by df = pd.read_csv("test_data.csv") – Anil Tiwari Jul 28 '21 at 05:40
@AnilTiwari updated answer...kindly have a look **:)** – Anurag Dabas Jul 28 '21 at 05:55

how to convert one column of dataframe into distributive column which has values in json format (PYTHON)

1 Answers1