How to convert JSON column in DataFrame to "normal" columns in DataFrame in Python Pandas?

Question

I have DataFrame like below:

ABT = pd.read_excel("ABT.xlsx")

DATA TYPES:

COL1 - float
COL2 - int
COL3 - object

COL1 COL2 COL3 COL4

1.2 5 {"X":"cc", "y":12} {"A":{1,2}, "B":{3,3}"}

0.0 2 {"X":"dd", "y":13} {"A":{0,1}, "B":{2,2}"}

2.22 0 {"X":"ee", "y":45} {"A":{5,5}, "B":{1,1}"}

... ... ... ...

And I need to have something like below:

| COL1 | COL2| COL3               | X   | y   | A  | B 
|------|-----|--------------------|-----|-----|----------
| 1.2  | 5   | {"X":"cc", "y":12} | cc  | 12  |
| 0.0  | 2   | {"X":"dd", "y":13} | dd  | 13  |
| 2.22 | 0   | {"X":"ee", "y":45} | ee  | 45  |
| ...  | ... | ...                | ... | ... |

I tried to use code like below, but it does not work: pd.json_normalize(ABT) because of error: AttributeError: 'str' object has no attribute 'values'

I also tried this one: pd.io.json.json_normalize(ABT.COL3[0]) but I have error: AttributeError: 'str' object has no attribute 'values'

How can I do that in Python Pandas ? I have a problem to image how should look output for values in COL4 ?

In my real DF: When I use ABT.head().to_dict('list') I have liek below:

{'COL1': [0.0],
 'COL2': [2],
 'COL3': [2162561990],
 'COL4': [1500.0],
 'COL5': [750.0],
 'COL6': ['{"paAccounts": {"mySector": 4, "otherSectors": 10}}'],
 'COL7': ['{"grade": "CC"}']}

mozway as I wrote in question pd.json_normalize(ABT) generate error — dingaro, Dec 22 '22 at 19:03
Then combine with `ast.literal_eval` (NB. you should provide a reproducible example as DataFrame constructor). — mozway, Dec 22 '22 at 19:04
mozway I am not able to provide constructor I can not create sample of JSON column, but could you simply present line of code and I will implement it on my DataFrame ? — dingaro, Dec 22 '22 at 19:10
mozway, it is not the whole DF, because I have many rows but sample is like below: — dingaro, Dec 22 '22 at 19:15
{'COL1': [0.0, 0.0, 0.0, 0.0, 0.0], 'COL2': [2, 0, 33, 33, 2], 'COL3': [2162561990, 2167912785, 599119703, 603482478, 2162561990], 'COL4': [1500.0, 500.0, 3500.0, 1500.0, 1500.0], 'COL5': [750.0, 0.0, 3500.0, 1485.0, 750.0], 'COL6': ['{"paAccounts": {"mySector": 4, "otherSectors": 10}, "npaAccounts": {"mySector": 0, "otherSectors": 0}} — dingaro, Dec 22 '22 at 19:15
Looks like your output is truncated, can you add it as edit to the question? — mozway, Dec 22 '22 at 19:20
mozway, I edited my question but I have a problem to even image how can look output for values in COL4 where we have nested JSON — dingaro, Dec 22 '22 at 19:27
I don't see the dictionary in your edit. I'm sorry but I can't help you without a reproducible input. Let me know if you add it, eventually. — mozway, Dec 22 '22 at 19:29
mozway, I edited my question, at the end you have sample data after use line: ABT.head().to_dict('list') — dingaro, Dec 22 '22 at 19:37
I'm really sorry but the output you provide is invalid. This cannot be the direct output of what I requested. You must have been editing it or failed to copy a part. — mozway, Dec 22 '22 at 19:44
mozway, You are right, I edited question again, now shoudl work good, could you check at the end of the question ? — dingaro, Dec 22 '22 at 19:47
do you have some idea mozway ? I checked the whole internet and I can not do that :/ — dingaro, Dec 22 '22 at 20:00
Try: `pd.json_normalize(df['COL7'].apply(ast.literal_eval))`, for COL6, the json is invalid — mozway, Dec 22 '22 at 20:06
I have error using your line on COL7: ValueError: malformed node or string: nan, and I edited COL6 because previously there were lack of "}" — dingaro, Dec 22 '22 at 20:08

COL1	COL2	COL3	COL4
1.2	5	{"X":"cc", "y":12}	{"A":{1,2}, "B":{3,3}"}
0.0	2	{"X":"dd", "y":13}	{"A":{0,1}, "B":{2,2}"}
2.22	0	{"X":"ee", "y":45}	{"A":{5,5}, "B":{1,1}"}
...	...	...	...

How to convert JSON column in DataFrame to "normal" columns in DataFrame in Python Pandas?

0 Answers0