Convert JSON data in data frame Python

Question

I am beginner of programming language, so it would be appreciated you help and support.

Here is DataFrame and one column' data is JSON type? of data.

ID, Name, Information
1234, xxxx, '{'age': 25, 'gender': 'male'}'
2234, yyyy, '{'age': 34, 'gender': 'female'}'
3234, zzzz, '{'age': 55, 'gender': 'male'}'

I would like to covert this DataFrame as below.

ID, Name, age, gender
1234, xxxx, 25, male
2234, yyyy, 34, female
3234, zzzz, 55, male

I found that ast.literal_eval() can convert str to dict type, but I have no idea how to write code of this issue.

Would you please give some example of a code which can solve this issue?

Kassian Sun · Answer 1 · 2020-08-13T00:44:36.023

If third column is a JSON string, ' is not valid, it should be ", so we need to fix this.
If the third column is a string representation of python dict, you can use eval to convert it.

A sample of code to split third column of type dict and merge into the original DataFrame:

data = [
  [1234, 'xxxx', "{'age': 25, 'gender': 'male'}"],
  [2234, 'yyyy', "{'age': 34, 'gender': 'female'}"],
  [3234, 'zzzz', "{'age': 55, 'gender': 'male'}"],
]

df = pd.DataFrame().from_dict(data)

df[2] = df[2].apply(lambda x: json.loads(x.replace("'", '"'))) # fix the data and convert to dict
merged = pd.concat([df[[0, 1]], df[2].apply(pd.Series)], axis=1)

Hi Kassian, thank you for your answer. I will try it to your suggestion too. — montjeu, Aug 14 '20 at 00:36

Trenton McKinney · Accepted Answer · 2020-08-13T01:12:16.207

Given `test.csv`

ID,Name,Information
1234,xxxx,"{'age': 25, 'gender': 'male'}"
2234,yyyy,"{'age': 34, 'gender': 'female'}"
3234,zzzz,"{'age': 55, 'gender': 'male'}"

Read the file in with pd.read_csv and use the converters parameter with ast.literal_eval, which will convert the data in the Information column from a str type to dict type.
Use pd.json_normalize to unpack the dict with keys as column headers and values in the rows
.join the normalized columns with df
.drop the Information column

import pandas as pd
from ast import literal_eval

df = pd.read_csv('test.csv', converters={'Information': literal_eval})

df = df.join(pd.json_normalize(df.Information))

df.drop(columns=['Information'], inplace=True)

# display(df)
     ID  Name  age  gender
0  1234  xxxx   25    male
1  2234  yyyy   34  female
2  3234  zzzz   55    male

If the data is not from a csv file

import pandas as pd
from ast import literal_eval

data = {'ID': [1234, 2234, 3234],
        'Name': ['xxxx', 'yyyy', 'zzzz'],
        'Information': ["{'age': 25, 'gender': 'male'}", "{'age': 34, 'gender': 'female'}", "{'age': 55, 'gender': 'male'}"]}

df = pd.DataFrame(data)

# apply literal_eval to Information
df.Information = df.Information.apply(literal_eval)

# normalize the Information column and join to df
df = df.join(pd.json_normalize(df.Information))

# drop the Information column
df.drop(columns=['Information'], inplace=True)

Hi Tranton, Thank you for your answer with detail information. Those detail help me to learn python much easier. Problem solved in CSV way you posted first. Thank you. — montjeu, Aug 14 '20 at 00:35

Convert JSON data in data frame Python

2 Answers2

Given test.csv

If the data is not from a csv file

Given `test.csv`