I have a dataframe df
. Most of the columns are json strings while some are list of jsons. The preview of sample rows is shown below:
id movie genres
1 John [{'id': 28, 'name': 'Action'}, {'id': 12, 'name': 'Adventure'}, {'id': 878, 'name': 'Science Fiction'}]
2 Mike [{'id': 28, 'name': 'Action'}]
3 Jerry []
As visible above, genres column has different quantity of items.
I want to extract only values from the keys called 'name' and put them into separate columns. So, for example, if there are three 'name' keys then there will need to be 3 separate columns to store the respective values (the 'name' is the genre). So the new columns could be called 'genre1', 'genre2' etc.
I only need 4 columns max for 4 genres only!
I tried this code:
pd.concat([df['genres'].apply(pd.Series)], axis=1)
it gave me the output I didn't need.
The output should be:
id movie genre1 genre2 genre3
1 John Action Adventure Science Fiction
2 Mike Action
3 Jerry None