How to split single column with list of dictionaries into multiple columns in pandas?

Question

Currently playing around with some data from an API using pandas, but one area I'm particularly interested in seems to be crammed into one column as a list of dictionaries.

How can I split this column out into four new columns?

Currently:

chip_plays
__________
[{"chip_name":"bboost",
"num_played":114585},
{"chip_name":"freehit",
"num_played":89437},
{"chip_name":"wildcard",
"num_played":263066},
{"chip_name":"3xc",
"num_played":216711}]

[{"chip_name":"bboost",
"num_played":79069},
{"chip_name":"freehit",
"num_played":41897},
{"chip_name":"wildcard",
"num_played":103086},
{"chip_name":"3xc",
"num_played":74713}]

...

Desired outcome:

bboost     freehit     wildcard     3xc
_________________________________________
114585     89437       263066       216711
79069      41897       103086       74713
...

In case it's useful, the api link is: https://fantasy.premierleague.com/api/bootstrap-static/

Screenshot of dataframe: https://share.getcloudapp.com/o0u8yYZn

Hello @lookofdisdain. Welcome to S.O. Please provide some code with what you have tried so far — alfonso.kim, Aug 07 '20 at 03:41

score 0 · Answer 1 · answered Jul 30 '20 at 16:20

0

pivot_table does the job:

df = pd.DataFrame([{"chip_name":"bboost",
"num_played":114585},
{"chip_name":"freehit",
"num_played":89437},
{"chip_name":"wildcard",
"num_played":263066},
{"chip_name":"3xc",
"num_played":216711}])

print(pd.pivot_table(df, values='num_played', columns='chip_name'))
chip_name      3xc  bboost  freehit  wildcard
num_played  216711  114585    89437    263066

answered Jul 30 '20 at 16:20

Rm4n

623
7
14

Would that work over multiple rows though? [screenshot] (https://share.getcloudapp.com/o0u8yYZn) – lookofdisdain Jul 30 '20 at 16:27
you can set the `values` as the list of desired rows – Rm4n Jul 30 '20 at 16:35

JPI93 · Answer 2 · 2020-07-30T20:05:00.087

PLEASE SEE UPDATE

The below example provides the output you are looking for given the response provided in your original question.

It creates the df you want by way of a dictionary comprehension which:

Iterates across the list (response)
Inserts each "chip_name" value as a key
Attributes a list containing the corresponding "num_played" value to each key

The resultant dictionary (data) is then passed as an argument to the pd.DataFrame() to create your desired output.

import pandas as pd

response = [{"chip_name":"bboost",
"num_played":114585},
{"chip_name":"freehit",
"num_played":89437},
{"chip_name":"wildcard",
"num_played":263066},
{"chip_name":"3xc",
"num_played":216711}]

data = { i['chip_name']: [i['num_played']] for i in response} 

df = pd.DataFrame(data)

Will give you the following df.

   bboost  freehit  wildcard     3xc
0  114585    89437    263066  216711

NOTE: The value passed doesn't necessarily have to be within a list, however if it isn't see this question for information about passing scalar values to pd.DataFrame().

UPDATE

Below example works for question following OP update to question. Fairly self explanatory:

Defines column names based on response content
Creates empty df with column names
Iterates across response list
Extracts row data using dictionary comprehension as in original answer
Create new df (temp) and append to existing to add new rows

import pandas as pd

response = [
    [{"chip_name":"bboost",
    "num_played":114585},
    {"chip_name":"freehit",
    "num_played":89437},
    {"chip_name":"wildcard",
    "num_played":263066},
    {"chip_name":"3xc",
    "num_played":216711}],

    [{"chip_name":"bboost",
    "num_played":79069},
    {"chip_name":"freehit",
    "num_played":41897},
    {"chip_name":"wildcard",
    "num_played":103086},
    {"chip_name":"3xc",
    "num_played":74713}]
]

columns = ['bboost', 'freehit', 'wildcard', '3xc']
df = pd.DataFrame(columns=columns)

for i in response:
    data = { j['chip_name']: [j['num_played']] for j in i }
    temp = pd.DataFrame(data=data)
    df = df.append(temp, ignore_index=True)

df looks like this:

   bboost freehit wildcard     3xc
0  114585   89437   263066  216711
1   79069   41897   103086   74713

Will work fine with variable response length.

NOTE: Seems there may be a typo on part of OP in updated question. My update works under the assumption that the response is a list of lists containing single item dictionaries as this seems to make the most sense based on example given.

thank you, worked perfectly. Sorry for any confusion with the original question — lookofdisdain, Jul 31 '20 at 09:21
No worries, all part of the experience ;) Glad this has helped — JPI93, Jul 31 '20 at 09:22

How to split single column with list of dictionaries into multiple columns in pandas?

2 Answers2

PLEASE SEE UPDATE

UPDATE