How to add two category lists to dictionary in python

Question

I have a dataframe

df =
          name     age     character
0          A        10       fire
1          A        15       water
2          A        20       earth
3          A        25       air
4          B        10       fire
5          B        7        air

I want to convert this dataframe to dictionary, so that output will be,

dic = {'A': [[10, 15, 20, 25], ['fire', 'water', 'earth', 'air']],
       'B': [[10, 7], ['fire', 'air']] }

What I tried is,

from collections import defaultdict
dic = defaultdict(list)
for i in range(len(df)):
    dic[df.loc['name', i]].append(df.loc['age', i])
    dic[df.loc['name', i]].append(df.loc['character', i]) # this is wrong. It appends to existing list.

If I declare dic = defaultdict([[], []]), it throws error that first argument of defaultdict must be callable or None.
How can I improve this dictionary?

`dic = {'A': [10, 15, 20, 25], ['fire', 'water', 'earth', 'air'], 'B': [10, 7], ['fire', 'air'] }` makes no sense. — Błotosmętek, Feb 19 '20 at 17:45
Does this answer your question? [Split pandas dataframe based on groupby](https://stackoverflow.com/questions/23691133/split-pandas-dataframe-based-on-groupby) — noslenkwah, Feb 19 '20 at 17:47
@noslenkwah it is similar. Can groupby pandas df be converted to dictionary? — jayko03, Feb 19 '20 at 18:05
@jayko03 - [yes](https://stackoverflow.com/questions/29876184/groupby-results-to-dictionary-of-lists) — noslenkwah, Feb 19 '20 at 18:08
The title of the question is "GroupBy results to dictionary of lists". I'm not sure what to tell you except to read over the questions and answers again. — noslenkwah, Feb 19 '20 at 18:27

score 1 · Answer 1 · answered Feb 19 '20 at 18:23

Here's a solution that returns np.array, which is similar enough to list:

{k: d[['age','character']].T.to_numpy() for k,d in df.groupby('name')}

Output:

{'A': array([[10, 15, 20, 25],
        ['fire', 'water', 'earth', 'air']], dtype=object), 
'B': array([[10, 7],
        ['fire', 'air']], dtype=object)}

Andy L. · Accepted Answer · 2020-02-19T23:53:12.380

1

You may use combination of pivot_table and to_dict

dic = (df.pivot_table(columns='name', values=['age','character'], aggfunc=list)
         .to_dict('l'))

Out[107]:
{'A': [[10, 15, 20, 25], ['fire', 'water', 'earth', 'air']],
 'B': [[10, 7], ['fire', 'air']]}

If you dataframe have exact 3 columns name, age, character, you may simply ignore values= parameter

dic = df.pivot_table(columns='name', aggfunc=list).to_dict('l')

As you said in comment, to strip whitespaces, you need to pre-process df with str.strip before calling pivot_table as follows

df.update(df.select_dtypes('object').apply(lambda x: x.str.strip()))
dic = df.pivot_table(columns='name', aggfunc=list).to_dict('l')

edited Feb 19 '20 at 23:53

answered Feb 19 '20 at 19:10

Andy L.

24,909
4
17
29

This is very nice and works great. Is it possible to strip on keys and values? – jayko03 Feb 19 '20 at 21:13
@jayko03: could you explain more on what `keys` and `values` you want to strip? – Andy L. Feb 19 '20 at 21:22
For example, the key can be 'A ', or the value can be ['fire ', ' water'] .. something like this white space – jayko03 Feb 19 '20 at 22:53
@jayko03: ah, I understand now. You want to strip white spaces from values and keys. Just do `str.strip` on columns of `df` before calling `pivot_table` – Andy L. Feb 19 '20 at 23:44
@jayko03: check my updated answer for stripping whitespaces – Andy L. Feb 19 '20 at 23:53

How to add two category lists to dictionary in python

2 Answers2