2

I have a 2-columns df with a particular distribution of items. The first column shows a repetition of items. In the second column, there are no items repeated.

I have been trying to create a dictionary in which keys save the name of the first column and values save the items of the second column. Let's see my table and the dictionary I would like to create for a better understanding.

df
  col1 col2
0 A     1
1 A     2
2 A     3
3 A     4
4 A     9
5 A     C
6 B     2
7 B     3
8 B     4
9 B     29
10 B    34
...
dict
{'A': '1', '2','3','4','9','C', 'B': '2', '3','4','29','34'}

Could someone put me in the right direction?

2 Answers2

4

Close, what need is dictionary of lists, values are strings, because C:

d = df.groupby('col1')['col2'].agg(list).to_dict()
print (d)
{'A': ['1', '2', '3', '4', '9', 'C'], 'B': ['2', '3', '4', '29', '34']}
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

Try this:

new_dict = df.groupby('col1')['col2'].apply(list).to_dict()
Carsten
  • 2,765
  • 1
  • 13
  • 28