-1

I have a pandas data frame df that looks like this

import pandas as pd
data = [[1, 'Jack', 'A'], [1, 'Jamie', 'A'], [1, 'Mo', 'B'], [1, 'Tammy', 'A'], [2, 'JJ', 'A'], [2, 'Perry', 'C']]
df = pd.DataFrame(data, columns=['id', 'name', 'class'])
> df
   id   name class
0   1   Jack     A
1   1  Jamie     A
2   1     Mo     B
3   1  Tammy     A
4   2     JJ     A
5   2  Perry     C

I want to convert this to a dictionary mydict where

> mydict[0]
{'id': '1', 'name': ['Jack', 'Jamie', 'Mo', 'Tammy'], 'class': ['A', 'A', 'B', 'A']}

> mydict[1]
{'id': '2', 'name': ['JJ', 'Perry'], 'class': ['A', 'C']}

and

> mydict[0:2]
{'id': ['1', '2'], 'name': [['Jack', 'Jamie', 'Mo', 'Tammy'],['JJ', 'Perry']], 'class': [['A', 'A', 'B', 'A'], ['A', 'C']]} 

I tried mydict = df.to_dict() but that didn't seem to work as intended.

Adrian
  • 9,229
  • 24
  • 74
  • 132
  • I did look at that and tried its top answer, but it didn't work as intended. – Adrian May 07 '23 at 05:45
  • 2
    The linked duplicated doesn't really solve this issue – mozway May 07 '23 at 05:56
  • There is no such thing as a sliceable dictionary. It's definitely possible to create a custom mapping that can do this or perhaps even construct a Pandas object that can. But I guess either is relatively complicated. https://docs.python.org/3/glossary.html#term-mapping – Joooeey May 07 '23 at 07:01
  • @Adrian I retracted my close vote and voted to reopen the question. Which I leave my apology here with you. I hope it gets open. – bonCodigo May 07 '23 at 14:33

2 Answers2

2

You can use:

out = df.groupby('id', as_index=False).agg(list).to_dict('records')

Alternative:

out = [{'id': k} | g.drop(columns='id').to_dict('list')
       for k, g in df.groupby('id', as_index=False)]

Output:

[{'id': 1,
  'name': ['Jack', 'Jamie', 'Mo', 'Tammy'],
  'class': ['A', 'A', 'B', 'A']},
 {'id': 2, 'name': ['JJ', 'Perry'], 'class': ['A', 'C']}]

Alternative, as dictionary:

out = df.groupby('id', as_index=False).agg(list).to_dict('index')

Output:

{0: {'id': 1,
     'name': ['Jack', 'Jamie', 'Mo', 'Tammy'],
     'class': ['A', 'A', 'B', 'A']},
 1: {'id': 2, 'name': ['JJ', 'Perry'], 'class': ['A', 'C']}}
mozway
  • 194,879
  • 13
  • 39
  • 75
1

Code

mydict = df.groupby('id').agg(list).reset_index().T.to_dict()

Output

mydict[0]

{'id': 1, 'name': ['Jack', 'Jamie', 'Mo', 'Tammy'], 'class': ['A', 'A', 'B', 'A']}

mydict

{0: {'id': 1, 'name': ['Jack', 'Jamie', 'Mo', 'Tammy'], 'class': ['A', 'A', 'B', 'A']},
 1: {'id': 2, 'name': ['JJ', 'Perry'], 'class': ['A', 'C']}}
Panda Kim
  • 6,246
  • 2
  • 12
  • Thanks, is it possible to create `mydict` so that it's sliceable? Please see my edited post, i.e., the output for `mydict[0:2]` – Adrian May 07 '23 at 05:53
  • You can use basic python to convert value of a dictionary into list. – Panda Kim May 07 '23 at 05:57
  • @PandaKim but slicing in Pandas is open (including the 2) while list-slicing in Python is only half-open (excluding the 2). – Joooeey May 07 '23 at 06:45
  • @Joooeey could you elaborate this a bit? You meant `itertools` slicing can be applied to dictionaries but not for list (limited)? – bonCodigo May 07 '23 at 14:39
  • Slicing in Pandas is complex. See link. In this particular case, it seems the OP wants to refer to the ids 0, 1, 2 with `0:2`. This only works when using label-based indexing in Pandas. In pure Python `0:2` only refers to 0 and 1. https://pandas.pydata.org/docs/user_guide/indexing.html#selection-by-label – Joooeey May 07 '23 at 15:41