6

I read this (How to create multiple dataframes from pandas groupby object) however, I still do not understand how to create my dataframes for each person after I create my grouped_persons group with groupby.

How to create multiple dataframes from pandas groupby object

What should I change in this code? I think this is part of my problem: 'df_'+ name +'1'

grouped_persons = df.groupby('Person')
for name, group in grouped_persons
    'df_'+ name +'1' = df.loc[(df.Person == name) & (df.ExpNum == 1)]

File "", line 2 for name, group in grouped_persons ^ SyntaxError: invalid syntax

Rebecca Ijekah
  • 429
  • 2
  • 5
  • 14

3 Answers3

4

Let your DataFrame look like this

df = pd.DataFrame([['Tim', 1, 2],
                   ['Tim', 0, 2],
                   ['Claes', 1, 3],
                   ['Claes', 0, 1],
                   ['Emma', 1, 1],
                   ['Emma', 1, 2]], columns=['Person', 'ExpNum', 'Data'])

giving

>>> df
  Person  ExpNum  Data
0    Tim       1     2
1    Tim       0     2
2  Claes       1     3
3  Claes       0     1
4   Emma       1     1
5   Emma       1     2

Then you will get the group dataframes directly from the pandas groupby object

grouped_persons = df.groupby('Person')

by

>>> grouped_persons.get_group('Emma')
  Person  ExpNum  Data
4   Emma       1     1
5   Emma       1     2

and there is no need to store those separately.

Note: Pandas version used was '0.23.1' but this feature might be available in some earlier versions as well.

Edit: If you are interested in those entries with ExpNum == 1 only, I suggest applying this before the groupby, e.g.

grouped_persons_1 = df[df['ExpNum'] == 1].groupby('Person')
Kay Wittig
  • 563
  • 3
  • 18
2

Use a dictionary for a variable number of variables.

One straightforward solution is to use tuple keys representing ('Person', 'ExpNum') combinations. You can achieve this by feeding a groupby object to tuple and then the result to dict.

Data from @KayWittig.

df = pd.DataFrame([['Tim', 1, 2], ['Tim', 0, 2],
                   ['Claes', 1, 3], ['Claes', 0, 1],
                   ['Emma', 1, 1], ['Emma', 1, 2]],
                  columns=['Person', 'ExpNum', 'Data'])

df_dict = dict(tuple(df.groupby(['Person', 'ExpNum'])))

print(df_dict)

{('Claes', 0):   Person  ExpNum  Data
               3  Claes       0     1,
 ('Claes', 1):   Person  ExpNum  Data
               2  Claes       1     3,
 ('Emma', 1):   Person  ExpNum  Data
               4   Emma       1     1
               5   Emma       1     2,
 ('Tim', 0):   Person  ExpNum  Data
               1    Tim       0     2,
 ('Tim', 1):   Person  ExpNum  Data
               0    Tim       1     2}
jpp
  • 159,742
  • 34
  • 281
  • 339
0

You can store it in a dictionary like this. I have corrected some syntax errors in your code as well.

    grouped_persons = df.groupby('Person')
    multi_df = {}
    for name, group in grouped_persons:
       multi_df['df_'+ name +'1'] = df[(df.Person == name) & (df.ExpNum == 1)]

Now you can get the stored dataframe back by using multi_df['df_myname_1']

Kavitha Madhavaraj
  • 562
  • 1
  • 6
  • 23