2

I want to subset a dataframe into individual dataframes.

So:

df:

     name    color   value
      joe     yellow   7.0
      mary    green    9.0
      pete    blue     8.0
      mary     red     8.8
      pete     blue    7.7
      joe     orange   2.0

I want to get:

df_joe

         name    color   value
      joe     yellow   7.0
      joe     orange   2.0

df_mary

     name    color   value
      mary    green    9.0
      mary     red     8.8

df_pete

     name    color   value
      pete    blue     8.0
      pete     blue    7.7

This is easy enough to do individually and manually. But I want to automate it in a loop or using `groupby'. There are lots of related answers on how to get this information but none I have found discusses saving the broken out information to several dataframes.

SO ACTUALLY THIS IS NOT A DUPLICATE QUESTION BECAUSE OF THE FOLLOWING:

I have tried to loop something like this:

names = ['joe','pete','mary']
for name in names
   'df_' + name = df[df['Name'] == name]

But I get an error assigning the dataframe subset to the newly constructed name.

How can I do this?

Windstorm1981
  • 2,564
  • 7
  • 29
  • 57

1 Answers1

4

Best is here create dictionary of DataFrames by groupby object:

dfs = dict(tuple(df.groupby('name')))
print (dfs)
{'joe':   name   color  value
0  joe  yellow    7.0
5  joe  orange    2.0, 'pete':    name color  value
2  pete  blue    8.0
4  pete  blue    7.7, 'mary':    name  color  value
1  mary  green    9.0
3  mary    red    8.8}

print (dfs['mary'])
   name  color  value
1  mary  green    9.0
3  mary    red    8.8

But if really need variables by strings (not recommended but possible):

for name, df in df.groupby('name'):
   globals()['df_' + name] = df

print (df_mary)
   name  color  value
1  mary  green    9.0
3  mary    red    8.8
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252