-1

I have a dataframe saved in a list. Is there a way to loop through the list to create separate dataframes based of a column value?

ex: Turn this

df

ID Colour Transport
0902 red car
0902 blue car
0105 red car
0105 yellow car
0105 orange boat

To this:

df1

ID Colour Transport
0902 red car
0902 blue car

df2

ID Colour Transport
0105 red car
0105 yellow car
0105 orange boat

I have searched for a way to do this but can't find it. Any ideas?

U13-Forward
  • 69,221
  • 14
  • 89
  • 114

2 Answers2

2

Try this:

df1 = df.loc[df['ID'] == 0902]
df2 = df.loc[df['ID'] == 0105]

Or this:

df1, df2 = [group for _, group in df.groupby('ID')]

Or if you want it dynamically:

dct = {f'df{idx}': group for _, group in df.groupby('ID')]}
print(dct)

Or:

dct = {}
for idx, v in enumerate(df['ID'].unique()):
    dct[f'df{idx}'] = df.loc[df['ID'] == v]

print(dct)

And print like this for specific dataframe:

print(dct['df1'])
U13-Forward
  • 69,221
  • 14
  • 89
  • 114
0

You can create variables df1 and df2 dynamically using locals():

for i, (ID, subdf) in enumerate(df.groupby('ID'), 1):
    locals()[f'df{i}'] = subdf

Output:

>>> df1
     ID  Colour Transport
2  0105     red       car
3  0105  yellow       car
4  0105  orange      boat

>>> df2
     ID Colour Transport
0  0902    red       car
1  0902   blue       car

Or you can create a dictionary indexed by the group ID:

dfs = dict(list(df.groupby('ID')))

Output:

>>> dfs['0105']
     ID  Colour Transport
2  0105     red       car
3  0105  yellow       car
4  0105  orange      boat

>>> dfs['0902']
     ID Colour Transport
0  0902    red       car
1  0902   blue       car
Corralien
  • 109,409
  • 8
  • 28
  • 52