I have a dataframe like this:
df:
col1 | col2 | col3
a1_123 | a1_232| blue
a1_123 | a1_832| orange
a1_143 | a1_232| orange
a1_963 | a1_202| purple
a1_000 | a1_912| blue
a1_863 | a1_402| blue
I want to write a function that creates a separate dataframe for each value in col3. The desired output should look something like this:
df_blue:
col1 | col2 | col3
a1_123 | a1_232| blue
a1_000 | a1_912| blue
a1_863 | a1_402| blue
df_orange:
col1 | col2 | col3
a1_123 | a1_832| orange
a1_143 | a1_232| orange
df_purple:
col1 | col2 | col3
a1_963 | a1_202| purple
I want to combine the filtering and the creation of dataframes in one function. So far I have something for filtering
def df_filter(df_name, col_value, col_name):
'''
df: main dataframe
value: value to search
col: column to search for the value
'''
filt = (df_name[col_name] == col_value) # rows where col matches value
return df_name[filt]
value_list = df['col3'].unique().tolist()
for i in value_list:
something to create df with naming for mat of df_value_list unique value
What is the most efficient way to do this? My main dataframe can get pretty large.