I have a dataframe constructed as follows:
df = pd.DataFrame({"taxon":["taxa1","taxa2","taxa3","taxa4","taxa5"],"rank":["genus","genus","family","species","species"]})
There are 3 different ranks in this example dataframe: genus
, family
and species
. I want to extract the rows of df
to create new dataframes for each of the ranks with the corresponding rows of that rank. The name of the new dataframe should be df_
followed by the name of the rank
So as output I want 3 dataframes df_genus
, df_family
, and df_species
. Each of these contains the rows of that rank
with the corresponding rows of the original df
data frame.
I already tried several things, including:
ranks = ["genus","family","species"]
for rank in ranks:
"df_"+str(rank) = df.loc[df["rank"]==rank]
but this returns error: SyntaxError: can't assign to operator
How can I perform this operation?