Full code sample below
Let's take the iris dataset as an example. The length of the dataframe is 150
and , the column species
contain the three unique values ['setosa', 'versicolor', 'virginica']
. And they appear in the dataframe column in the same order as that list. But how do I specify that order?
The original order can be found using:
# In:
df['species'].unique()
# Out:
# array(['setosa', 'versicolor', 'virginica'], dtype=object)
A reverse alphabetical order is easily applied like this:
# In:
df_alpha = df.sort_values(by='species', ascending=False)
df_alpha.unique()
# Out:
array(['virginica', 'versicolor', 'setosa'], dtype=object)
But how can I specify the order to be ['virginica', 'setosa', 'versicolor']
?
Code and reproducible data:
import pandas as pd
import plotly.express as px
df = px.data.iris()
df_alpha = df.sort_values(by='species', ascending=False)
df_alpha.tail()
Structure of the dataframe:
sepal_length sepal_width petal_length petal_width species species_id
0 5.1 3.5 1.4 0.2 setosa 1
1 4.9 3.0 1.4 0.2 setosa 1
2 4.7 3.2 1.3 0.2 setosa 1
3 4.6 3.1 1.5 0.2 setosa 1
4 5.0 3.6 1.4 0.2 setosa 1
.
.
.
145 6.7 3.0 5.2 2.3 virginica 3
146 6.3 2.5 5.0 1.9 virginica 3
147 6.5 3.0 5.2 2.0 virginica 3
148 6.2 3.4 5.4 2.3 virginica 3
149 5.9 3.0 5.1 1.8 virginica 3