When there is a DataFrame like the following:
import pandas as pd
df = pd.DataFrame(1, index=[100, 29, 234, 1, 150], columns=['A'])
How can I sort this dataframe by index with each combination of index and column value intact?
When there is a DataFrame like the following:
import pandas as pd
df = pd.DataFrame(1, index=[100, 29, 234, 1, 150], columns=['A'])
How can I sort this dataframe by index with each combination of index and column value intact?
Dataframes have a sort_index
method which returns a copy by default. Pass inplace=True
to operate in place.
import pandas as pd
df = pd.DataFrame([1, 2, 3, 4, 5], index=[100, 29, 234, 1, 150], columns=['A'])
df.sort_index(inplace=True)
print(df.to_string())
Gives me:
A
1 4
29 2
100 1
150 5
234 3
Slightly more compact:
df = pd.DataFrame([1, 2, 3, 4, 5], index=[100, 29, 234, 1, 150], columns=['A'])
df = df.sort_index()
print(df)
Note:
sort
has been deprecated, replaced by sort_index
for this scenarioinplace
as it is usually harder to read and prevents chaining. See explanation in answer here:
Pandas: peculiar performance drop for inplace rename after dropnaIf the DataFrame index has name, then you can use sort_values()
to sort by the name as well. For example, if the index is named lvl_0
, you can sort by this name. This particular case is common if the dataframe is obtained from a groupby
or a pivot_table
operation.
df = df.sort_values('lvl_0')
If the index has name(s), you can even sort by both index and a column value. For example, the following sorts by both the index and the column A
values:
df = df.sort_values(['lvl_0', 'A'])
If you have a MultiIndex dataframe, then, you can sort by the index level by using the level=
parameter. For example, if you want to sort by the second level in descending order and the first level in ascending order, you can do so by the following code.
df = df.sort_index(level=[1, 0], ascending=[False, True])
If the indices have names, again, you can call sort_values()
. For example, the following sorts by indexes 'lvl_1'
and 'lvl_2'
.
df = df.sort_values(['lvl_1', 'lvl_2'])