276

I have a pandas dataframe. I want to print the unique values of one of its columns in ascending order. This is how I am doing it:

import pandas as pd
df = pd.DataFrame({'A':[1,1,3,2,6,2,8]})
a = df['A'].unique()
print a.sort()

The problem is that I am getting a None for the output.

ivanleoncz
  • 9,070
  • 7
  • 57
  • 49
MAS
  • 4,503
  • 7
  • 32
  • 55
  • 7
    `a.sort()` modifies `a` and does not return anything so replace by: `a.sort(); print a` – stellasia Aug 18 '15 at 12:13
  • Note: `unique()` returns a numpy.ndarray, so `sort()` is actually `numpy.ndarray.sort()` method. That's why the behavior is unexpected. `drop_duplicates()` returns a pandas series or dataframe, allowing use of `sort_values()`. – wisbucky May 10 '22 at 08:19

9 Answers9

359

sorted(iterable): Return a new sorted list from the items in iterable.

CODE

import pandas as pd
df = pd.DataFrame({'A':[1,1,3,2,6,2,8]})
a = df['A'].unique()
print(sorted(a))

OUTPUT

[1, 2, 3, 6, 8]
Paul P
  • 3,346
  • 2
  • 12
  • 26
Vineet Kumar Doshi
  • 4,250
  • 1
  • 12
  • 20
  • 1
    This doesn't work if your column contains data with ambiguous boolean values, such as pandas' NAType - sorted() will raise a TypeError – Elliot Young Jul 08 '21 at 20:33
45

sort sorts inplace so returns nothing:

In [54]:
df = pd.DataFrame({'A':[1,1,3,2,6,2,8]})
a = df['A'].unique()
a.sort()
a

Out[54]:
array([1, 2, 3, 6, 8], dtype=int64)

So you have to call print a again after the call to sort.

Eg.:

In [55]:
df = pd.DataFrame({'A':[1,1,3,2,6,2,8]})
a = df['A'].unique()
a.sort()
print(a)

[1 2 3 6 8]
EdChum
  • 376,765
  • 198
  • 813
  • 562
  • The reason is because `unique()` returns a numpy.ndarray, so `sort()` is actually `numpy.ndarray.sort()` method. That's why the behavior is unexpected. `drop_duplicates()` returns a pandas series or dataframe, allowing use of `sort_values()`. – wisbucky May 10 '22 at 08:21
33

You can also use the drop_duplicates() instead of unique()

df = pd.DataFrame({'A':[1,1,3,2,6,2,8]})
a = df['A'].drop_duplicates()
a.sort()
print a
Meloun
  • 13,601
  • 17
  • 64
  • 93
19

Fastest code

for large data frames:

df['A'].drop_duplicates().sort_values()
Serge Stroobandt
  • 28,495
  • 9
  • 107
  • 102
  • 9
    This answer would be more interesting if you provide the evidence for your claim – saQuist Sep 29 '21 at 15:00
  • 3
    `drop_duplicates()` is better than `unique()` because it can work with multiple cols (dataframes), not just single cols (series). – wisbucky May 10 '22 at 08:25
15

Came across the question myself today. I think the reason that your code returns 'None' (exactly what I got by using the same method) is that

a.sort()

is calling the sort function to mutate the list a. In my understanding, this is a modification command. To see the result you have to use print(a).

My solution, as I tried to keep everything in pandas:

pd.Series(df['A'].unique()).sort_values()
Bowen Liu
  • 1,065
  • 1
  • 11
  • 24
  • I like the `pandas` solution because it puts `NaN` values at the end and works with arrays of mixed types. – m13op22 Aug 01 '19 at 15:31
14

I prefer the oneliner:

print(sorted(df['Column Name'].unique()))
MDMoore313
  • 3,233
  • 1
  • 23
  • 38
7

I would suggest using numpy's sort, as it is anyway what pandas is doing in background:

import numpy as np
np.sort(df.A.unique())

But doing all in pandas is valid as well.

Challensois
  • 522
  • 2
  • 10
4

Another way is using set data type.

Some characteristic of Sets: Sets are unordered, can include mixed data types, elements in a set cannot be repeated, are mutable.

Solving your question:

df = pd.DataFrame({'A':[1,1,3,2,6,2,8]})
sorted(set(df.A))

The answer in List type:

[1, 2, 3, 6, 8]
1

Surprised no one suggested this:

df['A'].sort_values().unique()
russhoppa
  • 56
  • 7
  • 1
    Well, yes this works, but it doesn't make sense to do the sorting first (on the entire array) instead of last (on the reduced set). That's why every other answer does `set` -> `sort`. – tdy Apr 06 '23 at 21:23
  • Oh yeah it isn't as efficient. Looks cleaner but runs less well. – russhoppa Apr 07 '23 at 15:45