6

As per the following data set I want no get the number of unique values and count of the unique values.

My data set:

Account_Type
Gold
Gold
Platinum
Gold

Output :

no of unique values : 2
unique values : [Gold,Platinum]
Gold : 3
Platinum :1 
Jacques Gaudin
  • 15,779
  • 10
  • 54
  • 75
Sidhartha
  • 988
  • 4
  • 19
  • 39
  • you may try set(). – Shiping Mar 28 '17 at 09:59
  • Does this answer your question? [Counting unique values in a column in pandas dataframe like in Qlik?](https://stackoverflow.com/questions/45759966/counting-unique-values-in-a-column-in-pandas-dataframe-like-in-qlik) – jdhao Oct 03 '22 at 06:50

3 Answers3

8

Use pd.value_counts

pd.value_counts(df.Account_Type)

Gold        3
Platinum    1
Name: Account_Type, dtype: int64

Get number of unique as well

s = pd.value_counts(df.Account_Type)
s1 = pd.Series({'nunique': len(s), 'unique values': s.index.tolist()})
s.append(s1)

Gold                            3
Platinum                        1
nunique                         2
unique values    [Gold, Platinum]
dtype: object

Alternate Approach

df['col1'].value_counts(sort=True)
df['col1'].value_counts(sort=True, normalize=True) -> provides proportion
RomanHotsiy
  • 4,978
  • 1
  • 25
  • 36
piRSquared
  • 285,575
  • 57
  • 475
  • 624
  • When applying this nice code to a categorical variable of a df I receive an error: pd.value_counts(df.application_type) s = pd.value_counts(df.application_type) s1 = pd.Series({'nunique': len(s), 'unique values': s.index.tolist()}) s.append(s1) TypeError: cannot append a non-category item to a CategoricalIndex "application_type" is a category column from df. – NuValue Apr 20 '18 at 12:41
1

You can use set() to remove duplicates and then calculate the length:

len(set(data_set))

To count the occurrence:

data_set.count(value)

adabsurdum
  • 91
  • 8
0
    df['Account_Type].unique() 

returns unique values of the specified column (in this case 'Account_Type') as a NumPy array.

All you have to do is use the len() function to find the no of unique values in the array.

    len(df['Account_Type].unique()) 

To find the respective counts of unique values, you can use value_counts()

missnomer
  • 105
  • 2
  • 2
  • 8