0

I have a dataframe like this:-

import pandas as pd

dict_data = {
    'Date':pd.Timestamp('20200720'),
    'Number': 123,
    'course':pd.Series(['Python', 'Quant', 'CFA', 'Finance', 'Python', 'Python', 'Finance', 'Finance']),
    'Company':['AA', 'BB', 'CC', 'DD', 'BB', 'BB', 'DD', 'CC']
}

pd.DataFrame(dict_data)

I can select a column. For example, dict_data['course'] and it will output all data of this column. May I know is there any method it can mask the duplicate value? Look like this?

0     Python
1      Quant
2        CFA
3    Finance
janicewww
  • 323
  • 1
  • 10

1 Answers1

2

You can use df.drop_duplicates():

df = pd.DataFrame(dict_data)

In [1327]: df.course.drop_duplicates()
Out[1327]: 
0     Python
1      Quant
2        CFA
3    Finance
Name: course, dtype: object
Mayank Porwal
  • 33,470
  • 8
  • 37
  • 58