Hi I want to remove all duplicates rows from panda dataframe by only keeping first. This is what i am doing.
import pandas as pd
df = pd.DataFrame({'col1':['A']*3+['B']*4+['C','B','A'],'col2':[2,3,4,2,4,2,1,3,4,4]})
print(df)
df.drop_duplicates(subset=None, keep='first', inplace=True, ignore_index=True)
This is fine but the given solution is exceeding time limit in my system. Can someone provide a better solution?