I am trying to remove duplicated based on multiple criteria:
Find duplicated in column
df['A']
Check column
df['status']
and prioritize OK vs Open and Open vs Closeif we have a duplicate with same status pick the lates one based on
df['Col_1]
df = pd.DataFrame({'A' : ['11', '11', '12', np.nan, '13', '13', '14', '14', '15'], 'Status' : ['OK','Close','Close','OK','OK','Open','Open','Open',np.nan], 'Col_1' :[2000, 2001, 2000, 2000, 2000, 2002, 2000, 2004, 2000]}) df
Expected output:
I have tried differente solutions like the links below (map or loc) but I am unable to find the correct way: