I am using Python with pandas library. I have a dataframe df. I need to write a function to filter out duplicates, that is to say, to remove the rows which contain the same value as a row above
example :
df = pd.DataFrame({'A': {0: 1, 1: 2, 2: 2, 3: 3, 4: 4, 5: 5, 6: 5, 7: 5, 8: 6, 9: 7, 10: 7}, 'B': {0: 'a', 1: 'b', 2: 'c', 3: 'd', 4: 'e', 5: 'f', 6: 'g', 7: 'h', 8: 'i', 9: 'j', 10: 'k'}})
I wrote the code below.
total_len = len(df.index)
for i in range(total_len):
if df['A'].loc[i] == df['A'].loc[i+1]:
df['A'].drop(df['A'].index[i+1])
else:
df['A']
what am I doing wrong?