I would like to add a repeat count for the duplicate rows. current example only drop duplicate rows.
import re
import pandas as pd
data = [
['A','B','C'],
['A','B','C'],
['A','D','C'],
['A','D','C']
]
df = pd.DataFrame(data,columns=['x1','x2','x3'])
print(df)
df1 = df.drop_duplicates(keep='first')
print(df1)
expected output:
: x1 x2 x3
: 0 A B C
: 1 A B C
: 2 A D C
: 3 A D C
: x1 x2 x3 count
: 0 A B C 2
: 2 A D C 2