Below example count duplicate numbers by column x1
and x2
. the output has x1
,x2
and count
., I expect it keep x3
(first row of duplicates) as well.
import re
import pandas as pd
data = [
['A','B','C'],
['A','B','D'],
['A','D','C'],
['A','D','C']
]
df = pd.DataFrame(data,columns=['x1','x2','x3'])
print(df)
df1 = df.groupby(['x1','x2']).size().reset_index()
print(df1)
current output:
x1 x2 x3
0 A B C
1 A B D
2 A D C
3 A D C
x1 x2 0
0 A B 2
1 A D 2
expected output:
x1 x2 x3 0
0 A B C 2
1 A D C 2