I have 2 DataFrames : df0
and df1
and df1.shape[0] > df1.shape[0]
.
df0
and df1
have the exact same columns.
Most of the rows of df0
are in df1
.
The indices of df0
and df1
are
df0.index = range(df0.shape[0])
df1.index = range(df1.shape[0])
I then created dft
dft = pd.concat([df0, df1], axis=0, sort=False)
and removed duplicated rows with
dft.drop_duplicates(subset='this_col_is_not_index', keep='first', inplace=True)
I have some duplicates on the index of dft
. For example :
dft.loc[3].shape
returns
(2, 38)
My aim is to change the index of the second row returned to have a unique index 3
.
This second row should be indexed dft.index.sort_values()[-1]+1
.
I would like to apply this operation on all duplicates.
References :
Python Pandas: Get index of rows which column matches certain value