I have two data frames df
and df_copy
. I would like to copy the data from df_copy
, but only if the data is also identical. How do I do that?
import pandas as pd
d = {'Nameid': [100, 200, 300, 100]
, 'Name': ['Max', 'Michael', 'Susan', 'Max']
, 'Projectid': [100, 200, 200, 100]}
df = pd.DataFrame(data=d)
display(df.head(5))
df['nameid_index'] = df['Nameid'].astype('category').cat.codes
df['projectid_index'] = df['Projectid'].astype('category').cat.codes
display(df.head(5))
df_copy = df.copy()
df.drop(['Nameid', 'Name', 'Projectid'], axis=1, inplace=True)
df = df.drop([1, 3])
display(df.head(5))
df
df_copy
What I want
I looked at Pandas Merging 101
df.merge(df_copy, on=['nameid_index', 'projectid_index'])
But I got this result
The same row are twice, I only want once.