You can use duplicated()
to flag all duplicates and filter out flagged rows. If you need to assign columns to new_df
later, make sure to call .copy()
so that you don't get SettingWithCopyWarning
later on.
new_df = df[~df.duplicated(subset=['A', 'C'], keep=False)].copy()

One nice feature of this method is that you can conditionally drop duplicates with it. For example, to drop all duplicated rows only if column A is equal to 'foo'
, you can use the following code.
new_df = df[~( df.duplicated(subset=['A', 'B', 'C'], keep=False) & df['A'].eq('foo') )].copy()

Also, if you don't wish to write out columns by name, you can pass slices of df.columns
to subset=
. This is also true for drop_duplicates()
as well.
# to consider all columns for identifying duplicates
df[~df.duplicated(subset=df.columns, keep=False)].copy()
# the same is true for drop_duplicates
df.drop_duplicates(subset=df.columns, keep=False)
# to consider columns in positions 0 and 2 (i.e. 'A' and 'C') for identifying duplicates
df.drop_duplicates(subset=df.columns[[0, 2]], keep=False)