As the question states. I am trying to get rid of duplicate rows in a df with 2 series/columns df['Offering Family', 'Major Offering']
.
I hope to merge the subsequent df with another one I have based on the Major Offering column, thus only the offering family column will be transposed to the new df. I should note that I only want to get rid of rows with values that are repeated in both columns. If a value appears more than once in the Offering family column but the value in the major offering column is different, it should not be deleted. However, when I run the code below, I'm finding that I'm losing those sorts of values. Can anybody help?
df = pd.read_excel(pipelineEx, sheet_name='Data')
dfMO = df[['Offering Family', 'Major Offering']].copy()
dfMO.filter(['Offering Family', 'Major Offering'])
dfMO = df.drop_duplicates(subset=None, keep="first", inplace=False)
#dfMO.drop_duplicates(keep=False,inplace=True)
print(dfMO)
dfMO.to_excel("Major Offering.xlsx")