I'm trying to compare 2 columns in one dataframe(df1) with 2 columns in another dataframe(df2). After comparison, I want to select the rows where the first two columns do not match. You can see my attempts below and this what the dataframes look like [1]
import pandas as pd
fd1= 'Q37.xlsx'
fd2= 'Q43.xlsx'
df1 = pd.read_excel( fd1, sheetname='prio 1')
df2 = pd.read_excel( fd2, sheetname='prio 1')
closed_items= {} #items in fd1 but not in fd2
new_items={} #items in fd2 but not in fd1
In order to get closed_items, I've tried the following 3 things
closed_items.where(df1[df1['Code'].values!=df2[df2['Code'].values and
df1['Owner'].values != key in df1['Owner'].values)
and gotten
ValueError: Can only compare identically-labeled Series objects
I've also tried
Closed_items = df2.loc[(df2['Code'] != df1['Code']) and
df2.loc[(df2['Owner'] != df1['Owner'])]
And lastly I tried
for key in df1['Code'].values:
if key in df1['Code'].values != key in df1['Code'].values or key in
df1['Owner'].values != key in df1['Owner'].values:
closed_items.append()
else:
pass
Which gave this syntax
The truth value of an array with more than one element is ambiguous.
Use a.any() or a.all()
...
AFP= pd.ExcelWriter("AFP.xlsx", engine='xlsxwriter')
closed_items.to_excel(AFP, sheet_name='Closed', index=False)