I would like to match two columns, and if all of them are True. I need to drop the subset from a very large dataset.
I am running into 2 problem:
- when rowA.exprityDate = NaN and rowB.exprityDate = NaN, my condition returns false. I believe its's because NaN is a special condition. Is there any way around this. My matching condition state below:
- Is there any way to conduct a cross matching condition? rowA.baseCptyId[0] == rowB.extCptyId[1] and rowA.extcptyId[1] == RowB.extcptyID[0]...It's a bit more clearer using the table below where the Matched column is False for the first two rows.
Here is the matching condtion I am using:
combine_series = pd.DataFrame(dict(rowA = rowA, rowB = rowB))
combine_series['Matched'] = np.where(combine_series['rowA'] == combine_series['rowB'], True, False)
Here is the resulting Matched Column
rowA rowB Matched
baseCptyID 2231200 5900 False
extCptyID 5900 2231200 False
notional 3.4e+07 3.4e+07 True
startDate 2015-05-29 2015-05-29 True
expiryDate NaN NaN False
settlementDate 2020-06-29 2020-06-29 True
rate 0.03375 0.03375 True
spread NaN NaN False
paymentFreq PA PA True
resetFreq PA PA True
modelUsed FixedLeg FixedLeg True
PayoutCCY AUD AUD True
DayCountConv ACT/ACT ICMA ACT/ACT ICMA True
join_column 2231200 2231200 True