There are 2 .csv files - file1(with heads Order_ID, Amount, Commission, GST, etc.) and file2(with heads hbx_id, amount, status, etc.).
I have to tell that the “amount” in File2 matches with (“Amount” – “Commission” – “GST”) in File1. This has to be performed basis “Order_ID” in File1 and “hbx_id” in File2. I am required to put the status as “matched” against items that are matching else put “not matched”. For items in File2 which are not present in File1, I am supposed to enter the status “not present” in the status column in File2.
I solved the problem with the use of nested for loops and my code looked like this -
for i in file2.index:
flag = 0
for j in file1.index:
if (file1['Order_ID'][j]).count(file2['hbx_id'][i])>0:
flag = 1
if file1['Amount'][j]-file1['Commission'][j]-file1['GST'][j]==file2['amount'][i]:
file2.at[i,'Status'] = 'matched'
else:
file2.at[i,'Status'] = 'not matched'
if flag == 0:
file2.at[i,'Status'] = 'not present'
This worked just fine, but my question is that this would not be a feasible option if the datasets are big(coz of the for loop iterations). Is there any substitute for this method, a function from pandas, or any other library or something?
EDIT:- The expected output looks like this -
hbx_id tran_date amount Status
0 AMBLEY127928' 01-09-2019 36.00 matched
1 SIPTBT18469' 01-09-2019 65.50 not matched
2 'KITJUJ272150' 01-09-2019 37.28 matched
3 'LEGNOE40184' 01-09-2019 20.00 matched
4 hb-pay_9492291-1' 01-09-2019 313.77 not matched
5 'SF3C145362' 01-09-2019 25.00 matched
6 hb-AMBLEY127931-1' 01-09-2019 75.50 not matched
7 'HEANEM26099' 01-09-2019 19.62 matched
8 hb-COOINT242360-1' 01-09-2019 44.15 not present
9 'MUNDCT3880' 01-09-2019 142.00 matched
10 'KDA178133' 01-09-2019 24.53 matched
11 'KSCA1492699' 01-09-2019 9.81 matched
12 'FOROCH372249' 01-09-2019 14.72 matched
13 'CHAMXN202625' 01-09-2019 29.43 matched
14 'DOSFSX48229' 01-09-2019 49.06 matched
15 'ARAEZG242324' 01-09-2019 25.00 matched
16 hb-SHRWOG391499-1' 01-09-2019 19.56 matched
17 'MALSJU13377' 01-09-2019 16.50 matched
18 hb-OMKXDT420930-1' 01-09-2019 10.00 matched
19 'SCCA528997' 01-09-2019 20.00 not matched
file1.head()
Transaction_ID Order_ID Transaction_Date Updated_Date Transaction_Type Status MID Customer_ID Customer_Nickname Customer_Phone_No. ... Commission_Rate Product_Code GMV_Tier Transaction_Slab Request_Type Refund_Type Refund_Actor Split_Flag Split_MID Split_Id
0 '20190901111212800100168044885072100' hb-SIPJZO32656-1' '2019-09-01 00:54:14' '2019-09-01 00:54:33' 'ACQUIRING' 'SUCCESS' 'EatGoo08603428284199' NaN NaN 9804817853' ... NaN '51051000100000000010' NaN NaN 'SEAMLESS' NaN NaN NaN NaN NaN
1 '20190901111212800110168956985465456' 'SHRWOG391493' '2019-09-01 02:14:43' '2019-09-01 02:14:46' 'ACQUIRING' 'SUCCESS' 'EatGoo08603428284199' NaN NaN NaN ... NaN '51051000100000000001' NaN NaN 'SEAMLESS_3D_FORM' NaN NaN NaN NaN NaN
2 '20190901111212800110168835885639302' 'MIDYEY245461' '2019-09-01 00:23:36' '2019-09-01 00:23:39' 'ACQUIRING' 'SUCCESS' 'EatGoo08603428284199' NaN NaN NaN ... NaN '51051000100000000001' NaN NaN 'SEAMLESS_3D_FORM' NaN NaN NaN NaN NaN
3 '20190901111212800110168061185352899' 'KITJUJ272194' '2019-09-01 03:40:00' '2019-09-01 03:40:03' 'ACQUIRING' 'SUCCESS' 'EatGoo08603428284199' NaN NaN NaN ... NaN '51051000100000000001' NaN NaN 'SEAMLESS_3D_FORM' NaN NaN NaN NaN NaN
4 '20190901111212800110168140285515665' 'EXLRYB3995' '2019-09-01 00:08:21' '2019-09-01 00:08:21' 'ACQUIRING' 'SUCCESS' 'EatGoo08603428284199' '216810000026794940140' NaN '9515053146' ... NaN '51051000100000000001' NaN NaN 'AUTO_DEBIT' NaN NaN NaN NaN NaN
file2.head()
hbx_id tran_date amount
0 AMBLEY127928' 01-09-2019 36.00
1 SIPTBT18469' 01-09-2019 65.50
2 'KITJUJ272150' 01-09-2019 37.28
3 'LEGNOE40184' 01-09-2019 20.00
4 hb-pay_9492291-1' 01-09-2019 313.77