Hello how can I do to only the lines where val is different in the 2 dataframes.
The way I need to filter is the following:
For each row of F1 (take each id1 if it is not null search for id1 F2 ) compare the VAL and if its different return it. else look at id2 and do the same thing.
Notice that I can have id1 or id2 or both as shown below:
d2 = {'id1': ['X22', 'X13',np.nan,'X02','X14'],'id2': ['Y1','Y2','Y3','Y4',np.nan],'VAL1':[1,0,2,3,0]}
F1 = pd.DataFrame(data=d2)
d2 = {'id1': ['X02', 'X13',np.nan,'X22','X14'],'id2': ['Y4','Y2','Y3','Y1','Y22'],'VAL2':[1,0,4,3,1]}
F2 = pd.DataFrame(data=d2)
Where F1 is:
id1 id2 VAL1
0 X22 Y1 1
1 X13 Y2 0
2 NaN Y3 2
3 X02 Y4 3
4 X14 NaN 0
and F2 is:
id1 id2 VAL2
0 X02 Y4 1
1 X13 Y2 0
2 NaN Y3 4
3 X22 Y1 3
4 X14 Y22 1
Expected output:
d2 = {'id1': ['X02',np.nan,'X22','X14'],'id2': ['Y4','Y3','Y1',np.nan],'VAL1':[3,2,1,0],'VAL2':[1,4,3,1]}
F3 = pd.DataFrame(data=d2)
id1 id2 VAL1 VAL2
0 X02 Y4 3 1
1 NaN Y3 2 4
2 X22 Y1 1 3
3 X14 NaN 0 1