I have 2 almost identical pandas dataframes with 5 common columns. I want to add the second dataframe to the first which has a new column.
But I want it to update the same row given that columns 'Lot name', 'wafer' and 'site' match (green). If the columns do not match, I want to have the value of NaN as shown below.
I have to do this with over 160 discrete columns but with possible matching Lot name, WAFER and SITE values.
I have tried the various merging(left right outer) and concat options, just cant seem to get it right. Any help\comments is appreciated.
Edit, follow up question:
I am trying to use this in a loop, where each iteration generates a new dataframe assigned to TEMP that needs to be merged with the previous dataframe. I cannot merge with an empty dataframe as it gives a merge error. How can I achieve this?
alldata = pd.DataFrame()
for i in range(len(operation)):
temp = data[data['OPE_NO'].isin([operation[i]])]
temp = temp[temp['PARAM_NAME'].isin([parameter[i]])]
temp = temp.reset_index(drop=True)
temp = temp[["LOT",'Lot name','WAFER',"SITE","PRODUCT",'PARAM_VALUE_NUMBER']]
temp = temp.rename(columns={'PARAM_VALUE_NUMBER':'PMRM28LEMCKLYTFR.1~'+operation[i]+'~'+parameter[i]})
alldata.merge(temp,how='outer')