I have two pyspark dataframes with different values that I want to merge on some condition. The below is what I have
DF-1
date person_surname person_order_number item
2017-08-09 pearson 1 shoes
2017-08-09 zayne 3 clothes
DF-2
date person_surname person_order_number person_slary
2017-08-09 pearson 2 $1000
2017-08-09 zayne 5 $2000
I want to merge DF1 and DF2 such that the surnames of the people match and the person_order_number is merged correct. So i want the following returned
DF_pearson
date person_surname person_order_number item salary
2017-08-09 pearson 1 shoes
2017-08-09 pearson 2 $1000
DF_Zayne
date person_surname person_order_number item salary
2017-08-09 zayne 3 clothes
2017-08-09 zayne 5 $2000
How do i achieve this? I want to then perform operations on each of these dataframes as well.