I've three datasets:
dataset 1
Customer1 Customer2 Exposures + other columns
Nick McKenzie Christopher Mill 23450
Nick McKenzie Stephen Green 23450
Johnny Craston Mary Shane 12
Johnny Craston Stephen Green 12
Molly John Casey Step 1000021
dataset2 (unique Customers: Customer 1 + Customer 2)
Customer Age
Nick McKenzie 53
Johnny Craston 75
Molly John 34
Christopher Mill 63
Stephen Green 65
Mary Shane 54
Casey Step 34
Mick Sale
dataset 3
Customer1 Customer2 Exposures + other columns
Mick Sale Johnny Craston
Mick Sale Stephen Green
Exposures refers to Customer 1 only.
There are other columns omitted for brevity. Dataset 2 is built by getting unique customer 1 and unique customer 2: no duplicates are in that dataset. Dataset 3 has the same column of dataset 1.
I'd like to add the information from dataset 1 into dataset 2 to have
Final dataset
Customer Age Exposures + other columns
Nick McKenzie 53 23450
Johnny Craston 75 12
Molly John 34 1000021
Christopher Mill 63
Stephen Green 65
Mary Shane 54
Casey Step 34
Mick Sale
The final dataset should have all Customer1 and Customer 2 from both dataset 1 and dataset 3, with no duplicates. I have tried to combine them as follows
result = pd.concat([df2,df1,df3], axis=1)
but the result is not that one I'd expect. Something wrong is in my way of concatenating the datasets and I'd appreciate it if you can let me know what is wrong.