I have two data frames (df1
and df2
) which I wish to combine and, where there is a match on a common field, combine those rows' data, otherwise simply add them to the resulting data frame.
They share a common field (specimenid
) and where there are matches between df1
and df2
, I wish to combine those individual rows' data to produce a single row, featuring the fields of both data frames.
For example, my data frames might be something like this:
df1:
specimenid firstname age
001 bob 45
004 stuart 65
005 eunice 43
006 robert 20
007 james 40
df2:
specimenid surname salary department
001 hastings £45,000 sales
002 smith £28,500 accounting
007 bond £150,000 international relations
008 jennings £50,000 contracts
And the resulting data frame should look like this:
specimenid firstname age surname salary department
001 bob 45 hastings £45,000 sales
002 NA NA smith £28,500 accounting
004 stuart 65 NA NA NA
005 eunice 43 NA NA NA
006 robert 20 NA NA NA
007 james 40 bond £150,000 international relations
008 NA NA jennings £50,000 contracts
The resulting data frame will be a combination of the data frames' rows where there is a match on specimenid
in both df1
and df2
and for those records where there is no match, they will be added as new rows with NA
s populating the unmatched fields.
I want to include the data from both data frames regardless of whether there is a match or not. If there is a match, then I wish to combine the data frames' rows into a single entry.
I feel that this is a clear example and explanation, but if further clarification is needed then I will update the question.