I have two datasets like so:
training.csv
last_name ob1 ob2
Adam 2:01 2:02
Barry, S 3:30 2:50
Barry, D 2:45
Charlie 4:00
Don 2:00 1:50
Earl 2:50 2:30
Johnson, A 2:57 2:54
Johnson, T 3:15 3:10
and
racing.csv
last_name first_name 1mile-time 500m-time
Barry Sue 4:45 1:50
Don Regan 4:35 0:50
Earl Sage 4:50 1:30
Johnson Adam 4:37 1:54
Johnson Terry 4:50 2:10
So I used merge(training, racing, by = "last_name", all = TRUE)
but some people have a shared last name. In the case that a last name was shared, it was entered as last name and first initial separated by a comma.
Another important thing to note, not everyone who goes to training makes the races. So there will be some unique names in training.csv
that are not present in racing.csv
.
Desired output
last_name first_name ob1 ob2 1mile-time 500m-time
Adam Bob 2:01 2:02
Barry, S Sue 3:30 2:50 4:45 1:50
Barry, D Derrick 2:45
Charlie Charles 4:00
Don Regan 2:00 1:50 4:35 0:50
Earl Sage 2:50 2:30 4:50 1:30
Johnson, A Adam 2:57 2:54 4:50 2:10
Johnson, T Terry 3:15 3:10 4:50 2:10