I noticed full_join
and been doubling rows when I am matching on rows with duplicates id's. Is there a way I can include the unique rows from two datasets without duplicating data? I could imagine making another unique identifier.
library(dplyr)
## two row output as expected
x <- tibble(id = c(1,2))
full_join(x, x, by="id")
#> # A tibble: 2 x 1
#> id
#> <dbl>
#> 1 1
#> 2 2
## 1's double
y <- tibble(id = c(1,1,2))
full_join(y, y, by="id")
#> # A tibble: 5 x 1
#> id
#> <dbl>
#> 1 1
#> 2 1
#> 3 1
#> 4 1
#> 5 2
Created on 2020-07-21 by the reprex package (v0.3.0)