I'm using dplyr (actually dbplyr) and I have 2 tables. I've tried to reproduce samples below:
df1 <- tibble('y' = c(2001,2001,2001,2001,2002,2002,2002,2002), 'id' = c(7,8,7,8,7,8,7,8), 'c' = c(0,0,0,NA,NA,1,1,NA))
y id c
<dbl> <dbl> <dbl>
1 2001 7 0
2 2001 8 0
3 2001 7 0
4 2001 8 NA
5 2002 7 NA
6 2002 8 1
7 2002 7 1
8 2002 8 NA
and:
df2 <- tibble('y' = c(2001,2001,2002), 'id' = c(7,8,7), 'c' = c(1,1,1))
y id c
<dbl> <dbl> <dbl>
1 2001 7 1
2 2001 8 1
3 2002 7 1
What I want to do is merge the column c, so that I end up with df1, but with any relevant values, replaced in from df2.
y id c
<dbl> <dbl> <dbl>
1 2001 7 0
2 2001 8 0
3 2001 7 0
4 2001 8 1
5 2002 7 1
6 2002 8 1
7 2002 7 1
8 2002 8 NA
In this example I have only put one column, c, but in reality my data has hundreds of columns, and I would like to do this across all the columns.
I've seen this answer: Merge 2 columns into 1 column but its in Python, and only for 1 column.
I've tried using regular merges, but it doesn't work. I can do a merge, keeping both columns, then use a mutate with an ifelse statement, to check if the df2 column has a NA, and use the df2 column if it isn't, otherwise keep the df1 column. But this is only for 1 column, and I need this across multiple columns.
Any help would be appreciated