I have these two dataframes for example:
dates = c('2020-11-19', '2020-11-20', '2020-11-21')
df1 <- data.frame(dates, area = c('paris', 'london', 'newyork'),
rating = c(10, 5, 6),
rating2 = c(5, 6, 7))
df2 <- data.frame(dates, area = c('budapest', 'moscow', 'valencia'),
rating = c(1, 2, 1))
> df1
dates area rating rating2
1 2020-11-19 paris 10 5
2 2020-11-20 london 5 6
3 2020-11-21 newyork 6 7
> df2
dates area rating
1 2020-11-19 budapest 1
2 2020-11-20 moscow 2
3 2020-11-21 valencia 1
When performing an outer join using dplyr:
df <- df1 %>%
full_join(df2, by = c('dates', 'area'))
the result is like this:
dates area rating.x rating2 rating.y
1 2020-11-19 paris 10 5 NA
2 2020-11-20 london 5 6 NA
3 2020-11-21 newyork 6 7 NA
4 2020-11-19 budapest NA NA 1
5 2020-11-20 moscow NA NA 2
6 2020-11-21 valencia NA NA 1
i.e. the rating columns from the two dataframes are not blended together but two separate columns are created.
How do I get a result like this?
dates area rating rating2
1 2020-11-19 paris 10 5
2 2020-11-20 london 5 6
3 2020-11-21 newyork 6 7
4 2020-11-19 budapest 1 NA
5 2020-11-20 moscow 2 NA
6 2020-11-21 valencia 1 NA
Thanks to the solutions provided by @kybazzi, the desired result was obtained.
df <- df1 %>%
bind_rows(df2)
FOLLOW-UP
As a follow-up question, I would like to join the following to the joined dataframe:
df3 <- data.frame(dates, area = c('budapest', 'moscow', 'valencia'),
rating2 = c(3, 2, 5))
Using the same method, the result is this:
> df_final <- df %>%
+ bind_rows(df3)
> df_final
dates area rating rating2
1 2020-11-19 paris 10 5
2 2020-11-20 london 5 6
3 2020-11-21 newyork 6 7
4 2020-11-19 budapest 1 NA
5 2020-11-20 moscow 2 NA
6 2020-11-21 valencia 1 NA
7 2020-11-19 budapest NA 3
8 2020-11-20 moscow NA 2
9 2020-11-21 valencia NA 5
How do i get a result like this:
dates area rating rating2
1 2020-11-19 paris 10 5
2 2020-11-20 london 5 6
3 2020-11-21 newyork 6 7
4 2020-11-19 budapest 1 3
5 2020-11-20 moscow 2 2
6 2020-11-21 valencia 1 5