Create new dataframe from a subset dataframe

Question

I have a dataframe like this:

df = data.frame(id = c(1,1,3,4,4), stockoprice1 = c(3,4,2,45,1))

And a second like this:

df2 = data.frame(id = c(1,4), name = c("price1","price2"))

I would like to merge them in order to take results like this:

> dfexpected = data.frame(id = c(1,1,3,4,4), stockprice1 = c(3,4,2,45,1), name = c("price1","price1",NA,"price2","price2"))
> dfexpected
  id stockprice1   name
1  1           3 price1
2  1           4 price1
3  3           2   <NA>
4  4          45 price2
5  4           1 price2

Which is the appropriate merge option?

@kath no it is not duplicate. non of this options can produce the results — Elr Mant, Feb 09 '19 at 21:50

score 1 · Accepted Answer · answered Feb 09 '19 at 21:41

df = data.frame(id = c(1,1,3,4,4), stockoprice1 = c(3,4,2,45,1))

df2 = data.frame(id = c(1,4), name = c("price1","price2"))

dfexpected <- dplyr::left_join(df, df2)
dfexpected

#> Joining, by = "id"
#>   id stockoprice1   name
#> 1  1            3 price1
#> 2  1            4 price1
#> 3  3            2   <NA>
#> 4  4           45 price2
#> 5  4            1 price2

Created on 2019-02-10 by the reprex package (v0.2.0).

score 1 · Answer 2 · answered Feb 09 '19 at 21:55

you can use base merge while using tidyverse:

df %>% 
  merge(df2, by = "id", all = TRUE)

  id stockoprice1   name
1  1            3 price1
2  1            4 price1
3  3            2   <NA>
4  4           45 price2
5  4            1 price2

I love piped options, but base merge is really flexible for different kinds of joins by the options set from all, all.x, or all.y. It's easy to experiment to get the expected output.

Create new dataframe from a subset dataframe

2 Answers2