-1

I have a dataset that look like this:

enter image description here

I would like to combine those two datasets. How can I do that. I can't use rbind as df has more vars. Sample data can be build using codes:

df<-structure(list(ID = c(1, 2, 3), High = c(25, 36, 75), weight = c(38, 
58, 36), date = c(1, 1, 1)), row.names = c(NA, -3L), class = c("tbl_df", 
"tbl", "data.frame"))

df2<-structure(list(ID = c(1, 2, 3, 1, 2, 3), weight = c(69, 58, 35, 
65, 24, 15), date = c(3, 3, 3, 2, 2, 2)), row.names = c(NA, -6L
), class = c("tbl_df", "tbl", "data.frame"))

The final outcome should be sth that looks like this: enter image description here

Stataq
  • 2,237
  • 6
  • 14

3 Answers3

2

If your dataset is large, try the data.table package for operations like these. Here is a vignette if you want to know more.

Here is the code utilizing the data.table package :

library(data.table)
setDT(df)
setDT(df2)

result<-rbindlist(list(df,df2),fill=T)

Note that the key argument that allows for the rows to be different is fill=TRUE

  • Thanks so much. Is it possible to fill high for df2 using df value?, so the final data will has high value for every record? – Stataq Feb 12 '21 at 16:43
  • 1
    By that, do you mean that you want to give the same High to values corresponding to df2 as given to the corresponding IDs in df? try this : rbindlist(list(df,df2),fill=T)[, High := (max(High,na.rm = T)), by = ID] This 'trick' should get the job done if that's what you meant. – Abhishek Arora Feb 12 '21 at 17:14
1

You could use bind_rows() from dplyr:

df1 <- tibble::tribble(
  ~ID, ~High, ~weight, ~date, 
  1, 25, 38, 1,
  2, 36, 58, 1,
  3, 75, 36, 1
)

df2 <- tibble::tribble(
  ~ID, ~weight, ~date, 
  1, 69, 3,
  2, 58, 3, 
  3, 35, 3, 
  1, 65, 2, 
  2, 24, 2,
  3, 15, 2
)


bind_rows(df1, df2)
# # A tibble: 9 x 4
#      ID  High weight  date
#   <dbl> <dbl>  <dbl> <dbl>
# 1     1    25     38     1
# 2     2    36     58     1
# 3     3    75     36     1
# 4     1    NA     69     3
# 5     2    NA     58     3
# 6     3    NA     35     3
# 7     1    NA     65     2
# 8     2    NA     24     2
# 9     3    NA     15     2

DaveArmstrong
  • 18,377
  • 2
  • 13
  • 25
1

Is this your expected output?

> merge(df1, df2, all = TRUE)
  ID weight date High
1  1     38    1   25
2  1     65    2   NA
3  1     69    3   NA
4  2     24    2   NA
5  2     58    1   36
6  2     58    3   NA
7  3     15    2   NA
8  3     35    3   NA
9  3     36    1   75
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81