merge/join two dataframe with different number of columns and rows in R

Question

I have two dataframes that are .csv files looking like this :

data <- data.frame(ID=c(205,206,207,308,310,400,401,35,36,90), 
               year=c(2015, 2015,2015,2015,2015, 2015, 2015, 2015,2015, 2015),
               ghost=c(0,3,0.5,17.5,37.5,0,87.5,0,0,0),
               pumkin=c(3,3,3,3,0,0,37.5,0,0,15),
               pirate=c(3,3,3,3,0,0,37.5,0,87.5,3))

and the second one which has different observations of halloween costums for different years :

data1 <- data.frame(ID=c(1,12,13,14,15,16,17), 
                year=c(2017,2017,2017,2017,2017,2017,2017),
                ghost=c(0,3,0.5,17.5,37.5,0,87.5),
                mermaid=c(0.5,0.5,37.5,37.5,0,0.5,37.5))

What I would like to have, is one data frame that takes all the elements from both the dataframes which will have this form :

ID  year  ghost pumkin pirate mermaid 
205 2015  0
206 2015  3
207 2015  0.5
308 2015  17.5
...  ...
1   2017  0
12  2017  3 
13  2017  0.5

The columns that were not present in one of the dataframe will be added and fill with zeros.

Is there any way to do this ? I've tried using merge() and join() and I was exploring the possibility to use dplyr as well to manipulate my dataframes but it feels like there are two issues in my problem and no simple ways to do what i want.

Your help will be much appreciated, Happy halloween !

moodymudskipper · Answer 1 · 2017-11-02T08:17:17.563

dplyr::bind_rows will handle this

library(dplyr)
bind_rows(data,data1) %>% `[<-`(is.na(.),value=0)
# 1  205 2015   0.0    3.0    3.0     0.0
# 2  206 2015   3.0    3.0    3.0     0.0
# 3  207 2015   0.5    3.0    3.0     0.0
# 4  308 2015  17.5    3.0    3.0     0.0
# 5  310 2015  37.5    0.0    0.0     0.0
# 6  400 2015   0.0    0.0    0.0     0.0
# 7  401 2015  87.5   37.5   37.5     0.0
# 8   35 2015   0.0    0.0    0.0     0.0
# 9   36 2015   0.0    0.0   87.5     0.0
# 10  90 2015   0.0   15.0    3.0     0.0
# 11   1 2017   0.0    0.0    0.0     0.5
# 12  12 2017   3.0    0.0    0.0     0.5
# 13  13 2017   0.5    0.0    0.0    37.5
# 14  14 2017  17.5    0.0    0.0    37.5
# 15  15 2017  37.5    0.0    0.0     0.0
# 16  16 2017   0.0    0.0    0.0     0.5
# 17  17 2017  87.5    0.0    0.0    37.5

so fast, so simple, thank you sir ! – Pinceloup Nov 01 '17 at 14:26 — Pinceloup, Nov 01 '17 at 14:26
edited to deal with NAs – moodymudskipper Nov 02 '17 at 08:17 — moodymudskipper, Nov 02 '17 at 08:17

score 0 · Answer 2 · answered Nov 01 '17 at 14:28

0

to obtain your desired output and based on @Moody_Mudskipper's answer you can do

data2 <- bind_rows(data, data1)
data2[is.na(data2)] <- 0

answered Nov 01 '17 at 14:28

Mouad_Seridi

2,666
15
27

score 0 · Answer 3 · answered Nov 01 '17 at 14:34

To fill NA's with zeros you can do:

library(dplyr)
bind_rows(data, data1) %>%
  mutate_all(funs(ifelse(is.na(.), 0, .)))

or with data.table:

library(data.table)
rbindlist(list(data, data1), fill = TRUE)[,lapply(.SD, function(x){ifelse(is.na(x), 0, x)})]

Result:

     ID year ghost pumkin pirate mermaid
 1: 205 2015   0.0    3.0    3.0     0.0
 2: 206 2015   3.0    3.0    3.0     0.0
 3: 207 2015   0.5    3.0    3.0     0.0
 4: 308 2015  17.5    3.0    3.0     0.0
 5: 310 2015  37.5    0.0    0.0     0.0
 6: 400 2015   0.0    0.0    0.0     0.0
 7: 401 2015  87.5   37.5   37.5     0.0
 8:  35 2015   0.0    0.0    0.0     0.0
 9:  36 2015   0.0    0.0   87.5     0.0
10:  90 2015   0.0   15.0    3.0     0.0
11:   1 2017   0.0    0.0    0.0     0.5
12:  12 2017   3.0    0.0    0.0     0.5
13:  13 2017   0.5    0.0    0.0    37.5
14:  14 2017  17.5    0.0    0.0    37.5
15:  15 2017  37.5    0.0    0.0     0.0
16:  16 2017   0.0    0.0    0.0     0.5
17:  17 2017  87.5    0.0    0.0    37.5

merge/join two dataframe with different number of columns and rows in R

3 Answers3