0

I have two dataframes that are .csv files looking like this :

data <- data.frame(ID=c(205,206,207,308,310,400,401,35,36,90), 
               year=c(2015, 2015,2015,2015,2015, 2015, 2015, 2015,2015, 2015),
               ghost=c(0,3,0.5,17.5,37.5,0,87.5,0,0,0),
               pumkin=c(3,3,3,3,0,0,37.5,0,0,15),
               pirate=c(3,3,3,3,0,0,37.5,0,87.5,3))   

and the second one which has different observations of halloween costums for different years :

data1 <- data.frame(ID=c(1,12,13,14,15,16,17), 
                year=c(2017,2017,2017,2017,2017,2017,2017),
                ghost=c(0,3,0.5,17.5,37.5,0,87.5),
                mermaid=c(0.5,0.5,37.5,37.5,0,0.5,37.5))

What I would like to have, is one data frame that takes all the elements from both the dataframes which will have this form :

ID  year  ghost pumkin pirate mermaid 
205 2015  0
206 2015  3
207 2015  0.5
308 2015  17.5
...  ...
1   2017  0
12  2017  3 
13  2017  0.5

The columns that were not present in one of the dataframe will be added and fill with zeros.

Is there any way to do this ? I've tried using merge() and join() and I was exploring the possibility to use dplyr as well to manipulate my dataframes but it feels like there are two issues in my problem and no simple ways to do what i want.

Your help will be much appreciated, Happy halloween !

Pinceloup
  • 11
  • 5

3 Answers3

2

dplyr::bind_rows will handle this

library(dplyr)
bind_rows(data,data1) %>% `[<-`(is.na(.),value=0)
# 1  205 2015   0.0    3.0    3.0     0.0
# 2  206 2015   3.0    3.0    3.0     0.0
# 3  207 2015   0.5    3.0    3.0     0.0
# 4  308 2015  17.5    3.0    3.0     0.0
# 5  310 2015  37.5    0.0    0.0     0.0
# 6  400 2015   0.0    0.0    0.0     0.0
# 7  401 2015  87.5   37.5   37.5     0.0
# 8   35 2015   0.0    0.0    0.0     0.0
# 9   36 2015   0.0    0.0   87.5     0.0
# 10  90 2015   0.0   15.0    3.0     0.0
# 11   1 2017   0.0    0.0    0.0     0.5
# 12  12 2017   3.0    0.0    0.0     0.5
# 13  13 2017   0.5    0.0    0.0    37.5
# 14  14 2017  17.5    0.0    0.0    37.5
# 15  15 2017  37.5    0.0    0.0     0.0
# 16  16 2017   0.0    0.0    0.0     0.5
# 17  17 2017  87.5    0.0    0.0    37.5 
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
0

to obtain your desired output and based on @Moody_Mudskipper's answer you can do

data2 <- bind_rows(data, data1)
data2[is.na(data2)] <- 0
Mouad_Seridi
  • 2,666
  • 15
  • 27
0

To fill NA's with zeros you can do:

library(dplyr)
bind_rows(data, data1) %>%
  mutate_all(funs(ifelse(is.na(.), 0, .)))

or with data.table:

library(data.table)
rbindlist(list(data, data1), fill = TRUE)[,lapply(.SD, function(x){ifelse(is.na(x), 0, x)})]

Result:

     ID year ghost pumkin pirate mermaid
 1: 205 2015   0.0    3.0    3.0     0.0
 2: 206 2015   3.0    3.0    3.0     0.0
 3: 207 2015   0.5    3.0    3.0     0.0
 4: 308 2015  17.5    3.0    3.0     0.0
 5: 310 2015  37.5    0.0    0.0     0.0
 6: 400 2015   0.0    0.0    0.0     0.0
 7: 401 2015  87.5   37.5   37.5     0.0
 8:  35 2015   0.0    0.0    0.0     0.0
 9:  36 2015   0.0    0.0   87.5     0.0
10:  90 2015   0.0   15.0    3.0     0.0
11:   1 2017   0.0    0.0    0.0     0.5
12:  12 2017   3.0    0.0    0.0     0.5
13:  13 2017   0.5    0.0    0.0    37.5
14:  14 2017  17.5    0.0    0.0    37.5
15:  15 2017  37.5    0.0    0.0     0.0
16:  16 2017   0.0    0.0    0.0     0.5
17:  17 2017  87.5    0.0    0.0    37.5
acylam
  • 18,231
  • 5
  • 36
  • 45