Merge two columns maintaning missing values

Question

I am trying to add two columns. My dataframe is like this one:

data <- data.frame(a = c(0,1,NA,0,NA,NA),
                   x = c(NA,NA,NA,NA,1,0),
                   t = c(NA,2,NA,NA,2,0))

I want to add some of the columns like this:

yep  <- cbind.data.frame( data$a, data$x, rowSums(data[,c(1, 2)], na.rm = TRUE))

However the output looks like this:

> yep

      data$a  data$x   rowSums(data[,c(1, 2)], na.rm = TRUE)
  1        0      NA                                      0
  2        1      NA                                      1
  3       NA      NA                                      0
  4        0      NA                                      0
  5       NA       1                                      1
  6       NA       0                                      0

And I would like an oputput like this:

> yep

      data$a  data$x   rowSums(data[,c(1, 2)], na.rm = TRUE)
  1        0      NA                                      0
  2        1      NA                                      1
  3       NA      NA                                      NA
  4        0      NA                                      0
  5       NA       1                                      1
  6       NA       0                                      0

If the columns contain only NA values I want to leave the NA values.

How I could achive this?

score 4 · Answer 1 · answered Oct 17 '18 at 11:09

Base R:

data <- data.frame("a" = c(0,1,NA,0,NA,NA),
                   "x" = c(NA,NA,NA,NA,1,0),
                   "t" = c(NA,2,NA,NA,2,0)
)

yep <- cbind.data.frame( data$a, data$x, rs = rowSums(data[,c(1, 2)], na.rm = TRUE))
yep$rs[is.na(data$a) & is.na(data$x)] <- NA
yep

score 3 · Answer 2 · answered Oct 17 '18 at 11:07

3

You might try dplyr::coalesce

cbind.data.frame( data$a, data$x, dplyr::coalesce(data$a, data$x))
#  data$a data$x dplyr::coalesce(data$a, data$x)
#1      0     NA                               0
#2      1     NA                               1
#3     NA     NA                              NA
#4      0     NA                               0
#5     NA      1                               1
#6     NA      0                               0

answered Oct 17 '18 at 11:07

markus

25,843
5
39
58

1

This works fine with the example posted, but it doesn't get the `sum`. If there's a row with `a = x = 1` it will return `1` instead of `2`. – AntoniosK Oct 17 '18 at 11:12
1

@AntoniosK You are right of course. Given the example data, I though `coalesce` will be "enough" for OP's needs. – markus Oct 17 '18 at 11:31

score 3 · Accepted Answer · answered Oct 17 '18 at 11:17

3

Base R (ifelse):

cbind(data$a,data$x,ifelse(is.na(data$a) & is.na(data$x),NA,rowSums(data[,1:2],na.rm = TRUE)))

If you are looking for the column name then replace cbind with cbind.data.frame

Output:

      [,1] [,2] [,3]
[1,]    0   NA    0
[2,]    1   NA    1
[3,]   NA   NA   NA
[4,]    0   NA    0
[5,]   NA    1    1
[6,]   NA    0    0

answered Oct 17 '18 at 11:17

Saurabh Chauhan

3,161
2
19
46

1

All the answers are working. This one, also works when **a = x = 1**, this answer returns 2 instead 1. The answer of markus works perfect for my example and if some one want to use the value from the first colum (in cases where are no missing data in both columns). – Eric González Oct 17 '18 at 11:47

score 0 · Answer 4 · answered Oct 17 '18 at 11:15

base r ifelse

  data[['rowsum']]<-ifelse(is.na(data$a) & is.na(data$x),NA,ifelse(is.na(data$a),0,data$a)+ifelse(is.na(data$x),0,data$x))

     a  x  t rowsum
 1:  0 NA NA      0
 2:  1 NA  2      1
 3: NA NA NA     NA
 4:  0 NA NA      0
 5: NA  1  2      1
 6: NA  0  0      0

score 0 · Answer 5 · answered Oct 17 '18 at 11:48

Another base R approach.

If all the values in the rows are NA then return NA or else return sum of the row ignoring NA's.

#Select only the columns which we need
sub_df <- data[c("a", "x")]

sub_df$answer <- ifelse(rowSums(is.na(sub_df)) == ncol(sub_df), NA, 
                    rowSums(sub_df, na.rm = TRUE))

sub_df
#   a  x answer
#1  0 NA      0
#2  1 NA      1
#3 NA NA     NA
#4  0 NA      0
#5 NA  1      1
#6 NA  0      0

Merge two columns maintaning missing values

5 Answers5