4

I am trying to add two columns. My dataframe is like this one:

data <- data.frame(a = c(0,1,NA,0,NA,NA),
                   x = c(NA,NA,NA,NA,1,0),
                   t = c(NA,2,NA,NA,2,0))

I want to add some of the columns like this:

yep  <- cbind.data.frame( data$a, data$x, rowSums(data[,c(1, 2)], na.rm = TRUE))

However the output looks like this:

> yep

      data$a  data$x   rowSums(data[,c(1, 2)], na.rm = TRUE)
  1        0      NA                                      0
  2        1      NA                                      1
  3       NA      NA                                      0
  4        0      NA                                      0
  5       NA       1                                      1
  6       NA       0                                      0

And I would like an oputput like this:

> yep

      data$a  data$x   rowSums(data[,c(1, 2)], na.rm = TRUE)
  1        0      NA                                      0
  2        1      NA                                      1
  3       NA      NA                                      NA
  4        0      NA                                      0
  5       NA       1                                      1
  6       NA       0                                      0

If the columns contain only NA values I want to leave the NA values.

How I could achive this?

Jaap
  • 81,064
  • 34
  • 182
  • 193
Eric González
  • 465
  • 2
  • 10

5 Answers5

4

Base R:

data <- data.frame("a" = c(0,1,NA,0,NA,NA),
                   "x" = c(NA,NA,NA,NA,1,0),
                   "t" = c(NA,2,NA,NA,2,0)
)

yep <- cbind.data.frame( data$a, data$x, rs = rowSums(data[,c(1, 2)], na.rm = TRUE))
yep$rs[is.na(data$a) & is.na(data$x)] <- NA
yep
r.user.05apr
  • 5,356
  • 3
  • 22
  • 39
3

You might try dplyr::coalesce

cbind.data.frame( data$a, data$x, dplyr::coalesce(data$a, data$x))
#  data$a data$x dplyr::coalesce(data$a, data$x)
#1      0     NA                               0
#2      1     NA                               1
#3     NA     NA                              NA
#4      0     NA                               0
#5     NA      1                               1
#6     NA      0                               0
markus
  • 25,843
  • 5
  • 39
  • 58
  • 1
    This works fine with the example posted, but it doesn't get the `sum`. If there's a row with `a = x = 1` it will return `1` instead of `2`. – AntoniosK Oct 17 '18 at 11:12
  • 1
    @AntoniosK You are right of course. Given the example data, I though `coalesce` will be "enough" for OP's needs. – markus Oct 17 '18 at 11:31
3

Base R (ifelse):

cbind(data$a,data$x,ifelse(is.na(data$a) & is.na(data$x),NA,rowSums(data[,1:2],na.rm = TRUE)))

If you are looking for the column name then replace cbind with cbind.data.frame

Output:

      [,1] [,2] [,3]
[1,]    0   NA    0
[2,]    1   NA    1
[3,]   NA   NA   NA
[4,]    0   NA    0
[5,]   NA    1    1
[6,]   NA    0    0
Saurabh Chauhan
  • 3,161
  • 2
  • 19
  • 46
  • 1
    All the answers are working. This one, also works when **a = x = 1**, this answer returns 2 instead 1. The answer of markus works perfect for my example and if some one want to use the value from the first colum (in cases where are no missing data in both columns). – Eric González Oct 17 '18 at 11:47
0

base r ifelse

  data[['rowsum']]<-ifelse(is.na(data$a) & is.na(data$x),NA,ifelse(is.na(data$a),0,data$a)+ifelse(is.na(data$x),0,data$x))

     a  x  t rowsum
 1:  0 NA NA      0
 2:  1 NA  2      1
 3: NA NA NA     NA
 4:  0 NA NA      0
 5: NA  1  2      1
 6: NA  0  0      0
rahul
  • 561
  • 1
  • 5
  • 13
0

Another base R approach.

If all the values in the rows are NA then return NA or else return sum of the row ignoring NA's.

#Select only the columns which we need
sub_df <- data[c("a", "x")]

sub_df$answer <- ifelse(rowSums(is.na(sub_df)) == ncol(sub_df), NA, 
                    rowSums(sub_df, na.rm = TRUE))

sub_df
#   a  x answer
#1  0 NA      0
#2  1 NA      1
#3 NA NA     NA
#4  0 NA      0
#5 NA  1      1
#6 NA  0      0
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213