2

I am very new to R, and I sincerely appreciate your help.

The following is part of my data:

subjectID  A B C D E F G H I J
S001       1 1 1 1 1 0 0
S002       1 1 1 0 0 0 0 

I want to sum the rows from A to J, and so the data will look like this:

subjectID A B C D E F G H I J TOTAL
S001      1 1 1 1 1 0 0        5
S002      1 1 1 0 0 0 0        3

Thank you so much! I would like sum if variable A to J == 1.

MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
Alicia Chang
  • 73
  • 1
  • 5
  • 2
    Try to search for `?rowSums`; it does exactly what the name might suggest. Or `apply(your_data, 1, function(x) sum(x[x == 1]))` or with `mutate` of `dplyr` package. There are several ways you can do that but next time please provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) in order to help users help you and **first** al all: try to search. I bet there are several questions to your answer. – SabDeM Jul 31 '15 at 01:45
  • What do you mean by "if variable A to J == 1"? In your example, there is no value for H, I and J. –  Jul 31 '15 at 01:46
  • to my previous comment do not forget to add the `na.rm = T` to the `sum` function. But since your output (with no elements to H, I, J -- most likely empty strings) I bet they are character maybe. – SabDeM Jul 31 '15 at 01:54
  • @SabDeM that should be an answer. – Brandon Bertelsen Jul 31 '15 at 02:04
  • @BrandonBertelsen maybe, I am just waiting moderators because most likely this question will be marked as duplicate and furthermore a `dput` of the data just to test my code. – SabDeM Jul 31 '15 at 02:08
  • 1
    Many questions are duplicates and that's ok. It's good to have a lot of examples to even marginally different problems. – Brandon Bertelsen Jul 31 '15 at 02:13
  • @BrandonBertelsen I was thinking about how to do that with `dplyr`... maybe I am missing something but this is one of the very few cases in which `dplyr` is not a comfortable tool to work with. – SabDeM Jul 31 '15 at 02:33

2 Answers2

1

As suggested, I post here my answers. This is is with apply. the df[-1] is to exclude the first column (which is not numeric), the x[x == 1] is to subset the elements of x (a single row due to the 1 of the apply) with only values of 1.

 df$TOTAL <- apply(df[-1], 1, function(x) sum(x[x == 1], na.rm = T))

Another (I bet much faster and) easier to code way in base R is:

df$TOTAL <- rowSums(df[-1] == 1, na.rm = T)

both have as a result this

df
  subjectID A B C D E F G  H  I  J TOTAL
1      S001 1 1 1 1 1 0 0 NA NA NA     5
2      S002 1 1 1 0 0 0 0 NA NA NA     3

Data

df <- structure(list(subjectID = structure(1:2, .Label = c("S001", 
"S002"), class = "factor"), A = c(1L, 1L), B = c(1L, 1L), C = c(1L, 
1L), D = c(1L, 0L), E = c(1L, 0L), F = c(0L, 0L), G = c(0L, 0L
), H = c(NA, NA), I = c(NA, NA), J = c(NA, NA)), .Names = c("subjectID", 
"A", "B", "C", "D", "E", "F", "G", "H", "I", "J"), class = "data.frame", row.names = c(NA, 
-2L))
SabDeM
  • 7,050
  • 2
  • 25
  • 38
0

Another similar option to the one posted by SabDeM but using sapply to sum only numeric columns

df$Total <- rowSums(df[ ,sapply(df, is.numeric)])

Output:

  subjectID A B C D E F G  H  I  J Total
1      S001 1 1 1 1 1 0 0 NA NA NA     5
2      S002 1 1 1 0 0 0 0 NA NA NA     3
mpalanco
  • 12,960
  • 2
  • 59
  • 67