-1

i want to sum multiple columns of data frames in a list and only show the sum without showing the (calculation) input columns. Here an example:

ls <- list(data.frame(a=1, b=5, c=3, d=2), data.frame(a=NA, b=2, c=7, d=9))

ls
[[1]]
  a b c d
1 1 5 3 2

[[2]]
   a b c d
1 NA 2 7 9

my expected result is:

ls2
[[1]]
  c new
1 3   8

[[2]]
  c new
1 7  11

Any ideas how to do this? So far I tried to enhance this answer for lists, without success and without omiting the input columns (a,b,d). I tried so far lapply:

lapply(ls, function(x) x$e <- rowSums(x[,c("a", "b", "d")], na.rm=T)) 
and 
ls$e <- lapply(ls, function(x) rowSums(x[,c("a", "b", "d")], na.rm=T)) 

Thank you in advance

Edit: Thanks Aech and Abdou for your answers, which work fine with this example. However, I have >200 columns, do you know a way without writing the columns that will remain? Like deleting the columns that I use for the calculation, instead of naming all columns.

EDIT 2: Thanks for your improved code, it works well with the example data. However, with my true data set not... I get the following error:

Error in rowSums(x[, columns_to_sum], na.rm = T) : 
 'x' must be an array of at least two dimensions"

My list has about 96 matrices with 200 columns and one row. But I don´t know how to prepare a reproducible example of my error. Any ideas?

Community
  • 1
  • 1
N.Varela
  • 910
  • 1
  • 11
  • 25
  • Your other question is an exact dupe of this one. You can't post duplicate questions on SO. If this question is not good enough, then you should edit it, provide a minimal reproducible example and explain exactly what you are looking for. – David Arenburg Sep 25 '16 at 19:30

2 Answers2

4

You should not name your list ls, because ls is a function.

lapply(myList, function(x) data.frame(c=x$c, new = rowSums(x[,c("a", "b", "d")], na.rm=T))) 

Here is a solution where you specify the dropped columns only (after edit):

dropped <- c("a", "b", "d")
lapply(myList, function(x) {
  x$new <- rowSums(x[,dropped], na.rm=T)
  x[!names(x) %in% dropped]
  }) 
Aeck
  • 543
  • 7
  • 11
  • If we're being pedantic, you shouldn't name your vector `drop` because `drop` is a function. I agree that `ls` is a more common function and more likely to cause confusion, but all the same... – Gregor Thomas Sep 09 '16 at 22:24
  • Instead of `x$new <- rowSums(x[,dropped], na.rm=T)` use `x$new <- ifelse(length((dropped)) > 1, rowSums(x[,dropped], na.rm=T), x[,dropped])` – Aeck Sep 25 '16 at 17:18
  • Thanks.. see my new question on matrices: http://stackoverflow.com/questions/39690633/r-how-to-sum-multiple-columns-of-matrices-in-a-list – N.Varela Sep 25 '16 at 18:59
  • However it is only used together with the data.frame `df` and does not occur alone (apart from that it obviously is the desired output). My suggested naming rule is admittedly more important when it comes to function definitions. – Aeck Sep 25 '16 at 20:00
2

Try:

lapply(ls, function(x) {
    x$new <- rowSums(x[,c("a", "b", "d")], na.rm=T)
    return(x[,c("c","new")])
})

Edit:

You can put the columns you wish to use rowSums on into a variable as follows:

lapply(ls, function(x) {
    columns_to_sum <- c("a", "b", "d")
    x$new <- rowSums(x[,columns_to_sum], na.rm=T)
    return(x[,!colnames(x) %in% columns_to_sum])
})

Here columns_to_sum is the variable that saves the names of the columns you wish to apply rowSums on.

I hope this helps.

Abdou
  • 12,931
  • 4
  • 39
  • 42
  • @N.Varela `columns_to_sum` cannot be one column. It has to contain more than one column, otherwise the `rowSums` function won't work. – Abdou Sep 25 '16 at 16:54
  • I have >200 cols but only 1 row. Is this a problem for RowSums? – N.Varela Sep 25 '16 at 18:24
  • Thanks.. see my new question on matrices: http://stackoverflow.com/questions/39690633/r-how-to-sum-multiple-columns-of-matrices-in-a-list – N.Varela Sep 25 '16 at 18:59