2

I have a data frame like the following:

> example
  name X1.8 X1.8.1 X1.8.2
1    a    1      1      7
2    b   33      0      2
3    c    3     10     -1
4    a   -1     -1      4
5    d    5      8      5
6    e    7      6     12
7    a   -1      7      7
8    c    5     20      9

and I want to collapse(sum) the row with the same name (column 1) but ignore the value -1 while collapsing (summing). For example, the example above would become:

> example    # the goal
  name X1.8 X1.8.1 X1.8.2
1    a    1      8      18
2    b   33      0       2
3    c    8     30       9
4    d    5      8       5
5    e    7      6      12

> dput(example)
structure(list(name = structure(c(1L, 2L, 3L, 1L, 4L, 5L, 1L, 
3L), .Label = c("a", "b", "c", "d", "e", "f"), class = "factor"), 
    X1.8 = c(1, 33, 3, -1, 5, 7, -1, 5), X1.8.1 = c(1, 0, 10, 
    -1, 8, 6, 7, 20), X1.8.2 = c(7, 2, -1, 4, 5, 12, 7, 9)), row.names = c(NA, 
8L), class = "data.frame")

Edit for question: will this work if there are some rows with -1? For example,

  > example
     name X1.8 X1.8.1 X1.8.2
   1    a    1      1      7
   2    b   33      0      2
   3    c    3     10     -1
   4    a   -1     -1      4
   5    d    5      8      5
   6    e    7      6     12
   7    a   -1      7      7
   8    c    5     20      9
   9    f   -1     -1     -1
Math Avengers
  • 762
  • 4
  • 15

2 Answers2

1

You can remove -1 and sum rest of the values.

Using base R :

aggregate(.~name, example, function(x) sum(x[x!=-1]))

#  name X1.8 X1.8.1 X1.8.2
#1    a    1      8     18
#2    b   33      0      2
#3    c    8     30      9
#4    d    5      8      5
#5    e    7      6     12

In dplyr :

library(dplyr)

example %>%
  group_by(name) %>%
  summarise(across(everything(), ~sum(.[. != -1])))

and data.table :

library(data.table)
setDT(example)[, lapply(.SD, function(x) sum(x[x!=-1])), name]
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • hi, will this work if I have a row with all `-1` ? I added some more details in the question. – Math Avengers Jul 07 '20 at 20:42
  • 1
    @MathAvengers You can include .`drop = FALSE` in dplyr answer like `group_by(name, .drop = FALSE) %>%` which will return all values as 0 for `f` – Ronak Shah Jul 07 '20 at 22:46
0

As you are calculating sums you can set the -1 you want to ignore to 0 and use rowsum to get the sum per group.

x[x==-1] <- 0
rowsum(x[-1], x[,1])
#  X1.8 X1.8.1 X1.8.2
#a    1      8     18
#b   33      0      2
#c    8     30      9
#d    5      8      5
#e    7      6     12

another option is to set -1 to NA

x[x==-1] <- NA
rowsum(x[-1], x[,1], na.rm = TRUE)
GKi
  • 37,245
  • 2
  • 26
  • 48