-4

I have a data.table table in memory. It is made up of many rows with these columns

key, c1.min, c2.min, c3.min, c1.max, c2.max, c3.max, c1.sd, c2.sd, c3.sd

I would like to return a new table of

key, c1, c2, c3

where

c1 = c1.min + c1.max + c1.sd
c2 = c2.min + c2.max + c2.sd
c3 = c3.min + c3.max + c3.sd
ArunK
  • 1,731
  • 16
  • 35
wbm
  • 155
  • 1
  • 2
  • 13
  • 4
    That's ok but where's the problem? – talat Jun 06 '16 at 09:51
  • Also, for `c1` shouldn't it be `...+c1.max+...` ? – Sotos Jun 06 '16 at 09:53
  • Sotos, yes, fixed. @docendodiscimus, I would appreciate the help with the Syntax – wbm Jun 06 '16 at 10:12
  • apologies if this is uber simple. Any kind of pointer would be appreciated – wbm Jun 06 '16 at 10:26
  • 2
    Your downvoted probably because you didn't provide a reproducible example (see [mcve] and [here](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)) and you didn't show any efforts you made. You should be able to just do something like `dt[, list(c1 = c1.min + c1.max + c1.sd, c2 = c2.min + c2.max + c2.sd, c3 = c3.min + c3.max + c3.sd), by = key]` or similar – talat Jun 06 '16 at 10:35
  • 1
    ok - fair-enough, though it is part of a bigger problem I am trying to solve. My problem is the attributes are dynamic depending on circumstances so working towards a solution that works for c1, c2, ..., cn. Looks like I will need some lambda function. I will post solution when I have it and you might give me back my 2 points... – wbm Jun 06 '16 at 10:45
  • Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269). This will make it much easier for others to help you. – zx8754 Jun 06 '16 at 10:56

4 Answers4

1

Try this example:

library(data.table)

#dummy data
myData <- 
  data.table(
    data.frame(
      key = 1:10,
      c1.min = 1:10,
      c2.min = 10:19,
      c3.min = 100:109,
      c1.max = 1:10,
      c2.max = 1:10,
      c3.max = 1:10,
      c1.sd = 1:10,
      c2.sd = 1:10,
      c3.sd = 1:10))

# using basic regex match
cbind(key = myData$key,
      sapply(c("c1", "c2", "c3"),function(i){
        myColnames <- colnames(myData)
        rowSums(myData[, grepl(i, myColnames), with = FALSE])
      }))

# using manual sum
myData[ , list(key,
               c1 = c1.min + c1.max + c1.sd,
               c2 = c2.min + c2.max + c2.sd,
               c3 = c3.min + c3.max + c3.sd) ]
zx8754
  • 52,746
  • 12
  • 114
  • 209
1

Here is a slightly different option with melt. We specify the patterns in the measure argument, convert to 'long' format, then grouped by 'key' and specifying the .SDcols , get the sum of those columns.

melt(myData, measure = patterns("^c1", "^c2", "^c3"),
  value.name = c('c1', 'c2', 'c3'))[, lapply(.SD, sum) , key, .SDcols = c1:c3]
#    key c1 c2  c3
# 1:   1  3 12 102
# 2:   2  6 15 105
# 3:   3  9 18 108
# 4:   4 12 21 111
# 5:   5 15 24 114
# 6:   6 18 27 117
# 7:   7 21 30 120
# 8:   8 24 33 123
# 9:   9 27 36 126
#10:  10 30 39 129
akrun
  • 874,273
  • 37
  • 540
  • 662
1

Another alternative:

library(dplyr)
library(tidyr)

myData %>%
  gather(k, v, -key) %>%
  separate(k, into = c("l", "s")) %>%
  group_by(key, l) %>% 
  summarise(value = sum(v)) %>%
  spread(l, value)

Which gives:

#Source: local data frame [10 x 4]
#Groups: key [10]
#
#     key    c1    c2    c3
#*  <int> <int> <int> <int>
#1      1     3    12   102
#2      2     6    15   105
#3      3     9    18   108
#4      4    12    21   111
#5      5    15    24   114
#6      6    18    27   117
#7      7    21    30   120
#8      8    24    33   123
#9      9    27    36   126
#10    10    30    39   129
Steven Beaupré
  • 21,343
  • 7
  • 57
  • 77
0

You can use base packages

myData$c1 <- apply(myData[ ,c("c1.min","c1.max","c1.sd")] , 1 , sum)

myData$c2 <- apply(myData[ ,c("c2.min","c2.max","c2.sd")] , 1 , sum)

myData$c3 <- apply(myData[ ,c("c3.min","c3.max","c3.sd")] , 1 , sum)

myData <- myData[,c("key","c1","c2","c3")]

print(myData)
   key c1 c2  c3
1    1  3 12 102
2    2  6 15 105
3    3  9 18 108
4    4 12 21 111
5    5 15 24 114
6    6 18 27 117
7    7 21 30 120
8    8 24 33 123
9    9 27 36 126
10  10 30 39 129

Or you can define a function for sum columns

abc <- function(x)apply(x,1,sum)

c1 <- abc(myData[ ,c("c1.min","c1.max","c1.sd")])
c2 <- abc(myData[ ,c("c2.min","c2.max","c2.sd")])
c3 <- abc(myData[ ,c("c3.min","c3.max","c3.sd")])

mydata1 <- as.data.frame(cbind(Key=myData$key,c1,c2,c3)) 

> mydata1
   Key c1 c2  c3
1    1  3 12 102
2    2  6 15 105
3    3  9 18 108
4    4 12 21 111
5    5 15 24 114
6    6 18 27 117
7    7 21 30 120
8    8 24 33 123
9    9 27 36 126
10  10 30 39 129
Arun kumar mahesh
  • 2,289
  • 2
  • 14
  • 22