I have been fighting with cumsum to work with multiple conditions. In this case I need to perform a running temperature sum for each day of the year for year
, site
, canopy
and treatemt_abbr
. Here is sample of the simplified data:
site year doy canopy power treatment_abbr airtemp_comb_nearby gdd10
cfc 2009 1 closed 0 h2 -18.153490 0
cfc 2009 2 closed 0 h2 18.153490 8
cfc 2009 3 closed 0 h2 13.153490 3
cfc 2009 1 open 0 ac 12.490 2
cfc 2009 2 open 0 ac 16.912620 6
hwrc 2012 1 closed 0 dc 11.146437 1
hwrc 2012 2 closed 0 dc 2.005500 0
hwrc 2012 3 closed 0 dc 2.5500 0
hwrc 2012 4 closed 0 dc 22.1234 12
hwrc 2012 5 closed 0 dc 2.005500 0
my actual data set is quite large, so I am wanting to use the data.table functionality. It seems to me that following should work. It does create the new column "gddsum10", but fails to do the running sum. Any idea what I am doing wrong here?
dt[order(doy), gddsum10:=cumsum(gdd10), by=c("year", "doy", "site",
"canopy", "treatment_abbr")]
I am looking for something along the lines of this with the new column "gddsum10"
:
site year doy canopy power treatment_abbr airtemp_comb_nearby gdd10 gddsum10
cfc 2009 1 closed 0 h2 -18.153490 0 0
cfc 2009 2 closed 0 h2 18.153490 8 8
cfc 2009 3 closed 0 h2 13.153490 3 11
cfc 2009 1 open 0 ac 12.490 2 2
cfc 2009 2 open 0 ac 16.912620 6 8
hwrc 2012 1 closed 0 dc 11.146437 1 1
hwrc 2012 2 closed 0 dc 2.005500 0 1
hwrc 2012 3 closed 0 dc 2.5500 0 1
hwrc 2012 4 closed 0 dc 22.1234 12 13
hwrc 2012 5 closed 0 dc 2.005500 0 13