0

I have the below data table in R:

      n       s     b  y
1 a 1Q 1990 Rank1 10
2 a 2Q 1990 Rank1 43
3 b 1Q 1991 Rank2 42
4 b 3Q 2000 Rank3 66
5 a 4Q 1991 Rank1 55
6 c 1Q 2005 rank3 44

I want to sum up the y column based on the values of the rest columns. For example, in column n value "a", in column s value "Rank1". The caveat is that i want to sum all the numbers for each quarter after the initial one and not the others before it for example. for 1Q 1990 i want to add all other numbers for the quarters after 1Q 1990 up to 1Q 2005, for 2Q 1990 i want to add all numbers up to 1Q 2005 but not 1Q 1990.

So the final outcome should be something like this:

n       s     b   y
1 a 1Q 1990 Rank1 108
2 a 2Q 1990 Rank1  98
3 b 1Q 1991 Rank2  42
4 b 3Q 2000 Rank3  66
5 a 4Q 1991 Rank1  55
6 c 1Q 2005 rank3  44

I dont even know how to start building this.

Every bit of help is much appreciated.

Thanks

Maylo
  • 572
  • 5
  • 16
  • https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – jogo May 14 '18 at 13:39

2 Answers2

0

Here is a solution with data.table:

library("data.table")
DT <- fread(
"n       s     b  y
a 1Q.1990 Rank1 10
a 2Q.1990 Rank1 43
b 1Q.1991 Rank2 42
b 3Q.2000 Rank3 66
a 4Q.1991 Rank1 55
c 1Q.2005 rank3 44")
DT[, ysum:=rev(cumsum(rev(y))), by=.(n, b)][]
# > DT[, ysum:=rev(cumsum(rev(y))), by=.(n, b)][]
#    n       s     b  y ysum
# 1: a 1Q.1990 Rank1 10  108
# 2: a 2Q.1990 Rank1 43   98
# 3: b 1Q.1991 Rank2 42   42
# 4: b 3Q.2000 Rank3 66   66
# 5: a 4Q.1991 Rank1 55   55
# 6: c 1Q.2005 rank3 44   44

With base R you can do:

DT$ysum2 <- ave(DT$y, DT$n, DT$b, FUN=function(x) rev(cumsum(rev(x))))
jogo
  • 12,469
  • 11
  • 37
  • 42
-1

You can make use of the package "sqldf" to do SQL like queries on your dataframe. Before you do that, split up your composite column "a" to individual values, so that you can handle them easier.

https://www.rdocumentation.org/packages/sqldf/versions/0.4-11

https://cran.r-project.org/web/packages/sqldf/sqldf.pdf

Sammy
  • 47
  • 7
  • Thanks! Although there are lots of data with quite a few different combinations and setting up different queries for each of them is going to be harsh. Is there another way of doing that? – Maylo May 14 '18 at 13:34
  • @Sammy Please read https://stackoverflow.com/help/how-to-answer What you wrote is more a comment. – jogo May 14 '18 at 13:37