1

I'm new to R but i am trying to use it in order to aggregate losses that are observed from a severity distribution by an observation from a frequency distribution - essentially what rcompound does. However, i need a more granular approach as i need to manipulate the severity distribution before 'aggregation'.

Lets take an example. Suppose you have:

rpois(10,lambda=3)

Thereby, giving you something like:

[1] 2 2 3 5 2 5 6 4 3 1

Additionally, suppose we have severity of losses determined by:

rgamma(20,shape=1,scale=10000)

So that we also have the following output:

 [1]   233.0257   849.5771  7760.4402   731.5646  8982.7640 24172.2369 30824.8424 22622.8826 27646.5168  1638.2333  6770.9010  2459.3722   782.0580 16956.1417  1145.4368  5029.0473  3485.6412  4668.1921  5637.8359 18672.0568

My question is: what is an efficient way to get R to take each Poisson observation in turn and then aggregate losses from my severity distribution? For example, the first Poisson observation is 2. Therefore, adding two observations (the first two) from my Gamma distribution gives 1082.61.

I say this needs to be 'efficient' (run time) due to the fact: - The Poisson parameter may be come significantly large, i.e. up to 1000 or so. - The realisations are likely to be up to 1,000,000, i.e. up to a million Poisson and Gamma observations to sort through.

Any help would be greatly appreciated.

Thanks, Dave.

  • what would be the desired output for the second poisson entry 2? would it also be 1082.61? or would it be the sum of the next two entries in the gamma series 7760.4402+731.5646=8492.0048? – user2474226 Nov 19 '19 at 16:10
  • Apologies, that is not clear. It would be the second of your suggestions, i.e. 8492.00. For completeness, this would make the third observation of the Poisson dist "3 " equal to the sum of "8982.7640 24172.2369 30824.8424". Any ideas? –  Nov 19 '19 at 16:32

1 Answers1

0

It looks like you want to split the gamma vector at positions indicated by the accumulation of the poisson vector.

The following function (from here) does the splitting:

splitAt <- function(x, pos) unname(split(x, cumsum(seq_along(x) %in% pos)))

pois <- c(2, 2, 3, 5, 2, 5, 6, 4, 3, 1)
gam <- c(233.0257, 849.5771, 7760.4402, 731.5646, 8982.7640, 24172.2369, 30824.8424, 22622.8826, 27646.5168, 1638.2333, 6770.9010, 2459.3722, 782.0580, 16956.1417, 1145.4368, 5029.0473, 3485.6412, 4668.1921, 5637.8359, 18672.0568)
posits <- cumsum(pois)

Then do the following:

sapply(splitAt(gam, posits + 1), sum)
[1]  1082.603  8492.005 63979.843 61137.906 17738.200 19966.153 18672.057

According to post I linked to above, the splitAt() function slows down for large arrays, so you could (if necessary) consider the alternatives proposed in that post. For my part, I generated 1e6 poissons and 1e6 gammas, and the above function ran in 0.78 sec on my machine.

user2474226
  • 1,472
  • 1
  • 9
  • 9
  • This is exactly what im after. Thank you for sharing. Unfortunately, however, this function 'ignores' zero values generated by the Poisson distribution. Can you think of any ideas how this may be added in? In other words, id like the sappy function to be able to return zero values where the Poisson distribution gives zero values. Hope this is clear! –  Nov 19 '19 at 21:07
  • Hmm, I'm not sure that the split method can be directly modified to take care of zero locations. You could create a vector of zero locations: `zero_ind <- which(pois==0)`, then strip the zeroes from the `pois` (`pois <- pois[pois != 0]`), run the above code to obtain the relevant gammas; and finally insert 0s into the gamma output at the locations specified by the `zero_ind` vector. [See here](https://stackoverflow.com/questions/1493969/how-to-insert-elements-into-a-vector), for example. – user2474226 Nov 20 '19 at 14:35