Calculating the mean of every replication

Question

I have the following code

set.seed(30)
nsim <- 50    ## NUMBER OF REPLICATIONS
demand <- c(12,13,24,12,13,12,14,10,11,10)

res <- replicate(nsim, {
    load <- runif(10,11,14)
    diff <- load - demand    ## DIFFERENCE BETWEEN DEMAND AND LOAD 
    return(sum(diff < 0))
})
res
[1] 6 5 7 4 4 5 4 3 6 4 5 5 5 4 2 5 3 3 3 5 3 2 4 6 5 4 4 3 5 6 4 4 3 6 5 3 5 5 4 3 3
[42] 6 4 4 4 6 6 5 4 5

I have a huge data set and the question is what is the fastest way of calculating the mean for every replication. For example the res in first replication is 6 so the result should be 6/1=6 for the second 6+5/2=5.5 for the third 6+5+7/3=6 and for the last replication is sum(res)/nsim=4.38

Unless you're facing memory constraints, generate all your data at once and stick it in a matrix or data.frame, e.g. `sapply(seq(nsim), function(x){runif(10,11,14)})` or `matrix(runif(10 * nsim, 11, 14), nrow = nsim)`. Then apply your other steps in a vectorized fashion. — alistaire, Feb 09 '17 at 20:22
For me it is not clear how can I apply these procedures to my data frame and plot the LOLE divided by the number of iterations — kelamahim, Feb 09 '17 at 20:59
Possible duplicate of http://stackoverflow.com/questions/21982987/mean-per-group-in-a-data-frame — akrun, Feb 14 '17 at 19:38

score 2 · Answer 1 · answered Feb 09 '17 at 21:11

2

To illustrate my comment, you can generate a matrix where columns (or rows, if you prefer) represent replications, after which you can use R's matrix operations capabilities:

set.seed(47)    # make reproducible

nsim <- 50    ## NUMBER OF REPLICATIONS
demand <- c(12,13,24,12,13,12,14,10,11,10)

loads <- matrix(runif(10 * nsim, 11, 14), ncol = nsim)

diffs <- loads - demand    # with vector recycling
# or: diffs <- apply(loads, 2, `-`, demand)    
# or: diffs <- apply(loads, 2, function(x){x - demand})

res <- colSums(diffs > 0)
LOLE <- sum(res) / nsim

LOLE
#> [1] 5.7

answered Feb 09 '17 at 21:11

alistaire

42,459
4
77
117

Ok it is clear but how to save the results of every iteration? For 50 simulations I want to have 50 LOLE results. As I increase the number of iterations the LOLE should converge – kelamahim Feb 09 '17 at 21:20
The `sum` in defining `LOLE` collapses all your simulations into one number. This code just replicates what you have above, saving the intermediate products. If that's not what you want, define what you want and rework your code accordingly. – alistaire Feb 09 '17 at 21:36
Apologies for unclear instructions. I want to have the results for every iteration, for 50 simulation 50 lole results. Not just one result of 50 simulation. Just don`t know how to use replicate or any better approach for achieving this. – kelamahim Feb 09 '17 at 21:39
So 50 replications of 50 simulations each (i.e. 2500 simulated elements)? You should edit your code; otherwise I'm just guessing from your description, as I have no idea what any of your variables represent. – alistaire Feb 09 '17 at 21:42
No when i set the nsim to 50 i want to have 50 LOLE results not just one. Basically for the 1 replication I want to sum(res)/1 for the second replication I want to have the sum(res) of the 2 replications divided by 2 for the third the sum off all three replications divided by 3. I can do it manually changing the nsim numbers from 1:50 and have 50 lole results – kelamahim Feb 10 '17 at 05:30
1

That's not what your code does. Edit it to clarify. You can accomplish it with `Reduce` with `accumulate = TRUE`, though. – alistaire Feb 10 '17 at 05:32

Uwe · Accepted Answer · 2017-02-15T20:56:32.923

In the edited version of the question (edit of Feb 11 at 5:53), the OP has specified the expected result. These indicate that the OP might be looking for a cumulative mean of the result vector res:

cumsum(res)/seq_along(res)
# [1] 6.000000 5.500000 6.000000 5.500000 5.200000 5.166667 5.000000 4.750000 4.888889
#[10] 4.800000 4.818182 4.833333 4.846154 4.785714 4.600000 4.625000 4.529412 4.444444
#[19] 4.368421 4.400000 4.333333 4.227273 4.217391 4.291667 4.320000 4.307692 4.296296
#[28] 4.250000 4.275862 4.333333 4.322581 4.312500 4.272727 4.323529 4.342857 4.305556
#[37] 4.324324 4.342105 4.333333 4.300000 4.268293 4.309524 4.302326 4.295455 4.288889
#[46] 4.326087 4.361702 4.375000 4.367347 4.380000

Alternatively, dplyr::cummean(res) can be used.

Calculating the mean of every replication

2 Answers2

Linked

Related