0

Hey guys so I know there is a lot out there on simulations but I havent found exactly what I need. I have a vector of views data. Which is the number of views on a video in millions.

totalBeforeViews = (c( 1.19,2.29,2.05,1.96,2.07,1.77,1.50,1.77,4.49,9.76,6.55,5.17,6.56,10.31))

I want to do say a 1000 replicated simulations of this data. So I am looking for some sort of function that would do the simulations randomly but trained in a way by the vector above. I was thinking of doing this

sdViewsBefore = sd(totalBeforeViews)
simulatedBeforeViews = rnorm(n = 1000, mean = totalBeforeViews, sd = sdViewsBefore)

However, this gets me back negative values which I cannot use since there cant be negative views on a video. Also the end goal of this is for me to run a 1000 T-Tests on this verse another set of data I have. Any help is appreciated. Thanks

ColtonMSU
  • 41
  • 6
  • 1
    This gives you negative values because the Normal distribution is not constrained....maybe you should try a Poisson distribution. The 1000 t-tests might sound like a bad idea too... – Matias Andina Apr 07 '19 at 20:38
  • try `sample(1000, totalBeforeViews, replace=TRUE)` – Bensstats Apr 07 '19 at 20:47
  • Hey so had to do sample(1000, totalBeforeViews, replace=TRUE), but it seems to work. Thanks also matias. Can you explain why t-test seem like a bad idea? My thought process was to get these 1000 simulations of the two data sets and then do the t-tests on them and see how many times the p-value is signifigant. – ColtonMSU Apr 07 '19 at 21:08

1 Answers1

0

May consider truncated normal distribution. I have not tested below yet, but may help:

library(truncnorm)
rtruncnorm(n=1000, a=0, b=Inf, mean=totalBeforeViews, sd=sdViewsBefore)

In this link the author provided a customized truncated normal distribution sampling, you may adjust to your own without installing new packages.

mysamp <- function(n, m, s, lwr, upr, nnorm) {
  samp <- rnorm(nnorm, m, s)
  samp <- samp[samp >= lwr & samp <= upr]
  if (length(samp) >= n) {
    return(sample(samp, n))
  }  
  stop(simpleError("Not enough values to sample from. Try increasing nnorm."))
}

set.seed(42)
mysamp(n=10, m=39.74, s=25.09, lwr=0, upr=340, nnorm=1000)
liuminzhao
  • 2,385
  • 17
  • 28
  • Unfortunately, it looks like I can't download that package on my version. ```package ‘truncnorm’ is not available (for R version 3.3.2)``` If I download a new version to try this I will let you know how it goes. Thanks – ColtonMSU Apr 07 '19 at 21:18