Error in sample.int(length(x), size, replace, prob) : too few positive probabilities

Question

I am trying to simulate a response pattern from the IRT graded response model, using the mirt package. However, when I use the simdata function, it gives me the following error:

Error in sample.int(length(x), size, replace, prob): too few positive probabilities

The dataset I am trying to simulate is unidimensional, and each of the 40 items has 5 categories. Here is the code I wrote:

set.seed(1)
true.abilities <- matrix(c(rnorm(1000)), 1000)
set.seed(1)
a <- matrix(c(runif(40, 0.6, 1.9)), 40)
set.seed(3)
b <- matrix(c(rnorm(40*4)), 40)
b <- t(apply(b, 1, sort, decreasing=FALSE)) #sort since intercepts are ordered
Form.X <- simdata(a, b, 1000, Theta=true.abilities, itemtype='graded')

There was another similar question about this, but the answers didn't help me. I also found a code for simulating multidimensional graded response pattern, and it worked, but my code, for the unidimensional model, doesn't work.

Any ideas on how to fix this? Thank you!

Welcome to SO!, the above code did not produce any errors for me with `mirt_0.7.0` can you see output of `sessionInfo()` — Silence Dogood, Jul 05 '14 at 10:06
If you `debug(simdata)` you see that `P` is all zeros, which is why the next `apply` call craps out. Investigate how `P` is all zeros and you might get closer to a solution. — Roman Luštrik, Jul 05 '14 at 11:26
Thank you @RomanLuštrik, but I am new with R and with programming, thus I have no clue on how to investigate why the p is all zeros. — Daniela, Jul 05 '14 at 13:42
Ah, I fixed it. The problem was with `decreasing=FALSE`. It should be `TRUE`. — Daniela, Jul 05 '14 at 14:43
Here are a few tips on how to debug code. This has been made easier with Rstudio, which shows which line is about to be executed (when using `debug()`): http://stackoverflow.com/questions/4442518/general-suggestions-for-debugging-r — Roman Luštrik, Jul 06 '14 at 08:55
Just an aside; it's somewhat dangerous to simulate intercept parameters this way (i.e. with `sort(rnorm())`), especially if you are using this code in a larger simulation study. The issue is that if the sorted values are too close (e.g. `c(-1, 1, 1.0001)`) then when the observed response categories are drawn there will be some categories with very little endorsement. This will cause some pretty nasty estimation issues, and likely will result in non-convergence. So, it's generally better to specify the intercept spaces equally (or at least far enough apart) so that this does not occur. Cheers. — philchalmers, Jul 09 '14 at 03:44
Thank you for your suggestion, @philchalmers! Would those estimation issues result in the following error in R: "Error in if (tmp < 0) Mte <- exp(tmp) : Missing value where TRUE/FALSE needed" ? Would drawing the intercepts from a uniform distribution solve the problem? — Daniela, Jul 10 '14 at 05:48
I'm not really sure what that error means or where it comes from (`Mte` isn't a defined variable in the entire package!). No, uniform distributions can result in the same issue when sorted. You need to make sure that each intercept is relatively far away from each other (maybe by a minimum of 0.3) to avoid extremely sparse categories. — philchalmers, Jul 10 '14 at 15:19
Something like this would do it: `diffs <- t(apply(matrix(runif(75, .3, 1), 25), 1, cumsum)); diffs <- -(diffs - rowMeans(diffs)); d <- diffs + rnorm(25)` — philchalmers, Jul 10 '14 at 15:45

Error in sample.int(length(x), size, replace, prob) : too few positive probabilities

0 Answers0

Linked