0

Based on the proportion of slopes from the randomisation, greater or less than the slope from the observed data, I would like to calculate the expected probability of getting the observed slope. The observed slope is -0.2717.

Any help would be greatly appreciated, I am a newbie.

histdata<- numeric(10000)
for (i in 1:10000) {histdata[i]<-(summary.lm(lm(sample(tcons)~tleave))
[[4]][[2]])}
hist(histdata)
abline(v=-0.2717, lwd=3, lty=2)
box()

data3<- -0.2717>histdata

This ^^ gives me 9954 that are not greater than the original and 46 that are greater.

  • 1
    How do you want to calculate this p-value? It's easier to help you when you provide a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input data so we can actually run your code to see what it's doing. – MrFlick Nov 20 '17 at 15:12
  • Turned histdata into input data: data2<- data.frame(histdata). It is slope values from a loop which I am trying to find the p value for. However, I can't run an anova on it cuurently. Not sure if I need to change the object – Barry Allen ' The Flash Nov 20 '17 at 15:17
  • This comment doesn't make any sense to me. Still have no idea what you are doing. I don't understand the desired output. It's unclear what these different code chunks have to do with your goal. Are you trying to create one plot or multiple plots? – MrFlick Nov 20 '17 at 15:32
  • Just looking to find the p value of "histdata" which contains 10000 slope values from a for-loop. Then plot this along with the mean of the 10000 slopes on a single dimension plot. – Barry Allen ' The Flash Nov 20 '17 at 15:54
  • 2
    That doesn't make sense statistically. How do you calculate the pvalue of 10000 numbers? What's the p-value of 1, 7, 12? In order to have a p-value there needs to be some model (distributional assumption) and some test statistic. Some hypothesis to test. And a p-value is doing to be on a completely different scale than the observations themselves so how you include them on the same plot isn't at all clear. – MrFlick Nov 20 '17 at 16:04
  • My mistake. Based on the proportion of slopes from the randomisation, greater or less than the slope from the observed data, I would like to calculate the expected probability of getting the observed slope. The observed slope is -0.2717. I have edited the post to reflect this. – Barry Allen ' The Flash Nov 20 '17 at 16:21

1 Answers1

0

If you have the results of a randomization procedure in rand_vals and an observed value in obs_val, then the one-tailed p-value (quantifying support for the null hypothesis vs. the alternative hypothesis that the observed value is greater than the null value) is

mean(rand_vals>=obs)
  • Note that this is NOT ☢☣ (can't find a skull & crossbones emoji) the "probability of getting the observed slope". It is *the probability of observing a value greater than or equal to the observed slope, if the null hypothesis is true.
  • In some cases it may be appropriate to include the observed value in the "randomization" set as well, i.e. mean(c(rand_vals,obs)>=obs); this won't make much difference if your randomization set is large.
  • a two-tailed p-value would be something like mean(abs(rand_vals)>=abs(obs))
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453