0

I'm working on the HarvardX Data Science program. In the probability section, we are being asked to determine the probability of getting a score above 30. I have even plugged in the exact code that DataCamp reveals as the answer, but pnorm() still returns a value of 1 for any quantile I enter that's 27 or higher. I have attached my code for reference. Can someone point out what the issue is?

For reference see the image below:

pnorm() code

Oliver
  • 8,169
  • 3
  • 15
  • 37
  • 2
    What number do you expect? At some stage, probabilities become indistinguishable from 1. You are talking about values which are many standard deviations above the mean. Not really on-topic on Stack Overflow, but your problem seems to be with understanding normal distributions rather than with `pnorm()` – John Coleman Nov 04 '19 at 18:45
  • This code is not calculating the probability of getting a score above 30, it's calculating the probability of getting a value less than or equal to 30 from a standard normal distribution. Perhaps you were supposed to normalize the score first and subtract the lower tail probability from 1? – MrFlick Nov 04 '19 at 18:57
  • Yeah, I wouldn't normally have posted here but I wasn't getting any response back from the course instructors. The answer is 1.86*10^-11, so a very small value, we are supposed to calculate the probability of scoring above 30, so 1 - pnorm() is what is used, but we weren't told to correct the rounding. – emarkley08 Nov 04 '19 at 19:00
  • `pnorm(30,11,2.87,lower.tail = FALSE)`, even with the default display, yields `1.793456e-11` (as does `1- pnorm(30,11,2.87)`, but I guess you got thrown by how the intermediate result `pnorm(30,11,2.87)` was displayed). – John Coleman Nov 05 '19 at 12:19
  • In addition to the rounding problem, you should be aware that trusting a normal model of empirical data that many standard deviations away from the mean is problematic. No real-world data is truly normal, and the discrepancy between the model and the data often shows up on the tails. – John Coleman Nov 05 '19 at 19:14

1 Answers1

1

R does automatic rounding for console output. visualizing. Various methods exist to print all significant digits. As we work in a 32 bit environment, we can print the first 15' digits using

sprintf("%.15f", pnorm(30,11,2.87))
#output
[1] "0.999999999982065"

Anything past the 15 digits is not trustworthy.

Oliver
  • 8,169
  • 3
  • 15
  • 37
  • Awesome, thank you, we were not told to adjust the rounding so everything was just generating a value of 1. – emarkley08 Nov 04 '19 at 19:02
  • The rounding is only visual, keep in mind. Any calculations are performed with all available digits. The shown digits are controlled using `options(digits = n)`, so alternatively you could set the effect globally using `options(digits = 15)` (standard is 7). From my experience the GUI used (`Rstudio` for example) can set some alternative standard, to accomodate how their UI. – Oliver Nov 04 '19 at 19:17