26

Using R, it is trivial to calculate the quantiles for given probabilities in a sampled distribution:

x <- rnorm(1000, mean=4, sd=2)
quantile(x, .9) # results in 6.705755

However, I can't find an easy way to do the inverse—calculate the probability for a given quantile in the sample x. The closest I've come is to use pnorm() with the same mean and standard deviation I used when creating the sample:

pnorm(5, mean=4, sd=2) # results in 0.6914625

However, because this is calculating the probability from the full normal distribution, and not the sample x, it's not entirely accurate.

Is there a function that essentially does the inverse of quantile()? Something that essentially lets me do the same thing as pnorm() but with a sample? Something like this:

backwards_quantile(x, 5)

I've found the ecdf() function, but can't figure out a way to make it result in a single probability instead of a full equation object.

Andrew
  • 36,541
  • 13
  • 67
  • 93

3 Answers3

23

ecdf returns a function: you need to apply it.

f <- ecdf(x)
f( quantile(x,.91) )
# Equivalently:
ecdf(x)( quantile(x,.91) )
Vincent Zoonekynd
  • 31,893
  • 5
  • 69
  • 78
  • 4
    In the example shown in the original post, you would actually have to run `ecdf(x)(5)` in order to find the quantile of 5 given `x` (approx. 0.697 given seed 123). – Waldir Leoncio Feb 14 '14 at 14:05
3

Just for convenience, this function helps:

quantInv <- function(distr, value) ecdf(distr)(value)
set.seed(1)
x <- rnorm(1000, mean=4, sd=2)
quantInv(x, c(4, 5, 6.705755))
[1] 0.518 0.685 0.904
xm1
  • 1,663
  • 1
  • 17
  • 28
3

You more or less have the answer yourself. When you want to write

backwards_quantile(x, 5)

just write

ecdf(x)(5)

This corresponds to the inverse of quantile() with type=1. However, if you want other types (I favour the NIST standard, corresponding to Excel's Percentile.exc, which is type=6), you have more work to do.

In these latter cases, consider which use you are going to put it to. If all you want is to plot it, for instance, then consider

yVals<-seq(0,1,0.01)
plot(quantile(x,yVals,type=6))

But if you want the inverse for a single value, like 5, then you need to write a solving function to find the P that makes

quantile(x,P,type=6) = 5

For instance this, which uses binary search between the extreme values of x:

inverse_quantile<-function(x,y,d=0.01,type=1) {
  A<-min(x)
  B<-max(x)
  k<-(log((B-A)/d)/log(2))+1
  P=0.5
  for (i in 1:k) {
    P=P+ifelse((quantile(x,P,type=type)<y),2^{-i-1},-2^{-i-1})
  }
  P
}

So if you wanted the type 4 quantile of your set x for the number 5, with precision 0.00001, then you would write

inverse_quantile<-function(x,5,d=0.00001,type=4)