2

I've been getting unexpected results using the wtd.iqr function from the reldist package (version 1.6.6) to calculate a weighted interquartile range (as opposed to the unweighted interquartile range returned by IQR from the vanilla R stats package). To explore the problem, I tried comparing the output of reldist::wtd.iqr to the output of IQR.

To my surprise, I found that IQR and reldist::wtd.iqr return completely different output values for the same input values even when the input values are equally weighted (i.e. when weighting should make no difference).

> x <- rnorm(10000)
> wt <- rep(1, length(x))
> paste(c('IQR:', IQR(x), 'wtd.iqr:', reldist::wtd.iqr(x, weight = wt)))
[1] "IQR:"              "1.34879539936654"  "wtd.iqr:"          "0.675866062623211"
> 

In the above test, IQR seems always to return an output value approximately double the value that wtd.iqr returns for the same input values.

With input values that do not follow the above distribution, this relationship does not necessarily hold true: in fact, with real data, I sometimes get negative values from wtd.iqr, which I would have assumed to be impossible, but have never found that to happen with IQR.

In fact, it seems to me that wtd.iqr may actually be returning not the interquartile range but one of the quartiles. But if there's a bug here, it surely can't be such an obvious one --- can it?

Presumably the two functions define the concept of interquartile range in a different way, but there is no clue in the documentation. The documentation for IQR states that it 'computes interquartile range of the x values', while the documentation for wtd.iqr states that it 'Returns an empirical interquartile range from a weighted sample'.

tushaR
  • 3,083
  • 1
  • 20
  • 33
Westcroft_to_Apse
  • 1,503
  • 4
  • 20
  • 29
  • Right, so this question has been downvoted. It would be nice if the person who did that could leave a comment to explain why. – Westcroft_to_Apse Oct 25 '17 at 11:37
  • Looks like a bug to me. The `iqr` and `wtd.iqr` function of the `reldist` package both have the same description. **Compute weighted Interquartile range (iqr)** and yes it's returning 25% which is quite strange. – tushaR Oct 25 '17 at 11:50
  • Thanks, @TUSHAr. I've resorted to calculating the weighted iqr myself. But if it's a bug, it's such an obvious one that I'm surprised it slipped through... maybe I'm over-thinking it, though. – Westcroft_to_Apse Oct 25 '17 at 11:59
  • Good catch Westcroft_to_Apse. – tushaR Oct 25 '17 at 12:17

1 Answers1

3

No, you are not overthinking. It's actually a bug. I have gone through the function definition here: https://github.com/cran/reldist/blob/master/R/wtd.quantile.R . It's using the Hmisc package's wtd.quantile function internally and then subtracting it with the probability values .25 and .75. But the two calls are being made in separate lines and hence R is treating it as different calls, rather than calculating the expression a-b, it returns -b which is the last line. Hence, -ve value of the 25% is being returned.

IF you simply try this:

quantile(x = x,c(0,0.25,0.5,0.75,1))

#             0%                      25%                      50%                      75% 
#-3.903016136384592105202 -0.677263029412919159711 -0.012691140400805673433  0.636730086813689699632 
#                100% 
# 3.745404178709976328321 

reldist::wtd.iqr(x = x,weight = wt)

 #25% 
 #0.67726302941291915971 

The negative of 25%ile is returned. The last line is returned in an R function call.

This is the function definition:

wtd.iqr <- function(x, na.rm = FALSE, weight=FALSE) {
  wtd.quantile(x, q=0.75, na.rm = na.rm, weight=weight)
- wtd.quantile(x, q=0.25, na.rm = na.rm, weight=weight)
}

It should be:

wtd.iqr <- function(x, na.rm = FALSE, weight=FALSE) {
  wtd.quantile(x, q=0.75, na.rm = na.rm, weight=weight) - wtd.quantile(x, q=0.25, na.rm = na.rm, weight=weight)
}
tushaR
  • 3,083
  • 1
  • 20
  • 33