5

I am trying to understand the way to compute iqr (interquartile range).

according this, this and this, I tried 3 solutions to do this.

solution_1

a = numpy.array([1, 2, 3, 4, 5, 6, 7])
q1_a = numpy.percentile(a, 25)
q3_a = numpy.percentile(a, 75)
q3_a - q1_a

solution_2

from scipy.stats import iqr
iqr(a)

solution_3

q1_am = np.median(numpy.array([1, 2, 3, 4]))
q3_am = np.median(numpy.array([4, 5, 6, 7]))
q3_am - q1_am

3 of them give the same result 3 which is correct.

when I tried another set of numbers, things were going weird.

both solution_1 and 2 output 0.95 which is not correct.

x = numpy.array([4.1, 6.2, 6.7, 7.1, 7.4, 7.4, 7.9, 8.1])
q1_x = numpy.percentile(x, 25)
q3_x = numpy.percentile(x, 75)
q3_x - q1_x

solution_3 gives 1.2 which is correct

q1_xm = np.median(np.array([4.1, 6.2, 6.7,7.25]))
q3_xm = np.median(np.array([7.25,7.4, 7.9, 8.1]))
q3_xm - q1_xm

What am I missing with the solutions?

any clue would be appreciated.

tel
  • 13,005
  • 2
  • 44
  • 62

2 Answers2

6

You'll get your expected result with numpy.percentile if you set interpolation=midpoint:

x = numpy.array([4.1, 6.2, 6.7, 7.1, 7.4, 7.4, 7.9, 8.1])
q1_x = numpy.percentile(x, 25, interpolation='midpoint')
q3_x = numpy.percentile(x, 75, interpolation='midpoint')
print(q3_x - q1_x)

This outputs:

1.2000000000000002

Setting interpolation=midpoint also makes scipy.stats.iqr give the result you wanted:

from scipy.stats import iqr

x = numpy.array([4.1, 6.2, 6.7, 7.1, 7.4, 7.4, 7.9, 8.1])
print(iqr(x, rng=(25,75), interpolation='midpoint'))

which outputs:

1.2000000000000002

See the interpolation parameter in the linked docs for more info on what the option actually does.

tel
  • 13,005
  • 2
  • 44
  • 62
1

Use numpy.quantile:

import numpy as np

x = np.array([4.1, 6.2, 6.7, 7.1, 7.4, 7.4, 7.9, 8.1])
q1_x = np.quantile(x, 0.25, interpolation='midpoint')
q3_x = np.quantile(x, 0.75, interpolation='midpoint')
print(q3_x - q1_x)

the output:

1.2000000000000002
Jon Jiang
  • 51
  • 6