1

I calculated the upper quartile (Q3 or 75%-tile) and lower quartile (Q1 or 25%-tile) using Numpy/Pandas and TI-nspire. But I get different values. Why does this happen?

From (5+8)/2=6.5 and (18+21)/2=19.5, Numpy/Pandas Q1 and Q3 are wrong. Why does Numpy/Pandas return wrong numbers?

import numpy as np

data=np.array([2,4,5,8,10,11,12,14,17,18,21,22,25])

q75, q25 = np.percentile(data, [75 ,25])
print(q75,q25)


df=pd.DataFrame(data)
df.describe()

Numpy returns 18.0 and 8.0. Pandas return 18.0 and 8.0. But TI-nspire returns 19.5 and 6.5.

enter image description here

enter image description here

shin
  • 31,901
  • 69
  • 184
  • 271
  • Maybe help [this](https://stackoverflow.com/questions/49025162/is-pandas-showing-the-wrong-percentile) – jezrael Jan 28 '20 at 12:37

2 Answers2

1

This post and this post helped me understand it.

So if you have [7, 15, 36, 39, 40, 41], then 7 -> 0%, 15 -> 20%, 36 -> 40%, 39 -> 60%, 40 -> 80%, 41 -> 100%.

The default of interpolation is linear. So it uses i + (j - i) * fraction. You can set interpolation to midpoint which calculate (i + j) / 2.

import numpy as np

data=np.array([7,15,36,39,40,41])
linear = np.percentile(data, [25, 50, 75], interpolation='linear')
mid = np.percentile(data, [25, 50, 75], interpolation='midpoint')
low = np.percentile(data, [25, 50, 75], interpolation='lower')
high = np.percentile(data, [25, 50, 75], interpolation='higher')
nearest = np.percentile(data, [25, 50, 75], interpolation='nearest')
print(linear,mid,low,high,nearest)
print(15,37.5,40)

Output:

enter image description here

So I found there is no exact way you find the Q1 and Q3 in Pandas/Numpy as TI-nspire.

shin
  • 31,901
  • 69
  • 184
  • 271
1

You are in for a treat. They are both right.

Unlike most other descriptors there are are several different definitions of Q1 and Q3 in use. For dataset with a large number of observations the different definitions will give the more-or-less the same result. For small datasets you will see differences - as you experienced.

Mathword lists 5 (five!) different ways of computing quartiles. See http://mathworld.wolfram.com/Quartile.html

soegaard
  • 30,661
  • 4
  • 57
  • 106