I tried to understand how matplotlib draws a graph. This is the code that I write.
import matplotlib.pyplot as plt
import pandas as pd
age=[20,22,22,23,23,23,23,24,24,24,24,26,26,30]
df=pd.DataFrame(age, columns=['age'])
df['age'].describe()
This is the data printed out
count 14.000000
mean 23.857143
std 2.348720
min 20.000000
25% 23.000000
50% 23.500000
75% 24.000000
max 30.000000
Name: age, dtype: float64
I calculated the value IQR, L, U
IQR = Q3 - Q1= 24 – 23 = 1
L = Q1 – 1.5 * IQR = 23 – 1.5 * 1 = 21.5
U = Q3 + 1.5 * IQR = 24 + 1.5 * 1 = 25.5
However, the graph generated by matplotlib is different from what I calculate:
df.boxplot(column = ['age'])
The value of L and U extreme is 22 and 24 (not 21.5 and 25.5)
What is the formula for L and U (lower and upper extreme) that matplotlib uses?
Thanks a lot for pointing out my mistakes?