Different results when calculating quantile in pandas (Python) and R

Question

Could you please tell me, why the results differ, when quantiles are calculated in pandas (Python) and R?

Pandas code:

 print('p_new:   {:>5}   {:>5}     {:>5}'.format(
        round(self.pandas_data_frame['pending_new'].quantile(0.50), 2),
        round(self.pandas_data_frame['pending_new'].quantile(0.95), 2),
        round(self.pandas_data_frame['pending_new'].quantile(0.99), 2),
    ))

    print('new:     {:>5}   {:>5}   {:>5}'.format(
        round(self.pandas_data_frame['new'].quantile(0.50), 2),
        round(self.pandas_data_frame['new'].quantile(0.95), 2),
        round(self.pandas_data_frame['new'].quantile(0.99), 2),
    ))

results:

name     |   .50|    .95|    .99| 
p_new:     2.0    12.0      20.0
new:      52.0    78.0   106.06

R code:

dd = read.csv(“stats.csv”)
quantile(dd$pending_new, c(.50, .95, .99))
quantile(dd$new, c(.50, .95, .99))

results:

> quantile(dd$pending_new, c(.50, .95, .99))                                                                                                                                               
50%  95%  99% 
2.0 13.1 34.0 
> quantile(dd$new, c(.50, .95, .99))                                                                                                                                                       
50%    95%    99% 
52.00  81.00 129.26

Use the sources ([pandas](https://pandas.pydata.org/pandas-docs/version/0.21/generated/pandas.DataFrame.quantile.html) and [r](http://stat.ethz.ch/R-manual/R-devel/library/stats/html/quantile.html)), Luke! — Nelewout, Jun 09 '18 at 08:58
There are many different [ways](https://en.wikipedia.org/wiki/Quantile#Estimating_quantiles_from_a_sample) to estimate quantiles. R uses one way by default, pandas uses another. — ayhan, Jun 09 '18 at 09:22

score 0 · Answer 1 · answered Sep 04 '19 at 08:28

When doing this function in Python, all functions of the np.percentile() family have an optional argument interpolation. Set this argument to 'midpoint' and your results with match the result in R. You can also read more about the python function here: How to calculate 1st and 3rd quartiles?

Different results when calculating quantile in pandas (Python) and R

1 Answers1