Why do np.std(X) and X.std() return different values?

Question

I am trying to calculate normalized scores for my dataset using mean normalization. When I write (X - np.mean(X))/np.std(X), it gives me different score than doing ((X - X.mean())/X.std().

Problem seems to be coming from calculation of standard deviation. X.std() returns one values for standard deviation and np.std() returns different values for standardization. Why is this happening?

What is `X`? (e.g. Pandas DataFrame, xarray DataArray etc) – user7813790 Jul 24 '19 at 07:50 — user7813790, Jul 24 '19 at 07:50
It's a dataframe. I got it now. – Matt Jul 24 '19 at 08:51 — Matt, Jul 24 '19 at 08:51

score 5 · Accepted Answer · answered Jul 24 '19 at 07:50

5

Pandas uses the unbiased estimator (N-1 in the denominator), whereas Numpy by default does not.

To make them behave the same, pass ddof=1 to numpy.std().

Different std in pandas vs numpy

answered Jul 24 '19 at 07:50

Aleksandr Chernov

147
5

Thank you. It gives the same answer now. – Matt Jul 24 '19 at 08:24
2

if it solved the issue and you are satisified with this answer , please @Matt, accept it and close the question. – Frayal Jul 24 '19 at 10:04

Why do np.std(X) and X.std() return different values?

1 Answers1