0

I am confused. I found few formulas for finding the SD (standard deviation). This is the NumPy library std method:

>>> nums = np.array([65, 36, 52, 91, 63, 79])
>>> np.std(nums)
17.716909687891082

But I found another formula here:Standard deviation

By this formula with the same dataset my result is 323,1666666666667‬. Now which one is right? Or they are used for two different things?

EDIT: Seems I forgot about the square root

Jason Aller
  • 3,541
  • 28
  • 38
  • 38
Toma Tomov
  • 1,476
  • 19
  • 55
  • Are you sure you implemented that formula correctly? – Blorgbeard Jul 02 '19 at 17:12
  • Your answer is definitely incorrect, but do you know the difference between population standard deviation and sample standard deviation? – pault Jul 02 '19 at 17:14
  • Related: [click](https://stackoverflow.com/a/35584364/1534017) – Cleb Jul 02 '19 at 17:17
  • 3
    Just to add: This is a great example to check your intuition of standard deviation. Given 5 numbers, all between 36 and 91, consider carefully whether a st. dev. of 323 makes sense – G. Anderson Jul 02 '19 at 17:21
  • I can see that it doesn't make sens. I am just curious why there are 2 formulas. But from @pault comment now I know. Thank you pault. Will read about those two variants. I am totaly new in this area :) – Toma Tomov Jul 02 '19 at 17:36

2 Answers2

4

numpy is correct, of course. here the plain python version:

from math import sqrt

data = [65, 36, 52, 91, 63, 79]

mean = sum(data) / len(data)
std = sqrt(sum((d - mean) ** 2 for d in data) / len(data))
print(std)   # 17.716909687891082
hiro protagonist
  • 44,693
  • 14
  • 86
  • 111
  • @hiroprotagonist Why to reinvent the wheel? There is a statistics core package in python. https://docs.python.org/3/library/statistics.html – balderman Jul 02 '19 at 17:22
  • Thank you! Will read about the difference between population and sample SD :) – Toma Tomov Jul 02 '19 at 17:37
  • 2
    @balderman this was not about reinventing the wheel; the OP already had a library that did what he wanted but could not get it to agree with what he thought the result should be. with this he can confirm step by step where he went wrong. – hiro protagonist Jul 02 '19 at 18:19
0

Core python. see pstdev

import statistics
print(statistics.pstdev([65, 36, 52, 91, 63, 79]))

output

17.716909687891082
balderman
  • 22,927
  • 7
  • 34
  • 52