0

The standard R sd() function uses a different standard deviation equation than numpy does. R uses the MatLab equation so:

>sd(c(1,2,6)
[1] 2.645751
>np.std([1,2,6])
[1] 2.1602468994692869

What is an equivalent R function that produces the bottom result?

knowads
  • 705
  • 2
  • 7
  • 24

1 Answers1

1
sd(c(1,2,6))*sqrt(2/3)
[1] 2.160247

From that I gather that R uses n-1 in the denominator and numpy uses N when calculating the variance

Marsenau
  • 1,095
  • 2
  • 13
  • 18
  • 1
    And of course `n-1` is usually the more sensible choice. – Roland Jul 22 '16 at 19:50
  • Now that I realize it, my data is large enough that the sqrt(n-1/n) is always going to be greater than .999, I think I'll just use sd – knowads Jul 22 '16 at 21:08