-1

I have dataset with 61 entries. What i am trying to do is to calculate variance.

I am doing it with two ways but they differ

1st way is following

$var X = E(X^{2}) - (EX)^{2}$

so

> c = 0

> for( year in females$Salary )
+     c = c + (year^2)
> (c/length(females$Salary) - mean(females$Salary)^2
[1] 286682.3

but when i use build in function

> var(females$Salary)
[1] 291460.3

as u can see the output is different. Why is this happening? Shouldnt they be the same?

LyzandeR
  • 37,047
  • 12
  • 77
  • 87
Darlyn
  • 4,715
  • 12
  • 40
  • 90

1 Answers1

5
  • var in R uses the unbiased estimator of the variance (sample variance) which has a denominator of n-1.

  • Your calculation uses the formula of variance.

Check this:

vec <- 1:100

#var uses the sample variance where the denominator is n-1 i.e. 99
var(vec)
#[1] 841.6667
1 / 99 * sum((vec - mean(vec))^2)
#[1] 841.6667

#this is what you use to calculate variance, which uses a denominator of n i.e. 100
mean(vec^2) - mean(vec)^2
#[1] 833.25
1 / 100 * sum((vec - mean(vec))^2)
#[1] 833.25
LyzandeR
  • 37,047
  • 12
  • 77
  • 87