24

How can I calculate the population variance of my data using R?

I read there is a package called popvar but I have the Version 0.99.892 and I don't find the package

YazminRios
  • 389
  • 1
  • 2
  • 10
  • 1
    What software do you have Version 0.99.892 of? The current R version is 3.3.0. Maybe RStudio? Don't confuse R and RStudio - RStudio is just a tool for writing R code. – Gregor Thomas Jun 09 '16 at 18:12
  • 3
    R's `var` function divides by n-1 by default. Multiplying the output of var by (n-1)/n will give you what want. – Dason Jun 09 '16 at 18:14

4 Answers4

33

The var() function in base R calculate the sample variance, and the population variance differs with the sample variance by a factor of n / n - 1. So an alternative to calculate population variance will be var(myVector) * (n - 1) / n where n is the length of the vector, here is an example:

x <- 1:10
var(x) * 9 /10
[1] 8.25

From the definition of population variance:

sum((x - mean(x))^2) / 10
[1] 8.25 
Psidom
  • 209,562
  • 33
  • 339
  • 356
9

You already have a great answer, but I'd like to show that you can easily make your own convenience functions. It is surprising that a population variance/standard deviation function is not available in base R. It is available in Excel/Calc and other software. It wouldn't be difficult to have such a function. It could be named sdp or sd.p or be invoked with sd(x, pop = TRUE)

Here is a basic version of population variance with no type-checking:

  x <- 1:10
  varp <- function(x) mean((x-mean(x))^2)
  varp(x)
  ## [1] 8.25

To scale up, if speed is an issue, colSums and/or colMeans may be used (see: https://rdrr.io/r/base/colSums.html)

PatrickT
  • 10,037
  • 9
  • 76
  • 111
  • Also surprising that there is no skewness (with/without adjustment) or kurtosis or raw moments in base ``R``... – PatrickT Oct 21 '17 at 16:21
2

You can find the details on package popvar here: https://cran.r-project.org/web/packages/PopVar/index.html - You can install it using the command install.packages("PopVar"); Note that the name is case sensitive (capital P, capital V).

Mekki MacAulay
  • 1,727
  • 2
  • 12
  • 23
  • It will be helpful if you explain how to use it. I tried **PopVar(c(1,2,3))** and got error: *Error: could not find function "PopVar"*. – Mohammed H Jul 14 '17 at 03:52
  • 1
    @HabeebPerwad You must first load the package `PopVar` to use the function `PopVar::PopVar` – Erdogan CEVHER Apr 17 '18 at 09:41
  • @ErdoganCEVHER `PopVar::` has only two tab completions `PopVar::pop.predict` and `PopVar::x.val'. I couldn't find `PopVar::PopVar`. – Mohammed H Apr 17 '18 at 16:38
2

You can calculate the population variance with the following function:

pvar <- function(x) {
  sum((x - mean(x))**2) / length(x)
}

where x is a numeric vector that keeps the data of your population. For example:

> x <- c(1, 3, 5, 7, 14)
> pvar(x)
[1] 20
tzabal
  • 431
  • 5
  • 5