0

This is my tibble:

df <- tibble(x = c("a", "a", "a", "b", "b", "b"), y = c(1,2,3,4,6,8))
df
# A tibble: 6 x 2
  x         y
  <chr> <dbl>
1 a         1
2 a         2
3 a         3
4 b         4
5 b         6
6 b         8

I want to compute the poulation sd for the grouped variables of x.

I tried it with this function:

sqrt((n-1)/n) * sd(x)

and dplyr and it looked like this:

df %>%
  group_by(x) %>%
  summarise(sd = sqrt((length(df$y)-1)/length(df$y)) * sd(y)) %>%
  ungroup()

# A tibble: 2 x 2
  x        sd
* <chr> <dbl>
1 a     0.913
2 b     1.83 

Ofcourse this is incorrect, since the length argument is not grouped and therefore takes n = 6 and not n = 3. I should get

a = 0.8164966
b = 1.632993

Edit:

The output should be a tibble with the variables i have grouped and the sd for every group.

Dutschke
  • 277
  • 2
  • 15

1 Answers1

1

You can use the n() function

df %>%
    group_by(x) %>%
    summarise(sd = sqrt(( n() -1)/ n() ) * sd(y)) %>%
    ungroup()
Sirius
  • 5,224
  • 2
  • 14
  • 21
  • There is a ")" too much in sd(y)) right? but also after removing i get the following error: Error in is.data.frame(x) : object 'y' not found – Dutschke Mar 05 '21 at 11:30
  • yes, well, it was elsewhere, but fixed now! – Sirius Mar 05 '21 at 11:33
  • it works now. but one more thing. why is it rounding? because what i get is 0.816 and 1.63. i should get 0.8164966 and 1.632993 right? thats what i get, when i do it like this: sqrt((n-1)/n) * sd(x) – Dutschke Mar 05 '21 at 11:35
  • It's rounding to not fill your view with numbers. The numbers in the object are available with full precision. You could add `%>% pull(sd)` to get them as a vector. – Sirius Mar 05 '21 at 13:14
  • 1
    See also this page on how to control how many digits are shown with tibble's: https://stackoverflow.com/questions/51621518/how-to-make-tibbles-display-significant-digits – Sirius Mar 05 '21 at 13:15