0

I am new to r and I have run into a problem. I am analysing a dataframe and the question I am working on had three possible answers. I now want to obtain the share per answer in my dataframe. This is my code so far:

BES%>%
  group_by(y08) %>%
  summarise(count = n())

and it yields

 y08 count
                   <dbl+lbl> <int>
1 1 [Yes: trade union]         261
2 2 [Yes: staff association]    25
3 3 [No]                      1908

How can I obtain the absolute number of observations (sum of my integers) and based on that the share of each option? I'd like to create a stratified sample based on this.

(for the srs sample:

str_samp <-
  BES%>%
  mutate(strata = sample_size * share) %>%
  group_by(y08) %>%
  sample_n(strata) %>%
  ungroup()

this is my code atm. Sample size is defined but I struggle w/ defining the share variable.)

Thank you for your help!

  • Since your question got closed, please edit it with the respective demanded information (see the link I provided). Then we can reopen the question. Unless the linked question above already answers your question. – deschen Feb 24 '22 at 06:44

2 Answers2

0

To get the share/proportions just do this:

BES%>%
  group_by(y08) %>%
  summarise(count = n()) %>%
  mutate(share = count/sum(count))
deschen
  • 10,012
  • 3
  • 27
  • 50
  • Thanks a lot for the quick response. I have tried to add this to my stratified sampling code but it always yields an error saying the sample size must be 3 or less. I get this bcs I only have three strata, but I am looking to draw random samples within these strata. What am I doing wrong here? ```BES%>% group_by(y08) %>% summarise(count = n()) %>% mutate(share = count/sum(count), unitss = round(share * sample_size)) %>% sample_n(unitss) %>% ungroup() ``` – CommanderKrieger Feb 23 '22 at 16:22
  • Please share: https://stackoverflow.com/help/minimal-reproducible-example Otherwise itβ€˜s hard to help you. – deschen Feb 23 '22 at 16:53
0
BES %>% count(y08) %>% mutate(share=n/sum(n))
langtang
  • 22,248
  • 1
  • 12
  • 27