I want to use dplyr for some data manipulation. Background: I have a survey weight and a bunch of variables (mostly likert-items). I want to sum the frequencies and percentages per category with and without survey weight.
As an example, let us just use frequencies for the gender variable. The result should be this:
gender freq freq.weighted
1 292 922.2906
2 279 964.7551
9 6 21.7338
I will do this for many variables. So, i decided to put the dplyr-code inside a function, so i only have to change the variable and type less.
#exampledata
gender<-c("2","2","1","2","2","2","2","2","2","2","2","2","1","1","2","2","2","2","2","2","1","2","2","2","2","2","2","2","2","2")
survey_weight<-c("2.368456","2.642901","2.926698","3.628653","3.247463","3.698195","2.776772","2.972387","2.686365","2.441820","3.494899","3.133106","3.253514","3.138839","3.430597","3.769577","3.367952","2.265350","2.686365","3.189538","3.029999","3.024567","2.972387","2.730978","4.074495","2.921552","3.769577","2.730978","3.247463","3.230097")
test_dataframe<-data.frame(gender,survey_weight)
#function
weighting.function<-function(dataframe,variable){
test_weighted<- dataframe %>%
group_by_(variable) %>%
summarise_(interp(freq=count(~weight)),
interp(freq_weighted=sum(~weight)))
return(test_weighted)
}
result_dataframe<-weighting.function(test_dataframe,"gender")
#this second step was left out in this example:
#mutate_(perc=interp(~freq/sum(~freq)*100),perc_weighted=interp(~freq_weighted/sum(~freq_weighted)*100))
This leads to the following Error-Message:
Error in UseMethod("group_by_") :
no applicable method for 'group_by_' applied to an object of class "formula"
I have tried a lot of different things. First, I used freq=n()
to count the frequencies, but I always got an Error (i checked, that plyr was loaded before dplyr and not afterwards - it also didn“t work.).
Any ideas? I read the vignette on standard evaluation. But, i always run into problems and have no idea what could be a solution.