Assume I want to run this:
MS_date<-bind_inpatient_MSW %>%
arrange(NRIC,
APPROVED_DATE_BILL,APPROVED_DATE_FF_APPLICATION) %>%
group_by(NRIC,
APPROVED_DATE_BILL,APPROVED_DATE_FF_APPLICATION) %>%
mutate(n_marital_status=n_distinct(MARITAL_STATUS,na.rm=TRUE))
and this
TH_date<-bind_inpatient_MSW %>%
arrange(NRIC,
APPROVED_DATE_BILL) %>%
group_by(NRIC,
APPROVED_DATE_BILL) %>%
mutate(n_TH=n_distinct(TYPE_OF_HOUSING,na.rm=TRUE))
These two differ by the variables that arrange and group the dataframe, as well as the added variable. I would like to write a user-defined function so that I dont have to write this more than once. I tried as follows:
df_date<-function(df,grpby,cntby){
dfnew<-df %>%
arrange(grpby) %>%
group_by(grpby) %>%
mutate(n=n_distinct(cntby,na.rm=TRUE))
return(dfnew)
}
And applying df_date(bind_inpatient_MSW,NRIC,APPROVED_DATE_BILL,APPROVED_DATE_FF_APPLICATION,MARITAL_STATUS)
and
df_date(bind_inpatient_MSW,NRIC,APPROVED_DATE_BILL,TYPE_OF_HOUSING)
They wouldnt work. How could I solve this?