0

Assume I want to run this:

MS_date<-bind_inpatient_MSW %>% 
  arrange(NRIC,
          APPROVED_DATE_BILL,APPROVED_DATE_FF_APPLICATION) %>%
  group_by(NRIC,
           APPROVED_DATE_BILL,APPROVED_DATE_FF_APPLICATION) %>%
  mutate(n_marital_status=n_distinct(MARITAL_STATUS,na.rm=TRUE))

and this

TH_date<-bind_inpatient_MSW %>% 
  arrange(NRIC,
          APPROVED_DATE_BILL) %>%
  group_by(NRIC,
           APPROVED_DATE_BILL) %>%
  mutate(n_TH=n_distinct(TYPE_OF_HOUSING,na.rm=TRUE))

These two differ by the variables that arrange and group the dataframe, as well as the added variable. I would like to write a user-defined function so that I dont have to write this more than once. I tried as follows:

df_date<-function(df,grpby,cntby){
  dfnew<-df %>%
    arrange(grpby) %>%
    group_by(grpby) %>%
    mutate(n=n_distinct(cntby,na.rm=TRUE))
  return(dfnew)
}

And applying df_date(bind_inpatient_MSW,NRIC,APPROVED_DATE_BILL,APPROVED_DATE_FF_APPLICATION,MARITAL_STATUS)

and

df_date(bind_inpatient_MSW,NRIC,APPROVED_DATE_BILL,TYPE_OF_HOUSING)

They wouldnt work. How could I solve this?

lmo
  • 37,904
  • 9
  • 56
  • 69
HNSKD
  • 1,614
  • 2
  • 14
  • 25
  • Check if the `cntby` exists, if it doesn't then assign `grpby` to `cntby`, and use `group_by_` and `arrange_`, to pass string as variable names. Possible duplicate of http://stackoverflow.com/questions/7964830/test-if-an-argument-of-a-function-is-set-or-not-in-r – zx8754 Mar 23 '17 at 08:51

1 Answers1

0

You can try something like:

fun <- function(dat,group,ctnby) {
            dat %>% 
            group_by_(group) %>%
            do((function(., ctnby) {
                with(., data.frame(n = n_distinct(get(ctnby))))
            } 
    )(.,ctnby))
    }

fun(mtcars,"cyl","hp")

which avoids lazy evaluation using do.

count
  • 1,328
  • 9
  • 16