0

I'm looking for a way to conditionally pass only one argument to a function (one of three choices). Based on the choice, I want to simply create a variable in the dataset. Lets say we have the following dataset:

set.seed(10)
test <- data.frame(time_stamp = sample(seq(as.Date('1999/01/01'), as.Date('2012/01/01'), by="day"), 12))
test
#    time_stamp
# 1  2000-05-05
# 2  2009-03-09
# 3  2008-04-24
# 4  2011-03-22
# 5  2003-05-27
# 6  2003-01-01
# 7  2008-10-22
# 8  2003-10-13
# 9  2011-02-26
# 10 2008-08-27
# 11 2011-12-30
# 12 2001-07-18

My desired output when I run my function is the following:

test_fun(type = "halfs") 
#or more simply
test_fun(halfs)
#    time_stamp half_var
# 1  2000-05-05  H1 2000
# 2  2009-03-09  H1 2009
# 3  2008-04-24  H1 2008
# 4  2011-03-22  H1 2011
# 5  2003-05-27  H1 2003
# 6  2003-01-01  H1 2003
# 7  2008-10-22  H2 2008
# 8  2003-10-13  H2 2003
# 9  2011-02-26  H1 2011
# 10 2008-08-27  H2 2008
# 11 2011-12-30  H2 2011
# 12 2001-07-18  H2 2001

Based on the argument chosen I run an if statement within a pipe, I thought I could do this if I put {} around the conditional statement as mentioned here, but I can't figure it out. Heres the function:

    test_fun <- function(type = c("halfs", "quarts", "other")) {
      test %>% {
        if (type == "halfs") {
          mutate(half_var = ifelse(month(time_stamp) <= 6, paste('H1', year(time_stamp)), paste('H2', year(time_stamp))))
        }  else if (type == "quarts") {
          mutate(quarts_var = case_when(month(time_stamp) <= 3 ~ paste('q1', year(time_stamp)), 
                                        month(time_stamp) > 3 & month(time_stamp) <= 6 ~ paste('q2', year(time_stamp)),
                                        month(time_stamp) > 6 & month(time_stamp) <= 9 ~ paste('q3', year(time_stamp)),
                                        month(time_stamp) > 9 ~ paste('q4', year(time_stamp))))
        }  else (type == "other") {
          mutate(other = ifelse(month(time_stamp) <= 6, paste('H1', year(time_stamp)), paste('H2', year(time_stamp))))
        }
      }

}

I'm getting an error about unexpected brackets but I think the problem is to do with conditional if within a pipe (all brackets are closed).

Another approach might be using optional argument as suggested here test_fun <- function(halfs, quarts = NULL, other = NULL)) but that way indicates that halfs must be supplied which is not the case. Really I want something like test_fun <- function(halfs = NULL, quarts = NULL, other = NULL)) or test_fun <- function(...)) which cant be done. A way around that might be to supply the data as an argument: test_fun <- function(test, halfs = NULL, quarts = NULL, other = NULL)) but I cant figure it out.

Any suggestions would be great.

user63230
  • 4,095
  • 21
  • 43

2 Answers2

2

The syntax error is real and must be addressed first. else (type == "other") isn't proper syntax. I think you meant else if (type == "other"). Since you didn't have the if, the brackets were unexpected.

But also when you pipe into a code block, you need to use . to place the variable. Your mutates inside the {} should use mutate(., half_var=...)

test_fun <- function(type = c("halfs", "quarts", "other")) {
  test %>% {
    if (type == "halfs") {
      mutate(., half_var = ifelse(month(time_stamp) <= 6, paste('H1', year(time_stamp)), paste('H2', year(time_stamp))))
    }  else if (type == "quarts") {
      mutate(., quarts_var = case_when(month(time_stamp) <= 3 ~ paste('q1', year(time_stamp)), 
                                    month(time_stamp) > 3 & month(time_stamp) <= 6 ~ paste('q2', year(time_stamp)),
                                    month(time_stamp) > 6 & month(time_stamp) <= 9 ~ paste('q3', year(time_stamp)),
                                    month(time_stamp) > 9 ~ paste('q4', year(time_stamp))))
    }  else if (type == "other") {
      mutate(., other = ifelse(month(time_stamp) <= 6, paste('H1', year(time_stamp)), paste('H2', year(time_stamp))))
    }
  } 
}
MrFlick
  • 195,160
  • 17
  • 277
  • 295
2

These calculations are already available directly in yearmon and yearqtr in the zoo package so:

library(zoo)

test %>% 
  mutate(yearmon = as.yearmon(time_stamp),
         yearqtr = as.yearqtr(time_stamp),
         yearhalf = paste0(as.integer(yearmon), " H", (cycle(yearmon) > 6) + 1))

giving:

   time_stamp  yearmon yearqtr yearhalf
1  2005-08-07 Aug 2005 2005 Q3  2005 H2
2  2002-12-27 Dec 2002 2002 Q4  2002 H2
3  2004-07-19 Jul 2004 2004 Q3  2004 H2
4  2008-01-03 Jan 2008 2008 Q1  2008 H1
5  2000-02-08 Feb 2000 2000 Q1  2000 H1
6  2001-12-05 Dec 2001 2001 Q4  2001 H2
7  2002-07-26 Jul 2002 2002 Q3  2002 H2
8  2002-07-15 Jul 2002 2002 Q3  2002 H2
9  2006-12-29 Dec 2006 2006 Q4  2006 H2
10 2004-07-29 Jul 2004 2004 Q3  2004 H2
11 2007-06-16 Jun 2007 2007 Q2  2007 H1
12 2006-05-13 May 2006 2006 Q2  2006 H1

Function

It is not clear that we really need a function for this but just to complete this:

test_fun <- function(x, type = c("month", "quarter", "half")) {
  type <- match.arg(type)
  ym <- as.yearmon(x)
  if (type == "month") ym
  else if (type == "quarter") as.yearqtr(x)
  else paste0(as.integer(ym), " H", (cycle(ym) > 6) + 1)
}

library(zoo)

test %>% 
  mutate(yearmonth = test_fun(time_stamp, "month"),
         yearqtr = test_fun(time_stamp, "quarter"),
         yearhalf = test_fun(time_stamp, "half"))

Function with one argument

Regarding the subject line of the question which asks for a function of one argument I am not so sure that is a good idea since it implies hard coding which column to use but if you really want to do it anyways we show how to it in the following. We actually provide a second argument just in case you change your mind and want to specify the time_stamp column but if it is not specified it defaults appropriately provided it is called within a mutate.

test_fun2 <- function(type = c("month", "quarter", "half"),
    x = parent.frame()$.data$time_stamp) {
  type <- match.arg(type)
  ym <- as.yearmon(x)
  if (type == "month") ym
  else if (type == "quarter") as.yearqtr(x)
  else paste0(as.integer(ym), " H", (cycle(ym) > 6) + 1)
}

library(zoo)

test %>% 
  mutate(month = test_fun2("month"),
         quarter = test_fun2("quarter"),
         halfs = test_fun2("half"))

Function which returns a subset of the three

If what you meant was that you want test_fun3 to return up to 3 columns then

test_fun3 <- function(x, month = FALSE, quarter = FALSE, half = FALSE) {
  ym <- as.yearmon(x)
  data <- data.frame(yearmon = ym,
    quarter = as.yearqtr(x),
    half = paste0(as.integer(ym), " H", (cycle(ym) > 6) + 1))
  data[c(month, quarter, half)]
}

test %>% 
  bind_cols(test_fun3(.$time_stamp, TRUE, TRUE))
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341