Why do I have to set as.Date origin again after using ifelse? Is there a better way?

Question

The following function does work, but the last as.Date part was more or less an result of trial and error that do not understand fully.

 ### This function creates a real date column out of year / period that is saved in 
 ### in separate columns, plus it handles a 13th period in case of overlapping period
 ### terminology. Turns quarters into months.

 realDate <- function (table,year="year_col",period="period_col"){


if (is.character(table) == TRUE)
{
    dframe <- get(table)
}

else{
    dframe <- table
}


x <- expression({resDate <- with(dframe,
                    as.Date(paste(get(year),"-",
                                    ifelse(get(period) > 9, get(period),
                                            paste("0", get(period), sep = "")), 
                                    "-01", sep = "")))
        })

y <- expression({resDate <- with(dframe,as.Date(paste(get(year) + 1,"-","01","-01",sep="")))})

#### I do not get this? Why do I have to do this?
a <- ifelse(get(period) == 13,eval(y),eval(x))
a <-as.Date(a, origin="1970-01-01")


return(a)

}

Instead I tried to do it like this (because it was more intuitively to me):

{ ....
ifelse(get(period) == 13,eval(y),eval(x))
return(resDate)
}

This returned the corrected values whenever the condition was FALSE (no) but returned NA if the condition was TRUE (yes). Why is that? And if I use the function above, why do I have to define the origin again? Why I even have call as.Date again?

EDIT:

 a <- rep(2002:2010,2)
 b <- rep(1:13,2)
 d<-cbind(a,b[1:length(a)])
 names(d) <- c("year_col","period_col")

P.S.: I found this thread on vectorized ifelse.

giving an idea about the input would be nice to try out some things. — Joris Meys, Dec 06 '10 at 11:44
Is it works correctly if you do `xx<-eval(x);yy<-eval(y);ifelse(get(period) == 13,yy,xx)`? — Marek, Dec 06 '10 at 12:10
sorry for that. I edited my post, you can cbind any information to data.frame d as long as the two cols remain. — Matt Bannert, Dec 06 '10 at 12:16

score 2 · Accepted Answer · answered Dec 06 '10 at 12:24

Your construct is "interesting" at least. To start with, neither x nor y gives output. I wonder why you use an assignment in your eval(). this gives you a resDate vector that is exactly what the last call has been. And that is not dependent on the condition, it's the last one written (eval(x) in your case). They get executed before the ifelse clause is executed.

Plus, the output you get is the numeric representation of your data, not the data object. That is in resDate. I guess that ifelse cannot determine the class of the output vector as you use the eval() inside. I'm surprised you get output at all, in fact you're effectively using something that could be called a "bug" in R (Microsoft would call it a feature :-) ).

Your mistake is in your ifelse : get(period) doesn't exist. it should be get(period, dframe). Then it works. The only reason why it works on your computer, is because you have a period in your workspace presumably. Classis problem when debugging.

In any case, I'd make it:

realDate <- function (table,year="year_col",period="period_col"){
  if (is.character(table)){ # is.character(table) returns a boolean already.
      dframe <- get(table)
  } else {
      dframe <- table
  }
  year <- get(year,dframe)
  period <- get(period,dframe)

  year[period==13] <- year[period==13]+1
  period[period==13] <- 1

  as.Date(paste(year,"-",period,"-01",sep=""))
}

This is quite a bit faster than your own, has less pitfalls and conversions, and is more the R way of doing it. You could change year[...] and period [...] by ifelse constructs, but using indices is generally faster.

EDIT :

This is easier for the data generation:

dframe <- data.frame(
    year_col= rep(2006:2007,each=13),
    period_col = rep(1:13,2)
)

realDate(dframe)
 [1] "2006-01-01" "2006-02-01" "2006-03-01" "2006-04-01" "2006-05-01" 
          "2006-06-01" "2006-07-01" "2006-08-01" "2006-09-01"
[10] "2006-10-01" "2006-11-01" "2006-12-01" "2007-01-01" "2007-01-01" 
          "2007-02-01" "2007-03-01" "2007-04-01" "2007-05-01"
[19] "2007-06-01" "2007-07-01" "2007-08-01" "2007-09-01" 
          "2007-10-01" "2007-11-01" "2007-12-01" "2008-01-01"

Feel free to call it weird, annoying, stoopid ... you may choose. Though interesting applies too, because I learned a lot ;). I tend to forget that cols are not the only thing you can index in R. Thx as well for pointing out the classic debugging mistake, of course you were right with that too. Ah, and that # was nice too ... you made my day. Really had a major problem in that condition... why does get(period) not work without dframe? — Matt Bannert, Dec 06 '10 at 12:57
@ran2 : it's the same as calling period_col in your workspace instead of dframe$period_col . — Joris Meys, Dec 06 '10 at 13:02
+ I´d like to point out that the leading zero (for the periods) was not necessary when using as.Date(). — Matt Bannert, Dec 06 '10 at 13:03

Why do I have to set as.Date origin again after using ifelse? Is there a better way?

1 Answers1

Linked