dfrm$dc <- c("dog", "cat", "rabbit")[ findInterval(dfrm$b, c(1, 2.5, 5.5, Inf)) ]
The findInterval approach will be much faster than nested ifelse
strategies, and I'm guessing very much faster than a function that loops over unnested if
statements. Those of us working with bigger data do notice the differences when we pick inefficient algorithms.
This didn't actually address the request, but I don't always think that new users of R will know the most expressive or efficient approach to problems. A request to "use IF" sounded like an effort to translate coding approaches typical of the two major macro statistical processors SPSS and SAS. The R if
control structure is not generally an efficient approach to recoding a column since the argument to its first position will only get evaluated for the first element. On its own it doesn't process a column, whereas the ifelse
function will do so. The cut
function might have been used here (with appropriate breaks
and labels
parameters) , although it would have delivered a factor
-value instead of a character value. The findInterval
approach was chosen for its ability to return multiple levels (which a single ifelse
cannot). I think chaining or nesting ifelse
's becomes quickly ugly and confusing after about 2 or 3 levels of nesting.