0

Possible Duplicate:
R if with vectorized statements

there's some similar questions on how to best vectorize functions on here, but I can't yet find an example that applies an if-type function, by row for a data frame.

Give a data frame, df, with column "Year", which holds year values from 1912 - 2010, I simply want to apply a test of whether a given year is before or after a test year (e.g 1948) and assign a character "yes" or "no" in another column. Should be easy...

Currently, I have written the code as follows:

i = 1
while (i < nrow(df)) {
     if (df$Year[i] < 1948) {
         df$Test[i] <- "Yes"        
     } else { df$Test[i] <- "No"
     }
     i = i + 1
 }

The above works, but is slow with large datasets, and I know that there must be a more "elegant" solution for this in R. Would a better approach use apply? Or is there something even simpler?

Thanks!

Community
  • 1
  • 1
jsnider
  • 283
  • 1
  • 3
  • 7

2 Answers2

4

ifelse makes more sense here.

df$Test <- ifelse(df$Year < 1948, "Yes", "No")

ifelse is a vectorized version of the if/else construct. When using R it almost always makes more sense to go with a vectorized solution if it's possible.

Dason
  • 60,663
  • 9
  • 131
  • 148
3

You want ifelse() instead, it is vectorized and returns a value with the same shape as test which is filled with elements selected from either yes or no depending on whether the element of test is TRUE or FALSE, or so says the help page.

For example:

> years <- 1980:2000

> ifelse(years < 1986, "old", "young")
 [1] "old"   "old"   "old"   "old"   "old"   "old"   "young" "young" "young" "young" "young" "young" "young" "young" "young"
[16] "young" "young" "young" "young" "young" "young"

You can even nest ifelse() statements if you have more than 2 conditions, similar to Excel if you're familiar with =IF()

ifelse(years < 1986, "old", ifelse(years < 1996, "medium", "young"))
Chase
  • 67,710
  • 18
  • 144
  • 161