You may also want to consider using transform()
to deal with recoding issues such as this. transform()
will perform slower than the logical indexing method, but is easier to digest the intent of the code. A good discussion of the pros and cons of the different methods can be found here. Consider:
set.seed(42)
df <- data.frame("first" = sample(1:5, 10e5, TRUE), "second" = sample(4:8, 10e5, TRUE))
df <- transform(df
, test = ifelse(first %in% 1:3 & second == 4, 1
, ifelse(first %in% 1:3 & second == 5, 2
, ifelse(first %in% 1:3 & second == 6, 3, NA)))
)
Secondly, the column names 1st
and 2nd
are not syntactically valid column names. Take a look at make.names()
for more details on what constitutes valid column names. When working with a data.frame
, you can use/abuse the check.names
argument. For example:
> df <- data.frame("1st" = sample(1:5, 10e5, TRUE), "2nd" = sample(4:8, 10e5, TRUE), check.names = FALSE)
> colnames(df)
[1] "1st" "2nd"
> df <- data.frame("1st" = sample(1:5, 10e5, TRUE), "2nd" = sample(4:8, 10e5, TRUE), check.names = TRUE)
> colnames(df)
[1] "X1st" "X2nd"