Combining conditions

Question

my recode attempts

df$test[(df$1st==(1:3) & df$2nd <= 4)] <- 1
df$test[(df$1st==(1:3) & df$2nd <= 5)] <- 2
df$test[(df$1st==(1:3) & df$2nd <= 6)] <- 3

result in a "longer object length is not a multiple of shorter object length" warning and a lot of NAs in df$test, even though some recodes work correctly.
What am I missing? Any help appreciated.

dw

score 5 · Answer 1 · answered Dec 13 '10 at 11:38

5

Problem is in this line:

df$1st==(1:3)

You could use %in%

df$1st %in% (1:3)

Warning comes cause you compare vectors of different lengths (1:3 has length 3 and df$1st has length "only you know what").

Beside I think you missed that your values are overwritten: df$2nd <= 4 is also df$2nd <= 6 so all 1 and 2 are overwrite by 3.

answered Dec 13 '10 at 11:38

Marek

49,472
15
99
121

sorry, overwriting only takes place in my example, which I put down too fast and erronous... – dw006 Dec 13 '10 at 12:00

NPE · Answer 2 · 2010-12-13T12:16:41.233

4

I am not sure what you're trying to achieve with df$1st==(1:3), but it probably doesn't do what you think it does. It recycles c(1,2,3) as many times as it needs to make it as long as df.

If you are trying to check if df$1st is between 1 and 3, you might want to spell it out:

df$1st>=1 & df$1st<=3

edited Dec 13 '10 at 12:16

answered Dec 13 '10 at 11:37

NPE

486,780
108
951
1,012

Thanks alot, aix! The problem was indeed the 1:3; spelling it out worked. – dw006 Dec 13 '10 at 11:58

score 1 · Answer 3 · edited May 23 '17 at 11:58

You may also want to consider using transform() to deal with recoding issues such as this. transform() will perform slower than the logical indexing method, but is easier to digest the intent of the code. A good discussion of the pros and cons of the different methods can be found here. Consider:

set.seed(42)
df <- data.frame("first" = sample(1:5, 10e5, TRUE), "second" = sample(4:8, 10e5, TRUE))

df <- transform(df
    , test =      ifelse(first %in% 1:3 & second == 4, 1
            , ifelse(first %in% 1:3 & second == 5, 2
            , ifelse(first %in% 1:3 & second == 6, 3, NA)))
    )

Secondly, the column names 1st and 2nd are not syntactically valid column names. Take a look at make.names() for more details on what constitutes valid column names. When working with a data.frame, you can use/abuse the check.names argument. For example:

> df <- data.frame("1st" = sample(1:5, 10e5, TRUE), "2nd" = sample(4:8, 10e5, TRUE), check.names = FALSE)
> colnames(df)
[1] "1st" "2nd"
> df <- data.frame("1st" = sample(1:5, 10e5, TRUE), "2nd" = sample(4:8, 10e5, TRUE), check.names = TRUE)
> colnames(df)
[1] "X1st" "X2nd"

Combining conditions

3 Answers3