There are different results between subset code in R

Question

The results of:

BB= RB[RB$Rep, %in% c(“1”,”3”)] and Bb=subset(RB,Rep ==c(“1”,”3”) ) are different.

Please tell me what the problem is?

Second code possible is taking only the first element. Instead you should try: `Bb=subset(RB,Rep ==c(“1”) | Rep == c(”3”) ) ` — Duck, Aug 24 '20 at 15:03
Thank you! But I would like to know why the result is different when I enter these two code? — LIOU bing, Aug 24 '20 at 15:14
Welcome to Stack Overflow! Please don't post code samples as photos. Just copy the text and paste it into your question. Then select it and press the "{ }" button to format it as code. — Howlium, Aug 24 '20 at 17:52

score 1 · Accepted Answer · answered Aug 24 '20 at 15:32

When you use == the comparison is done in a sequential order.

Consider this example :

df <- data.frame(a = 1:6, b = c(1:3, 3:1))
df
#  a b
#1 1 1
#2 2 2
#3 3 3
#4 4 3
#5 5 2
#6 6 1

When you use :

subset(df, b == c(1, 3))
#  a b
#1 1 1
#4 4 3

1st value of b is compared with 1, 2nd with 3. Now as you have vector of shorter length, the values are recycled meaning 3rd value is again compared to 1, 4th value with 3 and so on until end of the dataframe. Hence, you get row 1 and 4 as output here.

When you use %in% it checks for either 1 or 3 is present in b. So it selects all the rows where value 1 or 3 is present in b.

subset(df, b %in% c(1, 3))

#  a b
#1 1 1
#3 3 3
#4 4 3
#6 6 1

There are different results between subset code in R

1 Answers1