having trouble using "subset" command in R

Question

I have been working in project in R. I worked with R before but never faced such problem. I am trying to take a subset from my full dataset (longitudinal data) using the "subset" command as follows:

subset(data, data$Code == c("A21", "A26", "A29", "A42", "A48", "A51", "B20"))

but the subset is taking only the codes "A21", "A26", "A42", "A51", "B20" in the new dataset. Can you please tell me why it is doing that?

I also tried to see if I have any problem in code or the main dataset using the code

subset(data, data$Code == c("A29", "A48"))

the new dataset is alright. I am really confused why the command is taking only the selective data.

Welcome to Stack Overflow. Please [make this question reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) by including a small representative dataset in a plain text format - for example the output from `dput(data)`, if that is not too large. — neilfws, Dec 12 '21 at 22:24

Pedro Cunha · Answer 1 · 2021-12-12T22:34:48.410

1

The correct operator in this case would be %in%, and not ==:

subset(data, Code %in% c(...))
# substitute ... for the codes you want to include in the subset

(also: inside subset() you don't need to use the $ operator to refer to a data.frame's columns)

edited Dec 12 '21 at 22:34

answered Dec 12 '21 at 22:23

Pedro Cunha

66
5

Thank you so much for your reply. That worked. But I used == before and it worked then. – Maliha Haider Dec 13 '21 at 20:13
The reason it worked before is that it just seemed to work while it really did not. It's actually a bit common for such cases to go unnoticed in larger datasets. There's a good explanation if you follow the link to the duplicate question, above. – Pedro Cunha Dec 13 '21 at 23:02

having trouble using "subset" command in R

1 Answers1