0

I have been working in project in R. I worked with R before but never faced such problem. I am trying to take a subset from my full dataset (longitudinal data) using the "subset" command as follows:

subset(data, data$Code == c("A21", "A26", "A29", "A42", "A48", "A51", "B20"))

but the subset is taking only the codes "A21", "A26", "A42", "A51", "B20" in the new dataset. Can you please tell me why it is doing that?

I also tried to see if I have any problem in code or the main dataset using the code

subset(data, data$Code == c("A29", "A48"))

the new dataset is alright. I am really confused why the command is taking only the selective data.

neilfws
  • 32,751
  • 5
  • 50
  • 63
  • 2
    Welcome to Stack Overflow. Please [make this question reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) by including a small representative dataset in a plain text format - for example the output from `dput(data)`, if that is not too large. – neilfws Dec 12 '21 at 22:24

1 Answers1

1

The correct operator in this case would be %in%, and not ==:

subset(data, Code %in% c(...))
# substitute ... for the codes you want to include in the subset

(also: inside subset() you don't need to use the $ operator to refer to a data.frame's columns)

  • Thank you so much for your reply. That worked. But I used == before and it worked then. – Maliha Haider Dec 13 '21 at 20:13
  • The reason it worked before is that it just seemed to work while it really did not. It's actually a bit common for such cases to go unnoticed in larger datasets. There's a good explanation if you follow the link to the duplicate question, above. – Pedro Cunha Dec 13 '21 at 23:02