1

I am tring to write a R code that compare two columns grouped by thier Id. My idea is to filter the column so it shows only ID that have been to both initial and review meeting.

This is how my data frame:

ID  Initial Review  Type
P40 Yes             Meeting1
P40         Yes     Meeting2
P42 Yes             Meeting1
P42         No      Meeting2
P43 Yes             Meeting1
P43         Yes     Meeting2
P44 Yes             Meeting1
P44         No      Meeting2

This is what I am trying to achieve:

ID  Initial Review  Type
P40 Yes             Meeting1
P40         Yes     Meeting2
P43 Yes             Meeting1
P43         Yes     Meeting2

Have tried using OR and AND logical operators. The OR gives me wrong result, with the AND I get empty data frame.

tt %>% group_by(ID) %>% filter(Initial == "Yes" & Review == "Yes")
ranaz
  • 97
  • 1
  • 10
  • 1
    Please share data with `dput`, [**->help**](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610). – jay.sf Jul 15 '19 at 08:57

1 Answers1

1
df %>% group_by(ID) %>% filter(any(Initial == "Yes") && any(Review == "Yes"))

Explanation: Initial == "Yes" is a vector of two elements, one for each row of a given ID. For example, for P40 it is c(TRUE, FALSE). Same for Review == "Yes", except that here the vector is c(FALSE, TRUE). Now, c(TRUE, FALSE) & c(FALSE, TRUE) is c(FALSE, FALSE), which is why you get an empty df.

In my solution, you get a single value. For P40, any(Initial == "Yes") is TRUE, and any(Review == "Yes") is also TRUE, and TRUE && TRUE is TRUE. Now since we need a vector of length 2, R expands the vector automatically, filling it with TRUE, and that is why you get both lines for P40.

January
  • 16,320
  • 6
  • 52
  • 74
  • Promise me that you will use `dput` in future ;-) – January Jul 15 '19 at 09:17
  • @ January, ok I promise. What about P40 has another row which has "No" in the initial column but you do not want that to show? Is you want I can edit the data frame to show you what i want. – ranaz Jul 15 '19 at 09:23
  • Ah, that is another question altogether with no example in your data. Do I understand correctly that you do want to show the user `P40` even though in another row it says "no"? Then after the above filter you just remove the rows which have a "No" in the first column. Either that or you need to write another question with more details as to what exactly you want. – January Jul 15 '19 at 09:25
  • Cheers @ January – ranaz Jul 15 '19 at 09:48