0

I have a dataset which looks like this:

Names     Subject Trial
A0100_1   A0100   1
A0100_2   A0100   2
A0102_1   A0102   1
A0103_1   A0103   1
A0103_2   A0103   2

I want to keep only the rows of the people with both trials 1 and 2. Thanks in advance!

Cettt
  • 11,460
  • 7
  • 35
  • 58
steph18
  • 3
  • 2

1 Answers1

0

here is one possibility using the tidyverse package:

library(tidyverse)

mydata <- data.frame(subject = c("A0100", "A0100", "A0102", "A0103", "A0103"),
                     Trial = c(1,2,1,1,2))

mydata %>% 
  mutate(dummy = 1) %>%
  spread(Trial, dummy) %>%
  filter(`1` == `2`) %>%
  gather(trial, dummy, - subject) %>%
  select(-dummy)

  subject trial
  <chr>   <chr>
1 A0100   1    
2 A0103   1    
3 A0100   2    
4 A0103   2  

Alternatively (and a bit shorter) you can use the count function and then do a semi join:

mydata %>% 
  count(subject) %>%
  filter(n == 2) %>%
  semi_join(mydata, ., by = "subject")

# A tibble: 4 x 2
  subject Trial
  <chr>   <dbl>
1 A0100       1
2 A0100       2
3 A0103       1
4 A0103       2
Cettt
  • 11,460
  • 7
  • 35
  • 58