-1

I have data over individuals that looks like this:

df=data.frame(Col1=c("A","A","B","B"),Col2=c("A50","C50","B60","A70"),Col3=c("A40","A50","A50","A70"))
criteria="A50"

  Col1 Col2 Col3
    A  A50  A40
    A  C50  A50
    B  A50  A50
    B  A70  A70

I want to pick each individual in Col1 that has the criteria fulfilled (A50) in any of the columns in at two different observations. That is, individual A will be selected since he has A50 in two different observations. Individual B will however not be selected since he only has A50 at in one observation, even if he became A50 two times during that observation.

The problem is an extension to this one: Subset multiple columns in R with multiple matches

KGB91
  • 630
  • 2
  • 6
  • 24

2 Answers2

1

Try using this with dplyr 1.0.0

library(dplyr)

cols <- c('Col2', 'Col3') 

df %>%
  group_by(Col1) %>%
  filter(sum(colSums(cur_data()[cols] == criteria) >= 1) > 1)

#  Col1  Col2  Col3 
#  <chr> <chr> <chr>
#1 A     A50   A40  
#2 A     C50   A50  

cur_data()[cols] selects only cols column, colSums counts number of matches in each column and sum ensures that the match is in different columns.

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Hm, I just get an empty data frame when I run that code on the real data. – KGB91 Oct 09 '20 at 14:50
  • Works fine for the data you have shared. Could you share a representative sample then? – Ronak Shah Oct 09 '20 at 14:55
  • Indeed it does... I can't since the data is confidential: but it is one column with the ID of the individual (integer) and 15 columns (char) with a state (like X300). Each individual could appear 15 times or so in the data. – KGB91 Oct 09 '20 at 20:20
  • Ahhh! Column 11-15 were formatted as `logi`. My bad. Thanks, your solution works fine! – KGB91 Oct 09 '20 at 20:22
0

Here is a base R option

u <- sapply(
  split(as.data.frame(df[-1] == criteria), df$Col1),
  function(x) all(rowSums(x) > 0) & all(colSums(x) > 0)
)
subset(df, Col1 == names(u)[u])

which gives

  Col1 Col2 Col3
1    A  A50  A40
2    A  C50  A50
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81