0

I have a data set that has many rows and columns, I would like to filter my data set, by only selecting rows below a row with the value "Loss" or "MissedWin"

How do I do this?

For example, I have a data set that looks like this:

   ParticipantNumber Primary_Dx     Cue TrialCategory Target.RT
1              21054    Healthy Control       Neutral       203
2              21054    Healthy   Lose2   AvoidedLoss       186
3              21054    Healthy Control       Neutral       205
4              21054    Healthy    Win2           Win       222
5              21054    Healthy    Win2           Win       198
6              21054    Healthy    Win2           Win       271
7              21054    Healthy   Lose2   AvoidedLoss       259
8              21054    Healthy   Lose2   AvoidedLoss       223
9              21054    Healthy   Lose2   AvoidedLoss       257
10             21054    Healthy   Lose2   AvoidedLoss       204
11             21054    Healthy Control       Neutral       207
12             21054    Healthy   Lose2   AvoidedLoss       193
13             21054    Healthy Control       Neutral         0
14             21054    Healthy Control       Neutral       217
15             21054    Healthy    Win2           Win       208
16             21054    Healthy Control       Neutral       248
17             21054    Healthy    Win2           Win       184
18             21054    Healthy   Lose2   AvoidedLoss       217
19             21054    Healthy   Lose2          Loss       296
20             21054    Healthy Control       Neutral       296
21             21054    Healthy Control       Neutral         0
22             21054    Healthy Control       Neutral         0
23             21054    Healthy Control       Neutral       202
24             21054    Healthy Control       Neutral         0
25             21054    Healthy    Win2     MissedWin         0
26             21054    Healthy    Win2     MissedWin       207
27             21054    Healthy    Win2     MissedWin         0
28             21054    Healthy   Lose2          Loss         0
29             21054    Healthy Control       Neutral       201
30             21054    Healthy   Lose2          Loss         0
31             21054    Healthy    Win2     MissedWin         0
32             21054    Healthy   Lose2          Loss       233
33             21054    Healthy   Lose2          Loss       241
34             21054    Healthy    Win2           Win       223
35             21054    Healthy   Lose2          Loss         0
36             21054    Healthy   Lose2          Loss         0
37             21054    Healthy Control       Neutral       192
38             21054    Healthy    Win2           Win       211
39             21054    Healthy   Lose2   AvoidedLoss       208
40             21054    Healthy    Win2           Win       166
41             21054    Healthy    Win2     MissedWin         0
42             21054    Healthy    Win2           Win       191
43             21054    Healthy Control       Neutral       199
44             21054    Healthy    Win2           Win       205
45             21054    Healthy    Win2           Win       108
46             21054    Healthy Control       Neutral       174
47             21054    Healthy   Lose2   AvoidedLoss       218
48             21054    Healthy   Lose2   AvoidedLoss       227
49             21054    Healthy   Lose2   AvoidedLoss       236
50             21054    Healthy Control       Neutral       219

And I want it to look like this:

        ParticipantNumber Primary_Dx     Cue TrialCategory Target.RT

20             21054    Healthy Control       Neutral       296
29             21054    Healthy Control       Neutral       201
34             21054    Healthy    Win2           Win       223
37             21054    Healthy Control       Neutral       192
42             21054    Healthy    Win2           Win       191
KKhosra
  • 163
  • 1
  • 4
  • 9
  • Please don't post `dplyr` questions without including a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – cmaher Apr 23 '18 at 21:10
  • 3
    Please don't post *most* questions without including a reproducible example, `dplyr` is included in that set but it is far from exclusive. – r2evans Apr 23 '18 at 21:12
  • I apologize! I have added an example. Thanks! – KKhosra Apr 24 '18 at 14:36

1 Answers1

1

Here's a stab:

head(mtcars)
#                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
# Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
# Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
# Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
# Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
# Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
# Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

Let's say the Datsun is interesting based on its low HP, and I want to list everything below it.

library(dplyr)
head(mtcars) %>%
  filter(cumany(hp < 100))
#    mpg cyl disp  hp drat    wt  qsec vs am gear carb
# 1 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
# 2 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
# 3 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
# 4 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

You can see what it does literally with:

cumany(c(F,F,T,F,T,F,T))
# [1] FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE

and specifically:

head(mtcars)$hp < 100
# [1] FALSE FALSE  TRUE FALSE FALSE FALSE
cumany(head(mtcars)$hp < 100)
# [1] FALSE FALSE  TRUE  TRUE  TRUE  TRUE
r2evans
  • 141,215
  • 6
  • 77
  • 149