0

I've got a data frame df (+/- 331000 observations with 4 variables) with Date (range in format = "%Y-%m-%d"), ID (factor with 19 levels), Station (factor with 18 levels), and Presence (1/0). The data frame is setup in such a way that there's a range of dates (over an almost three year period) for each ID at each Station, and whether a person was present (1/0) on a particular day at a particular Station.

If one would subset/filter the df according to a day and ID, you'd get the following dataset (I'll refer to this from now on as 'group'):

filter(df, Date == "2016-01-03" & ID == "Fred")
 Date           ID     Station       Presence
 <date>         <fct>  <fct>         <dbl>
 2016-01-03     Fred   Station1      0 
 2016-01-03     Fred   Station2      0 
 2016-01-03     Fred   Station3      0 
 2016-01-03     Fred   Station4      1
 2016-01-03     Fred   Station5      0 
 2016-01-03     Fred   Station6      0 
 2016-01-03     Fred   Station7      0 
 2016-01-03     Fred   Station8      0 
 2016-01-03     Fred   Station9      0 
 2016-01-03     Fred   Station10     0 
 2016-01-03     Fred   Station11     0 
 2016-01-03     Fred   Station12     0 
 2016-01-03     Fred   Station13     0
 2016-01-03     Fred   Station14     0 
 2016-01-03     Fred   Station15     0 
 2016-01-03     Fred   Station16     0 
 2016-01-03     Fred   Station17     0 
 2016-01-03     Fred   Station18     0 

I would like to remove rows from the group if the following conditions are met: For each group, if df$Presence == 1, remove rows with df$Presence == 0 (it is possible to have rows with multiple df$Presence == 1 within one group, e.g. Fred was at Station4, Station9 and Station 15 on 2016-01-06). But if there are no rows with df$Presence == 1 within the group, don't remove any of the rows (so I can't simply remove all the df$Presence == 0 rows).

The above group would thus become:

 Date         ID      Station    Presence
 <date>       <fct>   <fct>      <dbl>
 2016-01-03   Fred    Station4   1

However, the following group would stay as it is (as there are no Presence == 1 within the group):

filter(df, Date== "2016-01-03" & ID == "Mark")
 Date       ID    Station    Presence
 <date>     <fct> <fct>      <dbl>
 2016-01-03 Mark Station1    0 
 2016-01-03 Mark Station2    0 
 2016-01-03 Mark Station3    0 
 2016-01-03 Mark Station4    0
 2016-01-03 Mark Station5    0 
 2016-01-03 Mark Station6    0 
 2016-01-03 Mark Station7    0 
 2016-01-03 Mark Station8    0 
 2016-01-03 Mark Station9    0 
 2016-01-03 Mark Station10   0 
 2016-01-03 Mark Station11   0 
 2016-01-03 Mark Station12   0 
 2016-01-03 Mark Station13   0 
 2016-01-03 Mark Station14   0 
 2016-01-03 Mark Station15   0 
 2016-01-03 Mark Station16   0 
 2016-01-03 Mark Station17   0 
 2016-01-03 Mark Station18   0 

I've thought of starting with

df %>%
  group_by(Date, ID) %>%

However, I don't know how to proceed from there. I don't know how to deal with the contrasting conditions.

m0nhawk
  • 22,980
  • 9
  • 45
  • 73
  • When asking for help, you should include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Make a smaller dataset we can copy/paste into R to test.You probably just need a group_by and an `any(Presence==1) & (Presence==0)` to remove the rows. – MrFlick Jul 02 '18 at 14:09

1 Answers1

0
library(tidyverse)
dat%>%
   group_by(Date,ID)%>%
   filter(all(Presence==0)|Presence==1)

# A tibble: 19 x 4
# Groups:   Date, ID [2]
   Date       ID    Station   Presence
   <chr>      <chr> <chr>        <int>
 1 2016-01-03 Fred  Station4         1
 2 2016-01-03 Mark  Station1         0
 3 2016-01-03 Mark  Station2         0
 4 2016-01-03 Mark  Station3         0
 5 2016-01-03 Mark  Station4         0
 6 2016-01-03 Mark  Station5         0
 7 2016-01-03 Mark  Station6         0
 8 2016-01-03 Mark  Station7         0
 9 2016-01-03 Mark  Station8         0
10 2016-01-03 Mark  Station9         0
11 2016-01-03 Mark  Station10        0
12 2016-01-03 Mark  Station11        0
13 2016-01-03 Mark  Station12        0
14 2016-01-03 Mark  Station13        0
15 2016-01-03 Mark  Station14        0
16 2016-01-03 Mark  Station15        0
17 2016-01-03 Mark  Station16        0
18 2016-01-03 Mark  Station17        0
19 2016-01-03 Mark  Station18        0
Onyambu
  • 67,392
  • 3
  • 24
  • 53