Subseting dataframe with repeated observations based on conditions

Question

Here is an example of my dataframe:

ID <- rep(c(1, 2,3), each = 4)
value <-  rep(c(3,3,1,1,4,4,4,4,6,6,9,9))
Group <-  rep(c("Group 1","Group 2", "Group 2","Group 2","Group 1", "Group 2"), each =2 )
data <- data.frame(ID, Group,value)
data
#>    ID   Group value
#> 1   1 Group 1     3
#> 2   1 Group 1     3
#> 3   1 Group 2     1
#> 4   1 Group 2     1
#> 5   2 Group 2     4
#> 6   2 Group 2     4
#> 7   2 Group 2     4
#> 8   2 Group 2     4
#> 9   3 Group 1     6
#> 10  3 Group 1     6
#> 11  3 Group 2     9
#> 12  3 Group 2     9

There are some repeated IDs in group 1 and also in group 2 (example ID 1 and 3 ). Also, there are some IDs in only one group, either Group 1 or Group 2 (example ID 2). I want to subset the dataframe based on "Group" column (either Group 1 or Group 2), conditioned on column "Value" (minimum value) such that all IDs are considered(even the IDs in only one group). The output I expect is this:

    ID <- rep(c(1,1,2,2,2,2,3,3))
    value <-  rep(c(1,1,4,4,4,4,6,6))
    Group <-  rep(c("Group 2","Group 2","Group 2","Group 2","Group 2","Group 2","Group 1","Group 1"))
    data <- data.frame(ID, Group,value)
    data
#>   ID   Group value
#> 1  1 Group 2     1
#> 2  1 Group 2     1
#> 3  2 Group 2     4
#> 4  2 Group 2     4
#> 5  2 Group 2     4
#> 6  2 Group 2     4
#> 7  3 Group 1     6
#> 8  3 Group 1     6

So the number of distinct ID's before subsetting must be the same with the number of distinct ID's after subsetting. The only thing that will change is the number of rows that will reduce

If I didn't specify the problem clear enough feel free to ask and I will try to explain it more clearly! Thank you all in advance!

score 0 · Accepted Answer · answered May 20 '20 at 20:52

We can group by 'ID' and filter the rows where 'value' is minimum

library(dplyr)
data %>%
      group_by(ID) %>%
      filter(value == min(value))
# A tibble: 8 x 3
# Groups:   ID [3]
#     ID Group   value
#  <dbl> <chr>   <dbl>
#1     1 Group 2     1
#2     1 Group 2     1
#3     2 Group 2     4
#4     2 Group 2     4
#5     2 Group 2     4
#6     2 Group 2     4
#7     3 Group 1     6
#8     3 Group 1     6

Subseting dataframe with repeated observations based on conditions

1 Answers1