0

I've created the following MWE

test_data <- data.frame(ID = c(1,2,2,3,4,4,4,5),
                    Date_From = as.Date(c("2022-10-17", "2022-10-17", "2022-10-17", "2022-09-12", "2022-09-12",
                                          "2022-09-12","2022-09-02","2022-09-02")),
                    Date_To = as.Date(c("2022-10-24", "2022-10-24", "2022-10-24", "2022-09-17", "2022-09-17",
                                        "2022-09-17", "2022-09-07", "2022-09-07")),
                    Category = c("Cars", "Books", "Books", "Fruits", "Fish", "Fish", "Fish", "Fish"),
                    Value = c(10:3))

which outputs

  ID  Date_From    Date_To Category Value
1  1 2022-10-17 2022-10-24     Cars    10
2  2 2022-10-17 2022-10-24    Books     9
3  2 2022-10-17 2022-10-24    Books     8
4  3 2022-09-12 2022-09-17   Fruits     7
5  4 2022-09-12 2022-09-17     Fish     6
6  4 2022-09-12 2022-09-17     Fish     5
7  4 2022-09-02 2022-09-07     Fish     4
8  5 2022-09-02 2022-09-07     Fish     3

I want to create a subset of that data frame based on the following condition:

If the ID, Date_From, Date_To and Category column is the same for multiple rows, display only the first one of those and also display all other rows. In this particular example,

  • the rows 2-3 have the same values inside those columns and I want to display only the 2nd row.
  • the rows 5-6 have the same values inside those columns and I want to display only the 5th row.
  • also I want to display "all other rows" with non-matching columns => row 1,4,7,8

The final data frame should look like this

  ID  Date_From    Date_To Category Value
1  1 2022-10-17 2022-10-24     Cars    10
2  2 2022-10-17 2022-10-24    Books     9
3  3 2022-09-12 2022-09-17   Fruits     7
4  4 2022-09-12 2022-09-17     Fish     6
5  4 2022-09-02 2022-09-07     Fish     4
6  5 2022-09-02 2022-09-07     Fish     3
Vala
  • 133
  • 4
  • 1
    Use `slice` to subset groups by row number. `test_data %>% group_by(ID, Date_From, Date_To, Category) %>% slice(1)` – Gregor Thomas Oct 10 '22 at 14:10
  • A note on your call-out *"also I want to display "all other rows" with non-matching columns => row 1,4,7,8"* - rows with non-matching columns will each be their own group. Getting the first row of all groups will automatically keep these also. – Gregor Thomas Oct 10 '22 at 14:14

0 Answers0