0

I want to filter a data frame to include only rows that have matching values in certain columns.

My data:

df <- data.frame("Date" = ymd(c("2005-01-01", "2005-01-02", "2005-01-02", "2005-01-01", "2005-01-01")),
                 "Person" = c("John", "John", "John", "Maria", "Maria"),
                 "Job" = c("OR", "ER", "Heart", "Liver", "CV"),
                 "Type" = c("Day", "Night", "Night", "Day", "Night"))

I want to create a smaller data frame that includes rows that match on the date, the person, and the type.

The data frame I want to see is this:

df1 <- data.frame("Date" = ymd(c("2005-01-02", "2005-01-02")),
                  "Person" = c("John", "John"),
                  "Job" = c("ER", "Heart"),
                  "Type" = c("Night", "Night"))
Cat
  • 49
  • 6

1 Answers1

2

We can use group_by and filter from dplyr:

library(dplyr)

df %>%
  group_by(Date, Person, Type) %>%
  filter(n() > 1)

Output:

# A tibble: 2 x 4
# Groups:   Date, Person, Type [1]
  Date       Person Job   Type 
  <date>     <fct>  <fct> <fct>
1 2005-01-02 John   ER    Night
2 2005-01-02 John   Heart Night
acylam
  • 18,231
  • 5
  • 36
  • 45