0

I have a data frame where I want to filter out the Month values if the count of its associated Index is < 2.

ID = c(rep("A", 5), rep("B", 5))
Month = c(
  1,
  1,
  2,
  2,
  3,
  1,
  2,
  2,
  3,
  3
)
Index = c("X1", "X1", "X2", "X2", "X3", "X1", "X2", "X2", "X2", "X3")
df = data.frame(ID, Month, Index, stringsAsFactors =  FALSE)
df$Month <- as.factor(df$Month)
df

Here, X3 and X1 only occur once for A and B respectively so those rows would be deleted.

But if I try to filter my data using %in% for X3 and X1 they will also get removed from the other ID values.

The deletion should only be group specific.

adkane
  • 1,429
  • 14
  • 29
  • Does this answer your question? [Using filter with count](https://stackoverflow.com/questions/26573285/using-filter-with-count) – camille Nov 11 '19 at 17:31
  • There's a *very* in-depth comparison of different methods that might help here: https://stackoverflow.com/q/43110349/5325862. Also many options here: https://stackoverflow.com/q/20204257/5325862 – camille Nov 11 '19 at 17:31

1 Answers1

1

We can group by the columns and filter

library(dplyr)
df %>% 
  group_by(ID, Month, Index) %>%
  filter(n() >1)
akrun
  • 874,273
  • 37
  • 540
  • 662