0

I am looking for some help with a filter function in R. Hope you guys can help me.

I have a following example table:

1st 2nd 3rd 4th
A K2 S2 13
B K1 S1 31
B K2 S1 68
A K1 S1 101
B K1 S1 129
A K2 S1 500
B K1 S1 129
A K2 S1 50

I want to filter-out/remove these specific row combinations from the data set, e.g.:

1st = "A" & 2nd ="K2" & !4th > 100
AND
1st = "A" & 2nd ="K1" & !4th > 50
AND
1st = "B" & 2nd ="K1" & !4th > 64

Is there any special filter to do that?

Knstntn
  • 1
  • 1
  • Just create a logical expression by changing the `=` to `==`, `AND` to `&` and then negate (`!`) the whole expression in `subset` i..e `subset(yourdata, yourexpr)` – akrun Jun 11 '22 at 20:39
  • Thanks for including some test data. You should also include the code you’ve tried. https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example You might find the `filter()` function in `library(dplyr)` useful. – John Polo Jun 11 '22 at 20:40
  • Please provide enough code so others can better understand or reproduce the problem. – Community Jun 12 '22 at 00:36
  • Thanks for the feedback first of all. My code was: ....filter(1st=="A" & 2nd == "K2" & !4th>100) %>% filter(1st=="A" & 2nd == "K1" & !4th>50) %>% filter........it did not deliver a result I wanted. – Knstntn Jun 12 '22 at 08:55

1 Answers1

1

The filter() function from dplyr allows you to filter your data frame on multiple criteria separated by commas. You can think of each of your “combinations” as a set of logical criteria that when met you want excluded from your results (hence wrap these criteria in a NOT !(.) expression).

For example:

library("tidyverse")

Var_1 <- c("A","B","B","A","B","A","B","A")
Var_2 <- c("K2","K1","K2","K1","K1","K2","K1","K2")
Var_3 <- c("S2","S1","S1","S1","S1","S1","S1","S1")
Var_4 <- c(13,31,68,101,129,500,129,50)

Test_Data_1 <- data.frame(Var_1,Var_2,Var_3,Var_4)

Test_Data_2 <- Test_Data_1 %>%
  filter(!(Var_1 == "A" & Var_2 == "K2" & Var_4 >= 100),
         !(Var_1 == "A" & Var_2 == "K1" & Var_4 >= 50),
         !(Var_1 == "B" & Var_2 == "K1" & Var_4 >= 64))
  • Great, thanks a lot. My code was similar, the mistake I did was to place the exclamation mark before Var_4 and not before the whole line. (.....& !Var_4 >100...) – Knstntn Jun 12 '22 at 09:13