2

I have data that looks something like:

print(dat)
i1  i2  node_id
 4   4        8
 4   5        8
 3   2        9
 5   1        8

Using either dplyr's filter or subset(preferably filter) I would like to reverse filter the data so I get this:

 print(dat)
 i1  i2  node_id
 4   4        8
 4   5        8
 5   1        8

I say reverse filter because instead of filtering or sub-setting like:

dat<-filter(dat,node_id==8)
dat<-subset(dat,node_id==8)

I would like to do this by telling R I want to keep everything except where the node_id==9 I have tried:

dat<-filter(dat,-node_id==9)
dat<-subset(dat,-node_id==9)

But neither work. Any suggestions? Thanks.

David Arenburg
  • 91,361
  • 17
  • 137
  • 196
costebk08
  • 1,299
  • 4
  • 17
  • 42

2 Answers2

5

Reverse filtering for <, >, <=, >=

I know it's not specifically asked, but in case of reverse filtering for cases <, >, <=, >= conditions, if you wanted all the node_id < 9following would work

dat <- dat %>% filter(!node_id >= 9)

which is same as

dat <- dat %>% filter(node_id < 9)

Aks
  • 932
  • 2
  • 17
  • 32
1

subset() with multiple conditions

This is not directly relevant to OP, but could help someone else with situations where you have subset() with multiple conditions.

Say you have a data frame dat and you have a subset of dat named A.df. You want to get B.df which excludes A.df from dat.

One approach is using ! to reverse the combination of conditions:

A.df <- subset(dat, Col1 %in% criteria | Col2 %in% criteria | Col3 %in% criteria)

becomes

B.df <- subset(dat, !(Col1 %in% criteria | Col2 %in% criteria | Col3 %in% criteria))

But this might not be suitable in nested subset (i.e. subset of a subset and so on).

Another approach is using rownames() to exclude certain rows. This approach works for subset with multiple conditions and nested subset.

Say dat has rows of 1,2,3,4,5 while A.df has rows of 3,4. So, we exclude those rows to get 1,2,5 for B.df.

dat$ID <- rownames(dat)
B.df <- subset(dat, !(ID %in% rownames(A.df)))

First line is to take the names of every row of dat (by default is 1,2,3,...) and append them to a new column.

Saftever
  • 685
  • 6
  • 9