How to delete rows with speicifc column condition in r?

Question

Hello all my df looks like

I want to delet record based on rows which are >= 4 Stage

Expected output

Thanks in advance

`RowsOfInterest <- df[ , 'Stage' ] < 4; Result <- df[ RowsOfInterest , ];` — Tasos Papastylianou, Sep 10 '20 at 08:18

score 0 · Accepted Answer · answered Sep 10 '20 at 08:02

Select groups where all the values are less than 4.

library(dplyr)
df %>% group_by(PID) %>%filter(all(Stage < 4))

#    PID Stage
#  <int> <int>
#1   124     1
#2   124     3
#3   137     2
#4   137     3
#5   178     1
#6   178     2
#7   178     1

This can be written in data.table

library(data.table)
setDT(df)[, .SD[all(Stage < 4)], PID]

and base R :

subset(df, ave(Stage < 4, PID, FUN = all))

data

df <- structure(list(PID = c(123L, 123L, 123L, 124L, 124L, 137L, 137L, 
153L, 153L, 153L, 167L, 167L, 178L, 178L, 178L, 187L, 187L), 
    Stage = c(1L, 2L, 4L, 1L, 3L, 2L, 3L, 1L, 4L, 5L, 4L, 5L, 
    1L, 2L, 1L, 3L, 4L)), class = "data.frame", row.names = c(NA, -17L))

score 0 · Answer 2 · answered Sep 10 '20 at 08:07

A dplyr solution without group_by():

library(dplyr)

df %>% filter(!PID %in% PID[Stage >= 4])

#   PID Stage
# 1 124     1
# 2 124     3
# 3 137     2
# 4 137     3
# 5 178     1
# 6 178     2
# 7 178     1

It's base version:

subset(df, !PID %in% PID[Stage >= 4])

How to delete rows with speicifc column condition in r?

2 Answers2