Is there a way to use a cleaner to omit specific cases? (R)

Question

I have a dataframe where I want to omit cases where ages 30 or less are omitted. I know you can use na.omit to omit NA cases, but how would I omit specific cases like this?

Please refer to this for a good question and easy to understand, so others can understand your question thorougly. https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example — Sinh Nguyen, Feb 08 '22 at 02:24
Your question is not very clear. Can you show some example input, and desired output? — neilfws, Feb 08 '22 at 02:40

score 0 · Answer 1 · answered Feb 08 '22 at 02:23

0

Seems to be more a filtering problem than omitting missing values:

> df <- tibble(age = c(20,25,30,35,40))
> 
> df %>% filter(age < 30)
# A tibble: 2 × 1
    age
  <dbl>
1    20
2    25
>

answered Feb 08 '22 at 02:23

AugtPelle

549
1
10

AndrewGB · Answer 2 · 2022-02-08T04:05:23.963

With base R, you can filter out all rows where the age is greater than 50.

df[df$age < 30,]

    age values
  <int>  <dbl>
1    21  1.89 
2    22  1.01 
3    23  0.107
4    24  1.46 
5    25  1.17 
6    26  1.86 
7    27  1.77 
8    28  1.91 
9    29  0.594

Or with data.table:

library(data.table)

dt <- data.table(df)
dt[age < 30]

However, if you are wanting to only filter NAs for the rows, where the age is greater than 30, then you can find the row index for age being greater than 30 and another column having NA. Then, you can exclude those rows.

df[!(df$age > 30 & is.na(df$values)),]

Or with subset:

subset(df, !(age > 30 & is.na(values)))

With tidyverse:

library(tidyverse)

df %>% 
  filter(!(age > 30 & is.na(values)))

data.table:

dt <- data.table(df)
dt[!(age > 30 & is.na(values))]

Data

df <- structure(list(age = 21:40, 
                     values = c(1.88648780807853, 1.01084147393703, 
                                0.107075828593224, 1.46145519195125, 1.16910230834037, 1.85718628577888, 
                                1.7749991081655, 1.91132036875933, 0.594451983459294, 0.976039483677596, 
                                1.31880497187376, 1.82749796425924, 1.98314357083291, 0.57053042575717, 
                                0.722490054555237, 1.66634088428691, 0.702816031407565, 0.622223159298301, 
                                0.298387756571174, 1.6071562608704)), 
                class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -20L))

Is there a way to use a cleaner to omit specific cases? (R)

2 Answers2