0

I am trying to select all those households where the occupants age group are below 15 and between 30 to 40 from the below data frame. Basically I am looking for households with kids along with their parents

household Members_age
   100          75
   100          74
   100          30
   101          20
   101          50
   101          60
   102          35
   102          40
   102           5

Here only Household 102 satisfy the condition.

I got hint over here subset data based on condition in r but not able to add the 2nd condition to filter i.e. age between 30 - 40

Sample dataset:

library(dplyr)
library(sqldf)
data <- data.frame(household = c(100,100,100,101,101,101,102,102,102),
           Members_age = c(75,74,30,20,50,60,35,40,5))
jay.sf
  • 60,139
  • 8
  • 53
  • 110
joy_1379
  • 487
  • 3
  • 17

4 Answers4

1

We may use

subset(data, Members_age < 15 | (Members_age >30 & Members_age < 40))

-output

 household Members_age
7       102          35
9       102           5

According to the OP's post Here only Household 102 satisfy the condition.

akrun
  • 874,273
  • 37
  • 540
  • 662
1

We can group_by household and look for groups (households) that have both at least one age < 15 and at least one age between 30 and 40.

data %>%
  group_by(household) %>%
  filter(any(Members_age < 15) & any(between(Members_age, 30, 40))) %>%
  ungroup() %>%
  distinct(household)
# # A tibble: 1 x 1
# household
# <dbl>
#   1       102
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
1

Using data.table.

library(data.table)
setDT(data)[Members_age < 15 | (Members_age > 30 & Members_age < 40)]
#    household Members_age
# 1:       102          35
# 2:       102           5
jay.sf
  • 60,139
  • 8
  • 53
  • 110
0

An alternative using dplyr

library(dplyr)

data %>% 
  filter(Members_age < 15 | (Members_age > 30 & Members_age < 40))
Harrison Jones
  • 2,256
  • 5
  • 27
  • 34