Simplified code where two parameter age
and gender
; however, I would like to pick cases only by gender
or age
; I am thinking how you can overload to getIDs(age)
and getIDs(gender)
without multiplicating same code again and again; assume you have 50 parameters etc; I tried getIDs(age, "")
but I it is not a good idea
getIDs <- function(age, gender) {
# https://stackoverflow.com/a/40330110/54964
ageIDs <- c(1,2,3)
genderIDs # dummy code here to indicate that do not use genderIDs if gender ""
intersect(ageIDs, genderIDs)
}
Main data
ID,Age,Gender
100,69,male
101,75,female
102,84,female
103,,male
104,66,female
Data 2
DF <- structure(list(ID = 100:104, Age = c(69L, 75L, 84L, NA, 66L), Gender =
c("male", "female", "female", "male", "female")), .Names = c("ID", "Age",
"Gender"), row.names = c(NA, -5L), class = "data.frame")
Similarly for age: if age==""
, do not include subset
ageIDs` in.
Some parameter for all male would be great such that you do not need to do explicitly "male", "male", ...
.
Algorithm based on Roman's answer
I think this strategy is very challenging with 50 parameters so better way is still needed
getIDs <- function(age, gender) {
# https://stackoverflow.com/a/40330110/54964
# So if you called this as getIDs(c(20, 30), "male")
# You'd get the ids of all males with age >= 20 and <= 30
#
# NULL = ALL
# getIDs(age = c(1,2), gender = NULL)
# getIDs(age = NULL, gender = "male")
data <- read.csv("/home/masi/data.csv",header = TRUE,sep = ",")
if (is.null(gender)) {
genderIDs <- data$ID
} else {
gender <- data$Gender == gender
genderIDs <- data[which(gender), ]$ID
}
if (is.null(age)) {
age <- c(0,130)
}
if (length(age) == 1) {
ages <- data$Age == age
} else {
ages <- (data$Age >= age[1] & data$Age <= age[2])
}
ageIDs <- data[which(ages), ]$ID
intersect(ageIDs, genderIDs)
}
OS: Debian 8.5
R: 3.1.1