0

Is it possible to do like a multiple if-then statement? For example I want to put a range of values that column Age_OSM must be, dependent on whether they are male or female. So if they are male the Age_OSM must fall between 8-15 to be true anything else would be false and then if its a female the Age_OSM must fall within 15-25 to be true.

Data

example <- structure(list(ID = c("BLA61", "BLA203", "BLA216", "BLA53", "BLA33", 
                                 "BLA23", "BLA205", "BLA202", "BLA36", "BLA38", "BLA215", "BLA221", 
                                 "BLA219", "BLA60", "BLA48", "BLA248", "BLA217", "BLA1", "BLA270", 
                                 "BLA224", "BLA16", "BLA213", "BLA74", "BLA300", "BLA17", "BLA214", 
                                 "BLA228", "BLA80", "BLA31", "BLA79", "BLA42", "BLA283", "BLA307", 
                                 "BLA25", "BLA238", "BLA27", "BLA24", "BLA21", "BLA14", "BLA2", 
                                 "BLA211", "BLA294", "BLA2_2022", "BLA4_2022", "BLA5_2022", "BLA6_2022", 
                                 "BLA7_2022", "BLA9_2022", "BLA10_2022", "BLA12_2022", "BLA13_2022", 
                                 "BLA14_2022", "BLA15_2022", "BLA47", "BLA49"), Age_OSM = c("12", 
                                                                                            "12", "7", "13", "11", "13", "25", "25", "1", "9", "21", "16", 
                                                                                            "16", "16", "6", "17", "14", "9", "22", "14", "16", "22", "2", 
                                                                                            "13", "14", "17", "21", "19", "42", "8", "25", "16", "12", "27", 
                                                                                            "29", "10", "15", "3", "25", "19", "19", "34", "22", "15", "19", 
                                                                                            "19", "32", "8", "23", "19", "26", "18", "9", "23", "7"), Sex = c("Male", 
                                                                                                                                                              "Male", "Male", "Male", "Male", "Male", "Male", "Male", "Male", 
                                                                                                                                                              "Male", "Male", "Male", "Male", "Male", "Male", "Male", "Male", 
                                                                                                                                                              "Male", "Male", "Male", "Male", "Male", "Male", "Male", "Male", 
                                                                                                                                                              "Male", "Male", "Female", "Female", "Female", "Female", "Female", 
                                                                                                                                                              "Female", "Female", "Female", "Female", "Female", "Female", "Female", 
                                                                                                                                                              "Female", "Female", "Female", "Female", "Female", "Female", "Female", 
                                                                                                                                                              "Female", "Female", "Female", "Female", "Female", "Female", "Female", 
                                                                                                                                                              "Female", "Female")), row.names = c(NA, -55L), class = "data.frame")

Example of part of the data now

        ID    Age_OSM    Sex
1       BLA61      12   Male
2      BLA203      12   Male
3      BLA216       7   Male
4       BLA53      13   Male
5       BLA33      11   Male
6       BLA23      13   Male
7      BLA205      25   Male
8      BLA202      25   Male
9       BLA36       1   Male
10      BLA38       9   Male

Example output

        ID    Age_OSM    Sex   Result
1       BLA61      12   Male    true
2      BLA203      12   Male    true
3      BLA216       7   Male    false
4       BLA53      13   Male    true
5       BLA33      11   Male    true
6       BLA23      13   Male    true
7      BLA205      25   Male    false
8      BLA202      25   Male    false
9       BLA36       1   Male    false
10      BLA38       9   Male    true 

2 Answers2

3

Given the way you created this data, it seems like you might really dislike the tidyverse. Nevertheless, there's a very useful dplyr function, case_when that would solve this. This will get you what you seek:

example %>% mutate(Result = case_when(Sex == 'Male' & Age_OSM >= 8 & Age_OSM <= 15 ~ TRUE, Sex == 'Female' & Age_OSM >= 15 & Age_OSM <= 25 ~ TRUE, TRUE ~ FALSE))

The perhaps strange looking TRUE ~ FALSE at the end just means "for cases not explicitly covered by the conditions above, return FALSE".

Justin
  • 147
  • 1
  • 6
  • Hi @Justin, thank you! I created the data example using dput() from my own data. I should really work on creating data manually though when its this simple. Its something I need to get better at. – confusedindividual Apr 21 '23 at 05:55
  • 2
    `dput()` is miles better than some of the ways people attempt to share their data on SO. See here for [more info](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – Godrim Apr 21 '23 at 06:07
  • @Justin Note that the column `Age_OSM` is of type "character", and needs to be converted to "numeric" for the comparisons. – DrEspresso Apr 21 '23 at 11:19
1

If you would like to stick with baseR, you can also combine multiple logical expressions when subsetting a vector.

So you could first create a new column Result with all values FALSE, and then only set those values to TRUE which satisfy your conditions:

example$Age_OSM <- as.numeric(example$Age_OSM)

example$Result = FALSE
example$Result[example$Sex == "Male" & example$Age_OSM >= 8 & example$Age_OSM <= 15] <- TRUE
example$Result[example$Sex == "Female" & example$Age_OSM >= 15 & example$Age_OSM <= 25] <- TRUE

head(example, n = 10)
#        ID Age_OSM  Sex Result
# 1   BLA61      12 Male   TRUE
# 2  BLA203      12 Male   TRUE
# 3  BLA216       7 Male  FALSE
# 4   BLA53      13 Male   TRUE
# 5   BLA33      11 Male   TRUE
# 6   BLA23      13 Male   TRUE
# 7  BLA205      25 Male  FALSE
# 8  BLA202      25 Male  FALSE
# 9   BLA36       1 Male  FALSE
# 10  BLA38       9 Male   TRUE

(Note that I converted column Age_OSM to numeric, in order for the comparisons to work.)

DrEspresso
  • 211
  • 5