I'm looking for an easy way of joining a dataset in R to a lookup dataset. The lookup dataset has min, max and a label as below.
library(dplyr)
min_age <- c(0, .5, 1, 5, 17, 35, 50, 75)
max_age <- c(.5, 1, 5, 17, 35, 50, 75, 125)
age_lbl <- c("< 6 months", "6 months - 1 year", "1 year - 5 years",
"5 years - 17 years", "17 years - 35 years", "35 years - 50 years"
, "50 years - 75 years", "> 75 years")
age_lbl <- as.factor(age_lbl)
lkp_df <- data.frame(min_age = min_age, max_age = max_age, age_grp_lb = age_lbl)
The dataframe would look like below
lkp_df
min_age max_age age_grp_lb
0.0 0.5 < 6 months
0.5 1.0 6 months - 1 year
1.0 5.0 1 year - 5 years
5.0 17.0 5 years - 17 years
17.0 35.0 17 years - 35 years
35.0 50.0 35 years - 50 years
50.0 75.0 50 years - 75 years
75.0 125.0 > 75 years
I would now want to join it with a dataframe that has patient ages as below. The join should be based on min_age and max_age and obtain the age_grp_lbl values in the patient dataframe.
head(pat_df)
pat_id pat_age
1001 14.397769
1002 66.694280
1003 53.628013
1004 58.782156
1005 5.032531
1006 16.430463
I have a roundabout way of dealing with this.
# Introduce dummy column to obtain a cartesian
lkp_df <- lkp_df %>%
mutate(join = 1)
set.seed(6789)
pat_id <- seq(1001, 1075)
pat_age <- runif(75, 0, 95)
pat_df <- data.frame(pat_id, pat_age) %>%
mutate(pat_age_yrs = as.integer(pat_age),
pat_age_mths = as.integer(pat_age * 12)) %>%
# Introduce dummy column to obtain a cartesian
mutate(join = 1) %>%
# Create a cartesian product with the join column
inner_join(lkp_df) %>%
# Filter to keep only required records
filter(pat_age >= min_age & pat_age < max_age) %>%
# Keep only necessary columns
select(pat_id, pat_age, pat_age_yrs, pat_age_mths, age_grp_lb)
Can someone suggest a better way of dealing with similar situation. Thanks in advance. Masoud A R.