So I want to check if the age and the birthyear are matching. What is the easiest way to do that?
I think I can just add these two together and check if it is 2021.
But how would I just exlude these rows that don't match?
So I want to check if the age and the birthyear are matching. What is the easiest way to do that?
I think I can just add these two together and check if it is 2021.
But how would I just exlude these rows that don't match?
You could define a little function that takes the date of birth as input and returns the actual age in years as an output:
get_age <- function(DOB) {
floor(lubridate::interval(DOB, lubridate::now())/lubridate::years(1))
}
For example, suppose you had a data frame of DOB and ages:
df <- data.frame(DOB = as.Date(c("1984-07-01", "1970-09-22",
"2015-09-11", "1999-05-03")),
age = c(37, 51, 16, 22))
df
#> DOB age
#> 1 1984-07-01 37
#> 2 1970-09-22 51
#> 3 2015-09-11 16
#> 4 1999-05-03 22
Now you want to see if the recorded ages match the actual age as calculated by the date of birth. We can add a column giving the calculated age like this:
df$real_age <- get_age(df$DOB)
df
#> DOB age real_age
#> 1 1984-07-01 37 37
#> 2 1970-09-22 51 51
#> 3 2015-09-11 16 6
#> 4 1999-05-03 22 22
Finally, we can filter the data frame so that we only keep the rows with the correct age.
df[df$age == df$real_age,]
#> DOB age real_age
#> 1 1984-07-01 37 37
#> 2 1970-09-22 51 51
#> 4 1999-05-03 22 22