I have a dataset with people's complete age as strings (e.g., "10 years 8 months 23 days) in R, and I need to transform it into a numeric variable that makes sense. I'm thinking about converting it to how many days of age the person has (which is hard because months have different amounts of days). So the best solution might be creating a double variable that would show age as 10.6 or 10.8, some numeric variable that carries the information that 10years 8month 5days is greater than 10years 7month 12days.
Here is an example of the current variable I have
library(tibble)
age <- tibble(complete_age =
c("10 years 8 months 23 days",
"9 years 11 months 7 days",
"11 years 3 months 1 day",
"8 years 6 months 12 days"))
age
# A tibble: 4 x 1
complete_age
<chr>
1 10 years 8 months 23 days
2 9 years 11 months 7 days
3 11 years 3 months 1 day
4 8 years 6 months 12 days
Here is an example of a possible outcome I would love to see (with approximated values for age_num)
> age
# A tibble: 4 x 2
complete_age age_num
<chr> <dbl>
1 10 years 8 months 23 days 10.66
2 9 years 11 months 7 days 9.92
3 11 years 3 months 1 day 11.27
4 8 years 6 months 12 days 8.52
In summary, I have a dataset with the "complete_age" column, and I want to create the column "age_num."
How to do that in R? I'm having a hard time trying to use stringr
and lubridate
but maybe this is the way to go?