I have two columns, one with age e.g. (34) and another column with date of the event e.g. (2019-04-26:01:20:51). I would like to create a new column that returns the date of birth based on the above two columns). Many thanks in advance for the help.
-
2Hello Angela, welcome to SO. Could you provide a sample data for us to work with. You can get this using `dput(
)`. Also what have you tried to solve this problem? – snair.stack Mar 21 '19 at 11:55 -
You want package `lubridaet`, function `years`: try `DateEvent - lubridate::years(34)`. – Rui Barradas Mar 21 '19 at 15:56
2 Answers
Since there is no sample data available, I created a sample data frame using the data provided. Code snippet is given below. You don't need to use any external package for this. Instead as.POSIXlt
should be enough.
df <- data.frame(event = c("2019-04-26 01:20:51"), age = c(34))
df$event <- as.POSIXlt(x = df$event, format = "%Y-%m-%d %H:%M:%S") # define format here
#df$approx_DOB <- (df$event$year+1900)-df$age # 1900 to get actual year
df$approx_DOB <- NA
df$approx_DOB <- df$event
df$approx_DOB$year <- (df$event$year) - df$age # no 1900, since editing the age directly
df$YearOfBirth <- NA
df$YearOfBirth <- (df$event$year+1900) - df$age # Gives year alone
Ouput:
> df
event age approx_DOB YearOfBirth
1 2019-04-26 01:20:51 34 1985-04-26 01:20:51 1985
Bonus: You can further access the elements of a POSIXlt object using $
and specifying the type required (eg: year
, mon
, mday
etc). Then accordingly can format the approx_DOB
column. Check this answer for more info.

- 405
- 4
- 13
-
1yes, this is what I thought as well - you can calculate the approx DOB rather than exact one – GaB Mar 21 '19 at 12:21
Here it is another example, with tidyverse and lubridate. I believe it is a better solution since I will calculate the date of birth taking the year only and with big data sets you the calculations are quicker. The active full date won't give you the exact date of birth and probably some huge issues will come out. Thus, here it is my solution:
library(tidyverse)
library(lubridate)
df <- tibble::tibble(event = c("2018-04-26 02:30:10"), age = c(34))
df_separate <- df %>%
dplyr::mutate(year = as.numeric(lubridate::year(event)),
DOB_Y_approximated = year - age)
And you get the exact year of birth, which I assume it is a better output.
# A tibble: 1 x 4
event age year DOB_Y_approximated
<chr> <dbl> <dbl> <dbl>
1 2018-04-26 02:30:10 34 2018 1984

- 1,076
- 2
- 16
- 29