34

Right now, I am having 3 separate columns as year, month, and day in a data file in R. How do I merge these three columns into just one column and make R understand that it is date?

Here is what it looks like right now.

year   mon  day  gnp

1947    1   1   238.1   
1947    4   1   241.5   
1947    7   1   245.6   
1947    10  1   255.6   
1948    1   1   261.7   
1948    4   1   268.7
PKumar
  • 10,971
  • 6
  • 37
  • 52
viethaihp291
  • 507
  • 2
  • 6
  • 13

4 Answers4

39

Try:

  df$date <- as.Date(with(df, paste(year, mon, day,sep="-")), "%Y-%m-%d")
   df$date
  #[1] "1947-01-01" "1947-04-01" "1947-07-01" "1947-10-01" "1948-01-01"
  #[6] "1948-04-01"
akrun
  • 874,273
  • 37
  • 540
  • 662
31

Since your year, month and day types are numerical the best function to use is the make_date function from the lubridate package. The tidyverse style solution is therefore

library(tidyverse)
library(lubridate)

data %>%
  mutate(date = make_date(year, month, day))
mtedwards
  • 411
  • 4
  • 4
10

There is also a simpler solution using lubridate and magrittr:

df$date <- paste(df$year, df$mon, df$day, sep="-") %>% ymd() %>% as.Date()

This worked for me, even though I had days and months written in single (i.e. 1) and double (i.e. 01) digits. Parsing was correct as well.

Frank
  • 66,179
  • 8
  • 96
  • 180
Carrol
  • 1,225
  • 1
  • 16
  • 29
4

Or you could use the lubridate package, which makes working with dates and times in R much easier in general.

e.g.

df$date <- with(df, ymd(sprintf('%04d%02d%02d', year, mon, day)))
df$date
# [1] "1947-01-01 UTC" "1947-04-01 UTC" "1947-07-01 UTC" "1947-10-01 UTC"
# [5] "1948-01-01 UTC" "1948-04-01 UTC"

The ymd function takes a string representing Year, Month and Day, which could be "19470101", "1947-01-01", "1947/01/01", etc. Or there is also mdy and dmy if the elements are ordered differently. You can also optionally specify a time zone.

chri55c
  • 41
  • 3