0

I am new at using R, but I am trying to learn how to use it to make my data analyses more reproducible. I have my dates entered in three columns for the drop off date and three columns for the pickup date (one for month, one for day, and one for year). I need to be able to get R to recognize it as a date, so I can calculate the time in field as a fraction of a year (days/365). I installed the lubridate package and tried using the mdy() function, but it gave me the following error message:

Error: Column `drop_off_date` must be length 150 (the number of rows) or one, not 450
In addition: Warning message:
All formats failed to parse. No formats found. 

I also tried using backticks, but that did not work either. I think it may be because of how my dates are set up in different columns, but I am not sure. This is the section of code I used for that:

mutate(drop_off_date = mdy(dropoff_month, dropoff_day, dropoff_year),
         pickup_date = mdy(pickup_month, pickup_day, pickup_year),

Does anyone have any suggestions for a different function or what I could fix to use this function?

Rachel
  • 3
  • 2
  • 2
    Use `paste` or `paste0` first, perhaps `mdy(paste0(dropoff_month, dropoff_day, dropff_year))`? – r2evans Feb 17 '20 at 17:13
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Feb 17 '20 at 17:14
  • FYI, my first comment was untested, `paste0` (without a `sep=`) does not work with single digit months or days. – r2evans Feb 17 '20 at 17:21

1 Answers1

0

The lubridate functions take a single vector of strings. My first comment suggested paste0 could work too, but not directly (see the code below), so you need to include a separator (such as paste's default " " space).

library(lubridate)
### wrong
mdy(10, 13, 2018)
# Warning: All formats failed to parse. No formats found.
# [1] NA NA NA

### some fixed
mdy(paste(10, 13, 2018))
# [1] "2018-10-13"

library(dplyr)
data.frame(y=c(2018,2019), m=c(10,9), d=c(30,1)) %>%
  mutate(date = mdy(paste(m, d, y)))
#      y  m  d       date
# 1 2018 10 30 2018-10-30
# 2 2019  9  1 2019-09-01
r2evans
  • 141,215
  • 6
  • 77
  • 149