1

I am using the R programming language. I am trying to take the difference between two date columns. Both dates are in the following format : 2010-01-01 12:01

When I bring my file into R, the dates are in "Factor" format. Here is my attempt to recreate the file in R:

#how my file looks like when I import it into R

date_1 = c("2010-01-01 13:01 ", "2010-01-01 14:01" )
date_2 = c("2010-01-01 15:01 ", "2010-01-01 16:01" )

file = data.frame(date_1, date_2)
file$date_1 = as.factor(file$date_1)
file$date_2 = as.factor(file$date_2)

Now, I am trying to create a new column which takes the difference between these dates (in minutes)

I first tried to convert both date variables into the appropriate "Date" formats:

#convert to date formats:
    
  file$date_a = as.POSIXlt(file$date_1,format="%Y-%m-%dT%H:%M")
  file$date_b = as.POSIXlt(file$date_2,format="%Y-%m-%dT%H:%M")

Then, I tried to take the difference :

file$diff = difftime(file$date_a, file$date_b, units="mins")

But this results in "NA's":

> file

             date_1            date_2 date_a date_b    diff
1 2010-01-01 13:01  2010-01-01 13:01    <NA>   <NA> NA mins
2  2010-01-01 13:01  2010-01-01 13:01   <NA>   <NA> NA mins

Can someone please show me what I am doing wrong?

Thanks

Reference: How to get difference (in minutes) between two date strings?

stats_noob
  • 5,401
  • 4
  • 27
  • 83
  • 1
    i am not converting to factor. when i upload my file from excel into R, the dates are already in "factor" format. I tried to replicate the conditions I am working with – stats_noob May 05 '21 at 18:45

1 Answers1

1

There is no T in the string. So, we need the format as

difftime(as.POSIXct(file$date_1, format = '%Y-%m-%d %H:%M'),
       as.POSIXct(file$date_2, format = '%Y-%m-%d %H:%M'), units = 'mins')
#Time differences in mins
#[1] -120 -120
akrun
  • 874,273
  • 37
  • 540
  • 662
  • thank you for your answer! i tried this code and got the following result: Time differences in mins [1] 0 0 ... I don't think it should be 0? – stats_noob May 05 '21 at 18:48
  • 1
    @Noob In the data created both of them are the same. So, I am not sure how that should be evaluated as different. When you read the data from excel (not sure which package you are using), there is an option to specify column type. If there are seconds and milliseconds,it could be the difference – akrun May 05 '21 at 18:50
  • sorry, i have made them different times and they are still returning as 0 – stats_noob May 05 '21 at 18:50
  • 1
    @Noob Sorry, I couldn't replicate that with your new example even with factor class – akrun May 05 '21 at 18:53