How to subtract datetimes and store them in a separate column?

Question

i am working with csv file and i have a column with name "statistics_lastLocatedTime" as shown in csv file image i would like to subtract second row of "statistics_lastLocatedTime" from first row; third row from second row and so on till the last row and then store all these differences in a separate column and then combine this column to the other related columns as shown in the code given below:

##select related features
data <- read.csv("D:/smart tech/store/2016-10-11.csv")
(columns <- data[with(data, macAddress == "7c:11:be:ce:df:1d" ),
    c(2,10,11,38,39,48,50) ])
write.csv(columns, file = "updated.csv", row.names = FALSE)

## take time difference 
date_data <- read.csv("D:/R/data/updated.csv")
(dates <- date_data[1:40, c(2)])
NROW(dates)
for (i in 1:NROW(dates)) {
  j <- i+1
  r1 <- strptime(paste(dates[i]),"%Y-%m-%d %H:%M:%S")
  r2 <- strptime(paste(dates[j]),"%Y-%m-%d %H:%M:%S")
  diff <- as.numeric(difftime(r1,r2))
  print (diff)
}

## combine time difference with other related columns
combine <- cbind(columns, diff)
combine

now the problem is that i am able to get the difference of rows but not able to store these values as a column and then combine that column with other related columns. please help me. thanks in advance.

Why are you filtering your `date_data` to first 40 rows? This will not be able to cbind back to `columns`. — Parfait, Nov 20 '16 at 01:33
actually total number of rows in macAddress == "7c:11:be:ce:df:1d" are only 40 and it doesn't metter if i use (dates <- date_data[1:40, c(2)]) or (dates <- date_data[, c(2)]). commands will give the same output i.e. all 40 rows where macAddress == "7c:11:be:ce:df:1d" — Waqas Bari, Nov 20 '16 at 01:41
You want to calculate the diff in `statistics_lastLocatedTime`, and store that vector back in the dataframe. — smci, Nov 20 '16 at 07:31
There's no need for the for-loop, `diff()` can operate on vectors. There's no need to repeatedly convert each pair of datetimes from string into datetime, just read them in as datetimes with `read.csv()` — smci, Nov 20 '16 at 07:38
Related: [Specify custom Date format for colClasses argument in read.table/read.csv](http://stackoverflow.com/questions/13022299/specify-date-format-for-colclasses-argument-in-read-table-read-csv) — smci, Nov 20 '16 at 09:13

score 1 · Answer 1 · answered Nov 20 '16 at 02:42

Consider directly assigning the diff variable using vapply. Also, there is no need for the separate date_data df as all operations can be run on the columns df. Notice too the change in time format to align to the format currently in dataframe:

columns$diff <- vapply(seq(nrow(columns)), function(i){      
  r1 <- strptime(paste(columns$statistics_lastLocatedTime[i]),"%d-%m-%y %H:%M")
  r2 <- strptime(paste(columns$statistics_lastLocatedTime[i+1]),"%d-%m-%y %H:%M")

  diff <- difftime(r1, r2)      
}, numeric(1))

score 1 · Accepted Answer · edited May 23 '17 at 10:30

1

This is a four-liner:

Define a custom class 'myDate', and a converter function for your custom datetime, as per Specify custom Date format for colClasses argument in read.table/read.csv
Read in the datetimes as actual datetimes; no need to repeatedly convert later.
Simply use the vectorized diff operator on your date column (it sees their type, and automatically dispatches a diff function for POSIXct Dates). No need for for-loops:

.

setClass('myDate') # this is not strictly necessary
setAs('character','myDate', function(from) { 
  as.POSIXct(from, format='%d-%m-%y %H:%S', tz='UTC') # or whatever timezone
})

data <- read.csv("D:/smart tech/store/2016-10-11.csv",
            colClasses=c('character','myDate','myDate','numeric','numeric','integer','factor'))
# ...
data$date_diff <- c(NA, diff(data$statistics_lastLocatedTime))

Note that diff() produces a result of length one shorter than vector that we diff'ed. Hence we have to pad it (e.g. with a leading NA, or whatever you want).

edited May 23 '17 at 10:30

Community

1
1

answered Nov 20 '16 at 08:42

smci

32,567
20
113
146

Great answer! I'm adding `setAs()` to my library. However, you may need to add `setClass('myDate')` else the error: *no definition for class “myDate”*. And how did I miss `diff()`? Be sure to fix the ending quote of *integer* in colClasses. – Parfait Nov 20 '16 at 15:45
@Parfait no we don't need to add `setClass('myDate')`. It works fine without – smci Nov 20 '16 at 21:16
1

Turns out it is a warning and not an error, but still processes as this question indicates: http://stackoverflow.com/questions/14146341/using-setas-to-specify-colclasses-in-r – Parfait Nov 20 '16 at 22:14
@smci : thanks a lot for your help. 1) your closing bracket was missing in read.csv 2) my output was NA and when i removed format in as.POSIXct it worked fine – Waqas Bari Nov 22 '16 at 07:04
@WaqasBari: thanks but I don't understand how removing the format option changes anything; `format(NA...)` is still NA – smci Nov 22 '16 at 07:48
@WaqasBari by the way, `lubridate` package has much more user-friendly date-conversion fns. – smci Nov 22 '16 at 07:49
@smci i hadn't changed (format (NA...)) in your code but i got accurate results when i removed in it my code and use only as.POSIXct(from) . thanks for informing me about lubridate package. i used "date" and "chron" packages – Waqas Bari Nov 22 '16 at 08:06
Yeah, as.POSIXct is a pain. – smci Nov 22 '16 at 10:30

How to subtract datetimes and store them in a separate column?

2 Answers2