I am currently working on a project involving data of delivery timings. The data can be both negative (indicating that the delivery was not late but actually ahead of the estimate) or positive (indicating that it was indeed late).
I would like to obtain the five number summary and interquartile range using the fivenum() function on the data. However, because all of the values are positive, my statistics are not accurate. The following is an example of the data I am working with:
Delivery.Late Reaction.Time Time.Until.Send.To.Vendor
1 00:01:29 00:00:00 00:05:08
2 00:12:19 00:00:00 00:04:52
3 00:02:55 00:00:00 00:05:42
4 00:06:14 00:00:00 00:14:34
5 -00:06:05 00:00:00 00:01:42
6 00:09:58 00:00:00 00:02:56
From this, I am interested in the Delivery.Late variable and would like to perform exploratory / diagnostic statistics on it.
I have used the chron package to convert the column data into chronological objects but chron(object) always takes the absolute value of the time and turns it into a positive value. Here is a sample of my code:
library(chron)
feb_01_07 <- read.csv("~/filepath/data.csv")
#converting factor to time
feb_01_07[,19] <- chron(times=feb_01_07$Delivery.Late)
#Five number summary and interquartile range for $Delivery.Late column
fivenum(feb_01_07$Delivery.Late, na.rm=TRUE)
After running fivenum() I get the results:
[1] 00:01:29 00:02:55 00:06:09 00:09:58 00:12:19
Which is inaccurate because the lowest number (the first term), should in fact, be -00:06:05 and not 00:01:29. -00:06:05 was converted to a positive chronological object and became the median instead.
How can I convert them to time objects while maintaining the negative values?Thanks so much for any insight!