2

I tried to count days between specific dates. I have 2 columns with all character vectors.

start.date <- c("2015-01-10","2015-01-11","2015-02-24")
end.date <- c("2015-03-10","2015-04-01","2015-06-13")
date10 <- data.frame(cbind(start.date,end.date))
date10$start.date <- as.character(date10$start.date)
date10$end.date <- as.character(date10$end.date)
str(date10)

and the specific dates are from 2015-04-11 to 2015-07-10. so I made all date which is between the specific dates by using seq(). sp.da1<-ymd("2015-04-11") sp.da2<-ymd("2015-07-10") inteval.da<-seq(sp.da1, sp.da2, by = 'day') I wanted to know how many days are between the specific dates. I tried to use seq(start.date,end.date,by = 'day') like above, but I get this error: 'from' must be of length 1

Please Help me!!!

talat
  • 68,970
  • 21
  • 126
  • 157
Dougie Hwang
  • 93
  • 1
  • 1
  • 9
  • Is your goal to create a sequence of dates for each row in the dataframe? Or do you just want to know the number of days in between the two dates? – A. Stam Sep 12 '17 at 08:35
  • If I know how to create a sequence of dates for each row in the dataframe, then I can find the number of days. Can I know both of them?? :) – Dougie Hwang Sep 12 '17 at 08:37
  • Of course, but if you just want to know the number of days in between, there are easier solutions. – A. Stam Sep 12 '17 at 08:38
  • Ok, then Could I know the easier way?? – Dougie Hwang Sep 12 '17 at 08:39
  • this was answered here: https://stackoverflow.com/questions/28122708/generate-rows-between-two-dates-into-a-data-frame-in-r – clancy Sep 11 '18 at 12:07

2 Answers2

3

You are asking how many days of a given time interval is inside a main time interval.

Let's first set up the three varying time intervals. Then we write a function that checks for every day x whether it is inside or outside the main interval. If we sum up the number of days inside the main interval, we'll have what you're searching for:

date10$start.date <- as.Date.character(date10$start.date, format="%Y-%m-%d")
date10$end.date   <- as.Date.character(date10$end.date,   format="%Y-%m-%d")

your_intervals <- Map(seq, from = date10[, 1], to = date10[, 2], by = "days")

your_intervals is a list with three data frames, each containing every day in the interval.

is_in_interval <- function(x, l_bound = sp.da1, u_bound = sp.da2){
  return (x > l_bound) & (x < u_bound)
}

sapply(your_intervals, function(x) sum(is_in_interval(x)))
# [1] 0 0 63
KenHBS
  • 6,756
  • 6
  • 37
  • 52
  • Error in seq.int(0, to0 - from, by) : wrong sign in 'by' argument I got this error.... Can you give me any clue?? – Dougie Hwang Sep 12 '17 at 11:16
  • Could it be that `date[, 2]` is smaller than `date[, 1]` in some of your cases? In the data you posted, it all works.. – KenHBS Sep 12 '17 at 11:23
0

First off: why aren't the columns start.date and end.date already of a Date class? If you would store these columns as dates, you wouldn't have to transform them when you want to use them as such. In your example code, you're passing character strings to the seq() function, which doesn't work very well.

The following code should give you a sequence of dates for every row in your dataframe:

apply(date10, 1, function(x) seq(ymd(x['start.date']), ymd(x['end.date']), 'day'))

What this code does is slice your dataframe up in rows and consider them one by one. This means you are passing one start date and one end date to the seq function each time, instead of a whole column of each. That was what caused your error.

If you want to know the number of days in between, I would suggest this solution:

date10$diff <- ymd(date10$end.date) - ymd(date10$start.date)

This creates a column of the 'difftime' class. You can convert it to a simple integer by adding as.integer():

date10$diff <- as.integer(ymd(date10$end.date) - ymd(date10$start.date))
A. Stam
  • 2,148
  • 14
  • 29
  • Error in seq.int(0, to0 - from, by) : wrong sign in 'by' argument I got this error. :( what is the problem?? – Dougie Hwang Sep 12 '17 at 08:51
  • That code isn't similar to the answer I posted - for example, why are you using `seq.int`? I cannot answer this question unless you provide more information. – A. Stam Sep 12 '17 at 08:59
  • This code is just sample of my data. Actually I have 20 variables and over 3000 rows. they are in shape of data.frame. Like the code two of them are about date. so I tried to adapt your code to my situation. I can't write my all data in here. :( – Dougie Hwang Sep 12 '17 at 09:05