1

I'd like to generate a list of random dates between a defined interval using R such that there is only one date for each month present in the interval.

I've tried using a variation of the code from another solution, but I can't seem to limit it to one date per month. I get multiple dates for a given month.

Here's my attempt

df = data.frame(Date=c(sample(seq(as.Date('2020/01/01'), as.Date('2020/09/01'), by="day"), 9)))

But I seem to get more than one date for a given month. Any inputs would be highly appreciated.

Varun
  • 1,211
  • 1
  • 14
  • 31

2 Answers2

2

First I create a table, containing all the possible dates that you want to sample. And I store in a column of this table, the index, or the number of the month of each date, using the month() function from lubridate package.

library(lubridate)

dates <- data.frame(
  days = seq(as.Date('2020/01/01'), as.Date('2020/09/01'), by="day")
)

dates$month <- month(dates$day)

Then, the idea is to create a loop with lapply() function. In each loop, I select in the table dates, only the dates of that month, and I paste these months in to the sample() function.

results <- lapply(1:9, function(x){
  sample_dates <- dates$days[dates$month == x]
  
  return(sample(sample_dates, size = 1))
})

df <- data.frame(
  dates = as.Date(unlist(results), origin = "1970-01-01")
)

Resulting this:

       dates
1 2020-01-19
2 2020-02-06
3 2020-03-26
4 2020-04-13
5 2020-05-16
6 2020-06-29
7 2020-07-06
8 2020-08-21
9 2020-09-01

In other words, the ideia of this approach is to provide selected dates to sample() function on each loop. So it will sample, or choose a date, only for that specific month, on each loop.

Pedro Faria
  • 707
  • 3
  • 7
1

How about this:

First you create a function that returns a random day from month 'month' Then you lapply for all months you need, 1 to 9

x <- function(month){
 (Date=c(sample(seq(as.Date(paste0('2020/',month,'/01')), as.Date(paste0('2020/',month+1,'/01')), by="day"), 1)))
}

df <- data.frame(
  dates = as.Date(unlist(lapply(1:9,x)), origin = "1970-01-01")
)

If you also want the results to be random (not January, February, March...) you only need to add a sample:

df <- data.frame(
  dates = as.Date(unlist(sample(lapply(1:9,x))), origin = "1970-01-01")
)
LocoGris
  • 4,432
  • 3
  • 15
  • 30