0

I am doing dynamic network analysis in R and I have a list of date ranges as two columns in a dataframe. I am trying to figure out how to create a comprehensive dataframe/list that contains every possible unique combination of every date within each range. The date ranges are all of different lengths.

For example:

Date1    Date2
1275     1277
1301     1303
1290     1291

I want to create a dataframe or list in which each column/item represents a unique possible combination of the dates within each of these ranges:

1      2      3      4      5      6      7...........?
1275   1276   1277   1275   1275   1275   1276........1277
1301   1301   1301   1302   1303   1302   1202........1303
1290   1290   1290   1290   1290   1291   1290........1291

My intuition from java is to turn to nested loops, but avoiding loops is like the whole point of R so I feel like I must be wrong in that intuition.

Based on R- all pairwise combinations of elements from two lists and Calculating all possible combinations within a range in R I've hacked together this

makeAllDateCombos <- function(earlydatecolumn, latedatecolumn){
  allDCs <- NULL
  allDCs <- as.data.frame(lapply(earlydatecolumn, function(earlydatecolumn){
    lapply(latedatecolumn, function(latedatecolumn){
      (table(earlydatecolumn:latedatecolumn))
      })
  }))
  return(allDCs)
}

allTheDCs <- makeAllDCs(collab_dates$Date1, collab_dates$Date2)

However this returns an error.

Error in data.frame(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,  : 
  arguments imply differing number of rows: 16, 27, 6, 21, 26, 1, 5, 4, 31, 11, 18, 13, 15, 23, 17, 7, 10, 9, 24, 19, 33, 20 
AlexB
  • 3
  • 5
  • `t(do.call(expand.grid, mapply(seq, Date1, Date2, SIMPLIFY = FALSE)))` might work. – r2evans Sep 29 '17 at 22:00
  • This is definitely on the right track. mapply(seq...) works, but the do.call(expand.grid, ...) crashes rstudio. – AlexB Sep 30 '17 at 08:51
  • @r2evans Thanks for your suggestion. With a small list of ranges (under 5) your solution works perfectly, so if you want to submit it as an answer I'll accept it. With the amount of memory I have, I will need to find another solution, perhaps subsample of the total permutations, for figuring out how changes in dating affect the dynamic network metrics. – AlexB Sep 30 '17 at 11:33
  • How large is the true source data? If you are aggregating or filtering that new matrix, it is possible to do a "lazy" `expand.grid` without exhausting memory and crashing R. – r2evans Sep 30 '17 at 14:58

1 Answers1

0

With smaller numbers of combinations, this will work:

t(do.call(expand.grid, mapply(seq, Date1, Date2, SIMPLIFY = FALSE)))

Unfortunately, from your comment I infer that you have a relatively large number of combinations, thereby crushing the chance of dealing with all of your combinations at once. I suggest you may find use out of https://stackoverflow.com/a/36144255/3358272, slightly updated at https://gist.github.com/r2evans/e5531cbab8cf421d14ed. The point is to iterate over each combination and do something with it individually.

r2evans
  • 141,215
  • 6
  • 77
  • 149