0

This question is partially inspired by Count the number of times two values co-occur within a group in R.

Using a similar dataframe:

df = data.frame(ID   = c(1,1,1,1, 2,2,2, 3, 4,4,4,4,4,4,4,4, 5,5,5),
                year = c(2018, 2018, 2020, 2020,
                         2020, 2020, 2020,
                         2011,
                         2019, 2019, 2019, 2019, 2020, 2020, 2020, 2020,
                         2018, 2019, 2020),
                code = c("A", "B", "C", "D",
                         "A", "B", "Q",
                         "G",
                         "A", "B", "Q", "G", "C", "D", "T", "S",
                         "S", "Z", "F")

One of the answers address how to do it in pairs:

library(data.table)
setDT(df)
all_pairs <- function(x) {
  if (length(x) > 1) {
    sapply(combn(sort(x), 2, simplify = FALSE), paste, collapse = '')
  } else {
    c()
  }
}
df[,.(pairs = all_pairs(code)), .(ID, year)][,.N, .(pairs)]

I tried to change the 2 to a 5 or 7 (to reflect groups of 5 or 7) but have had no luck.

I'm getting the following error:

Error in combn(sort(x), 5, simplify = FALSE) : n < m
hy9fesh
  • 589
  • 2
  • 15
  • 3
    You get this error because there are only fewer than 5 items in your group. You cannot, for example, select 5 items from the 3 items where ID = 2. – ekoam Jan 12 '22 at 23:42
  • I thought it was an error but I just tried it and you're right :) – hy9fesh Jan 12 '22 at 23:51
  • 1
    Hi @hy9fesh Can you please add the desired output to your post. – Henrik Jan 13 '22 at 11:15

0 Answers0