This question is partially inspired by Count the number of times two values co-occur within a group in R.
Using a similar dataframe:
df = data.frame(ID = c(1,1,1,1, 2,2,2, 3, 4,4,4,4,4,4,4,4, 5,5,5),
year = c(2018, 2018, 2020, 2020,
2020, 2020, 2020,
2011,
2019, 2019, 2019, 2019, 2020, 2020, 2020, 2020,
2018, 2019, 2020),
code = c("A", "B", "C", "D",
"A", "B", "Q",
"G",
"A", "B", "Q", "G", "C", "D", "T", "S",
"S", "Z", "F")
One of the answers address how to do it in pairs:
library(data.table)
setDT(df)
all_pairs <- function(x) {
if (length(x) > 1) {
sapply(combn(sort(x), 2, simplify = FALSE), paste, collapse = '')
} else {
c()
}
}
df[,.(pairs = all_pairs(code)), .(ID, year)][,.N, .(pairs)]
I tried to change the 2 to a 5 or 7 (to reflect groups of 5 or 7) but have had no luck.
I'm getting the following error:
Error in combn(sort(x), 5, simplify = FALSE) : n < m