Here is a base R method using table
, crossprod
, combn
, and matrix subsetting.
# get counts of customer IDs
myMat <- crossprod(with(df, table(Order, CustomerID)))
myMat
CustomerID
CustomerID C1 C2 C3 C4
C1 3 0 2 0
C2 0 1 0 1
C3 2 0 2 0
C4 0 1 0 1
Note here that the diagonal is the count of total orders that each customer made and the (symmetric) off diagonals are the counts of orders that each customer shared.
# get all customer pairs
customers <- t(combn(rownames(myMat), 2))
# use matrix subsetting to pull out order counts and cbind.data.frame to put it together
cbind.data.frame(customers, myMat[customers])
1 2 myMat[customers]
1 C1 C2 0
2 C1 C3 2
3 C1 C4 0
4 C2 C3 0
5 C2 C4 1
6 C3 C4 0
You could use wrap this in setNames
to add names if desired to give specific variable names
setNames(cbind.data.frame(customers, myMat[customers]), c("c1", "c2", "counts"))
data
df <-
structure(list(Order = c("A", "B", "C", "D", "B", "C", "D"),
CustomerID = c("C1", "C1", "C1", "C2", "C3", "C3", "C4")), .Names = c("Order",
"CustomerID"), class = "data.frame", row.names = c(NA, -7L))