-3

im working on this for quite a long time, hope someone can help me. ;) Ive got a sample data, which has two columns: CustomerID and Basket. It looks like below image:

Sample

What i want is kind of a cooccurance matrix where the columns and rows are labeled as the names of the basket. It refers to an old post which i linked below:

How to use R to create a word co-occurrence matrix.

Greetings from Germany!

Kevin

Community
  • 1
  • 1
Kevin
  • 1
  • What have you tried? Please provide data in an easy-to-paste form. See [this post](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for tips on how to do just that. – Roman Luštrik May 16 '17 at 07:52

2 Answers2

1

Based on the given link, are you looking for something like this?

library(data.table)
library(qdapTools)
crossprod(
  as.matrix(
    mtabulate(
      dcast(setDT(DT), rowid(CRMPrivatkundeID) ~ CRMPrivatkundeID)[, -1]
      )
    )
  )
#               Esquinzo Playa Fleesensee Jandia Playa Masmavi Noblis Playa Granda Quinta die Ria
#Esquinzo Playa              1          1            0       0      0            0              0
#Fleesensee                  1         14            0       1      0            0              0
#Jandia Playa                0          0            1       0      1            0              0
#Masmavi                     0          1            0       1      0            0              0
#Noblis                      0          0            1       0      1            0              0
#Playa Granda                0          0            0       0      0            1              1
#Quinta die Ria              0          0            0       0      0            1              1

This also works without mtabulate():

library(data.table)
crossprod(
  as.matrix(
    dcast(DT, CRMPrivatkundeID ~ Basket)[, -1]
  )
)

Data

library(data.table)
f <- "Fleesensee"
DT <- data.table(
  CRMPrivatkundeID = rep(c(56, 172, 240, 306, 365, 423, 427), each = 2L),
  Basket = c(f, "Masmavi", f, f, f, "Esquinzo Playa", "Jandia Playa",
            "Noblis", "Quinta die Ria", "Playa Granda", rep(f, 4))
)
Community
  • 1
  • 1
Uwe
  • 41,420
  • 11
  • 90
  • 134
0

With data.table it's like this. Suppose your data.frame name is temp:

library(data.table)
setDT(temp)                   
dcast.data.table(ID ~ basket, data = temp)
amatsuo_net
  • 2,409
  • 11
  • 20
  • hey thanks for your response. This Message occurs: Error in setDT(dat) : All elements in argument 'x' to 'setDT' must be of same length. I forgot to mention that the data includes about 100.000 rows – Kevin May 15 '17 at 15:38
  • The column Basket has 27 different forms, if you find that helpful ;) – Kevin May 16 '17 at 14:24
  • I thought this is a normal two column `data.frame`. Could you paste the output of `str(dat)` in your question? – amatsuo_net May 16 '17 at 15:15