0

I need to take the data set containing my product codes (i.e. ABC CDE EFG) and create a matrix with that on both axis with the "inside" being a binary flag to signify that yes that combination has occured in my data set. I have found similar solutions using sparse matrices, but the function will not work with my data. Below is an example of what I will need as a final result. IE ABC-ABC is obviously 1 because they are the same product, but CDE-EFG indicates that when product CDE was bought at the same time as product EFG. My question is what is the best way to create a product-affinity matrix to analyze this set of transaction data.

    ABC CDE EFG GHI 
ABC 1    0   0   0
CDE 0    1   1   0
EFG 1    1   1   0
GHI 0    0   0   1

EDIT: I am aware of the dplyr package and its affinity function. However I cannot seem to get a succesful run with my data. Perhaps I need to change the data type from a dataframe, however I am not sure if that is the issue or not.

accortdr
  • 91
  • 11
  • 1
    What does your product code dataset look like? – acylam Jun 11 '18 at 21:03
  • useR: Currently, my code produces a large True-False matrix, however it is producing nearly all false in true instances. The accuracy of this code can be measured by at least verifying that when x-value = y-value is True (or 1). – accortdr Jun 12 '18 at 11:57
  • 1
    It would help if you give a [mcve]. See [this answer](https://stackoverflow.com/a/5963610/4996248) for what that means in R. Also, why not post the relevant parts of your current code if you want help optimizing it? – John Coleman Jun 12 '18 at 11:58
  • 1
    By "What does your dataset look like?" I meant posting your actual or a sample dataset in your question. It doesn't help that much to just _describe_ your data. Please refer to John Coleman's link for an example of how to post a sample. – acylam Jun 12 '18 at 13:58
  • The method to go about this is up in the air, I am just looking for recommendations about different channels to get the result I need. – accortdr Jun 12 '18 at 16:06

0 Answers0