0

I'm trying to implement a cluster analysis of types of users on a website leveraging a unique visitor ID, pages viewed, and the number of times the page was viewed. My thought is to get clusters of users who went to similar pages and then see what those pages are to determine similar behaviors.

The output I'm getting from Google Analytics is

ID          Page                  Pageviews
abc123      example.com/pagea     2 
qwer123     example.com/pageb     3 
abc123      example.com/pageb     4
qwer123     example.com/pagec     5 
uiop123     example.com/pagea     6

Based on the clustering algorithms I'm looking at in R, it would be better to have:

ID        example.com/pagea    example.com/pageb    example.com/pagec
abc123    2                    4                    0
qwer123   0                    3                    5
uiop123   6                    0                    0  

I can make that change in Excel and port it over, but that seems silly. What would be the way to change the data in R directly? Your help is much, much appreciated.

JAB
  • 115
  • 1
  • 1
  • 9

0 Answers0