I'm trying to implement a cluster analysis of types of users on a website leveraging a unique visitor ID, pages viewed, and the number of times the page was viewed. My thought is to get clusters of users who went to similar pages and then see what those pages are to determine similar behaviors.
The output I'm getting from Google Analytics is
ID Page Pageviews
abc123 example.com/pagea 2
qwer123 example.com/pageb 3
abc123 example.com/pageb 4
qwer123 example.com/pagec 5
uiop123 example.com/pagea 6
Based on the clustering algorithms I'm looking at in R, it would be better to have:
ID example.com/pagea example.com/pageb example.com/pagec
abc123 2 4 0
qwer123 0 3 5
uiop123 6 0 0
I can make that change in Excel and port it over, but that seems silly. What would be the way to change the data in R directly? Your help is much, much appreciated.