2

I would like to re-create the (ordered) list of pages each user went through when visiting my site using the Google Analytics API (core, v3). My understanding is that you need to:

  1. be able to tell visits apart
  2. gather the list of pages of that visit
  3. have a way of ordering those pages

1 . Telling visits apart

Can be done using using custom variables: OK

2 . List of pages viewed

Can be done using the below dimensions: OK

ga:pagePath
ga:landingPagePath
ga:secondPagePath
ga:exitPagePath
ga:previousPagePath
ga:nextPagePath

3. Ordering the pages viewed

My impression is that this is not possible for the following reasons:

  • the absolute dimensions (e.g. ga:landingPagePath/ga:secondPagePath) only provide the first 2 levels
  • the relative dimensions (e.g. ga:previousPagePath/ga:nextPagePath) won't be enough to tell pages apart as soon as the same page appears multiple times in the navigation

For instance, let's say someone visits the below pages (numbers in brackets represent the order):

(1) A -> (2) B -> (3) A -> (4) B -> (5) C

If you try to pull the data via the API you quickly hit a wall:

dimensions=ga:landingPagePath                          -> (1) A  : OK
dimensions=ga:secondPagePath                           -> (2) B  : OK
dimensions=pagePath,filters=ga:previousPagePath==B -> (3) A, (5) C: PROBLEM

At this point we need to find out whether A or C is the actual page. This would be possible if we had the pageview timestamps, but unfortunately it doesn't seem available (you only have ga:timeOnPage and ga:avgTimeOnPage).

Have you found a way to re-create the order list of pages users viewed when visiting your site using the Google Analytics API?

Community
  • 1
  • 1
Max
  • 12,794
  • 30
  • 90
  • 142
  • `previousPagePath`, `pagePath` and `nextPagePath` are supposed to be used toghether. So if you filter by `previousPagePath==B`, then it means that `pagePath=A, C` and `nextPagePath=B` (the second B this time). – Eduardo Jun 03 '13 at 10:32
  • Thanks, I've corrected accordingly. However I believe the question itself remains open. – Max Jun 03 '13 at 10:43

1 Answers1

1

You are trying to use Google Analytics to do something it was not designed to do. GA is an aggregate data analysis tool. You should be measuring groups of users not single users.

Using the userId as a customVariable is just a workaround to try to make GA into something it's not. This is considered a hack and as such comes with it's drawbacks. The first problem that will arise is sampling. If your site has more than 500k visits in the period you are analyzing only the first 500k are used. It may not bite you now, but when you grow it will. It's usually not a bad thing if you re doing aggregate analysis, but when you are doing reports on a user by user base it can completely screw you data even with a 90% sample.

If you are aware of this and decide to move on, why not send a timestamp with the pageview as a customVariable. You are already using that hack to send the userId, so you might as well buy into the idea completely. Since you are doing user by user reports time difference between user machines shouldn't matter.

A better solution to your problem is probably to send the data to your own servers and aggregate it yourself. Whenever you fire a pageview to GA just do the same to a server you control and aggregate on a user by user basis.

Eduardo
  • 22,574
  • 11
  • 76
  • 94
  • I also got the impression I was using `GA` for what it's not. I am myself not interested in what happens on an individual basis but others people in the business are (the background is too long for me to go over it). Right now what I'm thinking about doing is using `JavaScript` to track pages visited in a `cookie` and send that information to `GA` as an `event value`: we would then pull that data via the API and further process it programatically. Luckily for us this would be done for a very small subset of our users and I believe we would not run into sampling problems. – Max Jun 03 '13 at 10:59
  • Every pageview appended into a single cookie? Watchout for cookie limitations http://stackoverflow.com/questions/2543851/chrome-cookie-size-limit – Eduardo Jun 03 '13 at 12:27
  • Assuming 4KB for the cookie size and 100B per URL path, that gives us 40 URLs which will be more than enough for us. Good news is that there doesn't seem to be limitations on the event value (developers.google.com/analytics/devguides/collection/…) and that some people have already made it worked for strings up to 2KB (stackoverflow.com/questions/12280993/…) which will still be more than enough for us. Let's see how this turns out in the end... :) – Max Jun 10 '13 at 06:44