4

Very related and helpful: Send event to Google Analytics using API server sided


A download link is sent to a customer via email:

Hello,
Please find your product here: https://www.example.com/files/yourfile.zip

I'd like to track this download in Google Analytics, as a Goal conversion.

Unfortunately, when the user clicks on the link, the file is directly delivered by the web server, without going trough a .html page.

How to track such a direct file download inside Analytics?

  1. Should I add a dummy HTML page "in the middle", that would use the analytics.js tracking snippet and send a download event to GA with ga.send(...), and then redirect to the actual file after 500 milliseconds with setTimeout(redirect, 500)? Is it really a clean and safe solution? I see many little potential issues: is 500 ms ok? what kind of redirection should be used? Also a user with JS disabled will never get his file... or if using <noscript> no Goal conversion can be recorded.

  2. Is there a way to ask Apache (who serves the yourfile.zip to the client) or PHP to send a tracking event to GoogleAnalytics when this file is served?

  3. Another solution?

It seems that solution 2. would have the advantage of being 100% reliable, no matter the client has JS enabled or not.

But on the other hand, I don't want to use a very-few used hack. What is the usual solution for this very common situation?

Basj
  • 41,386
  • 99
  • 383
  • 673
  • More accurate duplicate: https://stackoverflow.com/questions/32200016/send-event-to-google-analytics-using-api-server-sided – Basj Nov 27 '20 at 13:24

1 Answers1

5

Google analytics actually has a protocol for sending analytics data from arbitrary sources. See here: https://developers.google.com/analytics/devguides/collection/protocol/v1/

So having your webserver send an analytics event to Google is not as hacky as it might seem. I'm not sure if you can hook into Apache directly to generate these events. However, I do see at least two solutions.

1) Redirect all downloads to a server side script which sends the data and can generate the desired analytics event.
2) Parse the servers logs and generate analytics events from that.

EDIT Example for solution 1:
Do make sure there are no spaces before or after the tags because this would would be part of the actual response sent to the client.

download.php:

<?php
    // Read ?file=xxx URL parameter
    $requestedFile = $_GET["file"];

    // Read Google Analytics cookie
    $rawCookie = $_COOKIE["_ga"];
    $splitCookie = explode('.', $rawCookie);
    $trackingId = $splitCookie[2] . '.' . $splitCookie[3];

    // Create Google Analytics request data (see here https://developers.google.com/analytics/devguides/collection/protocol/v1/devguide)
    $data = array('v' => 1, 
                  'tid' => 'UA-XXXXX-Y', 
                  'cid' => $trackingId, 
                  't' => 'event', 
                  'ec' => 'download', 
                  'ea' => 'download', 
                  'el' => $requestedFile);

    // Create the request options
    $options = array(
        'http' => array(
            'method' => 'POST',
            'content' => http_build_query($data)
        )
    );

    $context = stream_context_create($options);

    // Send GA request
    $result = file_get_contents('https://www.google-analytics.com/collect', false, $context);

    // GA request failed
    if($result === FALSE) { /* Error */ }

    // Requested file does not exist
    if(!file_exists($requestedFile)) { /* Error */ }

    // Set response headers for binary data
    header('Content-Type: application/octet-stream');
    header('Content-Length: ' . filesize($requestedFile));

    // Open the requested file
    $fileHandle = fopen($requestedFile, 'r');

    // Write the requested file to stdout (which is what the client receives)
    print fread($fileHandle, filesize($requestedFile));
    flush();

    // Close the requested file again
    fclose($fileHandle);

    exit;
?>

.htaccess/mod_rewrite rules:

RewriteEngine on
RewriteUrl ^/download/(.*)$ download.php?file=$1 [L]

Not it's been ages since I wrote my last PHP code and I didn't test this. But it should give a pretty good gist on how to implement option 1)

EDIT 2: If you send your tracking request to www.google-analytics.com/debug/collect you will receive some validation information telling you whether your request is valid or not (it will not track the event, though).

EDIT 3: Okay, so I've checked with a page which uses analytics.js. The script sets the following cookies:

_ga=GA1.3.1788966449.1501761573
_gid=GA1.3.1010429060.1501761573

Later on in the collect requests it sets

cid:1788966449.1501761573
_gid:1010429060.1501761573

So it seems like you need to do a little string splitting with what you find in the _ga cookie. (I've updated the code above)

EDIT 4: In case anyone's wondering, this is the request the analytics.js script generates with the cookie values mentioned above.

GET https://www.google-analytics.com/collect?v=1&_v=j56&a=1178408574&t=pageview&_s=1&dl=https%3A%2F%2Fdevelopers.google.com%2Fanalytics%2Fdevguides%2Fcollection%2Fanalyticsjs%2Fcommand-queue-reference&ul=de&de=UTF-8&dt=The%20ga%20Command%20Queue%20Reference%20%C2%A0%7C%C2%A0%20Analytics%20for%20Web%20(analytics.js)%20%C2%A0%7C%C2%A0%20Google%20Developers&sd=24-bit&sr=1920x1200&vp=1899x1072&je=0&_u=QDCAAAIhI~&jid=&gjid=&cid=1788966449.1501761573&tid=UA-41425441-2&_gid=1010429060.1501761573&z=1116872044
  • Thanks. I'm currently parsing the server logs, but this is quite difficult to extract informations manually. And then, to *send* this information to GA is even more difficult I think. To have a full automated solution seems complex (script to extract info from server log + GA API to send this data to GA + `cron` job etc.). I imagine there should be a simpler solutions because *it's a very very common situation to want to track download on a website?* How do non-programmers bloggers who want to track the downloads of the free PDF they give? – Basj Aug 03 '17 at 09:01
  • 1) sounds nice. Can you give a little bit more information how it would work? – Basj Aug 03 '17 at 09:04
  • Apache provides mod_rewrite which allows to rewrite URLs on the server without the client noticing it. So you could rewrite something like /downloads/cool.zip to /download.php?file=cool.zip. The PHP script can then read the file parameter, talk to Google Analytics and the just send the file the user has requested. Alternatively, instead of giving direct download links to your users, you can send out the /download.php?file=something links which wouldn't require any server side Url rewrites. – Nicolas Ristock Aug 03 '17 at 09:09
  • Nice idea @NicolasRistock. But would PHP have access to the client UUID (stored in a client cookie)? PHP needs to say to GoogleAnalytics "Client #3298474 has downloaded a file", but does PHP have access to this UUID? – Basj Aug 03 '17 at 09:12
  • As far as I know using mod_rewrite forwards all request headers (there is at least an option for that) since cookies are just a header the client sends, you should have access to them from the php script. – Nicolas Ristock Aug 03 '17 at 09:14
  • So this could be a 100% transparent solution for the user? Even using `wget thefile.zip` would deliver the right file, and no "in-the-middle" HTML page? – Basj Aug 03 '17 at 09:19
  • Exactly. Though, using plain wget would probably not send any cookies – Nicolas Ristock Aug 03 '17 at 09:27
  • Great! Would you include such a small PHP example as an illutration of your solution 1)? – Basj Aug 03 '17 at 09:29
  • I've edited my answer with some example code. – Nicolas Ristock Aug 03 '17 at 10:01
  • Thanks a lot @Nicolas. Are you sure `_gid` that looks like that `GA1.2.6xxxxxxxx.15017xxxxx` is what `cid` is looking for? It looks different from: https://developers.google.com/analytics/devguides/collection/protocol/v1/parameters#cid. If we don't use the same format, 1 visitor visiting a page (sending to GA with analytics.js snippet) and downloading a page (with this custom code) will be counted as 2 visitors. – Basj Aug 03 '17 at 10:23
  • I'm not sure that _gid is the correct cookie. https://developers.google.com/analytics/devguides/collection/analyticsjs/cookies-user-id talks about _ga being the client id so that might be what you're actually looking for. You can always try these things yourself though. Navigate to a Google tracked page without any cookies set. Then see which cookies the analytics.js script creates and check the network requests it makes. Let me know if you've found the correct cookie so I can edit my answer. – Nicolas Ristock Aug 03 '17 at 10:34
  • I just tried with a test GA account and a test site: 1. I visited index.html, a normal page with GA tracking and 2. I downloaded a file with your code, sending `_gid` ad client ID ; from same browser. It counts for 2 different users. But this [might be the solution indeed](https://developers.google.com/analytics/devguides/collection/analyticsjs/field-reference#clientId). – Basj Aug 03 '17 at 10:51
  • Update: I've tested every combination: `'cid' => $_COOKIE["_gid"]`, `'cid' => $_COOKIE["_ga"]`, `'uid' => $_COOKIE["_gid"]`, `'uid' => $_COOKIE["_ga"]` (uid and cid), and it's the same : a visitor who visits a HTML page and then downloads a file via this PHP code are identified as 2 different visitors. – Basj Aug 03 '17 at 11:56
  • (edit: moved to answer edit because formatting doesn't work) – Nicolas Ristock Aug 03 '17 at 12:06
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/150952/discussion-between-basj-and-nicolas-ristock). – Basj Aug 03 '17 at 12:07
  • 1
    About edit3: so this means that internally, the `analytics.js` snippet that we put on pages, does the same: it sends a request to `/collect?v=...`. Good to know! – Basj Aug 03 '17 at 12:19
  • Wonderful solution! – Basj Aug 03 '17 at 13:06
  • If accessed from a web page, `ga(function(tracker) { console.log(tracker.get('clientId')); });` is useful to get the client ID, similar to parsing `_ga` cookie. – Basj Aug 12 '17 at 00:56