I am trying to remove Personally Identifiable Information (PII) from URLs in out Single Page Application (SPA) registered by Google Tag Manager.
The URLs have the form /customer/1234/invoice/5678
, which I want to send to GA4 as /customer/(redacted)/invoice/(redacted)
What I did is the following:
- In GTM, I created a Custom JavaScript variable called
Page location without ids
with the following content. (Note: using{{Page URL}}
here, but also triedwindow.location.href
with same effect.)
function() {
// including timestamp for debugging purposes
var url = Date.now() + {{Page URL}}.replace(/\d{4}/g, '(redacted)');
// outputting to console for debugging purposes
console.log(url);
return url;
}
- In the GA4 configuration tag (which is fired on
All Pages
), I openedFields to set
and changed the field namepage_location
to{{Page location without ids}}
. - I started Preview in GTM, and let GTM load the website. Tag Assistant comes up on the page, GTM reports it is connected.
- Everything seems well so far:
- I open the developer console on the website, and see some 20 output lines of the start page URL with timestamp, generated by my GTM script.
- In GTM's Tag Assistant I can see the modified URL in both the GTM and GA4 containers, under Variables. (In the GTM container assigned to
Page location without ids
, in the GA4 container assigned todl
(Page Location). - In GA4, I can see the modified URL in DebugView, assigned to the
page_location
Parameter.
- However, when I navigate to a page with ids in the URL:
- The console outputs the redacted URL, good. (4 times actually, don't know why.)
- However, the payload of the
collect
call shows the (redacted) starting page URL for thedl
parameter. The actual page URL (redacted or not) is not included. - GTM show a History event logged by the GTM container with the redacted URL in the
Page location without ids
variable, good. ThePage Path
andPage URL
variables however are not redacted, don't know if this is good or bad. - GTM shows for the GA4 container a Page View with the (redacted) starting page URL for the
dl
(Page Location) parameter! - And also GA4 in DebugView shows the starting page URL as
page_location
parameter.
- The console outputs the redacted URL, good. (4 times actually, don't know why.)
So for some reason I am unable to push the redacted URL into the dl
parameter for GA4, instead GA4 keeps on using the redacted initial (starting page) URL.