1

I have the following script (used in Google Tag Manager)

function() {
  try {
    var cml = document.cookie.match("comagic_visitor.+=.+%7C%7C.+(\\d{6})\;")[1];
    if (cml !== undefined) {
    return cml;
    }   
  } catch(e) {}  
  return 'false';
}

It has to get the cookie value. Name of the cookie may change, but the first part of it always remain unchanged "_comagic_visitor".

For some reason when I use the code to get cookie value in console I get correct value:

PHPSESSID=3reongfce35dl150rbdkkllto0; region=2; region2=2; _gat_UA-XXXXXX-2=1; _ym_visorc_263098=w; _ga=GA1.3.26804606X.X431002649; _comagic_visitorTH17k=cASNWQ3N9mRZT8tSmUtTGs5IG9LaD7BPHtCCiEpq_fpSnSKGMcCsEG0kPVur16gH%7C%7C124972212; _comagic_sessionTH17k=203937260

VALUE: 972212

But using it with Tag Manager I get 937260 (which as you can see is from "_comagic_session" (last 6 digits).

Unfortunately I'm not good at debugging and my js skill is very bad to figure out how to fix this. Any ideas on what I have to fix?

Paul
  • 26,170
  • 12
  • 85
  • 119
  • This isn't really a google-analytics question. There's no code attempting to pass the result to google-analytics, so I answered the question about fixing the cookie matching code. If you still need help with the google analytics portion, ask a new question focusing on the google analytics part. – Paul May 07 '15 at 14:43

1 Answers1

2

Any ideas on what I have to fix?

You need to fix the regular expression used for matching. The argument in parenthesis in document.cookie.match() is a regular expression.

From MDN, document.cookie is a string of all the cookies, separated by ;. Since document.cookie is simply a String document.cookie.match() is simply calling String.match(). String.match(regexp) finds an Array of matches using the regular expression parameter regexp.

The regexp you are using is:

comagic_visitor.+=.+%7C%7C.+(\\d{6})\;

This regexp means a match must satisfy all of the following conditions:

  1. comagic_visitor begin with comagic_visitor
  2. .+ followed by one or more other characters (any chars except newline and a couple others). This can be dangerous
  3. = followed by =
  4. .+ followed by one or more other characters Again, can be dangerous
  5. %7C is a little dangerous depending on where you use it, and might be a literal %7C or could be translated into | which means "or"
  6. .+ one or more other characters
  7. (\\d{6}) the parenthesis extract this as the result of the match, and \d{6} is exactly 6 digits. It seems to be escaped by an extra \ which would be unnecessary if you used /regexp/ instead of "regexp"
  8. \; is an escaped ;, which requires the final ;

Primary Issue: This regexp is much too loose and matches much more than desirable. .+ is greedy, in practice it matches as much as it can, and allows the regexp to match the beginning of the desired cookie, all the other cookies in the string, and the digits in some other cookie. Since the individual cookies in document.cookie are probably not guaranteed to be in any particular order, a greedy match can behave inconsistently. When the desired cookie is at the end of the string, you will get the correct result. At other times, you won't, when the .+ matches too much and there are 6 digits at some other cookie at the end that can be matched.

Alternative #1: Write a short function to split your cookie string on ; which will split into an array of strings and then feed each string into match separately and return the first match. This prevents the regexp from making a bad match in the full cookie string.

Alternative #2: Fix the regexp to match only what you want. You can use https://regex101.com/ or a console window to test regular expressions. Possibly you can change the .+ to [^;]+ which changes any char except newline to any char except ; and that might fix it, because to match across multiple cookies in the full cookie string it has to be allowed to match a ; and if we deny that those false matches should be impossible.

Like this:

var cml = document.cookie.match(/comagic_visitor[^;]+(\d{6})\;/)[1];

This works for me in nodejs.

d = "PHPSESSID=3reongfce35dl150rbdkkllto0; region=2; region2=2; _gat_UA-XXXXXX-2=1; _ym_visorc_263098=w; _ga=GA1.3.26804606X.X431002649; _comagic_visitorTH17k=cASNWQ3N9mRZT8tSmUtTGs5IG9LaD7BPHtCCiEpq_fpSnSKGMcCsEG0kPVur16gH%7C%7C124972212; _comagic_sessionTH17k=203937260";
r = /comagic_visitor[^;]+(\d{6})\;/
d.match(r)[1]
---> '972212'
Abdul Aziz Barkat
  • 19,475
  • 3
  • 20
  • 33
Paul
  • 26,170
  • 12
  • 85
  • 119