0

I am using Google Analytics to see which pages are viewed the most of a website, and I need a selection based on a Regular Expression.

What I need to select:

/index.php?page=1203
/index.php?page=12
/index.php?page=15&print=1

Basically I first need to select the literal string of /index.php?page= appended by any number of any length, all integers, so no comma's. After this, anything can be appended so I am thinking an * will do. But please answer one with, and without the * because I need to target both.

Thanks in advance!

jeroen
  • 502
  • 8
  • 17

1 Answers1

4

You can use this regex with a captured group:

\/index\.php\?page=(\d+)

We are capturing 1+ digits in 1st capture group after matching text /index.php?page=

anubhava
  • 761,203
  • 64
  • 569
  • 643
  • Thanks for the quick answer, but Google Analytics throws an error saying it's an invalid Regex, the rules for Regex within GA are listed here: https://support.google.com/analytics/answer/1034324?hl=en. Not quite sure what the problem is. – jeroen Oct 22 '13 at 09:44
  • I edited my question, because I need to look for the literal string of /index.php?page= and not index.php?page= not sure if this cause the error. – jeroen Oct 22 '13 at 09:45
  • The regex above is using a lookbehind : the construct `(?<=EXPRESSION)` . This is not supported in all regular expression engines (including javascript) – James S Oct 22 '13 at 09:45
  • Alright removed lookbehind from my regex. Try it now, you number will be available in backreference # 1 – anubhava Oct 22 '13 at 09:47
  • Thanks for the edit @anubhava, and the noite by James. It works! – jeroen Oct 22 '13 at 09:47
  • Glad to know it worked. Sorry I didn't know much about `Google Analytics regex` capabilities that's why I gave lookbehind based regex earlier. – anubhava Oct 22 '13 at 09:48
  • 2
    Ta. Note the regex above will not include the &print=1 in your last example. If you need this bit as well then try `\/index\.php\?page=(\d+)(&[^\s]+)?` (ie the previous expression optionally followed by an & and one or more non-whitespace characters. – James S Oct 22 '13 at 09:54