2

What would be the reg-ex expression be to find urls that contains both "subsciption" and "stock" in any order

/pay/subscription/confirmation?utk=67&key=13508952&efnc=d949fcf32f36e695b6f415ff20e47a20&stock=91

I'm trying to use this for filtering within Google Analytics.

Iv looked at other answers such as this:

And they suggest this syntax:

(?=.*subscription)(?=.*stock) but this isn't even accepted as valid regex.

enter image description here

This produces the message "Your data request includes an invalid regular expression."

Community
  • 1
  • 1
Dan
  • 29,100
  • 43
  • 148
  • 207
  • 1
    When you say it "doesn't work" in Google Anaylytics (eg. `subscription.*stock`, what do you mean? Does nothing come up? – h2ooooooo Aug 08 '13 at 16:10
  • Do you want every URL that contains both “subscription” and “stock”, or does each URL only have to contain one of the words? And must the words be in that order, or could it be “stock” followed by “subscription”? – Rory O'Kane Aug 08 '13 at 16:11
  • I have beefed up my question to answer your point. – Dan Aug 08 '13 at 16:15
  • @Dan I think my answer should do what you want without using complex regex syntax – MDEV Aug 08 '13 at 16:18

2 Answers2

2

If you're matching a string that contains both with some text in between (in any order):

(subscription.+stock|stock.+subscription)
MDEV
  • 10,730
  • 2
  • 33
  • 49
  • This seems to match anything in the query string, but not the url so it matches `/home/index?xxx=subscription&stock=333` but not `/subscription/index?xxx=ddd&stock=333` – Dan Aug 08 '13 at 16:41
  • @Dan It definitely matches both - what did you test it in that failed? http://regexr.com?35spb – MDEV Aug 08 '13 at 16:43
1

This is the regex you want:

subscription.*stock|stock.*subscription

The first possibility it matches is “subscription”, then optionally anything in between, then “stock”. The second possibility it matches is “stock”, then optionally anything in between, then “subscription”. These two alternatives cover the two possible orderings of the words.

(?=...) is lookahead. I'm guessing that the regex engine by Google Analytics does not support that feature. But in this case, the (?=)(?=) pattern only makes it so you don't have to list all possible orderings of the words. Since you have only two words and two possible orderings of those words, it's not too hard to write both possibilities out.

Rory O'Kane
  • 29,210
  • 11
  • 96
  • 131