0

You will find this simple and silly Question but it is not. Please Guys help me here.

I have set of URLs in these two formats:-

https://lenskart.sg/collections/abc/products/xyz

https://lenskart.sg/collections/abc/xyz

I only need those URLs that contain the word "collections"(double quote to highlight the word) and does not contain the word "products"

How to write regex(Regular Expression) for this?

PS:- I need To filter out the URLs from Google Analytics using Regex. The Best expression I have come up till now is:- (collections/)(\w+)(/)(?!products) But Google Analytics is showing it as an Invalid Regex. It is working fine in other regex testing tool. May be Google Analytics is not accepting Negative Lookaheads. Here are Few URLs to support the same:- Google Analytics Regex - Alternative to no negative lookahead https://www.reddit.com/r/analytics/comments/5v6q4i/regex_expression_for_does_not_contain/ https://recalll.co/?q=negative%20lookahead%20-%20Google%20Analytics%20look%20ahead&type=code

Please Guys help me here. It's a big issue for me

Kartik
  • 1
  • 2

2 Answers2

0

I do not think you need a complex regular expression at all, an include and subsequent exclude filter should suffice.

Do an include filter, select request url als filter field and "/collections/" as filter pattern. This will dismiss all Urls that do not have "/collections/" in their path (or to put it another way, this will only include Urls that match the pattern).

Then (order is important) do and exclude filter, select request url as filter field, and enter "/products" as pattern.

Filters are applied in the order they are displayed in the view settings. Each subsequent filter will work on the data a previous filter has returned. So it is often easier to split the work between multiple filters.

This is assuming that you are filtering in your view settings, but frankly if this is a filter in a report, it basically works the same way (you have to click the "advanced" link next to the filter box to access multiple filter conditions, and "Request Url" is called "Page" here, but otherwise it's basically the same).

Filters in reports do not support negative lookaheads, the (permanent) view filters allegedly do.

Eike Pierstorff
  • 31,996
  • 4
  • 43
  • 62
  • I have to use Regex only as I have to Filter out URLs in Google Goal Filter where I don't have Include Exclude Functionality – Kartik Mar 19 '19 at 19:40
0

^\/lenskart\.sg\/collections\/abc((?!\/products).)*$

I'm not an expert, but the above RegEx will satisfy the following by matching on the first and third URLs, but not the middle.

/lenskart.sg/collections/abc/ /lenskart.sg/collections/abc/products/xyz /lenskart.sg/collections/abc/services