Google Cloud Platform lets you create label logs using the RE2 regex engine.
How can I create a regex that matches the path in the URL?
Examples matches:
https://example.com/awesome --> "awesome"
https://example.com/awesome/path --> "awesome/path"
https://example.com/awesome/path/ --> "awesome/path"
https://example.com/awesome/path?arg1=123 --> "awesome/path"
Details:
- The domain and protocol are constant, it can be assumed to be
https://example.com
here. - If there are multiple directories, they should be matched too, including the
/
in between. - Trailing
/
should NOT be matched. - Queries, e.g.
?arg1=123&arg2=456
should NOT be matched. - It can be assumed that directory names will only contain alphanumeric characters
a-zA-Z0-9
, dashes-
and underscores_
.
Note that Google RE2 is different than PCRE2.