I have number of scenarios that I am trying to account for, but can't seem to nail down my match string (#regexbeginner). Unfortunately, no JavaScript is possible, as this regex is being used within Adobe Analytics's Classification Rule Builder.
What I am after are three groups:
- Base URL (not including
http[s]:\/\/www.
) - The tracking code (everything after the ?, but before the #)
- The hash (everything after the #)
The thing is, the tracking codes and hashes are optional. Both might appear, one of them might appear, or none of them might appear. There can also never be more that one tracking code or more than one hash present in the URL, and the hash will never appear before the tracking code.
Here is where I have got to so far:
^http[s]:\/\/www.(.+\/.+)\?(.+)?#(.+)?
This works fine if there is both a tracking code and a hash, but it does not work if one, or none of them are present.
Below are my test cases. All of them need to return three groups, but I understand that group 2 and/or group 3 may be empty.
- https://www.example.com/en-US/tires/wrangler-duratrac
- https://www.example.com/en-US/tires/wrangler-duratrac/sizes-specs
- https://www.example.com/en-US/tires/wrangler-duratrac#
- https://www.example.com/en-US/tires/wrangler-duratrac#121
- https://www.example.com/en-US/tires/wrangler-duratrac?
- https://www.example.com/en-US/tires/wrangler-duratrac?sku=150638601
- https://www.example.com/en-US/tires/wrangler-duratrac?sku=150638601#121
Any help would be appreciated. Feel like this should be easy for someone with a little experience.
Thanks, Chris