3

I am trying to parse the following URL:

http://localhost:30001/catalog/search?tags=bed-green-big-33-22-ancient-5--2

Where:

  1. bed-green-big-33-22-ancient-5 is 1 Group (filters)
  2. --2 is Group 2 [PageNumber] and it's optional

My regex attempt is:

 tags=(.*)--(\d*)

It works as it captures exactly what I need, but it does not account for the optional --2 at the end.

Results should be: bed-green-big-33-22-ancient-5, 2.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
user3685080
  • 73
  • 1
  • 5
  • There are lots and lots of posts on SO regarding parsing url. Finding how to get started is not difficult in a web search. Please show what you have tried. Also question is not very clear as to what goal and expected results are. Please also update your problem description. See; http://stackoverflow.com/help/how-to-ask – charlietfl Sep 27 '15 at 22:46
  • I tried a lot of variants, most successful was tags=(.*)--(\d*) , but [--2] is optional, i tried to make (--(\d*))? but it didn`t work. Another variant was tags=(\w+)(?:\-(\w+))+ but it captures only 1 match (bed) – user3685080 Sep 27 '15 at 22:54
  • 1
    Update question itself with what you tried... and provide an explanation of what the expected results are. – charlietfl Sep 27 '15 at 22:55
  • possible duplicate of [How do you access the matched groups in a JavaScript regular expression?](http://stackoverflow.com/questions/432493/how-do-you-access-the-matched-groups-in-a-javascript-regular-expression) – WorldSEnder Sep 27 '15 at 22:56
  • Try this `tags=(.*)(--\d*)`. If it works for you, let me know – james jelo4kul Sep 27 '15 at 23:47
  • James, didn`t work for me, works only if there is --2, but its optional – user3685080 Sep 28 '15 at 03:26

2 Answers2

2

Let's consider a simple one-regex approach.

Since your string is inside the query string, you might want to watch out for argument boundaries (& and initial ?) and use [&?] at the pattern start. Right now, .* will match everything even if you have more than 1 argument. In order to make sure you match groups separated with - but not overmatch after &, you can use a negated character class [^&-]

Next thing to consider is the optional part --<NUMBER>. You need to group the characters and apply a ? quantifier to that group to make it "optional" one time (? means match 1 or 0 times). To make our match result cleaner, it is advisable to use non-capturing groups.

So, the regex will look like:

[&?]tags=([^&-]*(?:-[^&-]+)*)(?:--(\d+))?
  ^      |     Main         ||    ^Grp2^| 
 Start   |   capture        ||          |
boundary |    group         || Optional |

See regex demo (\n is added since this is a multiline demo).

JS:

var re = /[&?]tgs=([^&\n-]*(?:-[^&\n-]+)*)(?:--(\d+))?/; 
var str = 'http://localhost:30001/catalog/search?tags=bed-green-big-33-22-ancient-5--2';
var m = str.match(re);
if (m !== null) {
    document.getElementById("r").innerHTML = "First part: <b>" + m[1] + "</b><br/>Second part: <b>" + m[2] + "</b>";
}
<div id="r"/>
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
-1

Take a look at http://regex101.com. It will give you a breakdown of what your regex is doing as well as what it matches.

Since this is a pretty simple regex expression, I'm not going to give it to you directly since I imagine you're learning regex, but I will give you some hints to get you started.

You can create groups using parenthesis (). Think about where you need to start matching a group and match your URL up to that point, then start your group to pull out your string of tags. Once you get to the end of that you have a "--" which you can match against to find the start of your second group for your page number.

thenaterhood
  • 167
  • 6