1

I would like to parse url query params intelligently using regex.

Things I've had to consider: 1) params can be out of order 2) only certain params must match

Given a query string: "?param1=test1&param2=test2&parm3=test3" I would like to run javascript regex to parse param1's value and param3's value.

The regex I've come up with so far is:

/(?:[?&](?:param1=([^&]*)|param3=([^&]*)|[^&]*))+$/g

This regex seems to work fine for me in sites like https://regex101.com/.

However, when I run the JS method below, I always get undefined for $2, which is what param1's value should be parsing to. Any help or suggestion?

"?param1=test1&param2=test2&param3=test3".replace(
/(?:[?&](?:param1=([^&]*)|param3=([^&]*)|[^&]*))+$/g,
 function ($0, $1, $2, $3) { return $0 + ' ' + $1 + ' ' + $2 + ' ' + $3; });

This returns $2 as undefined and $3 as test3. However, if I exclude both param2 and param3 from the url query string, I am successfully able to parse param1 as $2. A bit confused about that.

thanks!

prasanth
  • 22,145
  • 4
  • 29
  • 53
programmer33
  • 642
  • 8
  • 19
  • Add with javascript tag – prasanth Nov 05 '16 at 05:11
  • What is your expected output of the replace? – Damon Nov 05 '16 at 06:12
  • I would like it $0 and $1 to output 'test1' and 'test3' it might actually be a javascript limitation: http://stackoverflow.com/questions/3537878/how-to-capture-an-arbitrary-number-of-groups-in-javascript-regexp – programmer33 Nov 05 '16 at 06:22
  • You are finding those capture groups, but this regex only MATCHES one time. I still don't understand what your expected output of this operation is. Do you want to extract the values from param 1 and 3 into an array like `['test1', 'test3']` or do you want to replace them? Do you want to replace the whole string? – Damon Nov 05 '16 at 06:48
  • Javascript `replace` will replace only the matched text with the return value of the callback. That capture group is undefined because you are only accessing the final repetition of that alternation group. tl;dr - I can't see this regex solving any real world problem the way it is written. You must give us the expected output for any meaningful assistance :). – Damon Nov 05 '16 at 07:07

2 Answers2

2
.*parameterName=([^&|\n|\t\s]+)

Using this pattern will give you in a group the parameter value. For example for this URL: https://www.youtube.com/watch?v=aiYpDDHKy18&list=RDaiYpDDHKy18&start_radio=1

.*list=([^&|\n|\t\s]+)

will get you: the list id: "RDaiYpDDHKy18"

.*start_radio=([^&|\n|\t\s]+)

will get you the number 1.

David Rechtman
  • 193
  • 1
  • 7
0

If they're in an arbitrary order, you can use lookaheads to find them and capture the param values. The pattern you want (for your test string) is this:

^(?=.*param1=([^&]+)|)(?=.*param2=([^&]+)|)(?=.*param3=([^&]+)|).+$

Demo on Regex101

It'll also accept query strings that are missing parameters, because each lookahead alternates on either the parameter/value pair or the empty string, using the |s at the end.

You'll need an additional lookahead for each parameter you hope to grab, and there's no way around this. Because every regex under the sun is a state machine under the hood, the only thing you're gonna be hanging on to using your original pattern is the most recent match for a given capture group.

Sebastian Lenartowicz
  • 4,695
  • 4
  • 28
  • 39