0

My regex pattern:

const pattern = /^\/(test|foo|bar\/baz|en|ppp){1}/i;

const mat = pattern.exec(myURL);

I want to match:

www.mysite.com/bar/baz/myParam/...anything here

but not

www.mysite.com/bar/baz/?uid=100/..

myParam can be any string with or without dashes but only after that anything else can occur like query strings but not immediately after baz.

Tried

/^\/(test|foo|bar\/baz\/[^/?]*|en|ppp){1}/i;

Nothing works.

Nguyễn Văn Phong
  • 13,506
  • 17
  • 39
  • 56
  • I would recommend checking out a site like https://regex101.com/ which will allow you to put multiple possible test values and easily adjust your regex until you see what matches. – jwatts1980 Jan 22 '20 at 04:21
  • so would `www.mysite.com/test/bar/baz/?uid=100/` be a valid string? If no, then you should add explicitly that a param is not allowed **anywhere** following `bar/baz`. – Marc Lambrichs Jan 23 '20 at 04:56

3 Answers3

0

This, I believe, is what you are asking for:

const myURL = "www.mysite.com/bar/baz/myParam/";
const myURL2 = "www.mysite.com/bar/baz/?uid=100";

const regex = /\/[^\?]\w+/gm;

console.log('with params', myURL.match(regex));
console.log('with queryParams', myURL2.match(regex))
You can test this and play further in Regex101. Even more, if you use that page, it tells you what does what in the regex string.

If it's not what you were asking for, there was another question related to yours, without regex: Here it is

spersico
  • 846
  • 10
  • 15
  • Does not seem to work for all the cases here, can you please check here: https://regex101.com/r/gNSXnQ/2/ - it should not match `/foo/bar` alone and also it should not match `/foo/bar/?...something` - it shuld only match `/foo/bar//..whatever` . - @Santiago Persico – geek090909 Jan 22 '20 at 05:52
  • so... only the last parameter?. In the examples of my code snippet, the match should only be "myParam" for the first url and "baz" for the second one? – spersico Jan 22 '20 at 05:57
  • So basically "baz" in your example should be followed by only a string (with or without dashes - but not "?" @Santiago Persico – geek090909 Jan 22 '20 at 06:38
  • Yeah.... I don't understand the question then. Going to sleep. Good luck! – spersico Jan 22 '20 at 06:59
0

For the 2 example strings, you might use

^[^\/]+\/bar\/baz\/[\w-]+\/.*$

Regex demo

If you want to use the alternations as well, it might look like

^[^\/]+\/(?:test|foo|bar)\/(?:baz|en|ppp)\/[\w-]+\/.*$
  • ^ Start of string
  • [^\/]+ Match 1+ times any char except a /
  • \/ Match /
  • (?:test|foo|bar) Match 1 of the options
  • \/ Match /
  • (?:baz|en|ppp) Match 1 of the options
  • \/ Match /
  • [\w-]+ Match 1+ times a word char or -
  • \/ Match /
  • .* Match 0+ occurrences of any char except a newline
  • $ End of string

Regex demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70
0

Using a negative lookahead or lookbehind will solve your problem. There are 2 options not clear from the question:

  1. ?uid=100 is not allowed after the starting part /bar/baz, so www.mysite.com/test/bar/baz?uid=100 should be valid.
  2. ?uid=100 is not allowed anywhere in the string following /bar/baz, which means that www.mysite.com/test/bar/baz/?uid=100 is invalid as well.

Option 1

In short:

\/(test|foo|bar\/baz(?!\/?\?)|en|ppp)(\/[-\w?=]+)*\/?

Explanation of the important parts:

|                        # OR
  bar                    #   'bar' followed by
  \/                     #   '/'   followed by
  baz                    #   'baz'
  (?!                    #   (negative lookahead) so, **not** followed by
    \/?                  #     0 or 1 times '/'
    \?                   #     '?'
  )                      #    END negative lookahead

and

  (                      # START group
    \/                   # '/'
    [-\w?=]+             # any word char, or '-','?','='
  )*                     # END group, occurrence 0 or more times
  \/?                    # optional '/'

Examples Option 1

You can make the lookahead even more specific with something like (?!\/?\?\w+=\w+) to make explicit that ?a=b is not allowed, but that's up to you.

Option 2

To make explicit that ?a=b is not allowed anywhere we can use negative lookbehind. Let's first find a solution for not allowing* bar/baz preceding the ?a=b.

Shorthand:

(?<!bar\/baz\/?)\?\w+=\w+

Explanation:

  (?<!                        # Negative lookbehind: do **not** match preceding
     bar\/baz                 # 'bar/baz'
     \/?                      # optional '/'
  )
  \?                          # match '?'
  \w+=\w+                     # match e.g. 'a=b'

Let's make this part of the complete regex:

\/(test|foo|en|ppp|bar\/baz)(\/?((?<!bar\/baz\/?)\?\w+=\w+|[-\w]+))*\/?$

Explanation:

\/                            # match '/'
(test|foo|en|ppp|bar\/baz)    # start with 'test', 'foo', 'en', 'ppp', 'bar/baz'
(\/?                          # optional '/'        
  ((?<!bar\/baz\/?)\?\w+=\w+  #   match 'a=b', with negative lookbehind (see above)
  |                           # OR
  [-\w]+)                     #   1 or more word chars or '-'
)*                            # repeat 0 or more times
\/?                           # optional match for closing '/'
$                             # end anchor

Examples Option 2

Marc Lambrichs
  • 2,864
  • 2
  • 13
  • 14