0

I would like to get URLs of the form (strictly) /forumId/slug/ with Express' router.get(). Additional tokens should result in a 404.

I came up with this path route: /:forumId([^?\/]{0,}):parameters1?/:slug([^?\/]{0,})?:parameters2? which according to express-route-tester behaves as expected.

However Express seems to also be catching paths with extra tokens after the slug. Directly passing a Javascript regular expression instead of a string has the same effect.

Here is the regex I used: /^\/(\d+)(?:\?[^\?\/]{0,})?(?:$|\/([^\?\/]{0,})(?:\?[^\?\/]{0,})?\/?$)/

Examples of strings that should match (those work):

/7721
/7721/
/7721?page=2
/7721/ForumTitle/
/7721/AnotherForumTitle?test/
/7721/YetAnotherForumTitle?page=2
/7721?page=3/ForumTitle?page=2
/7721?page=3/ForumTitle?page=2/

Example of strings that shouldn't match:

/7721?page=3/ForumTitle?page=2/threadId

Express uses path-to-regexp to parse the string and according to the docs (https://www.npmjs.com/package/path-to-regexp#user-content-usage), the 'end' option is set to true by default which seems to corroborate that the match should stop right after the slug.

What am I missing?

The version of Express is 4.16.4.

UPDATE: removing the part about the query parameters (?page=2) in the regex solved the problem. Why is that so? Was my regex flawed (I tried spotting possible greedy matches but couldn't find one) or is this expected from path-to-regexp?

Kathandrax
  • 914
  • 14
  • 26
  • Try this https://regexr.com/44hu7 and also remember about `^` and `$` operators for limit line. – Ihor Voronin Dec 07 '18 at 15:42
  • @IhorVoronin 'ForumTitle' is just an example of title but could be any series of characters. I edited the question to include the regex I used that didn't work either. – Kathandrax Dec 07 '18 at 17:13

1 Answers1

0

As explained in Express routing docs,

query strings are not part of the route path

(https://expressjs.com/en/guide/routing.html#route-paths).

The part trying to catch the query parameters in /^\/(\d+)(?:\?[^\?\/]{0,})?(?:$|\/([^\?\/]{0,})(?:\?[^\?\/]{0,})?\/?$)/ was therefore unnecessary, resulting in the simplified /^\/(\d{1,})(?:$|\/([^\/]{0,})\/?$)/:

  • ^\/: beginning of the endpoint followed by a slash '/'
  • (\d{1,}): the forum identifier consisting of digits
  • (?:$|\/([^\/]{0,})\/?$)/: either ends there or captures an optional slug

This regex matched every provided example but also captured /7721/ForumTitle?page=2/test which should result in a 404. Upon further investigation I noticed '/test' was being captured as part of the query parameters in req.query ({"page": "2/test"}).

This is odd, since the slash '/' is supposed to be a reserved character for delimitation (Characters allowed in GET parameter). Additional information would be helpful.

EDIT: to quote Douglas Wilson (from there), Express.js uses Node.js core to parse the paths according to the url spec. Node.js's URL module follows the WHATWG URL Standard.

Kathandrax
  • 914
  • 14
  • 26