6

I cut my teeth on Perl. I'm pretty comfortable with regular expressions (but still prone to errors).

Why does (*) work as a regular expression in an Express route named param?

Why doesn't (.*) work as a regular expression in an Express route named param?

Is something like ([\\w:./]+) a more reliable way to do it?


I'm trying to use a route parameter that is intended to have slashes in the value.

e.g.

If the request is:

http://www.example.com/new/https://www.youtube.com/trending

... and I'm using this route:

app.get('/new/:url', (req, res) => {
  console.log('new')
  console.log(req.params.url)
})

I want url to equal https://www.youtube.com/trending

I understand that the path is split on the slashes, so I thought I could use a regular expression in parentheses after the named parameter to also match the slashes.

I tried /new/:url(.*), which I thought should greedily match anything, including the slashes, but this made the route fail completely. Why doesn't this work?

Through my own trial and error, I found that /new/:url([\\w:./]+) works. This makes sense to me, but seems unnecessarily complex. Is this "the right way"?

The one that perplexes me the most is one I found in a YouTube video example... Why does /new/:url(*) work? The * says 0 or more of the previous item, but there's nothing before the asterisk.

I have a feeling that the answer lies in this GitHub issue, but it's not clear to me from reading the thread exactly what's happening. Does (*) rely on a bug that's likely to be corrected in the next release of Express?

Vince
  • 3,962
  • 3
  • 33
  • 58
  • https://github.com/expressjs/express/issues/2495 – Adam Mar 19 '18 at 10:40
  • @Adam I already read that and it either doesn't answer my question or I just don't understand. If you read my question, you'll see I made a reference to that GitHub issue. – Vince Mar 19 '18 at 19:53

1 Answers1

7

The first part of the question is answered by the referenced GitHub issue.

As for why .* doesn't work, the dot (.) isn't a special character in this implementation. It's just a dot.

From the referenced GitHub issue I understand that the asterisk (*) isn't understood as a quantifier at all. It just matches everything. So, that's why (*) works.

The part that isn't explained by the GitHub issue is .* which, when taking the known bug into consideration, should match a single character followed by everything else. However, through trial and error, I've determined that the . isn't a special character at all. In this implementation, it's just a literal dot.

For example, if the request is:

http://www.example.com/new/https://www.youtube.com/trending

... and I'm using this route:

app.get('/new/:url(.*)', (req, res) => {
  console.log('new')
  console.log(req.params.url)
})

The route wouldn't be matched, but a request for

http://www.example.com/new/.https://www.youtube.com/trending

would match (note the dot preceding the https) and req.params.url would equal .https://www.youtube.com/trending.

I used the following code to test:

const express = require('express')
const app = express()
const port = process.env.PORT || 3000

app.get('/dotStar/:dotStar(.*)', (request, response) => {
  console.log(`test request, dotStar: ${request.params.dotStar}`)
  response.send(`dotStar: ${request.params.dotStar}`)
})

app.get('/star/:star(*)', (request, response) => {
  console.log(`test request, star: ${request.params.star}`)
  response.send(`star: ${request.params.star}`)
})

app.get('/regexStar/:regexStar([a-z][a-z-]*)', (request, response) => {
  console.log(`test request, regexStar: ${request.params.regexStar}`)
  response.send(`regexStar: ${request.params.regexStar}`)
})

app.get('/dotDotPlus/:dotDotPlus(..+)', (request, response) => {
  console.log(`test request, dotDotPlus: ${request.params.regexStar}`)
  response.send(`dotDotPlus: ${request.params.dotDotPlus}`)
})

app.get('/regex/:regex([\\w:./-]+)', (request, response) => {
  console.log(`test request, regex: ${request.params.regex}`)
  response.send(`regex: ${request.params.regex}`)
})

app.listen(port, () => {
  console.log(`Listening on port ${port}...`)
})

-- Also found in this gist

Vince
  • 3,962
  • 3
  • 33
  • 58