1

I am trying to match below path with a regex.

# test-site could be anything like "shoe-site", "best-clock", etc.
/content/test-site

The current regex rule (\/content\/{1}\S+) would have been adequate.

The problem is it is matching the entire path such as /content/test-site/en_US/abc.html.

I need it to match only /content/test-site.

Example of path to be matched:

https://localhost:4502/content/test-site/en_US/book/city/sample-rooms/airport-inn/propertyCode.00000.html

Regex I've tried by far;

(\/content\/[a-z-]+)\/[a-z]{2}_[A-Z]{2}\/book(ing-path)?\/(sample-|sample-details)(.*).[0-9]{5}.*

/content/test-site is optional- it might not present sometimes in url.

What am I doing wrong and how can I fix it?

nyedidikeke
  • 6,899
  • 7
  • 44
  • 59
Sri
  • 1,205
  • 2
  • 21
  • 52
  • If "test-site" always consists of two words (alphanumeric or underscore) separated by `-`, you may use `\/content\/\w+-\w+`. See it here: https://regex101.com/r/vjx05U/1. If it can be anything except `/`, then you may use `\/content\/[^/]+`. Please note that `{1}` is redundant. – 41686d6564 stands w. Palestine Sep 05 '19 at 23:50
  • 2
    Just replace `\S+` with `[^\/]+` – Nick Sep 05 '19 at 23:53
  • Mach for character repetition. See here https://stackoverflow.com/questions/1023902/it-is-possible-to-match-a-character-repetition-with-regex-how – MayDisplay Sep 05 '19 at 23:55
  • soo... are you matching against a full URL, or just the path? Because your "path to be matched" example is a full URL... – CrayonViolent Sep 06 '19 at 00:51
  • @CrayonViolent just the path - not full url. some times it would have /content/test-site, some times not. /en_US/... etc. – Sri Sep 06 '19 at 00:58

3 Answers3

1

Regex to match any character expect /:

content\/[^\/]+

This is character classes. A character class beginning with caret will match anything not in the class. More about this.

So, with javascript:

const url = '/content/test-site/en_US/abc.html';

const path = url.match(/content\/[^\/]+/)

console.log(path[0])
ManUtopiK
  • 4,495
  • 3
  • 38
  • 52
1

Here is another approach:

(?:\/[a-zA-Z0-9-]+){2}

Regex Demo

Explanation:

(?:                 # Non-capturing group
\/[a-zA-Z0-9-]+     # Match starting with / and followed with characters/digits/-
)                   # Close grouping
{2}                 # Match two times, i.e. only match with two /
vs97
  • 5,765
  • 3
  • 28
  • 41
  • Why did you assume that "content" isn't a constant? – 41686d6564 stands w. Palestine Sep 06 '19 at 00:01
  • @AhmedAbdelhameed in the question it doesn't mention anywhere that content is a constant. What if it is not? We don't know, I'm leaving this for OP to decide. My answer is just one of possible approaches. – vs97 Sep 06 '19 at 00:02
  • The OP used it as a constant (i.e., in his/her pattern). I wouldn't _assume_ otherwise, but maybe that's just me :) – 41686d6564 stands w. Palestine Sep 06 '19 at 00:05
  • 1
    @AhmedAbdelhameed I see your point, I don't think my answer does any harm though, if OP confirms that it is a constant I will delete answer, no problem. – vs97 Sep 06 '19 at 00:06
  • @AhmedAbdelhameed thanks for the response. can you tell me ?: why this is needed ? thanks. – Sri Sep 06 '19 at 00:17
  • @Sri this is a non-capturing group, see here https://stackoverflow.com/questions/3512471/what-is-a-non-capturing-group-in-regular-expressions – vs97 Sep 06 '19 at 00:19
  • /content/test-site is optional , in above regex it is saying +, I am not sure if I could do a whole group zero or one time. – Sri Sep 06 '19 at 00:28
1

You can use negated character class

^(?:[^\/]*\/){2}[^\/]*

enter image description here

const path = '/content/test-site/en_US/abc.html';

const desired = path.match(/^(?:[^\/]*\/){2}[^\/]*/)

console.log(desired[0])
Code Maniac
  • 37,143
  • 5
  • 39
  • 60