1

I currently have a regex to match a url subpath. It looks like this

^(?!^__.*__$).[a-zA-Z0-9_.-]+$

I want to disable ONLY 2 underscores at the beginning and the end of the string because it's a reserved string. Any number of underscores other than 2 should be allowed

For example:

_should_work_
__should_work___
_should_work___
__should_not_work__

The problem now is even though I have more than 2 underscores, the regex will still not match

___should_work_but_doesnt__________

You can check out the regex here:

https://regex101.com/r/H9F1NN/1

Kelok Chan
  • 706
  • 1
  • 8
  • 24
  • Maybe it would be easier to do a positive match, like for example `^__[^_]+__$`, and reject the strings that do match the regex. – Cristik Jan 18 '23 at 11:18
  • 1
    Maybe this one helps: [`^(?!__(?!_).*[^_]__$)\w+$`](https://regex101.com/r/q07vvO/1) – bobble bubble Jan 18 '23 at 13:36
  • @bobblebubble this 1 works but sadly it does not work on safari :/ https://stackoverflow.com/questions/51568821/works-in-chrome-but-breaks-in-safari-invalid-regular-expression-invalid-group – Kelok Chan Jan 21 '23 at 05:49
  • @KelokChan: So, if you want to `"disallow __test__ but allow anything else"`, which of the answers fit that criteria? I think your title, description and examples here are in sync, but there is a disconnect with the fiddle you gave. – Peter Thoeny Jan 21 '23 at 08:45
  • Also, what should `__` (string made of two underscores), or `___` (string made of three underscores) give? Any is regex a must (if yes, why)? – Cristik Jan 21 '23 at 09:55
  • @KelokChan Have you tested it on Safari? I can't see any reason why it should not work there, it does not contain any lookbehind, just lookaheads which are supported afaik. However for simplicity and compatibility I'd stick to [@PeterThoeny's answer with a slight modification](https://stackoverflow.com/questions/75154645/regex-to-prevent-double-underscores-at-the-beginning-and-the-end-of-the-string#comment132688364_75165744). – bobble bubble Jan 21 '23 at 12:11

3 Answers3

3

You can use

^(?!_(?!_))(?!(?:.*[^_])?_$)[\w.-]+$

See the regex demo.

Details:

  • ^ - start of string
  • (?!_(?!_)) - the string should not start with a _ that is not immediately followed with another _ char
  • (?!(?:.*[^_])?_$) - the string can't end with a _ that is immediately preceded with a char other than a _ or at the start of string
  • [\w.-]+ - one or more letters, digit, underscores, dots or hyphens
  • $ - end of string.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • 1
    This incorrectly fails on `___ok__` and `___ok____` – Peter Thoeny Jan 19 '23 at 23:21
  • No, [both are matched fine](https://regex101.com/r/jAsJSp/3). – Wiktor Stribiżew Jan 20 '23 at 07:54
  • Try `['_ok_', '_ok__', '__ok_too_', '__bad__', '__bad_as_well__', '___ok__', '___ok____'].forEach(str => { console.log(str, '=>', /^(?!_(?!_))(?!(?:.*[^_])?_$)\w+$/.test(str)); });`. If I understand the OP and his examples, he wants to disallow *only* strings that have exactly 2 leading and trailing underscores, and allow strings with 1 or 3+ underscores on either side – Peter Thoeny Jan 20 '23 at 08:10
  • @PeterThoeny Your understanding is not correct. See [OP regex attempt fiddle](https://regex101.com/r/H9F1NN/1) where all expected cases are given. – Wiktor Stribiżew Jan 20 '23 at 08:12
  • Hmm, then there is a disconnect between the examples in the fiddle, and his title, description, and examples in the OP – Peter Thoeny Jan 20 '23 at 08:17
  • Basically I want to disallow `__test__` but allow anything else if that helps. The original fiddle has been modified by someone so it's not the same anymore – Kelok Chan Jan 21 '23 at 05:48
  • @KelokChan If you want to disallow `__test__` you allow `test`. – Wiktor Stribiżew Jan 21 '23 at 11:17
2

You can negate the test, witch makes it easier:

const regex = /^__(?:[^_]|[^_].*[^_])__$/;
['ok', 'ok_too', '_ok_', '_ok__', '__ok_too_', '__bad__', '__bad_as_well__',
 '__b__', '___ok__', '___o__', '___ok____'
].forEach(str => {
  console.log(str, '=>', !regex.test(str));
});

Output:

ok => true
ok_too => true
_ok_ => true
_ok__ => true
__ok_too_ => true
__bad__ => false
__bad_as_well__ => false
__b__ => false
___ok__ => true
___o__ => true
___ok____ => true

Explanation of regex:

  • ^ -- anchor at start of string
  • __ -- expect two underscores
  • (?: -- start of non-capture group (for logical or)
    • [^_] -- expect a non-underscore char
  • | -- logical or
    • [^_].*[^_] -- expect a non-underscore char, any number of chars, and a non-underscore char
  • ) -- end of non-capture group
  • __ -- expect two underscores
  • $ -- anchor at end of string

Note that this regex avoids lookarounds, which is not universally supported.

Learn more about regex: https://twiki.org/cgi-bin/view/Codev/TWikiPresentation2018x10x14Regex

UPDATE 1: Changed regex from ^__[^_].*[^_]__$ to ^__(?:[^_]|[^_].*[^_])__$ to account for singe char __x__ too.

Peter Thoeny
  • 7,379
  • 1
  • 10
  • 20
  • 1
    I like this idea but in my understanding e.g `__a__` should be false too. To fix this, you can simply add an optional group: `^__[^_](?:.*[^_])?__$` – bobble bubble Jan 21 '23 at 12:08
  • 1
    @bobblebubble: Good point, I updated the regex with a logical `or` to support that corner case. Your regex is good too – Peter Thoeny Jan 21 '23 at 22:21
1

You can do something like

^(?!^__[^_]+(_[^_]+)*__$).[a-zA-Z0-9_.-]+$
^(?!^__[^_].*(?<!_)__$).[a-zA-Z0-9_.-]+$

Where both [^_]+(_[^_]+)* and [^_].*(?<!_) match any string that does not start and end with an underscore.

Etienne Laurin
  • 6,731
  • 2
  • 27
  • 31