1

I need to convert markdown URL's to the <a href="" /> links. But, the same regex that is working in PHP is not working in JavaScript. What I'm doing wrong?

Regex: \[([^]]*)\]\(([^\s^\)]*)[\s\)]

Here is the demo link where you can see for your self: https://regex101.com/r/kZbw7g/5

It seems that the caret has a different meaning in JavaScript.

Any help is appreciated.

ffox003
  • 137
  • 3
  • 8
  • Escape `]` inside a character class. Also, note that you do not have to escape `)` inside a character class as it is not a special char there. – Wiktor Stribiżew Apr 23 '18 at 12:35

1 Answers1

2

Look at this

 \[
 ( [^]]* )                     # (1)
 \]
 \(
 ( [^\s^\)]* )                 # (2)
 [\s\)]

This is a readable, formatted version of your regex.

There are rules governing the parsing of character classes (Non-Java/DotNet).

The class is opened [ and is not escaped.

Rule #1 : The very next un-escaped ] closes the class.

The Only exception is when it is the first item in the class. []].

This exception holds for a negated class as well [^]] because
it is still the first item in the class.

This is why your regex works in the pcre demo.

HOWEVER, JS is still an antiquated engine which has no such grace for ]
being the first item.
So, it must be escaped. [\]] and [^\]]

See the JS version here https://regex101.com/r/kZbw7g/6

Adendum :

Additionally, you can see how/why the grace exists for an un-escaped
] when it is the first item.

It seems originally, when designing rules for class parsing, it was
determined to NOT allow EMPTY classes [].

It could be that the fix-up state was beyond what designers wanted to
handle. In other words, they wanted to make parsing classes a very low
intelligent thing to do.

Who knows why..

Here is the kicker, empty classes are not allowed.
If a ] is immediately after the start of a class (first item),
it cannot possibly be the closing ] because, that would be an illegal
empty class.

However, since there is a valid closing brace []], there is no ambiguity,
the first item ] is a literal.

Thus the First Item Exception was born.

This is the proof:

pcre https://regex101.com/r/v0uBHA/1
JS https://regex101.com/r/foqcpL/1
Python https://regex101.com/r/sko0Qv/1
go https://regex101.com/r/Nhx8Ks/1