Regex for detecting url in plain form and in markdown

Question

I am trying to capture user input in a textarea that might be a url (and similarly email) in any of the three formats -

Just plain url.
Markdown with title [text](url "title")
Markdown without title [text](url)

Now, I have a regex (javascript) for each of the three individual formats that work by themselves. But if I want to do all 3, the first one prevents the second and third one from activating. In my code, on 'space', the regex detection is triggered. Therefore, if I have the first regex, then the one with markdown title is never triggered.

I am wondering if it is possible to have a regex for the 1st one that specifically excludes the format of the 2nd and the 3rd? Or, even better, if there is a single regex for capturing that matches all 3?

Also, since I am not that good at Regex, I'd love if someone could also explain their solution Regex, so that I could try to do the same for email detection.

Thank you!

steinerkelvin · Accepted Answer · 2017-03-01T15:16:03.863

0

Firstly, the second regex already works for the third format, so we only need to join the first and second ones.

The simple way to do this is to use the | ("OR") character, like this:

(<firstRegex>)|(<secondRegex>)

Demo

The problem with this is that it mess the capturing groups. If the regex catches the first pattern, the url will be in a different capturing group (4th on my demo) than if it was captured by the second one (2nd group).

Excluding markdown pattern on plain URL regex

Adding (?:^|[^\(\/]) to the beginning of the plain URL pattern will force the regex to match any character that's not a opening parenthesis, thus excluding the markdown case. The url must be extracted using a capturing group, since this character will be included in the match.

Demo

edited Mar 01 '17 at 15:16

answered Feb 27 '17 at 00:26

steinerkelvin

475
2
12

Thank you but it still has an issue. I should have explained the problem a little better. In my code, on 'space', the regex detection is triggered. Therefore, if I have the first regex, then the one with markdown title is never triggered. Any thoughts on how to avoid that? – geoboy Feb 27 '17 at 00:35
Sorry, I don't get it. Could you provide a JSFiddle demostrating the regex failing? – steinerkelvin Feb 27 '17 at 00:41
JSFiddle is a little hard to reproduce in this case but let me try and explain again before the fiddle. My code tries to detect as the user types but lazily. So when you start typing in the url and add a space (or a comma), it will detect it as a url. All good. But now when you type in markdown format `[text](url` the moment you add space, the regex for the 1st case is triggered and creates a url, thus preventing the user from adding in the title ever. Does that explanation help? – geoboy Feb 27 '17 at 00:51
I think I got one regex that matches plain urls, but fail to match the markdown ones. I used [this regex](https://stackoverflow.com/a/3809435/1967121) and added this pattern `(?:^|[^\(])`, which will match anything but a opening parenthesis, to the beggining. [Demo](https://regex101.com/r/Jrl4RO/5) – steinerkelvin Feb 27 '17 at 01:39
oo, this is close but requires http. It would be ideal to combine the pattern that you added to maybe the 1st url matching pattern that I had? btw, appreciate your help! – geoboy Feb 27 '17 at 02:33
[it seems to work](https://regex101.com/r/fpoBwa/5). I had to add `/` to the exclusion. – steinerkelvin Feb 27 '17 at 12:02
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/136789/discussion-between-geoboy-and-kelvinss). – geoboy Feb 27 '17 at 19:30

Regex for detecting url in plain form and in markdown

1 Answers1

Excluding markdown pattern on plain URL regex

Related