In a sentence similar to:
Lorem ipsum +dolor ++sit amet.
I'd like to match the +dolor
but not the ++sit
. I can do it with a lookbehind but since JavaScript
does not support it I'm struggling to build a pattern for it.
So far I've tried it with:
(?:\+(.+?))(?=[\s\.!\!]) - but it matches both words
(?:\+{1}(.+?))(?=[\s\.!\!]) - the same here - both words are matched
and to my surprise a pattern like:
(?=\s)(?:\+(.+?))(?=[\s\.!\!])
doesn't match anything. I thought I can trick it out and use the \s
or later also the ^
before the +
sign but it doesn't seem to work like that.
EDIT - background information:
It's not necessarily part of the question but sometimes it's good to know what is this all good for so to clarify some of your questions/comments a short explanation:
- any word in any order can by marked by either a
+
or a++
- each word and it's marking will be replaced by a
<span>
later - cases like lorem+ipsum are concidered to be invalid because it would be like splitting a word (ro+om) or writing two words together as one word (myroom) so it has to be corrected anyway (the pattern can match this but it's not an error) it should however at least match the normal cases like in the example above
- I use a lookahead like
(?=[\s\.!\!])
so that I can match words in any language an not only\w
's characters