Is there a regular expression reg
, so that for any string str
the results of str.split(".")
and str.match(reg)
are equivalent? If multiline should somehow matter, a solution for a single line would be sufficient.
As an example: Considering the RegExp /[^\.]+/g
: for the string "nice.sentance"
, "nice.sentance".split(".")
gives the same result as "nice.sentance".match(/[^\.]+/g)
- ["nice", "sentance"]
. However, this is not the case for any string. E.g. for the empty string ""
they would give different results, "".split(".")
returning [""]
and "".match(/[^\.]+/g)
returning null
, meaning /[^\.]+/g
is not a solution, as it would need to work for any possible string.
The question comes from a misinterpretation of another question here and left me wondering. I do not have a practical application for it at the moment and am interested because i could not find an answer - it looks like an interesting RegExp problem. It may however be impossible.
Things i have considered:
Imho it is fairly clear that
reg
needs the global flag, removing capture groups as a possibility/[^\.]+/g
does not match empty parts, e.g. for""
,".a"
or"a..a"
/[^\.]*/g
produces additional empty strings after non-empty matches, because when iteration starts for the next match, it can fit in an empty match. E.g. for"a"
With features not available on javascript currently (but on other languages), one could repair the previous flaw:
/(?<=^|\.)[^\.]*/g
My conclusion here would be that real empty matches need to be considered but cannot be differentiated from empty matches between a non-empty match and the following dot or EOL, without "looking behind". This seems a bit vague to count as a proper argument for it being impossible, but maybe is already enough. There might however be a RegExp feature i don't know about, e.g. to advance the index after a match without including the symbol, or something similar to be used as a trick.
Allowing some correction step on the array resulting from match
makes the problem trivial.
I found some related questions, which as expected utilize look-behind or capture groups though: