Is there a simple way to refer to one Javascript literal (e.g. "string") within another regexp literal?
Kind of familiar with Javascript Regexp but far from a guru. Trying to write a simple parser for a small handful of expression types. E.g. One type is expressions like:
`value gender 1='Male' 2 ='Female' 3="Didn't answer" >3 = 'Other';
Rather than write a whole parser in say, Jison, and the attendant learning curve, I thought it would be simple enough to use RegExp.
It appears Javascript Regexp can't capture an arbitrary number of repeating subgroups, and there's no clear character to split on, I'm parsing subgroups with their own regexps.
The following works okay, but the regexp literals are far from DRY, and all but unreadable. Each higher level construct repeats the lower level constructs.
var re_value_stmt = /value\s+(\w+)((?:\s+(?:[^=]+[=](?:(?:["][^"]+["])|(?:['][^']+[']))))+)/i
var re_value_clause = /([^=]+[=](\s*(?:(['][^']*['])|(["][^"]*["])))+)/ig
var re_value_elems = /([^=]+)[=]\s*(?:(?:[']([^']*)['])|(?:["]([^"]*)["]))/ig
console.log(re_value_elems.exec("1='Male'"));
console.log(re_value_clause.exec("1=\"Male\" 2=\"Female\""));
console.log(re_value_stmt.exec("value gender 1='Male' 2='Female'"));
For instance, (?:(?:["][^"]+["])|(?:['][^']+[']))
just means QuotedString
. Can I write that instead?
Is there a simple way to refer to one Javascript literal (e.g. "string") within another regexp literal? Specifying regexp by munging strings might work, but also seems awkward and error prone (e.g. needing to escape quote marks and escape escapes).
Or is this already the poster child for why people create parsers based on grammars and move out of Regexp?