1

I have written such regular expression:

(?<arg>(?<key>\w+)+=(?<quote>["'`])(?<value>(?:[^\k<quote>]|(?<=\\)\k<quote>)+\k<quote>))

but it doesn't work because of backreferencing inside [^]. I looked for solution on this thread and wrote this:

(?<arg>(?<key>\w+)+=(?<quote>["'`])(?<value>(?:(?!\k<quote>).|(?<=\\)\k<quote>)+\k<quote>))

however it still doesn't work.

What am I doing wrong?

I want to extract all keys with values from strings like:

arg="value" arg='value' arg=`value` arg="value 'value'" arg='value \'value\' value' arg="value \"value\" value" arg=`value \`value\ value`

regex101 - online preview

Thomas Ayoub
  • 29,063
  • 15
  • 95
  • 142
  • See https://regex101.com/r/cUkNUz/3 – Wiktor Stribiżew Feb 21 '19 at 16:07
  • Thanks for your response, works like a charm, but it includes quote mark in value group. Isnt there a way to make it easier? Could as well repeat the code for all three quotes and it would be simular at length and probably more efficient. –  Feb 21 '19 at 17:58
  • There was no Value group, look, [I added it](https://regex101.com/r/cUkNUz/4). – Wiktor Stribiżew Feb 21 '19 at 18:25
  • Now it works like it should but still i think it can be optimized. 200ms for such a simple regex is a lot. Of course thanks for your time and i will use your regex if nothing else comes up. –  Feb 21 '19 at 18:44
  • It is not a simple regex, and it *is not* very efficient. However, regex101 shows 1 or 2 ms (JS regex option) with the current text input. If you need to faster code, why not write a normal parser? Or are you trying to replicate an HTML parser? Use an existing one. – Wiktor Stribiżew Feb 21 '19 at 18:48
  • Ah i see. Maybe because I’m browsing on mobile it shows 180ms. I think writing parser is more difficult. I’ll use it for chat messages containing commands. Thanks! –  Feb 21 '19 at 18:50

1 Answers1

0

You may fix the regex by using the correct tempered greedy token:

(?<arg>               # Start arg group
  (?<key>\w+)         #  key group: 1+ word chars
  =                   # =
  (?<quote>['"`]?)    # quote group: an optional " ' or `
  (?<value>(?:(?!\k<quote>)[^\\])*(?:\\[\s\S](?:(?!\k<quote>)[^\\])*)*) # value group: any 0+ chars other than quote char with escaped quote chars allowed
  \k<quote>           # quote group value
)                     # end of arg group

See the regex demo.

A one-liner:

(?<arg>(?<key>\w+)=(?<quote>['"`]?)(?<value>(?:(?!\k<quote>)[^\\])*(?:\\[\s\S](?:(?!\k<quote>)[^\\])*)*)\k<quote>)

See demo.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563