1

So I have this regex - regex101:

\[shortcode ([^ ]*)(?:[ ]?([^ ]*)="([^"]*)")*\]

Trying to match on this string

[shortcode contact param1="test 2" param2="test1"]

Right now, the regex matches this:

[contact, param2, test1]

I would like it to match this:

[contact, param1, test 2, param2, test1]

How can I get regex to match the first instance of the parameters pattern, rather than just the last?

Josh
  • 23
  • 3
  • See https://stackoverflow.com/questions/6579908/get-repeated-matches-with-preg-match-all/24269775 – Wiktor Stribiżew Mar 08 '19 at 12:42
  • @WiktorStribiżew So, if I were to use PHP for this, what I want is impossible? – Josh Mar 08 '19 at 12:46
  • Possible, with two regexps like [this](https://regex101.com/r/I0g6qW/4) and then [this one](https://regex101.com/r/I0g6qW/5). Or use something like `(?:\G(?!^)\s+|\[shortcode\s+(\S+)\s+)(\S+)="([^"]*)"` ([demo](https://regex101.com/r/I0g6qW/6)) – Wiktor Stribiżew Mar 08 '19 at 12:50
  • @WiktorStribiżew That worked a charm, thank you! If you post that as an answer I can confirm it. – Josh Mar 08 '19 at 13:08
  • @WiktorStribiżew Using the other regex you provided, then parsing the result further with some PHP. – Josh Mar 08 '19 at 13:15

2 Answers2

0

Try using the below regex.

regex101

Below is your use case,

var testString = '[shortcode contact param1="test 2" param2="test1"]';

var regex = /[\w\s]+(?=[\="]|\")/gm;

var found = paragraph.match(regex);

If you log found you will see the result as

["shortcode contact param1", "test 2", " param2", "test1"]

The regex will match all the alphanumeric character including the underscore and blank spaces only if they are followed by =" or ".

I hope this helps.

Rohini
  • 244
  • 1
  • 8
0

You may use

'~(?:\G(?!^)\s+|\[shortcode\s+(\S+)\s+)([^\s=]+)="([^"]*)"~'

See the regex demo

Details

  • (?:\G(?!^)\s+|\[shortcode\s+(\S+)\s+) - either the end of the previous match and 1+ whitespaces right after (\G(?!^)\s+) or (|)
    • \[shortcode - literal string
    • \s+ - 1+ whitespaces
    • (\S+) - Group 1: one or more non-whitespace chars
    • \s+ - 1+ whitespaces
  • ([^\s=]+) - Group 2: 1+ chars other than whitespace and =
  • =" - a literal substring
  • ([^"]*) - Group 3: any 0+ chars other than "
  • " - a " char.

PHP demo

$re = '~(?:\G(?!^)\s+|\[shortcode\s+(\S+)\s+)([^\s=]+)="([^"]*)"~';
$str = '[shortcode contact param1="test 2" param2="test1"]';
$res = [];
if (preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0)) {
    foreach ($matches as $m) {
        array_shift($m);
        $res = array_merge($res, array_filter($m));
    }
}
print_r($res);
// => Array( [0] => contact [1] => param1  [2] => test 2 [3] => param2  [4] => test1 )
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563