2

I'm trying to wrap my mind around regex for the first time.

For the string

I want you to MATCH THIS, you bastard regex, but also MATCH X THIS and yeah,
MATCH X X X THIS too.

Basically, a starting pattern, an end pattern and an arbitrary number of a pattern inbetween.

So I'd like a myregex.exec string to successively return

["MATCH", "THIS"]
["MATCH", "X", "THIS"]
["MATCH", "X", "X", "X", "THIS"]

I've tried variations of this

/(MATCH)\s+(X)?\s+(THIS)/

but no cigar...

Marek
  • 23
  • 3
  • So do you want to match all the white space delimitted upper case words? – npinti Jul 15 '15 at 08:27
  • This is partially answered in http://stackoverflow.com/questions/5018487/regular-expression-with-variable-number-of-groups - see Tim Picker's comment. Short answer: in some flavours of Regex you can ask for multiple captures of the same group. But not in javascript. – yu_sha Jul 15 '15 at 08:29
  • @yu_sha Ah, I see. Thanks! – Marek Jul 15 '15 at 08:45
  • Having said that - you can group all of it in one group and then split. – yu_sha Jul 15 '15 at 08:48

2 Answers2

2

You can use the regular expression to match the entire expression:

/MATCH\s+(?:X\s+)*THIS/g

To get it into an array of terms/words you can then use String.split() like this:

var out = document.getElementById( "out" );

function parse( string ){
  var re = /MATCH\s+(?:X\s+)*THIS/g;
  var matches = (string.match( re ) || [])
                  .map( function(m){ return m.split( /\s+/ ); } );
  out.innerHTML = JSON.stringify( matches );
}

parse( document.getElementById( "in" ).value );
textarea { width: 100%; }
<textarea id="in" onchange="parse( this.value )">I want you to MATCH THIS, you bad regex, but also MATCH X THIS and yeah, MATCH X X X THIS too.</textarea>
<p id="out"/>
MT0
  • 143,790
  • 11
  • 59
  • 117
1

Try putting the \s+ into the optional group with *:

/(MATCH)\s+(?:(X)\s)*(THIS)/g

Note the g modifier to get all matches.

Joe
  • 877
  • 1
  • 11
  • 26
  • `(?:(X))` would never have done that. A capture group in a non-capture group ... weird. Doesn't work for MATCH X X X THIS, but apparently that don't work in JavaScript... – Marek Jul 15 '15 at 08:44
  • Will match multiple `X`s but will only ever return a single `X` as a capture group using `RegExp.exec()` - and for `MATCH THIS` it will return an empty capture group for the `X`s. – MT0 Jul 15 '15 at 08:57