3

I have a string that either has a number or the letter a, possibly followed byr or l.

In MATLAB the following regexp returns as

>> regexp('10r', '([0-9]*|a)(l|r)*', 'match')
ans = 
    '10r'

I would expect 10 and r separately, because I have two capture groups. Is there a way to get a cell array with both returned independently? I can't see it in the documentation.

RazerM
  • 5,128
  • 2
  • 25
  • 34

2 Answers2

6

You want 'tokens' instead of 'match'

>> toks = regexp('10r', '([0-9]*|a)(l|r)*', 'tokens');
>> toks{1}
ans = 
    '10'    'r'

Or if you want to get fancy, name the tokens and get a struct array:

>> toks = regexp('10r', '(?<number>[0-9]*|a)(?<letter>l|r)*', 'names');
>> toks
toks = 
    number: '10'
    letter: 'r'
Peter
  • 14,559
  • 35
  • 55
0

If you want to match

either has a number or the letter a, possibly followed by r or l as the * means 0 or more times.

You can also use [0-9]+ to match at least a single number and use a character class to match either r or l.

([0-9]+|a)([lr]?)

The pattern matches:

  • ([0-9]+|a) Capture group 1, match either 1+ digits 0-9 or match a
  • ([lr])? Capture group 2, optionally match either l or r

Regex demo

To prevent partial matches, you could also use word boundaries:

\<([0-9]+|a)([lr]?)\>
The fourth bird
  • 154,723
  • 16
  • 55
  • 70