0

(Note: I found a reasonable solution using String.split() instead of Regexp.match(), but I'm still interested in the theoretical regexp question.)

Given a string that may or may not end with the letter a, and may have any number of letters a in other positions, is there a regexp that lets me capture the trailing a if present as one group, and all previous characters as another? E.g.:

Input Group 1 Group 2
'a' ''* 'a'
'b' 'b' ''
'ba' 'b' 'a'
'baaa' 'baa' 'a'
'baaab' 'baaab' ''

* nil instead of the empty string would also be acceptable

Some things I've tried that haven't worked:

  • The naive approach: /^(.*)(a?)$/
  • The same, but with a numeric repetition limit: /^(.*)(a{0,1})$/
  • The same, but with an atomic group: /^(.*)((?>a?))$/
  • The same, but with negative lookahead in the first group: /^(.*(?!=a))(a?)$/

All of these fail to capture the trailing a if present:

input expected actual
'a' '', 'a' 'a', ''
'ba' 'b', 'a' 'ba', ''
'baaa' 'baa', 'a' 'baaa', ''

The closest I've been able to come is to use | to split between the cases with and without a trailing a. This comes close, but at the expense of producing twice as many capture groups, such that I'll need to do some additional checking to decide whether to use the left or right pair of groups:

  • /^(?:(.*)(a)$|(.*[^a])()$)/
input expected actual
'a' '', 'a' '', 'a', nil, nil
'b' 'b', '' nil, nil, 'b', ''
'ba' 'b', 'a' 'b', 'a', nil, nil
'baaa' 'baa', 'a' 'baa', 'a', nil, nil
'baaab' 'baaab', '' nil, nil, 'baaab', ''

The solution I've found is to throw out Regexp.match entirely and just use String.split. This comes close enough for my purposes:

  • input.split(/(a?)$/)
input expected actual
'a' '', 'a' '', 'a'
'b' 'b', '' 'b' (close enough)
'ba' 'b', 'a' 'b', 'a'
'baaa' 'baa', 'a' 'baa', 'a'
'baaab' 'baaab', '' 'baaab' (close enough)

This works, but I'd still like to know if there's a way to do it as a straight regexp match.

David Moles
  • 48,006
  • 27
  • 136
  • 235

0 Answers0