1

I am trying to parse input string using regular expression. I am getting problem when trying to capture a repeating group. I always seem to be matching last instance of the group. I have tried using Reluctant (non greedy) quantifiers, but I seems to be missing some thing. Can someone help?

Regular expression tried:

(OS)\\s((\\w{3})(([A-Za-z0-9]{2})|(\\w{3})(\\w{3}))\\/{0,1}){1,5}?\\r

(OS)\\s((\\w{3}?)(([A-Za-z0-9]{2}?)|(\\w{3}?)(\\w{3}?))\\/{0,1}?){1,5}?\\r

Input String:

OS BENKL/LHRBA/MANQFL\r\n

I always seem to capture last group which is MANQFL group (MAN QFL), and my aim is to get all three groups (there can be 1-5 groups):

(BEN KL) , (LHR BA) and (MAN QFL). 

Anyhelp will be appreciated.

omshanti
  • 71
  • 1
  • 4

1 Answers1

4

When you repeat a capturing group in a regular expression, the capturing group only stores the text matched by its last iteration. If you need to capture multiple iterations, you'll need to use more than one regex. (.NET is the only exception to this. Its CaptureCollection provides the matches of all iterations of a capturing group.

Just a learner
  • 26,690
  • 50
  • 155
  • 234
  • My intention was to use boost::regex_search to achive this,so that I can loop, but loop executes only once as it matches last instance always, is there any way to get around this ? std::string::const_iterator start = str.begin(), end = str.end(); while(regex_search(start,end,what,expr)) { cout << what[0]; cout << what[1]; ... start += what.position () + what.length (); } – omshanti Jun 28 '10 at 15:06