0

I have the following set of strings:

some_param[name] 
some_param_0[name]

I wish to capture some_param, 0, name from them. My regex knowledge is pretty weak. I tried the following, but it doesn't work for both cases.

/^(\D+)_?(\d{0,2})\[?(.*?)\]?$/.exec("some_param_0[name]") //works except for the trailing underscore on "some_param"

What would be the correct regex?

Parag
  • 963
  • 2
  • 10
  • 27

3 Answers3

3
/^(\w+?)_?(\d{0,2})(?:\[([^\[\]]*)\])?$/

(\w+?) uses a non-greedy quantifier to capture the identifier part without any trailing _.

_? is greedy so will beat the +? in the previous part.

(\d{0,2}) will capture 0-2 digits. It is greedy, so even if there is no _ between the identifier and digits, this will capture digits.

(?:...)? makes the square bracketed section optional.

\[([^\[\]]*)\] captures the contents of a square bracketed section that does not itself contain square brackets.

'some_param_0[name]'.match(/^(\w+?)_(\d{0,2})(?:\[([^\[\]]*)\])?$/)

produces an array like:

["some_param_0[name]",  // The matched content in group 0.
 "some_param",          // The portion before the digits in group 1.
 "0",                   // The digits in group 2.
 "name"]                // The contents of the [...] in group 3.

Note that the non-greedy quantifier might interact strangely with the bounded repetition in \d{0,2}.

'x1234[y]'.match(/^(\w+?)_?(\d{0,2})(?:\[([^\[\]]*)\])?$/)

yields

["x1234[y]","x12","34","y"]
Mike Samuel
  • 118,113
  • 30
  • 216
  • 245
  • I think he wants to remove the trailing underscore.. am I right? I tried with `/^([a-zA-Z_]+)(?:_(\d{0,2}))?(?:\[([^\[\]]*)\])?$/` but it looks like it doesn't work (at least, in Python) – redShadow Dec 13 '11 at 23:49
  • @redShadow, the RegExp in the OP leaves it out of capturing group 1, so I assumed the poster wanted it in. – Mike Samuel Dec 13 '11 at 23:52
  • Sorry, In the code comment I mentioned that I ideally wanted the trailing underscore to be ignored ("some_param" instead of "some_param_"). Should've made it clear in the question. – Parag Dec 13 '11 at 23:52
  • @fenderplayer, redShadow, Edited to make sure that `_` is not captured in group 1. – Mike Samuel Dec 13 '11 at 23:58
  • @MikeSamuel better to use `(?:_(\d+))?`, or you'll cut out the underscore in, e.g., `some_param_[name]`.. – redShadow Dec 14 '11 at 00:10
  • 1
    (I love infinite discussions on how to finely tune regular expressions.. you always end up learning something) – redShadow Dec 14 '11 at 00:12
1

Got it! (taking from Mike's answer):

/^(\D+)(?:_(\d+))?(?:\[([^\]]*)\])/

'some_param[name]' => ('some_param', None, 'name')
'some_param_0[name]' => ('some_param', '0', 'name')

(at least, in Python it works)

UPDATE: A little extra I wrote fiddling with it, by making the result cleaner by using named groups:

^(?P<param>\D+)(?:_(?P<id>\d+))?(?:\[(?P<key>[^\]]*)\])

UPDATE:

redShadow
  • 6,687
  • 2
  • 31
  • 34
0

Please ,check the follwing regexp "(\w+)_(\d)[(\w+)]" yo can test it @ http://rubular.com/