2

I'm trying to parse a method signature that is in this format:

'function_name(foo=<str>, bar=<array>)'

From this, I want the name of the method, and each argument and it's type. Obviously I don't want the <, > characters, etc. The number of parameters will be variable.

My question is: How is it possible to get all the parameters when using this regex? I'm using Python, but I'm just looking for a general idea. Do I need named groups and, if so, how can I use them to capture multiple parameters, each with it's type, all in one regex?

orokusaki
  • 55,146
  • 59
  • 179
  • 257

1 Answers1

2

You can't match a variable number of groups with Python regular expressions (see this). Instead you can use a combination of regex and split().

>>> name, args = re.match(r'(\w+)\((.*)\)', 'function_name(foo=<str>, bar=<array>, baz=<int>)').groups()
>>> args = [re.match(r'(\w+)=<(\w+)>', arg).groups() for arg in args.split(', ')]
>>> name, args
('function_name', [('foo', 'str'), ('bar', 'array'), ('baz', 'int')])

This will match a variable number (including 0) arguments. I have chosen not to allow additional whitespace, although you should allow for it by adding \s+ between identifiers if your format isn't very strict.

Community
  • 1
  • 1
moinudin
  • 134,091
  • 45
  • 190
  • 216
  • on second thought, that doesn't work. The params part only allows 2 params, no more no less. – orokusaki Dec 20 '10 at 22:07
  • This is the exact problem I had in the first place. I can match multiple of the same pattern with `*`, etc, but I can't capture the multiple values into groups (without manually typing the same group regex 15 times in a row, which wouldn't be very clean at all). – orokusaki Dec 20 '10 at 22:14
  • @orokusaki Okay, seems it's not possible with a single regex. See my new answer, which combines regex with `split()`. It will work for any number, tested it. – moinudin Dec 20 '10 at 22:16