Assume that we have a string such as '{ CAN_READ, CAN_WRITE }'
and '{ CAN_READ, CAN_WRITE, CAN_REMOVE }'
and want to extract the elements (CAN_READ
, CAN_WRITE
, CAN_REMOVE
). We assume that the number of elements can be any. I am trying to solve this with Python's regular expression module (re
).
The regular expression I designed is like this: r'^\{(\s*[a-zA-Z0-9_]+\s*,?)+\s*\}$'
I think this regexp is correct as re.match
works. However, although I expect we can get elements with the groups()
method of the result, it returns only the last match.
e.g.
>>> value='{ CAN_READ, CAN_WRITE }'
>>> re.match(r'^\{(\s*[a-zA-Z0-9_]+\s*,?)+\s*\}$', value).groups()
(' CAN_WRITE ',)
>>> value='{ CAN_READ, CAN_WRITE, CAN_REMOVE }'
>>> re.match(r'^\{(\s*[a-zA-Z0-9_]+\s*,?)+\s*\}$', value).groups()
(' CAN_REMOVE ',)
In order to test the block part (\s*[a-zA-Z0-9_]+\s*,?)
is correct, I repeated this block twice, and it worked:
>>> value='{ CAN_READ, CAN_WRITE }'
>>> re.match(r'^\{(\s*[a-zA-Z0-9_]+\s*,?)(\s*[a-zA-Z0-9_]+\s*,?)\s*\}$', value).groups()
(' CAN_READ,', ' CAN_WRITE ')
However this works only when the number of elements is two.
How can I get all repeated blocks?