1

I'm developing a sort of "parser for a custom script" in python using regexps. Please don't answer about if regexp is a good solution of not for this kind of operation... It is long (and off-topic) to explain why I'm choosing to use regexp, even if I know the problems of using regexp for parsing.

Now I proceed with the question. We start with this scenario:

This is the line I will read from a file, that I need to parse with my regexp:

something = { call _ "string to ""capture"" " } #non consumed

now I can do something like this:

import re
regex1 = re.compile(r'^([^"]*?)(_?)\s*"((?:""|[^"])*)"')
mystr = r'something = { call _ "string to ""capture"" " } #non consumed'
mymatch = re.search(regex1, mystr)

so I can obtain those capture groups:

  • 0: all mystr line until the last quote
  • 1: all things before quotation (I need this match to verify a thing later)
  • 2: '_' or '' (depending if there is an underscore rightly before quotation [there can be spaces between underscore and quotation])
  • 3: quotation (where "" is considered as a character and not a closing quote)

I need to know those groups, so using re.search is fine (becouse I can use mymatch.group(n) to check the value of the single captured groups).

But... after I used all groups from 1 to 3, I will need to reduce mystr so it will contain only "non consumed string by the 'successfull' regexp"

I could do this with:

mystr = mystr[ len(mymatch.group(0)): ]

so a working code could be this one:

import re
regex1 = re.compile(r'^([^"]*?)(_?)\s*"((?:""|[^"])*)"')
mystr = r'something = { call _ "string to ""capture"" " } #non consumed'
mymatch = re.search(regex1, mystr)
# code here that uses mymatch.group(n)
mystr = mystr[ len(mymatch.group(0)): ] # clear from mystr what was parsed by the regexp

but I'd like to see if there are other ways to do this. Can you suggest other code approaches different by the one I provided?"


Searches:

Not useful: it ask only about replacing, but not about single match groups. Here I am asking how to do both actions together in a good way

Not useful: For (almost) the same reason as the first link

Community
  • 1
  • 1
Nobun
  • 141
  • 1
  • 1
  • 10
  • **good** is a relative term and encompasses manything- i think this question should be asked at http://codereview.stackexchange.com/ – Learner Nov 01 '15 at 12:50
  • I didn't know about code review. However I edited my question with a "can you suggest other code approaches different by the one I provided?" hoping it is less ambigous than before. I will problably move this question to codereview as you pointed. Thank for help. – Nobun Nov 01 '15 at 13:13
  • If http://codereview.stackexchange.com is the right place where to ask this question (and probably is, I think Sislam is right) may I ask to an admin to move this question from here to codereview? (in this way I will avoid to post twice the same topic here in stackexcerge sites) – Nobun Nov 01 '15 at 13:22
  • Too much of an example to be on topic for Code Review IMO. We ask for the real code there, not MVCEs. – RubberDuck Nov 01 '15 at 13:24

0 Answers0