5

i have a regular expression which is very long.

 vpa_pattern = '(VAP) ([0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}): (.*)'

My code to match group as follows:

 class ReExpr:
def __init__(self):
    self.string=None

def search(self,regexp,string):
    self.string=string
    self.rematch = re.search(regexp, self.string)
    return bool(self.rematch)

def group(self,i):
    return self.rematch.group(i)

 m = ReExpr()

 if m.search(vpa_pattern,line):
    print m.group(1)
    print m.group(2)
    print m.group(3)

I tried to make the regular expression pattern to multiple line in following ways,

vpa_pattern = '(VAP) \
    ([0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}):\
    (.*)'

Or Even i tried:

 vpa_pattern = re.compile(('(VAP) \
    ([0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}):\
    (.*)'))

But above methods are not working. For each group i have a space () after open and close parenthesis. I guess it is not picking up when i split to multiple lines.

Naggappan Ramukannan
  • 2,564
  • 9
  • 36
  • 59

3 Answers3

8

Look at re.X flag. It allows comments and ignores white spaces in regex.

a = re.compile(r"""\d +  # the integral part
               \.    # the decimal point
               \d *  # some fractional digits""", re.X)
Alex Shkop
  • 1,992
  • 12
  • 12
  • +1 And it should also be noted that Python's `r"""raw multi-line string"""` syntax (used here) makes writing these self-documenting regexes much easier (because it completely avoids any backslash soup confusion). – ridgerunner May 28 '14 at 13:27
3

Python allows writing text strings in parts if enclosed in parenthesis:

>>> text = ("alfa" "beta"
... "gama")
...
>>> text
'alfabetagama'

or in your code:

text = ("alfa" "beta"
        "gama" "delta"
        "omega")
print text

will print

"alfabetagamadeltaomega"
Michael Myers
  • 188,989
  • 46
  • 291
  • 292
Jan Vlcinsky
  • 42,725
  • 12
  • 101
  • 98
1

Its actually quite simple. You already use the {} notation. Use it again. So instead of:

'([0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}:[0-9A-Fa-f]{2}):'

which is just a repeat of [0-9A-Fa-f]{2}: 6 times, you can use:

'([0-9A-Fa-f]{2}:){6}'

We can even simplify it further by using \d to represent digits:

'([\dA-Fa-f]{2}:){6}'

NOTE: Depending on what re function you use, you can pass in re.IGNORE_CASE and simplify that chunk down to [\da-f]{2}:

So your final regex is:

'(VAP) ([\dA-Fa-f]{2}:){6} (.*)'
Community
  • 1
  • 1
BeetDemGuise
  • 954
  • 7
  • 11
  • A repeating group only captures the last repetition. Instead, use a repeating non-capturing group inside a capturing group. Note also that OP's regex does not capture the last colon. – Janne Karila May 28 '14 at 13:04
  • If the OPs regex doesn't capture the final `:` then what is the `:` here: `'...[0-9A-Fa-f]{2}): (.*)'` doing? – BeetDemGuise May 28 '14 at 13:25
  • The `()` define a group which the OP accesses as `m.group(2)`. The last `:` is outside the paretheses. – Janne Karila May 28 '14 at 14:40
  • I see. They'll both recognize the same strings, though it seems the group structure may be different. – BeetDemGuise May 28 '14 at 14:46