-1

I have this android logcat's log:

"Could not find class android.app.Notification$Action$Builder, referenced from method b.a"

and I'm trying to apply a regular expression, in python, to extract android.app.Notification$Action$Builder and b.a.

I use this code:

regexp = '\'([\w\d\.\$\:\-\[\]\<\>]+).*\s([\w\d\.\$\:\-\[\]\<\>]+)'
match = re.match(r'%s' % regexp, msg, re.M | re.I)

I tested the regular expression online and it works as expected, but it never matches in python. Someone can give me some suggestions?

Thanks

  • 3
    `r'%s' % regexp` doesn't make the regular expression a raw string literal. Put the `r` on the `regex = r'...'` string instead. – Martijn Pieters Mar 16 '16 at 15:25
  • Not that it matters here. [You should be using `re.search`](https://docs.python.org/2/howto/regex.html#match-versus-search). – Martijn Pieters Mar 16 '16 at 15:29
  • Aside, `'%s' % some_string` is a non-sensical complexity. Prefer using `some_string` directly. In your case, prefer `re.match(regexp, msg, re.M | re.I)`. – Robᵩ Mar 16 '16 at 15:32
  • I'm not sure how you claim it works in an online regular expression tester. [It doesn't for me](https://regex101.com/r/nV5sU9/1). – Martijn Pieters Mar 16 '16 at 15:42

1 Answers1

1

.re.match() matches only at the start of a string. Use re.search() instead, see match() vs. search().

Note that you appear to misunderstand what a raw string literal is; r'%s' % string does not produce a special, different object. r'..' is just notation, it still produces a regular string object. Put the r on the original string literal instead (but if you use double quotes you do not need to quote the single quote contained):

regexp = r"'([\w\d\.\$\:\-\[\]\<\>]+).*\s([\w\d\.\$\:\-\[\]\<\>]+)"

For this specific regex it doesn't otherwise matter to the pattern produced.

Note that the pattern doesn't actually capture what you want to capture. Apart from the escaped ' at the start (which doesn't appear in your text at all, it won't work as it doesn't require dots and dollars to be part of the name. As such, you capture Could and b.a instead, the first and last words in the regular expression.

I'd anchor on the words class and method instead, and perhaps require there to be dots in the class name:

regexp = r'class\s+((?:[\w\d\$\:\-\[\]\<\>]+\.)+[\w\d\$\:\-\[\]\<\>]+).*method ([\w\d.\$\:\-\[\]\<\>]+)'

Demo:

>>> import re
>>> regexp = r'class\s+((?:[\w\d\$\:\-\[\]\<\>]+\.)+[\w\d\$\:\-\[\]\<\>]+).*method ([\w\d.\$\:\-\[\]\<\>]+)'
>>> msg = "Could not find class android.app.Notification$Action$Builder, referenced from method b.a"
>>> re.search(regexp, msg, re.M | re.I)
<_sre.SRE_Match object at 0x1023072d8>
>>> re.search(regexp, msg, re.M | re.I).groups()
('android.app.Notification$Action$Builder', 'b.a')
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343