1

I have a strange Problem. When I parse my Regex online it works fine, but in MicroPython doesn't match it.

regex: ()*<div>(.*?)<\/div>()*or<div>(.*?)<\/div>or<div>(.*?)</div>

toMatch:

&lt;Storage {}&gt;86400<div>Uhrzeit in Sekunden: 65567</div><div>Timer: 20833</div>

none of these match with python but do online (http://regexr.com/ or https://pythex.org/)

This is just a short part of what i want to get. But what i want is the data inside the div.

EDIT: I am using micropython on a esp8266. I am limited and cant use a html parser.

Patrick
  • 2,044
  • 1
  • 25
  • 44
  • 1
    People, stop using regex to parse HTML! HTML parsers exist for a reason. Also, why are you using empty capture groups? You'll need to use `findall` in Python, not `match`. – DeepSpace Aug 06 '17 at 14:17
  • 1
    Sorry I wasnt exact. I just wanted to write less detailed. I am using micropython on a esp8266. I am limited there. – Claus Spitzer Aug 06 '17 at 14:26
  • MicroPython Regex is very much a subset of Python Regex. AND there are many as yet (April 2020) unfixed bugs in MicroPython Regex. Especially related to escaping characters. The [ure library docs](http://docs.micropython.org/en/latest/library/ure.html) and the [open Issues in MicroPython Repo](https://github.com/micropython/micropython/issues?q=is%3Aissue+is%3Aopen+%28ure%29+OR+%28regex%29+in%3Atitle+in%3Abody) are your best bets for what currently works and how. :-/ – Patrick Mar 27 '20 at 18:31

1 Answers1

2

I suspect your problem is that you are not passing a raw string to re.compile(). If I do this I get what I think you want:

>>> rx = re.compile(r"<div>(.*?)<\/div>")
>>> rx.findall("&lt;Storage {}&gt;86400<div>Uhrzeit in Sekunden: 65567</div><div>Timer: 20833</div>")
>>> ['Uhrzeit in Sekunden: 65567', 'Timer: 20833']

You need a raw string because \ is both the Python string escape character and the regex escape character. Without it you have to put \\ in your regex when you mean \ and that very quickly becomes confusing.

BoarGules
  • 16,440
  • 2
  • 27
  • 44
  • 1
    There is no .findall in MicroPython Regex. I'd delete this answer since it is not applicable. – Patrick Mar 27 '20 at 18:31