Python Regex - Find contents from a string between two '*'

Question

I have a text file, and I need to extract everything from the file between two '*'s. There can be multiple occurrences of the same. How would I do that using Regex? I am good at Python, but I haven't used Regex a lot so its my weakness.

I have tried quite a few variants but couldn't make it work. As I said, I'm not good at Regex so couldn't come up with anything substantial. — Varun Shah, Nov 16 '13 at 09:26
It's okay. Josha has answered beautifully well, given the documentation links & the explanation of the pattern too. :) — shad0w_wa1k3r, Nov 16 '13 at 09:30

Josha Inglis · Answer 1 · 2013-11-16T05:27:17.937

Notes:

Use the documentation, it's really helpful!
* is normally a 0 or more pattern search, so you'll need to escape it with \
. is an any search and will capture all characters except for newlines!. To include newlines, add the re.DOTALL flag
+ means at least one, and it is a greedy operator, meaning that it would normally capture everything between the first * and the last * (including any *'s in between), so to prevent it from being greedy, we add the ? operator, which tells it to stop at the first * it encounters.
() Only matches within the parentheses are kept!

And here is an example of that in action:

import re
pattern = re.compile(r'\*(.+?)\*', flags=re.DOTALL)
text = """Why hello *there my fine
fellow!* How for art thou
on *such a glorious day?*"""

results = pattern.findall(text)
# ['there my fine\nfellow!', 'such a glorious day?']

I am not sure whether the output for `text = '*one***two*'` is really the desired result. — Hyperboreus, Nov 16 '13 at 09:05

Python Regex - Find contents from a string between two '*'

1 Answers1

Linked