I know that there are a bunch of other similar questions to this, but I have built off other answers with no success. I've dug here, here, here, here, and here but this question is closest to what I'm trying to do, however it's in php and I'm using python3
My goal is to extract a substring from a body text. The body is formatted:
**Header1**
thing1
thing2
thing3
thing4
**Header2**
dsfgs
sdgsg
rrrrrr
**Hello Dolly**
abider
abcder
ffffff
etc.
Formatting on SO is tough. But in the actual text, there's no spaces, just newlines for each line.
I want what's under Header2, so currently I have:
found = re.search("\*\*Header2\*\*\n[^*]+",body)
if found:
list = found.group(0)
list = list[11:]
list = list.split('\n')
print(list)
But that's returning "None". Various other regex I've tried also haven't worked, or grabbed too much (all of the remaining headers).
For what it's worth I've also tried:
\*\*Header2\*\*.+?^\**$
\*\*Header2\*\*[^*\s\S]+\*\*
and about 10 other permutations of those.