Searching and removing a string between two markers using a regex

Question

I have some sample data similar to the below.

From: JoeBloggs
Subject: This is a subject line

Hello

What I am looking to do is remove all of the data including the word 'From' until the final newline before 'Hello'. I would therefore need to look for two newline entries in sequence (I think). I could look to run a regex.match for 'From:' and then replace it but that would only replace the 'From:'. Is there anything I can do achieve this?

Using the 'repr' command with print, it displays as below:-

'From: JoeBloggs\nSubject: This is a subject line\n\nHello'

Therefore, I need to run a regex command to find the word 'From: ' all the way until the following '\n\n' and then replace everything in between.

Is "From" the beginning of the string or is your example just one record in a long string? — Wups, Sep 18 '20 at 16:45
Thanks @Wups. It can appear anywhere in the string and it can also feature on more than one occasion — thefragileomen, Sep 18 '20 at 17:25

score 0 · Answer 1 · answered Sep 18 '20 at 16:50

0

import re

test = """From: JoeBloggs
Subject: This is a subject line

Hello"""

print(test)
print("-"*10)
match = re.match(r'.*?\n\n(?P<content>.*)', test, re.DOTALL|re.MULTILINE)
print(match.group("content"))

Running this results in:

$ python test.py 
From: JoeBloggs
Subject: This is a subject line

Hello
----------
Hello

answered Sep 18 '20 at 16:50

gergelykalman

177
7

@thefragileomen and you could start your pattern as `r'From:\s.*?`, in case you wand to find multiple instances in a long string – RichieV Sep 18 '20 at 16:57

score 0 · Answer 2 · answered Sep 18 '20 at 17:05

0

import re

pattern = r'.+?\n\n'
text = 'From: JoeBloggs\nSubject: This is a subject line\n\nHello'
replacement_text = 'replacement\n'

replaced_text = re.sub(pattern,replacement,text,flags=re.DOTALL)
print(replaced_text)

Output

replacement
Hello

answered Sep 18 '20 at 17:05

Nadeem Mehraj

174
1
2
15

Searching and removing a string between two markers using a regex

2 Answers2