I want to extract specific parts of txt-files with python.
Here is my code:
import re
with open('test1.txt') as test_text:
data = test_text.read()
wanted_match = re.findall('start(\n.*?)+?end', data)
wanted_match_str = ",".join(wanted_match)
with open("output.txt", "w") as output:
output.write(wanted_match_str)
My txt-files look like this (includes newlines):
blablabla start blobloblobloblo bloblo blobloblo end bla blablabla start blobloblobloblo bloblo blobloblo end bla blablabla
and so on. I want to extract only the bloblob parts of the text and write them to a file (and not the blabla parts). According to pythex my regex should work (http://pythex.org), but all I get as my output is a list of commas. Can you help me? Thanks in advance! majee