I'm happy to ask my first python question !!! I would like to strip the beginning (the part before the first occurrence of the article) of the sample file below. To do this I use re.sub library.
below this is my file sample.txt:
fdasfdadfa
adfadfasdf
afdafdsfas
adfadfadf
adfadsf
afdaf
article: name of the first article
aaaaaaa
aaaaaaa
aaaaaaa
article: name of the first article
bbbbbbb
bbbbbbb
bbbbbbb
article: name of the first article
ccccccc
ccccccc
ccccccc
And my Python code to parse this file:
for line in open('sample.txt'):
test = test + line
result = re.sub(r'.*article:', 'article', test, 1, flags=re.S)
print result
Sadly this code only displays the last article. The output of the code:
article: name of the first article
ccccccc
ccccccc
ccccccc
Does someone know how to strip only the beginning of the file and display the 3 articles?