I am pretty new to python so got stuck in this problem:
there is a txt file like
blahh
blah
blah
...
<start>
some stuff
</start>
even more blah blah blah
I want to delete all the blah parts before the <start>
and after the </start>
. (The main thing is coming from this link. I want to make the html stuff in the page by bs4, so I think I must first delete all the non-html parts.
Can someone please tell me What is the best way to do this? Appreciate any helps!