I'm very new to Python and I'm trying to code a program to extract text inside html tags (without tags) and write it onto a different text file for future analysis. I referred this and this as well. I came was able to get below code. But how can I write this as a separate function? Something like
"def read_list('file1.txt')
and then do the same scraping? The reason why I'm asking is output of this code (op1.txt)
will be used for stemming and then for another calculations afterwards. The output of this code doesn't write line by line as it intends either. Thank you very much for any input!
f = open('file1.txt', 'r')
for line in f:
url = line
html = urlopen(url)
bs = BeautifulSoup(html, "html.parser")
content = bs.find_all(['title','h1', 'h2','h3','h4','h5','h6','p'])
with open('op1.txt', 'w', encoding='utf-8') as file:
file.write(f'{content}\n\n')
file.close()