data = re.sub('<[^>]*>', '', string=html).lower()
I want to crawl random pages. However, since it is impossible to scrape only the desired content, I post a question. Is it valid to delete the html using a regular expression after scratching it?
data = re.sub('<[^>]*>', '', string=html).lower()
I want to crawl random pages. However, since it is impossible to scrape only the desired content, I post a question. Is it valid to delete the html using a regular expression after scratching it?