I am trying to scrape a book from a website and while parsing it with Beautiful Soup I noticed that there were some errors. For example this sentence:
"You have more…
direct control over your skaa here. How many woul "Oh, a half dozen or so,"
The "more…"
and " woul" are both errors that occurred somewhere in the script.
Is there anyway to automatically clean mistakes like this up? Example code of what I have is below.
import requests
from bs4 import BeautifulSoup
url = 'http://thefreeonlinenovel.com/con/mistborn-the-final-empire_page-1'
res = requests.get(url)
text = res.text
soup = BeautifulSoup(text, 'html.parser')
print(soup.prettify())
trin = soup.tr.get_text()
final = str(trin)
print(final)