I'm trying to do some parsing with BeautifulSoup:
from bs4 import BeautifulSoup
import requests
import lxml
r = requests.get('https://pythonprogramming.net/parsememcparseface/')
page_text = r.text.encode('utf-8').decode('ascii', 'ignore')
soup = BeautifulSoup(page_text, 'lxml')
print(soup.find_all('p'))
I can't use find_all('p')
because of UnicodeEncodeError
. Typing just soup.p works good. I used the variable page_text
to encode html file but it is not enough. How can I overcome this error and access all paragraphs from the site?