I am building a python3 web crawler/scraper using bs4. The program crashes whenever it meets a UNICODE code character like a Chinese symbol. How do I modify my scraper so that it supports UNICODE?
Here's the code:
import urllib.request
from bs4 import BeautifulSoup
def crawlForData(url):
r = urllib.request.urlopen(url)
soup = BeautifulSoup(r.read(),'html.parser')
result = [i.text.replace('\n', ' ').strip() for i in soup.find_all('p')]
for p in result:
print(p)
url = 'https://en.wikipedia.org/wiki/Adivasi'
crawlForData(url)