Friends, I open an html from Python (a Jupyter notebook) in the following way:
import urllib.request
with urllib.request.urlopen('http://python.org/') as response:
html = response.read()
I am all set to work with this object. However, when I try to clean it with regular expressions it does not work:
import re
re.split(r'\W+', html)
The last command returns a type error:
cannot use a string pattern on a bytes-like object
What should I do?