-3

I was trying to install urllib to my python 3.6.1 using pip method, but I am unable to fix the error output. The error appears to be like this:

enter image description here

I first searched online and found out that one possible reason is that Python3 is unable to identify 0, I need to change the last digit to something, therefore, I tried to open the setup.py file in the folder. I tried to access the hidden folders on my mac following the path listed in the error, but I am unable to find any pip-build-zur37k_r folder in my mac, I turned all the hidden fildes to visible.

I want to extract information using urllib.request library and BeautifulSoup, and when I run the following code:

from urllib.request import urlopen
from bs4 import BeautifulSoup

html = urlopen("https://www.pythonscraping.com/pages/page1.html")
bsObj = BeautifulSoup(html.read())
print(bsObj.h1)

The error appears to be like: enter image description here

The code should return to me the following information:

<h1>  An Interesting Title </h1>
Catarina Ferreira
  • 1,824
  • 5
  • 17
  • 26
Li Ke
  • 1

2 Answers2

0

Your error says certificate verification failed. So it is a problem with the website, not your code. The call to urlopen() works for me, but maybe you have a proxy server that is fussier about certificates.

BoarGules
  • 16,440
  • 2
  • 27
  • 44
0

The url you are hitting is not having any SSL certificate so when you want to request such site you'll need to overlook the ssl check. As below:

from urllib.request import urlopen 
from bs4 import BeautifulSoup 
import ssl

ctx = ssl.create_default_context() 
ctx.check_hostname = False 
ctx.verify_mode = ssl.CERT_NONE 
html = urlopen("https://www.pythonscraping.com/pages/page1.html",context=ctx)

bsObj = BeautifulSoup(html.read()) print(bsObj.h1)

So you'll get the end result as expected.

Jaysheel Utekar
  • 1,171
  • 1
  • 19
  • 37