I have written a python program (from concurrent.futures import ThreadPoolExecutor) to collect and download html documents from this website (http://lis.ly.gov.tw/lydbc/lydbkmout?.ebe0C1E000901000000DC001E000000000000100000000C0370003dc5). When I open the html files on my computer(file:///Users/XXX.html), using requests and BeautifulSoup to parse these htmls. I failed to parse these htmls.
from bs4 import BeautifulSoup
import requests
url = 'file:///Users/martinchen/PycharmProjects/legislative%20yuan%20scratching/list_pages/list_page_1.html'
requests = requests.get(url)
lytext = requests.text
soup = BeautifulSoup(lytext, "html.parser")
And I get this outcome:
requests.exceptions.InvalidSchema: No connection adapters were found for 'file:///Users/martinchen/PycharmProjects/legislative%20yuan%20scratching/list_pages/list_page_1.html'
How to parse a html document which has download in my own computer file(file:///Users/XXX.html) just like relative links(http://XXX.html)?