0

Is it possible to get HTML text from webpage use requests?

import requests

headers = {
    'Accept':
    'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
    'Accept-Encoding':
    'gzip, deflate, br',
    'Accept-Language':
    'ru-RU,ru;q=0.9,en-US;q=0.8,en;q=0.7,uk;q=0.6',
    'Connection':
    'keep-alive',
    'DNT':
    '1',
    'Host':
    'labor.ny.gov',
    'Upgrade-Insecure-Requests':
    '1',
    'User-Agent':
    "Mozilla/5.0 (X11; Fedora; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36",
}

params = {'warnYr': '2018'}

s = requests.Session()

s.get(
    'https://labor.ny.gov/app/warn/default.asp?warnYr=2018',
    headers=headers,
    params=params)

It is don't work

raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', OSError(0, 'Error'))      

If requests.get - don't work too.
Maybe it is impossible to get HTML from this webpage?
If it is impossible, why?

Stefan Becker
  • 5,695
  • 9
  • 20
  • 30
  • Provide a sample data or a url along with the desired output? – DirtyBit Mar 18 '19 at 07:56
  • https://lukasa.co.uk/2017/02/Configuring_TLS_With_Requests/ Here is a solution for you. – Yang HG Mar 18 '19 at 08:12
  • ``` url = "https://labor.ny.gov/app/warn/default.asp?warnYr=2018" options = Options() options.add_argument("--headless") driver = webdriver.Chrome(executable_path="./chromedriver", chrome_options=options) driver.get(url) html = driver.page_source``` @DirtyBit –  Mar 18 '19 at 08:38
  • But I want to use requests for this website =) –  Mar 18 '19 at 08:39
  • Possible duplicate of https://stackoverflow.com/questions/33410577/python-requests-exceptions-sslerror-eof-occurred-in-violation-of-protocol – Kamal Mar 18 '19 at 10:07
  • @Kamal HTTPSConnectionPool(host='labor.ny.gov', port=443): Max retries exceeded with url: /app/warn/default.asp?warnYr=2018&warnYr=2018 (Caused by SSLError(SSLError("bad handshake: SysCallError(-1, 'Unexpected EOF')",),)) –  Mar 18 '19 at 13:04

0 Answers0