I need to retrive this site (www.extra.com.br) in a Python project.
Neither
page = requests.get("https://www.extra.com.br/")
nor
page = requests.get("http://www.extra.com.br/")
works, the script keeps stuck on this line.
So I went to try using curl to retrieve the page and check it off-line.
curl http://www.extra.com.br
also does not work. This is the output using -v.
- STATE: INIT => CONNECT handle 0x6dfd90; line 1392 (connection #-5000)
- Rebuilt URL to: https://www.extra.com.br/
- Added connection 0. The cache now contains 1 members
- STATE: CONNECT => WAITRESOLVE handle 0x6dfd90; line 1428 (connection #0) % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 23.74.196.194...
- TCP_NODELAY set
- STATE: WAITRESOLVE => WAITCONNECT handle 0x6dfd90; line 1509 (connection #0)
- Connected to www.extra.com.br (23.74.196.194) port 443 (#0)
- STATE: WAITCONNECT => SENDPROTOCONNECT handle 0x6dfd90; line 1561 (connection #0) 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Marked for [keep alive]: HTTP default
- ALPN, offering h2
- ALPN, offering http/1.1
- Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
- successfully set certificate verify locations:
- CAfile: C:/Program Files/Git/mingw64/ssl/certs/ca-bundle.crt CApath: none
- TLSv1.2 (OUT), TLS header, Certificate Status (22): } [5 bytes data]
- TLSv1.2 (OUT), TLS handshake, Client hello (1): } [512 bytes data]
- STATE: SENDPROTOCONNECT => PROTOCONNECT handle 0x6dfd90; line 1575 (connection #0) { [5 bytes data]
- TLSv1.2 (IN), TLS handshake, Server hello (2): { [108 bytes data]
- TLSv1.2 (IN), TLS handshake, Certificate (11): { [2799 bytes data]
- TLSv1.2 (IN), TLS handshake, Server key exchange (12): { [333 bytes data]
- TLSv1.2 (IN), TLS handshake, Server finished (14): { [4 bytes data]
- TLSv1.2 (OUT), TLS handshake, Client key exchange (16): } [70 bytes data]
- TLSv1.2 (OUT), TLS change cipher, Client hello (1): } [1 bytes data]
- TLSv1.2 (OUT), TLS handshake, Finished (20): } [16 bytes data]
- TLSv1.2 (IN), TLS change cipher, Client hello (1): { [1 bytes data]
- TLSv1.2 (IN), TLS handshake, Finished (20): { [16 bytes data]
- SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
- ALPN, server accepted to use http/1.1
- Server certificate:
- subject: C=BR; ST=Sao Paulo; L=Sao Paulo; O=CNOVA Comercio Eletronico S.A.; OU=TI; CN=*.extra.com.br
- start date: Jul 17 00:00:00 2018 GMT
- expire date: Jul 17 12:00:00 2019 GMT
- subjectAltName: host "www.extra.com.br" matched cert's "*.extra.com.br"
- issuer: C=US; O=DigiCert Inc; CN=DigiCert SHA2 Secure Server CA
- SSL certificate verify ok.
- STATE: PROTOCONNECT => DO handle 0x6dfd90; line 1596 (connection #0) } [5 bytes data]
GET / HTTP/1.1 Host: www.extra.com.br User-Agent: curl/7.58.0 Accept: /
- STATE: DO => DO_DONE handle 0x6dfd90; line 1658 (connection #0)
- STATE: DO_DONE => WAITPERFORM handle 0x6dfd90; line 1783 (connection #0)
- STATE: WAITPERFORM => PERFORM handle 0x6dfd90; line 1799 (connection #0) 0 0 0 0 0 0 0 0 --:--:-- 0:00:19 --:--:-- 0* OpenSSL SSL_read: SSL_ERROR_SYSCALL, errno 10054
- Marked for [closure]: Transfer returned error
- multi_done
- stopped the pause stream! 0 0 0 0 0 0 0 0 --:--:-- 0:00:19 --:--:-- 0
- Closing connection 0
- The cache now contains 0 members } [5 bytes data] curl: (56) OpenSSL SSL_read: SSL_ERROR_SYSCALL, errno 10054