0

I have the following results in certificate validation problems:

 use URI;
 use Web::Scraper
 my $res = $scraper->scrape( URI->new('https://example.com') );

After waiting about 2 min, I get the following error:

GET https://example.com failed: 500 Can't connect to example.com:443 (certificate verify failed)

Per suggestion in comments, I ran openssl s_client -connect olms.dol-esa.gov:443. It output the following to the terminal instantly and then hung for about 2 minutes:

CONNECTED(00000003)
depth=3 O = Digital Signature Trust Co., CN = DST Root CA X3
verify error:num=19:self signed certificate in certificate chain
verify return:0
---
Certificate chain
 0 s:/CN=olms.dol-esa.gov/O=DEPARTMENT OF LABOR/L=Washington/ST=District of Columbia/C=US
   i:/C=US/O=IdenTrust/OU=TrustID Server/CN=TrustID Server CA A52
 1 s:/C=US/O=IdenTrust/OU=TrustID Server/CN=TrustID Server CA A52
   i:/C=US/O=IdenTrust/CN=IdenTrust Commercial Root CA 1
 2 s:/C=US/O=IdenTrust/CN=IdenTrust Commercial Root CA 1
   i:/O=Digital Signature Trust Co./CN=DST Root CA X3
 3 s:/O=Digital Signature Trust Co./CN=DST Root CA X3
   i:/O=Digital Signature Trust Co./CN=DST Root CA X3
---
Server certificate
-----BEGIN CERTIFICATE-----
<SNIP>
-----END CERTIFICATE-----
subject=/CN=olms.dol-esa.gov/O=DEPARTMENT OF LABOR/L=Washington/ST=District of Columbia/C=US
issuer=/C=US/O=IdenTrust/OU=TrustID Server/CN=TrustID Server CA A52
---
No client certificate CA names sent
---
SSL handshake has read 6233 bytes and written 615 bytes
---
New, TLSv1/SSLv3, Cipher is DES-CBC3-SHA
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
SSL-Session:
    Protocol  : TLSv1.2
    Cipher    : DES-CBC3-SHA
    Session-ID: 3E5B2FBD819EF7880143C874181B7D67D3B1A0CE7C319B35F276E1CE8D9B9A18
    Session-ID-ctx: 
    Master-Key: BE8A24B0350C48FCC3ECFA21AE896BF09F8978C481F3BE01E1E9B904A0BFB87098914DB6CD592BBC7634142A4B5C43FB
    Key-Arg   : None
    PSK identity: None
    PSK identity hint: None
    SRP username: None
    Start Time: 1486034348
    Timeout   : 300 (sec)
    Verify return code: 19 (self signed certificate in certificate chain)
---

After waiting about two minutes, the following was output:

read:errno=0
brian d foy
  • 129,424
  • 31
  • 207
  • 592
StevieD
  • 6,925
  • 2
  • 25
  • 45
  • 1
    This can mean that certificate validation is working properly but the URL you use is wrong, the server is not sending needed chain certificates, you are missing an important root certificate, the server is sending a self-signed certificate or similar. Or it could mean that SNI is in use and that you are using a very old version of the SSL libraries so that it gets not used and therefore the wrong certificate is sent by the server. Impossible to tell from these information but you might provide the URL in question and version of the modules you use etc if you want to get more help. – Steffen Ullrich Jan 26 '17 at 05:52
  • 4
    The URI module does not matter at all here because it simply parses the URL. The relevant parts are Web::Scraper which depends on LWP which depends on IO::Socket::SSL which depends on Net::SSLeay which depends on OpenSSL for the SSL functionality. I've edited the question to reflect this a bit but I cannot add the needed information like real URL and version of the libraries to actually look at the problem because only the OP can provide these. – Steffen Ullrich Jan 26 '17 at 05:54
  • Perl 5.20.2, Debian Jessie with libssl-dev installed, LWP::UserAgent 6.17, Web::Scraper 0.38, IO::Socket::SSL 2.044, Net::SSLeay 1.80 – StevieD Feb 01 '17 at 22:06
  • Actually, it looks like it works now after upgrading all the modules. But it takes about 2 min before anything is returned. Web pages over http happen much quicker. – StevieD Feb 01 '17 at 22:15
  • And now I'm finding that half the time it works and other half I get this: GET https://olms.dol-esa.gov/query/getYearlyData.do:443 ==> 500 Can't connect to olms.dol-esa.gov:443 (certificate verify failed) (128s) Can't connect to olms.dol-esa.gov:443 (certificate verify failed) SSL connect attempt failed error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed at /usr/local/share/perl/5.20.2/LWP/Protocol/http.pm line 48, line 1. – StevieD Feb 01 '17 at 23:30
  • 1
    Using LWP against this site works for me perfectly on a freshly installed debian jessie. Could not test with Web:Scraper because your code is incomplete. While knowing nothing about your environment my guess is that you are behind some SSL intercepting proxy which changes the certificate so that it is no longer trusted by the local CA and which also causes the delay. Using `openssl s_client -connect olms.dol-esa.gov:443` and looking at the certificate chain (shown on top of the output) might help to debug this problem. – Steffen Ullrich Feb 02 '17 at 05:55
  • Thanks for your continued help. I ran the command you suggested which output to the screen and then hung for two minutes after giving some errno message. I've added the output along with more details to the OP. – StevieD Feb 02 '17 at 11:29
  • I ran the same openssl command on another Debian box I have outside of the first box's network and got the same results. – StevieD Feb 02 '17 at 11:38
  • 1
    I don't know where the delay comes from but the certificate chain it returns suggest that at least no active man in the middle is there. It might still a firewall causing the delay (a packet capture would maybe help to debug this problem). As for the validation errors: I don't get the *Verify return code: 19 (self signed certificate in certificate chain)* on jessie but instead *Verify return code: 0 (ok)*. This suggests that something is different with the CA store on your system – Steffen Ullrich Feb 02 '17 at 11:43
  • OK, I am getting closer. I took an older version of jessie I had laying around and ran my script and had the same problems. I ran apt-get upgrade and the problems went away. So I think it's just a matter of upgrading jessie on this virtual machine (though I'm having problems with that). So there is something about this machine that is hosed. Thanks for your help troubleshooting. – StevieD Feb 02 '17 at 12:21
  • OK, got all packages updated and rebooted machine. Things are working quite nicely now. Thanks again for your help. Much appreciated. – StevieD Feb 02 '17 at 14:00

1 Answers1

1

Web::Scraper uses LWP::UserAgent. More modern versions of that attempt to verify the hostnames unless you turn off that feature. Something else may be going on, but this question is low in details.

One of the constructor arguments to LWP::UserAgent is:

LWP::UserAgent->new( 
    ssl_opts => { verify_hostname => 0 }
    ...;
    }

You can construct your own user agent object and give it to Web::Scraper:

my $scraper = Web::Scraper->new(...);
$scraper->user_agent( $your_own_lwp_useragent_object );

Also see the answer at "Perl LWP::Simple HTTPS error". For more help, we'll need version details for the relevant modules and your openssl details.

Community
  • 1
  • 1
brian d foy
  • 129,424
  • 31
  • 207
  • 592