1

I am trying to scrape a Crunchbase page but i got this error:

ryzal~/Desktop/Sites/scraper$ ruby scraper.rb
/Users/Ryzal/.rbenv/versions/2.3.1/lib/ruby/2.3.0/net/http.rb:933:in `connect_nonblock': SSL_connect returned=1 errno=0 state=SSLv3 read server hello A: sslv3 alert handshake failure (OpenSSL::SSL::SSLError)
    from /Users/Ryzal/.rbenv/versions/2.3.1/lib/ruby/2.3.0/net/http.rb:933:in `connect'
    from /Users/Ryzal/.rbenv/versions/2.3.1/lib/ruby/2.3.0/net/http.rb:863:in `do_start'
    from /Users/Ryzal/.rbenv/versions/2.3.1/lib/ruby/2.3.0/net/http.rb:858:in `start'
    from /Users/Ryzal/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/net-http-persistent-2.9.4/lib/net/http/persistent.rb:700:in `start'
    from /Users/Ryzal/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/net-http-persistent-2.9.4/lib/net/http/persistent.rb:631:in `connection_for'
    from /Users/Ryzal/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/net-http-persistent-2.9.4/lib/net/http/persistent.rb:994:in `request'
    from /Users/Ryzal/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/mechanize-2.7.5/lib/mechanize/http/agent.rb:274:in `fetch'
    from /Users/Ryzal/.rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/mechanize-2.7.5/lib/mechanize.rb:464:in `get'
    from scraper.rb:10:in `<main>'

Below is the code:

require 'nokogiri'
require 'mechanize'
require 'json'


agent = Mechanize.new do |a|
  a.ssl_version = :TLSv1
end

page = agent.get "https://www.crunchbase.com/app/search/people/0b543c0d8ea4c95cdf78a48583d501da2a76b26c"

member_links = page.links_with(href: %r{.*/person/\w+})

member_links.map do |link|  
    puts member_counter

    member = link.click

    # Get name
    name = member.search('#profile_header_heading').text.strip
    puts "#{name}"
end 

I have tried both of these solutions:

Ruby Mechanize https error Mechanize getting "Errno::ECONNRESET: Connection reset by peer - SSL_connect"

But still the same error persist.

Please help, thanks!

Ryzal Yusoff
  • 957
  • 2
  • 22
  • 49

1 Answers1

0

Try Following

agent = Mechanize.new
agent.agent.http.verify_mode = OpenSSL::SSL::VERIFY_NONE
Santosh Sharma
  • 2,114
  • 1
  • 17
  • 28
  • I got `.../gems/mechanize-2.7.5/lib/mechanize/http/agent.rb:323:in `fetch': 416 => Net::HTTPRequestedRangeNotSatisfiable for ...` – Ryzal Yusoff Jan 14 '17 at 22:52
  • @RizalYusoff `Crunchbase ` is block ip after 5 requests. so it gives error code `416` .for solution you may try with set proxy ip in mechanize. – Santosh Sharma Jan 15 '17 at 08:32