13

I wrote simple function which handles fetching of the url:

def tender_page_get url, agent
  sleep(rand(6)+2)
  begin
    return agent.get(url).parser
  rescue Errno::ETIMEDOUT, Timeout::Error, Net::HTTPNotFound
    EYE.debug "--winter sleep #{url}"
    puts "-x-#{url}"
    sleep(300)
    tender_page_get url, agent
  rescue => e
    puts "-x-#{url}"
    EYE.debug "--unknown exception"
    EYE.debug "#{url} #{e.inspect}"
  end
end

The problem is, even though I am catching Net::HTTPNotFound in my first rescue block, I still see in my log records like:

--unknown exception
{url} 404 => Net::HTTPNotFound

which means that this exception was caught by the second rescue block. What could be the reason for that?

blahdiblah
  • 33,069
  • 21
  • 98
  • 152
spacemonkey
  • 19,664
  • 14
  • 42
  • 62

1 Answers1

19

Mechanize raises a Mechanize::ResponseCodeError for a 404 and not a Net::HTTPNotFound. The to_s on Mechanize::ResponseCodeError looks like this:

def to_s
  "#{response_code} => #{Net::HTTPResponse::CODE_TO_OBJ[response_code]}"
end

This returns '404 => Net::HTTPNotFound' which makes it look like this is the exception being raised.

David Tinker
  • 9,383
  • 9
  • 66
  • 98
  • in this scenario, do we have a specific string that we can catch for HTTP 404? Sorry, i am a beginner in ruby and from your response that explains the cause for the issue, I could not figure out a solution. – Arnab Jun 05 '20 at 19:03