0

I'm writing a Rails app having a simple web crawler that finds all links within a domain, stopping whenever it finds a link that leads outside of the domain. As normal for Rails developers, I've developed and tested the code mostly on my local machine, then just deployed to a staging server to try things out in real life.

When the crawler checks out a URL that redirects to another domain, on my local machine the #open method returns an empty Tempfile object representing the redirection. It doesn't follow the redirect, it just indicates that one happened. I use this information to decide what message to feed back to the user.

However, on the server this same #open method generates a RuntimeError. I'm running the exact same Ruby (2.0.0 p576) and Rails (4.0.3) versions in both environments. I assumed that a given piece of Ruby code, for the same version of Ruby + Rails and the same Rails environment, would have the exact same behavior. It's pretty disconcerting to find that the same code and apparently same environment can have such different results.

Any idea why this same code acts differently on different machines? What files or settings should I look at, or what commands should I run, to try to identify where this different behavior is coming from? I have isolated the problem to the following exemplar.

Thanks in advance!

In the development environment:

Loading production environment (Rails 4.0.3)
2.0.0-p576 :001 > require 'uri'
 => false 
2.0.0-p576 :003 > open 'http://www.ruby-doc.org/' # loads fine
 => #<Tempfile:/var/folders/hz/czmbmhds46s37t_pz8j198g40000gn/T/open-uri20141029-42188-51i9ls> 
2.0.0-p576 :002 > open 'http://ndic.com'          # loads fine
 => #<Tempfile:/var/folders/hz/czmbmhds46s37t_pz8j198g40000gn/T/open-uri20141029-42188-12kbadl> 

In the production environment:

Loading production environment (Rails 4.0.3)
2.0.0-p576 :001 > require 'uri'
 => false 
2.0.0-p576 :004 > open 'http://www.ruby-doc.org/' # loads fine
 => #<Tempfile:/tmp/open-uri20141029-11034-1sq9rtm> 
2.0.0-p576 :002 > open 'http://ndic.com'          # error!?
RuntimeError: redirection forbidden: http://ndic.com -> https://ndic.com/
  from /usr/local/rvm/rubies/ruby-2.0.0-p576/lib/ruby/2.0.0/open-uri.rb:223:in `open_loop'
  from /usr/local/rvm/rubies/ruby-2.0.0-p576/lib/ruby/2.0.0/open-uri.rb:149:in `open_uri'
  from /usr/local/rvm/rubies/ruby-2.0.0-p576/lib/ruby/2.0.0/open-uri.rb:689:in `open'
  from /usr/local/rvm/rubies/ruby-2.0.0-p576/lib/ruby/2.0.0/open-uri.rb:34:in `open'
  from (irb):2
  from /usr/local/rvm/gems/ruby-2.0.0-p576/gems/railties-4.0.3/lib/rails/commands/console.rb:90:in `start'
  from /usr/local/rvm/gems/ruby-2.0.0-p576/gems/railties-4.0.3/lib/rails/commands/console.rb:9:in `start'
  from /usr/local/rvm/gems/ruby-2.0.0-p576/gems/railties-4.0.3/lib/rails/commands.rb:62:in `<top (required)>'
  from bin/rails:4:in `require'
  from bin/rails:4:in `<main>'

EDIT:

One commenter asked if the problem could be that the second environment (latest CentOS) lacks the packages to make HTTPS requests. My understanding of the OpenURI library is that this shouldn't matter; if an http:// request will redirect to https://, the initial #open call should just return an object explaining the redirect (analogous to an HTTP response). I've tried directly loading a HTTPS url like https://ndic.com, and in both cases this fails with an OpenSSL::SSL::SSLError error. So I'm still stuck on the question of why the http:// (redirectable) request gets an error only in one environment.

Topher Hunt
  • 4,404
  • 2
  • 27
  • 51
  • look like `http://ndic.com` is being redirected to `https://ndic.com` which seem like not to be allowed. – Surya Oct 29 '14 at 19:01
  • Which are the operational system's on both server and your local machine? – Paulo Henrique Oct 29 '14 at 19:06
  • @Paulo: OSX 10.9 and Linux CentOS latest. So the code is definitely working in different environments at a large scale; but I thought a basic method call like this would work the same on any supported platform. Note that the first HTTP call succeeds. – Topher Hunt Oct 29 '14 at 19:09
  • @User089247: Thank you, my concern here is to figure out *why* the redirection is blocked when run on one machine but not the other. And whether there's anything I can do to change that. – Topher Hunt Oct 29 '14 at 19:09
  • It might be related to your CentOS server not having the requisits to open https URI's. – Paulo Henrique Oct 29 '14 at 19:11
  • How did you install ruby on CentOS? – Paulo Henrique Oct 29 '14 at 19:11
  • @Paulo I installed Ruby as hastily as possible, using RVM. I failed to mention this, but direct HTTPS URLs fail with a separate error (`OpenSSL::SSL::SSLError`) in *both* environments, so I made the assumption that this error wasn't because of failure to handle HTTPS. Does that make sense? – Topher Hunt Oct 29 '14 at 19:15
  • Are you sure you are using the rvm ruby? Maybe you are using the OS one. And which version are you using (2.1.1)? – Paulo Henrique Oct 29 '14 at 19:17
  • @Paulo the code snippets above should confirm that I'm using the same Ruby in both environments, 2.0.0 p576, unless I misunderstood the Rails console output. The OS ruby is indeed different between environments (1.9.3 and 2.0.0) but again I don't believe I'm using that. – Topher Hunt Oct 29 '14 at 19:25
  • I would recommend you to try reinstalling rvm. Maybe you are missing some packages related to openssl or something like that. – Paulo Henrique Oct 29 '14 at 19:27
  • Check [this](https://bugs.ruby-lang.org/issues/859) and [this](https://github.com/jaimeiniesta/open_uri_redirections). – S. A. Oct 29 '14 at 19:28
  • Seems like an old bug: http://stackoverflow.com/a/10014941/645886 but it has been patched: https://bugs.ruby-lang.org/issues/5950 so I guess should have worked. – Surya Oct 29 '14 at 19:29
  • Paulo, Sergio, User089247 thanks for the input. I'm trying other "real" redirects (that go to a whole different domain, not just HTTP -> HTTPS) and these reliably generate the "redirection forbidden" error in both environments. So the different handling *only* occurs when moving from HTTP to HTTPS. I'm no closer to determining why the difference but I'm clearer on the scope of the error, and it looks like I'll need a different HTTP solution altogether. Thanks for the help! – Topher Hunt Oct 29 '14 at 19:43

0 Answers0