4

I am trying to download the latest.zip from WordPress.org using Net::HTTP. This is what I have got so far:

Net::HTTP.start("wordpress.org/") { |http|
  resp = http.get("latest.zip")
  open("a.zip", "wb") { |file| 
    file.write(resp.body)
  }
  puts "WordPress downloaded"
}

But this only gives me a 4 kilobytes 404 error HTML-page (if I change file to a.txt). I am thinking this has something to do with the URL probably is redirected somehow but I have no clue what I am doing. I am a newbie to Ruby.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
maetthew
  • 191
  • 2
  • 14

2 Answers2

8

My first question is why use Net::HTTP, or code to download something that could be done more easily using curl or wget, which are designed to make it easy to download files?

But, since you want to download things using code, I'd recommend looking at Open-URI if you want to follow redirects. Its a standard library for Ruby, and very useful for fast HTTP/FTP access to pages and files:

require 'open-uri'

open('latest.zip', 'wb') do |fo|
  fo.print open('http://wordpress.org/latest.zip').read
end

I just ran that, waited a few seconds for it to finish, ran unzip against the downloaded file "latest.zip", and it expanded into the directory containing their content.

Beyond Open-URI, there's HTTPClient and Typhoeus, among others, that make it easy to open an HTTP connection and send queriers/receive data. They're very powerful and worth getting to know.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
  • I am using JRuby for an application where I could not use curb for some reason and I started to look at Net::HTTP. I was not aware of Open URI wich seems much more viable. Will look in to it. Thank you VERY much for the tip! Would've want to accept this answer also. But I have already accepted an answer, and the question is specifically about Net:HTTP – maetthew Mar 22 '11 at 07:08
  • I would also recommend https://github.com/rubiii/httpi , which lets you code against a common network interface and lets you switch the library underneath. – Martin T. Apr 04 '12 at 08:46
  • I consider both these answer as great answers – Stone Nov 10 '12 at 08:29
  • @the Tin Man - Thank you so much. I was having problems with `IO:copy_stream`. Your solution helped me a ton! If possible please answer my question - "Ruby undefined method `copy_stream' for IO:Class". – itsh Feb 18 '16 at 17:13
6

NET::HTTP doesn't provide a nice way of following redirects, here is a piece of code that I've been using for a while now:

require 'net/http'
class RedirectFollower
  class TooManyRedirects < StandardError; end

  attr_accessor :url, :body, :redirect_limit, :response

  def initialize(url, limit=5)
    @url, @redirect_limit = url, limit
  end

  def resolve
    raise TooManyRedirects if redirect_limit < 0

    self.response = Net::HTTP.get_response(URI.parse(url))

    if response.kind_of?(Net::HTTPRedirection)      
      self.url = redirect_url
      self.redirect_limit -= 1

      resolve
    end

    self.body = response.body
    self
  end

  def redirect_url
    if response['location'].nil?
      response.body.match(/<a href=\"([^>]+)\">/i)[1]
    else
      response['location']
    end
  end
end



wordpress = RedirectFollower.new('http://wordpress.org/latest.zip').resolve
puts wordpress.url
File.open("latest.zip", "w") do |file|
  file.write wordpress.body
end
Mike Lewis
  • 63,433
  • 20
  • 141
  • 111