15

I have a list of urls, I need to check which of the following urls are valid.

The code I used is

require 'net/http'

url = 'http://mysite.com'
res = Net::HTTP.get_response(URI.parse(url.to_s))
puts res.code

Here I can check the response code 200 for a valid url. My concern is the 'res' object returned contains code, body, etc. So my response (res object) becomes heavy. Is there any way so that I can get only the response code. I don't need any other info. Please help

Amal Kumar S
  • 15,555
  • 19
  • 56
  • 88

5 Answers5

11

I didn't check if it's possible to do with Net::HTTP, but you can use Curb, which is the Ruby wrapper for curl. Look at Curl::Easy#http_head

With Net::HTTP you can also use HTTP#head, which requests headers from the server using the HEAD method.

Information about HTTP's method HEAD:

9.4 HEAD

The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response. The metainformation contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request. This method can be used for obtaining metainformation about the entity implied by the request without transferring the entity-body itself. This method is often used for testing hypertext links for validity, accessibility, and recent modification.

Nakilon
  • 34,866
  • 14
  • 107
  • 142
Alex Kurkin
  • 1,069
  • 10
  • 17
6

This is easiest in Faraday:

# one line to make request
response = Faraday.head url

# example with headers
resource_size = response.headers['Content-Length']
Turadg
  • 7,471
  • 2
  • 48
  • 49
6

The code I used is:

require 'net/http'
response = nil
Net::HTTP.start('upload.wikimedia.org', 80) {|http|
 response = http.head(path)
}
puts response.code
Nakilon
  • 34,866
  • 14
  • 107
  • 142
Amal Kumar S
  • 15,555
  • 19
  • 56
  • 88
3

A HEAD request could look like this:

require 'socket'

s = TCPSocket.open("google.com", 80)
s.puts "HEAD / HTTP/1.1"
s.puts "Host: google.com"
s.puts

headline = s.gets
s.close

status = headline.scan(/\d\d\d/).first.to_i
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
moritz
  • 25,477
  • 3
  • 41
  • 36
0
require "uri"
uri = URI my_url

require "net/http"
p Net::HTTP.start(URI(url).host, 443, use_ssl: true){ |http| break http.head uri.path }
Nakilon
  • 34,866
  • 14
  • 107
  • 142