1

Suppose I was giving this list of urls:

website.com/thispage

website.com/thatpage

website.com/thispageagain

website.com/thatpageagain

website.com/morepages

... could possibly be over say 1k urls.

What is the best/easiest way to kinda loop through this list and check whether or not the page is up?

kmatyaszek
  • 19,016
  • 9
  • 60
  • 65
iCodeLikeImDrunk
  • 17,085
  • 35
  • 108
  • 169
  • possible duplicate of [Python verify url goes to a page](http://stackoverflow.com/questions/4041443/python-verify-url-goes-to-a-page) – finnw Nov 17 '12 at 10:16

4 Answers4

6

Perform a HEAD request on each of them.

Use this library: http://docs.python-requests.org/en/latest/user/quickstart/#make-a-request

requests.head('http://httpbin.org/get').status_code
Marcin
  • 48,559
  • 18
  • 128
  • 201
5

Here is an example in Python

import httplib2

h = httplib2.Http()
listUrls = ['http://www.google.com','http://www.xkcd.com','http://somebadurl.com']
count = 0

for each in listUrls:
    try:
        response, content = h.request(listUrls[count])
        if response.status==200:
            print "UP"
    except httplib2.ServerNotFoundError:
        print "DOWN"
    count = count + 1
RussellJSmith
  • 381
  • 1
  • 3
  • 8
2

There's an SO answer showing how to perform HEAD requests in Python:

How do you send a HEAD HTTP request in Python 2?

Community
  • 1
  • 1
John Gaines Jr.
  • 11,174
  • 1
  • 25
  • 25
1

Open a pool of threads, open a Url for each, wait for a 200 or a 404. Rinse and repeat.

enjoylife
  • 3,801
  • 1
  • 24
  • 33