First off I would like to point out that this is not a basic task which you can do on any kind of shared hosting provider. I assume you will get banned for sure.
So I assume you are able to compile software(VPS?) and start long running processes in the background(using php cli). I would use a redis(I liked predis as PHP client library very much) to push messages on a list. (P.S: I would prefer to write this in node.js/python(explanation below works for PHP), because I think this task can be coded in these languages pretty fast. I am going to try and write it and post code on github later.)
Redis:
Redis is an advanced key-value store.
It is similar to memcached but the
dataset is not volatile, and values
can be strings, exactly like in
memcached, but also lists, sets, and
ordered sets. All this data types can
be manipulated with atomic operations
to push/pop elements, add/remove
elements, perform server side union,
intersection, difference between sets,
and so forth. Redis supports different
kind of sorting abilities.
Then start a couple of worker processes which will take(blocking if none available) messages from the list.
Blpop:
This is where Redis gets really
interesting. BLPOP and BRPOP are the
blocking equivalents of the LPOP and
RPOP commands. If the queue for any of
the keys they specify has an item in
it, that item will be popped and
returned. If it doesn't, the Redis
client will block until a key becomes
available (or the timeout expires -
specify 0 for an unlimited timeout).
Curl is not exactly pinging(ICMP Echo), but I guess some servers could block these requests(security). I would first try to ping(using nmap snippet part) the host, and fail back to curl if ping fails, because pinging is faster then using curl.
Libcurl:
A free client-side URL transfer
library, supporting FTP, FTPS, Gopher
(protocol), HTTP, HTTPS, SCP, SFTP,
TFTP, TELNET, DICT, FILE, LDAP, LDAPS,
IMAP, POP3, SMTP and RTSP (the last
four—only in versions newer than
7.20.0 or 9 February 2010)
Ping:
Ping is a computer network
administration utility used to test
the reachability of a host on an
Internet Protocol (IP) network and to
measure the round-trip time for
messages sent from the originating
host to a destination computer. The
name comes from active sonar
terminology. Ping operates by sending
Internet Control Message Protocol
(ICMP) echo request packets to the
target host and waiting for an ICMP
response.
But then you should do a HEAD request and only retrieve headers to check if host is up. Otherwise you would also be downloading content of url(takes time/cost bandwidth).
HEAD:
The HEAD method is identical to GET
except that the server MUST NOT return
a message-body in the response. The
metainformation contained in the HTTP
headers in response to a HEAD request
SHOULD be identical to the information
sent in response to a GET request.
This method can be used for obtaining
metainformation about the entity
implied by the request without
transferring the entity-body itself.
This method is often used for testing
hypertext links for validity,
accessibility, and recent
modification.
Then each worker process should use curl_multi. I think this link might provide a good implementation of this(minus it does not do head request). to have some sort of concurrency in each process.