Possible Duplicate:
Get random site names in bash
I'm making a program for the university that has to find the occurrences of the words on the web. I need to make an algorithm that finds sites and count the numbers of words used and after it has to record them and sort by how many times they are used. Therefore the most sites my program checks, the better. First of all I was thinking of calculating random IPs, but the problem is that the process takes really too much (I left the computer searching the whole night and it found only 15 sites). I guess this is because site's IPs aren't distributed evenly on the web and most of the IPs belongs to users or other services. Now I had a pair of new approach in mind and I wanted to know what you guys think:
what if I make random searches using some sort of a dictionary through google? The dictionary would start empty at the beginning and each time I perform a search, I check one site and add to the dictionary only the words that occur once, so that this won't send me to that site again, by corrupting the occurrences.
Is this easy?
The first thing I want to do is to search also random pages in the google search and not only the first one, how can this be done? I can't figure out how to calculate the max number of pages for that search and how to directly go to a specific page
thanks