0

I've been working on a script and I thought I would ask for help. I'm looking to search a series of websites, check if the site is valid. Then the next step would be to check for specific content on the site. If the site holds that content, place the URL in a list.

import urllib2  

def getPage():  

    url="import urllib2  

National=[]
Local=[]
Sports=[]
Culture=[]

def getPage():  

    url="http://readingeagle.com/section.aspx?id=2"     

    for i in range (0,100,1)
        req = urllib2.Request(http://readingeagle.com/section.aspx?id=,i)
    if "national" in response:  

    response = urllib2.urlopen(req)  

    return response.read()
    for g in range (0,100,1)
    if "national" in response:
        National.append("http://readingeagle.com/section.aspx?id=,g"


# I would like to set-up an iteration to check the 'entryid from 1-100. If the term is found on the page, place the url in the list.

if __name__ == "__main__":  

    namesPage = getPage()  

    print (namesPage) 
icktoofay
  • 126,289
  • 21
  • 250
  • 231
  • Your question... is not a question! If you want other programmers to help you out, you should be much more specific: what is your problem? What do you think it might be wrong? What have you tried so far to find a solution yourself? – mac Jul 05 '11 at 00:37

1 Answers1

0

Here's my answer to the question of how to validate a given web site.

python check html valid

For checking the context of the page the tools consist of basic string methods, regex, or more sophisticated tools like lxml or beautifulsoup.

matchingSites = []
matchingSites.append(url) #Since you asked. :-p
Community
  • 1
  • 1
Peter Lyons
  • 142,938
  • 30
  • 279
  • 274