0

I'm using Python 2.7. When I try to run this code, I get a problem when the function hits print findPatTitle[i], and python returns "Index Error: list index out of range". I'm taking this code from the 13th python tutorial on youtube, and I'm pretty sure the code is identical, so I don't understand why I would get a range problem. Any ideas?

from urllib import urlopen
from BeautifulSoup import BeautifulSoup
import re

webpage = urlopen('http://feeds.huffingtonpost.com/huffingtonpost/LatestNews').read()

patFinderTitle = re.compile('<title>(.*)<title>')

patFinderLink = re.compile('<link rel.*href="(.*)" />')

findPatTitle = re.findall(patFinderTitle,webpage)
findPatLink = re.findall(patFinderLink,webpage)

listIterator = []
listIterator[:] = range(2,16)

for i in listIterator:
    print findPatTitle[i]
    print findPatLink[i]
    print "\n"
Burton Guster
  • 2,213
  • 8
  • 31
  • 29
  • 2
    Why are you using regex to parse the html when you have BeautifulSoup? o.O You shouldn't parse html with regex... http://stackoverflow.com/questions/590747/using-regular-expressions-to-parse-html-why-not – naeg Sep 06 '11 at 06:11

1 Answers1

0

If you regex managed to find out the title and link tags you would be getting a list of matched strings when using the findall. In that case, you can just iterate through them and print it.

Like:

for title in findPatTitle:
    print title

for link in findPatLink:
    print link

The Index Error you are getting is because you are trying to access the list of elements from 2 to 16 and there are not 16 elements in either Titles or links.

Note, listIterator[:] = range(2,16) is not a good way to write code for this purpose. You could just use

for i in range(2, 16)
    # use i
Senthil Kumaran
  • 54,681
  • 14
  • 94
  • 131