35

I am very new to python. Very new. I copied the following from a tutorial

#!/usr/bin/python

from urllib import urlopen
from BeautifulSoup import BeautifulSoup

webpage = urlopen('http://feeds.huffingtonpost.com/huffingtonpost/LatestNews').read

patFinderTitle = re.compile('<title>(.*)</title>')

patFinderLink = re.compile('<link rel.*href="(.*)"/>')

findPatTitle = re.findall(patFinderTitle,webpage)

findPatLink = re.findall(patFinderLink,webpage)

listIterator = []
listIterator[:] = range(2,16)

for i in listIterator:
    print findPatTitle[i]
    print findPatLink[i]
    print "\n"

I get the error:

Traceback (most recent call last):
  File "test.py", line 8, in <module>
    patFinderTitle = re.compile('<title>(.*)</title>')
NameError: name 're' is not defined

What am I doing wrong?

Aran-Fey
  • 39,665
  • 11
  • 104
  • 149
user_78361084
  • 3,538
  • 22
  • 85
  • 147
  • Which tutorial did you copy this from? It is littered with errors. – johnsyweb Sep 19 '11 at 01:40
  • http://www.youtube.com/watch?v=Ap_DlSrT-iE&feature=related – user_78361084 Sep 19 '11 at 01:47
  • Then you should compare your code with the accompanying code here: http://www.newthinktank.com/2010/11/python-2-7-tutorial-pt-13-website-scraping/ . After a little tidy-up I've found that it works. – johnsyweb Sep 19 '11 at 01:57
  • 1
    I rolled back your edit to the question because chameleon questions are not acceptable. You can't just invalidate the efforts of the people who posted answers to your original question like that. – Aran-Fey Sep 21 '18 at 12:34

2 Answers2

61

You need to import regular expression module in your code

import re
re.compile('<title>(.*)</title>')
TheOneTeam
  • 25,806
  • 45
  • 116
  • 158
  • thanks. I now get another error..please see my edits – user_78361084 Sep 19 '11 at 01:24
  • 3
    @user522962 `webpage = urlopen('http://feeds.huffingtonpost.com/huffingtonpost/LatestNews').read` should be `webpage = urlopen('http://feeds.huffingtonpost.com/huffingtonpost/LatestNews').read()` – razpeitia Sep 19 '11 at 01:56
2

As well as the missing import re, your program has another error. In

webpage = urlopen('http://feeds.huffingtonpost.com/huffingtonpost/LatestNews').read

You left the () off after read at the end of the line. So currently webpage is a reference to the .read method, it's not the result of the .read() call.

PM 2Ring
  • 54,345
  • 6
  • 82
  • 182
John La Rooy
  • 295,403
  • 53
  • 369
  • 502