-1

I am getting the "expected string or buffer" error in my simple python file. I am trying to get the titles of reddit articles written down.

from urllib import urlopen
import re


worldNewsPage = urlopen("https://www.reddit.com/r/worldnews/")

collectTitle = re.compile('<p class="title"><a.*>(.*)</a>')

findTitle = re.findall(collectTitle, worldNewsPage)

listIterator = []
listIterator[:] = range(1,3)

for i in listIterator:
    print findTitle
    print
jamylak
  • 128,818
  • 30
  • 231
  • 230
TheSoma300
  • 72
  • 4

2 Answers2

1

Change

worldNewsPage = urlopen("https://www.reddit.com/r/worldnews/")

to

worldNewsPage = urlopen("https://www.reddit.com/r/worldnews/").read()

Also don't use regex to parse html. You can use BeautifulSoup

Community
  • 1
  • 1
jamylak
  • 128,818
  • 30
  • 231
  • 230
0

Urlopen is an object so you have to call the method read to get the contents you downloaded (like files).

marcomg
  • 25
  • 4