17

My code got stuck on this function call:

feedparser.parse("http://...")

This worked before. The url is even not possible to open in the browser. How would you cure this case? Is there a timeout possibility? I'd like to continue as if nothing would happen (only with printing some message or log this issue)

xralf
  • 3,312
  • 45
  • 129
  • 200

3 Answers3

23

Use Python requests library for network IO, feedparser for parsing only:

# Do request using requests library and timeout
try:
    resp = requests.get(rss_feed, timeout=20.0)
except requests.ReadTimeout:
    logger.warn("Timeout when reading RSS %s", rss_feed)
    return

# Put it to memory stream object universal feedparser
content = BytesIO(resp.content)

# Parse content
feed = feedparser.parse(content)
Mikko Ohtamaa
  • 82,057
  • 50
  • 264
  • 435
  • It is better than specifying the global timeout but it might not fix the issue due to the reason pointed out in my answer (`requests.get()` may block for much longer than the `timeout` value). Follow the link for details. – jfs Mar 24 '18 at 21:20
  • 1
    I like this solution. I have http settings that work really well for my purposes, but wanted to feedparser for the variations I find in rss feeds. This allows me to do both. Thanks! – jsfa11 Jan 10 '19 at 17:00
17

You can specify timeout globally using socket.setdefaulttimeout().

The timeout may limit how long an individual socket operation may last -- feedparser.parse() may perform many socket operations and therefore the total time spent on dns, establishing the tcp connection, sending/receiving data may be much longer. See Read timeout using either urllib2 or any other http library.

Community
  • 1
  • 1
jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • OK, I used it but don't know if it works because the URL with endless loading is active again. – xralf Mar 19 '12 at 15:33
6

According to the author's recommendation[1], you should use requests library to do http request, and parse result to feedparser.

[1] https://github.com/kurtmckee/feedparser/pull/80

apporc
  • 870
  • 3
  • 11
  • 23