0

In this example RSS feed, the optional item element pubDate is included in all entries. But it is not available as a item element in the Python module feedparser. This code:

import feedparser
rss_object = feedparser.parse("http://cyber.law.harvard.edu/rss/examples/rss2sample.xml")
for entry in rss_object.entries:
    print entry.pubDate

Causes the error AttributeError: object has no attribute 'pubDate' but I can successfully do print entry.description and see the contents of all the description tags.

Martin Burch
  • 2,726
  • 4
  • 31
  • 59

1 Answers1

6

feedparser is an opinionated parser, not simply returning XML in a dictionary. The text of pubDate is available as entries[i].published.

The date this entry was first published, as a string in the same format as it was published in the original feed.

Working code:

for entry in rss_object.entries:
    print entry.published

Note: published is extracted from one of several possible XML tags depending on the format of the feed. See the reference manual for a list.

This manual also claims the pubDate element is parsed "as a date" in entries[i].published_parsed. What's in published_parsed is a time.struct_time object; you may want to re-parse the date yourself to maintain time zone information, if the original feed included time zones.

Community
  • 1
  • 1
Martin Burch
  • 2,726
  • 4
  • 31
  • 59
  • 1
    Gosh. I wish I'd noticed this earlier when using feedparser. Everything else maps to standard RSS tags, so I was missing dates all the time. – Marc Maxmeister Jul 15 '19 at 20:15