4

I have written a piece of code which uses System.ServiceModel.Syndication library to parse RSS feeds.

The problem is that for one of my feeds (which is provided by facebook) I get the following line in the end of the response and Syndication library fails to parse the feed because it says the text is invalid XML and it says it's because of this part:

  ...
  </channel>
  <access:restriction relationship="deny" xmlns:access="http://www.bloglines.com/about/specs/fac-1.0" />
</rss>

I'm sure there is something I'm missing here, because both the feed and the parsing library are from huge companies (Facebook and Microsoft respectively).

Can any of you help? Or alternatively a better parser that doesn't rely on the validity of XML?

P.S. Here is my RSS feed url:
http://www.facebook.com/feeds/page.php?id=202296766494181&format=rss20

Here is how I'm parsing the feed response:

var stringReader = new StringReader(resp);
var xreader = XmlReader.Create(stringReader);
var xfeed = System.ServiceModel.Syndication.SyndicationFeed.Load(xreader);

and the exception I get:

System.Xml.XmlException: 'Element' is an invalid XmlNodeType. Line 282, position 4.

at System.Xml.XmlReader.ReadEndElement() ...

Mo Valipour
  • 13,286
  • 12
  • 61
  • 87
  • maybe this post can help you [link]connect.microsoft.com/VisualStudio/feedback/details/325421/syndicationfeed-load-fails-to-parse-datetime-against-a-real-world-feeds-ie7-can-read – tazyDevel Oct 16 '11 at 10:29

1 Answers1

8

It seems the SyndicationFeed is having a problem with the access:restriction element used by facebook. See recent thread on http://social.msdn.microsoft.com/Forums/ar/xmlandnetfx/thread/7045dc1c-1bd9-409a-9568-543e74f4578d

Michael Sun (MSFT) wrote "Just saw Martin's post! Very helpful! I also did some research about the issue. The element is from Bloglines, http://www.bloglines.com/index.html. It sounds like an extension facebook is using for its RSS 2.0 feeds, http://www.feedforall.com/access-namespace.htm. From this article, it seems Rss20FeedFormatter is not the only one which does not support the elements.

I agree with Martin to use XDocument (LINQ to XML) to parse the RSS feed. Or if you are building some large app via C#, the Facebook C# SDK can be helpful as well, http://facebooksdk.codeplex.com/"

Edit :

It seems however that the Atomfeed is not suffering from this problem. So easiest solution would be to use this link (http://www.facebook.com/feeds/page.php?id=202296766494181&format=atom10). Thus changing the format parameter from rss20 to atom10

    HttpWebRequest req = WebRequest.Create(@"http://www.facebook.com/feeds/page.php?id=202296766494181&format=atom10") as HttpWebRequest;
        req.UserAgent = "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)";
        using (Stream responseStream = req.GetResponse().GetResponseStream())
        {
            using (XmlReader xr = XmlReader.Create(responseStream))
            {
                SyndicationFeed feed = SyndicationFeed.Load(xr);
            }
        }

Other alternative is to write an inherited XMLTextReader overiding the ReadEndElement Method, by skipping any Element after the channel closing tag.(Do mind that code below is without any guarantee as I consider myself still a novice c# developer. Please feel free to correct any possible mistakes)

public class FaceBookReader : XmlTextReader
{
    public FaceBookReader(Stream stream)
        : base(stream) { }

    public FaceBookReader(String url)
        : base(url) { }

    public override void ReadEndElement()
    {
        string elementTag = this.LocalName.ToLower();

        base.ReadEndElement();

        // When we've read the channel End Tag, we're going to skip all tags
        // until we reach the a new Ending Tag which should be that of rss
        if (elementTag == "channel")
        {
            while (base.IsStartElement())
            {
                base.Skip();
            }
        }
    }
}
tazyDevel
  • 436
  • 3
  • 12
  • very useful info, but parsing using XDocument is a real pain, because then I need to support ATOM, RSS, etc. separately... Any other solution you can think of? – Mo Valipour Oct 16 '11 at 21:46
  • Perhaps you can load the XML remove the invalid Access Tag and Reload using the SyndicationFeed ? – tazyDevel Oct 17 '11 at 06:21
  • Tried it by changing the format to atom10 and that seems to do the trick to get it to load into a SyndicationFeed – tazyDevel Oct 17 '11 at 18:27