5

Is it possible to determine from the System.ServiceModel.Syndication.SyndicationFeed instance what type of SyndicationFeed is being read? If all I have is the url (blahblah.com/feed) it might be rss or atom, and depending on the type I want to do one thing or the other.

Is there a simple way to tell without parsing the document and looking for specific characters?

Dan Lowe
  • 51,713
  • 20
  • 123
  • 112
SelAromDotNet
  • 4,715
  • 5
  • 37
  • 59

1 Answers1

11

Old question, but it deserves an answer.

There is a relatively simple way to determine if you've got an RSS or an Atom feed. It does require reading, or trying to read the document.

public SyndicationFeed GetSyndicationFeedData(string urlFeedLocation)
{
    XmlReaderSettings settings = new XmlReaderSettings
        {
            IgnoreWhitespace = true,
            CheckCharacters = true,
            CloseInput = true,
            IgnoreComments = true,
            IgnoreProcessingInstructions = true, 
            //DtdProcessing = DtdProcessing.Prohibit // .NET 4.0 option
        };

    if (String.IsNullOrEmpty(urlFeedLocation))
        return null;

    using (XmlReader reader = XmlReader.Create(urlFeedLocation, settings))
    {
        if (reader.ReadState == ReadState.Initial)
            reader.MoveToContent();

        // now try reading...

        Atom10FeedFormatter atom = new Atom10FeedFormatter();
        // try to read it as an atom feed
        if (atom.CanRead(reader))
        {
            atom.ReadFrom(reader);
            return atom.Feed;
        }

        Rss20FeedFormatter rss = new Rss20FeedFormatter();
        // try reading it as an rss feed
        if (rss.CanRead(reader))
        {
            rss.ReadFrom(reader);
            return rss.Feed;
        }

        // neither?
        return null;
    }
}
Cheeso
  • 189,189
  • 101
  • 473
  • 713
  • see i thought of and saw another example of this, but i don't remember why I didn't like it. It was so long ago, and this works, so consider this the best answer, thanks :) – SelAromDotNet Dec 16 '10 at 16:29
  • Ok so when I try this feed (http://en.espnf1.com/rss/motorsport/story/feeds/0.xml?type=2) which is of type Atom 2.0 your code does not work since the line atom.CanRead(reader) returns false. What is the solution here to handle Atom Ver. 2.0? – Marko Mar 09 '12 at 22:55
  • It's not atom 2.0, as far as I know. It looks to me like that feed is broken. It has junk in it. To work around it, I'd suggest fixing up the feed before trying to read it. I just tried this and it works for me here. – Cheeso Mar 10 '12 at 13:10
  • Well that's really surprising considering it comes from ESPN's Formula 1 site. Can you explain to me what is it that makes it a "junk" feed? What is missing and/or not formatted right? – Marko Mar 12 '12 at 13:06
  • use "View Source" and you will see the broken formatting in it. There is a repeated `%]` in the feed which appear to be remnants of a broken generation algorithm. If I remove those fragments, the feed is readable with the code here. Surprisingly enough, there are bugs in data feeds. I suppose no one else has reported it because IE and other browsers tolerate the broken feed. Programs that are more strict, including those that depend on the SyndicationFeed classes, apparently do not tolerate the broken feed. – Cheeso Mar 12 '12 at 21:20
  • Thanks for the information. I noticed the %] earlier but didn't think much of it. It actually makes sense because it's in between the different node elements... – Marko Mar 13 '12 at 15:02