55

How to find RSS feed of a particular website? Whether there is any particular way to find it?

Shan
  • 2,822
  • 9
  • 40
  • 61
  • 1
    Also, it is important to note that certain websites may not have feeds at all. In that case, you will not be able to find any RSS source code such as mentioned above. What I mean by this is - "Not all websites/blogs have RSS feeds". – Sunny Saxena Jan 06 '13 at 07:52

5 Answers5

71

You might be able to find it by looking at the source of the home page (or blog). Look for a line that looks like this:

<link rel="alternate" type="application/rss+xml" title="RSS Feed" href="http://example.org/rss" />

The href value will be where the RSS is located.

Francois Deschenes
  • 24,816
  • 4
  • 64
  • 61
  • This is a standard one right so on every website I can look for this right? – Shan Jun 13 '11 at 06:40
  • 3
    It is although the title attribute's value might change. – Francois Deschenes Jun 13 '11 at 06:44
  • 1
    Is there a similar way to find atom feeds? – Automatico Jul 17 '13 at 21:07
  • 4
    @Cort3z You bet! There may be something like this in there somewhere: ``. The key is to look for `application/atom+xml`. – Francois Deschenes Jul 17 '13 at 21:31
  • @FrancoisDeschenes Nice. It does seem to be a bit fragile though. Not everyone actually add the type in there. Maybe the best bet is actually to search for links with the keyword feed, rss or atom in it. – Automatico Jul 17 '13 at 22:25
  • @Cort3z - There really is no perfect solution here. You could consider looking for an XML sitemap or crawl known paths but otherwise it's up to the author of the feed to choose how it'll ultimately be publicized. – Francois Deschenes Jul 17 '13 at 22:39
  • Or if you want it as a CSS selector, just fire up the console (`Ctrl`+`Shift`+`J` in Chrome) and enter `$('link[type="application/rss+xml"]')[0]['href']`. – metakermit Nov 22 '13 at 14:32
16

There are multiple ways to get the RSS feed of the website.

What you can do is get the page source of a website and search for this link tag of type="application/rss+xml"

That will contain the RSS feed of that website, if any.

Here is a simple program in python that will print the RSS feed of any website, if any.

import requests  
from bs4 import BeautifulSoup  

def get_rss_feed(website_url):
    if website_url is None:
        print("URL should not be null")
    else:
        source_code = requests.get(website_url)
        plain_text = source_code.text
        soup = BeautifulSoup(plain_text)
        for link in soup.find_all("link", {"type" : "application/rss+xml"}):
            href = link.get('href')
            print("RSS feed for " + website_url + "is -->" + str(href))

get_rss_feed("http://www.extremetech.com/")

Save this file with the .py extension and run it. It will give you the rss feed url of that website.

Google also provides APIs to find the RSS feeds of a website. Please find them here: Google Feed API

Xan
  • 74,770
  • 16
  • 179
  • 206
Ram Narayan
  • 171
  • 1
  • 5
0

You need to loop through all urls on your website and then find one that's containing "rss".

Method above maybe won't work in some cases if url in href tag looks something like feed.xml, so in that case you'll need to loop through all tags containing href AND rss, then just parse url from href attribute.

If you want to do this through browser, press CTRL+U to view source, then CTRL+F to open find window and then just type in rss. RSS Feed url should appear immediately.

Stefan Đorđević
  • 565
  • 1
  • 4
  • 22
0

Firefox's Tools menu now has a "Page Info" command. One of the tabs in that tool displays discovered feed info.

npskirk
  • 1,188
  • 1
  • 8
  • 21
-3

I needed to find sites with RSS feeds. Using Visual Studio (VB) I was able do that. Following code is just a fragment. It dies after the loop finishes but it does find any ref to an rss page on the site. That's all I needed so I never quite finished it. But it worked for me.

Imports System.Net Imports System.IO

... Dim request As WebRequest request = WebRequest.Create("http://www.[site]")

    Dim response As WebResponse = request.GetResponse()
    Dim responseStream As Stream = response.GetResponseStream()
    Dim reader As New StreamReader(responseStream)

    Dim line As String = reader.ReadLine()
    Dim intPos As Integer

    Do
        line = reader.ReadLine()
        intPos = line.IndexOf("/rss")
        If intPos > 0 Then
            MessageBox.Show(line + " " + intPos.ToString)
        End If
    Loop While Not line Is Nothing

....