0

If I go to http://www.alexandriava.gov/rss.aspx in my browser, the Chrome developer tools console tells me that the server responds Status 200, and I get some XML and all is well in the cosmos.

If I write some code to access it remotely:

Node.JS

var h = require("http");

h.get("http://www.alexandriava.gov/rss.aspx", function(resp){
    console.log(resp);
}).on("error", function(err){
    console.error("ERROR ===========================");
    console.error(err);
});

I get status code 302, because ultimately it's trying to redirect me to an ASP.NET error page. But just for grins, here is the response header:

date: 'Fri, 06 Jun 2014 03:17:11 GMT',
server: 'Microsoft-IIS/6.0',
'x-powered-by': 'ASP.NET',
'set-cookie':
  [ 'COASTATS=539132b724041115851869612717; domain=.alexandriava.gov; expires=Tue 30-Dec-2031 23:59:59 GMT; path=/','ecm=user_id=0&isMembershipUser=0&site_id=&username=&new_site=/&unique_id=0&site_preview=0&langvalue=0&DefaultLanguage=1033&NavLanguage=1033&LastValidLanguageID=1033&DefaultCurrency=840&SiteCurrency=840&ContType=&UserCulture=1033&dm=www.alexandriava.gov&SiteLanguage=1033; path=/',
    'EktGUID=b56f532c-011d-4ccc-98cb-7a1b3e170fcf; expires=Sat, 06-Jun-2015 03:17:11 GMT; path=/',
    'EkAnalytics=0; expires=Sat, 06-Jun-2015 03:17:11 GMT; path=/' ],
'x-aspnet-version': '2.0.50727',
location: '/handle500.aspx?aspxerrorpath=/rss.aspx',
'cache-control': 'private',
'content-type': 'text/html; charset=utf-8',
 'content-length': '164' }

Even this very simple C# code

using (var reader = System.Xml.XmlReader.Create("http://www.alexandriava.gov/rss.aspx"))
{
    var rss = System.ServiceModel.Syndication.SyndicationFeed.Load(reader);
    return rss.Description.Text;
}

Errors on the initial request. Status: "ProtocolError", Message: "The remote server returned an error: (500) Internal Server Error."

I don't understand enough about HTTP requests to know what the difference is between my browser and my code. The site I'm trying to read from supposedly uses its own RSS feed to generate its front page.

Thinking this might be related (Error when Parsing RSS), I tried the suggested Web.config change.

<configuration>
    <system.net>
        <settings>
            <httpWebRequest useUnsafeHeaderParsing="true" />
        </settings>
    </system.net>
</configuration>

But it didn't help.

What should I try next?

Community
  • 1
  • 1
moron4hire
  • 703
  • 5
  • 13

1 Answers1

2

The problem is that the server seems to look for something specific in the "User-Agent" header, and it throws an error when it doesn't match whatever expectation it has.

To solve this, add a user-agent similar to whatever your browser is using (You can find this by reviewing the Network tab of the Chrome Developer tools you are using to see the 200 response.

I used a user-agent like this:

User-Agent:Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.114 Safari/537.36

And I was able to download the RSS xml file successfully. But when I did not specifically set the User-Agent, I received the 500 error just as you did.

I believe in Node.JS you would do this to set the user-agent:

var h = require("http");

h.get({
    host: "www.alexandriava.gov", 
    path: "/rss.aspx", 
    headers: { 
      'User-Agent': 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13'
    }
}, function(resp){
console.log(resp);
}).on("error", function(err){
console.error("ERROR ===========================");
console.error(err);
});
drwatsoncode
  • 4,721
  • 1
  • 31
  • 45
  • Thanks. When I wrote this and wrote "difference between my browser and my code", user agent came to mind, but I didn't know how to set it in Node and it seemed to be a ridiculously bad bug to have on their end. What should I expect, though, eh? – moron4hire Jun 06 '14 at 12:59
  • 1
    I agree. It is a "ridiculously bad bug". I'm sure Node.JS sends some kind of user-agent by default. It is pretty silly for that to crash the server. I just discovered it was the User-Agent by looking at all the headers sent by Chrome and then using C# WebClient to send all the same headers. That worked, so I started removing the headers one by one and found that the 500 error occurred when I removed the User-Agent. – drwatsoncode Jun 06 '14 at 23:41
  • Thanks for your diligence. I was about ready to march into City Hall and demand to see the person in charge :P – moron4hire Jun 07 '14 at 02:27