I am having problems parsing a feed in C#
.
I cannot get the authors of the feeds to change the code so I have to handle it.
I have tried passing the feed straight into the XmlDocument object as a URL, or obtaining it with WebClient as text, trimming it to remove any space that seems to be put in front of it for some reason and then use the LoadXML method to load it.
You can see an example of the feed here > http://scotjobsnet.co.uk.ni.strategiesuk.net/testfeed.xml
I cannot get past either the
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load(feedURL);
Or with a string.
XmlDocument xmlDoc = new XmlDocument();
string feedAsString = "";
// get from web as string
var webClient = new WebClient();
// Tell them who we are for white listing
webClient.Headers.Add("user-agent", "Mozilla/5.0 (compatible; Job Feed Importer;)");
// fetch feed as string
var content = webClient.OpenRead(feedURL);
var contentReader = new StreamReader(content);
var rssFeedAsString = contentReader.ReadToEnd();
rssFeedAsString = rssFeedAsString.Trim(); // remove any white space beore the feed
xmlDoc.LoadXml(feedAsString);
The errors I get are:
Root element is missing.
Could not extract first items from feed string; Error The element with name 'jobs' and namespace '' is not an allowed feed format.
I want to use xpath /jobs/job/ to loop through the feed nodes.
I have parsed feeds like this before with XmlDocument passing in just a URL and if not then a string.
I am thinking of resorting to using regular expressions to loop through the feeds using a <job>[\s\S]+></job>
type expression.
However I would rather use standard methods.
As I cannot get the feeds changed can anyone tell me what is wrong with the feed and the way I am parsing it. Forgive the use of var I was just knicked a snippet of code to parse a feed from an example that was using it. I am using strong types every where else and will convert it once I get it working.
Any help would be much appreciated.
Thanks