12
string url = "http://www.example.com/feed.xml";
var settings = new XmlReaderSettings();
settings.IgnoreComments = true;
settings.IgnoreProcessingInstructions = true;
settings.IgnoreWhitespace = true;
settings.XmlResolver = null;
settings.DtdProcessing = DtdProcessing.Parse;
settings.CheckCharacters = false;
var request = (HttpWebRequest)WebRequest.Create(url);
request.Timeout = 900000;
request.KeepAlive = true;
request.IfModifiedSince = lastModified;
var response = (HttpWebResponse)request.GetResponse();
Stream stream;
stream = response.GetResponseStream();
stream.ReadTimeout = 600000;
var xmlReader = XmlReader.Create(stream, settings);

while (!xmlReader.EOF)
{
...

When I try this on a large xml file (that is also very slow to download), my azure web app throws a blank page after a couple of minutes.

I saw this on Azure's Failed Request Tracing Logs:

ModuleName: DynamicCompressionModule

Notification: SEND_RESPONSE

HttpStatus: 500

HttpReason: Internal Server Error

HttpSubStatus: 19

ErrorCode: An operation was attempted on a nonexistent network connection. (0x800704cd)

As you can see, I have been "playing around" with the timeout settings. Also tried catching all exceptions but it doesn't catch any.

Also, this works without problems when debugging the web app locally on my computer. It could be that the internet connection at my office is better than Azure's, resulting on the xml file being read fast without any problems.

Any possible workarounds? Edit: I want to keep streaming the XML file (I'm avoiding downloading the whole file because the user has an option to read only the first N entries of the feed). In case the problem described above can't be avoided, I will be happy if someone can help me displaying a meaningful message to the user at least, instead of blank page.

Community
  • 1
  • 1
Paul0PT
  • 106
  • 1
  • 18
  • 1
    I would definitely bet on size and time to download affecting some content proxy inside azure cloud resetting your connection. – dresende Nov 19 '15 at 21:53
  • That could also may be caused by a server timeout, the one your code connects to (here it would be "http://www.example.com/feed.xml"). – Simon Mourier Nov 23 '15 at 17:07
  • yes @SimonMourier , I believe you might be right. But this is very frustrating when I can't catch the exception and give the user a meaningful message. – Paul0PT Nov 23 '15 at 17:09
  • You might try the [ServiceStack](https://servicestack.net/) library. Check out this other SO thread http://stackoverflow.com/questions/10040680/servicestack-and-returning-a-stream – Chuck Savage Nov 23 '15 at 20:16
  • What is the size of this XML that you are trying to download? – Thiago Lunardi Nov 25 '15 at 19:55

3 Answers3

2

Try using the WebClient Class to get the xml file.

string xmlAsString;
using (var xmlWebClient = new WebClient())
            {
                xmlWebClient.Encoding = Encoding.UTF8;
                xmlAsString = xmlWebClient.DownloadString(url);
            }

XmlDocument currentXml = new XmlDocument();
currentXml.Load(xmlAsString);
BinaryGuy
  • 1,246
  • 1
  • 15
  • 29
  • 1
    thanks Andrei! This seems like a good alternative, but I'm trying to avoid it, as I don't want to download 1GB+ xml files and load them into memory, that's why I would prefer streaming them. I will keep this in mind in case I don't find a better solution. – Paul0PT Nov 20 '15 at 10:48
1

You could just use

string url = "http://www.example.com/feed.xml";
using(var reader = XmlReader.Create(url){

And it should work as url are supported (see here). And streaming could then be used through yield return x. This is probably your best bet, since you can let the native component handle the streaming the way it wants. You could even chunk the file via the ReadValueChunk method.

Another consideration, and the one I would guess is the issue, is the size of your Azure instance. Azure instances have a notoriously small amount of memory unless on the highest tier.

I also do not see you disposing of you of any of your streams, which can also lead to memory leaks and excessive memory usage.

And considering it works on your machine, and most personal computers are at least as powerful as an A3 instance (one tier below the top), as well as having an IDE to clean up any memory leaks locally, and it seems viable the azure instance could be the issue.

One potential solution would be to use file streaming. Memory streaming and file streaming are very similar after a certain size. One uses the file system, while the other uses a sys file (IIRC pagefile.sys), so converting to a file stream would have little impact on performance, with the drawback of having to clean up the file after you are done. But when dollars are a consideration, disk streaming is cheaper in the azure world.

swestner
  • 1,881
  • 15
  • 19
0

try this

    static IEnumerable<XElement> StreamCustomerItem(string uri)
   {
    using (XmlReader reader = XmlReader.Create(uri))
    {
        XElement name = null;
        XElement item = null;

        reader.MoveToContent();

        // Parse the file, save header information when encountered, and yield the
        // Item XElement objects as they are created.

        // loop through Customer elements
        while (reader.Read())
        {
            if (reader.NodeType == XmlNodeType.Element
                && reader.Name == "Customer")
            {
                // move to Name element
                while (reader.Read())
                {
                    if (reader.NodeType == XmlNodeType.Element &&
                        reader.Name == "Name")
                    {
                        name = XElement.ReadFrom(reader) as XElement;
                        break;
                    }
                }

                // loop through Item elements
                while (reader.Read())
                {
                    if (reader.NodeType == XmlNodeType.EndElement)
                        break;
                    if (reader.NodeType == XmlNodeType.Element
                        && reader.Name == "Item")
                    {
                        item = XElement.ReadFrom(reader) as XElement;
                        if (item != null)
                        {
                            XElement tempRoot = new XElement("Root",
                                new XElement(name)
                            );
                            tempRoot.Add(item);
                            yield return item;
                        }
                    }
                }
            }
        }
    }
}

static void Main(string[] args)
{
    XStreamingElement root = new XStreamingElement("Root",
        from el in StreamCustomerItem("Source.xml")
        select new XElement("Item",
            new XElement("Customer", (string)el.Parent.Element("Name")),
            new XElement(el.Element("Key"))
        )
    );
    root.Save("Test.xml");
    Console.WriteLine(File.ReadAllText("Test.xml"));
}

based on below XML

<?xml version="1.0" encoding="utf-8"?>
<Root>
  <Item>
    <Customer>A. Datum Corporation</Customer>
    <Key>0001</Key>
  </Item>
  <Item>
    <Customer>A. Datum Corporation</Customer>
    <Key>0002</Key>
  </Item>
  <Item>
    <Customer>A. Datum Corporation</Customer>
    <Key>0003</Key>
  </Item>
  <Item>
    <Customer>A. Datum Corporation</Customer>
    <Key>0004</Key>
  </Item>
  <Item>
    <Customer>Fabrikam, Inc.</Customer>
    <Key>0005</Key>
  </Item>
  <Item>
    <Customer>Fabrikam, Inc.</Customer>
    <Key>0006</Key>
  </Item>
  <Item>
    <Customer>Fabrikam, Inc.</Customer>
    <Key>0007</Key>
  </Item>
  <Item>
    <Customer>Fabrikam, Inc.</Customer>
    <Key>0008</Key>
  </Item>
  <Item>
    <Customer>Southridge Video</Customer>
    <Key>0009</Key>
  </Item>
  <Item>
    <Customer>Southridge Video</Customer>
    <Key>0010</Key>
  </Item>
</Root>

Fore More details