C# Parsing XML with XmlReader and XmlWriter

Question

I am parsing this XML:

<?xml version="1.0" encoding="ISO-8859-2"?>
<tabela_kursow typ="A" uid="20a219">
   <numer_tabeli>219/A/NBP/2020</numer_tabeli>
   <data_publikacji>2020-11-09</data_publikacji> 
       <pozycja>
          <nazwa_waluty>bat (Tajlandia)</nazwa_waluty>
          <przelicznik>1</przelicznik>
          <kod_waluty>THB</kod_waluty>
          <kurs_sredni>0,1236</kurs_sredni>
       </pozycja>
       <pozycja>
         <nazwa_waluty>dolar amerykanski</nazwa_waluty>
         <przelicznik>1</przelicznik>
         <kod_waluty>USD</kod_waluty>
         <kurs_sredni>3,7787</kurs_sredni>
       </pozycja>
</tabela_kursow>

I am reading it with XmlReader from URL and writing with XmlWriter to my XML file. But I also have list of permitted currencies, so I don't want to write all currencies from URL XML to my XML file. But how can I effectively test if currency is on my list, when I start writing section , but only when I reach <kod_waluty> tag I can test if I want this currency or not. Can I previous lines write somewhere to buffer and later decide if I want them or not?

This is my current code which is reading all currencies without any conditon:

    public void ApiCall()
    {
        bool today = false;
        Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
        Tools.Log("Start processing file: " + cf.rateUrl);
        XmlWriter writer = XmlWriter.Create(cf.outXml);
        writer.WriteStartDocument();

        //*** FOR TESTS ***
        using (XmlReader reader = XmlReader.Create(new StreamReader("LastA.xml", Encoding.GetEncoding(28592))))
        //*** FOR TESTS ***            
        //using (XmlReader reader = XmlReader.Create(new StreamReader(WebRequest.Create(cf.rateUrl).GetResponse().GetResponseStream(), Encoding.GetEncoding(28592))))
        {
            while (reader.Read())
            {
                switch (reader.NodeType)
                {
                    case XmlNodeType.Element:
                        if (reader.Name.Equals("data_publikacji"))
                        {
                            today = true;
                        }
                        writer.WriteStartElement(reader.Name);
                        while (reader.MoveToNextAttribute())
                        {
                            writer.WriteAttributeString(reader.Name, reader.Value);
                        }
                        break;
                    case XmlNodeType.Text:
                        writer.WriteNode(reader, true);
                        writer.WriteEndElement();
                        if (today)
                        {
                            writer.WriteStartElement("our_date");
                            var settings = new XmlReaderSettings();
                            settings.ConformanceLevel = ConformanceLevel.Fragment;
                            StringReader scontent = new StringReader(string.Format(cf.today.Year + "-" + cf.today.Month + "-" + cf.today.Day));
                            XmlReader ourDate = XmlReader.Create(scontent, settings);
                            writer.WriteNode(ourDate, true);
                            writer.WriteEndElement();
                            today = false;
                        }
                        break;
                    case XmlNodeType.EndElement:
                        writer.WriteNode(reader, false);
                        break;
                }
            }
        }
        writer.WriteEndDocument();
        writer.Close();
    }

Sorry, yes there is, I update original post, with complete XML sample. — Pavel Matras, Nov 16 '20 at 10:24
@PavelMatras, it is much easier to achieve what you need by using XSLT. — Yitzhak Khabinsky, Nov 16 '20 at 16:43
The use of streaming XmlReader/XmlWriter is justified in the case of large xml sizes, say, hundreds of megabytes. | In your case, I would use linq to xml: XDocument/XElement. — Alexander Petrov, Nov 17 '20 at 11:04

Culme · Answer 1 · 2020-11-17T13:54:56.260

0

You can use an XPath expression, to choose only the nodes where "kod_waluty" matches a list of allowed currencies, like this in short:

xmlDoc.SelectNodes("//pozycja[contains('THB', kod_waluty)]"))

If you work with this fiddle, I'm sure you'll be able to adapt it to what you need! https://dotnetfiddle.net/jv98UZ (link updated 17 Nov)

NB The example only matches one currency, but you can enter more codes (XXX YYY ZZZ) in a list like this:

xmlDoc.SelectNodes("//pozycja[contains('THB XXX YYY ZZZ', kod_waluty)]"))

EDIT: If you really like the flow in your example and want to change as little as possible, instead of writing directly to your XmlWriter "writer" you could write to a temporary object (an XmlNode perhaps?), and then once you have validated the currency you can either write the temporary object to "writer", or disregard it in case it's not valid.

edited Nov 17 '20 at 13:54

answered Nov 16 '20 at 10:48

Culme

1,065
13
21

So solution is to rewrite my code to use XmlDocument. With only XmlReader and XmlWriter I am unable to do it? – Pavel Matras Nov 16 '20 at 11:37
I think you want XmlReader.ReadSubtree. That will give you an other reader, you an then test or currency and if a match, write that out. Long time since I played with this. – Tony Hopkinson Nov 16 '20 at 13:50
I try XmlReader.ReadSubtree, its look nice, but when I try work upon this new instance created by ReadSubtree (e. g. Read()), the original instance of XmlReader is also affected ;-( – Pavel Matras Nov 16 '20 at 14:30
@PavelMatras - that's correct. According to the [docs](https://learn.microsoft.com/en-us/dotnet/api/system.xml.xmlreader?view=net-5.0), `XmlReader` *Represents a reader that provides fast, noncached, forward-only access to XML data.* As such it doesn't have any to look back to previous values, or peek at upcoming values. If your file is large, you can read it in chunks using `XElement.ReadFrom` as shown in [How to read large xml file without loading it in memory and using XElement](https://stackoverflow.com/a/18282052/3744182). – dbc Nov 17 '20 at 06:53
@PavelMatras If you want to work with an XmlReader later on, it is no problem creating an XmlReader from the (filtered) XmlDocument. There are many ways to achieve what you are looking for, to me it seems a good idea to get rid of unwanted nodes forst, and then start iterating the ones that are left. – Culme Nov 17 '20 at 08:02
In case speed, memory usage and overall optimization is a factor here, certain methods should probably be avoided. I haven't really given that much thought here tbh. – Culme Nov 17 '20 at 08:10
I've only ever used XmlReader/Writer for massive files, I'm sure there are other use cases, but most of the time it's jut easier to read the entire think in, manipulate it and write it out. – Tony Hopkinson Nov 17 '20 at 10:25
Also Surely ReadSubTree also advancing on the parent reader is not an issue, if you write the subtree to writer. – Tony Hopkinson Nov 17 '20 at 10:28

C# Parsing XML with XmlReader and XmlWriter

1 Answers1