2

I have been looking all over for the best way to update xml in a file. I have just switched over to using XmlReader (coming from the XDocument method) for speed (not having to read the entire file in memory).

My XmlReader method works perfect and when I need to read a value, it opens the xml, starts reading and ONLY reads up to the node needed, then closes everything. It's very fast and effective.

Now that I have that working I want to make a method that UPDATES xml that is already in place. I would like to keep to the same idea and ONLY read in memory what is needed. So the idea would be, read up until the node I'm changing then use the writer to UPDATE that value.

Everything I have seen has a XmlReader reading while using an XmlWriter writing everything. If I did that I would assume that I would have to let it run through the entire file just like the XDocument would do. As an example this answer.

Is it possible to maybe just use the reader and read up to the node I'm trying to edit then change the innerxml or something?

What's the fastest and most efficient method to update XML in a file?

  • I would like to only read into memory what I'm trying to edit, not the whole file.
  • I would also like to account for nodes that do not exist (that need to be added).
Community
  • 1
  • 1
Arvo Bowen
  • 4,524
  • 6
  • 51
  • 109
  • 2
    Are you asking how to update the XML file in place? Because XML isn't designed for that. Or are you asking how to create a new file by modifying the old file without reading the entire old file into memory? If the latter, see http://blogs.msdn.com/b/mfussell/archive/2005/02/12/371546.aspx – dbc Mar 13 '16 at 17:57
  • Indeed I'm asking about "how to update it in place". Is that not possible? If I have a HUGE xml file and I want to update just the first node loading the entire XML file would take forever. I want to avoid that. As far as your second question the answer I referenced does that... – Arvo Bowen Mar 13 '16 at 18:13
  • Not really, see [Write to existing xml file without replacing it's contents (XmlWriter)](https://stackoverflow.com/questions/23699587/write-to-existing-xml-file-without-replacing-its-contents-xmlwriter): *XML files are just sequential text files. They are not a database or a random-access file. There is no way to just write into the middle of them.* All I can think to do is to manipulate the underlying streams as in https://stackoverflow.com/questions/10252974/write-xml-directly-to-disk-and-append-elements – dbc Mar 13 '16 at 18:24
  • I like using a combination of XmlReader and Linq. See my answer on following posting : http://stackoverflow.com/questions/35736230/reading-xml-and-storing-it-in-sql-server-getting-duplicates/35748711#35748711. Once you get an XmlElement it is very easy to update using Linq methods. – jdweng Mar 13 '16 at 20:30
  • @jdweng mind submitting an answer here? All I see in that answer you referenced is (1) No writing to an xml document (which is what my question is all about) and (2) The use of XDocument (reading the entire xml file) which my question explicitly says not to do. But I welcome an answer that I might be able to accept. ;) Thanks – Arvo Bowen Mar 13 '16 at 23:30
  • On the disk an XML file is just like a text file. If you happen to change the length of the file in the middle, it definitely needs to be fully rewritten to the disk. However, I think you should be able to just stream the data from the source, change what you need in between and then just write the result back on the disk without ever reading the full file in memory. You might just need a custom reader or writer. – xjuice Mar 14 '16 at 07:44
  • I'm not reading the entire xml at one time. I'm reading one node at a time (the tag "result") which repeats a lot of times. – jdweng Mar 14 '16 at 09:09
  • @jdweng No matter how you cut it `XmlDocument doc = new XmlDocument(); doc.Load(url);` loads the ENTIRE xml file at the location "url". That was what I was talking about. But that's completely irrelevant... I know how to read using XmlReader little by little just like you do later on in your code. That whole answer is based on reading and nothing to do with writing. My question has everything to do with writing. That's why I politely suggested that you submit an answer here showing how to accomplish what you are theorising. – Arvo Bowen Mar 14 '16 at 13:23
  • Ignore the Load() method in the code in the other posting. The example was for a Web Scrapping project where the totalresults number was needed from first first webpage. Later in the code an XmlReader is used which doesn't load entire xml. – jdweng Mar 14 '16 at 14:09

1 Answers1

2

By design, XmlReader represents a "read-only forward-only" view of the document and cannot be used to update the content. Using the Load method of either XmlDocument, XDocument or XElement, will still cause the entire file to be read in to memory. (Under the hood, XDocument and XElement still use an XmlReader.) However, you can combine using a raw XmlReader and XElement together using the overloads of the Load method which take an XmlReader.

You don't describe your XML structure, but you would want to do something similar to this:

var reader = XmlReader.Create(@"file://c:\test.xml");  
var document = XElement.Load(reader);
document.Add(new XElement("branch", "leaves"));
document.Save("Tree.xml");

To find a specific node (for example, with a specific attribute value), you'd want to do something similar to this:

var node = document.Descendants("branch")
                   .SingleOrDefault(e => (string)e.Attribute("name") == "foo");
Scott Dorman
  • 42,236
  • 12
  • 79
  • 110
  • Just to confirm all the methods XmlDocument.Load(), XElement.Load() and XDocument.Load() still load the entire file correct? The way your answer is worded it makes it seem as if you have a choice to load the entire file in memory or not if you choose. I was under the impression the only way to stream read/write is using XmlReader/XmlWriter. I thought all the Load() methods load the entire file in memory. – Arvo Bowen Mar 14 '16 at 17:56
  • @ArvoBowen That's almost correct. The `XmlDocument.Load()` will load the entire document into memory. The `XElement.Load()` and `XDocument.Load()` methods do not load the entire document in to memory. Think of them as "load on demand". – Scott Dorman Mar 14 '16 at 18:03
  • I would have to disagree... `XDocument doc = XDocument.Load(@"file://c:\test.xml"); File.Delete(@"c:\test.xml"); Console.WriteLine(doc);` shows the entire xml file. As soon as the Load() method is called the entire xml file is in memory. As in my example, I Load() the file, then delete the actual xml file, then show the contents of the doc object and the entire file is returned. – Arvo Bowen Mar 14 '16 at 18:16
  • @ArvoBowen: I understand what you're saying. Both `XDocument.Load()` and `XElement.Load()` use an `XmlReader` under the hood. When you call `Console.WriteLine(doc)`. If you try a similar test but using a raw `XmlReader` and delete the file, do you get the same result? – Scott Dorman Mar 14 '16 at 18:25
  • No, because with XmlReader you have to do `XmlReader reader = XmlReader.Create(@"file://c:\test.xml"); reader.Read(); Console.WriteLine(reader.Value);` and that would only get you the declaration. The reader is COMPLETELY different then the DOM based objects like XmlDocument and XDocument. The only way that I know of to stream read and write like that is the XmlReader/XmlWriter. Your answer is a good one and I wouldn't mind clicking that check, but it gives readers the wrong impression. – Arvo Bowen Mar 14 '16 at 18:33
  • @ArvoBowen I updated the answer to make it clearer that most of the `Load` methods still read the entire document in to memory. You can, however, use an overload of the `Load` method which takes an `XmlReader`. – Scott Dorman Mar 14 '16 at 20:06
  • I know this post is really old, does this answer need to be updated? because for some reason the code isn't working – inN0Cent Jan 29 '18 at 07:31