0

I'm currently working on a program that reads writes a XML file. While this is a simple task, i'm concerned about future issues.

My code reads the streamed data from the XML, and checks every element <x> until an element that matches a criteria is founds, this works quite fast, since the file currently has about 100 <x> elements, but when more elements are added this task will be much slower, specially if the element that matches the criteria is the last one in avery large file.

What approach should I take to minimize the impact of this? I was thinking about spliting files in smaller ones (containing up to 1000 elements each) and reading from various of those files at the same time. Is this a proper approach to this?

I'm coding in C#, in case it's relevant for a language-specific approach.

Sami Kuhmonen
  • 30,146
  • 9
  • 61
  • 74
Alex Coronas
  • 468
  • 1
  • 6
  • 21
  • Are you using XML Serialization? – Mihir Dave Mar 05 '18 at 08:09
  • 1
    How large do you expect the files to get, and what are your performance requirements? – Jon Skeet Mar 05 '18 at 08:10
  • Huge xml files require using XmlReader so you do not get out of memory errors. See my solution at : https://stackoverflow.com/questions/45822054/using-xmlreader-and-xpath-in-large-xml-file-c-sharp – jdweng Mar 05 '18 at 08:18

2 Answers2

2

You should use one of the available XML APIs of .Net. Which one depends on the size of the XML files. In this question there is a discussion between XDocument (Linq-to-Xml) and XmlReader. To summarize: If your file fits in memory, then use XDocument. If not then use XmlReader.

Kristof U.
  • 1,263
  • 10
  • 17
0

This sounds like a batch process in your case. Maybe this link: https://www.codeproject.com/Articles/1155341/Batch-Processing-Patterns-with-Taskling will help you. I never did this in C#, but in Java, and it's a good way to resolve this kind of tasks. Hope it will help you.

Dina Bogdan
  • 4,345
  • 5
  • 27
  • 56