I have a 2 GB XML file containing around 2.5 million records. I am not being able to load it in c#. It is throwing out of memory exception. Please help me to resolve it with easy method.
Asked
Active
Viewed 788 times
0
-
1Provide a [MCVE] in order to get help – Souvik Ghosh May 24 '18 at 06:39
-
Hello Simran - Why not use XmlReader? – Prateek Shrivastava May 24 '18 at 06:39
-
1Is your app compiled as 64-bit? Are you using https://learn.microsoft.com/en-us/dotnet/framework/configure-apps/file-schema/runtime/gcallowverylargeobjects-element ? – mjwills May 24 '18 at 06:39
-
Without a [mcve] documenting your specific problems, we can't do much more than to point you to [How to parse very huge XML Files in C#?](https://stackoverflow.com/q/15772031), [What is the best way to parse (big) XML in C# Code?](https://stackoverflow.com/q/676274), [Large XML Parsing Efficiently](https://stackoverflow.com/q/29951809) and [How to read large xml file without loading it in memory and using XElement](https://stackoverflow.com/q/2249875). Also, be sure you're loading directly from a `Stream` and not reading into a `string` and parsing that. – dbc May 24 '18 at 06:44
-
Set your project to 64 bit (if you can), job done, or parse it – TheGeneral May 24 '18 at 06:49
-
4Sum together the responses of Prateek and mjwills. Compile at 64 bits AND use `XmlReader`. Don't load completely the file in memory. Don't use `XDocument`/`XmlDocument`/`XmlSerializer`. Write the result of your reading one piece at a time. – xanatos May 24 '18 at 06:51
-
Show us the structure of your xml-file and what data you want to extract. I'll give you an example XmlReader. – Alexander Petrov May 25 '18 at 17:15
1 Answers
0
Simple and general methodology when you have these problems:
- As written by mjwills and TheGeneral, compile at 64 bits
- As written by Prateek use
XmlReader
. Don't load completely the file in memory. Don't useXDocument
/XmlDocument
/XmlSerializer
. - If the size of the output is proportional to the size of the input (you are making a conversion of formats for example), write the result of your reading one piece at a time. If possible you shouldn't have the whole output in memory at the same time. You read an object (a node) from the source file, you make your elaborations, you write the result in a new file/on a db, you discard the result of the elaboration
- If the output instead is a summary of the input (for example you are calculating some statistics on the input), and so the size of the output is sub-proportional to the size of the input then normally it is ok to keep it in memory

xanatos
- 109,618
- 12
- 197
- 280