0

i need to create a c# GUI windows software to load,search and filter huge xml file(~1GB). Its not for a specific xml, so i need a generic solution. I need help in approaching the problem. Like what libraries,frameworks and UI Controls can i use to load huge xml file and how can i represent it in GUI to efficiently search and filter through the xml files. I have tried using

  • Dataset to load xml file and display it in datagridview. but since my xml is too nested, it creates multiple tables and thus makes it impossible to represent in a single view.

  • Excel interop library to import xml into worksheet and then to Datatable and display in datagridview. It works fine for small files. but files above 5MB it keeps on loading and becomes unresponsive.

  • I thought of loading it into sqllite database and then load it in datagridview. but sqllite does not seems to provide a way to import xml files directly.

Help me in solving the problem. Thank you.

Joel S
  • 106
  • 3
  • Have you looked at some of the existing tools? XML Notepad 2007 for instance might give you a few ideas about how to (or how not to) tackle this. – peterG Dec 01 '16 at 14:17
  • You have to use an Xml Reader. Simple solution : http://stackoverflow.com/questions/34274568/how-to-read-an-xml-file-by-using-xmlreader-in-c-sharp Complex solution http://stackoverflow.com/questions/39805526/how-to-read-xml-file-having-different-hierarchy-in-net/39806338#comment67098235_39806338 – jdweng Dec 01 '16 at 14:20
  • @peterG Thanks. XML Notepad 2007 is nice and what i might need to build my software upon. it gave me an intuition to represent the xml efficiently as tree itself. – Joel S Dec 01 '16 at 18:13
  • @jdweng i saw ur solution. in simple solution and complex solution, u have specified the value in the xml for easy representation. Eg. reader.ReadToFollowing("something", ns); but i want a generic solution to read all of them. can i use XmlReader to read huge xml file element by element or it will crash with Out of Memory Exception? if so how can read it part by part without any prior knowledge? – Joel S Dec 01 '16 at 18:28
  • You can read element by element but the code can become complicated because reading element doesn't distiquish between parents and descendants. That is why in the complex solution I only read used XmlReader to read a hierarchy of specific tags. Then use XmlLinq to parse through the tags. It took me over two weeks with complex solution to get final results. The person who posted the xml only sent me small pieces of the entire xml. – jdweng Dec 01 '16 at 18:45
  • @jdweng k. but i need to have prior knowledge of the tags and attributes of the xml to implement your method right? Lets say i don't want to maintain hierarchy or relationship and just populate all the elements with attributes or values as list. is it possible? – Joel S Dec 01 '16 at 19:32
  • What happens if you have the same tag name at two different levels of hierarchy? often Name is a tag, At one level it could be a company name and another level a person name. – jdweng Dec 01 '16 at 21:02
  • ya. thats kind of problem. leaving that. assuming i m able to handle that, how can i read all of them and represent it in GUI a tree like structure? holding them in variable would take a hit in the memory right? should i use database then? – Joel S Dec 02 '16 at 08:58

1 Answers1

0

Well I don't think loading 1GB of document in the memory is a good approach. I am not even sure that you can take all the memory you want, because you have some limitations on memory usage.

Personally I would load the document not all at once but in pieces, and with pagination in order to limit memory usage and guarantee a good performance.

That being said, if you take this approach you can use the methos you mentioned.


Useful Links

Reading parts of large files from drive

Community
  • 1
  • 1
Sid
  • 14,176
  • 7
  • 40
  • 48
  • if i only take part of the xml file using filestream, then how can i parse it into xml since the closing tags might not proper right? – Joel S Dec 01 '16 at 18:30
  • If you could provide some additional info about the xml structure we can brainstorm about it :) – Sid Dec 01 '16 at 18:36