0

I have about 15,000 XML in the form of string. Each of the XML has an average of 1000 nodes.

I do not know the nodes name, and the hierarchical level of the XML. For each XML, I need to parse them into List<string> elements and List<string> values.

In a case where parent and child nodes are present, the parent node will be added into the List<string> elements and a null or empty string will be added to List<string> values

What are the possible ways of achieving so?

Edited: I supposed I just need to know how to parse one XML, and I can loop the same method for all 15,000 records.

p/s: I thought of using Dictionary or multi-dimensional List where I could have something like <key><value> pair, but it wasn't approved because it will affect other application significantly. So it has to be a List of Elements and a List of Values

C.J.
  • 3,409
  • 8
  • 34
  • 51
  • You could make a object with two List properties and then make list of the paired objects. – Bit Apr 24 '14 at 20:58
  • @N4TKD my question is how to parse XML into into List elements and List values, NOT how to make 2 List "associate" with each other. – C.J. Apr 24 '14 at 21:04
  • http://stackoverflow.com/questions/55828/how-does-one-parse-xml-files – Bit Apr 24 '14 at 21:06
  • @N4TKD You did not read the question properly. I did NOT know the node names and the hierarchical level. Even with LINQ to XML, I need to know at least the node names. I'm looking for a similar wild card like `Select * from table` in SQL. – C.J. Apr 24 '14 at 21:09
  • Please provide input example, and expected output. I don't think you need XQuery for this, XPath will probably suffice, and a sax-like approach might even be more reasonable. – Jens Erat Apr 24 '14 at 21:09
  • @JensErat I'm not sure if providing example will help, the requirement is too generic where I may be receiving any kind of XML. Even if I provide you with an example now and you solve the current example, the next XML will have a complete different nodes/structures. That is why I need something similar to `Select * from table` where I don't have to know the column name yet I still can find all the values – C.J. Apr 24 '14 at 21:15
  • @C.J. I read it, you have to get all the name in the xml here is a example: http://stackoverflow.com/questions/847978/c-sharp-how-can-i-get-all-elements-name-from-a-xml-file – Bit Apr 24 '14 at 21:18

1 Answers1

0

You can use LINQ to get all the nodes from XML. You'll need to add using System.Xml.Linq; to your parsing class, then you can grab the data like this.

string xml = "your xml string"
var myXmlData = XElement.Parse(xml);

//Get the names of all nodes
var allNames = (from e in myXmlData.Descendants()
                select e.Name.LocalName).ToList();

//Get the values of each node - empty string for nodes with children
var allElements = (from  e in myXmlData.Descendants()
                   select (e.HasElements ? "" : e.Value)).ToList();

This will give you two List<string> objects with all the corresponding names and values for your XML.

Daniel Simpkins
  • 674
  • 5
  • 18