Along the lines of Best practices to parse xml files? I am looking for a quick, simple method to take a carefully rendered XML schema and a bunch of instances of xml data built upon it, and read them quickly into some DataSet
-like object. (Can be anything with multiple tables, rows, and columns, really, as long as I can present it in grids, add some columns, make some changes, and push it along to granular objects or a database when I'm done interacting with it in aggregate.) I thought I would use XmlReader to do this, but its table creation is pretty unintelligent, perhaps because I'm working with globally-declared/reusable types in the xsd? Or because I'm not implementing with correct parameters?
Detailed sample code below. To get to the problem though: XmlReader
doesn't seem to have any facility to interpret the relational nature of the schema; a child element must be linked to its parent in a table structure to be meaningful, but the resulting output won't do that. Nor can the reader understand that a list is just a collection of tables, set off to avoid trouble with colliding, unrelated siblings (though the sample code has none).
If I want to work with this data primarily in bulk (objects not required really), what is the best-practice, fewest-lines-of-code method to get it into C# while fully respecting the hierarchical structure defined in the xsd and carried out in the xml?
One further note: I'm not married to the XmlReader, and did investigate LINQ to XML and the xsd.exe object generator, but both of these implied writing more code (or cleaning up more) than I would have thought I'd have to do with a rigorous, readable xsd in place. I thought, several days ago, that this would be a quick transformation...
Sample schema, saved as FooSchema.xsd
<?xml version="1.0" encoding="utf-8"?>
<xs:schema
targetNamespace="http://tempuri.org/FooSchema"
elementFormDefault="qualified"
xmlns="http://tempuri.org/FooSchema"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
>
<xs:complexType name="FooNoteType">
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="dateStamp" type="xs:dateTime" use="optional" />
<xs:attribute name="isBar" type="xs:boolean" use="optional" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
<xs:complexType name="FooType">
<xs:sequence>
<xs:element name="stuff" type="xs:string" />
<xs:element name="nonsense" type="xs:string" minOccurs="0" />
<xs:element name="note" type="FooNoteType" minOccurs="0" />
</xs:sequence>
<xs:attribute name="isBar" type="xs:boolean" use="required" />
</xs:complexType>
<xs:complexType name="FooListType">
<xs:sequence>
<xs:element name ="foo" type="FooType" maxOccurs="unbounded" />
</xs:sequence>
</xs:complexType>
<xs:element name="fullaFoo">
<xs:complexType>
<xs:sequence>
<xs:element name="foos" type="FooListType" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Sample instance FooData.xml--note the nested notes
<?xml version="1.0" encoding="utf-8" ?>
<fullaFoo xmlns="http://tempuri.org/FooSchema">
<foos>
<foo isBar="false">
<stuff>first</stuff>
<nonsense>item</nonsense>
<note>This is the first note.</note>
</foo>
<foo isBar="true">
<stuff>things</stuff>
<nonsense>other things</nonsense>
</foo>
<foo isBar="false">
<stuff>yet more information</stuff>
<nonsense>sound, fury, etc.</nonsense>
</foo>
<foo isBar="true">
<stuff>yes please</stuff>
<nonsense>no thank you</nonsense>
<note dateStamp="2012-02-12T00:00:00" isBar="false">RE good manners</note>
</foo>
<foo isBar="false">
<stuff>last</stuff>
<nonsense>item</nonsense>
</foo>
</foos>
</fullaFoo>
Rudimentary expected C# transformation
void Test_ReadXmlToDataset()
{
string pathRoot = @"C:\SomePath\";
string pathSchema = pathRoot + @"FooSchema.xsd";
string pathData = pathRoot + @"FooData.xml";
DataSet dsTest = new DataSet("testXml");
dsTest.ReadXmlSchema(pathSchema);
dsTest.ReadXml(pathData);
//rubber would meet road here...
DisplayTableStructure(dsTest);
}
void DisplayTableStructure(DataSet dataSet)
{
Console.WriteLine("\r\nTable structure \r\n");
Console.WriteLine("Tables count=" + dataSet.Tables.Count.ToString());
for (int i = 0; i < dataSet.Tables.Count; i++)
{
Console.WriteLine("\tTableName='" + dataSet.Tables[i].TableName + "'.");
Console.WriteLine("\tColumns count=" + dataSet.Tables[i].Columns.Count.ToString());
for (int j = 0; j < dataSet.Tables[i].Columns.Count; j++)
{
Console.WriteLine("\t\tColumnName='" +
dataSet.Tables[i].Columns[j].ColumnName + "', type = "
+ dataSet.Tables[i].Columns[j].DataType.ToString());
}
}
Console.ReadLine();
}