1

I have a big XML file where I am taking small snippet by using ReadFrom() and then I will get xmlsnippet which contains leaf, sas, kir tags at different positions (sometimes leaf at top compare to kir or viceversa).

Now the thing is I am using three foreach loop to get these values which is bad logic and it will take time when this snippet also big.

Is there anyway I can use one foreach loop and then three if loop inside foreach to get values?

arr is a custom arraylist

var xdoc = new XDocument(xmlsnippet);
string xml = RemoveAllNamespaces(xdoc.ToString());
foreach (XElement element in XDocument.Parse(xml).Descendants("leaf"))
{
    arr.Add(new Test("leaf", element.Value, 2));
    break;
}
foreach (XElement element in XDocument.Parse(xml).Descendants("sas"))
{
    arr.Add(new Test("sas", element.Value, 2));
    break;
}

foreach (XElement element in XDocument.Parse(xml).Descendants("kir"))
{
    if (element.Value == "0")
        arr.Add(new Test("kir", "90", 2));
    break;
}
rene
  • 41,474
  • 78
  • 114
  • 152
peter
  • 8,158
  • 21
  • 66
  • 119
  • 1
    The first think you could do: Save the return value from the `Parse()` function to a variable. So the `Parse()` function would be only executed once. – Benjamin J. Apr 14 '18 at 08:01
  • You don't even need the Parse, if you use XNamespace with the nodes you're looking for. – rene Apr 14 '18 at 08:04
  • break is there ... – peter Apr 14 '18 at 08:05
  • 1
    So you want to find the First element with one of those given names, right? – rene Apr 14 '18 at 08:06
  • 1
    I can't tell from looking at your xml the most efficient method to parse the xml. The best way is to parse through file using a common parent of all three elements (leaf, sas, and kir). Also if the xml file is very large is is best to use a XmlReader. I usually use a combination of XmlReader and XElement and can give code if you post sample of you xml. – jdweng Apr 14 '18 at 08:26

1 Answers1

2

You only need to Parse that xmlsnippet once (assuming it fits in memory) and then use XNamespace to qualify the right XElement. No need to call RemoveAllnamespaces which I guess does what its name implies and probably does so in an awful way.

I used the following XML snippet as example input, notice the namespaces a, b and c:

var xmlsnippet = @"<root xmlns:a=""https://a.example.com"" 
    xmlns:b=""https://b.example.com"" 
    xmlns:c=""https://c.example.com"">
    <child>
    <a:leaf>42</a:leaf>
    <a:leaf>43</a:leaf>
    <a:leaf>44</a:leaf>
    <somenode>
    <b:sas>4242</b:sas>
    <b:sas>4343</b:sas>
    </somenode>
    <other>
    <c:kir>80292</c:kir>
    <c:kir>0</c:kir>
    </other>
    </child>
</root>";

And then use Linq to either return an instance if your Test class or null if no element can be found. That Test class instance is then added to the arraylist.

var arr = new ArrayList();

var xdoc = XDocument.Parse(xmlsnippet);

// add namespaces
var nsa = (XNamespace) "https://a.example.com";
var nsb = (XNamespace) "https://b.example.com";
var nsc = (XNamespace) "https://c.example.com";

var leaf = xdoc.Descendants(nsa + "leaf").
    Select(elem => new Test("leaf", elem.Value, 2)).FirstOrDefault();
if (leaf != null) {
    arr.Add(leaf);
}
var sas = xdoc.Descendants(nsb + "sas").
    Select(elem => new Test("sas", elem.Value, 2)).FirstOrDefault();
if (sas != null) {
    arr.Add(sas);
}
var kir = xdoc.
    Descendants(nsc + "kir").
    Where(ele => ele.Value == "0").
    Select(elem => new Test("kir", "90", 2)).
    FirstOrDefault();
if (kir != null) {
    arr.Add(kir);
}

I expect this to be the most efficient way to find those nodes if you want to stick with using XDocument. If the xml is really huge you might consider using an XMLReader but that probably only helps if memory is a problem.

If you want to do it one LINQ Query you can do this:

 var q =  xdoc
    .Descendants()
    .Where(elem => elem.Name.LocalName == "leaf" ||
                   elem.Name.LocalName == "sas" ||
                   elem.Name.LocalName == "kir" && elem.Value == "0" )
    .GroupBy(k=> k.Name.LocalName)
    .Select(k=>
        new Test(
            k.Key, 
            k.Key != "kir"? k.FirstOrDefault().Value: "90",
            2)
    );
 arr.AddRange(q.ToList());

That query goes looking for all elements named leaf, sas or kir, groups them on the elementname and then takes the first element in each group. Notice the extra handling in case the elementname is kir. Both the where clause and the projection in Select need to deal with that. You might want to performance test this as I'm not sure how efficient this will be.

For completeness here is an XmlReader version:

var state = FoundElement.NONE; 
using(var xe = XmlReader.Create(new StringReader(xmlsnippet)))
while (xe.Read())
{ 
    // if we have not yet found an specific element
    if (((state & FoundElement.Leaf) != FoundElement.Leaf) && 
       xe.LocalName == "leaf") 
    {
       // add it ... do not change the order of those arguments
       arr.Add(new Test(xe.LocalName, xe.ReadElementContentAsString(), 2));
       // keep track what we already handled.
       state = state | FoundElement.Leaf;
    }
    if (((state & FoundElement.Sas) != FoundElement.Sas) && 
        xe.LocalName == "sas") 
    {
        arr.Add(new Test(xe.LocalName, xe.ReadElementContentAsString(), 2));
        state = state | FoundElement.Sas;
    }
    if (((state & FoundElement.Kir) != FoundElement.Kir) && 
        xe.LocalName == "kir") 
    {
        var localName = xe.LocalName; // we need this ...
        var cnt = xe.ReadElementContentAsString();  // ... because this moves the reader
        if (cnt == "0") {
            arr.Add(new Test(localName, "90", 2));
            state = state | FoundElement.Kir;
        }
    }
}

And here is the enum with the different states.

[Flags]
enum FoundElement
{
   NONE =0,
   Leaf = 1,
   Sas = 2,
   Kir = 4
}
rene
  • 41,474
  • 78
  • 114
  • 152