2

When parsing an xml document for its nodes or attributes, if the document is large, I would have a bunch of ifs and else statements.

Obviously, 100+ ifs does not make up maintainable code in the long run.

Rather than doing this, is there another better way? I read on Hanselman's blog about a friend of his who had the same situation and wrote loads of ifs/else if and generally poor code. Hanselman provided some snippets of a more maintainable way but the entire code isn't available so it's a little hard to understand exactly what (the whole picture) is going on. Life after if, else

I am using .NET 3.5 SO I have the full power of extension methods and LINQ available to me. However, I use .NET 2.0 a work so would also appreciate any solutions in v2.0. :)

My code looks very similar to the problem on Hanselman's site:

if (xmlNode.Attributes["a"].Value == "abc" {

}
else if (xmlNode.Attributes["b"].Value == "xyz"
{
wt = MyEnum.Haze;
}

I could just have a dictionary storing the values I am looking for as keys and perhaps a delegate in the value (or whatever I want to happen on finding a required value), so I could say if (containskey) get delegate and execute it, in pseudocode.

This sort of thing goes on and on. Obviously very naive way of coding. I have the same problem with parsing a text document for values, etc.

Thanks

GurdeepS
  • 65,107
  • 109
  • 251
  • 387

8 Answers8

3

If you need to map <condition on xml node> to <change of state> there's no way to avoid defining that mapping somewhere. It all depends on how many assumptions you can make about the conditions and what you do under those conditions. I think the dictionary idea is a good one. To offer as much flexibility as possible, I'd start like this:

Dictionary<Predicate<XmlNode>, Action> mappings;

Then start simplifying where you can. For example, are you often just setting wt to a value of MyEnum like in the example? If so, you want something like this:

Func<MyEnum, Action> setWt = val => 
    () => wt = val;

And for the presumably common case that you simply check if an attribute has a specific value, you'd want some convenience there too:

Func<string, string, Predicate<XmlNode>> checkAttr = (attr, val) => 
    node => node.Attributes[attr] == val;

Now your dictionary can contain items like:

 ...
 {checkAttr("a", "abc"), setWt(MyEnum.Haze)},
 ...

Which is nice and terse, but also isn't restricted to the simple <attribute, value> to <enum> mapping. OK, so now you have a big dictionary of these condition-action pairs, and you just say:

foreach(DictionaryEntry<Predicate<XmlNode>, Action> mapping in mappings)
{
     if (mapping.Key(xmlNode))
     {
         mapping.Value();
         break;
      }
}

If you avoid the lambda syntax and the dictionary initializers, you should be able to do that in 2.0.

1

The link you are referring to spells out one of my favorite approaches - populating a dictionary and using it as a map from your xml attributes to the values you're setting, etc.

Another "trick" I've used is taking an extension on that. If your logic around containing a specific attribute is more than just setting a value, you can make a dictionary of attribute names (or values) to delegates, where the delegate sets your value and optionally performs some logic.

This is nice because it works in .net 2 and .net3/3.5. The delegates can be nicer to setup in .net 3.5, though.

Once you have the map, then you can do a foreach loop on all of your attributes, and just lookup the delegate, if it exists, call it, if it doens't, move on/throw/etc - all up to you.

Reed Copsey
  • 554,122
  • 78
  • 1,158
  • 1,373
1

What you're doing here is executing a list of tests. For each test, if a predicate is true, execute an action. When a test passes, stop processing the list. Right?

A couple of people have suggested using a dictionary, but the problem with using a dictionary is that you don't control the order of the items in it. If you want to perform the tests in a specific order (which, as stated, you do), that's not going to work. So a list seems like the way to go.

Here's a functional way to do this, assuming that the predicates are examining an XmlElement.

Your tests are instances of a class:

class Test
{
    string Predicate { get; set; }
    Action Verb { get; set; }

    Test(string predicate, Action verb)
    {
       Predicate = predicate;
       Verb = verb;
    }

    bool Execute(XmlElement e)
    {
        if (e.SelectSingleNode(Predicate) != null)
        {
            Verb();
            return true;
        }
        return false;
    }
}

To populate the list of tests:

List<Test> tests = new List<Test>();
tests.Add(new Test("@foo = 'bar'", Method1));
tests.Add(new Test("@foo = 'baz'", Method2));
tests.Add(new Test("@foo = 'bat'", Method3));

To execute the tests:

foreach (Test t in tests)
{
   if (t.Execute()) break;
}

You've eliminated a lot of if/else clutter, but you've replaced it with this:

void Method1()
{
   ... do something here
}

void Method2()
{
   ... do something else here
}

If your method naming is good, though, this results in pretty clean code.

To use .NET 2.0, I think you need to add this to the code:

public delegate void Action();

because I think that type was defined in 3.0. I could be wrong.

Robert Rossney
  • 94,622
  • 24
  • 146
  • 218
0

Well, I would use LINQ in 3.5. However, have you thought about using a typed dataset; is this a possibility or is the schema too loose? You could infer the schema and still reduce a lot of the gobbeldy-gook code. This is one approach.

BobbyShaftoe
  • 28,337
  • 7
  • 52
  • 74
0

Depending on the document and your scenario and what you use the if/elses for... if it's for validation of your XML document, validate it against a schema. If it validates, you can assume safely that certain elements are present...

Emiel
  • 1,474
  • 1
  • 13
  • 16
0

It's a bit hard to tell what you need to do. If it's to set one variable based on an XML attribute then the one line approach Hanselman alluded to is the most elegant.

MyEnum wt = (MyEnum)Enum.Parse(typeof(MyEnum), xmlNode.Attributes["a"].Value, true);  

From the brief example you provided it looks like you may need to set the variable based on different XML attributes and if that's the case you may not be able to get around the need for a big if/else block.

If you do have a semblance of structure in the XML you are processing it can sometimes be easier to process an XML node as a DataRow. If your XML is all over the place then this approach isn't much good.

DataSet xmlDerivedSet = new DataSet();
xmlDerivedSet.ReadXml(xmlFilename);

foreach (DataRow row in xmlDerivedSet.Tables[0].Rows)
{
    MyClass xmlDerivedClass = new MyClass(row);
}
sipsorcery
  • 30,273
  • 24
  • 104
  • 155
0

If you are doing processing for each big node, you might also want to have some specific typed xml reader classes to more cleanly integrate separate it from the actual logic of the application.

Lets say the processing you are doing is for some good amount of customer data you are receiving in Xml. You could define a class like:

public class CustomerXmlReader
{
    public class CustomerXmlReader(XmlReader xml){}
    public Customer Read()
    { // read the next customer
    }
}

This way the rest of the application just keep working with the Customer object, and you avoid mixing it with the Xml processing.

eglasius
  • 35,831
  • 5
  • 65
  • 110
0

What Scott Hanselman is describing, once you clear away the implementation details, is a straightforward table-driven method. These are discussed in many books, such as Steve McConnell's "Code Complete", chapter 12 (either edition.) Also discussed on Stack Overflow, here.

Community
  • 1
  • 1
cygil
  • 3,614
  • 1
  • 18
  • 10