I've been working on cleaning up a very messy ASP.NET project, and I have a tool that measures project complexity in various ways, so I can show the results of my work: as I clean up, the complexity goes down.
One of my metrics was HTML markup line count, but I've realized that this isn't a very good way to measure, because line count is subject to inflation during formatting; this snippet:
<span><em>This is bold</em></span>
should have the same score as the pretty printed version:
<span>
<em>This is bold</em>
</span>
But simply counting lines shows the second snippet having more lines.
What would be a better way to compute the complexity of markup, to capture the structural complexity, not just line count?
Update: Commenters asked about what I mean by complexity. I mean this in the sense of how much structure the page has. My original example wasn't the best one because the two snippets are the same. My ultimate goal is to convert sloppy table driven layouts to CSS, and I want to measure how much "less" code there is when that's done. Simply counting the number of nodes doesn't quite get at the nesting structure. Is there a metric that would capture the node count AND the nesting depth?