While validating an xml file, I want to logg any text-node with empty content. A newline \n is also considered a texnode but it is not what I want to aprove. In the following code: 'parent' has two textnodes of content '\n' that are not interesting to me. The content of 'elem1' is '\n\n', which is an error and must be reported. 'elem2' has a valid content. Content of 'books' is empty and must be reported.
In my first try I searched each text-node for [\n\t\r] and would ignore them. But this way I would also ignore elem1 which should have been reported as error.
What is the point I am doing wrong? (notice: I have to solve this issue without xsd-validation)
Update 1): I have added more \n between the elements. Now the first 'parent' node has 5 textnodes with content: \n
<root>
<parent>
<elem1>
</elem1>
<elem2>good content of el2</elem2>
<elem3> half so good
contentof el3</elem3>
</parent>
<parent>
<elem1>
</elem1>
<elem2>good content</elem2>
<elem3>good</elem3>
<elem4></elem4>
</parent>
<book></book>
</root>
Update 2) for more clearness: wenn a caller calls say validate("//parent/*"), I gather all nodes of this given path and get a nodelist returned. Then I start the validation for each node and its children.
Nodelist result = xpathinstance.validate(path, currentNode, XPathConstants.NODESET)
for (int n = 0; n < result.getLength(); n++) {
validateThereAreNoGaps(result.item(n));
}
Wenn I arive on the first 'parent'-element it shows 7 children (after update of example). Each \n between the element-tags is considered a text-node.
As a next solution I am now trying to replace all \n with "" to get rid of them...