2

I have read many post where people have asked about enforcing some order of attributes to an XML element and the general response is that it's not legal/required/allowed/relevant/other.

I am not looking for any response saying I shouldn't care about attribute order, so please don't reply if that's your view.

I have a real problem which needs a solution. A large corporate product treats the following two elements as different in the latest version of their product

<objquestion allowmultiple="true" id="7432" idtext="7433" idvar="7429" parent="7430" questiontype="multchoice">

<objquestion id="7432" idtext="7433" idvar="7429" parent="7430" questiontype="multchoice" allowmultiple="true">

Particularly, if the "allowmultiple" attribute is after the "questiontype" it acts as a modifier to the question type. If it's before, it's ignored - it shouldn't be.

So, they are unlikely to fix their product in the short term.

I am manipulating this XML content using

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
doc = dbf.newDocumentBuilder().parse(new InputSource(path));

and the internal implementation will sort the attributes in the DOM node map. When it's written back to the file it writes the attributes in their now sorted order. I have a lot of code that is playing around with the Document object using XPath.

When I have finished manipulating the Document I currently write it back with

Transformer xformer = TransformerFactory.newInstance().newTransformer();
xformer.transform(new DOMSource(lc.getDocument()), new StreamResult(new File(paths[1])));

What I need to be able to do is to ensure the allowmultiple attribute is written AFTER the questiontype.

I have tried to understand if I can either affect the serialisation used to write the DOM tree back or if I can simply substitute a different implementation that does not parse the attributes into a sorted map initially. I guess both would work, but I've not been able to find out how to do this.

I looked at LSSerializer, but I am not sure how I can intercept that particular element. Would I have to extend a FileOutputStream and look for something?

I have read that SAX might not do the initial sorting, but I need to be able to drop in the parser without much new code and am not so strong with the whole XML world.

Can anyone suggest a way to do this?

Deduplicator
  • 44,692
  • 7
  • 66
  • 118
adb
  • 103
  • 1
  • 12

3 Answers3

4

The next Saxon release (9.5, due imminently) has a serialization attribute that allows you to control attribute order. It was added for legitimate use cases (it can improve human readability to have id attributes always come first, for example), and I slightly regret that it's going to end up being used for use cases like yours that result from the incompetence and irresponsibility of the programmers employed by large corporations, but so be it: if it solves a problem, I won't cry.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
  • Thanks Michael, that sounds promising. I'll look at that. I'm a bit at sea with all the different frameworks, so can you give me a pointer on which Saxon classes I'd need to serialize the org.w3c.dom.Document I have. – adb Apr 17 '13 at 06:01
  • As I say, this is a "next release" feature. But the easiest way to serialize a DOM using Saxon's serializer is to get an identity transformer using `new net.sf.saxon.TransformerFactoryImpl(). newTransformer()`, set the serialization parameters you want, and then call the transform() method supplying a DOMSource and a StreamResult. – Michael Kay Apr 17 '13 at 07:24
  • I assume when you talk about setting the serialization parameters you are referring to the OutputProperties of the Transformer, but I'm wondering how the control would be done. Is there any discussion or info on how this will work? If the attributes are already parsed and sorted during the parsing process, I'm not sure I understand how this could be controlled during transformation output. – adb Apr 18 '13 at 06:58
  • Saxon 9.5 is now out, and for reference the feature is here: http://www.saxonica.com/documentation/index.html#!extensions/output-extras/attribute-order. It's a serialization feature because it controls the way the result tree is converted back to lexical XML. – Michael Kay Apr 19 '13 at 13:49
0

It sounds like a hack, but you can rename this attribute with something like x1allowmultiple and then it'll be a last one:

  • search and replace all occurrences of allowmultiple with x1allowmultiple
  • do processing and create output file with x1allowmultiple
  • search and replace all occurences of x1allowmultiple with allowmultiple
abinet
  • 2,552
  • 1
  • 15
  • 24
  • Thanks for the idea. That is my last ditch way to solve the problem, i.e. just run the final XML through a bit of sed to shift the attrs around. – adb Apr 17 '13 at 06:04
0

You can use this JAXB aproach and see this example for element attribute ordering.

Community
  • 1
  • 1
MMendes
  • 84
  • 5