2

Is there a way I could tell the xml transformer to sort alphabetically all the attributes for the tags of a given XML? So lets say...

<MyTag paramter1="lol" andTheOtherThing="potato"/>

Would turn into

<MyTag andTheOtherThing="potato" paramter1="lol"/>

I saw how to format it from the examples I found here and here, but sorting the tag attributes would be the last issue I have.

I was hoping there was something like:

transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.SORTATT, "yes"); // <-- no such thing

Which seems to be what they say: http://docs.oracle.com/javase/1.4.2/docs/api/javax/xml/transform/OutputKeys.html

Community
  • 1
  • 1
filippo
  • 5,583
  • 13
  • 50
  • 72
  • if you can persuade the transformer to somehow use "canonical form", attributes should come out sorted alphabetically. Xerces has support for this in its [DomConfiguration](http://xerces.apache.org/xerces2-j/javadocs/api/org/w3c/dom/DOMConfiguration.html). It could be a start. – forty-two Feb 08 '12 at 21:08

1 Answers1

4

As mentioned, by forty-two, you can make canonical XML from the XML and that will order the attributes alphabetically for you.

In Java we can use something like Apache's Canonicalizer

org.apache.xml.security.c14n.Canonicalizer

Something like this (assuming that the Document inXMLDoc is already a DOM):

Document retDoc;
byte[] c14nOutputbytes;
DocumentBuilderFactory factory;
DocumentBuilder parser;

// CANONICALIZE THE ORIGINAL DOM
c14nOutputbytes = Canonicalizer.getInstance(Canonicalizer.ALGO_ID_C14N_WITH_COMMENTS).canonicalizeSubtree(inXMLDoc.getDocumentElement());

// PARSE THE CANONICALIZED BYTES (IF YOU WANT ANOTHER DOM) OR JUST USE THE BYTES
factory = DocumentBuilderFactory.newInstance();
factory.set ... // SETUP THE FACTORY
parser = factory.newDocumentBuilder();
// REPARSE TO GET ANOTHER DOM WITH THE ATTRIBUTES IN ALPHA ORDER
ByteArrayInputStream bais = new ByteArrayInputStream(c14nOutputbytes);
retDoc = parser.parse(bais);

Other things will get changed when Canonicalizing of course (it will become Canonical XML http://en.wikipedia.org/wiki/Canonical_XML) so just expect some changes other than the attribute order.

Jayson Lorenzen
  • 575
  • 2
  • 7
  • Funny, I got it canonized alright, but one of the definitions (from wikipedia) didn't seem to apply, the empty tags weren't expanded to the open/close format. Also, the `.normalize()` from org.w3c.dom.Document seemed to do exactly the same. Am I missing something? – filippo Feb 10 '12 at 09:36
  • sorry the only example I had handy was using the Apache libs, as I have used that a lot. You can use Document.normalizeDocument with http://docs.oracle.com/javase/6/docs/api/org/w3c/dom/DOMConfiguration.html as well, should both conform to the http://www.w3.org/TR/xml-c14n w3c recommendation. I do not know why it is not expanding elements, it may be a switch that needs to be set. – Jayson Lorenzen Feb 14 '12 at 16:43
  • Okay. Thanks anyway, I will take a look in the documentation. cheers. – filippo Feb 15 '12 at 09:48