24

Say you define the following:

class Person(name: String, age: Int) {
    def toXml =
        <person>
            <name>{ name }</name>
            <age>{ age }</age>
        </person>   
}

val Persons = List(new Person("John", 34), new Person("Bob", 45))

Then generate some XML and save it to a file:

val personsXml = 
    <persons>
        { persons.map(_.toXml) }
    </persons>

scala.xml.XML.save("persons.xml", personsXml)

You end up with the following funny-looking text:

<persons>
        <person>
            <name>John</name>
            <age>32</age>
        </person><person>
            <name>Bob</name>
            <age>43</age>
        </person>
    </persons>

Now, of course, this is perfectly valid XML, but if you want it to be human-editable in a decent text editor, it would be preferable to have it formatted a little more nicely.

By changing indentation on various points of the Scala XML literals - making the code look less nice - it's possible to generate variations of the above output, but it seems impossible to get it quite right. I understand why it becomes formatted this way, but wonder if there are any ways to work around it.

Knut Arne Vedaa
  • 15,372
  • 11
  • 48
  • 59

6 Answers6

22

You can use scala.xml.PrettyPrinter to format it. Sadly this does not work for large documents as it only formats into a StringBuilder and does not write directly into a stream or writer.

Sagar V
  • 12,158
  • 7
  • 41
  • 68
Moritz
  • 14,144
  • 2
  • 56
  • 55
15

I could not find a way to use the PrettyPrinter and also specify the file encoding directly. The "solution" that I found was this:

val Encoding = "UTF-8"

def save(node: Node, fileName: String) = {

    val pp = new PrettyPrinter(80, 2)
    val fos = new FileOutputStream(fileName)
    val writer = Channels.newWriter(fos.getChannel(), Encoding)

    try {
        writer.write("<?xml version='1.0' encoding='" + Encoding + "'?>\n")
        writer.write(pp.format(node))
    } finally {
        writer.close()
    }

    fileName
}
gerferra
  • 1,519
  • 1
  • 14
  • 26
4

Thanks for the idea of "PrettyPrinter". That helped a lot.

I found out this way to write XML elements to a file with proper indent.

val xmlData = // your xml here

// max width: 80 chars
// indent:     2 spaces
val printer = new scala.xml.PrettyPrinter(80, 2)

XML.save("yourFileName.xml", XML.loadString(printer.format(musicMarshaledXML)) , "UTF-8", true, null)

Much appreciate any feedback about the performance or any drawbacks of this implementation (using "XML.save()")

hel
  • 453
  • 4
  • 12
1

this is a mod to @Hel's answer that can write to a target location that is not the local directory:

val printer = new PrettyPrinter(80, 2)
val targetFile = new java.io.File("./mytargetdir/file.xml")
val prettyDoc = printer.format(document)
val writer = new java.io.FileWriter(targetFile)
scala.xml.XML.write(writer, XML.loadString(prettyDoc), "UTF-8", true, null)
org.apache.commons.io.IOUtils.closeQuietly(writer);
Andrew Norman
  • 843
  • 9
  • 22
1

Maybe it will be useful. When you use text editor, try do not put any extra tabs within XML code because they will be saved in xml file.

I mean, your code should look like this:

val personsXml = 
<persons>
   { persons.map(_.toXml) }
</persons>

Instead of this:

val personsXml = 
    <persons>
        { persons.map(_.toXml) }
    </persons>

It perfectly worked for me.

Kersh
  • 301
  • 2
  • 8
0

Adapted from DOMImplementationLS serialize to String in UTF-8 in Java and How to pretty print XML from Java?

  def cleanXml(xml: String): String = {
    import org.w3c.dom.Node
    import org.w3c.dom.bootstrap.DOMImplementationRegistry
    import org.w3c.dom.ls.DOMImplementationLS
    import org.w3c.dom.ls.LSSerializer
    import org.xml.sax.InputSource
    import javax.xml.parsers.DocumentBuilderFactory
    import java.io.StringReader
    val src = new InputSource(new StringReader(xml))
    val document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(src).getDocumentElement
    val keepDeclaration = java.lang.Boolean.valueOf(xml.startsWith("<?xml"))
    val registry = DOMImplementationRegistry.newInstance()
    val impl = registry.getDOMImplementation("LS").asInstanceOf[DOMImplementationLS]
    val lsOutput = impl.createLSOutput
    lsOutput.setEncoding("UTF-8")
    import java.io.StringWriter
    val stringWriter = new StringWriter
    lsOutput.setCharacterStream(stringWriter)
    val writer = impl.createLSSerializer()
    writer.getDomConfig.setParameter("format-pretty-print", true)
    writer.getDomConfig.setParameter("xml-declaration", keepDeclaration)
    writer.write(document, lsOutput)
    stringWriter.toString
  }
Community
  • 1
  • 1
raisercostin
  • 8,777
  • 5
  • 67
  • 76