2

I need to validate XML and XSD not very acceptable (default one uses too many xsd:sequence and trick with xsd:choice makes validation too acceptable (not sure this is right word))

So, is there good way to turn this

<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
  <book category="cooking">
    <title lang="en">Everyday Italian</title>
    <author>Giada De Laurentiis</author>
    <year>2005</year>
    <price>30.00</price>
  </book>
  <book category="children">
    <title lang="en">Harry Potter</title>
    <author>J K. Rowling</author>
    <year>2005</year>
    <price>29.99</price>
  </book>
  <book category="web">
    <title lang="en">Learning XML</title>
    <author>Erik T. Ray</author>
    <year>2003</year>
    <price>39.95</price>
  </book>
</bookstore> 

Into this

<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
  <book category="cooking">
    <title lang="en"></title>
    <author></author>
    <year></year>
    <price></price>
  </book>
  <book category="children">
    <title lang="en"></title>
    <author></author>
    <year></year>
    <price></price>
  </book>
  <book category="web">
    <title lang="en"></title>
    <author></author>
    <year></year>
    <price></price>
  </book>
</bookstore> 

On windows with python/java/go ? Its not one-time job, I need to do it automatically

struckoff
  • 41
  • 7
  • You can find a way to solve your problem with regex, check [this topic](https://stackoverflow.com/questions/7167279/regex-select-all-text-between-tags) – Thomas Dussaut Jul 06 '17 at 12:30

1 Answers1

1

The right tool for XML transformations is XSLT. This one is dead easy. In XSLT 3.0 it's

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform version="3.0">
  <xsl:output indent="yes"/>
  <xsl:mode on-no-match="shallow-copy"/>
  <xsl:template match="text()"/>
</xsl:stylesheet>

You said a Java solution is OK, so download Saxon-HE 9.8 and run this as

java net.sf.saxon.Transform -s:in.xml -xsl:trans.xsl -o:out.xml

If you prefer to use an XSLT 1.0 or 2.0 processor you can replace the xsl:mode declaration with the identity template rule which is easily googled.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164