8

I want to be able to generate a complete XML file, given a set of XPath mappings.

The input could specified in two mappings: (1) One which lists the XPath expressions and values; and (2) the other which defines the appropriate namespaces.

/create/article[1]/id                 => 1
/create/article[1]/description        => bar
/create/article[1]/name[1]            => foo
/create/article[1]/price[1]/amount    => 00.00
/create/article[1]/price[1]/currency  => USD
/create/article[2]/id                 => 2
/create/article[2]/description        => some name
/create/article[2]/name[1]            => some description
/create/article[2]/price[1]/amount    => 00.01
/create/article[2]/price[1]/currency  => USD

For namespaces:

/create               => xmlns:ns1='http://predic8.com/wsdl/material/ArticleService/1/
/create/article       => xmlns:ns1='http://predic8.com/material/1/‘
/create/article/price => xmlns:ns1='http://predic8.com/common/1/‘
/create/article/id    => xmlns:ns1='http://predic8.com/material/1/'

Note also, that it is important that I also deal with XPath Attributes expressions as well. For example: I should also be able to handle attributes, such as:

/create/article/@type => richtext

The final output should then look something like:

<ns1:create xmlns:ns1='http://predic8.com/wsdl/material/ArticleService/1/'>
    <ns1:article xmlns:ns1='http://predic8.com/material/1/‘ type='richtext'>
        <name>foo</name>
        <description>bar</description>
        <ns1:price xmlns:ns1='http://predic8.com/common/1/'>
            <amount>00.00</amount>
            <currency>USD</currency>
        </ns1:price>
        <ns1:id xmlns:ns1='http://predic8.com/material/1/'>1</ns1:id>
    </ns1:article>
    <ns1:article xmlns:ns1='http://predic8.com/material/2/‘ type='richtext'>
        <name>some name</name>
        <description>some description</description>
        <ns1:price xmlns:ns1='http://predic8.com/common/2/'>
            <amount>00.01</amount>
            <currency>USD</currency>
        </ns1:price>
        <ns1:id xmlns:ns1='http://predic8.com/material/2/'>2</ns1:id>
    </ns1:article>
</ns1:create>

PS: This is a more detailed question to a previous question asked, although due to a series of further requirements and clarifications, I was recommended to ask a more broader question in order to address my needs.

Note also, I am implementing this in Java. So either a Java-based or XSLT-based solution would both be perfectly acceptable. thnx.

Further note: I am really looking for a generic solution. The XML shown above is just an example.

Community
  • 1
  • 1
Larry
  • 11,439
  • 15
  • 61
  • 84
  • Your requirements are too vague: "it is important that I also deal with any type of XPath expression" -> That would generally be an unsolvable set of equations. If you place a lot of restrictions on your mappings so they basically all look like in your example, then it's just a matter of looping through them and filling in a DOM. – Ingo Kegel Jul 09 '12 at 13:38
  • Ok, that’s a fair comment. How about we just restrict to node paths and attributes. I will update the Question. – Larry Jul 09 '12 at 13:41
  • So @Larry, what was your issue with my solution? – Sean B. Durkin Jul 09 '12 at 13:49
  • @SeanB.Durkin It didn’t seem generic, I guess I forgot to explicitly mention that in the question. For example, you seem to have hard-coded expressions such as `/s11:Envelope/s11:Body/ns1:create/article`. Also, I’m not sure why you have a restriction such that matches should always begin with abc[n], i.e. with a square-bracket expression, etc. Although, in case, I really need a generic solution. But thanks anyway, and if you have any other ideas here, please contribute. – Larry Jul 09 '12 at 13:54
  • Is there still a 'template' document? And if so, is there any part of it which is fixed? – Sean B. Durkin Jul 09 '12 at 14:03
  • @SeanB.Durkin I don’t think there will be any template. Dimitre’s recommendation from the previous question seemed to suggest that the notion of any template will be pointless. And I guess I agree, so I have decided to leave this out, and just depend on the expressions declarations for generating the XML file. – Larry Jul 09 '12 at 14:18
  • The problem as it stands, is probably unsolvable in XSLT, without a ridiculous level of complexity. You could solve it with Java. It would take a while to write in Java, but it the principle should be easy enough that you don't need to ask in StackOverflow how to solve in Java. Another option is to tag the question with XQuery and let the XQuery experts have a crack at it. – Sean B. Durkin Jul 09 '12 at 15:31
  • Ok, good suggestion, I will tag with XQuery. Although, it does seem useful for me to have this in StackOverflow, as after looking around a bit, it seems many would benefit from a possible solution to this. As after all, I’ve tried to pose the question generic enough, that I’m sure it would benefit many, in various different ways. – Larry Jul 09 '12 at 15:42
  • @Larry: What is the meaning of: `/create/article[@type] richtext` ? You haven't yet defined this. – Dimitre Novatchev Jul 10 '12 at 03:13
  • @SeanB.Durkin: There is nothing "unsolvable" -- with reasonable assuptions the solution is quite easy. – Dimitre Novatchev Jul 10 '12 at 03:50
  • @Larry: Good question again. +1. – Dimitre Novatchev Jul 10 '12 at 05:03
  • @DimitreNovatchev, Yes thanks, although I’ve got a couple of questions which I’ve posted as comments to your answer. – Larry Jul 10 '12 at 15:11

3 Answers3

2

This problem has an easy solution if one builds upon the solution of the previous problem:

<xsl:stylesheet version="2.0"
     xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     xmlns:xs="http://www.w3.org/2001/XMLSchema"
     xmlns:my="my:my">
     <xsl:output omit-xml-declaration="yes" indent="yes"/>

     <xsl:key name="kNSFor" match="namespace" use="@of"/>
     <xsl:variable name="vStylesheet" select="document('')"/>

     <xsl:variable name="vPop" as="element()*">
        <item path="/create/article/@type">richtext</item>
        <item path="/create/article/@lang">en-us</item>
        <item path="/create/article[1]/id">1</item>
        <item path="/create/article[1]/description">bar</item>
        <item path="/create/article[1]/name[1]">foo</item>
        <item path="/create/article[1]/price[1]/amount">00.00</item>
        <item path="/create/article[1]/price[1]/currency">USD</item>
        <item path="/create/article[1]/price[2]/amount">11.11</item>
        <item path="/create/article[1]/price[2]/currency">AUD</item>
        <item path="/create/article[2]/id">2</item>
        <item path="/create/article[2]/description">some name</item>
        <item path="/create/article[2]/name[1]">some description</item>
        <item path="/create/article[2]/price[1]/amount">00.01</item>
        <item path="/create/article[2]/price[1]/currency">USD</item>

        <namespace of="create" prefix="ns1:"
                   url="http://predic8.com/wsdl/material/ArticleService/1/"/>
        <namespace of="article" prefix="ns1:"
                   url="xmlns:ns1='http://predic8.com/material/1/"/>
        <namespace of="@lang" prefix="xml:"
                   url="http://www.w3.org/XML/1998/namespace"/>
        <namespace of="price" prefix="ns1:"
                   url="xmlns:ns1='http://predic8.com/material/1/"/>
        <namespace of="id" prefix="ns1:"
                   url="xmlns:ns1='http://predic8.com/material/1/"/>
     </xsl:variable>

     <xsl:template match="/">
      <xsl:sequence select="my:subTree($vPop/@path/concat(.,'/',string(..)))"/>
     </xsl:template>

     <xsl:function name="my:subTree" as="node()*">
      <xsl:param name="pPaths" as="xs:string*"/>

      <xsl:for-each-group select="$pPaths" group-adjacent=
            "substring-before(substring-after(concat(., '/'), '/'), '/')">
        <xsl:if test="current-grouping-key()">
         <xsl:choose>
           <xsl:when test=
              "substring-after(current-group()[1], current-grouping-key())">

             <xsl:variable name="vLocal-name" select=
              "substring-before(concat(current-grouping-key(), '['), '[')"/>

             <xsl:variable name="vNamespace"
                           select="key('kNSFor', $vLocal-name, $vStylesheet)"/>


             <xsl:choose>
              <xsl:when test="starts-with($vLocal-name, '@')">
               <xsl:attribute name=
                 "{$vNamespace/@prefix}{substring($vLocal-name,2)}"
                    namespace="{$vNamespace/@url}">
                 <xsl:value-of select=
                  "substring(
                       substring-after(current-group(), current-grouping-key()),
                       2
                             )"/>
               </xsl:attribute>
              </xsl:when>
              <xsl:otherwise>
               <xsl:element name="{$vNamespace/@prefix}{$vLocal-name}"
                          namespace="{$vNamespace/@url}">

                    <xsl:sequence select=
                     "my:subTree(for $s in current-group()
                                  return
                                     concat('/',substring-after(substring($s, 2),'/'))
                                   )
                     "/>
                 </xsl:element>
              </xsl:otherwise>
             </xsl:choose>
           </xsl:when>
           <xsl:otherwise>
            <xsl:value-of select="current-grouping-key()"/>
           </xsl:otherwise>
         </xsl:choose>
         </xsl:if>
      </xsl:for-each-group>
     </xsl:function>
</xsl:stylesheet>

When this transformation is applied on any XML document (not used), the wanted, correct result is produced:

<ns1:create xmlns:ns1="http://predic8.com/wsdl/material/ArticleService/1/">
   <ns1:article xmlns:ns1="xmlns:ns1='http://predic8.com/material/1/" type="richtext"
                xml:lang="en-us"/>
   <ns1:article xmlns:ns1="xmlns:ns1='http://predic8.com/material/1/">
      <ns1:id>1</ns1:id>
      <description>bar</description>
      <name>foo</name>
      <ns1:price>
         <amount>00.00</amount>
         <currency>USD</currency>
      </ns1:price>
      <ns1:price>
         <amount>11.11</amount>
         <currency>AUD</currency>
      </ns1:price>
   </ns1:article>
   <ns1:article xmlns:ns1="xmlns:ns1='http://predic8.com/material/1/">
      <ns1:id>2</ns1:id>
      <description>some name</description>
      <name>some description</name>
      <ns1:price>
         <amount>00.01</amount>
         <currency>USD</currency>
      </ns1:price>
   </ns1:article>
</ns1:create>

Explanation:

  1. A reasonable assumption is made that throughout the generated document any two elements with the same local-name() belong to the same namespace -- this covers the predominant majority of real-world XML documents.

  2. The namespace specifications follow the path specifications. A nsmespace specification has the form: <namespace of="target element's local-name" prefix="wanted prefix" url="namespace-uri"/>

  3. Before generating an element with xsl:element, the appropriate namespace specification is selected using an index created by an xsl:key. From this namespace specification the values of its prefix and url attributes are used in specifying in the xsl:element instruction the values of the full element name and the element's namespace-uri.

Community
  • 1
  • 1
Dimitre Novatchev
  • 240,661
  • 26
  • 293
  • 431
  • Thanks very much for your solution. However, I get the following error when trying to process: `Recoverable error on line 8 FODC0002: org.xml.sax.SAXParseException: Content is not allowed in prolog. Error on line 52 XPTY0004: An empty sequence is not allowed as the third argument of key() at my:subTree() (#35)`. Any ideas about what’s going wrong? thnx. – Larry Jul 10 '12 at 14:59
  • The other question is: does the above solution also support creation of attributes in the XML document. For example, in the question this is what I meant by having a specification such as `/create/article/@type => abcde`, etc. This means that, just like I can create nodes, I should also be able to create attributes, etc. – Larry Jul 10 '12 at 15:12
  • @Larry: [Re: attribute creation} This can be done the moment you show and explain a rule for an attibute creation -- there is none at present. – Dimitre Novatchev Jul 10 '12 at 15:51
  • @Larry, as for your problems running the transformation, there are a few possible causes: a) you changed the code; b) you are not using a compliant XSLT 2.0 processor. I always test my code and copy and paste the result of running the transformation. I have verified that both Saxon 9.1.07 and Altova (XML-SPY) produce the result that I have provided. – Dimitre Novatchev Jul 10 '12 at 15:52
  • I think I worked out why the compilation issue is occurring: If I’m correct you seem to be calling the stylesheet itself in `select=“document('')`, however this seems to return an error. However, as a quick-fix I replaced this with `select="document('transform.xslt’)` - which now works, and returns the expect result, given that I’ve saved the transformation to `transform.xslt`. However, perhaps there is another way to call document(self), since it seems `document('')` doesn’t seem to work (so far, for me at least). – Larry Jul 10 '12 at 15:56
  • @Larry: You could get this issue if your program has the text of the XSLT code as a string -- and doesn't load the stylesheet from a file. – Dimitre Novatchev Jul 10 '12 at 15:59
  • Ok, thanks I’ve fixed this now, your suggestion was correct. Now, getting back to attributes, why would a rule such as `/create/article[1]/@text => abc` not suffice, to mean that we want to create an attribute `text` in the first `article` node, with the value `abc`. Or, alternatively, would be your suggestion for this. Because, I think if we can support creation of both nodes and attributes, we’re well on our way, for a very good generic solution, that I’m sure many would benefit from. Thnx. – Larry Jul 10 '12 at 16:14
  • @Larry: Yes, this is exactly how I imagined such a rule -- but you never presented such -- until now. – Dimitre Novatchev Jul 10 '12 at 16:24
  • Ok, so do you think you could please add into your solution? That would make it a complete solution. – Larry Jul 10 '12 at 16:26
  • Not a problem, but at the same time I would prefer that you edit the question and define this explicitly. As I said before, my work-hours have started, so I will look at this in 8hrs. from now. – Dimitre Novatchev Jul 10 '12 at 16:33
  • Ok thanks once again. Btw: I think I had already edited the question some time before, see the part where I mention about XPath Attributes. Although, if this don’t seem clear, let me know, and I will further edit to make it clear. – Larry Jul 10 '12 at 16:40
  • Oh, I see -this is now OK. Before it read: `/create/article[@type]` and was confusing. – Dimitre Novatchev Jul 10 '12 at 16:42
  • Yes, I know, I made an error with the syntax but later changed it. So I assume it should be clear now, and once again, I look forward for your solution when you’re available. – Larry Jul 10 '12 at 16:44
  • @Larry: I updated the answer and now attributes are successfully created -- even attributes in a namespace. – Dimitre Novatchev Jul 11 '12 at 02:44
  • As always, thanks very much Dimitre. I’ve accepted the answer, but as a further request, do you think we could have the solution such that the input XML file, (which otherwise is unused at the moment), could contain all the expressions specifications (as you have in vPop). And that way also, the XSLT file itself could the be generic. It would seem a nice design to me. (Or, perhaps this could be asked as another question?!) – Larry Jul 11 '12 at 03:51
  • @Larry, Of course -- this is stright-forward and almost nothing needs to be changed. Just change `` to `` where `$pRulesPath` is a global/external parameter that contains the path to the rules document. In addition, the rules must be wrapped into a single top element. – Dimitre Novatchev Jul 11 '12 at 04:11
  • Ok thanks. But do you think it could be possible to not have to write the rules to a file, for each instance that I want to run the transformer. Since, this means in my implementation, I would need to write a temporary file, process the transformation, and then remove it. I kind of like this to be dynamic. So, perhaps we could have this in a way that it could be passed as part of the input? As if I’m understanding correctly, the input XML is still unused? – Larry Jul 11 '12 at 04:59
  • @Larry, Yes, a compiled/parsed XML document can be passed as a parameter to the transformation -- read the documentation of the XSLT processor that you are using, for detailed API description and examples. – Dimitre Novatchev Jul 11 '12 at 05:29
  • Thanks again Dimitre, just to let you know I managed to get it all working well, where I can pass in the rules as a parameter. Although, it turned out I also needed to set `vStylesheet` variable to the rules file. In any case, I’ve also posted a follow-up question, perhaps you could take a look http://stackoverflow.com/questions/11437606. Thanks, and a certain (+1)! – Larry Jul 11 '12 at 16:59
  • @Larry: You are most welcome. Please, keep up the good work of posting such challenging questions. I will be glad to have a look at the new question after working hours -- 8hrs from now. – Dimitre Novatchev Jul 11 '12 at 17:57
1

i came across a similar situation where i had to convert Set of XPath/FQN - value mappings to XML. A generic simple solution can be using the following code, which can be enhanced to specific requirements.

public class XMLUtils {
static public String transformToXML(Map<String, String> pathValueMap, String delimiter)
        throws ParserConfigurationException, TransformerException {

    DocumentBuilderFactory documentFactory = DocumentBuilderFactory.newInstance();
    DocumentBuilder documentBuilder = documentFactory.newDocumentBuilder();
    Document document = documentBuilder.newDocument();

    Element rootElement = null;

    Iterator<Entry<String, String>> it = pathValueMap.entrySet().iterator();
    while (it.hasNext()) {
        Entry<String, String> pair = it.next();
        if (pair.getKey() != null && pair.getKey() != "" && rootElement == null) {
            String[] pathValuesplit = pair.getKey().split(delimiter);
            rootElement = document.createElement(pathValuesplit[0]);
            break;
        }
    }

    document.appendChild(rootElement);
    Element rootNode = rootElement;
    Iterator<Entry<String, String>> iterator = pathValueMap.entrySet().iterator();
    while (iterator.hasNext()) {
        Entry<String, String> pair = iterator.next();
        if (pair.getKey() != null && pair.getKey() != "" && rootElement != null) {
            String[] pathValuesplit = pair.getKey().split(delimiter);
            if (pathValuesplit[0].equals(rootElement.getNodeName())) {
                int i = pathValuesplit.length;

                Element parentNode = rootNode;
                int j = 1;

                while (j < i) {
                    Element child = null;

                    NodeList childNodes = parentNode.getChildNodes();
                    for (int k = 0; k < childNodes.getLength(); k++) {
                        if (childNodes.item(k).getNodeName().equals(pathValuesplit[j])
                                && childNodes.item(k) instanceof Element) {
                            child = (Element) childNodes.item(k);
                            break;
                        }
                    }

                    if (child == null) {
                        child = document.createElement(pathValuesplit[j]);
                        if (j == (i - 1)) {
                            child.appendChild(
                                    document.createTextNode(pair.getValue() == null ? "" : pair.getValue()));
                        }
                    }
                    parentNode.appendChild(child);
                    parentNode = child;
                    j++;
                }
            } else {
                // ignore any other root - add logger
                System.out.println("Data not processed for node: " + pair.getKey());
            }
        }
    }

    TransformerFactory transformerFactory = TransformerFactory.newInstance();
    Transformer transformer = transformerFactory.newTransformer();
    DOMSource domSource = new DOMSource(document);

    // to return a XMLstring in response to an API
     StringWriter writer = new StringWriter();
     StreamResult result = new StreamResult(writer);

     StreamResult resultToFile = new StreamResult(new File("C:/EclipseProgramOutputs/GeneratedXMLFromPathValue.xml"));
     transformer.transform(domSource, resultToFile);
     transformer.transform(domSource, result);

    return writer.toString();
}

public static void main(String args[])
{

    Map<String, String> pathValueMap = new HashMap<String, String>();
    String delimiter = "/";

    pathValueMap.put("create/article__1/id", "1");
    pathValueMap.put("create/article__1/description", "something");
    pathValueMap.put("create/article__1/name", "Book Name");
    pathValueMap.put("create/article__1/price/amount", "120" );
    pathValueMap.put("create/article__1/price/currency", "INR");
    pathValueMap.put("create/article__2/id", "2");
    pathValueMap.put("create/article__2/description", "something else");
    pathValueMap.put("create/article__2/name", "Book name 1");
    pathValueMap.put("create/article__2/price/amount", "2100");
    pathValueMap.put("create/article__2/price/currency", "USD");

    try {
        XMLUtils.transformToXML(pathValueMap, delimiter);
    } catch (ParserConfigurationException | TransformerException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

}}

Output:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<create>
    <article__1>
        <id>1</id>
    <name>Book Name</name>
    <description>something</description>
    <price>
        <currency>INR</currency>
        <amount>120</amount>
    </price>
</article__1>
<article__2>
    <description>something else</description>
    <name>Book name 1</name>
    <id>2</id>
    <price>
        <currency>USD</currency>
        <amount>2100</amount>
    </price>
</article__2>

To remove __%num , can use regular expressions on final string. like:

resultString = resultString.replaceAll("(__[0-9][0-9])|(__[0-9])", "");

This would do the cleaning job

0

Interesting question. Let's assume that your input set of XPath expressions satisfies some reasonsable constraints, for example if there is an X/article[2] then there also (preceding it) an X/article[1]. And let's put the namespace part of the problem to one side for the moment.

Let's go for an XSLT 2.0 solution: we'll start with the input in the form

<paths>
<path value="1">/create/article[1]/id</path>
<path value="bar">/create/article[1]/description</path>
</paths>

and then we'll turn this into

<paths>
<path value="1"><step>create</step><step>article[1]</step><step>id</step></path>
   ...
</paths>

Now we'll call a function which does a grouping on the first step, and calls itself recursively to do grouping on the next step:

<xsl:function name="f:group">
  <xsl:param name="paths" as="element(path)*"/>
  <xsl:param name="step" as="xs:integer"/>
  <xsl:for-each-group select="$paths" group-by="step[$step]">
    <xsl:element name="{replace(current-grouping-key(), '\[.*', '')}">
      <xsl:choose>
        <xsl:when test="count(current-group) gt 1">
           <xsl:sequence select="f:group(current-group(), $step+1)"/>
        </xsl:when>
        <xsl:otherwise>
           <xsl:value-of select="current-group()[1]/@value"/>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:element>
  </xsl:for-each-group>
</xsl:function>

That's untested, and there may well be details you have to adjust to get it working. But I think the basic approach should work.

The namespace part of the problem is perhaps best tackled by preprocessing the list of paths to add a namespace attribute to each step element; this can then be used in the xsl:element instruction to put the element in the right namespace.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164