2

I have the following html structure:

<document>
<ol>a question</ol>
<div>answer</div>
<div>answer</div>
<ol>another question</ol>
<div>answer</div>
<ol>question #3</ol>
...
</document>

I would like to take the <ol> nodes and the following <div> nodes until the next <ol> node, so I can group them in an xml like

<vce>
  <topic>
   <question> ... </question>
   <answer> ... </answer>
  </topic>
  ...
</vce>

So far I have the following

<xsl:for-each select="//body/ol">
  <document>

    <content name="question">
      <xsl:value-of select="." />
    </content>

    <content name="answer">
      <xsl:for-each
        select="./following-sibling::div !!! need code here !!!>
        <xsl:value-of select="." />
      </xsl:for-each>
    </content>
  </document>
</xsl:for-each>

I get the questions just fine but I'm having trouble with the answers. I have tried working with following, preceding, not, for-each-group, ... . There are many similar questions but not quit like this with this format because I don't really have a child-parent structure in my html file.

RudyVerboven
  • 1,204
  • 1
  • 14
  • 31
  • Does your processor support XSLT 2.0? – michael.hor257k Nov 23 '16 at 14:47
  • No, I am using Watson Explorer which only supports 1.0, I think. – RudyVerboven Nov 23 '16 at 14:54
  • Have you checked this topic http://stackoverflow.com/questions/10859703/xpath-select-all-elements-between-two-specific-elements ? – ievche Nov 23 '16 at 14:57
  • I have already tried that one with `[count(preceding-sibling::ol)=1]` but it only returns the answer for the first question, the others are empty. I have tried also with `[count(preceding-sibling::ol)=position()]` but it returns for some questions fragments of the answer, for some nothing and for some wrong fragments – RudyVerboven Nov 23 '16 at 15:06
  • I'm not familiar with xslt, but maybe you can try to increment that `1` by 1 for each ol if it is possible. – ievche Nov 23 '16 at 15:33
  • http://stackoverflow.com/a/40769354/6805256 – uL1 Nov 23 '16 at 18:17

1 Answers1

3

Try it this way:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:key name="answers" match="div" use="generate-id(preceding-sibling::ol[1])" />

<xsl:template match="/document">
    <vce>
        <xsl:for-each select="ol">
            <topic>
                <question>
                    <xsl:value-of select="." />
                </question>
                <xsl:for-each select="key('answers', generate-id())">
                    <answer>
                        <xsl:value-of select="." />
                    </answer>
                </xsl:for-each>
            </topic>
        </xsl:for-each>
    </vce>
</xsl:template>

</xsl:stylesheet>

when applied to the following test input:

XML

<document>
   <ol>question A</ol>
   <div>answer A1</div>
   <div>answer A2</div>
   <ol>question B</ol>
   <div>answer B1</div>
   <ol>question C</ol>
   <div>answer C1</div>
   <div>answer C2</div>
</document>

the result will be:

<?xml version="1.0" encoding="UTF-8"?>
<vce>
   <topic>
      <question>question A</question>
      <answer>answer A1</answer>
      <answer>answer A2</answer>
   </topic>
   <topic>
      <question>question B</question>
      <answer>answer B1</answer>
   </topic>
   <topic>
      <question>question C</question>
      <answer>answer C1</answer>
      <answer>answer C2</answer>
   </topic>
</vce>
michael.hor257k
  • 113,275
  • 6
  • 33
  • 51
  • I'm trying to understand your solution. The context node for `generate-id()` is the `ol` currently being processed by the outer for-each. Is this correct? And I think `generate-id(preceding-sibling::ol[1])` could also be written as `generate-id(preceding-sibling::ol)`. – Markus Nov 23 '16 at 15:37
  • @Markus The key is defined so that each `div` has the id of the immediately preceding `ol` (no, you cannot remove the `[1]`predicate) as its key value. This allows each `ol` to retrieve the corresponding `div`s by its id. – michael.hor257k Nov 23 '16 at 15:49
  • So if I pass a node set to `generate-id()` the first node of the node set (which is used to generate the ID, if I understand things correctly) is always selected in document order and not depending on the axis used? – Markus Nov 23 '16 at 16:09
  • It works! Thank you very much I had to do some slight changes to get it to work in the Watson Explorer application but I got it working – RudyVerboven Nov 23 '16 at 19:48