Given input documents that are a series of same-level nodes, I want to find those nodes that occur between two flags (which themselves are nodes). The flags can be used multiple times and the final outcome should have all the content between the same flags grouped together. I am striking out on this.
Given this input document:
<root>
<p class="text">Hello world 1.</p>
<p class="text">Hello world 2.</p>
<p class="text">Hello world 3.</p>
<p class="excerptstartone">Dummy text</p> <!-- this flag identifies the start of the nodes I want to select -->
<p class="text">Hello world 4.</p>
<p class="text">Hello world 5.</p>
<p class="text">Hello world 6.</p>
<p class="excerptendone">Dummy text</p> <!-- this flag identifies the end of the nodes I want to select -->
<p class="text">Hello world 7.</p>
<p class="excerptstarttwo">Dummy text</p> <!-- this flag identifies the start of the nodes I want to select -->
<p class="text">Hello world 8.</p>
<p class="excerptendtwo">Dummy text</p> <!-- this flag identifies the end of the nodes I want to select -->
<p class="text">Hello world 9.</p>
<p class="excerptstartone">Dummy text for starting a new excerpt</p> <!-- this flag identifies the start of the nodes I want to select -->
<p class="text">Hello world 10.</p>
<p class="text">Hello world 11.</p>
<p class="excerptendone">Dummy text</p> <!-- this flag identifies the end of the nodes I want to select -->
<p class="text">Hello world 12.</p>
<p class="text">Hello world 13.</p>
<p class="text">Hello world 14.</p>
<p class="text">Hello world 15.</p>
<p class="text">Hello world 16.</p>
<p class="text">Hello world 17.</p>
</root>
I want this output:
<root>
<p class="excerptstartone">Dummy text</p>
<p class="text">Hello world 4.</p>
<p class="text">Hello world 5.</p>
<p class="text">Hello world 6.</p>
<p class="text">Hello world 10.</p>
<p class="text">Hello world 11.</p>
<p class="excerptendone">Dummy text</p>
<p class="excerptstarttwo">Dummy text</p>
<p class="text">Hello world 8.</p>
<p class="excerptendtwo">Dummy text</p>
</root>
Note: The flags will always start with "excerptstart" and "excerptend" and the suffix of the flags will always match (that is, guaranteed by business rules there will always be a "excerptendone" if there is a "excerptstartone").
This is what I have so far. I can find the collections I want as long as I hard code the excerptstart suffix (i.e., 'one', 'two'). I am stuck on trying to generalize it so the suffix doesn't have to be hard coded (I should also say I don't care about retaining the start/end paragraph "flags" in the result tree; I've hard coded those here for convenience in assessing the result tree):
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
<xsl:template match="root">
<root>
<p class="excerptstartone">Dummy text</p>
<xsl:for-each select="p[@class='excerptstartone']">
<xsl:sequence select="following-sibling::node() intersect following-sibling::p[@class='excerptendone'][1]/preceding-sibling::node()"/>
</xsl:for-each>
<p class="excerptendone">Dummy text</p>
<p class="excerptstarttwo">Dummy text</p>
<xsl:for-each select="p[@class='excerptstarttwo']">
<xsl:sequence select="following-sibling::node() intersect following-sibling::p[@class='excerptendtwo'][1]/preceding-sibling::node()"/>
</xsl:for-each>
<p class="excerptendtwo">Dummy text</p>
</root>
</xsl:template>
<xsl:template match="text()"/>
</xsl:stylesheet>