0

I'm facing the following case of XSL transformation.

I have this:

<content>
  xml_and_text_0
  <break/>
  xml_and_text_1
  <break/>
  ...
  <break/>
  xml_and_text_n
</content>

I want to use XSL 2.0 to turn the above XML into this:

<content>
  <block>
    xml_and_text_0
  </block>
  <block>
    xml_and_text_1
  </block>
  ...
  <block>
    xml_and_text_n
  </block>
</content>

(I'd also like to ignore some_xml_and_text_k = '', but for the moment let's assume they're non-empty)

I'm thinking I could use an approach similar to [XPath : select all following siblings until another sibling, but maybe there is a simpler approach (or a simpler XPath expression). For instance, is it possible to match all the siblings following/preceding the current item in a for-each loop?

EDIT: Note that xml_and_text_i is a mix of text and XML, similarly to XHTML, which I want to wrap within , so something like:

<break/>
this is an <ref id = "123">example</ref>, which is really <citation>awesome</citation>
<break/>

would become:

<block>this is an <ref id = "123">example</ref>, which is really <citation>awesome</citation></block>
Community
  • 1
  • 1
zakmck
  • 2,715
  • 1
  • 37
  • 53

2 Answers2

2

Your question is very confusing. If your real input contains both elements and text nodes in-between the break nodes, then so should your example.

Apparently this question is about grouping and using XSLT 2.0 it can be solved quite easily as:

<xsl:stylesheet version="2.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:template match="/content">
   <xsl:copy>
        <xsl:for-each-group select="node()" group-starting-with="break">
            <block>
                <xsl:copy-of select="current-group()[not(self::break)]" />
            </block>
        </xsl:for-each-group>
    </xsl:copy>
</xsl:template>

</xsl:stylesheet>
michael.hor257k
  • 113,275
  • 6
  • 33
  • 51
  • I don't think so. I need a new block per each and also xml_and_text_i contain xml and text, which I want to copy inside . More precisely, it contains paragraphs in a mark-up language, similar to xhtml, so you can read things like: "this is an example, which is really awesome". – zakmck Aug 24 '16 at 18:41
  • @zakmck Please edit your question and show an example input that the above stylesheet does not handle correctly. Also state if using XSLT 1.0 or 2.0. – michael.hor257k Aug 24 '16 at 18:58
  • In this example: ` Some text this is an example, which is really awesome Some other text ` Your XSL yields: ` Some text this is an , which is really Some other text ` But I expect: ` Some text this is an example, which is really awesome Some other text ` – zakmck Aug 24 '16 at 19:35
  • Please don't post code in comments - edit your question instead. And please answer the question regarding your XSLT processor's version. – michael.hor257k Aug 24 '16 at 19:49
  • I've edited my question too, I'd like to use XSL 2.0. – zakmck Aug 24 '16 at 19:59
  • Many thanks, I've seen your nice answer right now, please see my further comments on my answer. – zakmck Aug 29 '16 at 17:38
0

Thanks michael.hor257k for your solution (and sorry for the confusion). Your approach seems cleaner than mine, shown below and based on XSL axes. My version considers edge cases too, but I guess yours could be adapted to do the same, I'll look into it.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs"
    version="2.0">

    <!-- xsl:output method="xml" indent="yes" / -->

    <xsl:template match="/content">
        <content>

            <xsl:choose>

                <!-- First of all, distinguish the case where there is at least one separator, from the ones with no separator at all -->

                <xsl:when test="break">
                    <xsl:for-each select="break">

                        <!-- What do we have before the first separator? Create a block only if non-empty stuff -->
                        <xsl:if test="position() = 1">
                            <xsl:variable name="first_block"><xsl:copy-of select="preceding-sibling::node()" /></xsl:variable>
                            <xsl:if test = "normalize-space ( $first_block ) != ''" >
                                <xsl:message select="concat ( '1|', $first_block, '|' )" />
                                <block id = "{@id}"><xsl:copy-of select="$first_block" /></block>
                            </xsl:if>
                        </xsl:if>

                        <!-- What do we have after the next separator and before the next (or the end)? -->
                        <xsl:variable name="block_content">
                            <xsl:choose>
                                <xsl:when test="following-sibling::break">
                                    <!-- select all that comes after current node and precedes the next separator -->
                                    <xsl:copy-of select="following-sibling::node() intersect following-sibling::break[1]/preceding-sibling::node()" />
                                </xsl:when>
                                <xsl:otherwise>
                                    <!-- One separator after another, without anything in between -->
                                    <xsl:copy-of select="following-sibling::node()" />
                                </xsl:otherwise>
                            </xsl:choose>                           
                        </xsl:variable>

                        <xsl:message select="concat ( '|', $block_content, '|' )" />
                        <xsl:message select="concat ( '_|', normalize-space ( $block_content ), '|_' )" />

                        <!-- Did we get something after the current separator? Create a block if yes -->
                        <xsl:if test = "normalize-space( $block_content ) != ''">
                            <block id = "{@id}"><xsl:copy-of select="$block_content" /></block>
                        </xsl:if>                       
                    </xsl:for-each>
                </xsl:when>

                <!-- When some content is available without any separator, create a virtual single block to represent it -->
                <xsl:otherwise>
                    <xsl:variable name="single_block"><xsl:copy-of select="node()" /></xsl:variable>
                    <xsl:if test = "normalize-space( $single_block ) != ''">
                        <block id = "00"><xsl:copy-of select = "$single_block" /></block>
                    </xsl:if>
                </xsl:otherwise>
            </xsl:choose>

        </content>

    </xsl:template>

</xsl:stylesheet> 

Given this input:

<content>
    Some <b>text</b>
  <break id = '1'/>
  this is an <ref id = "123">example</ref>, which is really <citation>awesome</citation>
  <break id = '2'/>
  Some other text
</content>

It spawns this output:

<?xml version="1.0" encoding="UTF-8"?>
<content>
    <block id="1"> Some <b>text</b></block>
    <block id="1"> this is an <ref id="123">example</ref>, which is really<citation>awesome</citation></block>
    <block id="2"> Some other text </block>
</content>
zakmck
  • 2,715
  • 1
  • 37
  • 53