Remove new line between two XML nodes using XSLT

Question

I am very new with XSL.

I have one XHTML file and I want merely the <section id="ch01lev2sec01"> element and the first paragraph of that element to become one line that is separated by the UniCode codepoint &#X2003;.

The input is:

<section class="bodymatter" id="ch01body">
  <section id="ch01lev1sec01">
    <header>
      <h1 class="title">Assumptions Underlying Content Teaching</h1> 
    </header>
    <p>Most content area teachers assume it is their responsibility to cover their subject matter in a timely, accurate, and effective manner (<a class="biblioref" href="REF.xhtml#ch01bib033">Alvermann &amp; Moore, 1991</a>; <a class="biblioref" href="REF.xhtml#ch01bib034">Moore, 1996</a>). They also assume, for the most part, that textbooks are necessary for teaching and learning content (<a class="biblioref" href="REF.xhtml#ch01bib035">Wade &amp; Moje, 2000</a>). Finally, content area teachers tend to assume that by the time students enter middle and/or high school, they are strategic in their approach to reading and learning (<a class="biblioref" href="REF.xhtml#ch01bib036">Alvermann &amp; Nealy, 2004</a>). These assumptions influence teachers’ instructional decision making, their use of textbooks, and their perceptions of active and independent readers.</p>
    <section id="ch01lev2sec01">
      <header>
        <h1 class="title">Subject Matter</h1>
      </header>
      <p>The historical</p>
    </section>
  </section>
</section>

And the required output is:

<section class="bodymatter" id="ch01body">
  <section id="ch01lev1sec01">
    <header>
      <h1 class="title">Assumptions Underlying Content Teaching</h1>
    </header>
    <p>Most content area teachers assume it is their responsibility to cover their subject matter in a timely, accurate, and effective manner (<a class="biblioref" href="REF.xhtml#ch01bib033">Alvermann &amp; Moore, 1991</a>; <a class="biblioref" href="REF.xhtml#ch01bib034">Moore, 1996</a>). They also assume, for the most part, that textbooks are necessary for teaching and learning content (<a class="biblioref" href="REF.xhtml#ch01bib035">Wade &amp; Moje, 2000</a>). Finally, content area teachers tend to assume that by the time students enter middle and/or high school, they are strategic in their approach to reading and learning (<a class="biblioref" href="REF.xhtml#ch01bib036">Alvermann &amp; Nealy, 2004</a>). These assumptions influence teachers’ instructional decision making, their use of textbooks, and their perceptions of active and independent readers.</p>
    <section id="ch01lev2sec01">
      <header>
        <h1 class="title">Subject Matter</h1>
      </header>&#x2003;
      <p>The historical</p>
    </section>
  </section>
</section>

Post what you've tried so far. SO is not about doing the coding for you. — msp, Mar 22 '16 at 07:12
The XML is semantically identical either way. Are you _sure_ you need it to be arranged that way? Is this an example of an [XY Problem](http://xyproblem.info)? — Jim Garrison, Mar 22 '16 at 07:15
Yes, Actually we have export XML file and then flow in InDesign. — Nikhil Ranjan, Mar 22 '16 at 07:17

score 1 · Answer 1 · answered Mar 22 '16 at 09:15

1

Assuming you are using the identity template, you need a template to match the child text node of the section element before the first p element.

<xsl:template match="section[@id='ch01lev2sec01']/text()[not(preceding-sibling::p) and following-sibling::p]">

In this template you can then just output your extra character.

Try this XSLT

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:output method="xml" indent="no" />

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="section[@id='ch01lev2sec01']/text()[not(preceding-sibling::p) and following-sibling::p]">
        <xsl:value-of select="normalize-space()" />
        <xsl:text disable-output-escaping="yes">&#x2003;</xsl:text>
    </xsl:template>
</xsl:stylesheet>

answered Mar 22 '16 at 09:15

Tim C

70,053
14
74
93

Thanks a lot for this one, Can you guide me if i want put some regular expression in id. Like "match="section[@id='ch([0-9]+)lev2sec([0-9]+)']". I want it change all lev2 section. – Nikhil Ranjan Mar 22 '16 at 09:34
If you are using XSLT 2.0, then you can use regular expressions, using the `matches` command. So your match would look something like this: `match="section[matches(@id,'ch\d+lev2sec\d+')]` – Tim C Mar 22 '16 at 09:37
I don't know why, But when i add ` Content Literacy and the Reading Process ` Then it does not convert when remove then convert what's problem? May be due to attribute ? – Nikhil Ranjan Mar 22 '16 at 12:10
It's probably because it you are adding a default namespace `xmlns="http://www.w3.org/1999/xhtml" ` which means all un-prefixed elements in the XML will be in that namespace. My xslt solution will only match elements in no namespace. – Tim C Mar 22 '16 at 12:36
Then what should i do, to handle this situation? – Nikhil Ranjan Mar 23 '16 at 03:35
Take a look at http://stackoverflow.com/questions/36124743/transforming-xml-file-to-html-file-using-xsl as an example on how to handle namespaces in your XSLT. – Tim C Mar 23 '16 at 09:01
Thanks so much Tim C. You are amazing. Really Thanks a lot once again. Here main problem is namespace – Nikhil Ranjan Mar 23 '16 at 12:24

Remove new line between two XML nodes using XSLT

1 Answers1