-2

My XML is being generated from a web form and some users are inserting line breaks and characters that being converted to line breaks \n and broken entities, like &

I'm using some variables to convert and remove bad characters, but I don't know how to strip out these types of characters.

Here's the method I'm using to convert or strip out other bad characters. Let me know if you need to see the entire XSL. …

<xsl:variable name="smallcase" select="'abcdefghijklmnopqrstuvwxyz_aaea'" />
<xsl:variable name="uppercase" select="'ABCDEFGHIJKLMNOPQRSTUVWXYZ äãêÂ.,'" />
<xsl:variable name="linebreaks" select="'\n'" />
<xsl:variable name="nolinebreaks" select="' '" />

<xsl:value-of select="translate(Surname, $uppercase, $smallcase)"/>
<xsl:value-of select="translate(normalize-space(Office_photos), $linebreaks, $nolinebreaks)"/>

The text in the XML contains content like this:

<Office_photos>bn_1.jpg: Showing a little Red Sox Pride!&#13;\nLeft to right: 
 Tessa Michelle Summers, \nJulie Gross, Alexis Drzewiecki</Office_photos>

I'm trying to get rid of the \n character inside the data

Jim Maivald
  • 522
  • 6
  • 26
  • I am not sure how translating from uppercase to lower case addresses the issue. Do you have a list of characters that you wish to keep - or conversely, a list of characters you want to discard? – michael.hor257k Feb 25 '14 at 15:35
  • I don't need to convert, the variable is working to convert alpha characters or remove "," and "." from the text. What I need is to remove `\n` characters. The variable ignore the "n" and only see the slash "\". And variables only see the actual entities `&` but ignore broken ones `&amp;` – Jim Maivald Feb 25 '14 at 15:41
  • If you don't need to convert, then don't. You code shows that you do. You say "*Here's the method I'm using to convert or strip out other bad characters.*" But that's **not** what you're doing. In any case, you haven't answered my question. – michael.hor257k Feb 25 '14 at 16:12
  • The problem I'm having is NOT with the conversion variable. That is working. I'm doing more than one thing in the XSLT. I only included it here to show that it DOES NOT work for stripping out `\n` characters. I need help getting rid of the line breaks and BROKEN entities. Yes, this XSLT was featured on another page, but this is a separate question. I was told a long time ago that separate questions should be on separate pages. – Jim Maivald Feb 25 '14 at 16:17
  • I modified the example to show the problem better. We are converting people's names from uppercase to lowercase and replacing foreign characters with standard English characters to create filenames. But inside the data there are line breaks that are ignored by the "translate" function – Jim Maivald Feb 25 '14 at 16:29
  • 1
    \n is not considered a single character in XML, but 2 characters, a backslash('\') and an 'n'. You should use a replace function to remove '\n' from data. – Lingamurthy CS Feb 25 '14 at 16:54

1 Answers1

1

As Lingamurthy CS explains in the comments \n is not treated as a single character in XML. It is simply parsed into two characters without any special handling.

If this is literally want you want to change though, then in XSLT 1.0 you will need to use a recursive template to replace the text (XSLT 2.0 has a replace function, XSLT 1.0 doesn't).

A quick search on Stackoverflow finds one such template at XSLT string replace

To call this, instead of doing this....

<xsl:value-of select="translate(normalize-space(Office_photos), $linebreaks, $nolinebreaks)"/>

You would just do this

  <xsl:call-template name="string-replace-all">
     <xsl:with-param name="text" select="Office_photos" />
     <xsl:with-param name="replace" select="$linebreaks" />
     <xsl:with-param name="by" select="$nolinebreaks" /> 
  </xsl:call-template>

Try this XSLT

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
   <xsl:output omit-xml-declaration="yes" indent="yes" />

   <xsl:variable name="linebreaks" select="'\n'" />
   <xsl:variable name="nolinebreaks" select="' '" />

   <xsl:template match="/">
      <xsl:call-template name="string-replace-all">
         <xsl:with-param name="text" select="Office_photos" />
         <xsl:with-param name="replace" select="$linebreaks" />
         <xsl:with-param name="by" select="$nolinebreaks" /> 
      </xsl:call-template>
   </xsl:template>

   <xsl:template name="string-replace-all">
     <xsl:param name="text" />
     <xsl:param name="replace" />
     <xsl:param name="by" />
     <xsl:choose>
       <xsl:when test="contains($text, $replace)">
         <xsl:value-of select="substring-before($text,$replace)" />
         <xsl:value-of select="$by" />
         <xsl:call-template name="string-replace-all">
           <xsl:with-param name="text" select="substring-after($text,$replace)" />
           <xsl:with-param name="replace" select="$replace" />
           <xsl:with-param name="by" select="$by" />
         </xsl:call-template>
       </xsl:when>
       <xsl:otherwise>
         <xsl:value-of select="$text" />
       </xsl:otherwise>
     </xsl:choose>
   </xsl:template>
</xsl:stylesheet>

(Credit to Mark Elliot who created the replace template)

Community
  • 1
  • 1
Tim C
  • 70,053
  • 14
  • 74
  • 93