1

I have following XML input:

<table>
    <tbody>
        <tr>
            <td style="width: 10px; margin-left: 10px;">td text</td>
            <td style="color: red; width: 25px; text-align: center; margin-left: 10px;">
                <span>span text</span>
            </td>
        </tr> 
    </tbody>
</table>

Please note that I have other nodes in the same document that should not be touched.

I want to remove certain attribute values from an element (in this case from td). Let's say I want to remove the width value within a style attribute. I don't know where in the style-attribute the width-value is set, it could be anywhere. The span in the td doesn't really matter (this and some other elements are there in the input).

I expect the output to be like this:

<table>
    <tbody>
        <tr>
            <td style="margin-left: 10px;">td text</td>
            <td style="color: red; text-align: center; margin-left: 10px;">
                <span>span text</span>
            </td>
        </tr> 
    </tbody>
</table>

I prefer using XSLT1, I did not bring the replace() function to work yet (but maybe I am doing something wrong).

I tried using this XSLT:

<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="td/@style">
        <xsl:attribute name="style">
            <xsl:value-of select="replace(., 'width:.[[:digit:]]+px;', '')" />
        </xsl:attribute>
    <xsl:apply-templates select="node()" /> 
</xsl:template> 

I am still a beginner in XSLT and this above doesn't work and I did not find a solution here. Also, I don't know the width-value so I would need to replace the value with a regex (I used "width:.[[:digit:]]+px;") or something. Is there maybe a easier method that can replace every specific value? So I could remove text-align aswell without having to think of a new regex?

I really hope that you can help me with this (surely easy) problem. Thank you in advance!

hasey
  • 13
  • 4
  • `replace` is XSLT/XPath 2.0 so I am not sure why you say "I prefer using XSLT1". As for a general solution, I think you would need to implement a CSS parser or maybe transformation first to handle a CSS style attribute value. So I don't think it is an easily solved problem, for instance a CSS `width` value is not restricted to the `px` unit. – Martin Honnen Aug 26 '16 at 10:01
  • "*this above doesn't work*" That's a useless description. You should report the exact result and reproduce verbatim any error messages received. – michael.hor257k Aug 26 '16 at 11:15

2 Answers2

4

Let's say I want to remove the width value within a style attribute. I don't know where in the style-attribute the width-value is set, it could be anywhere.

Try:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<!-- identity transform -->
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="td/@style[contains(., 'width:')]">
    <xsl:attribute name="style">
        <xsl:value-of select="substring-before(., 'width:')" />
        <xsl:value-of select="substring-after(substring-after(., 'width:'), ';')" />
    </xsl:attribute>
</xsl:template> 

</xsl:stylesheet>

Note:

I want to remove certain attribute values from an element (in this case from td).

Actually, what you want is to remove certain properties from the style attribute. The above will work for removing a single property; if you want to remove more than one, you'll have to use a recursive template to do it.


Added:

Will there be an issue if the style contains border-width:1px as this become border-?

Yes, this could be a problem. A possible solution would be:

<xsl:template match="td/@style">
    <xsl:variable name="style" select="concat(' ', .)" />
    <xsl:choose>
        <xsl:when test="contains($style, ' width:')">
            <xsl:attribute name="style">
                <xsl:value-of select="substring-before($style, ' width:')" />
                <xsl:value-of select="substring-after(substring-after($style, ' width:'), ';')" />
            </xsl:attribute>
        </xsl:when>
        <xsl:otherwise>
            <xsl:copy/>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template> 

However, this assumes that the ; separator in the source document is always followed by a space (as it is in the given example). Otherwise it gets more complicated.

michael.hor257k
  • 113,275
  • 6
  • 33
  • 51
0

Assuming you are using XSLT 2.0 (as replace is not supported in 1.0) you can use \d to match a digit in regular expressions, so you can write your pattern like so:

<xsl:value-of select="replace(., '( | $)width:\s*\d*px;?', '')" />

Note the \s* is used to match zero or more characters of whitespace, so allow for width:10px or width: 10px. Also not ( | $) is used to ensure a space before width (or if it is at the start), so that properties like border-width are not matched.

If you wanted to handle units other than px you could do this...

<xsl:value-of select="replace(., '( | $)width:[^;]+;?', '')" />

Read up on regular expressions at http://www.xml.com/pub/a/2003/06/04/tr.html.

Tim C
  • 70,053
  • 14
  • 74
  • 93
  • Thank you, but I think somehow my processor doesn't work with the replace() function as I get an error while checking "funcall(replace, [step("self", -1), literal-expr(width:\s*\d*px;), literal-expr()])".'" This is my declaration: Is my processor causing this issue? If so, I think I'd need a XSLT1.0 template :-( – hasey Aug 26 '16 at 10:06
  • It sounds like you are using XSLT 1.0. See http://stackoverflow.com/questions/25244370/how-can-i-check-which-xslt-processor-is-being-used-in-solr to find out what version you are using (The answer applies anywhere, not just in solr). – Tim C Aug 26 '16 at 10:08