3

I posted a question yesterday with great success, it did not exactly give me what I needed but certainly more than enough to put me on the right path. I ran into another difficulty and was hoping to find similar guidance.

I have a document with several different types of elements, some which can be nested within others. I need to remove all tags and leave only the inner HTML whenever a certain element is present.

For example, if the element pnum is present, I need to take the whole element and remove any inner elements, leaving behind only the inner html.

input:

<li>
    <pnum>
     blah blah
    <linum>hello hello</linum>
    good bye
    <title>good morning</title>
    </pnum>
</li>

output:

<li>
    blah blah
    hello hello
    good bye
    good morning
<li>

I was able to do this using HTMLAGILITYPACK, but I had to traverse every node and the performance is not great. I am wondering if there is a quicker XSLT transform I can perform on the doc.

Thanks in advance!

1 Answers1

1

I am not sure where you have taken the term innerHTML from but since IE 4 it usually includes the markup so your request to strip markup does not seem to be related to innerHTML.

As for XSLT, you can use

<xsl:template match="li[.//pnum]">
  <xsl:copy>
    <xsl:value-of select="."/>
  </xsl:copy>
</xsl:template>

to have any li element with a pnum descendant transformed to an li with only the text contents.

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
  • sorry innerHTML was just my way of trying to convey i needed the text enclosed in the tags without the actual tags itself. i will try your solution and report back shortly, thank you! – HelpMeWithXSLT Sep 22 '16 at 14:58
  • i tried this but it unfortunately does not seem to work at all. ` ` on the following does nothing, perhaps i made an error: `

    blah

    `
    – HelpMeWithXSLT Sep 22 '16 at 15:13
  • http://xsltransform.net/ejivdGY is an online sample with the input of your question and my suggestion and shows the wanted output. – Martin Honnen Sep 22 '16 at 15:16
  • And http://xsltransform.net/ejivdGY/1 is an adaption that also processes the input from your comment. – Martin Honnen Sep 22 '16 at 15:48