0

I'm trying to transform the below XML sample to a CSV but I have difficulties matching nested elements with same name (Rule).

What is the XSLT transformation that can generate this structure?

File@path="filename1.txt" | Rule@id="3.1.6" | Messaage@severity="3" | Message@text="3480. ....."

File@path="filename1.txt" | Rule@id="3.5.19" | Messaage@severity="3" | Message@text="1281. ....."

File@path="filename2.txt" | Rule@id="3.1.6" | Messaage@severity="3" | Message@text="3480. ....."

File@path="filename2.txt" | Rule@id="3.5.3" | Messaage@severity="3" | Message@text="3219. ....."

The path would look like:

AnalysisData\dataroot type="per-file"\File\tree type="rules"\RuleGroup name="MISRA_C"\...\Rule id="[1-9]+\.[1-9]+\.[1-9]+"\Message

Input XML is:

<AnalysisData>
  <dataroot type="project">
  </dataroot>
  <dataroot type="per-file">
    <File path="filename1.txt">
      <Json>1.json</Json>
      <tree type="rules">
        <RuleGroup name="MISRA_C" total="2" active="2" >
          <Rule id="3" total="2" active="2" text="Mandatory" >
            <Rule id="3.1" total="1" active="1" text="Common" >
              <Rule id="3.1.6" total="1" active="1" text="Declarations and definitions" >
                <Message guid="qac-9.6.0-3480" total="1" active="1" severity="3" text="3480.  Object/function '%s', with internal linkage, has been defined in a header file." />
              </Rule>
            </Rule>
            <Rule id="3.5" total="1" active="1" text="MISRA Required Rules" >
              <Rule id="3.5.19" total="1" active="1" text="M3CM Rule-7.2 A &quot;u&quot; or &quot;U&quot; suffix shall be applied to all integer constants that are represented in an unsigned type" >
                <Message guid="qac-9.6.0-1281" total="1" active="1" severity="3" text="1281.  Integer literal constant is of an unsigned type but does not include a &quot;U&quot; suffix." />
              </Rule>
            </Rule>
          </Rule>
        </RuleGroup>
      </tree>
    </File>
    <File path="filename2.txt">
      <Json>2.json</Json>
      <tree type="rules">
        <RuleGroup name="CrossModuleAnalysis" total="11" active="11" >
          <Rule id="1" total="11" active="11" text="Maintainability" >
            <Rule id="1.1" total="11" active="11" text="CMA Declaration Standards" >
              <Message guid="rcma-2.0.0-1534" total="11" active="11" severity="2" text="1534.  The macro '%1s' is declared but not used within this project." />
            </Rule>
          </Rule>
        </RuleGroup>
        <RuleGroup name="MISRA_C" total="36" active="16" >
          <Rule id="3" total="20" active="0" text="Mandatory" >
            <Rule id="3.1" total="12" active="0" text="Common" >
              <Rule id="3.1.6" total="12" active="0" text="Declarations and definitions" >
                <Message guid="qac-9.6.0-3480" total="12" active="0" severity="3" text="3480.  Object/function '%s', with internal linkage, has been defined in a header file." />
              </Rule>
            </Rule>
            <Rule id="3.5" total="8" active="0" text="MISRA Required Rules" >
              <Rule id="3.5.3" total="8" active="0" text="M3CM Rule-2.1 A project shall not contain unreachable code" >
                <Message guid="qac-9.6.0-3219" total="8" active="0" severity="3" text="3219.  Static function '%s()' is not used within this translation unit." />
              </Rule>
            </Rule>
          </Rule>
          <Rule id="2" total="16" active="16" text="Minor" >
            <Rule id="2.1" total="16" active="16" text="Common" >
              <Rule id="2.1.15" total="16" active="16" text="Declarations and Definitions" >
                <Message guid="qac-9.6.0-3227" total="16" active="16" severity="2" text="3227.  The parameter '%s' is never modified and so it could be declared with the 'const' qualifier." />
              </Rule>
            </Rule>
          </Rule>
        </RuleGroup>
      </tree>
    </File>
  </dataroot>
</AnalysisData>
Flaviu
  • 931
  • 11
  • 16

1 Answers1

2

I usually suggest to write a template for the element that is mapped to a line and then to use xsl:value-of separator to output a line with the separator of your choice:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="3.0"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  exclude-result-prefixes="#all"
  expand-text="yes">

  <xsl:output method="text"/>

  <xsl:template match="/" name="xsl:initial-template">
    <xsl:apply-templates 
      select="/AnalysisData/dataroot[@type = 'per-file']/File/tree/RuleGroup[@name = 'MISRA_C']//Rule[Message]"/>
  </xsl:template>
  
  <xsl:template match="Rule">
    <xsl:value-of select="ancestor::File/@path, @id, Message!(@severity, @text)" separator=" | "/>
    <xsl:text>&#10;</xsl:text>
  </xsl:template>

</xsl:stylesheet>

If you want to output the attribute values plus the element/attribute name then a function can be helpful:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="3.0"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:mf="http://example.com/mf"
  exclude-result-prefixes="#all"
  expand-text="yes">

  <xsl:output method="text"/>
  
  <xsl:function name="mf:line" as="xs:string*">
    <xsl:param name="atts" as="attribute()*"/>
    <xsl:sequence
      select="$atts ! (local-name(..) || '@' || local-name() || '=&quot;' || . || '&quot;')"/>
  </xsl:function>

  <xsl:template match="/" name="xsl:initial-template">
    <xsl:apply-templates 
      select="/AnalysisData/dataroot[@type = 'per-file']/File/tree/RuleGroup[@name = 'MISRA_C']//Rule[Message]"/>
  </xsl:template>
  
  <xsl:template match="Rule">
    <xsl:value-of select="mf:line((ancestor::File/@path, @id, Message!(@severity, @text)))" separator=" | "/>
    <xsl:text>&#10;</xsl:text>
  </xsl:template>

</xsl:stylesheet>
Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
  • I just found in a big input file that a rule could contain more than one message. Could you please tell what should I change in the existing XSLTs? – Flaviu Oct 04 '21 at 09:15
  • @Flaviu, well, try on your own to adapt the suggested code, or at least give a clear description of what happens with the output if the rule has more than one message, do you want one line per message or one line per rule, which then has several columns for the messages? – Martin Honnen Oct 04 '21 at 09:24
  • Two messages appear per line (filename.txt;3.5.2;3;0654. [U] Using 'const' or 'volatile' in a function return type is undefined.;3;0914. [U] Source file does not end with a newline character.) and I want in this case two lines with same file name. That means one line per message with same file name. – Flaviu Oct 04 '21 at 10:48
  • 1
    As I said, match the element you want to map to a line in a template so instead of `select="/AnalysisData/dataroot[@type = 'per-file']/File/tree/RuleGroup[@name = 'MISRA_C']//Rule[Message]"` try `select="/AnalysisData/dataroot[@type = 'per-file']/File/tree/RuleGroup[@name = 'MISRA_C']//Message"` plus `match="Message"` then instead of `match="Rule"` and with adjusted selection of the "columns" e.g. `select="ancestor::File/@path, ../@id, @severity, @text"` or `select="mf:line((ancestor::File/@path, ../@id, @severity, @text))"`. – Martin Honnen Oct 04 '21 at 10:53
  • Works perfectly! Thanks again – Flaviu Oct 06 '21 at 06:56