1

I am on a red hat system and I have multiple XML files generated from various SOAP requests that are in a format that is not compatible with MySQL's LoadXML function. I need to load the data into MySQL tables. One table will be setup for each type of XML file, depending on the data received via the Soap XML API.

Sample format of one of the files is as this, but each file will have a different number of columns and different column names. I am trying to find a way to convert them to a compatible format in the most generic way possible since I will have to create any customized solution for each API request/response.

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
   <soap:Body>
      <dbd:DataRetrievalRequestResponse xmlns:dbd="dbd.v1">
         <DataObjects>
            <ObjectSelect>
               <mdNm>controller-ac</mdNm>
               <meNm>WALL-EQPT-A</meNm>
            </ObjectSelect>
            <DataInstances>
               <DataInstance>
                  <instanceId>DSS1</instanceId>
                  <Attribute>
                     <name>Name</name>
                     <value>DSS1</value>
                  </Attribute>
                  <Attribute>
                     <name>Operational Mode</name>
                     <value>mode-fast</value>
                  </Attribute>
                  <Attribute>
                     <name>Rate - Down</name>
                     <value>1099289</value>
                  </Attribute>
                  <Attribute>
                     <name>Rate - Up</name>
                     <value>1479899</value>
                  </Attribute>
                </DataInstance>
                <DataInstance>
                  <instanceId>DSS2</instanceId>
                  <Attribute>
                     <name>Name</name>
                     <value>DSS2</value>
                  </Attribute>
                  <Attribute>
                     <name>Operational Mode</name>
                     <value>mode-fast</value>
                  </Attribute>
                  <Attribute>
                     <name>Rate - Down</name>
                     <value>1299433</value>
                  </Attribute>
                  <Attribute>
                     <name>Rate - Up</name>
                     <value>1379823</value>
                  </Attribute>
                </DataInstance>
             </DataInstances>
          </DataObjects>
       </dbd:DataRetrievalRequestResponse>
    </soap:Body>
 </soap:Envelope>

Of course I want the data to be entered into a mysql table with column names 'id, Name, Group' rows for each unique instance

Name Operational Mode Rate - Down Rate - Up
DSS1 mode-fast 1099289 1479899
DSS2 mode-fast 1299433 1379823

Do I need to create an XSLT and preprocess this XML data from command line prior to running it to LoadXML to get it into a format that MySQL LoadXML function will accept? This would not be a problem, but I am not familiar with XSLT transformations.

Is there a way to reformat the above XML to straight CSV (preferred), or to another XML format that is compatible, such as the examples given in mysql documentation for loadxml?

<row>
  <field name='column1'>value1</field>
  <field name='column2'>value2</field>
</row>

I tried doing LOAD DATA INFILE and using ExtractValue function, but some of the values have spaces in them, and the delimiter for ExtractValue is hard coded to single-space. This makes it unusable as a workaround.

Danny Z
  • 113
  • 1
  • 6
  • The XML format allows for infinite variety of schemas and it is not possible to have a generic XSLT that would fit any possible input XML. Consider for example an XML that stores data about companies and their products; you would actually want to have 2 different XSLT stylesheets, one to import into the Companies table and another into Products. So even a customized XSLT could not fulfill both purposes, not to mention a generic one. As for the given example, it would be quite easy (provided it does have a single root element that's missing from your post). – michael.hor257k Jul 22 '22 at 20:59
  • Updated the head and tail of the xml sample and values are closer to actual to avoid any confusion – Danny Z Jul 22 '22 at 21:13
  • I expect some customization, but obviously the simpler the better. I don't know if going to XML is the right way to go just so I can import it to a database table. CSV would be the best intermediary format. – Danny Z Jul 22 '22 at 21:14
  • 1
    *"I expect some customization, but obviously the simpler the better."* We don't know how different the various responses you get are, so I cannot comment on that. If they all follow some common pattern, then you may be able to get by with minimal modifications, perhaps even none. As for XML vs. CSV, XSLT can produce either one. At this point I am not sure what your question is. If you're looking for someone to write your XSLT for you, that's not what we do here. – michael.hor257k Jul 22 '22 at 21:31
  • P.S. The edited response will error out because `` does not match ``. – michael.hor257k Jul 22 '22 at 21:34
  • fixed. It does seem to be what /can/ be done here. Looking for some help. The samples I'ves seen don't apply to the xml format I'm starting from. https://stackoverflow.com/questions/365312/xml-to-csv-using-xslt – Danny Z Jul 22 '22 at 23:32

1 Answers1

1

Your question is very general (which is fine!) so my answer is also quite general.

Firstly, it's certainly true that XSLT is an ideal generic tool for problems of this sort. I have absolutely no doubt that every one of your SOAP messages could be coerced into a suitable form, using an XSLT that's customised for each type of message, while still remaining structurally very similar, which is what you'd want if you're new to XSLT.

I'm not sure how familiar you are with XPath, XML, XML namespaces, etc, but I think the task here is simple enough to tackle, and if you do have any tricky XPath expressions to write you can always come back to StackOverflow and ask for help.

From what you've said it sounds like you're confident that each SOAP message can be mapped to a single table. I'm going to suggest an XSLT pattern that would be customisable for each type of SOAP message, where you have an xsl:for-each statement that iterates over each row, and within that you create a row element and populate it with fields.

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
   <!-- indent the output, for ease of reading -->
   <xsl:output indent="yes"/>
   <!-- process the document -->
   <xsl:template match="/">
      <!-- create the root element of the output -->
      <resultset>
         <!-- create each row of the output, by iterating over the
         repeating elements in the SOAP message -->
         <xsl:for-each 
            select="//DataInstance">
            <row>
               <!-- create each field -->
               <!-- This field is defined individually, and the value
                    is produced by evaluating the 'instanceId' xpath
                    relative to the current DataInstance -->
               <field name="id"><xsl:value-of select="instanceId"/></field>
               <!-- these field can be generated with a loop -->
               <xsl:for-each select="Attribute">
                  <field name="{name}"><xsl:value-of select="value"/></field>
               </xsl:for-each>
            </row>
         </xsl:for-each>
      </resultset>
   </xsl:template>
</xsl:stylesheet>

Result of this, run over your sample SOAP message:

<resultset>
   <row>
      <field name="id">DSS1</field>
      <field name="Name">DSS1</field>
      <field name="Operational Mode">mode-fast</field>
      <field name="Rate - Down">1099289</field>
      <field name="Rate - Up">1479899</field>
   </row>
   <row>
      <field name="id">DSS2</field>
      <field name="Name">DSS2</field>
      <field name="Operational Mode">mode-fast</field>
      <field name="Rate - Down">1299433</field>
      <field name="Rate - Up">1379823</field>
   </row>
</resultset>

If you can follow this general pattern, you should be able to write a custom XSLT for every kind of SOAP message in your collection. You will just need to modify the various XPath expressions in the stylesheet: //DataInstance means "every DataInstance" instanceId means "the instanceId that's a child of the current ("context") element. name means "the name element that's a child of the current element. value means "the value element that's a child of the current element.

In the example SOAP message you gave, the Attribute element maps to a field, so all those elements could be copied generically, with another xsl:for-each, but for your other documents you may have to just define each field element individually, as I did for the id element in my answer.

Conal Tuohy
  • 2,561
  • 1
  • 8
  • 15