1

So, I need help trying to parse an XML response in Bash. Let's say my response is this. (The response is abridged, but only shows the info I need.)

<?xml version="1.0" encoding="UTF-8"?>
<response>
    <submissions>
        <submission>
            <submission_id><![CDATA[90210]]></submission_id>
            <last_file_update_datetime><![CDATA[2017-06-18 02:47:14.39864+02]]></last_file_update_datetime>
        </submission>
        <submission>
            <submission_id><![CDATA[90211]]></submission_id>
            <last_file_update_datetime><![CDATA[2017-06-11 15:48:04.279135+02]]></last_file_update_datetime>
        </submission>
    </submissions>
</response>

I want to parse for each block in <submissions>, and export the data into an array in this format:

{submission_id}#{last_file_update_datetime}#1

As an example, the response should look like this when parsed:

90210#2017-06-18 02:47:14.39864+02#1
90211#2017-06-11 15:48:04.279135+02#1

How can I perform this in Bash?

Eduardo Perez
  • 503
  • 7
  • 22
  • Just bash? Or bash+coreutils (like sed, awk, etc) ? – Jerry Jeremiah Jun 18 '17 at 21:29
  • Here are a couple other possibly duplicate questions: https://stackoverflow.com/questions/893585/how-to-parse-xml-in-bash https://stackoverflow.com/questions/2222150/extraction-of-data-from-a-simple-xml-file https://stackoverflow.com/questions/17333755/extract-xml-value-in-bash-script https://stackoverflow.com/questions/20248037/read-from-xml-to-bash – Jerry Jeremiah Jun 18 '17 at 21:34

1 Answers1

0

I have found my in experience that problems that involve reformatting XML are best handled with XSLT. I don't know if xsltproc is on your box, but if it is, here is some code that will get you what you want:

$ xsltproc stylesheet.xsl input.xml


        90210#2017-06-18 02:47:14.39864+02
        90211#2017-06-11 15:48:04.279135+02


~
$ cat stylesheet.xsl
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >

<xsl:output method="text" />

<xsl:template match="//submission">
  <xsl:value-of select="submission_id"/>#<xsl:value-of select="last_file_update_datetime"/>
</xsl:template>

</xsl:stylesheet>


$ cat input.xml
<?xml version="1.0" encoding="UTF-8"?>
<response>
    <submissions>
        <submission>
            <submission_id><![CDATA[90210]]></submission_id>
            <last_file_update_datetime><![CDATA[2017-06-18 02:47:14.39864+02]]></last_file_update_datetime>
        </submission>
        <submission>
            <submission_id><![CDATA[90211]]></submission_id>
            <last_file_update_datetime><![CDATA[2017-06-11 15:48:04.279135+02]]></last_file_update_datetime>
        </submission>
    </submissions>
</response>
Mark
  • 4,249
  • 1
  • 18
  • 27