0

I have a XML files with data in it that I need to extract with a Linux Bash Script. For example, I need everything in the brackets and to print out in terminal accordingly (first name in next to first description).

What is the best way to do this? I was able to do it with c# using XPath, but I can't do that simply using a Linux Bash Script correct? I know I can get most of it done with awk, sed, grep, cut ect, but I don't think (actually I know I am not) doing it properly. I am at a point were I extracted description data to a file and name to a file, but after a few hours, I realized this is not the best/proper way.

Example of how it should read is.

  • name1 - description1
  • name2 - description2
  • name3 - (blank)
  • name4 - description3

Thank you!

If I can add any detail to make it clearer, please let me know! (Brain is a little fried from a long day at work)

Sample XML (simplified it down to one instance)

<zabbix_export>
<templates>
    <template>
        <items>
            <item>
                <name>Available memory</name>
                <description>Available memory is defined as free+cached+buffers memory.</description>                  
            </item>
        </items>
    </template>
<templates> 

How it should read out:

<name>Available memory</name> - <description>Available memory is defined as free+cached+buffers memory.</description>
<name>Another Name</name> - <description>Another Description</description>
<name>Another Name</name> - <description>Another Description</description>
<name>Another Name</name> - <description>Another Description</description>
PolarisUser
  • 719
  • 2
  • 7
  • 18

3 Answers3

3

command

xmlstarlet sel -t -m '//item/*' -c . -n input.xml |
    sed '2~2s/$/\n/' |
        awk -F'\n' -vRS='' '{print $1" - "$2}'

output

<name>Available memory</name> - <description>Available memory is defined as free+cached+buffers memory.</description>
<name>Available memory</name> - <description>Available memory is defined as free+cached+buffers memory.</description>
<name>Available memory</name> - <description>Available memory is defined as free+cached+buffers memory.</description>
kev
  • 155,172
  • 47
  • 273
  • 272
  • So, almost there!!! I got that to work properly, but some of my descriptions in my XML file are blank. Is there a way for it to skip the blank descriptions so it displays properly? – PolarisUser Jun 25 '12 at 22:35
2

You can launch an xpath command from bash. It installs with Perl XML and allows you to do any XPath queries to stdout very comfortably.

Jirka Hanika
  • 13,301
  • 3
  • 46
  • 75
  • Well, I have been trying to get it to work in Perl doing it that way, but I can't seem to get it to work (never used Perl before). I know it is possible to get it to print like this just using Bash Script. – PolarisUser Jun 25 '12 at 21:50
  • I really only want to use Linux Bash Script commands (not Perl or anything) – PolarisUser Jun 25 '12 at 21:52
  • @PolarisUser - and you really get right that. `xpath` is a command line utility that you run directly from Bash. You only need to know in which package to find it in case that it is not pre-installed on your system which would be a surprise for me. – Jirka Hanika Jun 25 '12 at 21:55
  • Okay, well I am at the point where I have all the data using XPath with Perl, but I can't seem to format it properly, like I posted :( – PolarisUser Jun 25 '12 at 21:57
  • 1
    @PolarisUser - You can follow this previous question (answer) and come back if you run in any other trouble: http://stackoverflow.com/questions/5629091/format-and-combine-output-of-xpath-in-bash – Jirka Hanika Jun 25 '12 at 22:04
0

If the xpath executable is not installed, you can install it like so:

cpan XML::XPath

That should download the necessary Perl module, and install the xpath binary.

Brian Minton
  • 3,377
  • 3
  • 35
  • 41