1

im seeking a simple method (utility, function, tool) for running boolean logic on data in xml format, and most likely via shell script. this has nothing to do with translating xml or creating other documents but simple logic decisions that permit basic operations <, >, =, !=

lets say given an xml file, 2012_04_21.xml containing (amongst other things) the xml key and value of <data>...<price>6.50</price>...</data>

my ideal tool would be along the lines of:

cooltool --input 2012-04-21.xml --eval "price <= 6.50"

returning either true, false, nothing or something depending on the given logic

grep works well for 'has' or '==' type operation. grep '<key>value</key>' 2012_04_21.xml offers either nothing or the matching string that can be booleanated.

but, grep fails to suffice for reasons: 1. not possible for price > 5.00 2. cannot cope with hierarchies like data/price > 5.00

XPATH logic is completely adequate, but im struggling to come up with a way of harnessing it under this situation.

xsltproc mylogic.xsl 2012-04-21.xml

Hmmm, perhaps a combo of xsltproc mylogic.xsl 2012-04-21.xml | grep true

Ill give that a whirl.

Any other ideas are welcome.

Gabe Rainbow
  • 3,658
  • 4
  • 32
  • 42

2 Answers2

3

This sounds like something you could pretty much do with a trivial XPath query; not quite sure how complex your source files are, but this sounds like something you could do with xmllint

xmllint --xpath "boolean(//price[text()<=6.50])" xmlfile.xml

You could write a utility script of your own... notice that my XPath query assumes that

  1. Your XML files do not have a fixed structure
  2. Your XML files can have repeatable <price> tags

.

#!/bin/bash

xmllint --xpath "boolean(//price[text()<=$2])" $1

./ingest.sh xmlfile.xml 6.50

There are existing questions [1,2,3] you might want to take a look at if you want to use grep.

[1] How to find information inside a xml tag using grep?
[2] How to parse XML in Bash?
[3] How to (e) grep XML for certain tag content?

Community
  • 1
  • 1
lightonphiri
  • 782
  • 2
  • 14
  • 29
  • thats exactly what im seeking. and similar to the solution which was using a standard Xpath doc, generating 'true' or 'false' within queries. yours is better for these trivial cases since no extra doc is needed. but my damn xmllint has no --xpath and thanks for the other links. – Gabe Rainbow Jan 01 '13 at 20:10
  • seems like xpath utility can do it too. – Gabe Rainbow Jan 01 '13 at 20:19
  • @user1869322, You probably have an older version of `libxml2` --`xmllint --version`. You might want to update it, if you haven't done so already. – lightonphiri Jan 01 '13 at 20:25
1

Given the following data

$ cat data.xml
<data>
    <price>10.50</price>
    <price>5.50</price>
</data>

Option 1: One liner xpath check

The following xmllint program finds the offending XML tags:

$ echo "cat //price[text()<=6.50]" | xmllint --shell data.xml | grep "<price>" && echo "found"
<price>5.50</price>
found

The exit code of the "grep" command can be used in a shell script to test whether the XML was validated or not.

Option 2: Generate an XML schema

The following shell script generates an XML schema that checks the entire document and includes a range restriction on the values of the price tag:

#!/bin/bash

LIMIT=$1

cat << EOF > data.xsd
<xsd:schema version="1.0" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <xsd:element name="data">
    <xsd:complexType>
      <xsd:sequence>
        <xsd:element maxOccurs="unbounded" name="price">
            <xsd:simpleType>
                <xsd:restriction base="xsd:decimal">
                    <xsd:maxInclusive value="$LIMIT"/>
                </xsd:restriction>
            </xsd:simpleType>
        </xsd:element>
      </xsd:sequence>
    </xsd:complexType>
  </xsd:element>
</xsd:schema>
EOF

xmllint --schema data.xsd data.xml

Runs as follows:

$ ./validate.sh 6.5
<?xml version="1.0"?>
<data>
    <price>10.50</price>
    <price>5.50</price>
</data>
data.xml:2: element price: Schemas validity error : Element 'price': [facet 'maxInclusive'] The value '10.50' is greater than the maximum value allowed ('6.5').
data.xml:2: element price: Schemas validity error : Element 'price': '10.50' is not a valid value of the local atomic type.
data.xml fails to validate
Mark O'Connor
  • 76,015
  • 10
  • 139
  • 185