-1

Source XML

<?xml version='1.0' encoding='UTF-8'?>
<ProcessType xmlns:xmi="http://www.omg.org/XMI" xmi:version="2.0" defaultContext="Default">
    <node componentName="tRedshiftRow" componentVersion="0.102" offsetLabelX="0" offsetLabelY="0" posX="-32" posY="96">
    <elementParameter field="TECHNICAL" name="QUERYSTORE:QUERYSTORE_TYPE" value="BUILT_IN"/>
    <elementParameter field="TEXT" name="DBNAME" value="&quot;&quot;"/>
    <elementParameter field="TEXT" name="SCHEMA_DB" value="&quot;&quot;"/>
    <elementParameter field="MEMO_SQL" name="QUERY" value="&quot;DELETE FROM schema.tablename;&quot;"/>
    </node>
</ProcessType>

I want to get the DELETE statement only where tag is "QUERY", and write it in a text file.

Expected output : DELETE FROM schema.tablename;

I was trying the following way, which obviously didn't work out !

from lxml import etree, objectify
import xml.etree.ElementTree as ET

def convert_xml_to_comp():
    metadata = 'source.xml'
    parser = etree.XMLParser(remove_blank_text=True)
    tree = etree.parse(metadata, parser)
    root = tree.getroot()
    for elem in root.getiterator():
        # print(elem)
        i = elem.tag.find('}')
        if i >= 0:
            elem.tag = elem.tag[i+1 :]
    objectify.deannotate(root, cleanup_namespaces=True)
    tree.write('done.xml', pretty_print=True, xml_declaration=True, encoding='UTF-8')


tree = ET.parse('done.xml')
root = tree.getroot()

def get_sql_text():

    file = open( "newdelete.txt", "w")
    for root in tree.getroot():
        ### Get the elements' names ###
        for elementParameter in root.iterfind('elementParameter[@name="UNIQUE_NAME"]') :
                        name=elementParameter.get('value')
                        ### Get the elements' name and SQL ###
                        for elementParameter in root.iterfind('elementParameter[@name="QUERY"]') :
                            #print (root.attrib)
                            val=elementParameter.get('value')
                            print(root.find('val[@value="DELETE FROM schema.tablename;"]'))
    file.close() 
get_sql_text()
if __name__ == '__main__':
    convert_xml_to_comp()
larsks
  • 277,717
  • 41
  • 399
  • 399

1 Answers1

4

You do this all in a just a couple of statements using an xpath query. Something like:

>>> from lxml import etree
>>> doc = etree.parse(open('data.xml'))
>>> query = doc.xpath('//elementParameter[@name="QUERY"]')[0].get('value')
>>> print(query)
"DELETE FROM schema.tablename;"

This says "find all the elementParameter elements with name="QUERY" and then return the value of the value attribute of the first one.


To select just those elements that contain "DELETE" in their value attribute, use the contains() function:

>>> doc.xpath('//elementParameter[@name="QUERY" and contains(@value, "DELETE")]')
larsks
  • 277,717
  • 41
  • 399
  • 399
  • That worked in fixed scenario. So in some scenarios, I have multiple name="QUERY" which consists of different SQL query. Instead of providing fixed array position[0], how can I search the value only with delete statement. – AWSGeekCoder Jan 21 '20 at 15:41
  • @AWSGeekCoder - Try adding a predicate that uses `contains()`. Example: `//elementParameter[@name="QUERY"][contains(@value,'DELETE')]` – Daniel Haley Jan 21 '20 at 15:43
  • 1
    @AWSGeekCoder There is code **in this answer** that shows you how to print that value. – larsks Jan 21 '20 at 16:29
  • @AWSGeekCoder - You only need to update the XPath in the original answer larsks gave. So: `query = doc.xpath('//elementParameter[@name="QUERY"][contains(@value,'DELETE')]')[0].get('value')` (and then `print(query)`) or you can use the second xpath larsks gave. They are equivalent. – Daniel Haley Jan 21 '20 at 16:37
  • https://stackoverflow.com/questions/2893551/case-insensitive-matching-in-xpath – larsks Jan 21 '20 at 21:12
  • The solutions show the appropriate xpath expression for making case-insensitive matches. – larsks Jan 22 '20 at 11:52