1

I'm using Python 3.3 in eclipse with PyDev plugin on Windows 7.

I need to parse an XML file using XPath and LXML. If I use a static XPath expression it works but I need to use a variable one but when I use a variable in the expression it doesn't work.

If I use this code:

xml = etree.parse(fullpath).getroot()
tree = etree.ElementTree(xml)

nsmap = {'xis' : 'http://www.xchanging.com/ACORD4ALLEDI/1',
         'ns' : 'http://www.ACORD.org/standards/Jv-Ins-Reinsurance/1' }

p = tree.xpath('//xis:Line', namespaces=nsmap)
print (p)
for e in p:
    print(e.tag, e.text)

it works as I want, the print(p) returns

 [<Element {http://www.xchanging.com/ACORD4ALLEDI/1}LloydsProcessingCode at 0x2730350>]

but if I change it to:

xml = etree.parse(fullpath).getroot()
tree = etree.ElementTree(xml)

nsmap = {'xis' : 'http://www.xchanging.com/ACORD4ALLEDI/1',
         'ns' : 'http://www.ACORD.org/standards/Jv-Ins-Reinsurance/1' }
header = 'Jv-Ins-Reinsurance'
ns = 'xis:'
path = "'//" + ns + header + "'"    
p = tree.xpath('%s' % path, namespaces=nsmap)
print ('p = %s' % p)
for e in p:
    print(e.tag, e.text)

the print(p) returns:

p = //xis:Jv-Ins-Reinsurance

and I get an error:AttributeError: 'str' object has no attribute 'tag'.

How can I do this?

Thanks

zhangyangyu
  • 8,520
  • 2
  • 33
  • 43
user2565150
  • 83
  • 5
  • 14

2 Answers2

1

Can you try to remove the single quotes ? I think you have one level too much of quoting in your path variable. I would just use path = "//" + ns + header.

Emmanuel
  • 13,935
  • 12
  • 50
  • 72
  • Because it seems EASIER TO DEBUG, I like the general, string building, approach of this solution better than the approach i found at http://stackoverflow.com/questions/16285816/lxml-html-parsing-with-xpath-and-variables – Love and peace - Joe Codeswell Feb 05 '15 at 19:46
0

You are building a string with literal quotes. You don't need to, omit the ' characters.

path = "//" + ns + header
p = tree.xpath(path, namespaces=nsmap)

or use string formatting:

path = "//{}{}".format(ns, header)
p = tree.xpath(path, namespaces=nsmap)

Your original version was the equivalent of:

path = "'//xis:Jv-Ins-Reinsurance'"

(note the extra single quote characters).

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343