My goal was to retrieve all nodes that contain a specific text.
1- I can retrieve nodes that contain some text with the folowing request:
[node for node in root.xpath('//*[contains(.,"Carte de chaleur")]') ]
Out[62]:
[<Element workbook at 0x1818bc76e88>,
<Element worksheets at 0x1819b886dc8>,
<Element worksheet at 0x1819c156488>,
<Element layout-options at 0x1819c1564c8>,
<Element title at 0x1818e9509c8>,
<Element formatted-text at 0x1819c156c48>,
<Element run at 0x1818e955048>,
<Element worksheet at 0x1819c156a88>,
<Element layout-options at 0x1819c156fc8>,
<Element title at 0x1818e9508c8>,
<Element formatted-text at 0x1819c1565c8>,
<Element run at 0x1818e955088>]
but when i checked, i only get 2 elements that contain the specific text.:
[node for node in root.xpath('//*[contains(.,"Carte de chaleur")]') if node.text.__contains__("Carte de chaleur")]
Out[66]: [<Element run at 0x1818e955048>, <Element run at 0x1818e955088>]
In fact when i look for the path of one of theses run nodes i can find that all the 'workbook',worksheets' etc... are in fact their parent nodes.
run_node
Out[71]: <Element run at 0x1818e955048>
tree.getpath(run_node)
Out[72]: '/workbook/worksheets/worksheet[3]/layout-options/title/formatted-text/run[1]'
So why this xpath query return me all the parent nodes of the nodes i am looking for (just the 2 run nodes in fact) ?
2- If i want nodes whose attribute contain a specific text i run this query:
root.xpath('//@*[contains(.,"bold")]/..')
Out[86]:
[<Element format at 0x18199f56948>,
<Element format at 0x18199f56148>]
(It 's logic since i want the node that contain a specific attribute nodes, so i am looking for the parent of this attribute node)
Very strangely, this request do not produce the same result:
root.xpath('//*[contains(@*,"bold")]')
Even if for me this last one mean: "take any descendant element of the root whose any attribute contain the text "bold" (the same that the preceding one for me)
3- Can i retrieve the nodes whose attribute contain different value, using variable ?
For one variable i could do:
root('//*[@name=$var]', var="[Petal_length]")
But is there a way to do something like:
root('//*[@name=$var1]//title[@format=$var2]', var1="[Petal_length]",var2="bold")