-1

I looked on the internet but I couldn't find a working answer to my question. I need to replace an XML file attribute value if it is size="10.439" to size="10.238". Basically, I need to change that number in the whole XML file. So the code is:

import lxml.etree as etree
import re
parser = etree.XMLParser(remove_blank_text=True)
tree = etree.parse('fe3.xml', parser)
re.sub(r'size="10.439"','size="10.238"', tree)

But it won't work, what do I have to do to make it work?

If it helps, the size attribute is in the tag text of the XML. Like this:

<pages>
<page>
<textbox>
<text size = "10.439"> hello
</text>
</textbox>
</page>
</pages>
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Anna
  • 369
  • 2
  • 10
  • You do not need any regex here. `re.sub(r'size="10.439"','size="10.238"', tree)` is the same as `tree.replace(r'size="10.439"','size="10.238"')`. Note you did not assign the value to a variable after replacing. – Wiktor Stribiżew May 07 '20 at 13:46
  • I get AttributeError: 'lxml.etree._ElementTree' object has no attribute 'replace' – Anna May 07 '20 at 13:48
  • So that was what you meant by "does not work". You can only use a regex with a string. You need to get the attributes you need with the XML parser and set the values with what you need, rather than using a regex against an XML structure. – Wiktor Stribiżew May 07 '20 at 13:52
  • I'm new to XML, I don't know how to do it, since I have to change it under condition – Anna May 07 '20 at 13:53
  • See [here](https://stackoverflow.com/questions/8171146/python-lxml-modify-attributes) for an example. – Wiktor Stribiżew May 07 '20 at 13:55
  • I see but how am I supposed to learn xpath from scratch? – Anna May 07 '20 at 13:57
  • @Anna - The [xpath tag info page](https://stackoverflow.com/tags/xpath/info) has some suggestions on XPath tutorials/online training. If you're working with XML, XPath is a must have skill in my opinion. – Daniel Haley May 07 '20 at 16:04

2 Answers2

1

My dirty solution:

tree = etree.parse('fe3.xml', parser)
tree = etree.tostring(tree).replace(b'size="10.439"', b'size="10.238"')
haja
  • 88
  • 3
0

I'm here again :)

from simplified_scrapy import SimplifiedDoc,req,utils
html = '''
<pages>
<page>
<textbox>
<text size = "10.439"> hello
</text>
</textbox>
</page>
</pages>
'''
doc = SimplifiedDoc(html)
text = doc.select('text')
if text.size=='10.439':
  text.setAttr('size','10.238')
print (doc.html)

Result:

<pages>
<page>
<textbox>
<text size="10.238"> hello
</text>
</textbox>
</page>
dabingsou
  • 2,469
  • 1
  • 5
  • 8