I've tried searching various questions and answers here on StackOverflow and cannot find a solution that works for my situation, so here is my issue.
I have 3 xml files that I am attempting to compare. The issue I am having is grabbing sections of the "Main" XML file at a time and keeping the information together. For example, I want to keep the information associated with 1 and be able to use each piece within the script.
This XML file can have any number of fields between the tags but I am only needing 5 specific fields. I am fairly new to Python and extremely new to using Python to read more than a text file, any help would be appreciated.
A sample of the xml is below.
Main XML:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<resultset table="foo_bar">
<row>
<field name="id">1</field>
<field name="name">foo 1</field>
<field name="item 1">bar 1</field>
<field name="item 2">Accepted</field>
<field name="item 3">Accepted</field>
</row>
<row>
<field name="id">2</field>
<field name="name">foo 2</field>
<field name="item 1">bar 2</field>
<field name="item 2">Declined</field>
<field name="item 3">Accepted</field>
</row>
<row>
<field name="id">3</field>
<field name="name">foo 3</field>
<field name="item 1">bar 3</field>
<field name="item 2">Accepted</field>
<field name="item 3">Declined</field>
</row>
.....Continues
</resultset>
I have tried following the various answers for similar questions, but have had no success thus far.
EDIT I've tried multiple things, I'll have to dig through the various .py scripts to find all of them. Here is the most recent based on the Question posted here
from lxml import etree as ET
def filter_by_itemid(doc, idlist):
rowset = doc.xpath("//row")
for elem in rowset.getchildren():
if elem.get("*") not in idlist:
rowset.remove(elem)
return doc
doc = ET.parse("my.xml")
filter_by_itemid(doc, ['id', 'name', 'item 1', 'item 2', 'item 3'])
print(ET.tostring(doc))
I know I am doing something wrong somewhere, and the formatting of the xml (which I am unable to change at the source) isn't helping...
The error I receive is "AttributeError: 'list' object has no attribute 'getchildren' "