0

As you can see from the xml here there are multiple <item> nodes with a set of children such as <summary>, <status> and <key>.

The problem I've encountered is that in using minidom, it's possible to get values of the firstChild and lastChild, but not necessarily any values in between.

I've created the below which doesn't work, but I think is a close approximation of what I need to be doing

import xml.dom.minidom

xml = xml.dom.minidom.parse(result) # or xml.dom.minidom.parseString(xml_string)

itemList = xml.getElementsByTagName('item')
for item in itemList [1:]:

    summaryList = item.getElementsByTagName('summary')
    statusList = item.getElementsByTagName('status')
    keyList = item.getElementsByTagName('key')

    lineText = (summaryList[0].nodeValue + " " + statusList[0].nodeValue  + " " + keyList[0].nodeValue)

    p = Paragraph(lineText, style)
    Story.append(p)
TheMightyLlama
  • 1,243
  • 1
  • 19
  • 51

2 Answers2

2

Define get_text() function that joins all of the text child nodes (see this answer):

def get_text(element):
    return " ".join(t.nodeValue for t in element[0].childNodes 
                    if t.nodeType == t.TEXT_NODE)


dom = xml.dom.minidom.parseString(data)
itemList = dom.getElementsByTagName('item')
for item in itemList[1:]:
    summaryList = item.getElementsByTagName('summary')
    statusList = item.getElementsByTagName('status')
    keyList = item.getElementsByTagName('key')

    print get_text(summaryList)
    print get_text(statusList)
    print get_text(keyList)
    print "----"

prints:

Unapprove all pull request reviewers after major change
Needs Triage
STASH-4473
----
Allow using left/right arrow to move side by side diff left/right
Needs Triage
STASH-4478
----

Hope that helps.

Community
  • 1
  • 1
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
0

How about something like

for item in itemList:
    lineText = ' '.join(child.nodeValue for child in item.childNodes)
    p = Paragraph(lineText, style)
    Story.append(p)
desfido
  • 787
  • 6
  • 16