0

I've been looking for answer for couple hours for now but I can't seem t find an easy answer for how to find a tag in an XML file and extract its value. For example:

<?xml version="1.0" encoding="UTF-8"?>
<record name="test111.xml" type="content">
    <item name="Metadata">
        <value>
            <item name="label1">
                <value>
                    <item name="dummy1">
                        <value />
                    </item>
                </value>
            </item>
            <item name="Source">
                <value>test</value>
            </item>
            <item name="Channel">
                <value>celebrate</value>
            </item>
            <item name="Catref">
                <value>cat150032</value>
            </item>
            <item name="ParentsSource" />
            <item name="MobileFriendly" />
            <item name="VirtualPath">
                <value>/testFolder/testItem</value>
            </item>
            <item name="VirtualFileName">
                <value>sweets/</value>
            </item>
            <item name="RefreshMode">
                <value>soft</value>
            </item>
            <item name="label2">
                <value>
                    <item name="dummy2">
                        <value />
                    </item>
                </value>
            </item>
        </value>
    </item>
</record>

In this instance, I'm trying to get the value of VirtualPath as/testFolder/testItem and maybe more.

from __future__ import print_function
import os

folderArray = ['\test1', '\test2', '\test3', '\test4']
path = 'D:\Projects\test'
for i in range(0, len(folderArray)):
    tempPath = path + folderArray[i]
    for root, dirs, files in os.walk(tempPath):
        for file in files:
            if file.endswith(".xml"):
                if (folderArray[0]):
                #  LOOK ONLY FOR VIRTUAL PATH
                    f = open(tempPath, 'r')
                    for line in f:
                        if "VirtualPath" in line:
                        # HERE FIND THE VALUE OF THE TAG

After opening the XML file I do not know how to look for a specific tag and find out its value. I know that I can do an indexof and substring the value out, but that is not the most efficient way.

EDIT

Found the answer after playing around a little. I know this has a duplicate but I didn't know about it before I've created this question. Answer:

from __future__ import print_function
import os
from xml.dom import minidom

folderArray = ['\test1', '\test2', '\test3', '\test4']
path = 'D:\Projects\test'
for i in range(0, len(folderArray)):
tempPath = path + folderArray[i]
for root, dirs, files in os.walk(tempPath):
    for file in files:
        if file.endswith(".xml"):
           xmldoc = minidom.parse(tempPath + '\\' + file)
           itemlist = xmldoc.getElementsByTagName('item')
           for s in itemlist:
               if (s.attributes['name'].value == 'VirtualPath'):
                   print(s.childNodes[0].firstChild.nodeValue)
Leustad
  • 143
  • 3
  • 13

0 Answers0